Claude Opus 4.8: Tops SWE-Bench Pro at 69.2%

Claude Opus 4.8: Tops SWE-Bench Pro at 69.2% | Flash News Detail | Blockchain.News

Latest Update

5/29/2026 3:56:00 AM

Claude Opus 4.8 hits 69.2% on SWE-Bench Pro for agentic coding lead, adds self-doubt honesty while trailing GPT-5.5 on Terminal-Bench 2.1.

Source

Analysis

Claude Opus 4.8 posts 69.2% on SWE-Bench Pro, extending its lead in agentic coding benchmarks, yet still trails GPT-5.5 on Terminal-Bench 2.1 at 2.1. The model now admits uncertainty on select tasks, a shift from prior versions that rarely flagged their own limits. Released at unchanged pricing, the update arrives alongside EasyRouterIO launch offering 400 free test credits via promo code.

AI benchmarks Claude Opus SWE Bench crypto none

傅盛

@FuSheng_0306

Chairman and CEO of Cheetah Mobile, Chairman of OrionStar

Claude Opus 4.8: Tops SWE-Bench Pro at 69.2%

Analysis

傅盛

Premium Sponsors

Trending topics