Anthropic releases Claude Opus 4.7: attempt it, benchmarks, security

Anthropic has been transport merchandise and making information at a blistering tempo in 2026, and on Thursday, the AI firm introduced the launch of Claude Opus 4.7.

Claude Opus 4.7 is Anthropic’s most clever mannequin accessible to most people. Notably, Anthropic said in a press release that Opus 4.7 is not as highly effective as Claude Mythos, which Anthropic deemed too dangerous for public launch.

Claude Opus is a household of hybrid reasoning fashions able to multi-step reasoning and superior coding. Till the announcement of Claude Mythos on April 7, Claude Opus was thought-about Anthropic’s most superior sequence of AI fashions.

Don’t miss out on our newest tales: Add Mashable as a trusted news source in Google.

attempt Claude Opus 4.7

Claude Opus 4.7 is out there now through Claude AI, the Claude API, and Anthropic companions comparable to Microsoft Foundry. The brand new mannequin is priced the same as Claude Opus 4.6.

Nonetheless, Anthropic famous that as a result of “Opus 4.7 thinks extra at greater effort ranges,” it makes use of extra ouput tokens than its predecessor. Customers can learn extra about tips on how to optimize token utilization within the Opus 4.7 migration guide.

How Claude Opus 4.7 improves over 4.6

As anticipated, Claude Opus 4.7 presents improved capabilities throughout the board.

Specifically, Anthropic says Claude Opus 4.7 is best at superior coding duties, visible intelligence, and doc evaluation. Anthropic additionally says Opus 4.7 is “extra tasteful and creative when finishing skilled duties, producing higher-quality interfaces, slides, and docs.”

Mashable Mild Pace

“Customers report with the ability to hand off their hardest coding work — the type that beforehand wanted shut supervision — to Opus 4.7 with confidence. Opus 4.7 handles complicated, long-running duties with rigor and consistency, pays exact consideration to directions, and devises methods to confirm its personal outputs earlier than reporting again,” reads an Anthropic blog post.

Claude Opus 4.7: Benchmark efficiency

Anthropic launched an in depth model card outlining how Claude Opus 4.7 compares to different Anthropic fashions and frontier fashions from OpenAI, Google, and xAI.

Opus 4.7 lags behind the unreleased Claude Mythos, which Anthropic reviews scores considerably greater on widespread benchmarks comparable to Humanity’s Last Exam. “Claude Opus 4.7 is much less succesful than Claude Mythos Preview on each related axis we measured and doesn’t advance {our capability} frontier,” the mannequin card states.” Meaning Claude Opus 4.7 is just not proof that AI growth has accelerated past present development traces.

SEE ALSO:

The AI industry has a big Chicken Little problem

On Humanity’s Final Examination (with out instruments), Anthropic reviews that Claude Opus 4.7 outperforms all different frontier fashions besides Claude Mythos.

Claude Mythos scored 56.8 % on HLE
Claude Opus 4.7 scored 46.9 %
Gemini 3.1 Professional scored 44.4 %
GPT-5-4 Professional scored 42.7 %
Claude Opus 4.6 scored 40.0 %

With instruments, GPT-5-4-Professional scored 58.7 % in comparison with Opus 4.7’s 54.7 %. Mythos beat them each with 64.7 %.

Mashable has not independently verified these benchmark outcomes. Full outcomes can be found within the Opus 4.7 model card.

table comparing claude opus 4.7 to other frontier models on benchmark tests

Credit score: Anthropic

General, Anthropic scored Opus 4.7 above different main fashions in some benchmarks, although Gemini 3.1 Professional and GPT-5-4 rating greater in some areas.

Claude Opus 4.7: Security and hallucinations

Anthropic additionally reviews that Opus 4.7 reveals a low threat of misaligned behaviors, with the same threat profile as Opus 4.6.

For instance, Anthropic says Opus 4.7 is much less prone to hallucinate and reveals decrease charges of reward hacking.

“Claude Opus 4.7 is extra reliably trustworthy than Opus 4.6 or Sonnet 4.6, with massive reductions within the charge of necessary omissions, and reasonable enhancements in factuality and charges of hallucinated enter,” the mannequin card states.

Need to study extra about getting one of the best out of your tech? Join Mashable’s Top Stories and Deals newsletters in the present day.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

Anthropic releases Claude Opus 4.7: attempt it, benchmarks, security

attempt Claude Opus 4.7

How Claude Opus 4.7 improves over 4.6

Claude Opus 4.7: Benchmark efficiency

Claude Opus 4.7: Security and hallucinations

Table of contents [hide]

Read More