Anthropic unveiled its new AI models, Claude Sonnet 4 and Claude Opus 4. The launch comes at a time when the company needed a strong response to maintain its position against rivals like Google Gemini.
Claude Opus 4 claims the title of “world’s best coding model”, while Claude Sonnet 4 offers significant improvements over its predecessor. Both models introduce extended thinking capabilities and enhanced tool usage.
However, one question remains: are these improvements substantial enough to stop the loss of users and be a favourite for developers again?
Early feedback from major tech companies and several reports suggests that Claude 4 might have found the right formula for a comeback.
Building Rapport in AI Coding
Claude Opus 4’s performance on coding benchmarks appears impressive on paper. It achieved 72.5% on SWE-bench and 43.2% on Terminal-bench.
Claude Sonnet 4 significantly improves on Sonnet 3.7’s industry-leading capabilities, excelling in coding with a state-of-the-art 72.7% on SWE-bench. Claude Sonnet 4 improves upon Sonnet 3.7’s capabilities in coding tasks. This improvement is demonstrated by a leading score of 72.7% on the SWE-bench benchmark.

While Claude Sonnet 4 does not match Opus 4 in most domains, it delivers what Anthropic calls “an optimal mix of capability and practicality”.
These scores position it ahead of competing models in software engineering tasks.
According to Anthropic, the Opus 4 model can work continuously for several hours, maintaining focus through thousands of steps. This capability addresses a standard limitation in current AI models that struggle with extended workflows.
The announcement blog post highlighted, “Rakuten validated its capabilities with a demanding open-source refactor running independently for 7 hours with sustained performance.”
Aman Sanger, co-founder of Cursor, mentioned on X, “Claude Sonnet 4 is much better at codebase understanding. Paired with recent improvements in Cursor, it’s SOTA on large codebases.”
While he did not share any comparison with other AI models, it is a noteworthy acknowledgement that would make developers try Claude again.
Companies like Replit and Block, the company behind Square, have also shared the same sentiment in their respective use cases while using Claude’s new AI models.
Mario Rodriguez, chief product officer at GitHub, mentioned in the Code with Claude opening keynote that they are using Claude Sonnet 4 as the base model for their new coding agent in GitHub Copilot.
He explained that Claude Sonnet was chosen for its strengths in deep software engineering, coding knowledge, problem-solving skills, and strong instruction following capabilities.
Claude Code Becomes Generally Available
Anthropic’s API now offers four new features for developers to create more robust AI agents—a code execution tool, an MCP connector, a Files API, and prompt caching for up to one hour.
Anthropic has also made Claude Code generally available. The tool now works with GitHub Actions to respond to pull requests and modify code.
The company also announced Claude Code integrations with VS Code and JetBrains in the form of extensions (in beta), with inline editing support.

The pricing structure remains consistent with previous models. While the pricing for Opus 4 is set at $15 per million input tokens and $75 per million output tokens, Sonnet 4 is priced at $3 for input and $15 for output per million tokens.
Beyond Coding
The Claude 4 models introduce several technical improvements that extend beyond coding assistance. In their launch video, Claude can be seen going across multiple services like Asana, and Google Docs to help users prioritise daily tasks, apart from being good at writing.
In a post on X, Dan Shipper, CEO at Every, highlighted, “Claude 4 Opus can do something no other AI model I’ve used can. It can actually judge whether the writing is good.”
Shipper wrote a blog post sharing his experience with Claude 4, from coding to writing to researching. “My verdict: Anthropic cooked with this one. In fact, it does some things that no model I’ve ever tried has been able to do, including OpenAI’s o3 and Google’s Gemini 2.5 Pro,” he stated.
Memory capabilities represent another significant advancement, particularly for Opus 4. When given access to local files, the model can create and maintain memory files to store key information. This feature enables better long-term task awareness and coherence in agent applications.
Anthropic demonstrated this capability by showing Opus 4 creating a ‘Navigation Guide’ while playing Pokémon, illustrating how the model can maintain context and build knowledge over time. This memory function could prove valuable for complex, multi-session projects that require continuity.
The models also show 65% less likelihood to use shortcuts or loopholes compared to Sonnet 3.7.
Both models can now use multiple tools simultaneously rather than sequentially. This capability could significantly speed up complex workflows that require multiple data sources or tools.
With Claude 4, Anthropic seems to be back in the game, if not the winner for every real-use case.