Digital Product Studio

Anthropic Releases New Claude 3.5 Sonnet AI Model That Can Communicate With Any Desktop App

Anthropic has released its upgraded Claude 3.5 Sonnet model that features a brand new capability – computer use. The model delivers significantly improved coding skills. With this new feature, developers can direct Claude to interact with any desktop application just like humans do. Let’s get into the details!

The Upgraded Claude 3.5 Sonnet

The new Sonnet AI model outperforms various models on coding benchmarks compared to its predecessor. On SWE-bench Verified, an evaluation for software engineering skills, it improves performance from 33.4% to 49.0%, scoring higher than all publicly available models. Similarly, on TAU-bench, which tests agentic tool usage, Claude 3.5 Sonnet enhances scores in both retail and airline domains. Early customers have noticed substantially stronger reasoning abilities with this upgraded model.

Credits: Anthropic

Performance Evaluation of New Claude 3.5 Sonnet

Early customer feedback highlights the leap taken by Claude 3.5 Sonnet. GitLab, who tested it for DevSecOps tasks, observed 10% higher reasoning abilities without any latency increase. Cognition reported major improvements in coding, planning and problem-solving compared to the prior version in autonomous evaluations. The Browser Company noted the upgraded model outperformed every model they had assessed before while automating workflows.

Teaching Claude Computer Usage

Additionally, Anthropic is teaching Claude universal “computer skills” rather than tool-specific automation abilities. Through their new ‘computer use’ API, developers can direct Claude to perceive and interact with computer interfaces like humans – translating instructions into clicking, typing and navigation sequences across standard programs and software. On OSWorld, a computer usage benchmark, Claude 3.5 Sonnet attained a score of 14.9% with screenshots and 22% with additional steps – outshining competing systems.  

The New Claude 3.5 Haiku

Claude 3.5 Haiku presents the next generation of Anthropic’s fastest model. For an identical price point and speed as Claude 3 Haiku, it exceeds the capabilities of even the powerful Claude 3 Opus model in numerous categories. Notably, Claude 3.5 Haiku scores 40.6% on SWE-bench Verified, outshining many publicly available agents. With low latency, enhanced instruction comprehension and more precise tool utilization, Claude 3.5 Haiku is well-suited for user-facing applications, specialized subtasks and vast personalized experiences.

Developing Computer Use Responsibly

While computer use enables powerful skills, Anthropic acknowledges limitations and has safeguards to promote responsible adoption. The company conducted in-depth pre-release reviews of the new Sonnet with safety experts. Joint testing of the model was conducted by renowned AI safety organizations – US AI Safety Institute and UK Safety Institute. Anthropic also self-assessed Claude 3.5 Sonnet and found the risks defined in their ‘Responsible Scaling Policy’ document remain valid.

Availability and Accessibility

The computer use public beta along with upgraded Claude 3.5 Sonnet and new Claude 3.5 Haiku models are available through Anthropic API, Amazon and Google cloud platforms. Anthropic invites everyone to explore the new models and capabilities while providing feedback to progress responsibly together.

| Latest From Us

SUBSCRIBE TO OUR NEWSLETTER

Stay updated with the latest news and exclusive offers!


* indicates required
Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

Don't Miss Out on AI Breakthroughs!

Advanced futuristic humanoid robot

*No spam, no sharing, no selling. Just AI updates.