Claude 3.5's new 'computer use' feature lets AI act like a human on your computer

Anthropic‘s latest innovation, the “computer use” feature in its Claude 3.5 Sonnet AI model, is creating waves among developers. This feature, now in public beta, allows Claude to interact with a computer like a human—reading the screen, moving the cursor, typing, and clicking. This functionality, introduced through Anthropic’s API, aims to streamline mundane digital tasks, making it easier for users to manipulate software, search, and organize information without direct human interaction.

The “computer use” capability stands out from similar efforts by major players like Microsoft, OpenAI, and Google. While those companies have demonstrated AI’s ability to analyze computer screens, Anthropic has pushed further by releasing a public beta that allows AI to control a computer. However, the company has set limitations, particularly steering Claude away from social media, election-related activities, and government sites, to mitigate risks like misinformation and misuse.

Anthropic's Claude 3.5 Sonnet and Claude 3.5 Haiku benchmark evaluation table — Image: Anthropic

Technically, this version of Claude still faces challenges. It captures the screen as a series of screenshots instead of a continuous video feed, making certain dynamic actions difficult. Quick movements, notifications, or scrolling can throw it off track. Despite these constraints, Claude’s computer use offers a glimpse into the future of how AI might handle repetitive digital tasks, with Anthropic expecting rapid improvements as feedback rolls in.

The updated Claude 3.5 Sonnet also brings notable advancements in coding and tool usage, outperforming previous models and competitors on various benchmarks. This model maintains Anthropic’s focus on practical AI applications, aiming to deliver more efficient assistance without compromising on safety and ethical standards.