Anthropic is doubling down on a future where AI doesn’t just answer questions in a chat window, but actually drives your computer for you—clicking buttons, filling in forms, and hopping between apps like a tireless digital coworker. That’s the backdrop for its latest move: acquiring Vercept, a small but highly specialized AI startup built around one idea—teaching models to see and operate software the way humans do.
In Anthropic’s world, this all rolls up under one banner: “computer use.” Claude can already write and run code across full repositories, synthesize research from dozens of sources, and juggle workflows that touch multiple tools and teams. Computer use is what lets it do those things inside live apps—the browser you’re already using, the spreadsheet that already has your data, the PDF viewer open on your second monitor—rather than just reasoning in the abstract. Think of it as the jump from a clever chatbot to a competent digital operator that can actually get things done in your environment.
Vercept slots into that vision almost perfectly. The company describes itself as building “specialized models designed to operate across computer applications,” essentially AI coworkers that quietly handle tedious tasks without getting in your way. Its flagship macOS app, Vy, lets users control and automate tasks across their computer—things like data entry, organizing invoices, or even helping with content workflows. Underneath that product is the hard problem Anthropic cares about: perception (what’s on the screen, which element matters, what changed?) and interaction (where to click, what to type, how to recover if something unexpected pops up).
Those might sound like details, but they’re exactly where most AI “agents” fall apart. It’s one thing to generate code or explain how to do something; it’s another to robustly navigate a finicky enterprise web app with nested menus, pop-up dialogues, and login flows. Vercept’s team—co-founders Kiana Ehsani, Luca Weihs, and Ross Girshick—has spent years thinking about how AI systems should see and act inside the same software humans use every day. Anthropic, bluntly, is buying that expertise. Vercept will wind down its external product in the coming weeks, and the whole team is moving over to focus on Claude’s computer use stack.
The timing is not an accident. Earlier this month, Anthropic rolled out Claude Sonnet 4.6, and quietly attached a pretty striking metric: on OSWorld—a widely used benchmark for AI computer use—Sonnet models jumped from under 15% in late 2024 to 72.5% today. OSWorld isn’t a toy benchmark; it drops agents into real operating systems (Ubuntu, Windows, macOS) and asks them to complete hundreds of practical tasks across apps like Chrome, LibreOffice, VS Code, email clients, and more. That includes doing things like navigating spreadsheets, completing web forms across multiple tabs, and orchestrating workflows that cross several applications.
Anthropic says Sonnet 4.6 is now approaching human-level performance on tasks such as navigating complex spreadsheets and completing multi-tab web forms. Independent analyses of OSWorld paint it as a serious, messy benchmark: about a third of tasks require switching between apps, some need 20–50 interaction steps, and state-of-the-art systems have gone from sub-10% to nearly 70% completion in just a few years. That trendline makes this acquisition easier to read: Anthropic sees computer use as one of the next big battlegrounds for enterprise AI, and it wants to accelerate from “strong benchmark scores” to “production-grade agents” as fast as possible.
You can already see the commercial shape of this in Anthropic’s broader product lineup. Claude is now integrated into Chrome, Excel, PowerPoint, Slack, and a growing catalog of connectors and plugins, all designed to let the model sit closer to where work actually happens. The company is also positioning Claude as the engine behind agents in areas like code modernization, customer support, education, healthcare, and financial services—domains where navigating existing tools and interfaces is half the battle. Bolting Vercept’s perception-and-interaction expertise onto that stack is a way to make those agents less brittle and more trustworthy in real workflows.
There’s also a competitive undercurrent here. UiPath, for example, recently highlighted that its Screen Agent, powered in part by Claude Opus 4.5, hit a top ranking on the OSWorld-Verified benchmark—basically a proof point that agentic automation is moving from hype decks into actual products. At the same time, investors and enterprises are increasingly looking at OSWorld-style scores as proxies for “how good is your AI at actually doing stuff on a computer, not just chatting.” For Anthropic, owning more of that capability in-house—and bringing in a team that’s been living those problems—signals that it wants to set the pace in this niche rather than just participate.
Strategically, Vercept is also part of a pattern. Anthropic has been selectively acquiring teams whose work pushes Claude into more specialized, high-value territory. In 2025, it bought Bun, a high-performance JavaScript runtime and toolkit, as Claude Code crossed a billion-dollar milestone in revenue and usage. That acquisition was about giving Claude a deeper, faster engine for code execution and developer workflows. Vercept is the same playbook, but for GUI automation and cross-app interaction. In both cases, Anthropic isn’t just patching a gap; it’s absorbing teams that can reshape how Claude behaves in entire classes of tasks.
For Vercept’s existing users, the short-term story is more mixed. The company has said it will wind down its external product in the coming weeks, which usually means an eventual sunset for tools like Vy in their current form. In exchange, the bet is that Vercept’s ideas show up inside a much larger platform: Claude as a coworker that can log into your systems, follow your internal procedures, and handle complex, multi-step chores without constant supervision. If Anthropic executes well, enterprise customers won’t buy “Vercept” as a standalone app—they’ll feel it as Claude becoming a lot more competent at living inside their real workflows.
The bigger question is what this does to the emerging ecosystem of “AI agents for everything.” On one side, you have generalist models tuned to be agents; on the other, highly focused teams like Vercept that obsess over the gritty details of how an agent moves through real software. Anthropic is clearly betting that you need both: a strong base model plus deeply specialized perception and control systems that handle the ambiguity of real UIs, odd edge cases, and non-standard enterprise setups. That’s not a trivial engineering problem, and it’s likely to be one of the core differentiators between agents that look impressive in demos and agents that people actually trust with important work.
Zoomed out, the Vercept acquisition is another signal that AI’s next frontier is less about clever text and more about execution. The first wave of large language models showed they could write, summarize, and reason in natural language. The next wave, which companies like Anthropic are racing to define, is about systems that can actually operate in the same digital environments humans use—knowledge workers’ desktops, browsers, and line-of-business apps—and do so safely, reliably, and at scale. Vercept gives Anthropic more of the technical DNA it needs to make that future feel less like a demo and more like a default part of how people work with Claude every day.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
