Claude Code now flags vulnerabilities as you type

Anthropic has quietly shipped a new “security guidance” plugin for Claude Code that acts like a built-in security reviewer, watching over your shoulder and flagging vulnerabilities as you write and edit code in real time. It is available to all Claude Code users through the plugin marketplace, and it is designed to warn on risky patterns before those changes ever hit your repo or production systems.

At a high level, this plugin is Anthropic’s attempt to answer an uncomfortable but increasingly obvious question: if AI agents are going to generate and refactor large chunks of our codebases, who is watching the watcher? Traditional static analysis tools were not built for an era where an assistant can spin up a whole microservice in a single session, or refactor your CI pipeline with one prompt. Anthropic has been leaning into AI-assisted security for a while – notably with its Claude Code Security capability that scans entire codebases and proposes patches – and this new plugin is the next logical step, bringing that posture into the tight loop of everyday edits.

Instead of being yet another command you have to remember to run, the security-guidance plugin wires itself into Claude Code as a “pre-tool hook.” In plain English, that means it automatically intercepts key operations – like Write, Edit, and MultiEdit – and scans the code that Claude is about to apply, before it actually changes your files. If it sees something sketchy, it throws a warning and concrete remediation advice, and only then lets the edit proceed. You don’t have to toggle it or remember a slash command; once installed, it is just part of the environment.

Under the hood, the plugin is opinionated about what “sketchy” looks like. Anthropic says it currently targets eight major categories of common vulnerabilities, including some of the classics that have haunted web and backend developers for years. That list covers things like command injection in GitHub Actions workflows, unsafe uses of child_process.exec() in Node, and the usual suspects like eval() and new Function() that can open the door to remote code execution if they’re fed untrusted input. It also looks for front-end XSS vectors such as innerHTML and dangerouslySetInnerHTML, Python’s pickle deserialization risks, and OS command injection patterns via os.system() and related calls.

When the plugin triggers, it doesn’t just throw a vague “this might be unsafe” banner and move on. The idea is to behave more like a senior engineer doing a targeted code review in the moment, explaining why a pattern is risky and how to fix it. For example, if Claude is about to introduce a GitHub Actions step that pipes untrusted input into a shell command, the warning can point out the injection risk and suggest safer alternatives or quoting strategies. Anthropic also scopes these warnings to a session, so you see each warning once per session instead of being spammed with the same nag every time you touch similar code.

One important nuance: this plugin isn’t scanning everything in your repo continuously, nor is it some magical “secure my app” button. That broader, repo-wide analysis is what Anthropic is already experimenting with in Claude Code Security, a separate capability that scans codebases, identifies vulnerabilities, and suggests patches for human review in a dashboard-style workflow. The new plugin is more surgical and more immediate. It specifically reviews changes that Claude itself is about to make during your interactive coding session and nudges the assistant to fix what it finds before those changes land. Think of it as guardrails for AI-driven edits rather than a big-bang security audit.

To actually get this working, you install it like any other Claude Code plugin, through the marketplace system that Anthropic introduced to let users extend Claude with packaged slash commands, subagents, hooks, and MCP servers. You add a marketplace – for Anthropic’s official catalog, that is something like anthropics/claude-code – and then browse or directly install the plugin from there. Once installed, it shows up in your plugin list and can be enabled or disabled like any other extension, but the key point is that the security behavior is automatic: no extra prompts, no new UX to learn.

The timing of this feature is interesting given the recent focus on AI security, both in terms of protecting AI systems and preventing AI from becoming a new source of vulnerabilities. Anthropic has already been positioning Claude Code Security as a frontier capability for defenders, making the case that AI-assisted analysis can catch complex, context-dependent flaws that pattern-based tools miss. That narrative has unsettled parts of the cybersecurity industry; when Claude Code Security was announced, some security stocks dropped sharply as investors digested the idea that AI coding assistants might start competing with, or at least augmenting, existing security solutions. A plugin that bakes basic security review into every AI-driven code change fits neatly into that storyline.

There is also a more introspective angle here: Claude Code itself has already had to learn hard lessons about its own attack surface. Earlier this year, a vulnerability tracked as CVE-2026-21852 described how a flaw in the project-load flow could allow a malicious repository to exfiltrate sensitive data, including Anthropic API keys, by manipulating configuration values before the user confirmed whether the project was trusted. That issue was patched in version 2.0.65, and users on the standard auto-update channel were automatically protected, but it underlined a simple truth – the tools we use to secure code can themselves become targets. Building a security-guidance plugin that scrutinizes Claude’s own edits feels like part of a broader push to harden the entire ecosystem.

At the same time, the plugin lands in a marketplace ecosystem that has already drawn scrutiny from security researchers. As third-party marketplaces and plugins emerged for Claude Code, researchers showed proof-of-concept attacks where a malicious plugin could rewrite Claude’s permissions files or auto-approve dangerous commands, effectively bypassing human-in-the-loop safeguards and exfiltrating data. Some of those demonstrations involved hooks that fire on every prompt submission, quietly changing how Claude executes shell commands like curl without the user realizing it. In that light, an official security plugin from Anthropic is not just about catching bugs in app code; it is also about setting expectations for what “good” defensive plugins should look like in this new ecosystem.

From a developer-experience perspective, the plugin is trying to hit a delicate balance. No one wants a nagging assistant that flags every innerHTML assignment when you are deliberately working in a controlled context. But the reality is that many vulnerabilities are introduced not by exotic zero-days, but by very familiar patterns used in slightly careless ways – a CI step with a bit too much string concatenation, a file upload path that is not properly validated, a deserialization helper that quietly accepts tainted input. Catching those issues at the moment of creation, with a specific pointer to safer patterns, is arguably more powerful than a long PDF report generated at the end of a sprint that everyone is too tired to read.

Zoomed out, this plugin is another data point in a trend that is rapidly redefining what “IDE assistance” means. GitHub’s Copilot, JetBrains AI Assistant, and other tools increasingly pitch themselves not just as autocomplete on steroids, but as co-pilots for architecture, testing, and refactoring. Anthropic’s move with Claude Code Security and now this real-time security-guidance plugin pushes that even further toward “secure-by-default co-development.” Instead of treating security as a separate phase or separate product, the assistant itself becomes a vector for secure coding practices.

For teams that are already experimenting with Claude Code in their terminals and editors, the upside of turning this on is obvious: you get an extra layer of review on every AI-generated change without needing to overhaul your tooling pipeline. For security engineers, it is another knob to turn in the ongoing effort to bake good hygiene into the daily flow of development, rather than as a gate at the end. And for Anthropic, it is a way of signaling that if AI is going to write code, then AI also needs to help own the responsibility of making that code safer.