自律エージェントによるコードベースのセキュリティ確保
Cursor社のセキュリティチームは、急速に変化するコードベース全体で脆弱性を発見・修正するために、自律型セキュリティエージェントのフリートを構築した。
キーポイント
自律型セキュリティエージェントの導入
Cursor社がセキュリティチーム主導で、コードベースの脆弱性を自動的に発見・修正する自律型エージェント群を構築した。
高速変化するコードベースへの対応
この取り組みは、開発速度が速く変化の激しい現代的なコードベースにおけるセキュリティ課題に対応することを目的としている。
プロアクティブなセキュリティ対策
従来の事後対応型ではなく、エージェントが継続的に監視・修正を行うことで、セキュリティ問題を未然に防ぐアプローチを実現している。
影響分析・編集コメントを表示
影響分析
この取り組みは、AIエージェントの実用的な応用例として、特にDevSecOps分野での自動化の新たな可能性を示している。企業の内部開発プロセスにおけるAI活用の具体例として参考になるが、技術的な革新性や詳細な実装方法については情報が限られている。
編集コメント
自社ブログでの事例紹介であり、技術的詳細やベンチマークデータが不足している点が惜しい。しかし、AIエージェントの実践的応用例として業界の参考になる取り組み。
Cursorのセキュリティチームは、急速に変化するコードベース全体で脆弱性を発見・修正するため、セキュリティエージェント群を構築しました。
原文を表示
Over the last nine months, our PR velocity has increased 5x. Security tooling based on static analysis or rigid code ownership remains helpful, but is not enough at this scale. We've adapted by using Cursor Automations, which has allowed us to quickly build a fleet of security agents that continuously identify and repair vulnerabilities in our codebase.
Today, we're releasing four new automation templates with the exact blueprints of the security agents we've found to be most helpful. Other security teams can customize these templates to build agents that automatically resolve a wide range of security issues.
The automations architecture
For agents to be useful for security, they need two features, both of which Cursor Automations provides.
The first is out-of-the-box integrations for receiving webhooks, responding to GitHub pull requests, and monitoring codebase changes. This allows agents that operate in the background to know when to step forward and take action.
The second is a rich agent harness and environment. Automations are powered by cloud agents, which gives them all the tools, skills, and observability that cloud agents have access to.
To make automations more powerful for security-specific use cases, we built a security MCP tool and deployed it as a serverless Lambda function, available just-in-time when needed, and not otherwise running.
The MCP, whose reference code is available here, serves three purposes:
Persistent data. The agent uses the MCP to store data, so we can track and measure security impact over time. We use that data to continually refine when and how we trigger automations.
Deduplication. We run multiple review agents on every change, and because their findings are generated by an LLM, different agents can end up using different words to describe the same underlying issue. To avoid duplicate work, the MCP allows the agent to deploy a classifier powered by Gemini Flash 2.5 that determines when two semantically distinct findings describe the same problem.
Consistent output. Agents report every vulnerability they find through the MCP, which sends consistently formatted Slack messages and handles further actions like dismissing or snoozing a finding.
With this foundation in place, the four security automations detailed below layer on their own workflows and trigger logic. We use Terraform to ensure that all changes to security tooling go through a standard review and deployment process.
Agentic Security Review
Internally, we were already using Bugbot to review PRs for code quality and general issues, including some security findings. But a general-purpose review tool isn't ideal for security because it can't be prompt-tuned to our specific threat model, and because we needed the ability to block CI on security findings specifically, without blocking on every general code quality issue.
Given that, we built a dedicated automation we call Agentic Security Review. Initially we had it forward its findings to a private Slack channel monitored by our security team.
Agentic Security Review sends findings to a private Slack channel monitored by the security team.
Once we were confident it was identifying genuine issues, we turned on PR commenting, then implemented a blocking gate check. In the last two months, Agentic Security Review has run on thousands of PRs and prevented hundreds of issues from reaching production.
Vuln Hunter
After the success of Agentic Security Review on new code, we pointed agents at the existing codebase. Vuln Hunter is an automation that divides the code into logical segments and searches each one for vulnerabilities. Our team triages findings and typically fixes them, often using @Cursor from Slack to generate PRs.
Anybump
Dependency patching is so time intensive that most security teams eventually give up and push it to engineering, where it sits in backlogs. We created an automation called Anybump that has entirely automated nearly all of it.
Anybump runs reachability analysis to narrow vulnerabilities to those that are actually impactful, then traces through the relevant code paths, runs tests, checks for breakage, and opens a PR once tests pass. After the PR is merged, Cursor's canary deployment pipeline provides a final safety gate before anything reaches production.
Anybump automatically opens PRs to patch vulnerable dependencies after tests pass.
Invariant Sentinel
Invariant Sentinel runs daily to monitor for drift against a set of security and compliance properties. It divides the repo into logical segments and spins up subagents to validate code against a list of invariants.
After analysis, the agent compares current state against previous runs using the automations memory feature. If it detects drift, it revalidates to ensure correctness, then updates its memory and sends a Slack report to the security team with a description of the change and specific code locations as evidence.
Because this automation runs in a full development environment, the agent can write and execute code to validate its own assumptions, complementing traditional functional, unit, and integration tests.
More automations to come
Security is full of opportunities to apply automations, and these four are just the beginning of the work we plan to do. We're already extending them to encompass vulnerability report intake, privacy compliance monitoring, on-call alert triage, and access provisioning.
In each case, agents give us coverage and consistency at a scale we couldn't achieve manually.
関連記事
今日のまとめ
AI日報で今日の重要ニュースをまとめ読み