Visa Vulnerability Agentic Harness
Visa Vulnerability Agentic Harness (VVAH) is Visa’s open-source agentic SAST pipeline for autonomous vulnerability discovery with frontier AI models. It runs nine stages from attack-surface mapping through adversarial verification to Markdown and SARIF reports. We assess it under Code Scanner and AI Agent because the design is credible and Apache-licensed, but Visa has not published precision or recall figures yet.
Blurb
VVAH is Visa’s open-source harness for autonomous vulnerability discovery using frontier AI models, built on learnings from Project Glasswing (Anthropic’s initiative for AI-assisted vulnerability research).
Summary
VVAH targets the triage bottleneck in AI-assisted vulnerability management, not raw discovery alone. Its primary effectiveness metric is Mean Time to Adapt (MTTA): time from AI-discovered weakness to a validated fix in production.
When to try: you own or have explicit permission to scan a repo; you want multi-model agentic SAST with structured triage artifacts; you already emit or consume SARIF in DevSecOps pipelines.
When to skip: you need deterministic, rule-based gates only (prefer Conftest or traditional SAST); you cannot review LLM-generated triage candidates; you lack budget for token-heavy multi-stage runs.
Install: Python >= 3.10, pip install . or pipx install . yields the vvaharness CLI. Default profile uses the Claude Code CLI (claude login); Anthropic SDK and OpenAI backends are also supported.
Key commands:
| |
Details
Pipeline
Three phases, nine stages:
| Phase | Stages | Purpose |
|---|---|---|
| Discovery and Modeling | S1-S3 | Attack surface mapping, threat modeling, hunting plan |
| Deep Dive and Verification | S4-S6 | Multi-lens research, policy gates, adversarial verification |
| Synthesis, Chaining, and Reporting | S7-S9 | Deduplication, chain construction, SARIF emission |
Each stage is a composable skill (attack surface mapper, AppSec threat modeler, language/crypto/logic/access-control lenses, adversarial reviewer, exploit strategist). See the repo docs/architecture.md and docs/SKILLS.md.
Design choices
- Threat modeling before deep analysis to focus the attack surface.
- Multi-agent deterministic voting on SDK and OpenAI backends to reduce false positives.
- Structured triage artifacts to compress the path from AI finding to actionable report.
Output
Per target, under <target>/security-scan/:
<module>_<ts>_report.mdwith findings and a dropped-findings appendix<module>_<ts>_report.sarif(SARIF 2.1.0)<module>_<ts>_errors.jsonlfor non-fatal errors
A run_manifest.json records tool version, model roles, config hash, target git SHA, and timing.
Limitations
- Findings are LLM-generated triage candidates, not confirmed vulnerabilities. Human review is required.
- Runs are non-deterministic. Two scans may differ.
- Token use is high. Run
vvaharness estimateand tune per-stagemax_budget_usdknobs. - The tool runs with elevated privilege. Scan only trusted repos with authorized operators.
Contrast
- Daybreak (assess): OpenAI-hosted cyber-defense platform with Codex Security harness. VVAH is self-hosted, multi-provider, and open source.
- Codacy (assess): multi-repo quality dashboards with traditional SAST aggregation. VVAH adds agentic reasoning and exploit-chain validation.
- Conftest (trial): deterministic Rego policy on IaC. Complementary, not a substitute for VVAH’s application-source analysis.
Agent integration
vvaharness setup --install-agents drops operating instructions for Claude Code, Copilot, Gemini CLI, and cross-tool AGENTS.md without overwriting existing files.