Visa Vulnerability Agentic Harness

No Change

assess

First Added:June 11, 2026 Updated: July 2, 2026

Visa Vulnerability Agentic Harness. We assess it under Tool in the garden.

Blurb

Visa Vulnerability Agentic Harness. Contribute to visa/visa-vulnerability-agentic-harness development by creating an account on GitHub.

Summary

VVAH targets the triage bottleneck in AI-assisted vulnerability management, not raw discovery alone. Its primary effectiveness metric is Mean Time to Adapt (MTTA): time from AI-discovered weakness to a validated fix in production.

When to try: you own or have explicit permission to scan a repo; you want multi-model agentic SAST with structured triage artifacts; you already emit or consume SARIF in DevSecOps pipelines.

When to skip: you need deterministic, rule-based gates only (prefer Conftest or traditional SAST); you cannot review LLM-generated triage candidates; you lack budget for token-heavy multi-stage runs.

Install: Python >= 3.10, pip install . or pipx install . yields the vvaharness CLI. Default profile uses the Claude Code CLI (claude login); Anthropic SDK and OpenAI backends are also supported.

Key commands:

1
2
3
vvaharness doctor
vvaharness estimate --repo /path/to/target
vvaharness scan --repo /path/to/target --application-id 12345

Details

Pipeline

Three phases, nine stages:

Phase	Stages	Purpose
Discovery and Modeling	S1-S3	Attack surface mapping, threat modeling, hunting plan
Deep Dive and Verification	S4-S6	Multi-lens research, policy gates, adversarial verification
Synthesis, Chaining, and Reporting	S7-S9	Deduplication, chain construction, SARIF emission

Each stage is a composable skill (attack surface mapper, AppSec threat modeler, language/crypto/logic/access-control lenses, adversarial reviewer, exploit strategist). See the repo docs/architecture.md and docs/SKILLS.md.

Design choices

Threat modeling before deep analysis to focus the attack surface.
Multi-agent deterministic voting on SDK and OpenAI backends to reduce false positives.
Structured triage artifacts to compress the path from AI finding to actionable report.

Output

Per target, under <target>/security-scan/:

<module>_<ts>_report.md with findings and a dropped-findings appendix
<module>_<ts>_report.sarif (SARIF 2.1.0)
<module>_<ts>_errors.jsonl for non-fatal errors

A run_manifest.json records tool version, model roles, config hash, target git SHA, and timing.

Limitations

Findings are LLM-generated triage candidates, not confirmed vulnerabilities. Human review is required.
Runs are non-deterministic. Two scans may differ.
Token use is high. Run vvaharness estimate and tune per-stage max_budget_usd knobs.
The tool runs with elevated privilege. Scan only trusted repos with authorized operators.

Contrast

Daybreak (assess): OpenAI-hosted cyber-defense platform with Codex Security harness. VVAH is self-hosted, multi-provider, and open source.
Codacy (assess): multi-repo quality dashboards with traditional SAST aggregation. VVAH adds agentic reasoning and exploit-chain validation.
Conftest (trial): deterministic Rego policy on IaC. Complementary, not a substitute for VVAH’s application-source analysis.

Agent integration

vvaharness setup --install-agents drops operating instructions for Claude Code, Copilot, Gemini CLI, and cross-tool AGENTS.md without overwriting existing files.