How to use the Auditor.

Drop a file, paste a snippet, or point at a repo. The auditor runs a rigorous 10-step method — three code passes, calibrated severity, concrete patches — and produces a structured security report. No exploits generated. Ever.

Scope
Establish context
Threat model, trust boundaries, deployment target
Scan
Three-pass review
Patterns → Auth/Authz → Context-specific
Fix
Reproduce & Patch
Concrete repro steps and code diffs
Ship
Report & Plan
Calibrated report + remediation plan

Installation

Claude Code / Cowork

Clone into your skills folder. The skill is auto-detected on the next session.

# Clone into your skills folder git clone https://github.com/alboechat/appsec-vulnerability-auditor.git # Or clone a stable tag git clone --branch v1.1.0 https://github.com/alboechat/appsec-vulnerability-auditor.git

OpenAI Codex CLI

Place in your Codex skills directory.

cd ~/.codex/skills git clone https://github.com/alboechat/appsec-vulnerability-auditor.git

Cursor / Windsurf / Cline

Copy the contents of SKILL.md into your custom instructions or system prompt. Grant filesystem access to references/, examples/, and templates/.

Any AI coding agent

The skill is a standard Markdown prompt file. Any agent that accepts system instructions or custom prompts can use it — paste or reference SKILL.md.

No dependencies. The skill is a prompt file + reference documents. No build step, no API keys, no runtime.

Your first audit

The simplest invocation: paste code and ask.

// Paste your code, then ask: "Review this for security vulnerabilities" // Or point at a file: "Audit src/api/routes.ts for security" // Or an entire repo: "Do a full security review of this codebase"

The auditor will ask one clarifying question to establish scope (threat model, trust boundaries), then run the full 10-step method. If your code is small, the entire audit runs in one pass. For repositories, it uses staged mode with resumable state tracking.

Tip: Providing a threat model upfront skips the clarification step and improves severity accuracy. Tell the auditor if it's a public web app, internal tool, CLI, library, or LLM agent.

Trigger phrases

The skill activates automatically on any of these patterns — in English or Portuguese.

"audit this code"
"review for security"
"find security bugs"
"check for OWASP Top 10"
"is this safe to deploy?"
"look for vulnerabilities"
"review this AI-generated code"
"I had Claude write this — check it"
"auditar segurança"
"é seguro para o deploy?"

You can also paste any code snippet accompanied by phrases like "is this OK?" or "can you check this?" — the auditor will recognize the intent.

The 10-step method

Every audit follows the same sequence. No steps are skipped, even on small inputs — the rigor is the value.

StepNameWhat happens
1Establish scopeThreat model, trust boundary, authorship, deployment target
2Map attack surfaceEntry points, trust transitions, sensitive sinks
2.5Ingest seed findings (optional)Normalize SARIF/Semgrep input into the ledger
3Pattern scanOWASP Top 10, CWE Top 25, AI-code pitfalls, secrets
4Auth, authz & data flowProtected routes, IDOR, sanitizer fit, state machines
5Context-specific passesStack-matched checks: web, LLM agent, mobile, IaC
6Reproduce or sketchSafe verification steps for Critical/High findings
7PatchCode diff, why it closes the risk, regression tests
8ReportStructured Markdown: scope, surface, findings, next steps
9Calibrate severityConfidence, assumptions, downgrade/upgrade conditions
10Remediation planSprint tasks, acceptance criteria, dependency graph
STEP 1

Establish scope

Ask about threat model, trust boundaries, who wrote the code (human, AI, or both), deployment target, and whether secrets/PII/PHI are in scope. If already provided, skip to Step 2.

STEP 2

Map the attack surface

Build a mental model: entry points (HTTP routes, CLI args, file uploads, MCP tools), trust transitions (where untrusted input enters a trusted context), and sensitive sinks (auth, crypto, file I/O, subprocess).

STEP 2.5

Ingest seed findings (optional)

If the user provides pre-existing findings (Semgrep, SARIF), normalize them into the internal ledger format with provenance tags. Validate during Pass 1.

STEP 3

Pass 1 — Pattern scan

Walk the code against OWASP Top 10, CWE Top 25, AI-generated code pitfalls, LLM agent security patterns, and secrets/config checks. Tag each finding with severity, confidence, CWE ID, and file:line.

STEP 4

Pass 2 — Auth, authz & data-flow

Hand-trace authentication (every protected route actually protected?), authorization (IDOR checks on every user-owned resource), data flow (untrusted input → sanitizer → sink), and state machines (can steps be skipped?).

STEP 5

Pass 3 — Context-specific

Choose checks based on what the code is (web frontend, backend, LLM agent, mobile, IaC, library). Load language-specific reference packs only for the stack in scope.

STEP 6

Reproduce or sketch

For every Critical/High finding: show a concrete reproduction (curl call, request body) or describe steps in prose. The user must be able to verify without receiving an exploit kit.

STEP 7

Patch

For every finding: a code diff in the same language and style, an explanation of why it closes the vulnerability, and regression tests. If it requires architectural change, say so.

STEP 8

Report

Structured Markdown report: scope, attack-surface map, calibration note, findings table, detailed writeups, negative findings, out-of-scope notes, and recommended next steps.

STEP 9

Calibrate severity

Every finding gets: severity level, confidence, assumptions, downgrade/upgrade conditions, and validation questions. No inflation, no deflation — calibrated severity is the credibility of the report.

STEP 10

Offer remediation plan

After delivering the report, offer to generate a prioritized remediation plan. Users choose which findings to fix by tier (T1/T2/T3), preset (quick-wins, ship-blocker, full), or individual selection. Output: sprint-allocated .md plan with task checklists, fix steps, acceptance criteria, and dependency graph.

Vibe-coding patterns

Code written via AI assistants ("vibe-coded") fails in characteristic ways. The auditor checks these even when the rest of the audit is clean — they are the highest-yield findings and the key differentiator vs. a generic SAST tool.

Top 10 vibe-coding checks

#PatternWhy it matters
1Confidently wrong cryptomd5, sha1, ECB mode, hand-rolled JWT verification that looks idiomatic
2Missing authz on PATCH/DELETEAI scaffolds CRUD, authenticates the route, forgets resource ownership check → IDOR
3Prompt-injectable tool useUntrusted text passed as system/user role with tool-calling enabled = pre-auth RCE-equivalent
4Hardcoded keys "for testing"In commit history, .env.example, comments, frontend bundles, mobile resources
5Server-trusts-clientPricing on client, role from JWT claim client sets, quantity validated only in JS
6String-concatenated SQL/HTML/shellAI falls back to interpolation even when ORM/template engine was available
7Overly broad CORS (*)Combined with credentialed endpoints = full cross-origin access
8eval / exec / pickle.loadsOn user input — disproportionately common in AI-generated "flexible" code
9SSRF in URL fetchersfetch(user_url) with no allowlist — agent stacks expose metadata services
10Secrets in LLM context windowAgent with env var / file / DB access can be coerced into echoing them = data exfil

Report structure

Every audit report contains these sections, in order:

SectionPurpose
1. Scope & threat modelWhat was reviewed, what was assumed
2. Attack-surface mapEntry points, trust transitions, sensitive sinks
3. Calibration noteHow to interpret severity, confidence, provisional ratings
4. Findings tableSummary sorted by severity
5. Detailed writeupsOne per finding: description, reproduction, patch, calibration
6. Negative findingsWhat was checked but found clean (builds trust)
7. Out-of-scope notesAnything seen but not reviewed (e.g., third-party deps)
8. Recommended next stepsPriority-ordered actions
9. Findings LedgerJSON format for programmatic consumption (optional)

Output formats

Default output is a Markdown report. You can request specific formats:

FormatWhat you get
Markdown defaultStructured .md report following the example template
SARIFSARIF 2.1.0 JSON for GitHub Code Scanning upload
GitHub IssueMarkdown body ready to paste into a GitHub issue
JIRA TicketFormatted for JIRA issue creation
Executive SummaryNon-technical one-pager for leadership
Findings LedgerMachine-readable JSON per finding-schema.json
// Request a specific format: "Audit this code and output as SARIF" "Review for security — I need a JIRA ticket per finding" "Security review with executive summary for my CTO"

Severity scale

The auditor uses calibrated severity with strict definitions. No inflation, no deflation.

SeverityDefinitionSLA
CriticalPre-auth RCE, full DB exfil, total auth bypass, ransomable24 hours
HighAuthenticated RCE, SQLi behind auth, IDOR exposing other users' data, secrets in repo1 week
MediumStored XSS (auth), missing rate limits, weak crypto where threat model tolerates it1 month
LowInfo disclosure (low sensitivity), missing security headers, verbose errorsConvenient
InfoBest-practice deviations, deprecated patterns, hardening opportunitiesNo SLA

Every finding also includes confidence (High/Medium/Low), assumptions, downgrade conditions, upgrade conditions, and validation questions when context is incomplete. Findings without a threat model are marked (provisional).

Findings Ledger

For automation and tracking, the auditor can output a JSON Findings Ledger conforming to references/finding-schema.json (JSON Schema Draft 2020-12).

{ "schema_version": "1.0", "audit_id": "notes-app-2026-05-28", "findings": [ { "id": "FINDING-001", "title": "SQL Injection in search endpoint", "severity": "Critical", "confidence": "High", "cwe": "CWE-89", "file": "src/api/search.ts:42", "source": "manual" } ] }

CLI tools

The skill ships with automation scripts in the tools/ directory. These are not required for auditing — the skill works with just SKILL.md — but they streamline common workflows.

ScriptPurpose
appsec-audit.pyPython CLI wrapper for invoking audits programmatically
run-evals.shRun the full evaluation suite against benchmark cases
eval-compare.pyCompare two scorecard snapshots for regression detection
seed-with-semgrep.shRun Semgrep and format output as seed findings for the auditor
validate.shValidate skill structure, internal links, and eval manifest
check-skill-links.shCheck all internal file references in SKILL.md

Agent configs

Ready-made configuration files for specific platforms live in agents/.

FilePlatformNotes
agents/openai.yamlOpenAI Codex CLIDrop-in agent config with skill path and instructions

Community contributions welcome. If you create a config for Cursor, Windsurf, Cline or another platform, open a PR. The goal is zero-friction setup on every major AI coding tool.

Worked examples

The examples/ directory contains complete, production-quality output samples showing exactly what the auditor produces.

FileShows
example-audit-report.mdFull Markdown audit report with all 9 sections
findings-ledger.example.jsonStructured JSON ledger with severity, CWE, assumptions
remediation-plan-example.mdSprint-ready remediation plan from Step 10
repository-audit-summary.mdRepo-scale audit summary with module-level findings
audit-state.example.jsonResumable audit state for large repositories
output-formats/sarif-example.jsonSARIF 2.1.0 for GitHub Code Scanning upload
output-formats/github-issue.mdGitHub issue body per finding
output-formats/jira-ticket.mdJIRA-friendly ticket per finding
output-formats/executive-summary.mdCTO/leadership risk summary

CI pipeline

The repository includes a GitHub Actions workflow (.github/workflows/ci.yml) that validates the skill on every push:

CheckWhat it validates
StructureAll referenced files exist, SKILL.md frontmatter is valid
Internal linksEvery path in SKILL.md resolves to an actual file
Eval manifestAll benchmark cases exist and expected findings sum correctly
Schemafinding-schema.json is valid JSON Schema

What's in the box

A complete map of the repository structure.

SKILL.md # Core audit prompt references/ # 11 security reference packs owasp-top-10.md cwe-top-25.md ai-generated-code-pitfalls.md llm-agent-security.md secrets-and-config.md severity-calibration.md output-formats.md repository-audit-protocol.md evaluation-protocol.md remediation-plan-protocol.md seed-input-protocol.md finding-schema.json lang/ # 9 language-specific packs typescript.md python.md go.md rust.md java.md ruby.md php.md csharp.md swift-kotlin.md examples/ # 9 worked output samples templates/ # Remediation plan template evals/ # 5 benchmark cases + scorecard tools/ # CLI wrapper, eval runner, validators agents/ # Ready-made agent configs docs/ # Architecture, glossary, threat model glossario-appsec-pt-br.md # PT-BR AppSec glossary threat-model-template.md # Structured intake template

Repository-scale audits

For full-repo audits, the skill uses staged mode with resumable state tracking. It partitions the codebase into scoped segments and audits each in sequence, maintaining an audit-state.json that tracks progress.

Resumable. If a session runs out of context, re-invoke the skill — it picks up where it left off from the persisted audit state.

At the end, a repository audit summary consolidates all segment findings into a single prioritized report with cross-cutting themes and architectural recommendations.

Semgrep seed input

You can bootstrap the audit with pre-existing findings from static analysis tools. The skill accepts SARIF, Semgrep JSON, or custom JSON conforming to finding-schema.json.

# Run Semgrep and feed results to the auditor semgrep --config auto --json -o semgrep-results.json src/ # Or use the included wrapper ./tools/seed-with-semgrep.sh src/

Every seed finding is tagged with its provenance (e.g., source: semgrep) so the report distinguishes tool-found from manually-identified vulnerabilities. The auditor validates seed findings during Pass 1 to filter false positives.

Language coverage

The skill loads language-specific reference packs only for the stack in scope — it does not load every pack by default.

Language / FrameworkReference pack
TypeScript / JavaScript / Node / React / Next.jsreferences/lang/typescript.md
Python / Django / Flask / FastAPIreferences/lang/python.md
Go / Gin / Echo / chi / gRPCreferences/lang/go.md
Rust / Axum / Actix Web / Rocketreferences/lang/rust.md
Java / Kotlin JVM / Spring / Jakartareferences/lang/java.md
Ruby / Rails / Sinatrareferences/lang/ruby.md
PHP / Laravel / Symfony / WordPressreferences/lang/php.md
C# / .NET (ASP.NET Core, Blazor, EF)references/lang/csharp.md
Swift / Kotlin (iOS / Android)references/lang/swift-kotlin.md

Defensive posture

The skill is defensive only. It identifies vulnerabilities and writes patches. It does not:

RefusedOffered instead
Write working exploits or weaponized PoCsDetection rules and hardened test cases
Fingerprint third-party systemsGuidance on what to check in your own infra
Bypass authentication or licensingRecommendations for proper auth hardening
Produce offensive instructionsThe equivalent defensive version

When asked for offensive output, the auditor declines and offers the equivalent defensive version — a detection rule, a fix, a hardened test case.