← Back to ThoughtProof

From CVEs to Custom Rules: The Compounding Security Pipeline

Audit Pipeline Semgrep March 3, 2026 · ThoughtProof Research

We spent a month auditing AI agent frameworks. 20+ findings, multiple confirmed CVEs, CVSS scores from 7.5 to 9.3.

Then we did something obvious in retrospect: we took the code patterns from those confirmed findings and turned them into automated detection rules.

The rules immediately found new vulnerabilities in repos we'd already audited.

8
Custom Rules
594→16
After FP Reduction
3
New Findings Day 1

The Patterns Nobody Is Scanning For

Standard Semgrep rulesets don't have an "AI agent security" category. The closest you get is generic injection rules that produce hundreds of false positives on agent code. Here's what we built:

Tool Output → Prompt Injection — The #1 pattern. Tool results flow into LLM context without sanitization. A malicious tool response can override the agent's system prompt. Found in multiple major frameworks.

Unguarded Code Executionsubprocess.Popen(command, shell=True) with zero confirmation between LLM output and system execution. The agent decides to run code, and the framework just... runs it.

Shell Injection via String Concatenation — f-strings concatenated into shell=True subprocess calls. Classic vulnerability, surprisingly common in ML tooling where "just make it work" culture prevails.

Memory Poisoning — Conversation memory that stores everything without content filtering. One injected message persists and influences all future interactions.

API Key Serializationself.__dict__json.dump(). Share your saved model program, leak your credentials.

The Pipeline

PHASE 1 — STATIC ANALYSIS

Custom Semgrep rules scan the codebase. Casts a wide net — optimized for recall, not precision. Finds candidates.

PHASE 2 — LLM TRIAGE

Each candidate is evaluated against a 3-dimension framework: Input Provenance × Architectural Context × Threat Model. Cuts ~90% of false positives. DSPy-optimized prompts achieve 100% accuracy on our labeled dataset.

PHASE 3 — DEEP VERIFICATION

Only triaged findings reach the deep verifier. It reads actual code paths, builds proof-of-concept exploits, assigns CVSS scores, and produces disclosure-ready reports.

The Feedback Loop

Here's what makes this compound:

1. Manual audit → find vulnerability
2. Confirm with vendor (CVE / MSRC / bounty)
3. Extract code pattern
4. Write Semgrep rule
5. Scan other targets → find new vulnerabilities
6. Go to 2

Every confirmed finding makes the next audit better. The rules we built from framework A's vulnerabilities found new issues in framework B. The patterns are structural, not implementation-specific.

Why This Matters

AI agent frameworks are being deployed into production faster than they're being secured. The attack surface is structural — tool output injection and unguarded execution appear in nearly every framework we've tested.

Static analysis catches the candidates. LLM triage filters the noise. Deep verification confirms the real ones. Each layer feeds back into the others.

CVEs are training data. Every confirmed finding is a pattern. Every pattern is a rule. Every rule finds the next finding. That's the loop.