DEV Community

I Scanned 3 Major AI Agent Frameworks. Here Are the 332 Critical Vulnerabilities

I scanned three of the most popular AI agent frameworks with AgentGuard v0.6.1. The results were worse than I expected.

The Scan

Framework Files Findings CRITICAL HIGH MEDIUM
LlamaIndex 2,951 1,003 252 558 193
AutoGen 549 229 80 113 36
CrewAI 84 391 0 0 391

LlamaIndex (252 CRITICAL)

The most popular RAG framework: 252 critical findings. 441 agent loop patterns, 178 data exfiltration paths, 141 trust boundary violations.

AutoGen (80 CRITICAL)

Microsoft. Self-modification vectors. Credential exposure in replay logs. MCP host trusts server prompts unsafely. Docker executor mounts host filesystem into sandbox.

CrewAI (391 MEDIUM)

Data exfiltration patterns across 391 locations -- agent data flowing to external endpoints without constraints.

What This Means

Frameworks with 30K+ stars, Fortune 500 production deployments. Findings in the code that ships today. Every finding has a clear fix -- input validation, Pydantic models, sandbox enforcement, log scrubbing. Solved application security problems not yet applied to AI agent code.

pip install dfx-agentguard

GitHub: https://github.com/dockfixlabs/agentguard

Benchmark: 36 samples, 100 percent detection, 0 FP

Comments

No comments yet. Start the discussion.