DEV Community 1h ago

I Scanned 3 Major AI Agent Frameworks. Here Are the 332 Critical Vulnerabilities

I scanned three of the most popular AI agent frameworks with AgentGuard v0.6.1. The results were worse than I expected.

The Scan

Framework	Files	Findings	CRITICAL	HIGH	MEDIUM
LlamaIndex	2,951	1,003	252	558	193
AutoGen	549	229	80	113	36
CrewAI	84	391	0	0	391

LlamaIndex (252 CRITICAL)

The most popular RAG framework: 252 critical findings. 441 agent loop patterns, 178 data exfiltration paths, 141 trust boundary violations.

AutoGen (80 CRITICAL)

Microsoft. Self-modification vectors. Credential exposure in replay logs. MCP host trusts server prompts unsafely. Docker executor mounts host filesystem into sandbox.

CrewAI (391 MEDIUM)

Data exfiltration patterns across 391 locations -- agent data flowing to external endpoints without constraints.

What This Means

Frameworks with 30K+ stars, Fortune 500 production deployments. Findings in the code that ships today. Every finding has a clear fix -- input validation, Pydantic models, sandbox enforcement, log scrubbing. Solved application security problems not yet applied to AI agent code.

pip install dfx-agentguard

GitHub: https://github.com/dockfixlabs/agentguard

Benchmark: 36 samples, 100 percent detection, 0 FP

Read on DEV Community ↗ ← Back to News