From Regex to AST: Building Taint Tracking for AI Agent Code
The Regex Ceiling
Regex catches obvious patterns:
prompt = f"You are helpful. {user_input}"
A regex rule sees f"..." with {user_input} and flags it. Done.
But regex cannot track this:
query = request.json.get("query")
processed = query.strip().upper()
template = "Answer: {q}"
prompt = template.format(q=processed)
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
The taint flows: request.json -> query -> processed -> template.format() -> prompt -> openai call. Four hops. Regex sees each line independently and cannot connect them.
AST to the Rescue
Python's ast module parses source code into a syntax tree. We can walk that tree and track how data flows.
Step 1: Identify Sources
A "source" is any expression that produces untrusted data:
SOURCE_PATTERNS = {
"user_input",
"user_msg",
"user_message",
"request",
"req",
"query",
"message",
"msg",
}
Plus attribute access patterns: request.args.get("q"), request.json["key"], input().
In AST terms, we check ast.Name nodes against the source set, and ast.Call nodes for request.args.get patterns.
Step 2: Track Propagation
When a source is assigned to a variable, that variable becomes tainted:
user_input = request.args.get("q") # user_input is now tainted
But taint also propagates through:
- Method calls:
processed = user_input.strip()--processedis still tainted - F-strings:
prompt = f"Hello {user_input}"--promptis tainted .format():prompt = template.format(q=query)--promptis tainted ifqueryis- String concatenation:
prompt = "Hello " + user_input--promptis tainted - List/dict construction:
messages = [{"role": "user", "content": user_input}]--messagesis tainted
The tracker walks assignments in order, maintaining a tainted_vars dict. When it sees x = tainted_expr, it adds x to the dict. When it sees x = safe_expr, it removes x.
Step 3: Identify Sinks
A "sink" is where tainted data reaches an LLM:
- Variable assignment:
prompt = <tainted>ormessages = [<tainted>] - Function call:
openai.chat.completions.create(messages=<tainted>)
When the tracker sees a tainted expression reaching a sink, it fires a finding.
Step 4: Sanitizers
Not all transformations preserve taint. Some explicitly make data safe:
safe = str(user_input)[:100] # truncated, cast to string
The tracker treats str(), int(), float(), len(), and explicit escape functions as sanitizers. When data passes through a sanitizer, the taint is removed.
What It Catches (That Regex Cannot)
# Multi-hop flow -- 4 variable assignments
user_input = request.args.get("message")
processed = user_input.strip()
prompt = f"You are helpful. {processed}"
# AgentGuard v0.5.0: DETECTED (2 findings: sink var + LLM call)
# Template .format() with named args
query = request.json.get("query")
template = "Answer: {q}"
prompt = template.format(q=query)
# AgentGuard v0.5.0: DETECTED
# Messages array with tainted content
user_msg = request.json.get("message")
messages = [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": user_msg}
]
# AgentGuard v0.5.0: DETECTED
What It Does Not Flag (Correctly)
# Sanitized input
user_input = request.args.get("q")
safe_input = str(user_input)[:100]
prompt = f"Query: {safe_input}"
# AgentGuard v0.5.0: NOT FLAGGED (sanitized)
# Hardcoded prompt
prompt = "What is the weather?"
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
# AgentGuard v0.5.0: NOT FLAGGED (no taint source)
Limitations
This is v0.5.0 -- the first iteration. Known gaps:
- Python only. JavaScript/TypeScript AST support is on the roadmap.
- Intra-file only. Taint does not cross file boundaries (no interprocedural analysis yet).
- No control flow. If/else branches are not tracked separately.
- Conservative sanitizers.
str()is treated as a sanitizer, butstr(user_input)alone does not make input safe for all contexts.
The Architecture
Source code
|
v
ast.parse()
|
v
Walk tree
|
+--> Assign node?
| |
| +--> RHS tainted? --> Add LHS to tainted_vars
| +--> RHS safe? --> Remove LHS from tainted_vars
| +--> LHS is sink var? --> Fire finding
|
+--> Call node?
|
+--> Is LLM API call?
| |
| +--> Args tainted? --> Fire finding
|
+--> Is .format() on tainted var?
|
+--> Result is tainted
Try It
pip install --upgrade dfx-agentguard
agentguard src/ --format text
The taint tracking rule (ASI01-TAINT-TRACK) runs alongside the existing regex rules. Both layers work together: regex for speed, AST for precision.
AgentGuard is MIT-licensed. v0.5.0 includes 38 tests and a 32-sample benchmark with 100% detection rate.
Comments
No comments yet. Start the discussion.