DEV Community

From Regex to AST: Building Taint Tracking for AI Agent Code

The Regex Ceiling

Regex catches obvious patterns:

prompt = f"You are helpful. {user_input}"

A regex rule sees f"..." with {user_input} and flags it. Done.

But regex cannot track this:

query = request.json.get("query")
processed = query.strip().upper()
template = "Answer: {q}"
prompt = template.format(q=processed)
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)

The taint flows: request.json -> query -> processed -> template.format() -> prompt -> openai call. Four hops. Regex sees each line independently and cannot connect them.

AST to the Rescue

Python's ast module parses source code into a syntax tree. We can walk that tree and track how data flows.

Step 1: Identify Sources

A "source" is any expression that produces untrusted data:

SOURCE_PATTERNS = {
    "user_input",
    "user_msg",
    "user_message",
    "request",
    "req",
    "query",
    "message",
    "msg",
}

Plus attribute access patterns: request.args.get("q"), request.json["key"], input().

In AST terms, we check ast.Name nodes against the source set, and ast.Call nodes for request.args.get patterns.

Step 2: Track Propagation

When a source is assigned to a variable, that variable becomes tainted:

user_input = request.args.get("q")  # user_input is now tainted

But taint also propagates through:

  • Method calls: processed = user_input.strip() -- processed is still tainted
  • F-strings: prompt = f"Hello {user_input}" -- prompt is tainted
  • .format(): prompt = template.format(q=query) -- prompt is tainted if query is
  • String concatenation: prompt = "Hello " + user_input -- prompt is tainted
  • List/dict construction: messages = [{"role": "user", "content": user_input}] -- messages is tainted

The tracker walks assignments in order, maintaining a tainted_vars dict. When it sees x = tainted_expr, it adds x to the dict. When it sees x = safe_expr, it removes x.

Step 3: Identify Sinks

A "sink" is where tainted data reaches an LLM:

  • Variable assignment: prompt = <tainted> or messages = [<tainted>]
  • Function call: openai.chat.completions.create(messages=<tainted>)

When the tracker sees a tainted expression reaching a sink, it fires a finding.

Step 4: Sanitizers

Not all transformations preserve taint. Some explicitly make data safe:

safe = str(user_input)[:100]  # truncated, cast to string

The tracker treats str(), int(), float(), len(), and explicit escape functions as sanitizers. When data passes through a sanitizer, the taint is removed.

What It Catches (That Regex Cannot)

# Multi-hop flow -- 4 variable assignments
user_input = request.args.get("message")
processed = user_input.strip()
prompt = f"You are helpful. {processed}"
# AgentGuard v0.5.0: DETECTED (2 findings: sink var + LLM call)

# Template .format() with named args
query = request.json.get("query")
template = "Answer: {q}"
prompt = template.format(q=query)
# AgentGuard v0.5.0: DETECTED

# Messages array with tainted content
user_msg = request.json.get("message")
messages = [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": user_msg}
]
# AgentGuard v0.5.0: DETECTED

What It Does Not Flag (Correctly)

# Sanitized input
user_input = request.args.get("q")
safe_input = str(user_input)[:100]
prompt = f"Query: {safe_input}"
# AgentGuard v0.5.0: NOT FLAGGED (sanitized)

# Hardcoded prompt
prompt = "What is the weather?"
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)
# AgentGuard v0.5.0: NOT FLAGGED (no taint source)

Limitations

This is v0.5.0 -- the first iteration. Known gaps:

  • Python only. JavaScript/TypeScript AST support is on the roadmap.
  • Intra-file only. Taint does not cross file boundaries (no interprocedural analysis yet).
  • No control flow. If/else branches are not tracked separately.
  • Conservative sanitizers. str() is treated as a sanitizer, but str(user_input) alone does not make input safe for all contexts.

The Architecture

Source code
    |
    v
ast.parse()
    |
    v
Walk tree
    |
    +--> Assign node?
    |       |
    |       +--> RHS tainted? --> Add LHS to tainted_vars
    |       +--> RHS safe? --> Remove LHS from tainted_vars
    |       +--> LHS is sink var? --> Fire finding
    |
    +--> Call node?
            |
            +--> Is LLM API call?
            |       |
            |       +--> Args tainted? --> Fire finding
            |
            +--> Is .format() on tainted var?
                    |
                    +--> Result is tainted

Try It

pip install --upgrade dfx-agentguard
agentguard src/ --format text

The taint tracking rule (ASI01-TAINT-TRACK) runs alongside the existing regex rules. Both layers work together: regex for speed, AST for precision.

AgentGuard is MIT-licensed. v0.5.0 includes 38 tests and a 32-sample benchmark with 100% detection rate.

Comments

No comments yet. Start the discussion.