DEV Community 2h ago

Why evidence matters more than model memory in AI pentesting

An AI finding you cannot reproduce is a liability, not a result. Darkmoon attaches the exact commands and raw tool output to every finding so a human can peer review it.

The trust problem

Most AI security tools return a confidence score and a paragraph. In offensive security that is not enough. If you cannot show the command that proved a vulnerability, you cannot defend it in a report or a remediation meeting.

What an evidence trail actually contains

For every finding Darkmoon keeps the executed command, the raw output, and the reasoning that connected them. The finding is a reproducible artifact, not a claim you have to take on faith.

Why this beats a bigger model

A larger model reduces some errors but never removes them. The evidence trail is what lets a human catch the ones that remain, which is exactly why we made it the core of the design rather than an afterthought.

How it changes the workflow

Reviewers stop re-verifying everything by hand and start spot-checking the trail. Reports write themselves from real data instead of paraphrased model output.

Try it

If you want to see the evidence trail on a live target, clone the Community Edition and point it at a lab.

Repo (GPLv3): https://github.com/ASCIT31/Dark-Moon
Docs: https://docs.dark-moon.org/
Demo:

Built by pentesters, open sourced for pentesters. Feedback on the methodology and the evidence trail is genuinely welcome.

Read on DEV Community ↗ ← Back to News