PG-011: The Cross-AI Adversarial Review Protocol

The Cross-AI Adversarial Review Protocol

PG-011 May 17, 2026 Thomas W. Gantz

This guide expands practice #10 of PG-000: 10 Things Every AI User Should Do.

A practitioner guide for using a second AI system as a structured review pass on the first one’s output

Why this guide exists

One AI system reviewing its own output rarely catches its own systematic mistakes. The same model that produced the mistake is the model being asked to spot it. Self-review can clean up surface problems — clarity, organization, minor errors — but it tends to be blind to the kinds of failures that come from how the system reasons.

A different AI system, especially one from a different developer, is often better at catching these. It is not a perfect oracle. It makes its own mistakes. But it has a different set of priors, a different training history, and a different way of weighing evidence, so its blind spots overlap only partially with the first system’s.

Used well, a second AI review is one of the most effective lightweight review passes available to most practitioners. Used poorly, it is theater that gives false confidence. This guide is the difference between the two.

The core failure mode this addresses

When a single AI produces an output that contains systematic problems — a fabricated citation that follows a plausible pattern, a misrepresented source, an overconfident factual claim, an argument with a hidden weakness — the same AI re-reading its own work usually misses them. The problems are not surface-level slips; they are baked into how the response was generated.

A second AI looking at the same output as a fresh input is reading it without that generative history. It does not have a stake in the conclusion. It sees the words on the page, not the reasoning that produced them.

What a second AI is good and bad at catching A second AI is typically better than the first one at catching fabricated citations, misrepresented sources, factual errors, internal contradictions, and weak or missing argument steps — the kinds of failures that come from how the first system generated its output. It is unreliable at catching deeper conceptual errors that both systems are likely to share, especially in contested or technical domains where the field itself is unsettled. This is a practitioner-derived heuristic, not an empirically established result.

This is the key practical point: a second AI review is a strong filter, not a verification. It catches a significant class of errors and misses a different class. Knowing which class it catches is what makes the protocol useful instead of misleading.

When you must use this procedure

Use this procedure whenever:

The output will be used in a real decision, public communication, or formal record
The output makes factual claims, attributes claims to sources, or relies on specific data
The output is long enough that a single read by you might miss a buried problem
You will repeat or stand behind the output once it leaves your hands

You do not need this for casual conversation, exploratory thinking, or first drafts that you will iterate on heavily yourself. The protocol is for finished work.

The protocol

Four steps.

Step 1 — Use a genuinely different system

The point of the second review is that the reviewer’s failure modes overlap less with the first system’s. The further apart the two systems, the more independent the review.

Most useful in order of independence:

An AI from a different developer (Claude reviewing GPT, Gemini reviewing Claude, Grok reviewing Gemini, and so on)
The same developer but a different model family or version, if a different developer is not available
The same model in a fresh session as a last resort

A fresh session of the same model catches some self-review blind spots but leaves the largest one in place: the systematic priors of that specific model. Use a cross-developer pairing when the stakes justify it.

Step 2 — Paste the output as input, with no preamble

The second AI should receive the first AI’s output as something it is reviewing — not as a continuation of a conversation, and not framed as “please confirm this is good.” The framing matters.

Recommended instruction: “The text below was produced by a different AI system. Review it adversarially. What is weak, missing, unsupported, internally contradictory, or potentially wrong? Be specific about each issue and point to the exact passage. If you are uncertain whether something is a real problem, say so explicitly. Do not summarize or rewrite the text. Do not soften your critique.”

Three things matter in this instruction:

"Produced by a different AI system." Tells the reviewer not to defer.
"Review it adversarially." Sets the frame. Default review tends to be supportive.
"Do not summarize or rewrite." Reviewers often drift into rewriting, which buries the critique and dilutes the signal.

Step 3 — Treat the second AI’s findings as leads, not verdicts

The second AI’s output is information to investigate, not a final ruling. Two failure modes are common at this step:

Overweighting the reviewer. If the reviewer flags a citation as suspect, that is a reason to check the citation — not proof that it is wrong. The reviewer can be confidently mistaken in exactly the same way the original system can.
Underweighting the reviewer. If the reviewer flags a problem and you do not see it, the temptation is to dismiss. Resist. The reviewer is reading without your investment in the original output. That is the entire reason it has any value.

The right move for each flag is the same: investigate the specific passage, decide for yourself, and make the change or not. The second AI is doing the work of pointing your attention at things worth examining. You are doing the work of deciding.

Step 4 — Stop after one good review

Running the same output through three or four different AIs does not produce three or four times the value. Each additional reviewer flags fewer new problems and contributes more noise. The marginal review past the first one is usually wasted effort.

If the first review surfaces enough issues that the output needs substantial rework, fix it and then run one more review on the revision. If the first review surfaces few or no issues, the output is good enough for its purpose. You are not looking for unanimous AI consensus. You are looking for the failures that the first system was structurally blind to.

What good adversarial review looks like

A second AI doing real work will produce review of the following kinds:

"Citation 3 attributes a specific statistical claim to a paper that, based on the title and venue, almost certainly does not contain that statistic. Check the actual source."
"The argument in paragraph 4 depends on a premise stated only in paragraph 2 that the author has not defended. The conclusion is vulnerable if a reader rejects the premise."
"This claim and this claim cannot both be true as stated. One needs to be qualified."
"This section relies heavily on the word ‘significant’ without specifying significance threshold or effect size."

A second AI doing surface work, or simulating review without performing it, will produce:

"This is a well-structured and thoughtful piece. A few minor suggestions…"
Generic style notes that could apply to any document
Praise dressed as critique ("you could strengthen this by…" with no specifics)
Rewriting the text instead of reviewing it

If the review reads like the second group, repeat the request with a sharper instruction or move to a different system. A weak review is worse than no review because it produces false confidence.

Key rules

Use a genuinely different system when possible — cross-developer when stakes are real
Frame the input as something to review adversarially, not a finished piece to validate
Treat findings as leads to investigate, never as ground truth
Stop after one good review; piling on more reviewers does not improve the signal

What this procedure protects

Following this method protects against the systematic blind spots of any single AI system — especially around fabricated or misrepresented citations, factual errors, internal contradictions, and weak argument structure. It also makes one of the most useful skills in multi-system AI work explicit and repeatable, rather than an ad hoc thing some practitioners do well and most do not do at all.

What this procedure does not do

This method does not verify the output. The reviewer can be confidently wrong about the same things the original system was wrong about, especially in contested or technical domains where both systems share field-level priors. It does not catch deep conceptual errors that are not surface-detectable. And it is not a substitute for verifying specific claims against actual sources — for citations especially, the reviewer’s flag is a prompt to check the source, not a substitute for checking it.

When in doubt

If you are uncertain whether a piece of work needs adversarial review, the cost of running one is small. The cost of standing behind work that contains a problem the original AI was structurally blind to is much larger and lasts much longer. The default for finished work that will leave your hands is one review pass, not zero.

Core Practitioner Guides

Guides covering the foundational skills for working reliably with any AI system.

PG-000: 10 Things Every AI User Should Do
PG-001: How to Work Reliably With Conversational AI Over Time
PG-002: AI-Assisted Editing Without Silent Loss
PG-003: Verify Before You Work
PG-004: You Are Accepting the First Adequate Answer
PG-005: Your AI Updated the File. Did It Preserve What It Didn’t Touch?
PG-009: Make the AI Show You the Source
PG-010: Don’t Trust What the AI Says About Its Own Work
PG-011: The Cross-AI Adversarial Review Protocol (this guide)

Synthience Institute