How to Read a Zbigniew Assessment

This page explains how assessments are produced, validated, and tracked - so you can judge their quality yourself.


What This Is (and Isn’t)

Zbigniew Protocol is an open-source intelligence analysis methodology built on Carl Sagan’s Baloney Detection Kit (1995). It extends Sagan’s nine rules for detecting nonsense into a system for geopolitical pattern analysis.

What it does: Maps events across vectors, identifies beneficiaries, tracks predictions with deadlines, and states falsifiability criteria for every judgment.

What it doesn’t do: Predict the future. It recognizes patterns and makes testable claims with explicit uncertainty.

The full framework is open source.


Confidence Levels

Every claim in every assessment carries a confidence level. Here’s what they mean:

Level Label What It Requires Language You’ll See
5 CONFIRMED Primary source documentation exists (government doc, court filing, official transcript) “is confirmed by…”, “documents show…”
4 HIGH Multiple independent reliable sources agree “strongly suggests…”, “almost certainly…”
3 MODERATE Logical inference from confirmed facts “likely…”, “evidence indicates…”
2 LOW Single source, circumstantial, or contested “possibly…”, “some evidence suggests…”
1 SPECULATIVE Pattern-based hypothesis, thin evidence “if true, would imply…”, “conceivable…”

Rules I follow:


Source Hierarchy

Not all sources are equal. I use a five-tier system:

Tier Type Examples Can Support Up To
1 Primary Government documents, official transcripts, court filings Level 5 (CONFIRMED)
2 Institutional Think tanks, academic papers, peer-reviewed analysis Level 4 (HIGH)
3 Quality Journalism Established outlets with direct quotes or documents Level 3 (MODERATE)
4 Specialized Trade publications, Bellingcat, domain experts Level 2 (LOW)
5 Unverified Social media, anonymous, single-source Level 1 (SPECULATIVE) only

Key rule: A Tier 5 source can never support a Level 3 claim, no matter how convincing it sounds. The source ceiling is enforced by the validation engine.

Every assessment includes a source diversity audit. I track: how many tiers are represented, how many languages, whether hostile sources (those who would benefit from the opposite conclusion) are included.


Seven Vectors

Events are mapped across seven analytical vectors:

Vector What It Tracks
INSTITUTIONAL Government capacity, civil service, rule of law
ALLIANCE NATO, EU, bilateral treaties, trust between states
ECONOMIC Trade, sanctions, currency, investment flows
INFORMATION Media, propaganda, platform control, censorship
MILITARY Posture, deployments, readiness, doctrine changes
POLITICAL Domestic polarization, elections, democratic norms
SOCIAL Civil unrest, migration patterns, public trust

When 5+ events across multiple vectors benefit the same actor, coincidence becomes improbable. That’s a pattern.


Cui Bono (Who Benefits?)

Every assessment asks: who benefits from this? Not who says they benefit, or who the narrative suggests benefits - who actually does, structurally?

The analysis maps four categories:

Then the Adversary Test: “If an adversary designed this policy to serve their interests, what would it look like? Does it look like this?”


Prediction Tracking

Predictions are analytical forecasts, not prophecies. Every prediction has:

Predictions are tracked in a public ledger. When a deadline passes, the prediction is resolved: confirmed, falsified, partially confirmed, or expired. I publish the results either way. Calibration analysis checks whether I’m over- or under-confident at each level.

Prediction Audits

Predictions are audited at regular intervals (30, 60, 90 days) against verified current events. Each audit grades every prediction as: CONFIRMED, ON TRACK, PARTIALLY RIGHT, TOO EARLY, or WRONG. Misses are analyzed for systematic bias - not to excuse them, but to calibrate future assessments.

The March 2026 audit (20 predictions, January-March 2026) showed:

Current track record: accuracy.md in the repository March 2026 Scorecard

Validation

Before publication, every assessment passes through three validation layers:

1. Data Integrity (automated)

Schema validation on all structured data. Prediction IDs, deadlines, vector names, source references - all checked against the schema.

2. Cognitive Bias Checklist (manual)

Eight biases checked before every assessment:

3. Logical Consistency (formal)

A Prolog-based validation engine enforces:

This connects to research on AI reasoning depth: AI can produce explanations up to ~10 levels deep, but only ~2.5 survive external verification. The validation engine is designed to keep assessments in the verifiable zone.

4. Red Team (adversarial)

Before publishing, five questions must be answered:

  1. What’s the strongest argument AGAINST this assessment?
  2. What alternative explanation fits the same facts?
  3. What would a defender of the subject say?
  4. What am I missing?
  5. In two years, what might make this look foolish?

Named Patterns

When a pattern appears across multiple assessments, it gets a name. Current named patterns:

Pattern What It Describes
DEMAND-SIDE SUBSIDY Borrower given cheap credit but forced to buy from specific suppliers
CORE-PERIPHERY EXTRACTION Periphery borrows under rules designed by the core to benefit the core
COMPETENCE LAUNDERING Failed actor promoted; promotion treated as evidence of competence
UNFALSIFIABLE REFRAMING Testable claim restructured to become untestable
PREDICTION MARKET SIGNAL LEAKAGE Classified operational plans visible through public betting patterns
AMPLIFICATION LAUNDERING State-originated narrative amplified through domestic actors to appear grassroots
REGULATORY CAPTURE BLIND SPOT Assuming institutional response when the institution has conflicting financial interests
RUSSIAN ESCALATION SEQUENCE Predictable ladder: economic pressure -> info ops -> diplomatic isolation -> military provocation -> frozen conflict -> military action. Each step only after previous fails
RATCHET NOT PENDULUM Crisis-driven structural changes (trade channels, currency agreements, alliance shifts) that persist after the crisis ends. Dedollarization, BRICS settlement, defense autonomy
STRUCTURAL BENEFIT WITHOUT ACTION Actor benefits from crisis without causing it. No sanctions target, no deterrence point. More dangerous than active operations

Naming patterns makes them detectable. Once you see DEMAND-SIDE SUBSIDY in European defense procurement, you start noticing it in IMF lending, agricultural policy, and technology transfer agreements.


Interactive Models

For assessments with cascading dependencies, I build interactive models. You can adjust the inputs and watch how consequences ripple through the system.

Every model has:

The models encode my causal reasoning into a testable structure. If you disagree with a relationship or threshold, you can see exactly where and why.


How to Evaluate This Work

Questions to ask when reading any assessment:

  1. Are confidence levels explicit? Every judgment should carry one. If not, it’s an oversight.
  2. Are sources cited? And are they appropriate tier for the claimed confidence?
  3. Is there a steel-man? The assessment should present the strongest case against itself.
  4. Are predictions falsifiable? Vague predictions (“tensions will increase”) are useless. Specific ones with deadlines are testable.
  5. Is the track record public? Mine is. If an analyst won’t show their prediction history, ask why.
  6. Is cui bono addressed? Who benefits from this analysis being wrong? Who benefits from it being right? Including me.

Intellectual Heritage

Built on Carl Sagan’s Baloney Detection Kit (1995, The Demon-Haunted World). All nine of Sagan’s rules are implemented. Extended with: cui bono analysis, actor background checks, pattern mapping across seven vectors, prediction accountability with signal watches, assessment versioning, and formal validation.

Sagan’s kit detects bullshit in science. This framework detects it in geopolitics. Same enemy: confident claims without falsifiable criteria.


Open Source

The full methodology, all assessment templates, validation tools, and prediction tracking are public:

github.com/maciejjankowski/zbigniew-protocol

Fork it. Apply it to your domain. File issues if you find logical gaps. The framework improves by being challenged, not protected.


“The question is not ‘who is an asset.’ The question is: ‘Why does this policy portfolio perfectly match the wish-list of adversaries?’”

por. Zbigniew - Pattern recognition, not prophecy