TLDR
Every claim in PAPER TRAIL is assigned one of four evidence tiers: T1 (government primary source), T2 (corpus-derived computation), T3 (journalism), or T4 (estimation/calculation). Each episode publishes its verification breakdown, with EP08 "Eight Aircraft" achieving the highest T1 ratio at 72% (PAPER TRAIL Project, 2026). This system makes the evidentiary basis of every statement visible and auditable.
The Problem of Trust
Documentaries about criminal networks face a credibility problem. The subject matter invites speculation. The audience has been conditioned by conspiracy content to distrust claims and by institutional failures to distrust denials. In this environment, presenting findings without making their evidentiary basis explicit is indistinguishable from presenting opinion.
PAPER TRAIL's response is a four-tier verification system applied to every factual claim in every episode. The tiers are not a rating of importance or confidence — they are a classification of provenance. Where did this number come from? Who generated this document? Is this a fact that was observed, computed, reported, or estimated?
The Four Tiers
Tier 1: Government Primary Sources. These are claims sourced directly to government records: FAA aircraft registrations, state corporate filings (New York Department of State, Delaware Division of Corporations), NYDFS consent orders, PACER docket entries, legislative text (Epstein Files Transparency Act, Pub. L. No. 119-38, 2025), DOJ press releases, and congressional hearing transcripts. T1 claims carry the highest institutional credibility because their provenance is independently verifiable through public registries (PAPER TRAIL Project, 2026).
Tier 2: Corpus-Derived Computation. These are claims generated by the analytical pipeline operating on the document corpus: wire transfer counts and amounts parsed by the bank records processing script, entity counts from Named Entity Recognition (NER) extraction, Splink cluster counts (from a probabilistic record-linking tool that matches different references to the same entity), change-point totals from PELT (an algorithm that detects sudden shifts in patterns over time), Leiden community counts (groups of entities that cluster together in a network), and FedEx shipment statistics. T2 claims are computationally reproducible — anyone with the same database and scripts would produce the same numbers — but they depend on the integrity of the pipeline and the completeness of the source data (PAPER TRAIL Project, 2026).
Tier 3: Journalism. These are claims sourced from news reporting by outlets on a 26-domain whitelist maintained by the verification system. The whitelist includes NPR, CNN, Washington Post, BBC, PBS, CBS News, and similar organizations. T3 claims are used for contextual and current-events information that is not available through government records or corpus computation (PAPER TRAIL Project, 2026).
Tier 4: Estimation and Calculation. These are claims derived from statistical models and analytical inferences: Chao1 species richness estimates (a statistical method that estimates how many total entities exist based on how many appear only once or twice), Granger causality p-values (from a statistical test that checks whether one type of event reliably happens before another), structured comparison tables that pit competing explanations against each other (known as Analysis of Competing Hypotheses, or ACH), and the overall reliability of a chain of evidence, which decreases with each additional link (known as compound confidence). T4 claims are explicitly labeled because they represent the pipeline's interpretive layer — what the data suggests rather than what the data contains (PAPER TRAIL Project, 2026).
Episode Verification Profiles
Each episode has a distinct verification profile that reflects its subject matter:
EP08 "Eight Aircraft" achieved the highest T1 ratio: 99 of 138 claims (72%) sourced to government primary records. This is because FAA registrations, corporate filings, and court records provide direct T1 sourcing for aviation and corporate entity claims (PAPER TRAIL Project, 2026).
EP06 "2,894 Packages" had the highest total claim count at 143, with 124 claims (87%) at T2. FedEx shipping data is corpus-derived — the numbers come from parsed invoices in the database — so the vast majority of claims are computational (PAPER TRAIL Project, 2026).
EP07 "The Wrong Robert" achieved zero unverified claims across 118 total (55 T1, 44 T2, 12 T3, 7 T4). As the episode about corroboration methodology, its own verification completeness had to be bulletproof (PAPER TRAIL Project, 2026).
EP05 "The SAR" was built entirely from a single source document — the TD Bank SAR filing (EFTA01656524.pdf) — making it the most narrowly sourced episode at 28 slides (PAPER TRAIL Project, 2026).
EP02 "The Pipeline" had the highest T4 concentration: 22 of 87 claims (25%) were estimation-tier, reflecting the statistical and computational methods being described (PAPER TRAIL Project, 2026).
The Verification Process
The verification system is not just a labeling exercise. The claim verification script implements a four-tier verification agent that takes natural-language claims and tests them against:
- T2: Database queries against entities, wire transfers, FedEx shipments, and full-text search
- T1: Government registries including OpenCorporates, FAA N-number lookup, and CourtListener
- T3: DuckDuckGo journalism search against the 26-domain whitelist
- T4: Detection of estimation language ("approximately," "estimated," "suggests")
Each claim receives a verdict: CONFIRMED, PARTIALLY_CONFIRMED, CONTRADICTED, UNVERIFIABLE, or CALCULATED. The verification reports are exported as JSON and CSV files (PAPER TRAIL Project, 2026).
Why It Matters
The four-tier system serves two audiences. For the general viewer, the tier labels signal that claims have different evidentiary strengths — a government filing is not the same as a statistical estimate. For the professional audience — investigators, prosecutors, congressional staff — the tier labels enable rapid assessment of which claims can be cited in official proceedings (T1), which require pipeline access to verify (T2), which need independent confirmation (T3), and which represent analytical conclusions subject to methodological debate (T4).
In a domain where speculation and evidence have been thoroughly mixed by public discourse, making the boundary visible is itself an analytical contribution.
References
PAPER TRAIL Project. (2026). Four-tier verification system definition and episode breakdowns [Data]. Episode references.md files.
PAPER TRAIL Project. (2026). EP08 verification: 138 claims, 99 T1 (72%) [Data]. communications/ep08_slides/references.md.
PAPER TRAIL Project. (2026). EP07 verification: 118 claims, 0 unverified [Data]. communications/ep07_slides/references.md.
PAPER TRAIL Project. (2026). EP06 verification: 143 claims, 124 T2 [Data]. communications/ep06_slides/references.md.
PAPER TRAIL Project. (2026). Claim verification script implementation [Script]. app/scripts/26_verify_claim.py.
PAPER TRAIL Project. (2026). 26-domain journalism whitelist [Script]. app/scripts/utils/web_verify.py.
PAPER TRAIL Project. (2026). Verification exports [Data]. _exports/verification/.