EP12: The Method — 0.84 and the Line Between Computation and Evidence | Epstein Revealed

TLDR

Episode 12 of PAPER TRAIL explains how computational findings become evidence: the Daubert standard for legal admissibility, five quality gates the pipeline must pass, a bootstrap calibration protocol using 50 verified dates and 46 anchor entities, and the corrections made when the pipeline's own outputs were wrong — including 2 retractions, 5 date corrections, and 1 refutation (PAPER TRAIL Project, 2026a).

The Daubert Standard

The episode opens with 0.84 — the minimum F1 score required for entity resolution before any finding is presented. Below this line, the pipeline's outputs are not trusted. The number derives from the Daubert standard, established by the Supreme Court in Daubert v. Merrell Dow Pharmaceuticals (1993), which sets four pillars for expert testimony: the methodology must be testable (deterministic pipeline, same input produces same output), the error rate must be known (F1, precision, recall computed at every stage), the methods must be peer-reviewable (all algorithms published in scientific literature), and the methodology must be generally accepted (Bayesian probability, Leiden, Splink, PELT are standard tools) (PAPER TRAIL Project, 2026a).

EP12 applies these pillars to a computational pipeline rather than a human expert. The project was not designed for court, but the episode argues that the Daubert standard is a minimum standard for intellectual honesty: findings that cannot survive legal scrutiny should not survive public scrutiny either.

Five Quality Gates

The pipeline enforces five checkpoints, each with a named threshold. Gate 1: entity resolution F1 > 0.84 (designed, not formally tested at production scale). Gate 2: network modularity Q > 0.30 (testable — 125,620 communities computed). Gate 3: document retrieval recall 75–80% (designed). Gate 4: calibration anchor alignment > 85% (50 dates calibrated, 5 corrected, 89.8% alignment achieved). Gate 5: compound error propagation within bounds (model designed, worst-case ceiling 0.9162) (PAPER TRAIL Project, 2026a).

The episode is explicit about which gates have been tested and which remain pending. This transparency — stating "designed but not formally validated" alongside "tested and passed" — is itself part of the methodology. A quality gate that claims to have been passed when it has not been tested is worse than no gate at all.

Calibration and Corrections

The calibration protocol uses 50 verified dates from primary government and court sources as fixed reference points. Five were wrong. The PELT temporal analysis initially produced 889 breakpoints; recalibration with an optimal penalty of 0.2069 reduced this to 610, with 89.8% anchor alignment — passing the 85% threshold (PAPER TRAIL Project, 2026b).

The corrections are presented as evidence of methodological integrity. Two observations retracted (OBS-5 and OBS-6, OCR hallucinations). One hypothesis refuted (OBS-1, the Robert Crumb FedEx finding). The N212JE tail number recycling — two different aircraft using the same registration — was caught by external corroboration research, not by the pipeline itself. The episode uses this as a case study: no pipeline is complete. External verification catches what internal validation misses (PAPER TRAIL Project, 2026c).

Why This Episode Matters

EP12 is the series' methodological backbone. Every finding in every other episode depends on the standards established here. The Chao1 estimate (63.7% complete) means every conclusion is drawn from partial evidence. The compound error model means even individually valid stages accumulate uncertainty. And the retraction standard — publish the error with the same prominence as the original finding — sets the project's credibility floor. A claim without an error rate is not evidence. It is a guess.

References

Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993).

PAPER TRAIL Project. (2026a). EP12 slide content: Daubert standard, quality gates, calibration protocol [Presentation]. communications/ep12_slides/

PAPER TRAIL Project. (2026b). PELT diagnostic: Optimal penalty calibration and anchor alignment [Data set]. _exports/temporal/anchor_alignment_report.csv

PAPER TRAIL Project. (2026c). Corroboration report: External source verification [Research document]. research/CORROBORATION_REPORT.md

This research is sponsored by Subthesis.

TLDR

The Daubert Standard

Five Quality Gates

Calibration and Corrections

Why This Episode Matters

References

See Also

Daubert Admissibility: Building a Pipeline for the Courtroom

610 Breakpoints: How PELT Recalibration Tightened Temporal Analysis

Error Disclosure: What Compound Error Ceilings Mean for Every Conclusion

Continue the Investigation

EP02: The Pipeline — 1.18 Million Entities and the Missing Third

EP03: Ghosts in the Machine — When Machines Hallucinate

EP10: 863,000 Emails — The Post-Conviction Network

EP13: Convergence — Six Domains, 232,083 Events, Zero Findings