29 Letters to Congress: Corpus Evidence for Oversight

Table of Contents

TLDR

Twenty-nine individualized letters were generated for H.R. 4405 cosponsors and committee chairs, each containing a corpus-sourced analytical enclosure with database-verified evidence (PAPER TRAIL Project, 2026a). Every recipient was first vetted against the corpus through a four-layer check -- entity mentions, co-occurrences, wire transfers, and FedEx records -- to ensure no letter was sent to someone implicated in the data.


Why Write to Congress

P.L. 119-38 mandated the release of Epstein records (Epstein Files Transparency Act, Pub. L. No. 119-38, 2025). It did not mandate that anyone analyze them. The DOJ released 3.5 million pages into a public portal. Congressional staffers received 33,000 pages directly. Members who wanted to review classified materials could visit a secure DOJ annex with four computers, one at a time, weekdays only, with their search queries monitored (PAPER TRAIL Project, 2026b).

Nobody was connecting the dots across the full corpus. The multi-script pipeline had already done that work: 2.38 million entities extracted, 519,000 clusters resolved, 224 wire transfers parsed, 2,894 FedEx shipments catalogued, 889 temporal change-points detected (PAPER TRAIL Project, 2026c). The question was how to get that analysis into the hands of the people with subpoena power.

The answer was letters. Not mass-produced form letters, but individualized correspondence to each of the 29 members of Congress most directly involved in Epstein oversight, with a standardized analytical enclosure providing the database-verified evidence they could not generate themselves.

Vetting the Recipients

Before generating any letter, each recipient passed through a four-layer corpus vetting process (PAPER TRAIL Project, 2026d). This was not optional. Sending corpus-derived evidence to a member of Congress who appears in that same corpus would be, at minimum, counterproductive.

The four layers: entity mention search (does the member's name appear in any document?), co-occurrence analysis (does the member co-occur with Epstein-linked entities?), wire transfer search (does the member's name appear in any wire transfer record?), and FedEx shipment search (does the member appear in shipping records?). A member who triggered any layer received flagged review before inclusion. All 29 final recipients passed all four layers cleanly (PAPER TRAIL Project, 2026d).

The original list had 30 recipients. Rep. Marjorie Taylor Greene was removed, reducing the count to 29. The remaining recipients span both parties and both chambers.

The Three Tiers

Recipients were organized into a priority system reflecting their leverage over the Epstein investigation.

Tier 1 comprised five members with maximum institutional power: Rep. Jim Jordan (House Judiciary Chair), Sen. Chuck Grassley (Senate Judiciary Chair), Rep. Ro Khanna (H.R. 4405 lead cosponsor), Rep. Thomas Massie (H.R. 4405 primary sponsor), and Rep. Sylvia Garcia (PAPER TRAIL Project, 2026e). These five hold either committee chairmanships or were the architects of the transparency legislation itself.

Tier 2 included Rep. Lauren Boebert and Rep. Nancy Mace -- both discharge petition signatories who demonstrated willingness to defy party leadership on the issue. Tier 3 comprised the remaining 22 cosponsors and committee members.

The tiering did not affect letter content. Every recipient received the same analytical enclosure. The tiers governed sending order, ensuring the most consequential recipients received correspondence first.

What the Letters Contain

The letter generation script produced individualized .docx and .pdf letters for each recipient (PAPER TRAIL Project, 2026a). Each letter identified the sender (Angel Reyes, DrPH candidate, Claremont Graduate University), established the project's scope, and directed the recipient to the enclosed analytical summary.

The enclosure generation script produced the analytical enclosure -- a standardized document providing corpus statistics that no individual congressional office could produce (PAPER TRAIL Project, 2026f). The enclosure included entity counts (2.38 million raw entities, 519,000 resolved clusters), wire transfer summaries ($24.1 million across 224 transactions), FedEx shipping network structure (2,894 shipments, 148 third-party), network topology findings (125,620 community groupings, 535,318 structural-hole brokers), and temporal analysis results (889 change-points validated against 50 calibration dates).

The enclosure did not contain accusations. It contained data. The wire transfer table showed amounts, dates, originators, and beneficiaries -- all parsed directly from government-released documents. The network topology showed who appeared alongside whom across multiple document types. The temporal analysis showed when document activity spiked and when it went silent.

The burden of interpreting that data -- and deciding what to do with it -- belonged to the members and their staff.

The Parallel Channel

The letters complement the two existing channels through which Congress accesses Epstein evidence. The public release (3.5 million pages, searchable online) provides raw documents but no analysis. The DOJ direct production (33,000 pages to the Oversight Committee) provides a curated subset but is limited to what DOJ chose to produce. The letters provide a third channel: independent analytical output from a civilian researcher who processed the entire public corpus computationally (PAPER TRAIL Project, 2026b).

This creates a triangulation opportunity. If the analytical enclosure identifies a wire transfer that appears in the public release but was omitted from the DOJ's 33,000-page production to the Committee, that discrepancy itself becomes an oversight finding. If the entity resolution identifies connections across data sets that DOJ presented as separate, those connections inform the scope of subpoenas.

What the Letters Cannot Do

The letters are not evidence in any legal sense. They are analytical products derived from government-released documents by a civilian researcher using open-source tools. They have not been tested under the Daubert standard (the legal test courts use to determine whether expert testimony and scientific evidence is admissible). They have not been subjected to adversarial cross-examination. They are not sworn testimony.

What they can do is direct attention. A congressional staffer reviewing 33,000 pages of documents has no way to know which 50 pages matter most. The surprisal scoring can tell them (PAPER TRAIL Project, 2026g). A member preparing questions for a deposition has no way to map the corporate ownership structure from raw PDFs. The ownership graph has already done it (PAPER TRAIL Project, 2026h). A committee drafting a subpoena has no way to identify which entities appear across the most analytical domains. The cross-domain synthesis ranks them (PAPER TRAIL Project, 2026i).

Twenty-nine letters. Twenty-nine enclosures. The evidence was already public. What was missing was the analysis.


References

Epstein Files Transparency Act, Pub. L. No. 119-38 (2025).

PAPER TRAIL Project. (2026a). Congressional letter generation [Script 14]. 14_generate_letters.py

PAPER TRAIL Project. (2026b). DOJ compliance status and implementation disputes [Research document]. doj_compliance_status.md

PAPER TRAIL Project. (2026c). Multi-script analytical pipeline [Data pipeline]. app/scripts/

PAPER TRAIL Project. (2026d). Congressional recipient vetting: Four-layer corpus check [Script 13]. 13_vet_recipients.py

PAPER TRAIL Project. (2026e). Congressional letter recipients [Configuration]. config/recipients.json

PAPER TRAIL Project. (2026f). Analytical summary enclosure generation [Script 15]. 15_generate_enclosure.py

PAPER TRAIL Project. (2026g). IDF+PMI surprisal scoring [Script 22]. 22_surprisal.py

PAPER TRAIL Project. (2026h). Institutional forensics: Ownership graph [Script 18]. 18_institutional_analysis.py

PAPER TRAIL Project. (2026i). Cross-domain synthesis [Script 25b]. 25_cross_domain_synthesis.py