Technical Deep Dives
Sub-Second Search Across 2.1 Million Documents
TLDR A fast full-text search system built into the database (known technically as a GIN-indexed tsvector) enables sub-second queries across 2.1 million...
4 investigations
TLDR A fast full-text search system built into the database (known technically as a GIN-indexed tsvector) enables sub-second queries across 2.1 million...
TLDR A pipeline of 27+ Python scripts transforms 2.1 million raw government documents into a searchable PostgreSQL database with 2.38 million extracted...
TLDR The entire 2.1 million document Epstein corpus was processed on a single Windows PC: an Intel i9-13950HX with 24 cores, an NVIDIA RTX 4070 with 8 GB of...
TLDR PostgreSQL 16 serves as the single master database for the entire Epstein corpus analysis. Six core tables hold 2.1 million documents, 2.38 million...