TLDR
A penalty sweep across 20 logarithmic steps reduced the corpus change-point count from 889 to 610, with the optimal penalty of 0.2069 selected by elbow method from a diagnostic tool called CROPS (PAPER TRAIL Project, 2026a). Anchor alignment against 50 verified calibration dates held at 89.8%, passing the 85% quality gate (PAPER TRAIL Project, 2026b).
Why Recalibrate
The original PELT run used a penalty derived from Modified Bayesian Information Criterion (MBIC), a statistical test that prevents the algorithm from finding too many false pattern breaks (Killick et al., 2012). MBIC produced 889 breakpoints after database verification removed 77 artifacts from an initial 966 (PAPER TRAIL Project, 2026c). But MBIC is a general-purpose penalty. It does not account for the specific characteristics of this corpus: uneven dataset sizes, a 20-year time span, and heavy concentration of documents in Data Sets 9 and 10.
Script 32 addressed this by running CROPS -- Changepoints for a Range of Penalties -- a diagnostic tool that sweeps the penalty parameter across a range and counts the resulting change-points at each value (Haynes et al., 2017). Instead of trusting a single default penalty, CROPS maps the entire penalty-to-breakpoint curve and identifies where the curve stabilizes.
The Sweep
The diagnostic ran 20 penalty values in logarithmic space from 0.001 to 10.0 across 499 curated person entities with between 100 and 50,000 mentions each (PAPER TRAIL Project, 2026a). At the lowest penalty (0.001), the algorithm detected 27,597 breakpoints -- essentially fitting noise. At the highest penalty (10.0), it detected almost none. Between those extremes, the curve exhibits a characteristic elbow: a region where increasing the penalty produces diminishing reductions in breakpoint count.
The elbow method identified the optimal penalty at 0.2069, producing 610 breakpoints across 223 entities -- an average of 2.7 breakpoints per entity (PAPER TRAIL Project, 2026a). This is 279 fewer breakpoints than the MBIC default. The removed breakpoints were not false in a computational sense -- PELT found them correctly given the lower penalty -- but they represented finer-grained segmentations that added noise without improving alignment to known events.
Anchor Alignment
The 610 breakpoints were tested against 50 verified calibration dates drawn from primary government and court sources spanning 2005 to 2026 (PAPER TRAIL Project, 2026b). A breakpoint "aligns" with a calibration anchor if it falls within a defined temporal window of a known event. At penalty 0.2069, 44 of the 49 testable anchors aligned, yielding 89.8% alignment (PAPER TRAIL Project, 2026a). The quality gate requires 85% or higher, so the recalibrated result passes.
The alignment curve is not monotonic. At very low penalties, alignment is 100% because the algorithm finds so many breakpoints that every anchor has one nearby -- but this is meaningless because it also finds thousands of spurious breakpoints. As the penalty increases, alignment holds steady until approximately 0.55, where it drops sharply (PAPER TRAIL Project, 2026a). This alignment cliff marks the boundary between useful signal and over-penalization. The optimal penalty of 0.2069 sits well below this cliff, in the stable region.
Corpus Artifacts
Of the 610 breakpoints, 91 (14.9%) were flagged as corpus artifacts -- change-points driven by known corpus events rather than real-world events (PAPER TRAIL Project, 2026a). The most common corpus artifacts correspond to the DOJ release dates: P.L. 119-38 enactment in November 2025, the first DOJ batch release, and the final release in February 2026. These events generated massive spikes in document volume that the algorithm correctly identifies as change-points, but they reflect government disclosure schedules, not the activities being investigated.
Flagging corpus artifacts rather than removing them preserves transparency. The 91 artifacts remain in the dataset with their is_corpus_artifact flag set to True, allowing downstream analysis to include or exclude them as appropriate (PAPER TRAIL Project, 2026a). The 519 non-artifact breakpoints represent the analytical core: temporal shifts in document activity that correspond to arrests, filings, regulatory actions, and other real-world events.
What Changed Downstream
The reduction from 889 to 610 breakpoints propagates through the synthesis engine. Temporal alignment is one of the eight pipeline stages in the compound error calculation, with an error rate of 74.9% -- reflecting the proportion of breakpoints without a contextual event match (PAPER TRAIL Project, 2026d). Tighter breakpoint selection reduces this rate, though it remains the largest single contributor to compound error because temporal alignment is inherently uncertain: a breakpoint can be real without corresponding to any event in the calibration set.
The 223 entities with breakpoints feed into cross-domain profiles, where temporal shifts are correlated with wire transfers, FedEx shipments, and banking events. Fewer, higher-confidence breakpoints mean fewer false temporal correlations and more reliable cross-domain leads.
References
Haynes, K., Eckley, I. A., & Fearnhead, P. (2017). Computationally efficient changepoint detection for a range of penalties (CROPS). Journal of Computational and Graphical Statistics, 26(1), 134-143. https://doi.org/10.1080/10618600.2015.1116445
Killick, R., Fearnhead, P., & Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 107(500), 1590-1598. https://doi.org/10.1080/01621459.2012.737745
PAPER TRAIL Project. (2026a). PELT temporal calibration diagnostic [Data set]. _exports/temporal/pelt_diagnostic.csv, curated_changepoints_summary.csv, curated_changepoints_by_entity.csv
PAPER TRAIL Project. (2026b). Calibration timeline: 50 verified anchor dates [Data set]. research/CALIBRATION_TIMELINE.md
PAPER TRAIL Project. (2026c). PELT change-point detection results [Data set]. _exports/temporal/changepoints_summary.csv
PAPER TRAIL Project. (2026d). Error disclosure matrix [Data set]. _exports/synthesis/error_disclosure_matrix.csv
PAPER TRAIL Project. (2026e). Script 32: PELT temporal calibration diagnostic [Software]. app/scripts/32_pelt_diagnostic.py
This investigation is part of the SubThesis accountability journalism network.