AI Safety

CaseLinker: An Open-Source System for Cross-Case Analysis of Internet Crimes Against Children Reports -- Technical Report & Initial Release

Open-source tool processes fragmented case data to identify patterns across 47+ cases, reducing emotional burden on investigators.

Deep Dive

Researcher Mrinaal Ramachandran has released CaseLinker, an open-source modular system designed specifically for analyzing Child Sexual Exploitation and Abuse (CSEA) case data. The system addresses the critical challenge of fragmented information that typically exists across multiple jurisdictions, agencies, and organizations with varying formats and detail levels. CaseLinker employs a hybrid deterministic approach that combines regex-based extraction for structured data (demographics, platforms, evidence) with pattern-based semantic analysis for severity indicators and case topics. This dual methodology ensures both interpretability and auditability while processing inherently disturbing case material.

CaseLinker populates a comprehensive case schema and generates six interactive visualizations including Timeline, Severity Indicators, Case Visualization, Previous Perpetrator Status, Environment/Platforms, and Organizations Involved. The system groups similar cases using weighted Jaccard similarity across multiple dimensions including platforms, demographics, topics, severity, and investigation type. It also provides automated triage and insights generation based on collected case data. In initial testing on 47 publicly available AZICAC reports from 2011-2014, CaseLinker demonstrated effective information extraction, case clustering, and visualization capabilities.

Beyond technical functionality, CaseLinker addresses the significant emotional burden investigators face when repeatedly processing disturbing case material. By automating pattern identification and cross-case analysis, the system reduces the need for manual review of traumatic content while improving investigators' ability to detect trends and connections across cases. The open-source nature of the project allows law enforcement agencies and researchers to adapt and extend the system for their specific needs.

Key Points
  • Hybrid extraction combines regex for structured data with semantic analysis for severity indicators and topics
  • Generates six interactive visualizations and groups cases using weighted Jaccard similarity across 5+ dimensions
  • Tested on 47 AZICAC cases (2011-2014) and reduces emotional burden on investigators processing disturbing content

Why It Matters

Helps law enforcement identify patterns in child exploitation cases while reducing trauma exposure for investigators analyzing disturbing material.