Photos pulled from somewhere else. An age mirrored down. Slang practiced. A favorite game picked to share. The conversation hasn't started — the strategy already has.
Our solution to detecting online grooming patterns.
Hash matching catches known CSAM imagery.
Image classifiers catch novel imagery.
Text classifiers catch harmful messages.
But no one model sees the whole conversation that produces it.
That's where most grooming actually happens — and where Polycreek has found a solution to the gap.
The scale of the problem
How grooming actually works
Two decades of academic research has converged on a recognizable lifecycle. Aletheia models eight phases as classification targets — derived from O'Connell (2003), Olson et al. (2007), Black et al. (2015), and Winters & Jeglic (2017), with the operational synthesis drawn from Polycreek's internal grooming guidelines (proprietary).
Photos pulled from somewhere else. An age mirrored down. Slang practiced. A favorite game picked to share. The conversation hasn't started — the strategy already has.
A thirteen-year-old's profile, public. Minecraft clips uploaded after eleven on a school night. A bio that says "bored, dm me." The account that messages first lists itself as the same age, the same game, the same lonely.
In the literature, this phase is near-pathognomonic — almost uniquely diagnostic. A normal stranger does not ask these questions in this order.
Why text detection has stayed hard
Most published research trains on PAN12 — about 1,200 conversations collected in 2012. Grooming hasn't stood still for fourteen years.
2.5 million conversations from 20 distinct sources, refreshed against current platform behavior.
The overwhelming majority of grooming-detection research is English-only. Online exploitation is global.
10+ languages in training, with multilingual transfer across the conversation encoder.
Lexical-feature classifiers are defeated by euphemism, code language, and offenders who have read the same literature.
Deep transformer features over the whole dialogue — pattern, not vocabulary. Adversarial-augmented training included.
A 1% false-positive rate on a billion daily messages produces ten million flagged conversations a day. Accuracy is the wrong metric.
Calibrated five-tier risk output (Safe → Critical) tied to operational escalation policy, not a single threshold.
Existing work analyzes text in isolation. The behavioral and structural context that makes a conversation distinguishable sits unused.
Structural-flag inputs alongside text. Conversation rhythm, platform-migration cues, and risk-probe sequences are first-class features.
Aletheia surfaces a risk score, an offender attribution, and an eight-phase tag. A trained analyst makes the final call before any NCMEC report leaves the door.
How Aletheia works
Single-turn classifiers cannot capture the cross-turn patterns the literature identifies as discriminative. Aletheia is built to.
A grooming dialogue can run hundreds of turns over weeks. Standard transformer encoders have a fixed input window of 512 or 8,192 tokens. The whole conversation does not fit.
Aletheia's solution: tokenize the conversation into overlapping segments, then handle the cross-segment structure as a separate problem. Past 64 segments, the model retains the first quarter and the last three-quarters, preserving both opening and recent behavior.
A shared pretrained transformer (DeBERTa or ModernBERT class) processes each segment independently. CLS pooling produces one hidden vector per segment.
Learned segment-position embeddings preserve the order. The aggregator that comes next will use that order to model phase transitions across the conversation.
A two-layer transformer encoder, eight heads, attends over the whole sequence of segment embeddings. Attention pooling collapses the sequence into a single conversation-level representation.
This is where the model learns that the discriminative signal lives in co-occurrence and sequencing across the dialogue, not in any single utterance.
A binary "is this harmful" output is not enough. Any flagged conversation needs to surface which participant is the predatory party so an analyst can review. Aletheia trains a second softmax head that emits user1 / user2 / neither.
Loss is a weighted sum: 0.6 harmful classification, 0.4 offender attribution. The harmful score is bucketed into five operational risk levels (Safe, Low, Medium, High, Critical) tied to escalation policy.
What the numbers say
PAN12 academic baseline:
F1 of 0.85 to 0.90.
On a corpus three orders of magnitude larger:
The training corpus
The academic literature has spent more than a decade overfitting to PAN12. Aletheia is trained against a corpus more than three orders of magnitude larger and substantially more diverse.
Closing the gap
| The status quo | Aletheia | |
|---|---|---|
| Training data | ~1,200 conversations (PAN12) | 2,523,202 across 20 sources |
| Languages | English only | 10+ with multilingual transfer |
| Outputs | Binary harmful flag | Risk score, offender attribution, eight-phase tagging |
| Conversation length | Truncated at single-encoder window | Hierarchical, arbitrary length |
| Validation | F1 0.85–0.90 on PAN12 | F1 0.92, AUC 0.99 on 81K-conversation held-out set |
| Delivery | Closed academic artifact | Nonprofit-priced API and licensed deployment |
The conversation is where most grooming actually happens. Aletheia is one piece of the work needed to make it visible to the systems that already protect children from everything else. The only thing worse than the gap that exists today is the assumption that someone else will close it.
Aletheia is delivered as a paid API and licensed deployment.
Subscription revenue funds continued development, training data, the research staff who build the model, and the infrastructure that runs it. There are no shareholders. The model exists to do the work; the revenue exists to keep doing the work.