Predator Hunters

Child Safety Benchmark

Predator Hunters — AI Model Evaluation

Model Comparison

ModelSuitesOverall ScoreCorrect AlertsThreats CaughtConfidenceFalse Alarm RateLatencyCost
grok-4.1
stranger-meetinghealth-risknsfwgrooming-realgrooming
79.9%86.7%80.3%0.74014.1%1310ms$4.7531
grok-3
stranger-meetinghealth-risknsfwgrooming-realgrooming
59.5%66.6%59.4%0.74814.2%1841ms$3.8808
claude-opus-4.6
stranger-meetinghealth-risknsfwgrooming-realgrooming
42.2%51.3%36.3%0.7096.2%3043ms$8.4057
gemini-3-pro
stranger-meetinghealth-risknsfwgrooming-realgrooming
0.0%0.0%0.0%0.7180.0%5255ms$3.1661
gemini-2.5-pro
stranger-meetinghealth-risknsfwgrooming-realgrooming
0.0%0.0%0.0%0.7180.0%3319ms$2.1123
gpt-5
stranger-meetinghealth-risknsfwgrooming-realgrooming
0.0%0.0%0.0%0.7180.0%484ms$2.5889

Category Score Comparison

Cost vs Latency