2nth.hw Reports

2026-05-16
Best Model
gemma4:e2b
42/42 passed, 38.9s avg
Arena Tests
194
188 passed across 5 models
Showcase
27
deliverables from 12 agents
Skills Audited
256
avg 5.0/10 completeness
Knowledge
36
domain articles

Model Arena 5 models

#ModelPass RateLatencyAvg Tokens
1gemma4:e2b100% (42/42)38.9s2085
2gemma4:e4b100% (38/38)73.7s2282
3qwen3:8b95% (36/38)87.8s2011
4deepseek-r1:8b95% (36/38)96.8s2139
5qwen3:14b95% (36/38)156.8s2065

Workload Runs 13 completed

WorkloadStatusTasksFailedDate
showcasecompleted26/5202026-05-16 05:14:42
knowledgecompleted17/1602026-05-15 20:03:25
brief-decompcompleted14/6002026-05-15 17:20:00
skill-auditcompleted255/30002026-05-15 15:44:27
showcasecompleted26/5202026-05-15 14:32:11
arenacompleted95/20002026-05-15 11:28:34
knowledgecompleted17/1602026-05-15 04:21:42
brief-decompcompleted14/6002026-05-15 01:39:45
skill-auditcompleted255/30002026-05-15 00:04:26
showcasecompleted26/5202026-05-14 22:52:18