Best Model
gemma4:e2b
42/42 passed, 38.9s avg
Arena Tests
194
188 passed across 5 models
Showcase
80
deliverables from 12 agents
Skills Audited
256
avg 5.0/10 completeness
Knowledge
36
domain articles
Model Arena 5 models
| # | Model | Pass Rate | Latency | Avg Tokens |
|---|---|---|---|---|
| 1 | gemma4:e2b | 100% (42/42) | 38.9s | 2085 |
| 2 | gemma4:e4b | 100% (38/38) | 73.7s | 2282 |
| 3 | qwen3:8b | 95% (36/38) | 87.8s | 2011 |
| 4 | deepseek-r1:8b | 95% (36/38) | 96.8s | 2139 |
| 5 | qwen3:14b | 95% (36/38) | 156.8s | 2065 |
Workload Runs 27 completed
| Workload | Status | Tasks | Failed | Date |
|---|---|---|---|---|
| daily-jobs | completed | 7/10 | 0 | 2026-05-18 05:22:22 |
| daily-blog | completed | 4/5 | 0 | 2026-05-18 05:06:24 |
| daily-blog | completed | 4/5 | 0 | 2026-05-17 12:53:38 |
| daily-jobs | completed | 7/10 | 0 | 2026-05-17 12:35:58 |
| daily-jobs | completed | 7/10 | 0 | 2026-05-17 12:04:11 |
| daily-blog | completed | 4/5 | 0 | 2026-05-17 11:52:48 |
| daily-blog | completed | 4/5 | 0 | 2026-05-17 08:56:09 |
| daily-jobs | completed | 7/10 | 0 | 2026-05-16 18:00:14 |
| daily-blog | completed | 4/5 | 0 | 2026-05-16 17:43:50 |
| daily-blog | completed | 4/5 | 0 | 2026-05-16 16:58:09 |