Community Benchmark
Open Model Leaderboard
The definitive, data-driven ranking of today's top open-source AI models. Sort, compare, and find the best model for your needs. p>
Rank | Model | Params (B) | MMLU | GSM8K | HumanEval | Tags |
---|---|---|---|---|---|---|
1 | Llama 3.1 8B Meta | 8.0 | 79.5 | 92.0 | 85.1 | Reasoning Power |
2 | Gemma 3n E4B Google | 4.0 | 74.5 | 86.5 | 72.0 | Efficiency King Multimodal |
3 | Phi-3 Medium Microsoft | 14.0 | 78.0 | 87.3 | 80.2 | |
4 | Qwen 2 7B Alibaba | 7.0 | 72.3 | 85.1 | 75.8 | Strong Coder |
5 | Llama 3.2 3B Meta | 3.0 | 66.7 | 79.0 | 68.0 | |
6 | Gemma 3n E2B Google | 2.0 | 64.3 | 78.2 | 62.5 | On-Device Fast |
* MMLU: Massive Multitask Language Understanding. GSM8K: Grade School Math. HumanEval: Code Generation.
* Performance data is based on publicly available information and may vary based on quantization and implementation.