Wie Modell-Performance von 2017 bis 2025 exponentiell gewachsen ist – von Transformer bis o3 über alle wichtigen Benchmarks
| Modell | Release | Parameter | MMLU | ARC | Math | Besonderheit |
|---|---|---|---|---|---|---|
| Transformer | 2017 | - | - | - | - | Architektur-Basis |
| BERT | 2018 | 340M | 77.3% | 64.6% | - | Encoder-Only |
| GPT-3 175B | 2020 | 175B | 54.9% | 51.4% | 2% | In-Context Learning |
| LLaMA 2 70B | 2023 | 70B | 63.9% | 68.2% | 28.7% | Open-Source |
| GPT-4 | 2023 | ~1.8T | 86.4% | 92.3% | 49.9% | MoE, Multimodal |
| Claude 3.5 | 2024 | ~175B | 88.3% | 94.2% | 58% | Constitutional AI |
| Llama 3.1 405B | 2024 | 405B | 85.9% | 92.3% | 53.3% | Dense, Open |
| o3 (April 2025) | 2025 | ? | 92.3% | 96.1% | 96.4% | Test-Time Compute |