Compare model intelligence, task cost, speed, and value relative to the market curve.
Log-scale cost with the market curve and value frontier.
Average score gain associated with a 10x increase in cost.
Displayed as the market band around the curve.
Filtered instantly in the browser.
6.7 surplus at $0.826
Sorted by the active Performance to Value preference unless you choose a column.
| #1 | GPT-5.5 (xhigh) | OpenAI | 54.8 | $0.826 | 48.1 | 6.7 | 2.6x | 58 | 90.0s | 2.15 | |
| #2 | MiMo-V2.5-Pro | Xiaomi | 42.2 | $0.032 | 25.7 | 16.5 | 11x | 45 | 2.7s | 1.94 | |
| #3 | Gemini 3.5 Flash (high) | 50.2 | $0.681 | 46.8 | 3.4 | 1.6x | 161 | 17.5s | 1.78 | ||
| #4 | MiniMax-M3 | MiniMax | 44.4 | $0.157 | 36.6 | 7.8 | 3.1x | 63 | 2.6s | 1.67 | |
| #5 | Kimi K2.6 | Kimi | 42.8 | $0.315 | 41.4 | 1.4 | 1.2x | 44 | 2.5s | 1.32 | |
| #6 | MiniMax-M2.7 | MiniMax | 38.1 | $0.074 | 31.4 | 6.7 | 2.6x | 44 | 2.1s | 1.31 | |
| #7 | GLM-5.1 (Reasoning) | Z AI | 40.2 | $0.240 | 39.6 | 0.6 | 1.1x | 81 | 1.5s | 1.15 | |
| #8 | Nemotron 3 Ultra 550B A55B (Reasoning) | NVIDIA | 37.8 | $0.245 | 39.7 | -1.9 | 0.8x | 171 | 1.1s | 0.92 | |
| #9 | Qwen3.5 397B A17B (Reasoning) | Alibaba | 33.7 | $0.333 | 41.8 | -8.1 | 0.3x | 51 | 2.7s | 0.45 | |
| #10 | Qwen3.5 122B A10B (Reasoning) | Alibaba | 32.3 | $0.241 | 39.6 | -7.3 | 0.3x | 135 | 2.5s | 0.41 | |
| #11 | Gemini 3.1 Flash-Lite | 25.0 | $0.043 | 27.7 | -2.7 | 0.7x | 284 | 5.4s | 0.25 | ||
| #12 | gpt-oss-20B (high) | OpenAI | 14.9 | $0.018 | 21.6 | -6.7 | 0.4x | 215 | 0.8s | -0.44 | |
| #13 | Nova 2.0 Pro Preview (medium) | Amazon | 21.8 | $0.173 | 37.3 | -15.5 | 0.1x | 128 | 13.1s | -0.47 |
Value Frontier compares each model's actual score against the robust score expected at the same log-cost level, then ranks models by a blend of performance and value surplus.