Project Benchmarks

click on snapshot for details

    model data provided by Surge AI
  1. Sep 1 September 1, 2025
  2. Nov 1 November 1, 2025
  3. Nov 26 November 26, 2025
  4. Mar 23 March 23, 2026
  5. Apr 11 April 11, 2026
  6. Apr 30 April 30, 2026

Published on April 30, 2026

120
research-level problems
51
contributing researchers
29
subfields included
Model Name Model Type Correct Answer
GPT-5.5 Active Model 64%
GPT-5.4 Legacy Model 58%
Claude Opus 4.7 Active Model 41%
GPT-5.2 Legacy Model 40%
Gemini 3.1 Pro Active Model 38%
DeepSeek V4 Pro Active Model 32%
Claude Opus 4.6 Legacy Model 27%
Gemini 3 Pro Legacy Model 24%
DeepSeek-V3.2 Legacy Model 13%
Grok-4.1 Legacy Model 11%
Grok-4.20 Active Model 11%
Based on 120 submissions that stump at least 1 active model. All models were queried via the API, using the strongest available version.
Model Name Model Type Correct Answer
GPT-5.5 Active Model 53%
GPT-5.4 Legacy Model 45%
GPT-5.2 Legacy Model 29%
Claude Opus 4.7 Active Model 23%
Gemini 3.1 Pro Active Model 19%
Claude Opus 4.6 Legacy Model 15%
DeepSeek V4 Pro Active Model 14%
DeepSeek-V3.2 Legacy Model 9%
Gemini 3 Pro Legacy Model 9%
Grok-4.20 Active Model 9%
Grok-4.1 Legacy Model 6%
Based on 90 submissions that stump at least 2 active models. All models were queried via the API, using the strongest available version.

Contributing Subfields

Algebraic Combinatorics 40
Combinatorics 20
Algebra 18
Algebraic Geometry 18
Enumerative Combinatorics 12
Matroid Theory 12
Discrete Geometry 11
Homological Algebra 11
Commutative Algebra 4
Graph Theory 4
Group Theory 4
Representation Theory 4
Algebraic Statistics 3
Analysis 3
Lie Theory 3
Metric Geometry 3
Number Theory 3
Symmetric Function Theory 3
Geometry 2
Topology 2
Complex Analysis 1
Euclidean Geometry 1
Monoid Theory 1
Partial Differential Equations 1
Polytope Theory 1
Probability Theory 1
Real Algebraic Geometry 1
Theoretical Computer Science 1
Tropical Geometry 1

Sample Problems

Number Theory Analysis
For $n$ a positive integer define $V(n)$ to be the integer obtained by using the base 10 digits of $n$ in base 11. I want to evaluate the series $\sum_{p \text{ prime}} \frac{1}{V(p)}$ to an accuracy of $10^{-5}$.
Algebraic Combinatorics
Among all possible Bruhat intervals in any Coxeter group, find an interval with the smallest number of elements whose Kazhdan-Lusztig polynomial does not equal $1$. How many cover relations does this interval have?
Algebraic Geometry Matroid Theory
Let $L$ denote the log-canonical bundle on $\overline{M}_{0,20}$ over a field of characteristic $2$. Compute the dimension of $H^{17}(\overline{M}_{0,20}, L^{-1})$.