Project Benchmarks

click on snapshot for details

    model data provided by Surge AI
  1. Sep 1 September 1, 2025
  2. Nov 1 November 1, 2025
  3. Nov 26 November 26, 2025
  4. Mar 23 March 23, 2026
  5. Apr 11 April 11, 2026
  6. Apr 30 April 30, 2026

Published on September 1, 2025

100
research-level problems
37
contributing researchers
Algebra & Combinatorics
main areas
Model Name Model Type Correct Answer
GPT-5 Active Model 43%
DeepSeek-V3.1 Active Model 34%
Grok-4 Active Model 34%
o3 Active Model 32%
Gemini 2.5 Pro Active Model 29%
DeepSeek R1 Legacy Model 27%
o3-mini Legacy Model 22%
Gemini 2.5 Flash Legacy Model 18%
Claude Opus 4.1 Active Model 15%
Claude Sonnet 4 Legacy Model 9%
Based on 100 submissions that stump at least 1 active model. All models were queried via the API, using the strongest available version.
Model Name Model Type Correct Answer
GPT-5 Active Model 35%
DeepSeek-V3.1 Active Model 26%
Grok-4 Active Model 26%
o3 Active Model 23%
DeepSeek R1 Legacy Model 22%
Gemini 2.5 Pro Active Model 21%
o3-mini Legacy Model 17%
Gemini 2.5 Flash Legacy Model 11%
Claude Opus 4.1 Active Model 8%
Claude Sonnet 4 Legacy Model 6%
Based on 80 submissions that stump at least 2 active models. All models were queried via the API, using the strongest available version.

Sample Problems

Number Theory
How many mutually non-isomorphic extensions of degree 4 and Galois group of order at most $8$ does the the field $\mathbb{Q}_2$ of $2$-adic numbers admit?
Algebraic Geometry
The by codimension graded pieces of the Chow ring $A^\bullet (\mathcal{F})$ of the two-step flag variety $\mathcal{F}=\operatorname{Fl}(2,4;\mathbb{C}^6)$ are vector spaces generated by Schubert classes indexed by two subsets of $0,...,5$: The first of size $2$ and the second of size $4$ containing the former. Let $X$ be a variety of class $[1,5; 1,3,4,5]$ and $Y$ be of class $[3,5; 1,3,4,5]$. (For example, the former means lines $L\in X$ meet a $\mathbb{P}^1$, and there is no additional conditions on the projective $3$-planes containing $L$.) Suppose the intersection of $X$ and $Y$ is transversal. What is the class of the intersection $X\cap Y$ written in terms of the basis of $A^k(\mathcal{F})$ of correct codimension $k$?
Metric Geometry
Let $B$ be the unit ball in $\mathbb{R}^{65}$ with respect to the standard Euclidean norm. What is the smallest natural number $r$ such that there exist hermitian $r\times r$ matrices $A_0,\ldots,A_{65}$ with $B=\{p\in\mathbb{R}^{65}\mid A_0+p_1\cdot A_1+\cdots+p_{65}\cdot A_{65}\textrm{ is positive semidefinite}\}$?