Coming up — May 11–13, 2026
Benchmarks in Leipzig Challenge
A new public benchmark dataset of research-level mathematics problems,
written by mathematicians to test the limits of frontier AI models.
Hosted at the Max Planck Institute for Mathematics in the Sciences.
Organized by Veronica Calvo Cortes (MPI MiS), Christian Stump (Ruhr-Universität Bochum), and Bernd Sturmfels (MPI MiS).
35+
research mathematicians
on-site in Leipzig and remotely
3 days
at the MPI MiS
May 11–13, 2026, Leipzig
1 dataset
released publicly
for AI labs to benchmark their models
Each problem is written by a researcher in their area of
expertise.
Problems are designed to demand multi-step and
domain-specific reasoning.
Public release — May 15, 2026
The full dataset will be released publicly on
May 15, 2026 for benchmarking frontier
models.
AI labs can contact
us
for early access or to coordinate evaluation runs.
Sample problems from previous benchmarks
Discrete Geometry
How many distinct combinatorial types of 3-dimensional polytopes can you obtain by intersecting the 4-dimensional cube with an affine hyperplane?
Number Theory
Analysis
For $n$ a positive integer define $V(n)$ to be the integer obtained by
using the base 10 digits of $n$ in base 11. I want to evaluate the
series
$\sum_{p \text{ prime}} \frac{1}{V(p)}$ to an accuracy of $10^{-5}$.
Algebraic Statistics
Algebraic Geometry
What is the maximum likelihood degree of the Grassmannian of lines
$\mathrm{Gr}(2,n)$ in its Pluecker embedding in $\mathbb P^{\binom{n}{2}-1}$?
Algebraic Geometry
Algebraic Combinatorics
Let $X$ be the $14$-dimensional permutohedral toric variety. Compute the Chern number $c_6c_5c_2c_1 $, where $c_i = c_i(T_{X})$.
Graph Theory
What is the number of connected, simple graphs with exactly 1 cycle, 100 vertices and all vertex degrees at most 3? To clarify, the graphs are unlabelled and not necessarily planar.