Coming up — May 11–13, 2026

Benchmarks in Leipzig Challenge

A new public benchmark dataset of research-level mathematics problems,
written by mathematicians to test the limits of frontier AI models.

Hosted at the Max Planck Institute for Mathematics in the Sciences.

Organized by Veronica Calvo Cortes (MPI MiS), Christian Stump (Ruhr-Universität Bochum), and Bernd Sturmfels (MPI MiS).

35+
research mathematicians
on-site in Leipzig and remotely
3 days
at the MPI MiS
May 11–13, 2026, Leipzig
1 dataset
released publicly
for AI labs to benchmark their models

Each problem is written by a researcher in their area of expertise.
Problems are designed to demand multi-step and domain-specific reasoning.

Public release — May 15, 2026

The full dataset will be released publicly on May 15, 2026 for benchmarking frontier models.
AI labs can contact us for early access or to coordinate evaluation runs.

Sample problems from previous benchmarks

Discrete Geometry
How many distinct combinatorial types of 3-dimensional polytopes can you obtain by intersecting the 4-dimensional cube with an affine hyperplane?
Number Theory Analysis
For $n$ a positive integer define $V(n)$ to be the integer obtained by using the base 10 digits of $n$ in base 11. I want to evaluate the series $\sum_{p \text{ prime}} \frac{1}{V(p)}$ to an accuracy of $10^{-5}$.
Algebraic Statistics Algebraic Geometry
What is the maximum likelihood degree of the Grassmannian of lines $\mathrm{Gr}(2,n)$ in its Pluecker embedding in $\mathbb P^{\binom{n}{2}-1}$?
Algebraic Geometry Algebraic Combinatorics
Let $X$ be the $14$-dimensional permutohedral toric variety. Compute the Chern number $c_6c_5c_2c_1 $, where $c_i = c_i(T_{X})$.
Graph Theory
What is the number of connected, simple graphs with exactly 1 cycle, 100 vertices and all vertex degrees at most 3? To clarify, the graphs are unlabelled and not necessarily planar.