AI tools start taking on Olympiad and early research math tasks

In short: AI is getting good at some very hard math problems, so it is starting to overlap with work often done by top students, but it still cannot reliably check its own proofs.

What's going on

Some AI systems can now solve math problems at the level of major competitions. DeepMind said its AlphaProof and AlphaGeometry 2 systems solved 4 of 6 problems on the 2024 International Mathematical Olympiad, scoring like a silver medalist.

Researchers are also building larger test sets to see how steady that performance really is. MIT’s MathNet collection has over 30,000 proof-based Olympiad-style problems. On a 6,400-problem benchmark from MathNet, GPT-5 scored about 69.3%, which means it still missed nearly 1 in 3 problems. Results got worse when problems used diagrams, which is like asking the AI to read a picture, not just text.

AI is also beginning to help with narrow research tasks. A Caltech-led team used an AI method designed for very long chains of reasoning to solve two families of problems linked to the Andrews–Curtis conjecture. It did not solve the main conjecture, but it closed subproblems that had been open for 25 to 44 years.

At the same time, recent tests on fresh USAMO problems found that several top models scored under 5%. Reviewers said a common failure is that models produce confident-looking proofs with gaps, instead of noticing they are wrong.

What to watch

For young mathematicians, this likely raises the bar for what counts as “impressive” in routine problem solving. Schools, contests, and hiring committees may put more weight on choosing good questions, explaining ideas clearly, and checking work carefully, especially because AI can sound sure even when it is mistaken.

Source: NYTimes

AI tools start taking on Olympiad and early research math tasks

Jack Harrison

What's going on

What to watch

Similar News

Microsoft AI chief says superintelligence is near and jobs are not disappearing

Scientists use machine learning to speed up weather models, not replace them

US colleges push new AI degrees, but program content varies widely

Studies find AI tutoring can help students, but works best with humans

Mira Murati returns to the spotlight and previews new AI approach

Explore AI Directory