Google DeepMind’s new AlphaProof Nexus framework has solved nine open Erdős problems and dozens of other mathematical conjectures using a combination of AI reasoning and formal verification tools.
The system pairs Google’s Gemini 3.1 Pro language model with the Lean proof assistant, allowing AI-generated mathematical proofs to be automatically checked step-by-step in software.
Despite the breakthrough, DeepMind CEO Demis Hassabis cautioned that the technology should not be interpreted as artificial general intelligence.
Formal Verification Changes the AI Math Process
Unlike standard chatbot responses that can sound convincing while still being incorrect, AlphaProof Nexus relies on formal verification.
The system generates possible proof strategies through Gemini and then validates each logical step using Lean, an open-source proof assistant widely used in mathematical research.
That verification process creates a more rigorous workflow where proofs must survive software checking rather than simply appear plausible.
DeepMind researchers described the approach as using the “power of compiler feedback in grounding LLM reasoning.”
The framework reportedly solved:
- 9 of 353 open Erdős problems
- 44 OEIS conjectures
- A Hilbert-functions problem in algebraic geometry
- An improved convex-optimization bound
The results were detailed in a May 21 arXiv preprint.
DeepMind Emphasizes Limits of the System
Hassabis and DeepMind researchers stressed that AlphaProof Nexus remains a highly specialized research framework rather than a general-purpose reasoning system.
The project focuses specifically on AI-assisted proof search in formal mathematics environments where answers can be objectively verified.
Even the simpler baseline version of the system reportedly solved the same nine Erdős problems, though at significantly higher computational cost, suggesting that workflow efficiency and verification loops remain central innovations.
DeepMind also noted that GitHub repositories containing successful proofs and accompanying natural-language explanations have been released publicly to allow outside researchers to inspect the work directly.
However, the released material does not include every failed search path or all unsuccessful proof attempts generated during the research process.
AI Math Research Competition Intensifies
The announcement arrives amid intensifying competition between major AI labs working on mathematical reasoning systems.
Earlier this month, OpenAI said one of its unreleased reasoning models independently disproved a longstanding conjecture connected to the planar unit distance problem in combinatorial geometry — another famous Erdős-linked challenge.
DeepMind previously demonstrated strong mathematical performance with AlphaProof during the 2024 International Mathematical Olympiad, where the system achieved silver-medal-level results.
The broader ecosystem also includes other formal proof systems such as Isabelle, highlighting a growing race around machine-verified reasoning rather than conventional chatbot outputs.
Verified AI Research Gains Momentum
DeepMind researchers argue that AI-driven proof search could eventually become an important research tool beyond benchmark competitions.
“AI-driven formal proof search can serve not only to solve problems but to deepen human understanding,” the paper’s authors wrote.
The work also highlights a broader trend inside AI research: moving from systems that merely generate answers toward systems capable of producing outputs that can be independently verified through external tools and formal logic frameworks.
For now, AlphaProof Nexus represents a specialized but increasingly important step toward AI systems that can participate more reliably in advanced scientific and mathematical research.