AlphaGeometry, the AI that can solve complex geometry problems

February 12, 2024

AlphaGeometry is a new AI system from Google DeepMind that solves complex geometry problems, achieving near-human performance.

It solved 25 out of 30 Olympiad-level geometry problems, compared to the human average of 25.9. This achievement is even more impressive considering the previous best AI only managed 10.

AlphaGeometry builds on other previous works by Google DeepMind and Google Research, such as FunSearch. The code and the model of AlphaGeometry are open-sourced and can be accessed here.

A neuro-symbolic approach

The system combines a pre-trained language model (GPT-f, which was trained on both synthetic and human proof examples) with a symbolic deduction engine, creating a neuro-symbolic architecture.

The neural language model can understand geometric concepts, quickly find patterns, and suggest solutions, but it cannot explain its choices. It is fast but not always trustworthy.
The symbolic deduction engine uses formal logic and clear rules to check each mathematical statement, acting as a meticulous proofreader. It is slow but reliable.

When combined, they create a better and stronger tool for problem-solving.

AlphaGeometry: overview of the model and how it solves
both a simple problem and the International Mathematical Olympiad (IMO) 2015 Problem 3 (source: paper)

Generate synthetic data for training

Instead of using the human-made proofs, AlphaGeometry built a massive dataset of 100 million problems with varying difficulty, generated by existing symbolic engines (see the next picture).

AlphaGeometry synthetic data generation process (source: paper)

One of the biggest challenges in automated geometry theorem proving is auxiliary construction (AR). This refers to the process of adding new geometric elements (such as points, lines, circles) that aren’t explicitly given in the problem but are crucial for solving it. AR is like adding missing pieces to a puzzle, requiring a deep understanding of geometric relationships.

To tackle this challenge, the researchers introduced and utilized a novel concept called dependency difference within their synthetic proof generation process. This concept empowers the method to automatically generate nearly 10 million synthetic proof steps, effectively constructing the necessary auxiliary points.

Training and pipeline

AlphaGeometry employed a two-stage training process:

Pre-training: the language model is trained on synthetic data that captures the logic and geometry of theorems. The model can infer the connections between geometric elements, statements, and logical deductions. It can also assist the theorem prover by recommending useful geometric constructions (such as points, lines, circles) that can simplify the proof.
Fine-tuning: the language model is fine-tuned to focus on auxiliary construction, which is the process of adding new geometric elements that are not given in the problem, but are crucial for solving it.

During the inference, AlphaGeometry uses the neural language model to analyze the problem and suggest geometric constructions that can simplify the proof. It then uses the symbolic engine to apply logic rules and derive new facts until it reaches the conclusion and generates a human-readable proof.

Key points

Neuro-symbolic approach: it combines a LLM for fast thinking and a symbolic engine for strict reasoning. The LLM suggests potentially constructs based on patterns and relationships. The symbolic deduction engine verifies and proves the suggested solutions using logical rules.
Training with 100 million synthetic data examples: it trains from scratch without human demonstrations, generating its own massive dataset of diverse geometric problems and proofs, thus overcoming data scarcity.
Human-readable solutions: it achieves verifiable results that resemble human reasoning. AlphaGeometry not only finds solutions to geometry problems, but also explains them in a human-readable way.
A novel technique (DD + AR) that improves the search efficiency of the symbolic engine. AR (auxiliary construction) adds new elements (like points, lines, circles) that are not part of the original problem but are essential for finding the solution. DD (deductive database) uses logic rules to infer new facts from given geometric relations.

Evaluation results

AlphaGeometry was tested against existing similar tools, including Wu’s method (state-of-the-art computer algebra approach). AlphaGeometry outperformed them, as detailed in the table below.

Main results on the IMO-AG-30 test benchmark (source: paper)

The new method successfully solved 25 out of 30 problems in a test set of recent IMO-level geometry problems, significantly outperforming previous methods (see the picture below). Its performance is close to an average IMO gold medalist.

AlphaGeometry performance (source: paper)

Unlike many AI systems, AlphaGeometry goes beyond just providing answers. It generates step-by-step proofs that anyone can understand, offering valuable insights into its problem-solving approach.

It solved all geometry problems in IMO 2000 and 2015 under human expert evaluation and even discovered a more general version of a translated IMO theorem from 2004, proving its ability to go beyond simply mimicking known solutions.

Conclusion

AlphaGeometry is a neuro-symbolic tool that can solve complex geometry problems at an expert level. It not only finds solutions but also generates human-readable proofs explaining its reasoning process.

It represents a significant step forward in automated reasoning, showing the power of combining neural networks and symbolic logic to solve difficult math problems. It avoids the need for human-given proofs by generating its own theorems and proofs with various complexity levels.

This method could have applications beyond geometry, making the AI systems learn and find new knowledge in different domains, like science and engineering.

Ongoing research aims to adapt the framework to diverse mathematical domains with different structures and reasoning methods, for example non-Euclidean geometry.