Google DeepMind unveils AI models for solving advanced mathematical problems
Google DeepMind, Google LLC’s artificial intelligence research unit, today unveiled two new AI models that are capable of advanced mathematical reasoning for solving complex math problems, which current models struggle with.
The company presented AlphaProof, a reinforcement-learning model capable of formal math reasoning, and AlphaGeometry 2, an improved version of the company’s existing geometry-solving system.
DeepMind said advanced mathematical reasoning would be required to pave the way to developing artificial general intelligence, or AGI. This is the “holy grail” of AI development that involves the production of computer systems capable of self-teaching, understanding and applying knowledge similar to the way humans do.
To test both models, DeepMind pitted them against problems from this year’s International Mathematical Olympiad. This is the oldest and most prestigious competition for young mathematicians, held annually since 1959. It consists of six exceptionally difficult questions from algebra, combinatorics, geometry and number theory.
The research team applied the two models to the six problems and together they solved four of them, achieving the same level of proficiency as a silver medalist. AlphaProof solved two algebra problems and one number theory problem, while AlphaGeometry 2 solved the geometry problem. The remaining two combinatorics problems remained unsolved.
AlphaProof works by proving mathematical statements in a formal language, Lean. It’s built using the pretrained language model AlphaZero, a reinforcement-learning algorithm, which previously taught itself to master the games of chess, shogi and Go. Formal languages allow the representation of mathematical equations in a way that can be verified for correctness, but there is a limited amount of human-written data available for AI algorithms.
In contrast, large language models using natural language instead of formal language can tend to hallucinate, providing plausible but erroneous answers even with a large amount of human-generated data.
“We established a bridge between these two complementary spheres by fine-tuning a Gemini model to automatically translate natural language problem statements into formal statements, creating a large library of formal problems of varying difficulty,” the DeepMind researchers said.
Gemini is Google’s largest and most capable large language model and the LLM behind the company’s Gemini chatbot. It can hold conversations, answer questions, perform advanced text generation, summarize text and generate code.
The researchers trained AlphaProof by having it prove and disprove millions of potential problems across a broad range of mathematical topic areas and difficulties. The same training loop was applied during the competition to self-generate variations of contest problems until a solution could be found.
AlphaGeometry 2 is also a hybrid model based on Gemini and a neuro-symbolic system that was trained from scratch with an order of magnitude more synthetic data than its predecessor AlphaGeometry.
“This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances,” the researchers said.
The symbolic engine within the model is also twice as fast as its predecessor, the team said. It uses a specialized knowledge-sharing mechanism to discover geometry solutions using search trees to understand complex problems.
Due to these upgrades, AlphaGeometry 2 capably solved 83% of all historical IMO geometry problems from the past 25 years, compared with a 53% solve rate achieved by the previous model. According to the researchers, the model solved Problem 4 within 19 seconds after its formalization.
The team also experimented with natural language reasoning built with Gemini. The system doesn’t require problems to be translated into formal language and thus could be combined with other AI systems. The researchers said it was tested against the problems from this year’s IMO and “the results showed great promise.”
Image: Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU