‘It is incredible’: How AI is transforming mathematics
Liam Price has no formal training in mathematics and has yet to attend university, but last month, he managed to break new ground in mathematical research — with the help of ChatGPT.AI is threatening science jobs. Which ones are most at risk?From his home in southwest England, Price got the popular artificial-intelligence tool to solve what is known as Erdős problem #1196, one of more than 1,000 puzzles that Hungarian mathematician Paul Erdős (1913–1996) collected throughout his life. Unlike other AI-generated solutions to mathematical problems, this one used a strategy that surprised specialists (B. Alexeev et al. Preprint at arXiv https://doi.org/q6p7; 2026).Posting on the social-media site X, mathematician Jared Duker Lichtman at Stanford University in California drew an analogy with chess. It was, he wrote, as if AI had discovered an opening no one had thought of before because of “human aesthetics and convention”.This is one of the more remarkable examples in a string of successes for AI in mathematics. Researchers in academia and at AI companies have been making a major push to see how far the systems can go. Computers are now contributing not just brute-force calculations, but also the type of logically sound reasoning that has been the province of mathematicians since Euclid more than 2,300 years ago.In many cases, advances have come from systems that are based on general-purpose large language models (LLMs), such as GPT, Gemini and Claude, without any special mathematical training. And — as with many areas of AI — the progress has been astoundingly fast.The systems are still mostly rehashing techniques they absorbed from the existing literature, and that was the case with some of the solutions to other Erdős problems that Price first achieved with his collaborator, Kevin Barreto, a mathematics undergraduate student at Cambridge University, UK.Artificial intelligence has proposed an unusual solution to a puzzle posed by Hungarian mathematician Paul Erdős.Credit: George CsicseryBut in cases such as Erdős problem #1196, mathematicians have started to spot glimpses of original ‘thought’ in the models’ outputs — with the tools making surprising connections between subfields. “It is incredible,” says Sébastien Bubeck, a mathematician at OpenAI in San Francisco, California. “A year ago, people thought maybe there would be some fundamental obstruction — that LLMs could never go beyond their training data.”Bubeck and others now think that it is only a matter of time before AI autonomously makes contributions at the level of the greatest mathematicians — and beyond. “I hope that perhaps by 2030, AI and mathematicians can jointly win a Fields Medal,” says Thang Luong, who heads the Superhuman Reasoning team at Google DeepMind in Mountain View, California.Innovative approachesErdős posed problem #1196 in 1966, and it concerns ‘primitive’ sets of whole numbers — meaning that none of the numbers evenly divides any of the others. (Prime numbers are the prototypical example of primitive sets.)‘The job description is changing’: mathematician Terence Tao on the rise of AIAccording to several commenters on various platforms, those who had tried solving problem #1196 had used the language of probability theory, so their efforts began by rephrasing the problem that way. GPT instead solved the problem in the original language in which it was formulated, and yet its solution implicitly established a link between numbers and probability, says Terence Tao, a mathematician at the University of California, Los Angeles.Daniel Litt, a mathematician at the University of Toronto, Canada, says the result is “reasonably interesting”, unlike previous examples in recent months of AI solutions to Erdős problems. He is relatively unimpressed by the results AI has achieved so far — and critical of the hype surrounding them. But Litt says that when it comes to future potential, it’s the sceptics who have it wrong.In fact, he says that he is puzzled that the AI systems are not already making big discoveries. Their knowledge of existing mathematics is superhuman and they have shown strong reasoning capabilities. Plus, they don’t get tired or demotivated.“Part of the mystery is, we don’t know what makes a human mathematician good at math,” Litt says, adding that it is unclear whether humans have some “secret sauce” that makes them uniquely creative.Proof positiveAs in many areas of AI, scaling up — particularly by adding computing power — and improving the efficiency of the algorithms will continue to make the models more powerful. One of the main limitations of AI-produced mathematics is that current models can produce proofs that are at most three or four pages long. Models tested internally at Google can already do better, says Luong, and might reach ten pages soon.“One hundred is not within their capabilities right now, but we are working towards there, and we see improvements,” Luong says — but that will be a mixed blessing, he adds. Already, human referees are stretched thin when it comes to evaluating the correctness of human-written mathematics papers, and scores of AI-generated ones are making the problem worse. “AI models can be capable of producing something that looks pretty convincing, and it takes a lot of time to figure out if there is a mistake,” says Lauren Williams, a mathematician at Harvard University in Cambridge, Massachusetts.Like many researchers in every discipline, she is worried about the proliferation of ‘AI slop’. “You can find several editors at math journals who can tell you horror stories,” Williams says.