Limitations of LLMs

Historically, it has been assumed that large language models (LLMs) would not discover new knowledge. Verifying this has also been a challenge, as LLMs are prone to "hallucinations," where they say things that aren't true. But what if we could harness the creativity of LLMs to find new solutions in math and science without the hallucinations? A paper published by Google's DeepMind on December 14th in Nature explores this possibility.

Snippets of code and colourful streams of light
Link to Blog of DeepMind

FunSearch: Making new discoveries in mathematical sciences using Large Language Models

The Fun in FunSearch stands for Function, a tool that focuses on finding mathematical functions written in programming code. This FunSearch is combined with a pre-trained LLM called an 'evaluator' to filter out 'hallucinations' in the responses generated by the LLM. This 'evaluator' is currently using Codey, Google's version of PaLM2, and performs fine-tuning of computer code.

The structure of FunSearch

FunSearch reveals new math solutions

A surprising finding of the Google researchers was that FunSearch suggested solutions that had never existed before. The problem that FunSearch solved is the "cap set problem" (roughly, the problem of making as many dots on a graph paper as possible without three dots forming a straight line), and it generated a solution that is better than what has been found in the last 20 years.

The researchers also found a solution to a problem called bin packing that is important in a variety of computer science applications, from data center management to e-commerce.

bin packing illustration

FunSearch ushers in a new era

The success of this research heralds a new era in which artificial intelligence and human creativity can be combined to solve more complex problems in the future. In the future, even pure math and science disciplines will be able to use LLMs to find new answers to previously unsolvable problems. Mathematicians are still figuring out the best ways to integrate models like FunSearch into their research flows, but the debate is ongoing, and we're excited to see what the future holds for math and science.