research
By searching for “functions” written in computer code, FunSearch made the first discoveries in open-ended problems in mathematical sciences using LLMs
Large Language Models (LLMs) are useful adjuncts – they excel at combining concepts and can read, write, and program to help people solve problems. But can they discover completely new knowledge?
Because LLM holders have been shown to “hallucinate” information that is factually incorrect, using it to make verifiable valid discoveries is challenging. But what if we could harness the creativity of LLM holders by identifying only their best ideas and building on them?
Today, in a paper published in Nature, we introduce you to FunSearch, a way to search for new solutions in mathematics and computer science. FunSearch works by combining a pre-trained MBA, whose goal is to provide creative solutions in the form of computer codes, with an automated “evaluator” that protects against hallucinations and incorrect thoughts. By iterating back and forth between these two components, initial solutions “evolve” into new knowledge. The system looks for “functions” written in computer code; Hence the name FunSearch.
This work represents the first time that a new discovery has been made to challenge open problems in science or mathematics using LLMs. FunSearch has discovered new solutions to the maximum set problem, a long-standing open problem in mathematics. Additionally, to demonstrate the practical utility of FunSearch, we used it to discover more efficient algorithms for the “box-packing” problem, which has applications as ubiquitous as making data centers more efficient.
Scientific progress has always depended on the ability to share new understanding. What makes FunSearch a particularly powerful scientific tool is that it produces software that reveals what is there how Its solutions are built, not just what the solutions are. We hope this will inspire more ideas in scientists using FunSearch, leading to a virtuous cycle of improvement and discovery.
Driving discovery through development using language models
FunSearch uses an evolutionary method powered by LLMs, which leverages and develops the highest scoring ideas. These ideas are expressed as computer programs, so they can be run and evaluated automatically. First, the user writes a description of the problem in code form. This description includes a software evaluation procedure, and a preprogram used to initialize a set of programs.
FunSearch is an iterative procedure; In each iteration, the system selects some programs from the current set of programs, which are fed to the LLM. The LLM program builds on these elements creatively and creates new programmes, which are automatically assessed. The best software is added back to the existing software pool, creating a self-improving loop. FunSearch uses Google’s PaLM 2, but is compatible with other code-trained LLMs.
Discovering new mathematical knowledge and algorithms in different fields is an extremely difficult task, far beyond the power of the most advanced artificial intelligence systems. To address such challenging issues with FunSearch, we have introduced several key components. Instead of starting from scratch, we begin the evolutionary process with general knowledge about the problem, and let FunSearch focus on finding the most important ideas for making new discoveries. In addition, our evolutionary process uses a strategy to improve the diversity of ideas in order to avoid stagnation. Finally, we perform the development process in parallel to improve the system efficiency.
Opening new horizons in mathematics
We first address the problem of determining a maximum, an open challenge, which has puzzled mathematicians in multiple research fields for decades. The famous mathematician Terence Tao once called it his favorite open-ended question. We collaborated with Jordan Ellenberg, a professor of mathematics at the University of Wisconsin-Madison and author of the book that made important progress on the problem of determining the limit.
The problem consists of finding the largest set of points (called a Cap set) in a high-dimensional grid, where no three points lie on the line. This problem is important because it serves as a model for other problems in extreme combinatorics—the study of how large or small a set of numbers, graphs, or other objects is. Brute force computing methods do not solve this problem, as the number of possibilities that must be taken into account quickly becomes larger than the number of atoms in the universe.
FunSearch has created solutions – in the form of software – that have discovered in some settings the largest maximum groups ever found. This represents the largest increase in the size of cap collections in the past 20 years. Furthermore, FunSearch outperformed state-of-the-art computational solutions, as this problem exceeded their current capabilities.
These results demonstrate that FunSearch technology can take us beyond proven results in difficult combinatorial problems, where intuition may be difficult to build. We expect this approach to play a role in new discoveries of similar theoretical problems in combinatorics, and in the future may open new possibilities in areas such as contact theory.
FunSearch prefers software that is concise and amenable to human interpretation
While discovering new mathematical knowledge is important in itself, the FunSearch approach offers an additional benefit over traditional computer search techniques. This is because FunSearch is not a black box that only creates solutions to problems. Instead, it generates programs that describe how These solutions have been found. This ‘show your work’ approach is how scientists generally work, where new discoveries or phenomena are explained by the process used to produce them.
FunSearch prefers to find solutions represented by highly compressed programs – solutions with low Kolmogorov complexity †. Short programs can describe very large objects, allowing FunSearch to extend its scope to include large needle-in-a-haystack problems. Furthermore, this makes FunSearch’s output easier for researchers to understand. “FunSearch provides a completely new mechanism for developing attack strategies,” Ellenberg said. “The solutions generated by FunSearch are conceptually richer than just a list of numbers. When I study them, I learn something.”
Furthermore, the interpretability of these FunSearch programs can provide actionable insights to researchers. When we used FunSearch, we noticed, for example, interesting similarities in the code of some of its high-scoring outputs. This gave us new insight into the problem, and we used this insight to improve the problem submitted to FunSearch, resulting in better solutions. We see this as a model of collaborative action between humans and FunSearch across many problems in mathematics.
Addressing a notoriously difficult challenge in computing
Encouraged by our success in solving the theoretical maximum identification problem, we decided to explore the flexibility of FunSearch by applying it to an important practical challenge in computer science. The Box Packing problem investigates how to pack items of different sizes into the smallest number of boxes. It is at the heart of many real-world problems, from loading containers with objects to allocating compute jobs in data centers to reduce costs.
The problem of packing boxes online is usually addressed using practical algorithmic rules (heuristics) based on human experience. But finding a set of rules for each specific situation—of different sizes, timing, or capacity—can be difficult. Although very different from the thresholding problem, FunSearch was easy to set up for this problem. FunSearch provided an automatically tailored program (that adapts to the details of the data) that outperformed established heuristics – using fewer boxes to fill the same number of items.
Difficult combinatorial problems, such as online box packing, can be tackled using other AI methods, such as neural networks and reinforcement learning. These methods have also proven effective, but may also require significant resources to deploy. FunSearch, on the other hand, produces code that can be easily inspected and deployed, which means that its solutions can be incorporated into a variety of real-world industrial systems for quick benefits.
A discovery-driven LLM for science and beyond
FunSearch explains that if we protect against LLM hallucinations, the power of these models can be harnessed not only to produce new mathematical discoveries, but also to uncover potentially impactful solutions to important real-world problems.
We envision that for many problems in science and industry – old or new – generating efficient, tailored algorithms using LLM-based methods will become common practice.
In fact, this is just the beginning. FunSearch will improve as a natural result of broader advances in LLMs, and we will also expand its capabilities to address a variety of pressing scientific and engineering challenges in society.
Mathematical sciences have always been a realm of discovery and innovation, but with the emergence of large language models, such as FunSearch, the possibilities for new discoveries seem endless. This cutting-edge technology is revolutionizing the way mathematicians approach problem-solving and exploration, opening up new avenues for research and understanding. By harnessing the power of machine learning and natural language processing, FunSearch is paving the way for exciting breakthroughs in mathematical sciences, offering a new and thrilling dimension to the field.