San Francisco-based AI research laboratory OpenAI has added another member to its popular GPT (Generative Pre-trained Transformer) family. In a new paper, OpenAI researchers introduce GPT-f, an automated prover and proof assistant for the Metamath formalization language.
While artificial neural networks have made considerable advances in computer vision, natural language processing, robotics and so on, OpenAI believes they also have potential in the relatively underexplored area of reasoning tasks. The new research explores this potential by applying a transformer language model to automated theorem proving.
Automated theorem proving tends to require general and flexible reasoning to efficiently check the correctness of proofs. This makes it an appealing domain for checking the reasoning capabilities of language models and for the study of reasoning in general. The ability to verify proofs also helps researchers as it enables the automatic generation of new problems that can be used as training data.
The researchers found similarities between learning to prove theorems and learning to play the ancient board game Go, as both offer automated ways of determining success and generating new data via self-play approaches. Thus, AlphaZero’s success in Go suggests that automated theorem proving might prove a fruitful domain for the study of reasoning in neural networks.
Metamath is a language and computer program for archiving, verifying, and studying mathematical proofs. The researchers employ Metamath — powered by a simple meta logic system based on a single substitution rule — as their formal environment, using decoder-only transformers similar to GPT-2 and GPT-3 to create models with various pretraining datasets and in different sizes. Their largest model has 36 layers and 774m trainable parameters.
The researchers created the GPT-f online proof assistant to enable interactive proof constructions with the assistance of their models. In experiments, GPT-f achieved a new SOTA result on the Metamath library, closing 56.22 percent of proofs from a held-out test set versus 21.16 percent for the current SOTA model MetaGen-IL to demonstrate transformer architecture’s potential in formal reasoning.
GPT-f also found new short proofs that have been accepted into the main Metamath library. The researchers say this may be the first time a deep learning-based system has contributed proofs that were adopted by a formal mathematics community.
“Our results suggest that tightly coupling a deep learning system with a formal system opens up interesting opportunities for further research, with the goal of better leveraging the generative power of the former and the verification capabilities of the latter,” the researchers conclude.
The paper Generative Language Modeling for Automated Theorem Proving is on arXiv.
Reporter: Yuan Yuan | Editor: Michael Sarazen