Thanks to the four color problem, a rift rose between orthodox mathematicians and the budding computer science community over what comprises a proof. Recently when I was refreshing my memory on the four color problem, I became distracted by this disagreement and the question What is a proof?. So, I looked into some info surrounding these topics such as the history of proofs and proof assistants, problems mathematicians raise with proof assistants, and the validity of proof assistants.
A little history of math
Arguments intended to convince people of an idea have been under development for thousands of years. Aristotle‘s work (~350 BC) is the earliest study of formal logic and the core of deductive reasoning, the process of reaching conclusions logically from premises. Euclid (~300 BC) established the axiomatic method, where one begins with statements assumed to be true (axioms) then uses deductive reasoning to reach new conclusions.
In Euclid’s famous Elements, he applied logic to mathematics to construct proofs in geometry and number theory. This is widely considered the most influential textbook of all time, not only for its content, but also because the proofs established a standard of mathematical rigor and coherence.
In the most boring words possible: a mathematical proof is a deductive argument which can be traced back to assumed statements. So, begin with a set of assumptions, find the relevant logical consequences of the assumed statements, then keep finding other relevant consequences of the preceding statements again and again until the conclusion is reached.
A proof has two main goals:
- to convince that a statement is correct (logic)
- to explain why a statement is correct (intuition)
Technically, a proof communicated in a natural language is an informal proof, and this is how nearly all proofs in mathematics are conveyed. A formal proof is a sequence of formulas that follow a mathematically logical structure, so each formula follows from the preceding formulas.
It’s impossible to standardize the level of rigor necessary for an informal proof to be considered acceptable by the mathematics community. New proofs are reviewed by a number of academic peers before being accepted for publication, and this is the most common judge for whether or not the proof reaches the appropriate level of rigor. The connection between informal and formal proofs is that in principle a formal proof could be constructed from an informal proof, although it usually isn’t because that’s a terrible process.
As an example, if someone wanted to use that 1+1=2 in a proof, they would almost certainly not include any explanation of why 1+1=2. It’s something we all take as an accepted fact. However, the proof that 1+1=2 is backed by mathematical logic. Below are two propositions from the formal proof that 1+1=2. (The proof that 1+1=2 in the Principia Mathematica was 162 pages long! But it did also include all the basics to show “hey here’s how arithmetic works.”)
A little history of artificial intelligence (AI)
Near the turn of the 20th century, some important advancements were made in mathematical logic. Principia Mathematica was an attempt to report the axioms and inference rules needed to prove any mathematical truth. 20 years later, Gödel’s incompleteness theorem proved that such a goal could never be accomplished.
Advancements in artificial intelligence in the 1950s allowed computer scientists to begin working to better imitate how humans reason, which became known as automated reasoning. Allen Newell, Herbert Simon, and Cliff Shaw developed the first computer program with the goal of “mimicking human reason”. The result was Logic Theorist, which proved 38 of the 52 theorems in the Principia Mathematica.
Within automated reasoning, some of the hottest projects are working towards constructing computer programs to find and check mathematical proofs.
A proof assistant, or interactive theorem prover, is a software tool which helps find formal proofs with interaction from the user. The proof assistant Coq was used to prove the four color theorem in 2005.
Proof assistants work with formal proofs. As exemplified by the proof of 1+1=2, there’s a large jump in detail between formal and informal proofs. Also, most mathematicians are not using proof assistants to help construct or check their proofs, so the tool isn’t something everyone’s used to yet.
Despite the complications, proof assistants allow mathematicians to examine the validity of proofs when it would have otherwise been very challenging. Specifically for mathematical proofs, a proof assistant is useful when a proof is questionable because it’s large (and therefore difficult to understand in its entirety) or because it involved a computer program for case analysis, calculations, or some other reason.
But how? Why?
So, the real questions: How do proof assistants work? Why do proof assistants work?
Luckily, the internet is full of people who are ready to explain these answers. Unfortunately, almost all of them are not qualified to answer these questions. After much searching, I found someone who was qualified to teach me, and he, H. Geuvers, even wrote a paper on it.
For most proof assistants, the user inputs what’s called a proof script, which is a list of instructions to guide the proof assistant to accepting the theorem. Then, the proof assistant uses logic rules and reasoning to ensure that the directions the user inputs are logically correct and justified. Together, the user is guiding the search for the proof and the proof assistant is verifying the proof as it progresses, while recording all the formal proof details.
Exactly how the proof assistant is working with its system of logic and the user varies. Section 2 of the Geuvers’ paper details the specific features of different systems.
There are several aspects of a well-designed proof assistant, which help elucidate why proof assistants are reliable. In particular, there are four mechanisms formulated by Geuvers which ensure the trustworthiness of a proof assistant. Most systems don’t incorporate all four of these, but some combination of a couple.
- A small kernel: Proof assistants contain a kernel, which is a set of code that executes the proof. The kernel contains a set of rules from which all proof rules are defined. If the kernel is logically correct, then all steps made in the proof by the proof assistant are logically correct. So, trusting the proof boils down to trusting the kernel’s correctness. Mistakes are theoretically possible in the construction of a kernel, just as they are possible in the construction of any proof. Therefore, small kernels are desirable so they’re easy to check manually.
- No unnecessary assumed knowledge: A proof assistant needs systems of logic and math to be specified so it knows which assumptions, definitions, and inference rules to use in the proof. Proof assistants which allow the user to specify a description of logic and math to be used in the proof give the user greater control over what the proof assistant uses (assumptions, definitions, and inference rules) in the proof. This also prevents the user from conflicting with the proof assistant.
- A capability of being checked by a proof verifier: Many proof assistants can be checked for correctness by a proof verifier. A proof verifier mechanically checks the proof for logical correctness. It differs from a proof assistant because it doesn’t try to develop any new proofs or take any input from the user; it just checks the logic. After having specified the logic and math used in the proof assistant, the proof verifier checks that the proof assistant can prove a theorem exactly when that proof is logically sound. This can be accomplished in a couple different ways, but one particular method is expanded upon in the next point.
- Satisfying the De Bruijn criterion: For the seriously motivated individual, some proof assistants create an independently checkable proof “object” while it’s proving the theorem. A system which does this is said to be achieving the De Bruijn criterion. Thus, someone can write their own proof verifier, then check the proof object with that verifier.
Also notable, proof assistants get a lot of attention from some pretty smart people. For instance, the Coq system is the product of over 30 years of research with at least 50 contributors. Mistakes are quite likely to get caught in this context. Of course, it’s possible they slip through the cracks, but hey classic mathematical proofs don’t have a clean rap sheet either.
Paul Erdös explained his views on Appel and Haken’s proof of the four color theorem as:
I’m not an expert on the four color problem, but I assume the proof is true. However, it’s not beautiful. I’d prefer to see a proof that gives insight into why four colors are sufficient.
Erdös is known for characterizing the most elegant proofs as coming from “The Book,” a hypothetical collection containing the most beautiful proof for every theorem. Computer assisted proofs are, in Erdös’ opinion, not from “The Book.” Elegance is sought out in almost everything, and mathematics is no exception. Beautiful proofs may be concise, use minimal assumptions, or are based on original ideas. Often times the beauty of a proof is ineffable, but a common element in these proofs is that they provide irrefutable, and often surprising, insight as to why something is true.
If the correctness of the kernel and logical/mathematical descriptions used by a proof assistant are accepted as correct, then a proof generated by a proof assistant should convince a reader that a statement is valid. Yet from these proofs we can lose the ability to clearly explain to the reader why the statement is true. This understanding is essential to the progression of mathematics, as it often provides insights that lead to other new results.
Proof assistants are allowing the mathematics community to prove more statements. Is this not a valid measure of progress? Thurston argues (and I hope the mathematics community agrees) that a mathematician isn’t supposed to publish as many papers as they can each year, prove the most important theorem in their field, or lecture 300 students on Taylor series. A mathematician is supposed to advance the world’s understanding of mathematics, by whatever means they are most suited for, be that research, teaching, or something else.
From this viewpoint, if mathematicians aren’t constructing proofs in a manner that furthers mathematical understanding, then is it worth it? Thurston stated the following in his famous paper on progress in mathematics:
I think that mathematics is one of the most intellectually gratifying of human activities. Because we have a high standard for clear and convincing thinking and because we place a high value on listening to and trying to understand each other, we don’t engage in interminable arguments and endless redoing of our mathematics. We are prepared to be convinced by others… In contrast computers are good at performing formal processes… However, we should recognize that the humanly understandable and humanly checkable proofs that we actually do are what is most important to us, and that they are quite different from formal proofs.
So tl;dr: Proof assistants (and let’s say computers in general) can definitely advance mathematics, but it’s important to keep the priority on mathematical understanding, not results.
While mathematics does weave together in natural amazing ways, the field is so large and segmented that today it’s extremely rare for a single person to be well-versed in more than a couple areas. Imagine if all of mathematics could be formalized and collected. Every proof, definition, conjecture, computation in a system that could use all of this to formulate new theorems and proofs. QED the project of building a computer system that represents all of mathematical understanding. While the objective of this project is enormous, and probably unattainable, the work being done towards this goal could really help simplify and organize mathematics.
Some resources to check out
- To learn more about how a proof is constructed: Proofs and Refutations by Imre Lakatos. The first section of the book is written as a dialogue between students and a teacher as they discover a proof together. Within this section, his interpretation on the process of proof is described.
- Geuver’s paper (linked above)
- Thurston’s paper (linked above)
I cited Geuver and Thurston in this post in specific places. The statements that seem like “opinions” in this post are really my attempt at summing up what the majority of mathematicians think, so it’s not bullshit.