Checking Mathematics with Computer Assistance

By Prof. N.G. de Bruijn,
Eindhoven University of Technology
Department of Mathematics and Computing Science,
Eindhoven, The Netherlands.



In recent years researchers have gained considerable experience with computer systems for checking mathematics. These have aroused practical and theoretical interest among logicians and computer scientists, very little among mathematicians. In this paper I shall try to inform a general mathematical readership about such systems, about why and how, and about the possibility that some of those systems may alter the views on how to formalize mathematics, and may even alter the scope of mathematics.

The matter has a large number of aspects, and the reader should not think that these are all tightly linked. One may reject some of the opinions exposed in this paper and accept some of the others.

I will not always be able to avoid duplication with the recent paper in these Notices by Shankar [S], which I recommend reading.

Automated checking vs. automated proving

The idea of formal proof checking is much older than the computer. Leibniz and Boole had already played with the idea to replace thinking by a kind of algebraic manipulation, but in their time they did not have the means to carry this out beyond the level of simple details of mathematical discourse. As we see it today, in order to cope with all possible mathematical situations one needs some feeling for formal languages as well as for language processing algorithms. It is not so much the availability of computers, but rather the experience acquired around computers that led to the design of what I like to call justification systems. I prefer this term over proof checking systems since they handle much more than proofs. Some of these systems can check complete theories including whether definitions and axioms are well-formed, they can check relations between different theories, and the material that can be checked can go beyond what is called mathematics today.

The activity of theorem proving should not be confused with the one of proof checking. A justification system is not expected to invent proofs, but to verify whether some input is correct mathematics or not.

Automated theorem proving seems to be older than automated proof checking. In a general sense automatic production of proofs for all provable theorems is a very hard task, but inside limited areas it might be feasible and sometimes even easy.

Automatic theorem provers may occasionally do amazing things, but they have their limitations. A justification system, on the other hand, is expected to be able to handle everything that is offered: every correct piece of mathematics should get the system's approval, and all incorrect or incomplete material is to be rejected. An automatic theorem prover is a kind of automated professor, a justification system a kind of automated student. The professor does the harder work but has the advantage of being allowed to select topics and methods. The student seems to have the easier job, but is supposed to digest whatever is served to him.

Motives for justification

What motives can one invent for setting up a justification system? (I say ``invent'' since motives are often afterthoughts: one starts something when it seems attractive and promises some success, motives are invented afterwards as a kind of defense). An obvious motive is protection against human oversight in long chains of arguments. Usually mathematicians do not need such checks: in most mathematical situations there are many possibilities to verify intermediate steps as well as final results by means of examples and analogies. But there are cases where a chain gets too long to be grasped by a single brain in a limited time, which worries in particular when the chain had not been produced by a single person but by a group, possibly of both people and machines. Then the poor reader has to rely on mechanical verification of all details. I use the word ``mechanical'' in the old sense of human machinelike action. In that sense we can do pencil-and-paper work without having to think all the time about the meaning. But in order to have any value at all, such mechanical checking should be perfectly organized.

A very important class of applications can be the area of correctness proofs for computer software. In the near future this may be a kind of work that requires and deserves the attention of many mathematicians.

A further motive is to lighten the burden of referees of mathematical papers. If a justification system is so easy to handle that the average author can use it with little effort (at present no systems are that good) then the referee (or the thesis supervisor) can require that the author provides a version of his paper directly in the language of the justification system, so that the referee can run it on his own machine. Then he need not bother about correctness any more, and can concentrate on whether the paper is interesting and new. The question whether it is new, might profit from the system as well: once there is a good justification system used on a large scale, one can think of organizing an enormous mathematical encyclopedia, a data bank of verified results, and such a bank can answer all sorts of questions about its contents.

Quite a different motive is the matter of understanding mathematics. This can mean several things.

One way in which a justification system helps understanding was explained by Shankar [S]: by being forced to convince a machine, a mathematician can sometimes transform proofs with subtle errors and duplications into faultless elegant proofs. And elegance supports understanding and insight. But it can also work out differently. Since a machine does not explicitly require elegance and shortness, there is the temptation to take the easy way as soon as the machine has accepted correctness: just go on, without bothering about polishing.

But in a more general sense the system that we use for explaining mathematics to a machine can give insights into the structure of mathematics and in the difficulties that beginners have in learning to play the game of mathematics. I have to admit that teaching to a student is not the same thing as teaching to a machine, but if a teacher is unable to arrange his arguments in a way acceptable to a machine, then his teaching to students may be an illusion, not beyond ``teaching by intimidation and learning by imitation''.

And apart from the computer's qualities in precision and in speed, it has its influence in forcing us into an absolutely rigorous form of formalization. If we are unable to leave something to a computer, then it has not yet been sufficiently formalized.

But not all justification systems are equally good in helping us to teach better. I feel that we learn much more from ``framework'' systems (to be discussed in this paper) then from classical systems.

What one is forced to learn anyway is to draw a strict boarderline between language and metalanguage. Mixing language and metalanguage is a well known source of errors and paradoxes. The language is the only thing the verification system checks, the metalanguage helps us to understand what we are doing.


Absolute safety?

One of the first questions people ask when hearing about a justification system is whether it would guarantee absolute dependability. This can mean two things. In the first place there is the matter of absolute dependability of mathematics, whatever the foundations may be. I think there is little hope ever to get final answers to that question.

The second thing it can mean is this: once we have accepted a rigorous formalization of some piece of mathematics, and we have accepted the idea that ``mechanical'' verification gives a kind of absolute guarantee of correctness, we ask whether this guarantee would be weakened by leaving the mechanical verification to a machine. This is a very reasonable, relevant and important question. It is related to proving the correctness of fairly extensive computer programs, and checking the interpretation of the specifications of those programs. And there is more: the hardware, the operating system have to be inspected thoroughly, as well as the syntax, the semantics and the compiler of the programming language. And even if all this would be covered to satisfaction, there is the fear that a computer might make errors without indicating them by total breakdown.

I do not see how we ever can get to an absolute guarantee. But one has to admit that compared to human mechanical verification, computers are superior in every respect.

Another question people raise is whether a justification system can justify itself. I think this is asking too much. A justification system can never justify more than certain fragments of itself, or certain interpretations of itself.

Philosophically, I hardly think that the question of self-justification, if it has a meaning at all, is very relevant. If someone tries to convince us with some story, then our doubts should not be allowed to melt away if the narrator declares in the same convincing tone to be absolutely sure about the truth. We should prefer to get support from a different person.

Doing it like humans

Some of the principles of the organization of justification systems are copies of what we have always been doing as humans.

We explain things to our automatic student in terms of a language, and according to the rules of that language we write a book, consisting of a sequence of lines, or at least a tree-shaped arrangement of lines, which anyway excludes circular reasoning.

The organization of memory in a justification system usually follows the human pattern too. We work with short and long range memory in our brain, directly accessible written memory on our desk, books on our shelves and books in possibly far away libraries. In a computer justification system we may observe a similar organization of memory.

Another aspect in which human behavior is followed closely is the production of mathematics, starting from a kind of raw material, clever but vague ideas, and ending with a final product of stupid but strictly precise formalities. The process looks like an assembly line in a factory. In the beginning of the conveyor belt there is the Mathematical Genius who puts his ideas on the belt. Next there is the Brilliant Mathematician who is able to write it all up in today's publication style. The next place along the belt is taken by the Competent Mathematician. He knows the tricks of the subject, is able to supply necessary material that his predecessor did not even mention, and is able to write meticuously in every detail.

In presenting mathematics to human students we usually do not give that ultimate form, and it is questionable whether we should. After all, students should learn to fill gaps themselves. If we keep chewing all food for them they will never develop proper teeth. Nevertheless they can learn a great deal from chewed material. In this connection I mention the way Edmund Landau wrote his books on analysis and number theory.

So the final product of the Competent Mathematician is a mathematical text that requires no specialized knowledge or experience from the reader. Let me call it Landau style.

The next stages along the belt have people who transform this product into the language of the justification system. Again it will turn out that gaps have to be filled, in particular since the Competent Mathematician did not bother to give the proper references all the time, and certainly did not always mention the logical derivation rules. After all, generations of mathematicians have done their work efficiently even without consciously knowing their derivation rules!

The work may be subdivided. One can think of a first stage where a person with some mathematical training inserts a number of intermediate steps whenever he feels that further workers along the belt might have trouble, and a second stage where the logical inference rules are supplied and the actual coding is carried out. For the latter piece of work one might think of a person with just some elemenary mathematics training, or of a computer provided with some artificial intelligence. But we should not be too optimistic about that: programming such jobs is by no means trivial.

Finally, at the end of the belt the Checker does the final verification. The work might be done by a human with an unusual amount of endurance, but it is much better and cheaper to leave it to a computer.

This picture of an assembly line may be a novelty for many mathematicians, since they hardly ever delegated parts of their work to people below their own level. This is quite different in other sciences.

Platonism

Some people feel that mathematical thinking depends on the existence of a real world of mathematical objects, an idea called Platonism. For communication between mathematicians this idea is irrelevant: in a mathematical discussion between a believer and a non-believer none of the two notices their different backgrounds. And of course it is irrelevant in mathematical communication with a machine. The machine does not store the mathematical objects we are talking about. The only thing it may have to store is what has been said thus far.

It is instructive to compare a justification system for mathematics with a verification system for chess. The latter has to be able to read a chess game, consisting of a sequence of lines representing the moves. It has to find out whether that sequence of moves produces a legitimate chess game or not. The chess moves are intended to update the position on the board, and actually the rules of the game express legitimacy of the next move with respect to the updated position (including a few extra bits of information since we have to know who's move it is, as well as a few details on castling and capturing ``en passant''). The chess verification system will of course store the position like a kind of Platonic reality, and can even forget about all previous moves. So we get the feeling that the sequence of moves talks about the positions, that board and pieces form the reality and that the given sequence of moves is just some abstract coding of the sequence of positions.

The difference with a mathematics verifier is striking. In mathematics we have nothing but the discussion, and even if a mathematical reality would exist, none of it would be stored in or consulted by the machine.

The language of mathematics is not talking about a limited number of things. If it were, a justification system might try to take a kind of model-theoretical approach by testing every statement in that world of objects. The language of mathematics cannot be verified on the objects: there are too many of them. The only thing we can do is applying the rule that things are correct if they have been correctly said. The notion of correctness is not formulated in terms of a mathematical reality, but involves rules about how a statement should be related to material that has been said before. Many non-mathematicians who hear about verification systems get the idea that such systems can handle only ``constructive'' situations like finite mathematics. This confusion depends on that wrong idea of implementing mathematical reality.

I believe that every Platonist would be converted at once when having to explain his mathematics to a machine.

Doing it naturally

I think that in formalizing mathematics, and in particular in preparing mathematics for justification, it is usually elegant as well as efficient to do everything in the natural way. That word of course does not mean ``like in nature''; it can at most mean ``like normally in our culture''.

Justification can be achieved step by step, like in the assembly line mentioned before. Putting a piece of mathematics into a justification system is a process of successive refinement. We begin by a rough sketch, and in various rounds we supply more and more details. The first few of these rounds belong to our cultural habits. I would call it ``natural'' if we proceed by successive refinement of those first rounds, and ``unnatural'' if the line of attack has to be completely overthrown or remodelled. In that sense Boolean logic is unnatural, and natural deduction is natural. Boolean logic comes down to replacing reasoning by an algebraic machinery that is not a refinement of what it is supposed to implement. Similarly Descartes' analytic geometry did not refine geometric proofs but replaced them by algebraic ones with completely different structure. Both Descartes and Boole created a beautiful and powerful theory, but I would not call their work ``natural''.

But of course, since the word ``natural'' means ``cultural'', it is subject to change. Set theory is an example for this. In my private opinion, it is unnatural to base mathematics on type-free set theory, where almost all mathematical notions are coded as elements of the Zermelo-Fraenkel universe (to be referred to as ZF). When it started, it reorganized much of the existing structure of mathematics, and could not be seen as a refinement. But later generations of mathematicians had often been exposed to ZF very early in life, and would therefore call it quite natural.

Some even believe (like Cantor possibly did) that the ZF universe is not fiction but Platonic reality. This seems to be in conflict with the popular slogan ``everything is a set''. That slogan means that very many things (of course not everything) can be coded in ZF. In a case like the one of the real number system, various different codings are in use. Platonism of course wants to consider the reals as a kind of reality that is independent of the way we talk about it. So we have the picture of the Platonic reals, with injections (the codings) into the Platonic ZF universe. In that picture the real reals would are no sets at all.

Coming back to the idea of refinement, I must confess that it is difficult to keep it pure on the long run. The Genius at the beginning of the assembly line may look with one eye at what happens at the other end of the line, and may adapt his ideas to the needs of the technology displayed there. Technical
realization can have influence on design.

Moreover I am not sure that the idea of systematic refinement has an eternal value worth fighting for. After all, it is extremely conservative, and there is nothing against a revolution now and then.

Doing it efficiently

A very important feature of efficient justification systems is linearity. This refers to the length of a complete account of a piece of mathematics in standard language (Landau style) compared to its translation in the language of the justification system, and compared to the time a computer needs for verification. In a bad system the latter two items may grow exponentially compared to the first one, in a good system the relation is linear. It can be linear if the system makes full use of all the definitions, abbreviations, lemmas and theorems that the standard mathematical language already
provides.

If the piece of mathematics in standard language contains gaps, however, or if it tacitly appeals to experience the student is assumed to have acquired thus far, then the text for the justification system might become considerably longer. But then it is not fair to put the blame on the system.

Without linearity, verification would never be feasible. In the Automath project in the early 70's feasibility was put to a test. The test was to push a full mathematics textbook through the justification system. The book chosen was E. Landau's ``Grundlagen der Analysis'', and Landau was followed in every detail. No attempt was made to simplify or to modernize the text. The translation was carried out by L.S. van Benthem Jutting [J], and the test was completely successful. Linearity from the beginning to the end. And computer technology of the early 70's was good enough for this. The speed was never a problem, but memory limitations were (and would still be).

Checking a Landau-like text is not always exactly what we have in mind. There is not always a Landau so kind as to write books in a systematically detailed form. Most mathematicians hate to do that, and it should be said that in some parts of mathematics that work is more inattractive than in others. In some areas all the steps are equally elegant and interesting, whereas in other areas we see elegant steps with a lot of dull work in between. At least there should have been done dull work in between, usually suppressed by the author. In such areas people are likely to hate formal verification.

Filling gaps

We would like to have an automatic student who behaves like an intelligent human student, and not like one that does ``mechanical'' checking without ``understanding''.

This indicates what we have to require from justification systems in the future. In the first place they should be able to deal with all the small gaps in any piece of mathematics, in particular gaps which can simply be filled by an appeal to a logical derivation rule. And in cases where references to previous material are needed, the machine should be able to find those, possibly guided by hints in the text.

Whatever a human student finds easy, should be automated, possibly by means of some artificial intelligence. This kind of automation is hard, but I think it can be done and will be done in some of the efficient systems which are being developed today.

But there is more. A good human student is not just assumed to have a general aptitude in mathematical reasoning in general, but is also assumed to learn from the particular subject that is presented, recognizing situations that have been understood before. This may amount to building a kind of subconscious library of lemmas that have not been formulated explicitly in the text, or even a subconscious library of methods. I think one is still very far from full automation of learning processes, so it will be a long time before we can automate the brilliant student. But for many purposes it may suffice to automate the average student.

Many mathematicians dislike pushing formalization to the extreme. The idea is that it kills intuitive thinking. I do not entirely agree. It may be true that unnatural formalization replaces intuitive thinking by an entirely different process of formula manipulation, but natural formalization supports intuition rather than destroying it. Formalization and intuitition should be each other's best friends rather than ennemies.

But part of what we call intuitive thinking is not of the kind that can be refined to proofs. That part cannot be formalized. Our brain processes are not based on logic or any other foundation of mathematics, and nevertheless they produce wonderful things. But all mathematicians agree that the results of intuitive thinking have to be justified by rigorous reasoning, even though there may be different opinions about the level of formality.

What is a proof?

Can a justification system check the computer proof of the four color theorem? Or, rather, is that proof really a proof?

Instead of the spectacular four color theorem I prefer discussing a simpler case. Imagine a combinatorial problem for which a computer search establishes the theorem that there are 24103 solutions. Do we consider that as a proof? Let us assume that there are correctness proofs, not just of computer program, of computer language compiler and operator system, but even that there is a complete description of the hardware specifications. We now buy a computer, trusting that it satisfies the specifications, we let it run and get as output that there are 24103 solutions. Is that a proof?

Of course it is not. At most we have a proof for a more complex statement: ``if the abstract execution of the program on an abstract machine with the prescribed hardware leads to 24103 solutions, then there are 24103 solutions''. This statement does not even have the form of the one we have in mind; it is a companion theorem in some kind of metalanguage.

But it is reasonable to ask whether the computer search can be refined to a proof in the ordinary sense. I think that in many combinatorial search problems such a proof by refinement can be written (in the language of a justification system) by a machine, and that the program producing that proof can be obtained by refinement of the computer search program. That machine-produced proof can be checked by the justification system. Possibly no human will read that proof, but nevertheless it is open for human inspection. We can inspect the general organization, and if we select any detail we can convince ourselves that the proof of that part is perfect. This is not very different from the situation in standard language in case of exceedingly long proofs that we have not written ourselves.

The use of machines for obtaining theorems was never questioned in the matter of numerical calculations. If, for example, as a step in a mathematical proof we need the fact that the product of 239 and 4649 equals 1111111, then it is not customary to require a formal proof in mathematical formalism. But it is reassuring to know that the process we carry out by the pencil-and-paper multiplication algorithm can be refined to a proof, and it is not hard to have that refined proof produced automatically.

Frameworks

When designing a justification system for mathematics, there is the crucial question how to start. In our present society it is more or less accepted to say (often only as lip service) that mathematics is founded on a basis of classical predicate logic and ZF set theory. Are we to base the justification system on that foundation? Accepting this, we get what I shall call a ``classical system''. But there is the alternative to start off from a more primitive level, with the possibility to present logic and set theory as explained material in the system, on an equal footing with the presentation of all other mathematical material. In that case I use the term ``framework system''. Such a set-up means that ``the usual way'' becomes an
option, possibly along with others, and that the user is allowed to be critical about the usual way.

The rules of the game in a framework system have to express how to handle mathematics and logic, more or less independently of the contents. Here ``handling'' involves how to work with things like definitions, assumptions, axioms, free and bound variables, substitution, proof rules, proofs, theorems. When designing the framework one has to bother about what it means to apply a definition or a theorem or a proof rule and not about what particular rules or axioms are to be taken as a basis. It is not very customary among mathematicians and logicians to discuss that framework: it is usually assumed to be available before mathematics and logic start off. But when trying to instruct a computer to follow our mathematical habits we are forced to be explicit about these matters.

Typing

When I started working on a justification system around 1966 I wanted to make something of a universal nature. I gave a great deal of thought about the framework, and that led me in a quite natural way to giving a central place to the idea of typing, i.e., attaching a type to every expression. The result was the system to be called Automath [dB1]. To insiders it might be described as natural deduction with (typed) variables and lambda-typed lambda calculus (lambda-typed means that the types may again be lambda terms), with argumentation structure depending on the idea of ``proofs as objects'' (others use the term ``propositions as types'' but I consider that as unfortunate).

My interpretation of typing is related to the use of ``is a'' in english. John is a soldier, London is a town, q is a rational number and P is a point. Now let us call ``soldier'' a type, and ``John'' an inhabitant of that type. In english these types play the role of substantives.

Replacing the ``is a'' by a colon, one gets ``John : soldier'', ``London : town'', ``q : rational number'' and ``P : point''. But unlike typed lambda calculus, natural language does not have uniqueness of types. ``John : soldier'' is in no way in conflict with ``John : Canadian''. Of course one may try to say that ``soldier'' and ``Canadian'' are subtypes of ``man'', and that ``man'' is the ultimate type, the archetype. But that does not work. In a different context one claims that John is a human being, or that John is a living being. It is hard to pretend that there is something like an ultimate most general typing for John.

In mathematics this is different. I would describe the situation as follows. At some point of the discussion a type A is introduced for the first time, and subtypes of A are introduced later. But after A has been created, it is not possible to create a type B and to say that B is a supertype of A. Of course there can be situations that a mathematician wants to have it that way, but then the rule of the game is that he has to start all over again, creating B first and introducing A as a subtype of B.

In the typed justification systems the types are essentially unique. So they are a kind of implementation of the archetypes mentioned above. Subtyping is not to be considered as typing, but has to be described by an archetype plus a predicate.

Typing can implement reasoning

Let me try to explain how a typed framework system can handle mathematical reasoning. Instead of trying to build up a complete picture I start somewhere in the middle. We have some theorem T and want to apply it in a particular case; that application is a theorem T_1. In theorem T there are some variables and some assumptions; let us just take one of each, and have as an example ``Let x be a real number, assume that p(x)>3. Then q(x)<5'' (p and q are supposed to be known functions). Later we have an application. We have real numbers a and b, we know that p(a+b)>3, and our new theorem T_1 is that q(a+b)<5. How do we convince a machine that this is an acceptable application?

The machine wants to know several things. First it needs a reference to the place where theorem T was proved. It sees that the theorem contains a typed variable and an assumption. It wants to know from us what we want to take for x, and we say a+b. The machine is able to check that this has the right type: it is a real number indeed (typing is unique and can be found by straightforwarded calculation, for which the machine needs no references or hints). Next the machine requires a proof of the assumption, not of the original p(x)>3 but of the one we get upon replacing x by a+b. Let us say that p(a+b)>3 was proved earlier. We satistisfy the machine's inquisivity by referring to a place Gamma where that proof is to be found. The machine checks this reference and then accepts all this as a proof of q(a+b)<5.

So we have supplied a+b, which is a real number, and Gamma, which is a proof for p(a+b)>3. In both cases we have the phrase ``is a'', and I interpret that as typing. The types are ``real number'' and ``proof for p(x)>3''. Proceeding in this style we note that proofs get the same treatment as
objects, and in manipulations like substitution, the machine treats the two in exactly the same way. It does not even have to know whether expressions stand for objects or for proofs.

The idea to treat proofs in the same way as mathematical expressions representing objects, is quite natural even if we do not think of a machine that requires information from us. It can be discussed in ordinary mathematics too. Instead of getting to theorem q(a+b)<5 by application of theorem T, we can get it by considering the proof of T as a blueprint that we just have to copy (with the proper adaptions) in order to get a proof for the application T_1. So the new proof is obtained by substitution into an old proof, and that is just like substituting into a function.

With this parallelism we get the full structure of mathematical reasoning at once. A definition has the form f:=P:Q, where f is a new name, P an acceptable expression and Q its type. In the proof world it corresponds to a theorem, where P expresses the proof, Q expresses (via ``proof of'') the proved statement, and f is the name of the proof. The introduction of a typed variable x:Q corresponds to an assumption. There Q represents (again via ``proof of'') the proposition that is assumed, and x is a name we can handle, during the lifetime of the assumption, as if it
were a proof of the assumption, completely parallel to the real variable x that is treated (during its lifetime) as if it were a real number.

There is a second kind of parallelism on a different level: types can be treated the same way as objects, in the sense that they can depend on variables and also act as variables. Accordingly we have a kind of constructions, similar to functions of several variables, where some of the variables are on the level of objects or proofs, and others are on the level of types. And the values of the functions can be objects as well as types.

So we have two levels of typing. On the one level (``low'' typing) we say things like ``3 is a natural'', on the other one (``high typing'') ``natural is a type''.

Such a simple systematic framework for dealing with typing, enriched with facilities (typed lambda calculus) for handling situations with dummy variables, is sufficiently simple to be natural, and rich enough to express almost anything we want. To take an example: the world of Greek geometry did not only handle geometrical objects and proofs, but geometrical constructions too. And a text describing a construction and proving that its result is the one we wanted, handles three kinds of low typing. Apart from the two mentioned before (with objects and proofs) we get a third one, of the form `` ... is a construction for ...''. And the three worlds are happily intermingled, referring to one another all the time. The combination of various theories into a single formalism can be called integration (see [dB5]).

I think that this is much more natural then encoding geometrical constructions as points in the ZF universe.

Typed sets vs. untyped sets

These frameworks based on typing can cover a very large part of formalizable mathematics, at least as much as what one has been doing with ZF. Actually, one of the things one can describe by means of the framework is ZF. But we have the alternative to build mathematics on types (with sets as subtypes). In a way that gives much more, since types can depend on all sorts of parameters (including type parameters), can be freely created as primitives (like we do with axioms), and can be introduced as variables. The only thing that we do not do is to create all our types in one stroke, by a single set of axioms. That is a very essential difference with ZF.

The type structure is also rich enough to enable us to add entirely new areas of application. I mentioned objects, proofs, geometrical constructions, but there can be much more. Like algorithms in general (of which the geometric constructions are a special case). And we might get closer to fulfilling Leibniz' dream of a general language for science.

A word about lambda calculus

Almost everything in framework systems depends on the notations of the lambda calculus. That is something that still seems to scare the majority of today's mathematicians. Very strange, since the lambda notation is such a great help in writing mathematics. It seems rather clumsy to live without it, in particular in fields like functional analysis. I can think of only one reason why mathematicians still don't use it: it is because Bourbaki did not do it!

The theory of lambda calculus may have difficult aspects, but the notation itself is simple. It just amounts to the use of a quantifier with a bound variable in order to build a function that is given by its values. Instead of introducing the symbol f for the function that sends every x of a set S to x^2 + x -3, we can talk directly about that f as Lambda : x \in S . (x^2 + x -3). So we do not need the letter f and we do not need the extra sentence which defines f by its values (it is a sentence in the metalanguage).

It is amazing how much can be written in terms of lambda expressions! In particular I refer to [dB4] where it is explained how a complete book written in the language of a justification system can be considered as a typed lambda expression, and where the notion of correctness of the whole book just reduces to the notion of correctness of that single expression.

Nevertheless we can do at least some mathematical reasoning without lambda's. In contrast to most other framework systems, Automath has a (very natural) feature of instantiation. It has the effect that material written in some context can be used later outside that context, without any appeal to lambda abstraction.

One might say, roughly, that until the 19th century mathematics needed no lambda calculus. Almost all of it might have been written in terms of the lambda-free fragment (Pal) of Automath, using the instantiation device for the description of explicit functions. It was only during the 19th century that the notion of a function moved very slowly from the metalanguage into the language, and that made the lambda notation indispensable.

Scope of mathematics

A justification system is expected to be able to handle all formalizable mathematics. But the typed framework being as simple and rich as it is, we have the right to turn the tables, and say: whatever the framework can describe, is to be called formalizable mathematics. In particular it contains all logic and computer science.

The claim is not so pedantic as it seems, since formalization does not encompass all mathematical activity. There are other wonderful things like intuitive thinking, mental pictures, heuristics, metalanguage and interpretations.

Experiences

It may be instructive to tell something about the experience we got in Eindhoven with teaching students to handle the Automath system. In the period 1971-1984 I gave an introductory course on Automath almost every year. Students could get credits for that course, not by passing an examination but by doing practical work. Quite often we gave them some material they were acquainted with, and required them to have it checked by the system. No treatment of preceding theories was required: all known material used in the proofs could be introduced by means of axioms. The students were usually mathematics majors with about 4 years of training in pure and applied mathematics, and no training at all in logic. The general rule was that they had to spend about 100 hours on their job, which included the time for attending the course. Most of them did rather well, and delivered a complete machine-checked proof.

Examples of such pieces of mathematics were: the elements of group theory, the Banach-Steinhaus theorem of functional analysis, Dirichlet's pigeonhole principle, the Konig-Hall theorem of combinatorics, Van der Waerden's theorem on arithmetic progressions. One student wrote a master's thesis in Automath, treating a new theory of the real number system. It took him 9 months to deliver a fully checked text plus a report about the work. It would not have taken him much less time to organize and finish the same material in ordinary textbook style.

In general it can be said that the more abstract the piece of mathematics, the smaller the gaps and the easier the justification. Less abstract fields, like combinatorics, can be hard since they may contain simple intuitive ideas for which there is no tradition of formalization.

Looking around

It was the main purpose of this paper to try to explain what justification systems are, and not to enter into details of systems. But here are some references for readers who might like to have some real information about various systems.

For the Boyer-Moore system I refer to [S]. For a survey of the Automath project to [dB3], for the Nuprl system to [C], for the Calculus of Constructions to [CH]. A way to put many framework systems in a common scheme was described in [B] (a similar scheme was already given in [dB2]).

Finally, the collection [HP] can give a good impression of what is going on in the field.

I already mentioned that I see a future in techniques of automated gap-filling in order to make justification systems useful for the working mathematician. For that particular activity I think it makes little difference whether we take a classical or a typed approach. In both cases one will do roughly the same thing.

But different justification systems may have different views on the output. I would prefer the case where the machine's output is: ``I have been able to fill the gap, and here is the proof written in your own language''. Other systems might say: ``OK, I have been able to fill the gap to my satisfaction, but my proof would have been unreadable for you, and has been put in the garbage already''. I would call those systems ``black box systems''. They might work very fast, but slightly against the ethical principles of justification.

References

[B] H.P. Barendregt. ``Introduction to generalised type systems,''
Proceedings 3rd Italian Conference on Theoretical Computer
Science., Eds. A. Bertoni a.o., World Scientific, Singapore 1989.

[dB1] Bruijn, N.G. de, ``The mathematical language Automath,
its usage, and some of its extensions,'' Symposium on Automatic
Demonstration (Versailles, December 1968), Lecture Notes in
Mathematics vol. 125, Springer Verlag 1970, pp. 29-61.

[dB2] ---, ``A framework for the description of
a number of members of the Automath family,''
Memorandum 1974-08, Department of Mathematics,
Eindhoven University of Technology, 1974.

[dB3] ---, (1980) ``A survey of the project Automath,''
In: To H.B. Curry: Essays in combinatory logic,
lambda calculus and formalism
, pp. 579-606,
Academic Press, 1980.

[dB4] ---, ``Generalizing Automath by means of a
lambda-typed lambda calculus.''
In: Mathematical Logic and Theoretical Computer Science,
Lecture Notes in Pure and Applied Mathematics, Vol. 106,
(ed. D. W. Kueker, E.G.K. Lopez-Escobar, C.H. Smith)
pp. 71-92. Marcel Dekker, New York 1987.

[dB5] ---,``The use of justification systems for
integrated semantics,''
In: Colog-88 (ed. P. Martin-Lof, G. Mints), Lecture Notes
in Computer Science, Vol 417, pp. 9-24, Springer Verlag, 1990.

[C] Constable, R.L., et.al., Implementing Mathematics with the Nuprl
Proof Development System
, Prentice-Hall Inc.,1986.

[CH] Coquand, T. and Huet, G., ``The calculus of constructions'',
Information and Computation, Vol. 76, 95-120, 1988.

[J] Jutting, L.S. van B., Checking Landau's
``Grundlagen'' in the Automath system
,
Mathematical Centre Tract Nr. 83, Amsterdam 1979.

[HP] Huet, G. and Plotkin, G. (editors), Proceedings of
the First Workshop on Logical Frameworks
, Cambridge University Press,
to appear 1991.

[S] Shankar, N., ``Observations on the Use of Computers in
Proof Checking'', Notices of the American Math. Soc., Vol. 35,
Nr. 6, 1988.