**Checking Mathematics with Computer Assistance**

*By Prof. N.G. de Bruijn,
*Eindhoven University of Technology

Department of Mathematics and Computing Science,

Eindhoven, The Netherlands.

In recent years researchers have gained considerable experience with computer systems for checking mathematics. These have aroused practical and theoretical interest among logicians and computer scientists, very little among mathematicians. In this paper I shall try to inform a general mathematical readership about such systems, about why and how, and about the possibility that some of those systems may alter the views on how to formalize mathematics, and may even alter the scope of mathematics.

The matter has a large number of aspects, and the reader should not think that these are all tightly linked. One may reject some of the opinions exposed in this paper and accept some of the others.

I will not always be able to avoid duplication with the recent paper in these Notices by Shankar [S], which I recommend reading.

**Automated checking vs. automated proving**

The idea of formal proof checking is much older than the computer. Leibniz
and Boole had already played with the idea to replace thinking by a kind of
algebraic manipulation, but in their time they did not have the means to carry
this out beyond the level of simple details of mathematical discourse. As we
see it today, in order to cope with all possible mathematical situations one
needs some feeling for formal languages as well as for language processing algorithms.
It is not so much the availability of computers, but rather the experience acquired
around computers that led to the design of what I like to call *justification
systems*. I prefer this term over proof checking systems since they handle
much more than proofs. Some of these systems can check complete theories including
whether definitions and axioms are well-formed, they can check relations between
different theories, and the material that can be checked can go beyond what
is called mathematics today.

The activity of *theorem proving* should not be confused with the one
of proof checking. A justification system is not expected to *invent*
proofs, but to verify whether some input is correct mathematics or not.

Automated theorem proving seems to be older than automated proof checking. In a general sense automatic production of proofs for all provable theorems is a very hard task, but inside limited areas it might be feasible and sometimes even easy.

Automatic theorem provers may occasionally do amazing things, but they have their limitations. A justification system, on the other hand, is expected to be able to handle everything that is offered: every correct piece of mathematics should get the system's approval, and all incorrect or incomplete material is to be rejected. An automatic theorem prover is a kind of automated professor, a justification system a kind of automated student. The professor does the harder work but has the advantage of being allowed to select topics and methods. The student seems to have the easier job, but is supposed to digest whatever is served to him.

**Motives for justification**

What motives can one invent for setting up a justification system? (I say ``invent'' since motives are often afterthoughts: one starts something when it seems attractive and promises some success, motives are invented afterwards as a kind of defense). An obvious motive is protection against human oversight in long chains of arguments. Usually mathematicians do not need such checks: in most mathematical situations there are many possibilities to verify intermediate steps as well as final results by means of examples and analogies. But there are cases where a chain gets too long to be grasped by a single brain in a limited time, which worries in particular when the chain had not been produced by a single person but by a group, possibly of both people and machines. Then the poor reader has to rely on mechanical verification of all details. I use the word ``mechanical'' in the old sense of human machinelike action. In that sense we can do pencil-and-paper work without having to think all the time about the meaning. But in order to have any value at all, such mechanical checking should be perfectly organized.

A very important class of applications can be the area of correctness proofs for computer software. In the near future this may be a kind of work that requires and deserves the attention of many mathematicians.

A further motive is to lighten the burden of referees of mathematical papers. If a justification system is so easy to handle that the average author can use it with little effort (at present no systems are that good) then the referee (or the thesis supervisor) can require that the author provides a version of his paper directly in the language of the justification system, so that the referee can run it on his own machine. Then he need not bother about correctness any more, and can concentrate on whether the paper is interesting and new. The question whether it is new, might profit from the system as well: once there is a good justification system used on a large scale, one can think of organizing an enormous mathematical encyclopedia, a data bank of verified results, and such a bank can answer all sorts of questions about its contents.

Quite a different motive is the matter of *understanding* mathematics.
This can mean several things.

One way in which a justification system helps understanding was explained
by Shankar [S]: by being forced to convince a machine, a mathematician can sometimes
transform proofs with subtle errors and duplications into faultless elegant
proofs. And elegance supports understanding and insight. But it can also work
out differently. Since a machine does not *explicitly* require elegance
and shortness, there is the temptation to take the easy way as soon as the machine
has accepted correctness: just go on, without bothering about polishing.

But in a more general sense the *system* that we use for explaining
mathematics to a machine can give insights into the structure of mathematics
and in the difficulties that beginners have in learning to play the game of
mathematics. I have to admit that teaching to a student is not the same thing
as teaching to a machine, but if a teacher is unable to arrange his arguments
in a way acceptable to a machine, then his teaching to students may be an illusion,
not beyond ``teaching by intimidation and learning by imitation''.

And apart from the computer's qualities in precision and in speed, it has its influence in forcing us into an absolutely rigorous form of formalization. If we are unable to leave something to a computer, then it has not yet been sufficiently formalized.

But not all justification systems are equally good in helping us to teach better. I feel that we learn much more from ``framework'' systems (to be discussed in this paper) then from classical systems.

What one is forced to learn anyway is to draw a strict boarderline between language and metalanguage. Mixing language and metalanguage is a well known source of errors and paradoxes. The language is the only thing the verification system checks, the metalanguage helps us to understand what we are doing.

** Absolute safety?**

One of the first questions people ask when hearing about a justification system is whether it would guarantee absolute dependability. This can mean two things. In the first place there is the matter of absolute dependability of mathematics, whatever the foundations may be. I think there is little hope ever to get final answers to that question.

The second thing it can mean is this: once we have accepted a rigorous formalization of some piece of mathematics, and we have accepted the idea that ``mechanical'' verification gives a kind of absolute guarantee of correctness, we ask whether this guarantee would be weakened by leaving the mechanical verification to a machine. This is a very reasonable, relevant and important question. It is related to proving the correctness of fairly extensive computer programs, and checking the interpretation of the specifications of those programs. And there is more: the hardware, the operating system have to be inspected thoroughly, as well as the syntax, the semantics and the compiler of the programming language. And even if all this would be covered to satisfaction, there is the fear that a computer might make errors without indicating them by total breakdown.

I do not see how we ever can get to an absolute guarantee. But one has to admit that compared to human mechanical verification, computers are superior in every respect.

Another question people raise is whether a justification system can justify
*itself*. I think this is asking too much. A justification system can
never justify more than certain fragments of itself, or certain interpretations
of itself.

Philosophically, I hardly think that the question of self-justification, if
it has a meaning at all, is very relevant. If someone tries to convince us with
some story, then our doubts should not be allowed to melt away if the narrator
declares in the same convincing tone to be absolutely sure about the truth.
We should prefer to get support from a different person.

**Doing it like humans**

Some of the principles of the organization of justification systems are copies of what we have always been doing as humans.

We explain things to our automatic student in terms of a *language,*
and according to the rules of that language we write a *book*, consisting
of a sequence of *lines*, or at least a tree-shaped arrangement of lines,
which anyway excludes circular reasoning.

The organization of memory in a justification system usually follows the human pattern too. We work with short and long range memory in our brain, directly accessible written memory on our desk, books on our shelves and books in possibly far away libraries. In a computer justification system we may observe a similar organization of memory.

Another aspect in which human behavior is followed closely is the *production*
of mathematics, starting from a kind of raw material, clever but vague ideas,
and ending with a final product of stupid but strictly precise formalities.
The process looks like an assembly line in a factory. In the beginning of the
conveyor belt there is the Mathematical Genius who puts his ideas on the belt.
Next there is the Brilliant Mathematician who is able to write it all up in
today's publication style. The next place along the belt is taken by the Competent
Mathematician. He knows the tricks of the subject, is able to supply necessary
material that his predecessor did not even mention, and is able to write meticuously
in every detail.

In presenting mathematics to human students we usually do not give that ultimate form, and it is questionable whether we should. After all, students should learn to fill gaps themselves. If we keep chewing all food for them they will never develop proper teeth. Nevertheless they can learn a great deal from chewed material. In this connection I mention the way Edmund Landau wrote his books on analysis and number theory.

So the final product of the Competent Mathematician is a mathematical text
that requires no specialized knowledge or experience from the reader. Let me
call it *Landau style*.

The next stages along the belt have people who transform this product into the language of the justification system. Again it will turn out that gaps have to be filled, in particular since the Competent Mathematician did not bother to give the proper references all the time, and certainly did not always mention the logical derivation rules. After all, generations of mathematicians have done their work efficiently even without consciously knowing their derivation rules!

The work may be subdivided. One can think of a first stage where a person with some mathematical training inserts a number of intermediate steps whenever he feels that further workers along the belt might have trouble, and a second stage where the logical inference rules are supplied and the actual coding is carried out. For the latter piece of work one might think of a person with just some elemenary mathematics training, or of a computer provided with some artificial intelligence. But we should not be too optimistic about that: programming such jobs is by no means trivial.

Finally, at the end of the belt the Checker does the final verification. The work might be done by a human with an unusual amount of endurance, but it is much better and cheaper to leave it to a computer.

This picture of an assembly line may be a novelty for many mathematicians, since they hardly ever delegated parts of their work to people below their own level. This is quite different in other sciences.

**Platonism**

Some people feel that mathematical thinking depends on the existence of a real world of mathematical objects, an idea called Platonism. For communication between mathematicians this idea is irrelevant: in a mathematical discussion between a believer and a non-believer none of the two notices their different backgrounds. And of course it is irrelevant in mathematical communication with a machine. The machine does not store the mathematical objects we are talking about. The only thing it may have to store is what has been said thus far.

It is instructive to compare a justification system for mathematics with a
verification system for chess. The latter has to be able to read a chess game,
consisting of a sequence of lines representing the moves. It has to find out
whether that sequence of moves produces a legitimate chess game or not. The
chess moves are intended to update the position on the board, and actually the
rules of the game express legitimacy of the next move with respect to the updated
position (including a few extra bits of information since we have to know who's
move it is, as well as a few details on castling and capturing ``en passant'').
The chess verification system will of course store the position like a kind
of Platonic reality, and can even forget about all previous moves. So we get
the feeling that the sequence of moves talks about the positions, that board
and pieces form the reality and that the given sequence of moves is just some
abstract coding of the sequence of positions.

The difference with a mathematics verifier is striking. In mathematics we have
nothing but the discussion, and even if a mathematical reality would exist,
none of it would be stored in or consulted by the machine.

The language of mathematics is *not* talking about a limited number
of things. If it were, a justification system might try to take a kind of model-theoretical
approach by testing every statement in that world of objects. The language of
mathematics cannot be verified on the objects: there are too many of them. The
only thing we can do is applying the rule that things are correct if they have
been correctly said. The notion of correctness is not formulated in terms of
a mathematical reality, but involves rules about how a statement should be related
to material that has been said before. Many non-mathematicians who hear about
verification systems get the idea that such systems can handle only ``constructive''
situations like finite mathematics. This confusion depends on that wrong idea
of implementing mathematical reality.

I believe that every Platonist would be converted at once when having to explain his mathematics to a machine.

**Doing it naturally**

I think that in formalizing mathematics, and in particular in preparing mathematics
for justification, it is usually elegant as well as efficient to do everything
in the *natural* way. That word of course does not mean ``like in nature'';
it can at most mean ``like normally in our culture''.

Justification can be achieved step by step, like in the assembly line mentioned
before. Putting a piece of mathematics into a justification system is a process
of successive refinement. We begin by a rough sketch, and in various rounds
we supply more and more details. The first few of these rounds belong to our
cultural habits. I would call it ``natural'' if we proceed by successive refinement
of those first rounds, and ``unnatural'' if the line of attack has to be completely
overthrown or remodelled. In that sense Boolean logic is unnatural, and natural
deduction is natural. Boolean logic comes down to replacing reasoning by an
algebraic machinery that is *not* a refinement of what it is supposed
to implement. Similarly Descartes' analytic geometry did not refine geometric
proofs but replaced them by algebraic ones with completely different structure.
Both Descartes and Boole created a beautiful and powerful theory, but I would
not call their work ``natural''.

But of course, since the word ``natural'' means ``cultural'', it is subject to change. Set theory is an example for this. In my private opinion, it is unnatural to base mathematics on type-free set theory, where almost all mathematical notions are coded as elements of the Zermelo-Fraenkel universe (to be referred to as ZF). When it started, it reorganized much of the existing structure of mathematics, and could not be seen as a refinement. But later generations of mathematicians had often been exposed to ZF very early in life, and would therefore call it quite natural.

Some even believe (like Cantor possibly did) that the ZF universe is not fiction but Platonic reality. This seems to be in conflict with the popular slogan ``everything is a set''. That slogan means that very many things (of course not everything) can be coded in ZF. In a case like the one of the real number system, various different codings are in use. Platonism of course wants to consider the reals as a kind of reality that is independent of the way we talk about it. So we have the picture of the Platonic reals, with injections (the codings) into the Platonic ZF universe. In that picture the real reals would are no sets at all.

Coming back to the idea of refinement, I must confess that it is difficult
to keep it pure on the long run. The Genius at the beginning of the assembly
line may look with one eye at what happens at the other end of the line, and
may adapt his ideas to the needs of the technology displayed there. Technical

realization can have influence on design.

Moreover I am not sure that the idea of systematic refinement has an eternal
value worth fighting for. After all, it is extremely conservative, and there
is nothing against a revolution now and then.

**Doing it efficiently**

A very important feature of efficient justification systems is *linearity*.
This refers to the length of a complete account of a piece of mathematics in
standard language (Landau style) compared to its translation in the language
of the justification system, and compared to the time a computer needs for verification.
In a bad system the latter two items may grow exponentially compared to the
first one, in a good system the relation is linear. It can be linear if the
system makes full use of all the definitions, abbreviations, lemmas and theorems
that the standard mathematical language already

provides.

If the piece of mathematics in standard language contains gaps, however, or if it tacitly appeals to experience the student is assumed to have acquired thus far, then the text for the justification system might become considerably longer. But then it is not fair to put the blame on the system.

Without linearity, verification would never be feasible. In the Automath project in the early 70's feasibility was put to a test. The test was to push a full mathematics textbook through the justification system. The book chosen was E. Landau's ``Grundlagen der Analysis'', and Landau was followed in every detail. No attempt was made to simplify or to modernize the text. The translation was carried out by L.S. van Benthem Jutting [J], and the test was completely successful. Linearity from the beginning to the end. And computer technology of the early 70's was good enough for this. The speed was never a problem, but memory limitations were (and would still be).

Checking a Landau-like text is not always exactly what we have in mind. There is not always a Landau so kind as to write books in a systematically detailed form. Most mathematicians hate to do that, and it should be said that in some parts of mathematics that work is more inattractive than in others. In some areas all the steps are equally elegant and interesting, whereas in other areas we see elegant steps with a lot of dull work in between. At least there should have been done dull work in between, usually suppressed by the author. In such areas people are likely to hate formal verification.

**Filling gaps**

We would like to have an automatic student who behaves like an intelligent human student, and not like one that does ``mechanical'' checking without ``understanding''.

This indicates what we have to require from justification systems in the future. In the first place they should be able to deal with all the small gaps in any piece of mathematics, in particular gaps which can simply be filled by an appeal to a logical derivation rule. And in cases where references to previous material are needed, the machine should be able to find those, possibly guided by hints in the text.

Whatever a human student finds easy, should be automated, possibly by means of some artificial intelligence. This kind of automation is hard, but I think it can be done and will be done in some of the efficient systems which are being developed today.

But there is more. A good human student is not just assumed to have a general
aptitude in mathematical reasoning in general, but is also assumed to *learn*
from the particular subject that is presented, recognizing situations that have
been understood before. This may amount to building a kind of subconscious library
of lemmas that have not been formulated explicitly in the text, or even a subconscious
library of methods. I think one is still very far from full automation of learning
processes, so it will be a long time before we can automate the brilliant student.
But for many purposes it may suffice to automate the average student.

Many mathematicians dislike pushing formalization to the extreme. The idea
is that it kills intuitive thinking. I do not entirely agree. It may be true
that unnatural formalization replaces intuitive thinking by an entirely different
process of formula manipulation, but *natural* formalization supports
intuition rather than destroying it. Formalization and intuitition should be
each other's best friends rather than ennemies.

But part of what we call intuitive thinking is not of the kind that can be refined to proofs. That part cannot be formalized. Our brain processes are not based on logic or any other foundation of mathematics, and nevertheless they produce wonderful things. But all mathematicians agree that the results of intuitive thinking have to be justified by rigorous reasoning, even though there may be different opinions about the level of formality.

**What is a proof?**

Can a justification system check the computer proof of the four color theorem? Or, rather, is that proof really a proof?

Instead of the spectacular four color theorem I prefer discussing a simpler case. Imagine a combinatorial problem for which a computer search establishes the theorem that there are 24103 solutions. Do we consider that as a proof? Let us assume that there are correctness proofs, not just of computer program, of computer language compiler and operator system, but even that there is a complete description of the hardware specifications. We now buy a computer, trusting that it satisfies the specifications, we let it run and get as output that there are 24103 solutions. Is that a proof?

Of course it is not. At most we have a proof for a more complex statement:
``if the abstract execution of the program on an abstract machine with the prescribed
hardware leads to 24103 solutions, then there are 24103 solutions''. This statement
does not even have the *form* of the one we have in mind; it is a companion
theorem in some kind of metalanguage.

But it is reasonable to ask whether the computer search can be refined to a proof in the ordinary sense. I think that in many combinatorial search problems such a proof by refinement can be written (in the language of a justification system) by a machine, and that the program producing that proof can be obtained by refinement of the computer search program. That machine-produced proof can be checked by the justification system. Possibly no human will read that proof, but nevertheless it is open for human inspection. We can inspect the general organization, and if we select any detail we can convince ourselves that the proof of that part is perfect. This is not very different from the situation in standard language in case of exceedingly long proofs that we have not written ourselves.

The use of machines for obtaining theorems was never questioned in the matter of numerical calculations. If, for example, as a step in a mathematical proof we need the fact that the product of 239 and 4649 equals 1111111, then it is not customary to require a formal proof in mathematical formalism. But it is reassuring to know that the process we carry out by the pencil-and-paper multiplication algorithm can be refined to a proof, and it is not hard to have that refined proof produced automatically.

**Frameworks**

When designing a justification system for mathematics, there is the crucial
question how to start. In our present society it is more or less accepted to
say (often only as lip service) that mathematics is founded on a basis of classical
predicate logic and ZF set theory. Are we to base the justification system on
that foundation? Accepting this, we get what I shall call a ``classical system''.
But there is the alternative to start off from a more primitive level, with
the possibility to *present* logic and set theory as explained material
*in the system*, on an equal footing with the presentation of all other
mathematical material. In that case I use the term ``framework system''. Such
a set-up means that ``the usual way'' becomes an

option, possibly along with others, and that the user is allowed to be critical
about the usual way.

The rules of the game in a framework system have to express how to *handle*
mathematics and logic, more or less independently of the contents. Here ``handling''
involves how to work with things like definitions, assumptions, axioms, free
and bound variables, substitution, proof rules, proofs, theorems. When designing
the framework one has to bother about what it means to *apply* a definition
or a theorem or a proof rule and not about what particular rules or axioms are
to be taken as a basis. It is not very customary among mathematicians and logicians
to discuss that framework: it is usually assumed to be available before mathematics
and logic start off. But when trying to instruct a computer to follow our mathematical
habits we are forced to be explicit about these matters.

**Typing**

When I started working on a justification system around 1966 I wanted to make
something of a universal nature. I gave a great deal of thought about the framework,
and that led me in a quite natural way to giving a central place to the idea
of *typing*, i.e., attaching a type to every expression. The result was
the system to be called Automath [dB1]. To insiders it might be described as
natural deduction with (typed) variables and lambda-typed lambda calculus (lambda-typed
means that the types may again be lambda terms), with argumentation structure
depending on the idea of ``proofs as objects'' (others use the term ``propositions
as types'' but I consider that as unfortunate).

My interpretation of typing is related to the use of ``is a'' in english.
John is a soldier, London is a town, *q* is a rational number and *P*
is a point. Now let us call ``soldier'' a type, and ``John'' an inhabitant of
that type. In english these types play the role of substantives.

Replacing the ``is a'' by a colon, one gets ``John : soldier'', ``London :
town'', ``*q* : rational number'' and ``*P* : point''. But unlike
typed lambda calculus, natural language does not have uniqueness of types. ``John
: soldier'' is in no way in conflict with ``John : Canadian''. Of course one
may try to say that ``soldier'' and ``Canadian'' are subtypes of ``man'', and
that ``man'' is the ultimate type, the *archetype*. But that does not
work. In a different context one claims that John is a human being, or that
John is a living being. It is hard to pretend that there is something like an
ultimate most general typing for John.

In mathematics this is different. I would describe the situation as follows.
At some point of the discussion a type *A* is introduced for the first
time, and subtypes of *A* are introduced later. But after *A*
has been created, it is not possible to create a type *B* and to say
that *B* is a supertype of *A*. Of course there can be situations
that a mathematician wants to have it that way, but then the rule of the game
is that he has to start all over again, creating *B* first and introducing
*A* as a subtype of *B*.

In the typed justification systems the types are essentially unique. So they
are a kind of implementation of the archetypes mentioned above. Subtyping is
not to be considered as typing, but has to be described by an archetype plus
a predicate.

**Typing can implement reasoning**

Let me try to explain how a typed framework system can handle mathematical
reasoning. Instead of trying to build up a complete picture I start somewhere
in the middle. We have some theorem *T* and want to apply it in a particular
case; that application is a theorem *T_1*. In theorem *T* there
are some variables and some assumptions; let us just take one of each, and have
as an example ``Let *x* be a real number, assume that *p(x)>3*.
Then *q(x)<5*'' (*p* and *q* are supposed to be known
functions). Later we have an application. We have real numbers *a* and
*b*, we know that *p(a+b)>3*, and our new theorem *T_1*
is that *q(a+b)<5*. How do we convince a machine that this is an acceptable
application?

The machine wants to know several things. First it needs a reference to the
place where theorem *T* was proved. It sees that the theorem contains
a typed variable and an assumption. It wants to know from us what we want to
take for *x*, and we say * a+b*. The machine is able to check
that this has the right type: it is a real number indeed (typing is unique and
can be found by straightforwarded calculation, for which the machine needs no
references or hints). Next the machine requires a proof of the assumption, not
of the original *p(x)>3* but of the one we get upon replacing *x*
by *a+b*. Let us say that *p(a+b)>3* was proved earlier. We
satistisfy the machine's inquisivity by referring to a place *Gamma*
where that proof is to be found. The machine checks this reference and then
accepts all this as a proof of *q(a+b)<5*.

So we have supplied *a+b*, which is a real number, and *Gamma*,
which is a proof for *p(a+b)>3*. In both cases we have the phrase
``is a'', and I interpret that as typing. The types are ``real number'' and
``proof for *p(x)>3*''. Proceeding in this style we note that proofs
get the same treatment as

objects, and in manipulations like substitution, the machine treats the two
in exactly the same way. It does not even have to know whether expressions stand
for objects or for proofs.

The idea to treat proofs in the same way as mathematical expressions representing
objects, is quite natural even if we do not think of a machine that requires
information from us. It can be discussed in ordinary mathematics too. Instead
of getting to theorem *q(a+b)<5* by *application* of theorem
*T*, we can get it by considering the proof of *T* as a blueprint
that we just have to copy (with the proper adaptions) in order to get a proof
for the application *T_1*. So the new proof is obtained by substitution
into an old proof, and that is just like substituting into a function.

With this parallelism we get the full structure of mathematical reasoning
at once. A definition has the form * f:=P:Q*, where *f* is a new
name, *P* an acceptable expression and *Q* its type. In the proof
world it corresponds to a theorem, where *P* expresses the proof, *Q*
expresses (via ``proof of'') the proved statement, and *f* is the name
of the proof. The introduction of a typed variable *x:Q* corresponds
to an assumption. There *Q* represents (again via ``proof of'') the proposition
that is assumed, and *x* is a name we can handle, during the lifetime
of the assumption, *as if* it

were a proof of the assumption, completely parallel to the real variable *x*
that is treated (during its lifetime) as if it were a real number.

There is a second kind of parallelism on a different level: types can be treated the same way as objects, in the sense that they can depend on variables and also act as variables. Accordingly we have a kind of constructions, similar to functions of several variables, where some of the variables are on the level of objects or proofs, and others are on the level of types. And the values of the functions can be objects as well as types.

So we have two levels of typing. On the one level (``low'' typing) we say things like ``3 is a natural'', on the other one (``high typing'') ``natural is a type''.

Such a simple systematic framework for dealing with typing, enriched with
facilities (typed lambda calculus) for handling situations with dummy variables,
is sufficiently simple to be natural, and rich enough to express almost anything
we want. To take an example: the world of Greek geometry did not only handle
geometrical objects and proofs, but geometrical constructions too. And a text
describing a construction and proving that its result is the one we wanted,
handles three kinds of low typing. Apart from the two mentioned before (with
objects and proofs) we get a third one, of the form `` ... is a construction
for ...''. And the three worlds are happily intermingled, referring to one another
all the time. The combination of various theories into a single formalism can
be called *integration* (see [dB5]).

I think that this is much more natural then encoding geometrical constructions
as points in the ZF universe.

**Typed sets vs. untyped sets**

These frameworks based on typing can cover a very large part of formalizable
mathematics, at least as much as what one has been doing with ZF. Actually,
one of the things one can describe by means of the framework *is* ZF.
But we have the alternative to build mathematics on types (with sets as subtypes).
In a way that gives much more, since types can depend on all sorts of parameters
(including type parameters), can be freely created as primitives (like we do
with axioms), and can be introduced as variables. The only thing that we do
*not* do is to create all our types in one stroke, by a single set of
axioms. That is a very essential difference with ZF.

The type structure is also rich enough to enable us to add entirely new areas
of application. I mentioned objects, proofs, geometrical constructions, but
there can be much more. Like algorithms in general (of which the geometric constructions
are a special case). And we might get closer to fulfilling Leibniz' dream of
a general language for science.

*A word about lambda calculus*

Almost everything in framework systems depends on the notations of the lambda calculus. That is something that still seems to scare the majority of today's mathematicians. Very strange, since the lambda notation is such a great help in writing mathematics. It seems rather clumsy to live without it, in particular in fields like functional analysis. I can think of only one reason why mathematicians still don't use it: it is because Bourbaki did not do it!

The *theory* of lambda calculus may have difficult aspects, but the
notation itself is simple. It just amounts to the use of a quantifier with a
bound variable in order to build a function that is given by its values. Instead
of introducing the symbol *f* for the function that sends every *x*
of a set *S* to *x^2 + x -3*, we can talk directly about that
*f* as *Lambda : x \in S . (x^2 + x -3)*. So we do not need the
letter *f* and we do not need the extra sentence which defines *f*
by its values (it is a sentence in the metalanguage).

It is amazing how much can be written in terms of lambda expressions! In particular
I refer to [dB4] where it is explained how a complete book written in the language
of a justification system can be considered as a typed lambda expression, and
where the notion of correctness of the whole book just reduces to the notion
of correctness of that single expression.

Nevertheless we can do at least *some* mathematical reasoning *without*
lambda's. In contrast to most other framework systems, Automath has a (very
natural) feature of *instantiation*. It has the effect that material
written in some context can be used later outside that context, without any
appeal to lambda abstraction.

One might say, roughly, that until the 19th century mathematics needed no lambda
calculus. Almost all of it might have been written in terms of the lambda-free
fragment (Pal) of Automath, using the instantiation device for the description
of explicit functions. It was only during the 19th century that the notion of
a function moved very slowly from the metalanguage into the language, and that
made the lambda notation indispensable.

**Scope of mathematics**

A justification system is expected to be able to handle all formalizable mathematics. But the typed framework being as simple and rich as it is, we have the right to turn the tables, and say: whatever the framework can describe, is to be called formalizable mathematics. In particular it contains all logic and computer science.

The claim is not so pedantic as it seems, since formalization does not encompass all mathematical activity. There are other wonderful things like intuitive thinking, mental pictures, heuristics, metalanguage and interpretations.

**Experiences**

It may be instructive to tell something about the experience we got in Eindhoven with teaching students to handle the Automath system. In the period 1971-1984 I gave an introductory course on Automath almost every year. Students could get credits for that course, not by passing an examination but by doing practical work. Quite often we gave them some material they were acquainted with, and required them to have it checked by the system. No treatment of preceding theories was required: all known material used in the proofs could be introduced by means of axioms. The students were usually mathematics majors with about 4 years of training in pure and applied mathematics, and no training at all in logic. The general rule was that they had to spend about 100 hours on their job, which included the time for attending the course. Most of them did rather well, and delivered a complete machine-checked proof.

Examples of such pieces of mathematics were: the elements of group theory, the Banach-Steinhaus theorem of functional analysis, Dirichlet's pigeonhole principle, the Konig-Hall theorem of combinatorics, Van der Waerden's theorem on arithmetic progressions. One student wrote a master's thesis in Automath, treating a new theory of the real number system. It took him 9 months to deliver a fully checked text plus a report about the work. It would not have taken him much less time to organize and finish the same material in ordinary textbook style.

In general it can be said that the more abstract the piece of mathematics, the smaller the gaps and the easier the justification. Less abstract fields, like combinatorics, can be hard since they may contain simple intuitive ideas for which there is no tradition of formalization.

**Looking around**

It was the main purpose of this paper to try to explain what justification systems are, and not to enter into details of systems. But here are some references for readers who might like to have some real information about various systems.

For the Boyer-Moore system I refer to [S]. For a survey of the Automath project to [dB3], for the Nuprl system to [C], for the Calculus of Constructions to [CH]. A way to put many framework systems in a common scheme was described in [B] (a similar scheme was already given in [dB2]).

Finally, the collection [HP] can give a good impression of what is going on in the field.

I already mentioned that I see a future in techniques of automated gap-filling in order to make justification systems useful for the working mathematician. For that particular activity I think it makes little difference whether we take a classical or a typed approach. In both cases one will do roughly the same thing.

But different justification systems may have different views on the output. I would prefer the case where the machine's output is: ``I have been able to fill the gap, and here is the proof written in your own language''. Other systems might say: ``OK, I have been able to fill the gap to my satisfaction, but my proof would have been unreadable for you, and has been put in the garbage already''. I would call those systems ``black box systems''. They might work very fast, but slightly against the ethical principles of justification.

*References*

[B] H.P. Barendregt. ``Introduction to generalised type systems,''

Proceedings 3rd Italian Conference on Theoretical Computer

Science., Eds. A. Bertoni a.o., World Scientific, Singapore 1989.

[dB1] Bruijn, N.G. de, ``The mathematical language Automath,

its usage, and some of its extensions,'' Symposium on Automatic

Demonstration (Versailles, December 1968), Lecture Notes in

Mathematics vol. 125, Springer Verlag 1970, pp. 29-61.

[dB2] ---, ``A framework for the description of

a number of members of the Automath family,''

Memorandum 1974-08, Department of Mathematics,

Eindhoven University of Technology, 1974.

[dB3] ---, (1980) ``A survey of the project Automath,''

In: *To H.B. Curry: Essays in combinatory logic,
lambda calculus and formalism*, pp. 579-606,

Academic Press, 1980.

[dB4] ---, ``Generalizing Automath by means of a

lambda-typed lambda calculus.''

In: *Mathematical Logic and Theoretical Computer Science*,

Lecture Notes in Pure and Applied Mathematics, Vol. 106,

(ed. D. W. Kueker, E.G.K. Lopez-Escobar, C.H. Smith)

pp. 71-92. Marcel Dekker, New York 1987.

[dB5] ---,``The use of justification systems for

integrated semantics,''

In: *Colog-88* (ed. P. Martin-Lof, G. Mints), Lecture Notes

in Computer Science, Vol 417, pp. 9-24, Springer Verlag, 1990.

[C] Constable, R.L., et.al., *Implementing Mathematics with the Nuprl
Proof Development System*, Prentice-Hall Inc.,1986.

[CH] Coquand, T. and Huet, G., ``The calculus of constructions'',

Information and Computation, Vol. 76, 95-120, 1988.

[J] Jutting, L.S. van B., *Checking Landau's
``Grundlagen'' in the Automath system*,

Mathematical Centre Tract Nr. 83, Amsterdam 1979.

[HP] Huet, G. and Plotkin, G. (editors), *Proceedings of
the First Workshop on Logical Frameworks*, Cambridge University Press,

to appear 1991.

[S] Shankar, N., ``Observations on the Use of Computers in

Proof Checking'', *Notices of the American Math. Soc.*, Vol. 35,

Nr. 6, 1988.