Originating authors are Michèle Artigue and Ferdinando Arzarello.
Given a square grid, it is easy to draw squares whose vertices are intersections of the grid lines. But is it possible to do so for other regular polygons, for instance an octagon ? The answer is : « No » and it can be proved, for the octagon, as follows (Payan, 1994) :
Let’s forget about the grid for a minute. To each regular octagon (that is, an octagon such that the sides all have the same length and all the angles are equal), we associate another octagon as indicated in figure 2. The point 
 is the image of the point 
 by the anticlockwise 
 rotation of centre 
, the point 
 is the image of 
 by the anticlockwise 
 rotation of centre 
, and so on… We can show that we obtain then another regular polygon, homothetic to the first one, and with same centre 
. Its surface is strictly smaller than the surface of the first octagon.
Let’s now go back to regular polygons whose vertices are intersections of grid lines. We associate to such a polygon 
 the number 
 of intersections of grid lines strictly contained in the polygon (i.e. the intersections which are inside the polygon but not on a side). For example, if 
 is the pentagon of Figure 3, then 
 (reminder : This number is related to the surface of the pentagon by Pick’s formula  Pick’s theorem ).

Figure 2 : Construction of the second polygon

Figure 3 : Computation of S(P)
Suppose now that we are able to build a regular octagon whose vertices are intersections of grid lines : 
. By the process described previously we can associate to it another regular octagon 
 whose vertices are also intersections of grid lines, since this property is preserved by 
 rotations whose centre is a point on the grid. Moreover the integer 
 will be strictly smaller than 
. Iterating this process necessarily ends with a contradiction.
We have proved the impossibility of an octagon with vertices on grid intersections using an arithmetical method well known since Fermat’s period: a proof by infinite descent. What does this method consist in ? It consists in the impossibility of building a strictly decreasing infinite sequence of natural numbers. It relies on a property of order on the set of natural numbers : the property of being well-ordered, which means that every non-empty subset has a smallest element. Indeed, suppose that we could build a infinite decreasing sequence of positive integers 
. Let 
 be the set of all the terms of the sequence : 
. Then 
 has a smallest element, say 
. It is an element of the sequence, so there exists 
 such that 
. But then 
 and 
 belongs to 
, which contradicts the definition of 
.
Mathematical induction and the well-ordering of natural numbers
Mathematical induction, generally introduced in high school, is in fact a consequence of the well-ordering of natural numbers. It is the third of Peano’s axioms for arithmetic (1908), which is formulated as follows :
« If 
 is a set containing 
 and if for any element 
 in 
 then the successor of 
 is also in 
, then 
 contains every natural number ».
We usually formulate it as follows : If a property 
 about natural numbers is possessed by 
 and also by the successor 
 of every natural number 
 which possesses it, then it is possessed by all natural numbers.
If we admit that the order on natural numbers is a well-ordered relation, then mathematical induction is a direct consequence. Indeed, suppose that the property 
 is not possessed by all natural numbers, and let 
 be the set of natural numbers which do not possess it. Then 
 is not empty, it has a smallest element 
 and 
 is not 
 since 
 possesses 
. Then it has a predecessor 
 which is not in 
. But then 
 possesses 
 and so does 
, which contradicts the definition of 
.
Mathematical induction can also be used to define functions. For instance, this is what we do when we define a sequence by induction, by giving a relation 
 where 
 is a function defined on 
 and a first term 
. We also call this a definition by induction.
Recurrence and induction : first generalization
More generally, we can think with induction on every set which is well-ordered. Consider for example the set 
 with lexicographical order : 
 if 
 or 
 and 
. This order is a well-ordered relation. Indeed, let 
 be a non-empty subset of 
, we can have two cases :
- either all the elements of 
 are of the form 
. Then 
 is a non-empty subset of 
 and it has a smallest element 
. Consequently 
 is the smallest element of 
 ; - either there exists an element of 
 of the form 
. Then showing that 
 has a smallest element consists in showing that 
 has a smallest element, which is necessarily the case. 
Similarly, the lexicographical order on 
 is a well-ordering. We leave it to the reader to adapt the previous proof to this case. Thus, there cannot be an infinite strictly decreasing sequence in these sets, even if there are infinitely many elements which are smaller than any element of the form 
 with 
 all those of the form 
 with 
.
In 
 the principle of infinite descent which we used in the first example can be applied. And in 
 we can also define functions by induction. It is the case for example for the famous function from 
 to 
 defined by Ackermann in 1932 which we will talk about later.
Induction and ordinals
In set theory, this notion of well-ordering, which is essential for mathematical reasoning, is seen through the notion of ordinal and does not only explain well-ordering relations as seen previously. Ordinal numbers, and consequently natural numbers which are finite ordinals, are defined as well ordered sets by membership and transitivity (a set 
 is transitive if all element 
 of an element 
 of 
 is also an element of 
), but in this article we will only see an intuitive approach of transfinite ordinals introduced by Cantor at the end of 19th century. The smallest infinite ordinal is denoted 
 and is the union of all finite ordinals. It represents the order on natural numbers. It has a successor 
 which has a successor 
, and so on… The smallest ordinal greater than any 
 is 
 or 
. The smallest ordinal greater than any 
 is 
 and so on.
      ![]()
There are in fact two kinds of ordinals, those which are successors of a given ordinal, such as 
, 
, for 
, and those which have no predecessor but are the union of all smaller ordinals, such as 
, 
, 
 in the previous list. We notice that the lexicographical order on 
 gives to this set an order which corresponds to the order of the ordinal 
. The generalization of the lexicographical order on 
 corresponds to 
, … But there are many other ordinals than those we mentioned here and every one of them is countable. It can be shown with set theory, using the axiom of choice, that any set can be well-ordered.
We can define functions by induction on ordinals. To define a function by induction on an ordinal 
, we have to give its value at 
 and the process which allows us to obtain the value of 
 from the values of the function on previous ordinals. Note that it is useful to consider two cases to define 
, whether 
 is a limit or a successor. On ordinals we can proceed by infinite descent and the  Goodstein vignette is a good example of such a proof.
About the Ackermann function and the diagonal argument
An example of a function defined by induction is the Ackermann function, mentioned previously. It is denoted by 
, and is defined as follows :
- For any positive integer 
, 
 - For any positive integer 
, 
 - For any pair of positive integer 
, 
. 
Its value at a point 
 is then given explicitly if 
 is 
. If 
 the computation of 
 uses the values of the function at some points 
 such that 
 for the lexicographical order. This computation requires a strictly decreasing sequence of such points which is necessarily finite since the lexicographical order is a well-ordering relation. However, the sequence may be very long. How many terms are necessary to find 
?
Let’s start with a simpler example : 
, using the following table.

The first condition 
 allows us to fill in the first column. Then we successively have :
 ; 
 ; 
, and more generally we have :
      ![]()
We can then find :
 ; 
 and eventually 
.
In order to find 
, we have to find all the values of the function for the pairs of the first column up to 
, of the second column up to 
, and of the third column up to 
, i.e. we have to use 
 values.
What happens if we want to find 
 and finally how much is 
 ? We leave it to the reader.
By fixing one of the two variables of the Ackermann function, we obtain what is generally called arithmetic functions. Thus, we write 
 the function obtained by choosing the first variable to be 
. We saw that 
, so 
 is the successor function and 
. We can show that 
 and 
. Then, the expressions quickly become very complicated, using iterations of powers, and the values of 
 become very high in only a few steps. For example, 
 and 
 is a number with 
 digits.
In fact we can show that the function 
 defined by 
 is such that for all primitive recursive function, that is to say any function which can be defined with the successor function and projection functions from 
 to 
 by composition and induction, there is a positive integer 
 such that 
 for any 
 greater than 
. This how we prove there are computable functions which are not primitive recursive. The Ackermann function is an example of such a function. But usual functions on integers, such as functions used in secondary schools, are all primitive recursive. The functions which associate to two natural numbers their sum or product are primitive recursive. We can indeed define them by induction by : for all 
, 
 and 
 and for all 
, 
 and 
. We leave it to the reader to deduce that the function which associates any pair of natural numbers 
 to 
 is also primitive recursive, such as any iteration of such function. More generally, any function which can be defined using primitive recursive functions with an algorithm using a loop of the form « For 
 from 
 to 
 » is primitive recursive.
The proof requires the diagonal argument. This argument was invented in 1874, by Georg Cantor, in order to prove that the set of real numbers is not countable i.e. there is no bijection between 
 and 
. In 1891, he used this method again to prove that the power set of any set 
 (i.e. the set of all subsets of 
) has a cardinal strictly smaller than the cardinal of 
. The invention of the diagonal argument has been an important step in the development of mathematics. In particular it is the starting point of the proofs of many results, including Gödel’s incompleteness theorem, and of many theorems of computability theory. Russell’s paradox is also based on this idea. We can give a taste of this argument in high schools by proving that the interval 
 is not countable, using the fact that every real number has a unique decimal representation (considering only normalized representations, i.e. we do not consider improper decimal representations ending by infinitely many 
). Suppose there is a bijection between the interval 
 and 
 and hence consider a sequence 
 of these real numbers. The proper decimal representation of the real number 
 has the form : 
,
 , and may end by infinitely many 
. Consider the real number 
 whose decimal representation is 
 where 
 if 
 and 
 if 
. It is then clear that 
 is the decimal representation of a real number 
 in the interval 
 and that 
 is not equal to any element of the sequence 
. Consequently the interval 
 is not countable, and neither is the set of real numbers.
Conclusion
The principle of mathematical induction has been used since ancient history. For instance, in Euclid’s Elements, it is used to show there are infinitely many primes. Similarly, the principle of infinite descent has also been used for hundreds of years and is often related to Fermat who used it repetitively in his work on number theory. On the other hand, showing the equivalence of these two principles is more recent, and is due to the development of mathematical logic. The main tool for this is the notion of well-ordering in which Georg Cantor saw a « fundamental tool for thinking ». It allows us to see the equivalence between mathematical induction and infinite descent, but also to work on sets other than natural numbers and understand in particular how induction on more than one counter works (as in the Ackermann function).
The example chosen to start this vignette shows that mathematical induction is a general tool in mathematics, and not a tool only used in number theory. It plays an important role especially in discrete mathematics.
This vignette was also the opportunity to mention some important questions other than induction, but which are not independent, such as computability. The mathematisation of the intuitive notion of computability, which required the encoding of different types of discrete objects, and the use of diagonal arguments, has helped to find important results in various fields. It is also fundamental to connections between mathematics and computer science.
References
[1] Calude C., Marcus S., Tevy, I (1979). The first example of a recursive function which is not primitive recursive, Historia Math., vol. 6, n. 4, 380–84.
[2] Cori, R., Lascar D. (). Logique mathématique, tome 2 : Fonctions récursives, théorème de Gödel, théorie des ensembles, théorie des modèles. Paris : Editions Dunod.
[3] Payan C. (1992). Une belle histoire de polygones et de pixels. MATh.en.JEANS” au Palais de la Découverte, pp. 99-106. mathenjeans.free.fr/amej/edition/actes/actespdf/92099106.pdf
			
Very interesting.
Do you have anything about Theon from Smyrna and his diagonal numbers?
The tell the solutions to Diophantic equations in a very simple way.
Pascals triangle is reduced to a xy-table.
Theon of Smyrna’s work could be an example for this vignette. Maybe in another one !