The Revenge of the Infinitesimals

Originating author is Michèle Artigue.

Infinitesimals played an essential role in the emergence and development of differential and integral calculus. The evident productivity of this calculus did not prevent recurrent and fierce debates about the nature of these objects and the legitimacy of their use. At the end of the 19th century, when the construction of real numbers from integers and the modern definition of the concept of limit provided a solid foundation for differential and integral calculus, infinitesimals and the associated metaphysics was rejected and their use perceived synonymous with bygone and poorly rigorous practices. However, the language of infinitesimals continued to be used, for example in physics and even in mathematics. It never completely disappeared from the informal discourse and heuristic thinking of a number of researchers.

Is this language thus really incompatible with mathematical rigour? What does it offer that is interesting and specific, which explains its permanence? Non-Standard Analysis developed in the 20th century and provided answers to these questions and enabled infinitesimals to take their revenge.

From infinitesimal calculus to non-standard analysis

In the preface of the first treaty on infinitesimal analysis published in 1696, its author, the Marquis de l’Hôpital boasts the power and ease of the new calculus that infinitesimals make possible $^i$ :

Figure 1: Cover and excerpt of the preface of the treaty by the Marquis de l’Hôpital

However, rather quickly debates arise about infinitesimals and their use. In a famous essay published in 1734, The Analyst, George Berkeley develops a fierce criticism of the use of infinitesimals, or evanescent increments, in differential calculus and Jean Le Rond d’Alembert in the l’Encyclopédie Méthodique published in 1751, tries to get free of them by relying on the intuitive idea of a limit. At the turn of the 20th century, with the development of modern analysis, this seems to be achieved. However, half a century later, the logician Abraham Robinson will rehabilitate infinitesimals and associated practices.

Abraham Robinson indeed shows that the language of infinitesimals is fully compatible with mathematical rigour. The logician Thoralf Skolem had already shown in 1934 that the set obtained by adding successive units to 0 could not be the only model of the Peano’s axioms for arithmetic, that there thus existed other models, called for that reason non-standard. In 1961, Abraham Robinson showed, by a construction based on ultra-products, the existence of a non-standard model of the field of real numbers containing numbers “infinitely large” and “infinitely small”. This was the birth of non-standard analysis (NSA). Then in 1977 Edward Nelson found how to axiomatise NSA. To do this, he added a one-place predicate symbol $st(x)$ expressing that an object $x$ is standard to the language of set theory, and three axioms to the ZFC $^{ii}$ axioms of set theory: the axioms of idealisation, standardisation and transfer $^{iii}$ , which made NSA easier to manipulate. It is on this axiomatisation called IST (Internal Set Theory) that we rely in this vignette.

These three axioms have important consequences. It results, for example, from the axiom of transfer that two standard sets E1 and E2 are equal if and only if they have the same standard elements $^{iv}$ . It also results from the same axiom that if there exists an $x$ satisfying a classical property $P$ (that is to say a property that can be expressed without using the predicate st), then necessarily there also exists a standard $x$ satisfying it. So, the objects that can be defined by a classical formula, in a unique way, are standard. Numbers and ordinary objects that we frequent in mathematics: the numbers $\pi$ , $e$ , trigonometric and exponential functions belong to this category and are thus all standard objects, like the empty set, the set $\mathbb{N}$ of whole numbers and the set $\mathbb{R}$ of real numbers.

However, to be a standard set does not imply to have only standard elements. In fact, it results from the axiom of idealisation that there exists in $\mathbb{N}$ an integer bigger than all standard integers, thus necessarily non-standard. More generally, using the same axiom, it is easily proved that any infinite set has non-standard elements $^v$ . $\mathbb{N}$ et $\mathbb{R}$ are thus standard sets containing both standard and non-standard elements. How can we represent them?

In $\mathbb{N}$ , all standard integers come first, preceding non-standard integers. In $\mathbb{R}$ , the situation is more complicated. One can distinguish three categories of real numbers, according to their magnitude:

The real numbers which are very small or infinitesimals (positive or negative), those whose absolute value is less than any standard positive real number,
The real numbers which are very big or infinitely big (positive or negative), those whose absolute value is more than any standard positive real number,
And, between them, the real numbers having a human size so to speak, often called appreciable numbers.

If a real number is infinitesimal or appreciable, it is said limited. If a real number is infinitely big, its inverse is an infinitesimal, and both of them are non-standard. Appreciable real numbers may be standard or non-standard. Let us consider for instance the real number $1$ ; adding to it an infinitesimal $\alpha$ different from $0$ , one obtains the number $1+\alpha$ also appreciable, but non-standard and infinitely close to $1$ . Around each standard real number, there exists thus a cloud of infinitely close non-standard real numbers. To give account of this situation, a new relationship is introduced. $x$ is said infinitely close to y if $x-y$ is an infinitesimal, which is noted $x \simeq y$ . The collection of real numbers infinitely close to a given real number is called its halo. Infinitesimals are the halo of $0$ and each appreciable real belongs to the halo of a unique standard real number called its standard part. The real non-standard line can be represented as in Figure 2, with a blurry frontier between real numbers that are appreciable and those that are infinitely big, as the collection of appreciable real numbers does not have a maximum and the collection of infinitely big numbers does not have a minimum.

Figure 2 : Representation of the non-standard real line

Rules of calculation taking into account the different orders of magnitude of numbers extend those of arithmetic. Noting $ip$ the infinitesimals, $app$ the appreciable and $ig$ the infinitely big, one has for instance:: $ip.app=ip$ , $ip+app=app$ , $ip.ip=ip$ et $ip+ig=ig$ , and we invite the reader of this vignette to complement the following tables where possible.

Rendered by QuickLaTeX.com

The set of real numbers still verifies the Archimede’s axiom: if $a$ and $b$ are positive real numbers such as $0<a<b$ , there exists an integer $n$ such as $na>b$ , but of course if $a$ is an infinitesimal and $b$ is not, $n$ will be infinitely big.

NSA and basic notions of Analysis

Within the framework of NSA, basic notions of mathematical analysis: limits, continuity, differentiability and integrability have very simple formalisations for standard objects, as shown in the table below $^{vi}$ :

Rendered by QuickLaTeX.com

These simple formalisations without alternation of quantifiers and the ideas underlying these, allows simplified proofs of many classical theorems. We propose below a non-standard proof of the Intermediate Value Theorem as an illustration.

Theorem of intermediate values $^{vii}$ s : let $I$ be an interval of $\mathbb{R}$ , $f$ a real value function continuous over $I$ , $a$ and $b$ two real numbers of $I$ with $a < b$ , for any $y$ between $f(a)$ and $f(b)$ , there exists a real number $c$ in $[a,b]$ such that $f(c)=y$ .
Proofe : without limiting the generality of the proof, we can suppose that $y=0$ and $f(a)<0<f(b)$ . Let $\omega$ be an infinitely big integer and the subdivision of $[a,b]$ with step $(b-a)/\omega$ ( $x_0=a$ , … , $x_n=a+n(b-a)/\omega$ ,… , $x_{\omega}=b$ ). Let $m$ be the smaller index such as $f(xm)>0$ . By hypothesis, $m \neq 0$ and $x_m$ is a limited real number because belonging to $[a,b]$ . Choose for $c$ its standard part. $f(x_{m-1}) \leq 0$ and $f(x_m)>0$ but $x_{m-1} \simeq x_m$ the subdivision being of infinitesimal step and $x_m \simeq c$ , thus, $f$ being continuous in $c$ , $f(c) \simeq f(x_{m-1}) \simeq f(x_m)$ . As $f(c)$ is standard, necessarily $f(c)=0$ .

ANS also allows the justification of the technique of cutting into infinitesimal slices still used beyond the mathematics community for calculating areas, volumes, inertia moments, gravity centres…, or for modelling situations through differential equations. The example below is a very simple illustration.

Computation of the volume of the sphere:

Let $S$ be a sphere of radius $R$ . One cuts the sphere into slices of infinitesimal thickness $dz$ . The volume $dV$ of the slice situated at the altitude $z$ is approximately that of the cylinder of $\sqrt{R^2 - z^2}$ and hight $dz$ .
$dV = \pi (R^2 - z^2)dz$
From here, by summation it comes
$V=\int_{-R}^{R} \pi(R^2 - z^2) dz$
$V=\left[ \pi (R^2 z - z^3 /3) \right]_{-R}^{R}$
and finally $V=\frac{4 \pi R^3}{3}$ .

Figure 4 : The volume of the sphere

The sum of the volumes of the infinitesimal cylinders is in fact the Cauchy sum attached to the infinitesimal subdivision of step $dz$ of the interval $[-R,R]$ for the area function $S(z)=\pi (R^2-z^2)$ . Considering the non-standard definition of the integral for continuous functions, the standard part of this sum is thus equal to the integral $\int_{-R}^R S(z) dz$ .
However, the fact that this integral rightly gives the volume of the sphere results from the fact that the approximation proposed for $dV$ is an equivalent of the volume of the slice, not just a number infinitely close. It would not be the case if the same approximation by cylindrical slices were used for calculating the area of the sphere for instance.

The necessity of maintaining some vigilance

However, the manipulation of non-standard objects requires some vigilance. For instance, extending the definition of continuity given above to any real number and function, standard or not, one obtains a new notion, the $S$ -continuity at a point which does not necessarily correspond to our vision of continuous functions. For instance, the staircase function which takes the value $0$ over negative real numbers and the value $\alpha$ over positive real numbers including $0$ , with $\alpha$ infinitesimal different from $0$ , is $S$ -continuous in $0$ , as the image of any positive infinitesimal is the infinitesimal $\alpha$ . Conversely, the square function is not $S$ -continuous in $\omega$ , if $\omega$ is infinitely big as $(\omega+1/\omega)^2=\omega^2+1/\omega^2+2$ is not infinitely close to $\omega^2$ . In fact, for a standard function, $S$ -continuity at any point of $\mathbb{R}$ is equivalent to uniform continuity over $\mathbb{R}$ . The calculation we have just made for the square function is a very simple proof of the fact that the square function, while being continuous over $\mathbb{R}$ , is not uniformly continuous over $\mathbb{R}$ .

The use of NSA requires also some vigilance when using the induction principle. In its usual form, it applies to classical properties; this is a consequence of the axiom of transfer, but not beyond. Consider the following reasoning, often presented as a paradox:

Let us $P$ be the property « To be standard ». $P(0)$ is true and, for any integer $n$ , if $P(n)$ is true, then $P(n+1)$ is also. Thus, for every $n$ , $P(n)$ is true and all integers are standard.

The principle of induction is applied here to a non-classical statement written with the predicate $st$ . This reasoning is not at all a paradox, it is simply non valid. In fact, it can be proved that in NSA, for a non classical property $P$ , only the following reduced principle of induction is valid: if $P(0)$ is true and, for any integer $n$ , $P(n)$ implies $P(n+1)$ , then $P(n)$ is true for all standard integers.

The potential of NSA: a question still debated

As shown above, NSA rehabilitates infinitesimal numbers, the modes of computation and the intuitions they convey, but at the price of some work and vigilance. Thus what is really gained through NSA? The question is still discussed in the mathematics community as shown for instance by the Terence Tao blog that we accessed when preparing this vignette. NSA has been fruitfully used in a diversity of mathematical domains: topology, probabilities, dynamical systems… It has supported interesting modelling in automatics, ecology, and economics, for instance. In France, at the initiative of the mathematician Georges Reeb, a non-standard community developed from the late seventies, which obtained original results in various areas (cf. (Lutz et Goze, 1982), (Diener & Diener, 1995). Among the best known is the discovery of specific trajectories called «duck trajectories» in slow-fast vector fields in two and then three dimensions. We present this example below without entering into the technical details of its non-standard treatment, using the same differential equation as in (Benoît et al.,1981). The reader who is not familiar with differential equations can skip this part.

The ‘duck’ trajectories

We consider the differential equation $cx''+(x^2-1)x'+x-a=0$ , with $a \geq 0$ and $c>0$ . A classical study of this equation shows that if $a<1$ , the equation has a unique periodic solution which is a attractive limit cycle. This periodic solution disappears for $a=1$ and for $a \geqslant 1$ there exists an attractive stationary state $x=a$ . The phenomenon of the duck trajectories precedes this bifurcation in the dynamics of the equation, known as a Hopf’s bifurcation. It has been observed for very small values of $c$ and values of a very close to $1$ . A non-standard modelling allowed the identification of the phenomenon and the characterisation of the conditions for its apparition.

First, we transform the second order differential equation into a first order system. This is done in two different ways, first by posing $u=cx'+F(x)$ with $F(x)=x^3/3-x$ , a particular transformation used for studying this type of equation known as Lienard’s type, second by posing $v=x'$ , the classical way of transforming an equation into a system. Thus the two systems, in which $\omega=1/c$ is an infinitely big number:

$\left\{ \begin{array}{cl} x' &= \omega (u-F(x)), \\ u' &= a-x \end{array}\right$

$\left\{ \begin{array}{cl} x' &= \omega (u-F(x)), \\ v' &= \omega(a-x-(x^2 -1)v) \end{array}\right$

The associated vector fields in the respective planes $(x,u)$ and $(x,v)$ are reproduced in the figures 1 et 2 copied from (Benoît et al., 1981). They are slow-fast vector fields. For instance, the field in the plane $(x,u)$ or plane of Lienard is nearly horizontal outside the halo of the cubic $C$ with equation $u=F(x)$ whose increasing parts are attractive, due to the direction of the horizontal arrows, while the decreasing part is repulsive. The double arrows are used for remembering that these parts, nearly horizontal, are travelled with an infinitely big speed. Combining techniques of standard and non-standard analysis, especially changes in scale through using macroscopes of infinitely big ratio along one direction in the planes $(x,u)$ and $(x,v)$ , one can show that, when $a<1$ but not in the halo of $1$ , there exists a unique big cycle which is attractive (figure 3). In the plane $(x,u)$ , a particle moving along this cycle, for instance starting from a point $M$ not situated in the halo of $C$ , reaches nearly horizontally this halo, travelling with an infinite speed, then goes down along $C$ , staying in its halo, with an appreciable speed $x' \simeq (a-x)/(x^2-1)$ until it arrives in the halo of its local minimum $S-$ . Then, it again takes a nearly horizontal trajectory until it reaches once again the halo of $C$ and follows this curve up to the halo of the local maximum $S+$ ; it then starts again a nearly horizontal movement…

When $a \simeq 1$ , the situation is more complex because separate branches of the curve $Ha$ associated with $v'=0$ in the plane $(x,v)$ , become infinitely close. It results from that in the plane $(x,u)$ that a solution which has followed $C$ until the halo of $S-$ may, for some values of a, follow a part of the repulsive part of $C$ before becoming nearly horizontal and reaching the halo of the attractive part of $C$ (figure 4).

Figure 4 : Morphology of duck trajectories

Due to their form, researchers named the associated cycles duck trajectories. For instance, they showed that for any real number $x$ between $-2$ and $1$ , there exists a value of a for which there exists a duck cycle whose beak abscissa is $x$ . Without the support of a NSA modelling, the existence of such trajectories might have remained ignored. To observe them indeed it is not enough to have $c$ small and $a$ close to $1$ ; it has been proved that it is necessary to have $1-a$ very close to $c/8$ , more precisely that the ratio $(1-a-c/8)/c$ be an infinitesimal. Beyond its pure mathematical interest, the existence of duck trajectories for systems of differential equations in two and three dimensions, has had applications in extra-mathematical fields.

It must be stressed, however, that the original results obtained by NSA researchers have been subsequently often proved by standard methods. This is not per se surprising as IST is a conservative extension of ZFC: any theorem of the theory IST which has a classical expression is also a theorem of ZFC. Combined with the fact that many mathematicians feel that the use of these infinitesimals is a regression from which mathematics had so much difficulty to get free, this situation often leads us to consider that the NSA construction is of little utility. Those who practice NSA reject these objections. They stress the change of vision conveyed by NSA, the intuitions and modelling it makes possible, more adapted to the reality of the world than those attached to standard views. To a vision of the real number set as a homogeneous entity, they oppose the non-standard vision which puts the distinction between order of magnitudes at the core of the numerical system, reflecting the diversity of scales that sciences currently consider, and the fuzzy frontiers between them. They insist on the potential that, through infinitesimals, NSA offers in terms of discrete modelling for various areas of application. According to them, such potential means that NSA deserves full recognition.

In mathematics education too, diverse attempts have been made to develop non-standard approaches to the teaching of calculus. These were based either on Robinson’s construction (as those proposed by Keisler, 1976; and Henle & Kleinberg, 1979), or inspired from Nelson’s axiomatic approach (Deledicq & Diener, 1989). However, as pointed out by Hogdson (1994) in the synthesis of these realisations he presented at the ICME-7 Congress in Québec, none of these innovative constructions were sustainable in the long term. The marginal status of NSA certainly contributes to this, but also the fact that for achieving an efficient use of non-standard analysis, speaking of infinitesimals is not enough; one has to learn to manipulate new ideas and definitions, familiarise oneself with new modes of reasoning, build new references and intuitions, and develop new means of control.

Finally, this vignette illustrates two frequent phenomena in the history of science and mathematics:

The fact that the integration of intuitive ideas into well founded theories may become accessible only decades and even centuries after these intuitions have been proved to be fruitful, thanks to other scientific advances. This was the case for the intuitive idea of infinitesimals thanks to the development of mathematical logic in the 20th century, but also for the intuitive idea of limit which is at the basis of standard analyses in the 19th century, as recalled at the beginning of this vignette;
The fact also that several theorisations of the same domain of reality can coexist, offering fruitful and complementary perspectives in order to make sense of it and work on it. This is the case for standard and non-standard analysis, two different but complementary ways of approaching the field of functions and analysis, and to think about its relations with the real world.

We will end by pointing out that the construction presented here, that of NSA, is not the only possible one for giving a mathematical status to infinitesimals. Many different attempts have been made since the 18th century (cf. Borovik & Katz, 2012). Smooth infinitesimal analysis, developed from the ideas of F.W. Lawvere in the theory of categories is another recent construction: in it, an infinitesimal is defined as a number different from $0$ but whose square is $0$ (cf. Bell, 2008).

References :

Bell, J.L. (2008). A primer of infinitesimal analysis, 2nd edition. Cambridge : Cambridge University Press.
Borovik, B. & Katz, M. (2012) Who Gave you the Cauchy-Weierstrass Tale? The Dual History of Rigorous Calculus. Foundations of Science 17, no. 3, 245-276.
Berkeley, G. (1734). The Analyst. http://www.maths.tcd.ie/pub/HistMath/People/Berkeley/Analyst/
Benoît, E., Callot, J.L., Diener, F., & Diener. M. (1981). Chasse au canard. Collectanea Mathematica, 32.1, 38-74.
http://collectanea.ub.edu/index.php/Collectanea/article/view/3537/4216
Deledicq A., & Diener, M. (1989). Leçons de calcul infinitésimal. Collection U. Paris : Armand Colin.
Diener, M., & Diener, F. (Eds.). (1995). Non standard analysis in practice. Berlin : Springer Verlag.
Henle, J.M., & Kleinberg, E.M. (1979). Infinitesimal Calculus. Cambridge : MIT Press.
Hodgson, B. (1994). Le calcul infinitésimal. In, D.F. Robitaille, D.H. Wheeler et C. Kieran (Eds.), Choix de conférence du 7e Congrès international sur l’enseignement des mathématiques (ICME-7), pp. 157-170. Québec : Presses de l’Université Laval.
Document.
Keisler, H.J. (1976). Elementary calculus : An infinitesimal approach. Boston : Prindle, Weber & Schmidt.
Lakatos, I. (1978): Cauchy and the continuum: the significance of nonstandard analysis for the history and philosophy of mathematics. Math. Intelligencer 1, no. 3, 151–161 (paper originally presented in 1966).
Lutz, R. & Goze, M. (1982). Non standard analysis : a practical guide with applications. Springer Lecture Notes in Mathematics, vol. 881. Berlin : Springer.
Marquis de l’Hôpital, G.F.A. (1696). Analyse des infiniment petits pour l’intelligence des lignes courbes. Paris : Imprimerie Royale.
Nelson, E. (1977). Internal set Theory, a new approach to NSA, Bull. Amer. Math. Soc., vol. 83, no 6,‎ 1165-1198.
Robinson, A. (1996). Non standard analysis. North Holland, Amsterdam.
Skolem, Th. (1934). Über die Nicht-charakterisierbarkeit der Zahlenreihe mittels endlich oder abzählbar unendlich vieler Aussagen mit ausschliesslich Zahlenvariablen. Fundam. Math. 23, 150-161

$^i$ The scope of this calculus is huge: it is suitable for mechanical as well as geometric curves; radical signs are indifferent to it, and often convenient; it extends to so many indeterminate as one wants; comparing infinitely small quantities of all types is equally easy for it. From that arises an infinity of surprising discoveries regarding tangents both curved and straight, questions of maxima and minima, inflection points and cusps of curves, developed and caustics by reflection or refraction, etc., as explained in this book. (our translation)
$^ii$ ZFC stands for Zermelo-Fraenkel axioms plus the axiom of choice (see http://en.wikipedia.org/wiki/Zermelo–Fraenkel_set_theory).
$^{iii}$ The axiom of idealization says that for any binary relation $R$ which is classical (ie expressed without the use of the predicate $st$ ), the following two statements are equivalent: (i) for any finite standard set $F$ , there exists an $x$ such that for any $y$ of $F$ , $R(x, y)$ ; (ii) there is an $x$ such that, for any $y$ standard, $R(x, y)$ .
The axiom of standardization says that for any property $P$ and any standard set $X$ , there is a standard part $Y$ of $X$ whose standard elements are exactly the elements of $X$ that satisfy $P$ . This axiom works whether $P$ is a classical property or not.
The axiom of transfer says that for any classical property $P$ , $P(x)$ is true for all $x$ if and only if $P(x)$ is true for all $x$ standard.
$^{iv}$ Apply the axiom of transfer with the property $P(x)$ : $x \in E1 \Leftrightarrow x \in E2$
$^{v}$ In the first case, the axiom of idealization is applied to the binary relation $x<y$ in $\mathbb{N}$ ; in the second case, it is applied to the binary relation $x \neq y$ in the infinite set considered.
$^{vi}$ In this table, all objects: $(u_n)$ , $l$ , $f$ , $a$ , $I$ are supposed standard.
$^{vii}$ In this statement too, $I$ , $f$ , $a$ $b$ are supposed standard.

4 Responses to The Revenge of the Infinitesimals

Mikhail Katz says:

May 19, 2014 at 4:14 pm

Nice post. Note that they are “Riemann sums”, not “Cauchy sums”. Also, there is a phrase in French: ” et nous invitons le lecteur à compléter lorsque possible les cases des deux tableaux suivants” that seems to be out of place.

- Michèle Artigue says:
  
  May 21, 2014 at 10:31 pm
  
  Thanks Miklail for signaling this point. The first version of the vignette was in French. It has been corrected.
  
Mikhail Katz says:

May 20, 2014 at 12:36 pm

Also, Henle’s name is misspelled (“Heinle”).

Mikhail Katz says:

June 18, 2014 at 3:43 pm

Your observation that in smooth infinitesimal analysis of Lawvere (and others) “an infinitesimal is defined as a number different from 0 but whose square is 0” is not entirely correct. These numbers cannot be shown to be different from zero. As you point out, the background logic is intuitionistic, so this does not allow one to conclude that they are zero.

The Revenge of the Infinitesimals

Related

4 Responses to The Revenge of the Infinitesimals

Leave a Reply Cancel reply

Receive notice of every new vignette.

Search

The Revenge of the Infinitesimals

Share this:

Related

4 Responses to The Revenge of the Infinitesimals

Leave a Reply Cancel reply

Receive notice of every new vignette.

Search