### Infinitesimals:

### History & Application Joel A. Tropp

Plan II Honors Program, WCH 4.104, The University of Texas at Austin, Austin, TX 78712

Abstract. An infinitesimal is a number whose magnitude ex-
ceeds zero but somehow fails to exceed any finite, positive num-
ber. Although logically problematic, infinitesimals are extremely
appealing for investigating continuous phenomena. They were used
extensively by mathematicians until the late 19^{th}century, at which
point they were purged because they lacked a rigorous founda-
tion. In 1960, the logician Abraham Robinson revived them by
constructing a number system, the hyperreals, which contains in-
finitesimals and infinitely large quantities.

This thesis introduces Nonstandard Analysis (NSA), the set
of techniques which Robinson invented. It contains a rigorous de-
velopment of the hyperreals and shows how they can be used to
prove the fundamental theorems of real analysis in a direct, natural
way. (Incredibly, a great deal of the presentation echoes the work
of Leibniz, which was performed in the 17^{th} century.) NSA has
also extended mathematics in directions which exceed the scope of
this thesis. These investigations may eventually result in fruitful
discoveries.

### Contents

Introduction: Why Infinitesimals? vi

Chapter 1. Historical Background 1

1.1. Overview 1

1.2. Origins 1

1.3. Continuity 3

1.4. Eudoxus and Archimedes 5

1.5. Apply when Necessary 7

1.6. Banished 10

1.7. Regained 12

1.8. The Future 13

Chapter 2. Rigorous Infinitesimals 15

2.1. Developing Nonstandard Analysis 15

2.2. Direct Ultrapower Construction of ^{∗}R 17

2.3. Principles of NSA 28

2.4. Working with Hyperreals 32

Chapter 3. Straightforward Analysis 37

3.1. Sequences and Their Limits 37

3.2. Series 44

3.3. Continuity 49

3.4. Differentiation 54

3.5. Riemann Integration 58

Conclusion 66

Appendix A. Nonstandard Extensions 68

Appendix B. Axioms of Internal Set Theory 70

Appendix C. About Filters 71

Appendix. Bibliography 75

Appendix. About the Author 77

To Millie, who sat in my lap every time I tried to work.

To Sarah, whose wonderfulness catches me unaware.

To Elisa, the most beautiful roommate I have ever had.

To my family, for their continuing encouragement.

And to Jerry Bona, who got me started and ensured that I finished.

Traditionally, an infinitesimal quantity is one which, while not necessarily coinciding with zero, is in some sense smaller than any finite quantity.

—J.L. Bell [2, p. 2]

Infinitesimals . . . must be regarded as unnecessary, erroneous and self-contradictory.

—Bertrand Russell [13, p. 345]

### Introduction: Why Infinitesimals?

What is the slope of the curve y = x^{2} at a given point? Any calculus
student can tell you the answer. But few of them understand why that
answer is correct or how it can be deduced from first principles. Why
not? Perhaps because classical analysis has convoluted the intuitive
procedure of calculating slopes.

One calculus book [16, Ch. 3.1] explains the standard method for solving the slope problem as follows.

Let P be a fixed point on a curve and let Q be a nearby movable point on that curve. Consider the line through P and Q, called a secant line. The tangent line at P is the limiting position (if it exists) of the secant line as Q moves toward P along the curve (see Figure 0.1).

Suppose that the curve is the graph of the equa- tion y = f (x). Then P has coordinates (c, f (c)), a nearby point Q has coordinates (c + h, f (c + h)), and the secant line through P and Q has slope msec given by (see Figure 0.2)

msec = f (c + h)− f(c)

h .

Consequently, the tangent line to the curve y = f (x) at the point P (c, f (c))—if not vertical—is that

Introduction: Why Infinitesimals? vii

Figure 0.1. The tangent line is the limiting position of the secant line.

Figure 0.2. m_{tan} = lim_{h→0}m_{sec}

line through P with slope mtan satisfying mtan = lim

h→0msec = lim

h→0

f (c + h)− f(c)

h .

Ignoring any flaws in the presentation, let us concentrate on the es- sential idea: “The tangent line is the limiting position . . . of the secant

Introduction: Why Infinitesimals? viii line as Q moves toward P .” This statement raises some serious ques-

tions. What does a “limit” have to do with the slope of the tangent line? Why can’t we calculate the slope without recourse to this migra- tory point Q? Rigor. When calculus was formalized, mathematicians did not see a better way.

There is a more intuitive way, but it could not be presented rigor-
ously at the end of the 19^{th}century. Leibniz used it when he developed
calculus in the 17^{th} century. Recent advances in mathematical logic
have made it plausible again. It is called infinitesimal calculus.

An infinitesimal is a number whose magnitude exceeds zero but somehow fails to exceed any finite, positive number; it is infinitely small. (The logical difficulties already begin to surface.) But infinitesi- mals are extremely appealing for investigating continuous phenomena, since a lot can happen in a finite interval. On the other hand, very little can happen to a continuously changing variable within an infinitesimal interval. This fact alone explains their potential value.

Here is how Leibniz would have solved the problem heading this
introduction. Assume the existence of an infinitesimal quantity, ε. We
are seeking the slope of the curve y = x^{2} at the point x = c. We will
approximate it by finding the slope through x = c and x = c + ε, a
point infinitely nearby (since ε is infinitesimal). To calculate slope, we
divide the change in y by the change in x. The change in y is given by
y(c + ε)− y(c) = (c + ε)^{2}− c^{2}; the change in x is (c + ε)− c = ε. So
we form the quotient and simplify:

(c + ε)^{2}− c^{2}

ε = c^{2}+ 2cε + ε^{2}− c^{2}
ε

= 2cε + ε^{2}
ε

= 2c + ε.

Introduction: Why Infinitesimals? ix Since ε is infinitely small in comparison with 2c, we can disregard it.

We see that the slope of y = x^{2} at the point c is given by 2c. This is
the correct answer, obtained in a natural, algebraic way without any
type of limiting procedure.

We can apply the infinitesimal method to many other problems.

For instance, we can calculate the rate of change (i.e. slope) of a sine curve at a given point c. We let y = sin x and proceed as before. The quotient becomes

sin(c + ε)− sin c

ε = sin c· cos ε + sin ε · cos c − sin c ε

by using the rule for the sine of a sum. For any infinitesimal ε, it can be shown geometrically or algebraically that cos ε = 1 and that sin ε = ε.

So we have

sin c· cos ε + sin ε · cos c − sin c

ε = sin c + ε cos c− sin c ε

= ε cos c ε

= cos c.

Again, the correct answer.

This method even provides more general results. Leibniz deter- mined the rate of change of a product of functions like this. Let x and y be functions of another variable t. First, we need to find the infinitesimal difference between two “successive” values of the function xy, which is called its differential and denoted d(xy). Leibniz reasoned that

d(xy) = (x + dx)(y + dy)− xy,

where dx and dy represent infinitesimal increments in the values of x and y. Simplifying,

d(xy) = xy + x dy + y dx + dx dy− xy

= x dy + y dx + dx dy.

Introduction: Why Infinitesimals? x Since (dx dy) is infinitesimal in comparison with the other two terms,

Leibniz concluded that

d(xy) = x dy + y dx.

The rate of change in xy with respect to t is given by d(xy)/dt. There- fore, we determine that

d(xy)

dt = xdy

dt + ydx dt, which is the correct relationship.

At this point, some questions present themselves. If infinitesimals are so useful, why did they die off? Is there a way to resuscitate them?

And how do they fit into modern mathematics? These questions I propose to answer.

CHAPTER 1

### Historical Background

Definition 1.1. An infinitesimal is a number whose magnitude exceeds zero yet remains smaller than every finite, positive number.

1.1. Overview

Infinitesimals have enjoyed an extensive and scandalous history. Al-
most as soon as the Pythagoreans suggested the concept 2500 years ago,
Zeno proceeded to drown it in paradox. Nevertheless, many mathema-
ticians continued to use infinitesimals until the end of the 19^{th} century
because of their intuitive appeal in understanding continuity. When the
foundations of calculus were formalized by Weierstrass, et al. around
1872, they were banished from mathematics.

As the 20^{th} century began, the mathematical community officially
regarded infinitesimals as numerical chimeras, but engineers and physi-
cists continued to use them as heuristic aids in their calculations. In
1960, the logician Abraham Robinson discovered a way to develop a
rigorous theory of infinitesimals. His techniques are now referred to as
Nonstandard Analysis, which is a small but growing field in mathema-
tics. Practioners have found many intuitive, direct proofs of classical
results. They have also extended mathematics in new directions, which
may eventually result in fruitful discoveries.

1.2. Origins

The first deductive mathematician, Pythagoras (569?–500? b.c.), taught that all is Number. E.T. Bell describes his fervor:

Historical Background 2 He . . . preached like an inspired prophet that all na-

ture, the entire universe in fact, physical, metaphysi- cal, mental, moral, mathematical—everything—is built on the discrete pattern of the integers, 1, 2, 3, . . . [1, p. 21].

Unfortunately, this grand philosophy collapsed when one of his students discovered that the length of the diagonal of a square cannot be written as the ratio of two whole numbers.

The argument was simple. If a square has sides of unit length, then its diagonal has a length of √

2, according to the theorem which bears Pythagoras’ name. Assume then that √

2 = p/q, where p and q are integers which do not share a factor greater than one. This is a reasonable assumption, since any common factor could be canceled immediately from the equation. An equivalent form of this equation is

p^{2} = 2q^{2}.

We know immediately that p cannot be odd, since 2q^{2} is even. We
must accept the alternative that p is even, so we write p = 2r for some
whole number r. In this case, 4r^{2} = 2q^{2}, or 2r^{2} = q^{2}. So we see that
q is also even. But we assumed that p and q have no common factors,
which yields a contradiction. Therefore, we reject our assumption and
conclude that √

2 cannot be written as a ratio of integers; it is an irrational number [1, p. 21].

According to some stories, this proof upset Pythagoras so much that he hanged its precocious young author. Equally apocryphal reports indicate that the student perished in a shipwreck. These tales should demonstrate how badly this concept unsettled the Greeks [3, p. 20].

Of course, the Pythagoreans could not undiscover the proof. They had to decide how to cope with these inconvenient, non-rational numbers.

Historical Background 3 The solution they proposed was a crazy concept called a monad.

To explain the genesis of this idea, Carl Boyer presents the question:

If there is no finite line segment so small that the di- agonal and the side may both be expressed in terms of it, may there not be a monad or unit of such a nature that an indefinite number of them will be required for the diagonal and for the side of the square [3, p. 21]?

The details were sketchy, but the concept had a certain appeal, since it enabled the Pythagoreans to construct the rational and irrational numbers from a single unit. The monad was the first infinitesimal.

Zeno of Elea (495–435 b.c.) was widely renowned for his ability to topple the most well-laid arguments. The monad was an easy target.

He presented the obvious objections: if the monad had any length, then an infinite number should have infinite length, whereas if the monad had no length, no number would have any length. He is also credited with the following slander against infinitesimals:

That which, being added to another does not make it greater, and being taken away from another does not make it less, is nothing [3, p. 23].

The Greeks were unable to measure the validity of Zeno’s arguments. In truth, ancient uncertainty about infinitesimals stemmed from a greater confusion about the nature of a continuum, a closely related question which still engages debate [1, pp. 22–24].

1.3. Continuity

Zeno propounded four famous paradoxes which demonstrate the subtleties of continuity. Here are the two most effective.

The Achilles. Achilles running to overtake a crawling tortoise ahead of him can never overtake it, because

Historical Background 4 he must first reach the place from which the tortoise

started; when Achilles reaches that place, the tortoise has departed and so is still ahead. Repeating the ar- gument, we see that the tortoise will always be ahead.

The Arrow. A moving arrow at any instant is either at rest or not at rest, that is, moving. If the instant is indivisible, the arrow cannot move, for if it did the instant would immediately be divided. But time is made up of instants. As the arrow cannot move in any one instant, it cannot move in any time.

Hence it always remains at rest.

The Achilles argues that the line cannot support infinite division. In this case, the continuum must be composed of finite atomic units.

Meanwhile, the Arrow suggests the opposite position that the line must be infinitely divisible. On this second view, the continuum cannot be seen as a set of discrete points; perhaps infinitesimal monads result from the indefinite subdivision.

Taken together, Zeno’s arguments make the problem look insoluble;

either way you slice it, the continuum seems to contradict itself [1, p. 24]. Modern mathematical analysis, which did not get formalized until about 1872, is necessary to resolve these paradoxes [3, pp. 24–25].

Yet, some mathematicians—notably L.E.J. Brouwer (1881–1966) and Errett Bishop (1928–1983)—have challenged the premises under- lying modern analysis. Brouwer, the founder of Intuitionism, regarded mathematics “as the formulation of mental constructions that are gov- erned by self-evident laws” [4]. One corollary is that mathematics must develop from and correspond with physical insights.

Now, an intuitive definition of a continuum is “the domain over which a continuously varying magnitude actually varies” [2, p. 1]. The

Historical Background 5 phrase “continuously varying” presumably means that no jumps or

breaks occur. As a consequence, it seems as if a third point must lie between any two points of a continuum. From this premise, Brouwer concluded that a continuum can “never be thought of as a mere col- lection of units [i.e. points]” [2, p. 2]. Brouwer might have imagined that the discrete points of a continuum cohere due to some sort of infinitesimal “glue.”

Some philosophers would extend Brouwer’s argument even farther.

The logician Charles S. Peirce (1839–1914) wrote that

[the] continuum does not consist of indivisibles, or points, or instants, and does not contain any except insofar as its continuity is ruptured [2, p. 4].

Peirce bases his complaint on the fact that it is impossible to single
out a point from a continuum, since none of the points are distinct.^{1}
On this view, a line is entirely composed of a series of indistinguishable
overlapping infinitesimal units which flow from one into the next [2,
Introduction].

Intuitionist notions of the continuum resurface in modern theories of infinitesimals.

1.4. Eudoxus and Archimedes

In ancient Greece, there were some attempts to skirt the logical difficulties of infinitesimals. Eudoxus (408–355 b.c.) recognized that he need not assume the existence of an infinitely small monad; it was sufficient to attain a magnitude as small as desired by repeated subdi- vision of a given unit. Eudoxus employed this concept in his method of

1More precisely, all points of a continuum are topologically identical, although some have algebraic properties. For instance, a small neighborhood of zero is in- distinguishable from a small neighborhood about another point, even though zero is the unique additive identity of the fieldR.

Historical Background 6 exhaustion which is used to calculate areas and volumes by filling the

entire figure with an increasingly large number of tiny partitions [1, pp. 26–27].

As an example, the Greeks knew that the area of a circle is given by
A = ^{1}_{2}rC, where r is the radius and C is the circumference.^{2} They prob-
ably developed this formula by imagining that the circle was composed
of a large number of isosceles triangles (see Figure 1.1). It is important
to recognize that the method of exhaustion is strictly geometrical, not
arithmetical. Furthermore, the Greeks did not compute the limit of a
sequence of polygons, as a modern geometer would. Rather, they used
an indirect reductio ad absurdem technique which showed that any re-
sult other than A = ^{1}_{2}rC would lead to a contradiction if the number
of triangles were increased sufficiently [7, p. 4].

Figure 1.1. Dividing a circle into isosceles triangles to approximate its area.

Archimedes (287–212 b.c.), the greatest mathematician of antiq- uity, used another procedure to determine areas and volumes. To measure an unknown figure, he imagined that it was balanced on a

2The more familiar formula A = πr^{2} results from the fact that π is defined by
the relation C = 2πr.

Historical Background 7 lever against a known figure. To find the area or volume of the for-

mer in terms of the latter, he determined where the fulcrum must be placed to keep the lever even. In performing his calculations, he imagined that the figures were comprised of an indefinite number of laminae—very thin strips or plates. It is unclear whether Archimedes actually regarded the laminae as having infinitesimal width or breadth.

In any case, his results certainly attest to the power of his method: he discovered mensuration formulae for an entire menagerie of geomet- rical beasts, many of which are devilish to find, even with modern techniques. Archimedes recognized that his method did not prove his results. Once he had applied the mechanical technique to obtain a preliminary guess, he supplemented it with a rigorous proof by exhaus- tion [3, pp. 50–51].

1.5. Apply when Necessary

All the fuss about the validity of infinitesimals did not prevent
mathematicians from working with them throughout antiquity, the
Middle Ages, the Renaissance and the Enlightenment. Although some
people regarded them as logically problematic, infinitesimals were an
effective tool for researching continuous phenomena. They crept into
studies of slopes and areas, which eventually grew into the differential
and integral calculi. In fact, Newton and Leibniz, who independently
discovered the Fundamental Theorem of Calculus near the end of the
17^{th} century, were among the most inspired users of infinitesimals [3].

Sir Isaac Newton (1642–1727) is widely regarded as the greatest genius ever produced by the human race. His curriculum vitae easily supports this claim; his discoveries range from the law of universal grav- itation to the method of fluxions (i.e. calculus), which was developed using infinitely small quantities [1, Ch. 6].

Historical Background 8 Newton began by considering a variable which changes continuously

with time, which he called a fluent. Each fluent x has an associated rate
of change or “generation,” called its fluxion and written ˙x. Moreover,
any fluent x may be viewed as the fluxion of another fluent, denoted x.^{|}
In modern terminology, ˙x is the derivative of x, and x is the indefinite^{|}
integral of x.^{3} The problem which interested Newton was, given a
fluent, to find its derivative and indefinite integral with respect to time.

Newton’s original approach involved the use of an infinitesimal quantity o, an infinitely small increment of time. Newton recognized that the concept of an infinitesimal was troublesome, so he began to focus his attention on their ratio, which is often finite. Given this ratio, it is easy enough to find two finite quantities with an identical quotient.

This realization led Newton to view a fluxion as the “ultimate ratio” of finite quantities, rather than a quotient of infinitesimals. Eventually, he disinherited infinitesimals: “I have sought to demonstrate that in the method of fluxions, it is not necessary to introduce into geometry infinitely small figures.” Yet in complicated calculations, o sometimes resurfaced [3, Ch. V].

The use of infinitesimals is even more evident in the work of Gott- fried Wilhelm Leibniz (1646–1716). He founded his development of calculus on the concept of a differential, an infinitely small increment in the value of a continuously changing variable. To calculate the rate of change of y = f (x) with respect to the rate of change of x, Leibniz formed the quotient of their differentials, dy/dx, in analogy to the for- mula for computing a slope, ∆y/∆x (see Figure 1.2). To find the area under the curve f (x), he imagined summing an indefinite number of

3Newton’s disused notation seems like madness, but there is method to it. The
fluxion ˙x is a “pricked letter,” indicating the rate of change at a point. The inverse
fluentx^{|} suggests the fact that it is calculated by summing thin rectangular strips
(see Figure 1.3).

Historical Background 9 rectangles with height f (x) and infinitesimal width dx (see Figure 1.3).

He expressed this sum with an elongated s, writingR f (x) dx. Leibniz’s notation remains in use today, since it clearly expresses the essential ideas involved in calculating slopes and areas [3, Ch. V].

Figure 1.2. Using differentials to calculate the rate of change of a function. The slope of the curve at the point c is the ratio dy/dx.

Figure 1.3. Using differentials to calculate the area un- der a curve. The total area is the sum of the small rect- angles whose areas are given by the products f (x) dx.

Although Leibniz began working with finite differences, his suc- cess with infinitesimal methods eventually converted him, despite on- going doubts about their logical basis. When asked for justification, he

Historical Background 10 tended to hedge: an infinitesimal was merely a quantity which may

be taken “as small as one wishes” [3, Ch. V]. Elsewhere he wrote that it is safe to calculate with infinitesimals, since “the whole matter can be always referred back to assignable quantities” [7, p. 6]. Leib- niz did not explain how one may alternate between “assignable” and

“inassignable” quantities, a serious gloss. But it serves to emphasize the confusion and ambivalence with which Leibniz regarded infinitesi- mals [3, Ch. V].

As a final example of infinitesimals in history, consider Leonhard Euler (1707–1783), the world’s most prolific mathematician. He un- abashedly used the infinitely large and the infinitely small to prove many striking results, including the beautiful relation known as Eu- ler’s Equation:

e^{iθ} = cos θ + i sin θ,
where i = √

−1. From a modern perspective, his derivations are
bizarre. For instance, he claims that if N is infinitely large, then the
quotient ^{N−1}_{N} = 1. This formula may seem awkward, yet Euler used it
to obtain correct results [7, pp. 8–9].

1.6. Banished

As the 19^{th} century dawned, there was a strong tension between
the logical inconsistencies of infinitesimals and the fact that they of-
ten yielded the right answer. Objectors essentially reiterated Zeno’s
complaints, while proponents offered metaphysical speculations. As
the century progressed, a nascent trend toward formalism accelerated.

Analysts began to prove all theorems rigorously, with each step requir- ing justification. Infinitesimals could not pass muster.

The first casualty was Leibniz’s view of the derivative as the quo- tient of differentials. Bernhard Bolzano (1781–1848) realized that the

Historical Background 11 derivative is a single quantity, rather than a ratio. He defined the de-

rivative of a continuous function f (x) at a point c as the number f^{0}(c)
which the quotient

f (c + h)− f(c) h

approaches with arbitrary precision as h becomes small. Limits are evident in Bolzano’s work, although he did not define them explicitly.

Augustin-Louis Cauchy (1789–1857) took the next step by develop- ing an arithmetic formulation of the limit concept which did not appeal to geometry. Interestingly, he used this notion to define an infinitesi- mal as any sequence of numbers which has zero as its limit. His theory lacked precision, which prevented it from gaining acceptance.

Cauchy also defined the integral in terms of limits; he imagined it as the ultimate sum of the rectangles beneath a curve as the rectangles be- come smaller and smaller [3, Ch. VII]. Bernhard Riemann (1826–1866) polished this definition to its current form, which avoids all infinitesi- mal considerations [16, Ch. 5], [12, Ch. 6].

In 1872, the limit finally received a complete, formal treatment
from Karl Weierstrass (1815–1897). The idea is that a function f (x)
will take on values arbitrarily close to its limit at the point c when-
ever its argument x is sufficiently close to c.^{4} This definition rendered
infinitesimals unnecessary [3, 287].

The killing blow also fell in 1872. Richard Dedekind (1831–1916) and Georg Cantor (1845-1918) both published constructions of the real numbers. Before their work, it was not clear that the real numbers ac- tually existed. Dedekind and Cantor were the first to exhibit sets which

4More formally, L = f (c) is the limit of f (x) as x aproaches c if and only if the following statement holds. For any ε > 0, there must exist a δ > 0 for which

|c − x| < δ implies that |L − f(x)| < ε.

Historical Background 12
satisfied all the properties desired of the reals.^{5} These models left no

space for infinitesimals, which were quickly forgotten by mathemati- cians [3, Ch. VII].

1.7. Regained

In comparision with mathematicians, engineers and physicists are typically less concerned with rigor and more concerned with results.

Since their studies revolve around dynamical systems and continuous phenomena, they continued to regard infinitesimals as useful heuris- tic aids in their calculations. A little care ensured correct answers, so they had few qualms about infinitely small quantities. Meanwhile, the formalists, led by David Hilbert (1862-1943), reigned over math- ematics. No theorem was valid without a rigorous, deductive proof.

Infinitesimals were scorned since they lacked sound definition.

In autumn 1960, a revolutionary, new idea was put forward by Abraham Robinson (1918–1974). He realized that recent advances in symbolic logic could lead to a new model of mathematical analysis.

Using these concepts, Robinson introduced an extension of the real
numbers, which he called the hyperreals. The hyperreals, denoted ^{∗}R,
contain all the real numbers and obey the familiar laws of arithmetic.

But ^{∗}R also contains infinitely small and infinitely large numbers.

With the hyperreals, it became possible to prove the basic theorems
of calculus in an intuitive and direct manner, just as Leibniz had done in
the 17^{th} century. A great advantage of Robinson’s system is that many
properties of R still hold for ^{∗}R and that classical methods of proof
apply with little revision [6, pp. 281–287]. Robinson’s landmark book,

5Never mind the fact that their constructions were ultimately based on the natural numbers, which did not receive a satisfactory definition until Frege’s 1884 book Grundlagen der Arithmetik [14].

Historical Background 13 Non-standard Analysis was published in 1966. Finally, the mysterious

infinitesimals were placed on a firm foundation [7, pp. 10–11].

In the 1970s, a second model of infinitesimal analysis appeared,
based on considerations in category theory, another branch of math-
ematical logic. This method develops the nil-square infinitesimal, a
quantity ε which is not necessarily equal to zero, yet has the property
that ε^{2} = 0. Like hyperreals, nil-square infinitesimals may be used to
develop calculus in a natural way. But this system of analysis possesses
serious drawbacks. It is no longer possible to assert that either x = y
or x 6= y. Points are “fuzzy”; sometimes x and y are indistinguishable
even though they are not identical. This is Peirce’s continuum: a se-
ries of overlapping infinitesimal segments [2, Introduction]. Although
intuitionists believe that this type of model is the proper way to view a
continuum, many standard mathematical tools can no longer be used.^{6}
For this reason, the category-theoretical approach to infinitesimals is
unlikely to gain wide acceptance.

1.8. The Future

The hyperreals satisfy a rule called the transfer principle:

Any appropriately formulated statement is true of^{∗}R
if and only if it is true of R.

As a result, any proof using nonstandard methods may be recast in terms of standard methods. Critics argue, therefore, that Nonstandard Analysis (NSA) is a trifle. Proponents, on the other hand, claim that infinitesimals and infinitely large numbers facilitate proofs and permit a more intuitive development of theorems [7, p. 11].

6The specific casualties are the Law of Excluded Middle and the Axiom of Choice. This fact prevents proof by contradiction and destroys many important results, including Tychonoff’s Theorem and the Hahn-Banach Extension Theorem.

Historical Background 14 New mathematical objects have been constructed with NSA, and

it has been very effective in attacking certain types of problems. A
primary advantage is that it provides a more natural view of standard
mathematics. For example, the space of distributions, D^{0}(R), may be
viewed as a set of nonstandard functions.^{7} A second benefit is that NSA
allows mathematicians to apply discrete methods to continuous prob-
lems. Brownian motion, for instance, is essentially a random walk with
infinitesimal steps. Finally, NSA shrinks the infinite to a manageable
size. Infinite combinatorial problems may be solved with techniques
from finite combinatorics [10, Preface].

So, infinitesimals are back, and they can no longer be dismissed as logically unsound. At this point, it is still difficult to project their future. Nonstandard Analysis, the dominant area of research using infinitesimal methods, is not yet a part of mainstream mathematics.

But its intuitive appeal has gained it some formidable allies. Kurt
G¨odel (1906–1978), one of the most important mathematicians of the
20^{th} century, made this prediction: “There are good reasons to believe
that nonstandard analysis, in some version or other, will be the analysis
of the future” [7, p. v].

7Incredibly, D^{0}(R) may even be viewed as a set of infinitely differentiable non-
standard functions.

CHAPTER 2

### Rigorous Infinitesimals

There are now several formal theories of infinitesimals, the most common of which is Robinson’s Nonstandard Analysis (NSA). I believe that NSA provides the most satisfying view of infinitesimals. Further- more, its toolbox is easy to use. Advanced applications require some practice, but the fundamentals quickly become arithmetic.

2.1. Developing Nonstandard Analysis

Different authors present NSA in radically different ways. Although the three major versions are essentially equivalent, they have distinct advantages and disadvantages.

2.1.1. A Nonstandard Extension of R. Robinson originally
constructed a proper nonstandard extension of the real numbers, which
he called the set of hyperreals, ^{∗}R [6, 281–287]. One approach to NSA
begins by defining the nonstandard extension ^{∗}X of a general set X.

This extension consists of a non-unique mapping ∗ from the subsets of
X to the subsets of ^{∗}X which preserves many set-theoretic properties
(see Appendix A). Define the power set ofX to be the collection of all
its subsets, i.e. P(X) = {A : A ⊆ X}. Then, ∗ : P(X) → P(^{∗}X). It
can be shown that any nonempty set has a proper nonstandard exten-
sion, i.e. X $^{∗}X. The extension of R to ^{∗}R is just one example. Since
R is already complete, it follows that ^{∗}R must contain infinitely small
and infinitely large numbers. Infinitesimals are born [8].

Rigorous Infinitesimals 16 I find this definition very unsatisfying, since it yields no information

about what a hyperreal is. Before doing anything, it is also necessary
to prove a spate of technical lemmata. The primary advantage of this
method is that the extension can be applied to any set-theoretic object
to obtain a corresponding nonstandard object.^{1} A minor benefit is that
this system is not tied to a specific nonstandard construction, e.g. ^{∗}R.

It specifies instead the properties which the nonstandard object should preserve. An unfortunate corollary is that the presentation is extremely abstract [8].

2.1.2. Nelson’s Axioms. Nonstandard extensions are involved
(at best). Ed Nelson has made NSA friendlier by axiomatizing it. The
rules are given a priori (see Appendix B), so there is no need for com-
plicated constructions. Nelson’s approach is called Internal Set Theory
(IST). It has been shown that IST is consistent with standard set the-
ory,^{2} which is to say that it does not create any (new) mathematical
contradictions [11].

Several details make IST awkward to use. To eliminate^{∗}R from the
picture, IST adds heretofore unknown elements to the reals. In fact,
every infinite set of real numbers contains these nonstandard mem-
bers. But IST provides no intuition about the nature of these new
elements. How big are they? How many are there? How do they relate
to the standard elements? Alain Robert answers, “These nonstandard
integers have a certain charm that prevents us from really grasping

1This version of NSA strictly follows the Zermelo-Fraenkel axiomatic in re- garding every mathematical object as a set. For example, an ordered pair (a, b) is written as{a, {a, b}}, and a function f is identified with its graph, f = {(x, f(x)) : x∈ Dom f}. In my opinion, it is unnecessarily complicated to expand every object to its primitive form.

2Standard set theory presumes the Zermelo-Fraenkel axioms and the Axiom of Choice.

Rigorous Infinitesimals 17
them!” [11]. I see no charm.^{3} Another major complaint is that IST

intermingles the properties ofR and ^{∗}R, which serves to limit compre-
hension of both. It seems more transparent to regard the reals and the
hyperreals as distinct systems.

2.2. Direct Ultrapower Construction of ^{∗}R

In my opinion, a direct construction of the hyperreals provides the most lucid approach to NSA. Although it is not as general as a non- standard extension, it repays the loss with rich intuition about the hyperreals. Arithmetic develops quickly, and it is based largely on simple algebra and analysis.

Since the construction of the hyperreals from the reals is analogous to Cantor’s construction of the real numbers from the rationals, we begin with Cantor. I follow Goldblatt throughout this portion of the development [7].

2.2.1. Cantor’s Construction of R. Until the end of the 1800s, the rationals were the only “real” numbers in the sense that R was purely hypothetical. Mathematicians recognized that R should be an ordered field with the least-upper-bound property, but no one had demonstrated the existence of such an object. In 1872, both Richard Dedekind and Georg Cantor published solutions to this problem [3, Ch. VII]. Here is Cantor’s approach.

Since the rationals are well-defined, they are the logical starting point. The basic idea is to identify each real number r with those sequences of rationals which want to converge to r.

3In Nelson’s defense, it must be said that the reason the nonstandard numbers are so slippery is that all sets under IST are internal sets (see Section 2.3.2), which are fundamental to NSA. Only the standard elements of an internal set are arbitrary, and these dictate the nonstandard elements.

Rigorous Infinitesimals 18 Definition 2.1 (Sequence). A sequence is a function defined on

the set of positive integers. It is denoted by
a={a^{j}}^{∞}j=1 ={a^{j}}.

We will indicate the entire sequence by a boldface letter or by a single term enclosed in braces, with or without limits. The terms are written with a subscript index, and they are usually denoted by the same letter as the sequence.

Definition 2.2 (Cauchy Sequence). A sequence {r^{j}}^{∞}j=1 ={r^{j}} is
Cauchy if it converges within itself. That is, limj,k→∞|r^{j}− r^{k}| = 0.

Consider the set of Cauchy sequences of rational numbers, and de-
note them by S. Let r = {r^{j}} and s = {s^{j}} be elements of S. Define
addition and multiplication termwise:

r⊕ s = {r^{j} + sj}, and
r s = {r^{j} · s^{j}}.

It is easy to check that these operations preserve the Cauchy property.

Furthermore, ⊕ and are commutative and associative, and ⊕ dis- tributes over. Hence, (S, ⊕, ) is a commutative ring which has zero 0={0, 0, 0, . . .} and unity 1 = {1, 1, 1, . . .}.

Next, we will say that r, s ∈ S are equivalent to each other if and only if they share the same limit. More precisely,

r≡ s if and only if lim

j→∞|r^{j} − s^{j}| = 0.

It is straightforward to check that≡ is an equivalence relation by using the triangle inequality, and we denote its equivalence classes by [·].

Moreover, ≡ is a congruence on the ring S, which means r ≡ r^{0} and
s≡ s^{0} imply that r⊕ s ≡ r^{0}⊕ s^{0} and r s ≡ r^{0} s^{0}.

Now, letR be the quotient ring given by S modulo the equivalence.

R = {[r] : r ∈ S}.

Rigorous Infinitesimals 19 Define arithmetic operations in the obvious way, viz.

[r] + [s] = [r⊕ s] = [{rj + sj}] , and
[r]· [s] = [r s] = [{r^{j}· s^{j}}] .

The fact that ≡ is a congruence on S shows that these operations are independent of particular equivalence class members; they are well- defined.

Finally, define an ordering: [r] < [s] if and only if there exists a
rational ε > 0 and an integer J ∈ N such that r^{j} + ε < sj for each
j > J.^{4} We must check the well-definition of this relation. Let [r] < [s],
which dictates constants ε and J. Choose r^{0} ≡ r and s^{0} ≡ s. There
exists an N > J such that j > N implies|r^{j}−rj^{0}| < ^{1}_{4}ε and|s^{0}j−s^{j}| < ^{1}_{4}ε.

Then,

|r^{j}− r^{0}j| + |s^{0}j − s^{j}| < ^{1}_{2}ε,
which shows that

|(r^{j}− s^{j}) + (s^{0}_{j}− rj^{0})| < ^{1}_{2}ε, or

− ^{1}_{2}ε < (rj − s^{j}) + (s^{0}_{j} − r^{0}j) < ^{1}_{2}ε, which gives
(sj − r^{j})− ^{1}_{2}ε < (s^{0}_{j}− rj^{0})

for any j > N . Since [r] < [s] and N > J, ε < (sj − r^{j}) for all j > N .
Then,

0 < ε− ^{1}_{2}ε < (s^{0}_{j} − r^{0}j), or
r^{0}_{j}+ ^{1}_{2}ε < s^{0}_{j}

for each j > N , which demonstrates that [r^{0}] < [s^{0}] by our definition.

It can be shown that (R, +, ·, <) is a complete, ordered field. Since all complete, ordered fields are isomorphic, we may as well identify this object as the set of real numbers. Notice that the rational numbers Q

4The sequences r and s do not necessarily converge to rational numbers, which means that we cannot do arithmetic with their limits. In the current context, the more obvious definition “[r] < [s] iff limj→∞rj<limj→∞sj” is meaningless.

Rigorous Infinitesimals 20 are embedded in R via the mapping q 7→ [{q, q, q, . . .}]. At this point,

the construction becomes incidental. We hide the details by labeling the equivalence classes with more meaningful symbols, such as 2 or√

2 or π.

2.2.2. Cauchy’s Infinitesimals. The question at hand is how to define infinitesimals in a consistent manner so that we may calculate with them. Cauchy’s arithmetic definition of an infinitesimal provides a good starting point.

Cauchy suggested that any sequence which converges to zero may
be regarded as infinitesimal.^{5} In analogy, we may also regard divergent
sequences as infinitely large numbers. This concept suggests that rates
of convergence and divergence may be used to measure the magnitude
of a sequence.

Unfortunately, when we try to implement this notion, problems appear quickly. We might say that

{2, 4, 6, 8, . . .} is greater than {1, 2, 3, 4, . . .}

since it diverges faster. But how does

{1, 2, 3, 4, . . .} compare with {2, 3, 4, 5, . . .}?

They diverge at exactly the same rate, yet the second seems like it should be a little greater. What about sequences like

{−1, 2, −3, 4, −5, 6, . . .}?

How do we even determine its rate of divergence?

Clearly, a more stringent criterion is necessary. To say that two se- quences are equivalent, we will require that they be “almost identical.”

5Given such an infinitesimal, ε = {ε^{j}}, Cauchy also defined η = {η^{j}} to be
an infinitesimal of order n with respect to ε if ηj ∈ O (ε^{j}^{n}) and εjn

∈ O (η^{j}) as
j→ ∞ [3, Ch. VII].

Rigorous Infinitesimals 21 2.2.3. The Ring of Real-Valued Sequences. We must formal-

ize these ideas. As in Cantor’s construction, we will be working with sequences. This time, the elements will be real numbers with no con- vergence condition specified.

Let r ={r^{j}} and s = {s^{j}} be elements of R^{N}, the set of real-valued
sequences. First, define

r⊕ s = {r^{j} + sj}, and
r s = {r^{j} · s^{j}}.

(R^{N},⊕, ) is another commutative ring^{6}with zero 0 = {0, 0, 0, . . .} and
unity 1 = {1, 1, 1, . . .}.

2.2.4. When Are Two Sequences Equivalent? The next step
is to develop an equivalence relation onR^{N}. We would like r≡ s when
r and s are “almost identical”—if their agreement set

E^{rs} ={j ∈ N : r^{j} = sj}

is “large.” A nice idea, but there seems to be an undefined term. What is a large set? What properties should it have?

• Equivalence relations are reflexive, which means that any se-
quence must be equivalent to itself. Hence E^{rr} ={1, 2, 3, . . .} =
N must be a large set.

• Equivalence is also transitive, which means that E^{rs} and Est

large must imply Ert large. In general, the only nontrivial
statement we can make about the agreement sets is that E^{rs}∩
E^{st} ⊆ E^{rt}. Thus, the intersection of large sets ought to be
large.

6Note that R^{N} is not a field, since it contains nonzero elements which have a
-product of 0, such as {1, 0, 1, 0, 1, . . .} and {0, 1, 0, 1, 0, . . .} .

Rigorous Infinitesimals 22

• The empty set, ∅, should not be large, or else every subset of N would be large by the foregoing. In that case all sequences would be equivalent, which is less than useful.

• A set of integers A is called cofinite if N \ A is a finite set.

Declaring any cofinite set to be large would satisfy the first three properties. But consider the sequences

o={1, 0, 1, 0, 1, . . .} and e = {0, 1, 0, 1, 0, . . .}.

They agree nowhere, so they determine two distinct equiva-
lence classes. We would like the hyperreals to be totally or-
dered, so one of e and o must exceed the other. Let us say
that r < s if and only if L^{rs} = {j ∈ N : r^{j} < sj} is a large
set. Neither L^{oe} ={j : j is even} nor L^{eo} ={j : j is odd} is
cofinite, so e6< o and e 6> o. To obtain a total ordering using
this potential definition, we need another stipulation: for any
A⊆ N, exactly one of A and N \ A must be large.

These requirements may seem rather stringent. But they are satis-
fied naturally by any nonprincipal ultrafilter F onN. (See Appendix C
for more details about filters.) The existence of such an object is not
trivial. Its complexity probably kept Cauchy and others from develop-
ing the hyperreals long ago. We are more interested in the applications
of ^{∗}R than the minutiae of its construction. Therefore, we will not
delve into the gory, logical details. Here, suffice it to say that there
exists a nonprincipal ultrafilter on N.

Definition 2.3 (Large Set). A set A ⊆ N is large with respect to the nonprincipal ultrafilter F ∈ P(N) if and only if A ∈ F .

Notation ({{r R s}}). In the foregoing, E^{rs} denoted the set of
places at which r = {r^{j}} and s = {s^{j}} are equal. We need a more
general notation for the set of terms at which two sequences satisfy

Rigorous Infinitesimals 23 some relation. Write

{{r = s}} = {j ∈ N : r^{j} = sj},

{{r < s}} = {j ∈ N : r^{j} = sj}, or in general
{{r R s}} = {j ∈ N : r^{j} Rsj}.

Sometimes, it will be convenient to use a similar notation for the set of places at which a sequence satisfies some predicate P :

{{P (r)}} = {j ∈ N : P (r^{j})}.

Now, we are prepared to define an equivalence relation on R^{N}. Let
{r^{j}} ≡ {s^{j}} iff {{r = s}} ∈ F .

The properties of large sets guarantee that ≡ is reflexive, symmetric
and transitive. Write the equivalence classes as [·]. And notice that ≡
is a congruence on the ring R^{N}.

Definition 2.4 (The Almost-All Criterion). When r≡ s, we also say that they agree on a large set or agree at almost all n. In general, if P is a predicate and r is a sequence, we say that P holds almost everywhere on r if {{P (r)}} is a large set.

2.2.5. The Field of Hyperreals. Next, we develop arithmetic
operations for the quotient ring ^{∗}R which equals R^{N}modulo the equiv-
alence:

∗R = {[r] : r ∈ R^{N}}.

Addition and multiplication are defined by

[r] + [s] = [r⊕ s] = [{r^{j} + sj}] , and
[r]· [s] = [r s] = [{r^{j}· s^{j}}] .

Well-definition follows from the fact that ≡ is a congruence. Finally, define the ordering by

[r] < [s] iff {{r < s}} ∈ F iff {j ∈ N : r^{j} < sj} ∈ F .

Rigorous Infinitesimals 24 This ordering is likewise well-defined.

With these definitions, it can be shown that (^{∗}R, +, ·, <) is an or-
dered field. (See Goldblatt for a proof sketch [7, Ch. 3.6].)

This presentation is called an ultrapower construction of the hyper-
reals.^{7} Since our development depends quite explicitly on the choice of
a nonprincipal ultrafilter F , we might ask whether the field of hyper-
reals is unique.^{8} For our purposes, the issue is tangential. It does not
affect any calculations or proofs, so we will ignore it.

2.2.6. R Is Embedded in ^{∗}R. Identify any real number r ∈ R
with the constant sequence r = {r, r, r, . . .}. Now, define a map ∗ :
R →^{∗}R by

∗r = [r] = [{r, r, r, . . .}] . It is easy to see that for r, s∈ R,

∗(r + s) = ^{∗}r +^{∗}s,

∗(r· s) = ^{∗}r·^{∗}s,

∗r =^{∗}s iff r = s, and

∗r <^{∗}s iff r < s.

In addition, ^{∗}0 = [0] = [{0, 0, 0, . . .}] is the zero of ^{∗}R, and ^{∗}1 = [1] =
[{1, 1, 1, . . .}] is the unit.

Theorem 2.5. The map ∗ : R → ^{∗}R is an order-preserving field

isomorphism.

7The term ultrapower means that ^{∗}R is the quotient of a direct power (R^{N})
modulo a congruence (≡) given by an ultrafilter (F ).

8Unfortunately, the answer depends on which set-theoretic axioms we assume.

The continuum hypothesis (CH) implies that we will obtain the same field (to the point of isomorphism) for any choice of F . Denying CH leaves the situation undetermined [7, 33]. Both CH and not-CH are consistent with standard set theory, but Schechter’s reference, Handbook of Analysis and Its Foundations, gives no indication that either axiom has any effect on standard mathematics [15].

Rigorous Infinitesimals 25 Therefore, the reals are embedded quite naturally in the hyperreals.

As a result, we may identify r with ^{∗}r as convenient.

2.2.7. R Is a Proper Subset of ^{∗}R. Let ε = {1,^{1}_{2},^{1}_{3}, . . .} = {^{1}_{j}}.

It is clear that ε > 0:

{{0 < ε}} = {j ∈ N : 0 < ^{1}_{j}} = N ∈ F .
Yet, for any real number r, the set

{{ε < r}} = {j ∈ N : ^{1}_{j} < r}

is cofinite. Every cofinite set is large (see Appendix C), so{{ε < r}} ∈ F
which implies that [ε] <^{∗}r. Therefore, [ε] is a positive infinitesimal!

Analogously, let ω ={1, 2, 3, . . .}. For any r ∈ R, the set {{r < ω}} = {j ∈ N : r < j}

is cofinite, because the reals are Archimedean. We have proved that

∗r < [ω]. Therefore, [ω] is infinitely large!

Remark 2.6. It is undesirable to discuss “infinitely large” and “in- finitely small” numbers. These phrases are misleading because they suggest a connection between nonstandard numbers and the infinities which appear in other contexts. Hyperreals, however, have nothing to do with infinite cardinals, infinite sums, or sequences which diverge to infinity. Therefore, the terms hyperfinite and unlimited are preferable to “infinitely large.” Likewise, infinitesimal is preferable to “infinitely small.”

These facts demonstrate thatR $^{∗}R. Here is an even more direct
proof of this result. For any r ∈ R, {{r = ω}} equals ∅ or {r}. Thus
{{r = ω}} 6∈ F , which shows that ^{∗}r 6= [ω]. Thus, [ω] ∈^{∗}R \ R.

Rigorous Infinitesimals 26
Definition 2.7 (Nonstandard Number). Any element of ^{∗}R \ R is

called a nonstandard number. For every r∈ R, ^{∗}r is standard. In fact,
all standard elements of ^{∗}R take this form.

This discussion also shows that any sequence ε converging to zero generates an infinitesimal [ε], which vindicates Cauchy’s definition.

Similarly, any sequence ω which diverges to infinity can be identified with an unlimited number [ω]. Moreover, [ε]· [ω] = [1]. So [ε] and [ω]

are multiplicative inverses.

Mission accomplished.

2.2.8. The ∗ Map. We would like to be able to extend functions
from R to ^{∗}R. As a first step, it is necessary to enlarge the function’s
domain.

Let A⊆ R. Define the extension or enlargement^{∗}A of A as follows.

For each r∈ R^{N},

[r]∈^{∗}A iff {{r ∈ A}} = {j ∈ N : r^{j} ∈ A} ∈ F .

That is, ^{∗}A contains the equivalence classes of sequences whose terms
are almost all in A. One consequence is that ^{∗}a∈^{∗}A for each a∈ A.

Now, we prove a crucial theorem about set extensions.

Theorem 2.8. Let A ⊆ R. ^{∗}A has nonstandard members if and
only if A is infinite. Otherwise, ^{∗}A = A.

Proof. If A is infinite, then there is a sequence r, where rj ∈ A
for each j, whose terms are all distinct. The set {{r ∈ A}} = N ∈ F ,
so [r] ∈ ^{∗}A. For any real s ∈ A, let s = {s, s, s . . .}. The agreement
set {{r = s}} is either ∅ or a singleton, neither of which is large. So

∗s = [s] 6= [r]. Thus, [r] is a nonstandard element of ^{∗}A.

On the other hand, assume that A is finite. Choose [r] ∈ ^{∗}A.

By definition, r has a large set of terms in A. For each x ∈ A, let

Rigorous Infinitesimals 27
Rx ={{r = x}} = {j ∈ N : r^{j} = x}. Now, {R^{x}}^{x∈A} is a finite collection

of pairwise disjoint sets, and their union is an element of F , i.e. a
large set. The properties of ultrafilters (see Appendix C) dictate that
Rx ∈ F for exactly one x ∈ A, say x^{0}. Therefore, {{r = x^{0}}} ∈ F ,
where x^{0} ={x0, x0, x0, . . .}. And so [r] =^{∗}x0.

As every element of A has a corresponding element in ^{∗}A, we con-

clude that ^{∗}A = A whenever A is finite.

The definition and theorem have several immediate consequences.

∗A will have infinitesimal elements at the accumulation points of A. In addition, the extension of an unbounded set will have infinitely large elements.

It should be noted that the ∗ map developed here is a special case of a nonstandard extension, described in Appendix A. Therefore, it preserves unions, intersections, set differences and Cartesian products.

Now, we are prepared to define the extension of a function, f :R →
R. For any sequence r ∈ R^{N}, define f (r) ={f(r^{j})}. Then let

∗f ([r]) = [f (r)] . In general,

{{r = r^{0}}} ⊆ {{f(r) = f(r^{0})}},
which means

r≡ r^{0} implies f (r)≡ f(r^{0})).

Thus, ^{∗}f is well-defined. Now, ^{∗}f : ^{∗}R →^{∗}R.

We can also extend the partial function f : A → R to the partial
function^{∗}f : ^{∗}A→^{∗}R. This construction is identical to the last, except
that we avoid elements outside Dom f . For any [r]∈^{∗}A, let

sj = f (rj) if rj ∈ A, 0 otherwise.

Rigorous Infinitesimals 28
Since [r] ∈ ^{∗}A, rj ∈ A for almost all j, which means that s^{j} = f (rj)

almost everywhere. Finally, we put

∗f ([r]) = [s] .

Demonstrating well-definition of the extension of a partial function is similar to the proof for functions whose domain is R.

It is easy to show that^{∗}(f (r)) =^{∗}f (^{∗}r), so ^{∗}f is an extension of f .
Therefore, the ∗ is not really necessary, and it is sometimes omitted.

Definition 2.9 (Hypersequence). Note that this discussion also
applies to sequences, since a sequence is a function a : N → R. The
extension of a sequence is called a hypersequence, and it maps^{∗}N →^{∗}R.

The same symbol a is used to denote the hypersequence. Terms with hyperfinite indices are called extended terms.

Definition 2.10 (Standard Object). Any set of hyperreals, func- tion on the hyperreals, or sequence of hyperreals which can be defined via this ∗ mapping is called standard.

2.3. Principles of NSA

Before we can exploit the power of NSA, we need a way to translate results from the reals to the hyperreals and vice-versa. I continue to follow Goldblatt’s presentation [7].

2.3.1. The Transfer Principle. The Transfer Principle is the most important tool in Nonstandard Analysis. First, it allows us to recast classical theorems for the hyperreals. Second, it permits the use of hyperreals to prove results about the reals. Roughly, transfer states that

any appropriately formulated statement is true of^{∗}R
if and only if it is true of R [7, 11].

Rigorous Infinitesimals 29 We must define what it means for a statement to be “appropriately

formulated” and how the statement about^{∗}R differs from the statement
about R.

Any mathematical statement can be written in logical notation us- ing the following symbols:

Logical Connectives: ∧ (and), ∨ (or), ¬ (not), → (implies), and ↔ (if and only if).

Quantifiers: ∀ (for all) and ∃ (there exists).

Parentheses: (), [].

Constants: Fixed elements of some fixed set or universe U , which are usually denoted by letter symbols.

Variables: A countable collection of letter symbols.

Definition 2.11 (Sentence). A sentence is a mathematical state- ment written in logical notation and which contains no free variables.

In other words, every variable must be quantified to specify its bound, the set over which it ranges. For example, the statement (x > 2) contains a free occurence of the variable x. On the other hand, the statement (∀y ∈ N)(y > 2) contains only the variable y, bound to N, which means that it is a sentence. A sentence in which all terms are defined may be assigned a definite truth value.

Next, we explain how to take the∗-transform of a sentence ϕ. This is a further generalization of the ∗ map which was discussed in Sec- tion 2.2.8.

• Replace each constant τ by ^{∗}τ .

• Replace each relation (or function) R by ^{∗}R.

• Replace the bound A of each quantifier by its enlargement ^{∗}A.

Variables do not need to be renamed. Set operations like∪, ∩, \, ×, etc. are preserved under the ∗ map, so they do not need renaming. As

Rigorous Infinitesimals 30
we saw before, we may identify r with ^{∗}r for any real number, so these

constants do not require a ∗. It is also common to omit the ∗ from
standard relations like =,6=, <, ∈, etc. and from standard functions
like sin, cos, log, exp, etc. The classical definition will dictate the ∗-
transform. As before, A $ ^{∗}A whenever A is infinite. Therefore, all
sets must be replaced by their enlargements.

Be careful, however, when using sets as variables. The bound of a
variable is the set over which it ranges, hence (∀A ⊆ R) must be written
as (∀A ∈ P(R)). Furthermore, the transform of P(R) is ^{∗}P(R) and
neither P(^{∗}R) nor ^{∗}P(^{∗}R). This phenomenon results from the fact
that P is not a function; it is a special notation for a specific set.

It will be helpful to provide some examples of sentences and their

∗-transforms.

(∀x ∈ R)(sin^{2}x + cos^{2}x = 1) becomes
(∀x ∈^{∗}R)(sin^{2}x + cos^{2}x = 1).

(∀x ∈ R)(x ∈ [a, b] ↔ a ≤ x ≤ b) becomes
(∀x ∈^{∗}R)(x ∈^{∗}[a, b] ↔ a ≤ x ≤ b).

(∃y ∈ [a, b])(π < f(y)) becomes
(∃y ∈^{∗}[a, b](π < ^{∗}f (y)).

Now, we can restate the transfer principle more formally. If ϕ is a
sentence and ^{∗}ϕ is its ∗-transform,

∗ϕ is true iff ϕ is true.

The transfer principle is a special case of Lo´s’s Theorem, which is beyond the scope of this thesis.

As a result of transfer, many facts about real numbers are also true about the hyperreals. Trigonometric functions and logarithms, for instance, continue to behave the same way for hyperreal arguments.

Rigorous Infinitesimals 31 Transfer also permits the use of infinitesimals and unlimited numbers

in lieu of limit arguments (see Section 3.1).

One more caution about the transfer principle: although every sen-
tence concerning R has a ∗-transform, there are many sentences con-
cerning ^{∗}R which are not ∗-transforms.

The rules for applying the ∗-transform may seem arcane, but they quickly become second nature. The proofs in the next chapter will foster familiarity.

2.3.2. Internal Sets. For any sequence of subsets of R, A =
{A^{j}}, define a subset [A] ⊆^{∗}R by the following rule. For each [r] ∈^{∗}R,

[r]∈ [A] iff {{r ∈ A}} = {j ∈ N : rj ∈ Aj} ∈ F .
Subsets of ^{∗}R formed in this manner are called internal.

As examples, the enlargement ^{∗}A of A ⊆ R is internal, since it is
constructed from the constant sequence {A, A, A, . . .}. Any finite set
of hyperreals is internal, and the hyperreal interval, [a, b] = {x ∈^{∗}R :
a≤ x ≤ b}, is internal for any a, b ∈^{∗}R.

Internal sets may also be identified as the elements of^{∗}P(R). Thus
the transfer principle gives internal sets a special status. For example,
the sentence

(∀A ∈ P(N))[(A 6= ∅) → (∃n ∈ N)(n = min A)] becomes
(∀A ∈^{∗}P(N))[(A 6= ∅) → (∃n ∈^{∗}N)(n = min A)].

Therefore, every nonempty internal subset of ^{∗}N has a least member.

Internal sets have many other fascinating properties, which are fun- damental to NSA. It is also possible to define internal functions as the equivalence classes of sequences of real-valued functions. These, too, are crucial to NSA. Unfortunately, an explication of these facts would take us too far afield.

Rigorous Infinitesimals 32 2.4. Working with Hyperreals

Having discussed some of the basic principles of NSA, we can begin to investigate the structure of the hyperreals. Then, we will be able to ignore the details of the ultrapower construction and use hyperreals for arithmetic. I am still following Goldblatt [7].

2.4.1. Types of Hyperreals. ^{∗}R contains the hyperreal numbers.

Similarly,^{∗}Q contains hyperrationals,^{∗}Z contains hyperintegers and^{∗}N
contains hypernaturals. The sentence

(∀x ∈ R)[(x ∈ Q) ↔ (∃y, z ∈ Z)(z 6= 0 ∧ x = y/z)]

transfers to

(∀x ∈^{∗}R)[(x ∈ ^{∗}Q) ↔ (∃y, z ∈^{∗}Z)(z 6= 0 ∧ x = y/z)],
which demonstrates that ^{∗}Q contains quotients of hyperintegers.

Another important set of hyperreals is the set of unlimited natural
numbers, ^{∗}N^{∞} = ^{∗}N \ N. One of its key properties is that it has no
least member.^{9}

Hyperreal numbers come in several basic sizes. Terminology varies,
but Goldblatt lists the most common definitions. The hyperreal b∈^{∗}R
is

• limited if r < b < s for some r, s ∈ R;

• positive unlimited if b > r for every r ∈ R;

• negative unlimited if b < r for every r ∈ R;

• unlimited or hyperfinite if it is positive or negative unlimited;

• positive infinitesimal if 0 < b < r for every positive r ∈ R;

• negative infinitesimal if r < b < 0 for every negative r ∈ R;

• infinitesimal if it is positive or negative infinitesimal or zero;^{10}

• appreciable if b is limited but not infinitesimal.

9Consequently,^{∗}N∞ is not internal.

10Zero is the only infinitesimal inR.