A Medley of Potpourri: Aug 16, 2022

Tuesday, August 16, 2022

Nonstandard calculus

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Nonstandard_calculus

In mathematics, nonstandard calculus is the modern application of infinitesimals, in the sense of nonstandard analysis, to infinitesimal calculus. It provides a rigorous justification for some arguments in calculus that were previously considered merely heuristic.

Non-rigorous calculations with infinitesimals were widely used before Karl Weierstrass sought to replace them with the (ε, δ)-definition of limit starting in the 1870s. (See history of calculus.) For almost one hundred years thereafter, mathematicians such as Richard Courant viewed infinitesimals as being naive and vague or meaningless.

Contrary to such views, Abraham Robinson showed in 1960 that infinitesimals are precise, clear, and meaningful, building upon work by Edwin Hewitt and Jerzy Łoś. According to Howard Keisler, "Robinson solved a three hundred year old problem by giving a precise treatment of infinitesimals. Robinson's achievement will probably rank as one of the major mathematical advances of the twentieth century."

History

The history of nonstandard calculus began with the use of infinitely small quantities, called infinitesimals in calculus. The use of infinitesimals can be found in the foundations of calculus independently developed by Gottfried Leibniz and Isaac Newton starting in the 1660s. John Wallis refined earlier techniques of indivisibles of Cavalieri and others by exploiting an infinitesimal quantity he denoted ${\tfrac {1}{\infty }}$ in area calculations, preparing the ground for integral calculus. They drew on the work of such mathematicians as Pierre de Fermat, Isaac Barrow and René Descartes.

In early calculus the use of infinitesimal quantities was criticized by a number of authors, most notably Michel Rolle and Bishop Berkeley in his book The Analyst.

Several mathematicians, including Maclaurin and d'Alembert, advocated the use of limits. Augustin Louis Cauchy developed a versatile spectrum of foundational approaches, including a definition of continuity in terms of infinitesimals and a (somewhat imprecise) prototype of an ε, δ argument in working with differentiation. Karl Weierstrass formalized the concept of limit in the context of a (real) number system without infinitesimals. Following the work of Weierstrass, it eventually became common to base calculus on ε, δ arguments instead of infinitesimals.

This approach formalized by Weierstrass came to be known as the standard calculus. After many years of the infinitesimal approach to calculus having fallen into disuse other than as an introductory pedagogical tool, use of infinitesimal quantities was finally given a rigorous foundation by Abraham Robinson in the 1960s. Robinson's approach is called nonstandard analysis to distinguish it from the standard use of limits. This approach used technical machinery from mathematical logic to create a theory of hyperreal numbers that interpret infinitesimals in a manner that allows a Leibniz-like development of the usual rules of calculus. An alternative approach, developed by Edward Nelson, finds infinitesimals on the ordinary real line itself, and involves a modification of the foundational setting by extending ZFC through the introduction of a new unary predicate "standard".

Motivation

To calculate the derivative $f'$ of the function $y=f(x)=x^{2}$ at x, both approaches agree on the algebraic manipulations:

{\displaystyle {\frac {\Delta y}{\Delta x}}={\frac {(x+\Delta x)^{2}-x^{2}}{\Delta x}}={\frac {2x\Delta x+(\Delta x)^{2}}{\Delta x}}=2x+\Delta x\approx 2x}

This becomes a computation of the derivatives using the hyperreals if $\Delta x$ is interpreted as an infinitesimal and the symbol " $\approx$ " is the relation "is infinitely close to".

In order to make f ' a real-valued function, the final term $\Delta x$ is dispensed with. In the standard approach using only real numbers, that is done by taking the limit as $\Delta x$ tends to zero. In the hyperreal approach, the quantity $\Delta x$ is taken to be an infinitesimal, a nonzero number that is closer to 0 than to any nonzero real. The manipulations displayed above then show that $\Delta y/\Delta x$ is infinitely close to 2x, so the derivative of f at x is then 2x.

Discarding the "error term" is accomplished by an application of the standard part function. Dispensing with infinitesimal error terms was historically considered paradoxical by some writers, most notably George Berkeley.

Once the hyperreal number system (an infinitesimal-enriched continuum) is in place, one has successfully incorporated a large part of the technical difficulties at the foundational level. Thus, the epsilon, delta techniques that some believe to be the essence of analysis can be implemented once and for all at the foundational level, and the students needn't be "dressed to perform multiple-quantifier logical stunts on pretense of being taught infinitesimal calculus", to quote a recent study. More specifically, the basic concepts of calculus such as continuity, derivative, and integral can be defined using infinitesimals without reference to epsilon, delta (see next section).

Keisler's textbook

Keisler's Elementary Calculus: An Infinitesimal Approach defines continuity on page 125 in terms of infinitesimals, to the exclusion of epsilon, delta methods. The derivative is defined on page 45 using infinitesimals rather than an epsilon-delta approach. The integral is defined on page 183 in terms of infinitesimals. Epsilon, delta definitions are introduced on page 282.

Definition of derivative

The hyperreals can be constructed in the framework of Zermelo–Fraenkel set theory, the standard axiomatisation of set theory used elsewhere in mathematics. To give an intuitive idea for the hyperreal approach, note that, naively speaking, nonstandard analysis postulates the existence of positive numbers ε which are infinitely small, meaning that ε is smaller than any standard positive real, yet greater than zero. Every real number x is surrounded by an infinitesimal "cloud" of hyperreal numbers infinitely close to it. To define the derivative of f at a standard real number x in this approach, one no longer needs an infinite limiting process as in standard calculus. Instead, one sets

f'(x)=\mathrm {st} \left({\frac {f^{*}(x+\varepsilon )-f^{*}(x)}{\varepsilon }}\right),

where st is the standard part function, yielding the real number infinitely close to the hyperreal argument of st, and $f^{*}$ is the natural extension of $f$ to the hyperreals.

Continuity

A real function f is continuous at a standard real number x if for every hyperreal x' infinitely close to x, the value f(x' ) is also infinitely close to f(x). This captures Cauchy's definition of continuity as presented in his 1821 textbook Cours d'Analyse, p. 34.

Here to be precise, f would have to be replaced by its natural hyperreal extension usually denoted f^* (see discussion of Transfer principle in main article at nonstandard analysis).

Using the notation $\approx$ for the relation of being infinitely close as above, the definition can be extended to arbitrary (standard or nonstandard) points as follows:

A function f is microcontinuous at x if whenever $x'\approx x$ , one has $f^{*}(x')\approx f^{*}(x)$

Here the point x' is assumed to be in the domain of (the natural extension of) f.

The above requires fewer quantifiers than the (ε, δ)-definition familiar from standard elementary calculus:

f is continuous at x if for every ε > 0, there exists a δ > 0 such that for every x' , whenever |x − x' | < δ, one has |f(x) − f(x' )| < ε.

Uniform continuity

A function f on an interval I is uniformly continuous if its natural extension f* in I* has the following property (see Keisler, Foundations of Infinitesimal Calculus ('07), p. 45):

for every pair of hyperreals x and y in I*, if $x\approx y$ then $f^{*}(x)\approx f^{*}(y)$ .

In terms of microcontinuity defined in the previous section, this can be stated as follows: a real function is uniformly continuous if its natural extension f* is microcontinuous at every point of the domain of f*.

This definition has a reduced quantifier complexity when compared with the standard (ε, δ)-definition. Namely, the epsilon-delta definition of uniform continuity requires four quantifiers, while the infinitesimal definition requires only two quantifiers. It has the same quantifier complexity as the definition of uniform continuity in terms of sequences in standard calculus, which however is not expressible in the first-order language of the real numbers.

The hyperreal definition can be illustrated by the following three examples.

Example 1: a function f is uniformly continuous on the semi-open interval (0,1], if and only if its natural extension f* is microcontinuous (in the sense of the formula above) at every positive infinitesimal, in addition to continuity at the standard points of the interval.

Example 2: a function f is uniformly continuous on the semi-open interval [0,∞) if and only if it is continuous at the standard points of the interval, and in addition, the natural extension f* is microcontinuous at every positive infinite hyperreal point.

Example 3: similarly, the failure of uniform continuity for the squaring function

x^{2}

is due to the absence of microcontinuity at a single infinite hyperreal point, see below.

Concerning quantifier complexity, the following remarks were made by Kevin Houston:

The number of quantifiers in a mathematical statement gives a rough measure of the statement’s complexity. Statements involving three or more quantifiers can be difficult to understand. This is the main reason why it is hard to understand the rigorous definitions of limit, convergence, continuity and differentiability in analysis as they have many quantifiers. In fact, it is the alternation of the

\forall

and

\exists

that causes the complexity.

Andreas Blass wrote as follows:

Often ... the nonstandard definition of a concept is simpler than the standard definition (both intuitively simpler and simpler in a technical sense, such as quantifiers over lower types or fewer alternations of quantifiers).

Compactness

A set A is compact if and only if its natural extension A* has the following property: every point in A* is infinitely close to a point of A. Thus, the open interval (0,1) is not compact because its natural extension contains positive infinitesimals which are not infinitely close to any positive real number.

Heine–Cantor theorem

The fact that a continuous function on a compact interval I is necessarily uniformly continuous (the Heine–Cantor theorem) admits a succinct hyperreal proof. Let x, y be hyperreals in the natural extension I* of I. Since I is compact, both st(x) and st(y) belong to I. If x and y were infinitely close, then by the triangle inequality, they would have the same standard part

c=\operatorname {st} (x)=\operatorname {st} (y).

Since the function is assumed continuous at c,

f(x)\approx f(c)\approx f(y),

and therefore f(x) and f(y) are infinitely close, proving uniform continuity of f.

Why is the squaring function not uniformly continuous?

Let f(x) = x² defined on $\mathbb {R}$ . Let $N\in \mathbb {R} ^{*}$ be an infinite hyperreal. The hyperreal number $N+{\tfrac {1}{N}}$ is infinitely close to N. Meanwhile, the difference

f(N+{\tfrac {1}{N}})-f(N)=N^{2}+2+{\tfrac {1}{N^{2}}}-N^{2}=2+{\tfrac {1}{N^{2}}}

is not infinitesimal. Therefore, f* fails to be microcontinuous at the hyperreal point N. Thus, the squaring function is not uniformly continuous, according to the definition in uniform continuity above.

A similar proof may be given in the standard setting (Fitzpatrick 2006, Example 3.15).

Example: Dirichlet function

Consider the Dirichlet function

{\displaystyle I_{Q}(x):={\begin{cases}1&{\text{ if }}x{\text{ is rational}},\\0&{\text{ if }}x{\text{ is irrational}}.\end{cases}}}

It is well known that, under the standard definition of continuity, the function is discontinuous at every point. Let us check this in terms of the hyperreal definition of continuity above, for instance let us show that the Dirichlet function is not continuous at π. Consider the continued fraction approximation a_n of π. Now let the index n be an infinite hypernatural number. By the transfer principle, the natural extension of the Dirichlet function takes the value 1 at a_n. Note that the hyperrational point a_n is infinitely close to π. Thus the natural extension of the Dirichlet function takes different values (0 and 1) at these two infinitely close points, and therefore the Dirichlet function is not continuous at π.

Limit

While the thrust of Robinson's approach is that one can dispense with the approach using multiple quantifiers, the notion of limit can be easily recaptured in terms of the standard part function st, namely

\lim _{x\to a}f(x)=L

if and only if whenever the difference x − a is infinitesimal, the difference f(x) − L is infinitesimal, as well, or in formulas:

if st(x) = a then st(f(x)) = L,

cf. (ε, δ)-definition of limit.

Limit of sequence

Given a sequence of real numbers $\{x_{n}\mid n\in \mathbb {N} \}$ , if $L\in \mathbb {R}$ L is the limit of the sequence and

L=\lim _{n\to \infty }x_{n}

if for every infinite hypernatural n, st(x_n)=L (here the extension principle is used to define x_n for every hyperinteger n).

This definition has no quantifier alternations. The standard (ε, δ)-style definition, on the other hand, does have quantifier alternations:

{\displaystyle L=\lim _{n\to \infty }x_{n}\Longleftrightarrow \forall \varepsilon >0\;,\exists N\in \mathbb {N} \;,\forall n\in \mathbb {N} :n>N\rightarrow |x_{n}-L|<\varepsilon .}

Extreme value theorem

To show that a real continuous function f on [0,1] has a maximum, let N be an infinite hyperinteger. The interval [0, 1] has a natural hyperreal extension. The function f is also naturally extended to hyperreals between 0 and 1. Consider the partition of the hyperreal interval [0,1] into N subintervals of equal infinitesimal length 1/N, with partition points x_i = i /N as i "runs" from 0 to N. In the standard setting (when N is finite), a point with the maximal value of f can always be chosen among the N+1 points x_i, by induction. Hence, by the transfer principle, there is a hyperinteger i₀ such that 0 ≤ i₀ ≤ N and $f(x_{i_{0}})\geq f(x_{i})$ for all i = 0, …, N (an alternative explanation is that every hyperfinite set admits a maximum). Consider the real point

c={\rm {st}}(x_{i_{0}})

where st is the standard part function. An arbitrary real point x lies in a suitable sub-interval of the partition, namely $x\in [x_{i},x_{i+1}]$ , so that st(x_i) = x. Applying st to the inequality $f(x_{i_{0}})\geq f(x_{i})$ , ${\rm {st}}(f(x_{i_{0}}))\geq {\rm {st}}(f(x_{i}))$ . By continuity of f,

{\rm {st}}(f(x_{i_{0}}))=f({\rm {st}}(x_{i_{0}}))=f(c)

Hence f(c) ≥ f(x), for all x, proving c to be a maximum of the real function f. See Keisler (1986, p. 164).

Intermediate value theorem

As another illustration of the power of Robinson's approach, a short proof of the intermediate value theorem (Bolzano's theorem) using infinitesimals is done by the following.

Let f be a continuous function on [a,b] such that f(a)<0 while f(b)>0. Then there exists a point c in [a,b] such that f(c)=0.

The proof proceeds as follows. Let N be an infinite hyperinteger. Consider a partition of [a,b] into N intervals of equal length, with partition points x_i as i runs from 0 to N. Consider the collection I of indices such that f(x_i)>0. Let i₀ be the least element in I (such an element exists by the transfer principle, as I is a hyperfinite set). Then the real number

c=\mathrm {st} (x_{i_{0}})

is the desired zero of f. Such a proof reduces the quantifier complexity of a standard proof of the IVT.

Basic theorems

If f is a real valued function defined on an interval [a, b], then the transfer operator applied to f, denoted by *f, is an internal, hyperreal-valued function defined on the hyperreal interval [*a, *b].

Theorem: Let f be a real-valued function defined on an interval [a, b]. Then f is differentiable at a < x < b if and only if for every non-zero infinitesimal h, the value

\Delta _{h}f:=\operatorname {st} {\frac {[{}^{*}\!f](x+h)-[{}^{*}\!f](x)}{h}}

is independent of h. In that case, the common value is the derivative of f at x.

This fact follows from the transfer principle of nonstandard analysis and overspill.

Note that a similar result holds for differentiability at the endpoints a, b provided the sign of the infinitesimal h is suitably restricted.

For the second theorem, the Riemann integral is defined as the limit, if it exists, of a directed family of Riemann sums; these are sums of the form

\sum _{k=0}^{n-1}f(\xi _{k})(x_{k+1}-x_{k})

where

a=x_{0}\leq \xi _{0}\leq x_{1}\leq \ldots x_{n-1}\leq \xi _{n-1}\leq x_{n}=b.

Such a sequence of values is called a partition or mesh and

\sup _{k}(x_{k+1}-x_{k})

the width of the mesh. In the definition of the Riemann integral, the limit of the Riemann sums is taken as the width of the mesh goes to 0.

Theorem: Let f be a real-valued function defined on an interval [a, b]. Then f is Riemann-integrable on [a, b] if and only if for every internal mesh of infinitesimal width, the quantity

S_{M}=\operatorname {st} \sum _{k=0}^{n-1}[*f](\xi _{k})(x_{k+1}-x_{k})

is independent of the mesh. In this case, the common value is the Riemann integral of f over [a, b].

Applications

One immediate application is an extension of the standard definitions of differentiation and integration to internal functions on intervals of hyperreal numbers.

An internal hyperreal-valued function f on [a, b] is S-differentiable at x, provided

\Delta _{h}f=\operatorname {st} {\frac {f(x+h)-f(x)}{h}}

exists and is independent of the infinitesimal h. The value is the S derivative at x.

Theorem: Suppose f is S-differentiable at every point of [a, b] where b − a is a bounded hyperreal. Suppose furthermore that

|f'(x)|\leq M\quad a\leq x\leq b.

Then for some infinitesimal ε

|f(b)-f(a)|\leq M(b-a)+\epsilon .

To prove this, let N be a nonstandard natural number. Divide the interval [a, b] into N subintervals by placing N − 1 equally spaced intermediate points:

a=x_{0}<x_{1}<\cdots <x_{N-1}<x_{N}=b

Then

{\displaystyle |f(b)-f(a)|\leq \sum _{k=1}^{N-1}|f(x_{k+1})-f(x_{k})|\leq \sum _{k=1}^{N-1}\left\{|f'(x_{k})|+\epsilon _{k}\right\}|x_{k+1}-x_{k}|.}

Now the maximum of any internal set of infinitesimals is infinitesimal. Thus all the ε_k's are dominated by an infinitesimal ε. Therefore,

|f(b)-f(a)|\leq \sum _{k=1}^{N-1}(M+\epsilon )(x_{k+1}-x_{k})=M(b-a)+\epsilon (b-a)

from which the result follows.

A Medley of Potpourri

Search This Blog