Hyperreals in a Nutshell

7Richard_Kennaway

1Yudhister Kumar

4JBlack

2Flying Pen and Paper

2quiet_NaN

2tailcalled

2Adrià Garriga-alonso

1Yudhister Kumar

3Adrià Garriga-alonso

2Yudhister Kumar

1[comment deleted]

2Adrià Garriga-alonso

3JBlack

1Yudhister Kumar

1quiet_NaN

2Viliam

3Joseph Van Name

1Valdes

2quiet_NaN

1Yudhister Kumar

1quiet_NaN

2Joseph Van Name

1Valdes

1Dacyn

2Yudhister Kumar

1Leo P.

1Joseph Van Name

1rotatingpaguro

New Comment

Yet, the biggest effect I think this will have is pedadogical. I've always found the definition of a limit kind of unintuitive, and it was specifically invented to add post hoc coherence to calculus after it had been invented and used widely. I suspect that formulating calculus via infinitesimals in introductory calculus classes would go a long way to making it more intuitive.

Different people will have different intuitions. I've always found the epsilon-delta method clear and simple, and infinitesimals made of shadows and fog when used as a basis for calculus. Every infinitesimals-first approach I have seen involves unexplained magic or papered-over cracks at some point, unexplained and papered-over because at the stage of first learning calculus the student usually doesn't know any formal logic. There's a reason that infinitesimals were only put on a sound footing a century after epsilon-delta. Mathematical logic had to be invented first.

Here the magic lies in depending on the axiom of choice to get a non-principal ultrafilter. And I believe I see a crack in the above definition of the derivative. is a function on the non-standard reals, but its derivative is defined to only take standard values, so it will be constant in the infinitesimal range around any standard real. If , then its derivative should surely be everywhere. The above definition only gives you that for standard values of .

I also think that making it more intuitive is missing the point of learning—really learning—mathematics. The idea of the slope of a curve is already intuitive. What is needed is to show the student a way of thinking about these things that does not depend on the breath of intuition to keep it aloft.

Here the magic lies in depending on the axiom of choice to get a non-principal ultrafilter. And I believe I see a crack in the above definition of the derivative. f is a function on the non-standard reals, but its derivative is defined to only take standard values, so it will be constant in the infinitesimal range around any standard real. If , then its derivative should surely be everywhere. The above definition only gives you that for standard values of .

Yep, the definition is wrong. If then let denote the natural extension of this function to the hyperreals (considering behaves like this should work in most cases). Then, I think the derivative should be

W.r.t. what the derivative of should be, I imagine you can describe it similarly in terms of , which by the transfer principle should exist (which applies because of Łoś's theorem, which I don't claim to fully understand).

For the derivative then is:

Just in case anyone was wondering why we can't have any finite sets in the ultrafilter:

If some finite set {n1, n2, ..., n_k} is in an ultrafilter U, then either {n1, n2, ..., n_(k-1)} is in U or I \ {n1, n2, ..., n_(k-1)} is in U. In the latter case, the intersection with the original set is {n_k}, which must be in U. In the former case, you can keep repeating this until you are left with some other one-element set.

If any one-element set {n} is in U, then membership in U is just decided by whether a set contains n or not.

When you go through the equivalence construction, this means that two sequences are equivalent if and only if they agree at the n'th position, which means that all the operations are just the same as arithmetic on that position with the rest not mattering at all. So to get anything different, U really does have to be a *non-principal* ultrafilter.

Observe that is a set of natural numbers. If then cannot be finite, and it seems pretty obvious that almost all the elements in are the same (they only disagree at a finite number of places after all).

The bracketed remark doesn't appear to be true. Why can we not have or ? Indeed, by the definition of an ultrafilter, we must have one of them in . Also, in the post, you use for two different purposes, which makes the post slightly less clear.

Some random thoughts.

First, it would be nice if one could go from rationals to hyperreals directly without having to define the reals in between (especially for people with limit allergies, as the reals are sometimes defined as limits of Cauchy sequences). I don't see a straightforward way to do so though, you can hardly allow people to encode their reals as sequences of rationals, otherwise the sequence would have to be equivalent to zero instead of an infinitesimal.

Also, one could split the hyperreals into equivalence classes within which the Archimedian property holds. Using the big-O adjacent notation, the reals would be , and the hyperreal called above would be . Stretching the big-O notation, one could call the equivalence class of something like . So one has a rather large zoo of these equivalence classes. This would imply that there is no Archimedian equivalence class for the smallest infinite hyperreal. If a hyperreal is infinite (that is, diverges), then is a smaller infinite hyperreal.

I am well used to there being no biggest infinity, but there being no smallest infinity would indicate that these things are neither equivalent to cardinals nor ordinals.

I found Terry Tao's writing on the topic to be helpful for understanding, especially the connection between nonprincipal ultrafilters and Arrow's Impossibility Theorem.

Yet, the biggest effect I think this will have is pedadogical. I've always found the definition of a limit kind of unintuitive, and it was specifically invented to add post hoc coherence to calculus after it had been invented and used widely. I suspect that formulating calculus via infinitesimals in introductory calculus classes would go a long way to making it more intuitive.

I think hyperreals are too complicated for calculus 1 and you should just talk about a non-rigorous "infinitesimal" like Newton and Leibniz did.

I agree. This is what I was going for in that paragraph. If you define derivatives & integrals with infinitesimals, then you can actually do things like treating dy/dx as a fraction without partaking in the half-in half-out dance that calc 1 teachers currently have to do.

I don't think the pedagogical benefit of nonstandard analysis is to replace Analysis I courses, but rather to give a rigorous backing to doing algebra with infinitesimals ("an infinitely small thing plus a real number is the same real number, an infinitely small thing times a real number is zero"). *Improper integrals would make a lot more sense this way, IMO.

Thank you, that makes sense!

Indefinite integrals would make a lot more sense this way, IMO

Why so? I thought they already made sense, they're "antiderivatives", so a function such that taking its derivative gives you the original functions. Do you need anything further to define them?

(I know about the definite integral Riemann and Lebesgue definitions, but I thought indefinite integrals were much easier in comparison.

Voila! We have a suitable definition of "almost all agreement": if the agreement set is contained in some arbitrary nonprincipal ultrafilter .

Isn't it easier to just say "If the agreement set has a nonfinite number of elements"? Why the extra complexity?

must contain a set or its complement

Oh I see, so defining it with ultrafilters rules out situations like and where both have infinite zeros and yet their product is zero.

The post is wrong in saying that U contains only cofinite sets. It obviously *must* contain plenty of sets that are neither finite nor cofinite, because the complements of those sets are also neither finite nor cofinite. Possibly the author intended to type "contains all cofinite sets" instead.

In particular, exactly one of *a* or *b* is equivalent to zero in *R.

Which one is equivalent to zero depends upon exactly which non-principal ultrafilter you choose, as there are infinitely many non-principal ultrafilters. Unfortunately (as with many other applications of the Axiom of Choice) there is no finite way to specify which ultrafilter you mean.

The post is wrong in saying that U contains only cofinite sets. It obviously must contain plenty of sets that are neither finite nor cofinite, because the complements of those sets are also neither finite nor cofinite. Possibly the author intended to type "contains all cofinite sets" instead.

Yep, this is correct! I've updated the post to reflect this.

E.g. if an ultrafilter contains the set of all even naturals, it won't contain the set of all odd naturals, neither of which are finite or cofinite.

Thanks, this is helpful to point out.

Of course, this makes all of this rather abstract. It looks to me like for almost any two hyperreals (e.g. a, b as above), the answer to "which of them is larger?" is "It depends on the ultrafilter. Also, I can not tell you if a set is part of any specific ultrafilter. But fear not, for any given ultrafilter, the hyperreals are well-ordered."

Basically for any usable theorem, one would have to prove that the result is independent of the actual ultrafilter used, which means that numbers such as a and b will probably not feature in them a lot.

I can not fault my analysis 1 professor for opting to stick to the reals (abstract as they are already are) instead.

I don't understand some of the words you used, so please correct me if I am wrong. What are the equivalents of the original natural numbers here? Is it like 2 = { (2, 2, 2...), and all sequences that contain an infinite number of 2's and a finite number of anything else } ?

Then we would have a *partially* ordered set, because 2 is neither greater than nor smaller than { (1, 3, 1, 3, 1, 3...), and its equivalents }. Is that okay?

Yes. We have 2=[(2,2,2,...)]. But we can compare 2 with (1,3,1,3,1,3,...) since (1,3,1,3,1,3,1,3,...)=1 (this happens when the set of all even natural numbers is in your ultrafilter) or (1,3,1,3,1,3,1,3,...)=3 (this happens when the set of all odd natural numbers is in your ultrafilter). Your partially ordered set is actually a linear ordering because whenever we have two sequences , one of the sets

is in your ultrafilter (you can think of an ultrafilter as a thing that selects one block out of every partition of the natural numbers into finitely many pieces), and if your ultrafilter contains

, then .

Epistemic status: Vaguely confused and probably lacking a sufficient technical background to get all the terms right. Is very cool though, so I figured I'd write this.When calculus was invented, it didn't make sense. Newton and Leibniz played fast and dirty with mathematical rigor to develop methods that arrived at the correct answers, but no one knew why. It took another one and a half centuries for Cauchy and Weierstrass develop analysis, and in the meantime people like Berkeley refused to accept the methods utilizing these "ghosts of departed quantities."

Cauchy's and Weierstrass's solution to the crisis of calculus was to define infinitesimals in terms of limits. In other words, to not describe the behavior of functions directly acting on infinitesimals, but rather to frame the the entire endeavour as studying the behaviors of certain operations in the limit, in that weird superposition of being arbitrarily close to something yet not it.

(And here I realize that math is better shown, not told)

The limit of a function f(x)at x=a is L if for any ϵ>0 there exists some δ>0 such that if

|x−a|<δ,then

|f(x)−L|<ϵ.Essentially, the limit exists if there's some value δ that forces f(x) to be within ϵ of L if x is within δ of a. Note that this has to hold true for all ϵ, and you choose ϵ first!

From this we get the well-known definition of the derivative:

f′(x)=limh→0f(x+h)−f(x)hand you can define the integral similarly.

The limit solved calculus's rigor problem. From the limit the entire field of analysis was invented and placed on solid ground, and this foundation has stood to this day.

Yet, it seems like we lose something important when we replace the idea of the "infinitesimally small" with the "arbitrarily close to." Could we actually make numbers that were

infinitely small?## The Sequence Construction

Imagine some mathematical object that had all the relevant properties of the real numbers (addition, multiplication are associative and commutative, is closed, etc.) but had infinitely small and infinitely large numbers. What does this object look like?

We can take the set of all infinite sequences of real numbers RN as a starting point. A typical element a∈RN would be

a=(a0,a1,a2,…)where a0,a1,a2,… is some infinite sequence of real numbers.

We can define addition and multiplication element-wise as:

a+b=(a0+b0,a1+b1,a2+b2,…),You can verify that this is a commutative ring, which means that these operations behave nicely. Yet, being a commutative ring is not the same thing as being an ordered field, which is what we eventually want if our desired object is to have the same properties as the reals.

To get from RN to a field structure, we have to modify it to accommodate well-defined division. The typical way of doing this is looking at how to introduce the zero product property: i.e. ensuring that if a,b∈RN then if ab=0 either one of a,b is 0.

If we let 0 be the sequence of all zeros (0,0,0,…) in RN, then it is clear that we can have two non-zero elements multiply to get zero. If we have

a=(a,0,0,0,…),and

b=(0,b,b,b,…),then neither of these are the zero element, yet their product is zero.

How do we fix this? Equivalence classes!

Our problem is that there are too many distinct "zero-like" things in the ring of real numbered sequences. Intuitively, we should expect the sequence (0,1,0,0,…) to be

basicallyzero, and we want to find a good condensation of RN that allows for this.In other words, how do we make all the sequences with "almost all" their elements as zero to be equal to zero?

## Almost All Agreement ft. Ultrafilters

Taken from "five ways to say "Almost Always" and actually mean it":

Let's say we define some nonprincipal ultrafilter U on the natural numbers. This will contain all cofinite sets, and will exclude all finite sets. Now, let's take two sequences a,b∈RN, and define their

agreement setI to be the indices on which a,b are identical (have the same real number in the same position).Observe that I is a set of natural numbers. If I∈U, then I cannot be finite, and it seems pretty obvious that almost all the elements in a,b are the same (they only disagree at a finite number of places after all). Conversely, if I∉U, this implies that N/I∈U, which means that a,b disagree at almost all positions, so they probably shouldn't be equal.

Voila! We have a suitable definition of "almost all agreement": if the agreement set I is contained in some arbitrary nonprincipal ultrafilter U.

Let ∗R be the quotient set of RN under this equivalence relation (essentially, the set of all distinct equivalence classes of RN). Does this satisfy the zero product property?

(Notation note: we will let (a) denote the infinite sequence of the real number a, and [a] the equivalence class of the sequence (a) in ∗R.)

## Yes, This Behaves Like The Real Numbers

Let a,b∈RN such that ab=(0). Let's break this down element-wise: either an,bn must be zero for all n∈N. As one of the ultrafilter axioms is that it must contain a set or its complement, either the index set of the zero elements in a or the index set of the zero elements in b will be in any nonprincipal ultrafilter on N. Therefore, either a or b is equivalent to (0) in ∗R, so ∗R satisfies the zero product property.

Therefore, division is well defined on ∗R! Now all we need is an ordering, and luckily almost all agreement saves the day again. We can say for a,b∈∗R that a>b if almost all elements in a are greater than the elements in b at the same positions (using the same ultrafilter equivalence).

So, ∗R is an ordered field!

## Infinitesimals and Infinitely Large Numbers

We have the following hyperreal:

ϵ=(1,12,13,…,1n,…).Recall that we embed the real numbers into the hyperreals by assigning every real number a to the equivalence class [a]. Now observe that ϵ

is smaller than every real number embedded into the hyperreals this way.Pick some arbitrary real number a. There exists p∈N such that 1p<a. There are infinitely many fractions of the form 1n, where n is a natural number greater than p, so ϵ is smaller than (a) at almost all positions, so it is smaller than a.

This is an infinitesimal! This is a rigorously defined, coherently defined,

infinitesimal numbersmaller than all real numbers! In a number system which shares all of the important properties of the real numbers! (except the Archimedean one, as we will shortly see, but that doesn't really matter).Consider the following

Ω=(1,2,3,…).By a similar argument this is larger than all possible real numbers. I encourage you to try to prove this for yourself!

(The Archimedean principle is that which guarantees that if you have any two real numbers, you can multiply the smaller by some natural number to become greater than the other. This is not true in the hyperreals. Why? (Hint: Ω breaks this if you consider a real number.))

## How does this tie into calculus, exactly?

Well, we have a coherent way of defining infinitesimals!

The short answer is that we can define the

f′(x)=st(∗f(x+Δx)−∗f(x)Δx)staroperator (also called thestandard partoperator) st(x) as that which maps any hyperreal to its closest real counterpart. Then, the definition of a derivative becomeswhere Δx is some infinitesimal, and ∗f is the natural extension of f to the hyperreals. More on this in a future blog post!

It also turns out the hyperreals have a bunch of really cool applications in fields far removed from analysis. Check out my expository paper on the intersection of nonstandard analysis and Ramsey theory for an example!

Yet, the biggest effect I think this will have is pedadogical. I've always found the definition of a limit kind of unintuitive, and it was specifically invented to add

post hoccoherence to calculus after it had been invented and used widely. I suspect that formulating calculus via infinitesimals in introductory calculus classes would go a long way to making it more intuitive.