This is a prototype attempt to create lessons aimed at teaching mathematical thinking to interested teenagers. The aim is to show them the nuts and bolts of how mathematics is built, rather than to teach them specific facts or how to solve specific problems. If successful I might write more.
Calculate
What you'll quickly notice (or might already be aware of) as you keep on adding smaller and smaller steps is that it gets closer and closer to 2 but never quite gets there.
Now you might be tempted to say that
When we see an equation like a = b, that implies certain things:
E.g.
E.g. If , then .
We've defined what it means for an equation to have a finite number of terms on one of the sides. We've never defined what it means for an equation to have an infinite number of terms. Do all those rules we have about equations apply to these infinite term equations too?
If and , does 1 + ?
What about if we multiply the two together? Does ?
What does it even mean to multiply two infinite sums together?
To answer all these questions we need to define everything in terms of things we already understand. Then we can try and prove properties about them.
The first step is to break our infinite terms into something that only has finite terms. To do that we stop talking about vague things like infinite sums, and start talking about sequences.
A sequence is just a neverending list of numbers. 1,2,3,4... is a sequence, as is 1,4,9,16...
More formally it's a mapping from the positive integers to the reals[1]. If you tell me you have a sequence , I need to be able to ask you what the 172nd element of is, and you need to be able to answer me. We denote this as .
In our case, we can define a sequence:
In other words, for any integer , the th number in the sequence is the sum of the first terms in . Now because we're only ever dealing with a finite number of terms, every member of this sequence is well defined.
Now we want to define a property called convergence, and say that converges to 2. How might we define that?
Our first attempt might be to say that:
A sequence converges to if for all , is always closer to than .
But a bit of exploration proves that is totally insufficient - By that definition converges to 3, 17, and in fact any number greater than 2.
So let's try and address that in our next attempt:
A sequence converges to if for all , is always closer to than , and eventually gets infinitely close to .
But we can't just willy nilly throw around terms like infinitely close. We need to define convergence only in terms of things we already understand. What we want to express is that if you pick any value, the sequence eventually gets closer to A than that value. We can express that precisely as:
A sequence converges to if for all , is always closer to than , and for any non-zero number , there is a number , such that [2].
That works for all the examples we've given so far. But it doesn't work for some other examples. What about:
We definitely want to be able to express somehow that the value of this sequence is 0, but it doesn't fit our definition, since it doesn't continuously get closer to 0 it keeps on overshooting, but the amount it overshoots by keeps getting smaller.
So we need to loosen our criteria. Instead of requiring it to always get closer to A, we can say that for any number, the sequence needs to eventually get closer to A than that number, and stay there:
A sequence converges to if for any non-zero number , there is an integer , such that for all integers , where , .
And this is indeed the most common definition of convergence that mathematicians use.
With this tool we can now start to explore concrete questions about convergence.
For example lets define the sum of two series as follows:
if .
Now if converges to , and converges to , does necessarily converge to ?
We can attempt to sketch out a proof:
For any nonzero number , there is a value such that for all , and a value such that for all , . Without loss of generality[3], assume that . Then for all , we have .
Not only does this prove that adding sequences works as we expect, it also hints that when we do so the resultant sequence converges nearly as fast as the slowest of the two constituent sequences. Convergence speeds are a relatively advanced topic, but you can bet that the first thing you'll do if you study it is try to define precisely what it means for a sequence to converge quickly or slowly.
Now the usual notation for converges to is:
However you'll sometimes see mathematicians skipping this notation. Instead of writing they'll just write: .
What's going on here?
Firstly Mathematicians are lazy, and since everyone who reads the second form will understand it means the same thing as the first, why bother writing it out in full?
But it's actually hinting at something deeper - when you work with limits of sums they often behave as though you literally are just adding an infinite number of terms. Manipulations like the one Euler used for the Basel problem often happen to work even though they aren't actually justified by our definitions, and this notation can give you the hint you need to attempt just such an "illegal" manipulation before you commit yourself to finding a full blown formal proof.
Later mathematicians then discovered some of the conditions under which you can treat a limit as a normal sum, and so such loose notation can actually provide fertile ground for future mathematical discoveries. This isn't an isolated event - such formalisation of informal notation has been repeated across many disparate branches of mathematics.
It doesn't have to be the reals - you could have a sequence of functions or shapes or whatever, but for our purposes it's the reals.
Those vertical lines mean the absolute value, which means to ignore whether the value is positive or negative. So . Here it's just used to express that and are less than distance apart, without specifying whether is greater than or less than .
Without loss of generality is another way of saying that we're going to prove one scenario (here where ), but the proof is identical in other scenarios (e.g. when ) if you just switch the symbols around (so call and vice versa), so we don't want to repeat the proof multiple times for each scenario.