Why Did No One Teach Me This?: 2011

Monday, 17 October 2011

Limitless e, Part Two

We have now this equation for working out the number of ones, \(y\)'s, \(y^2\)'s, \(y^3\)'s that you will get no matter how many brackets you start with (or rather, no matter to what power you try to raise those backets).

\[1+my+\frac{m(m-1)}{2\!}y^2+\frac{m(m-1)(m-2)}{3\!}y^3+\ldots\]

With a little thought we can see that this pattern is going to continue for \(y^4\)'s, \(y^5\)'s and every other power of \(y\). Why? Well each higher power of \(y\) is created by multiplying that many \(y\)'s together. To make sure you create every possible combinations of rows you follow the procedure we established last time. First you fix one \(y\), then the rest of the available \(y\)'s, but one, and then you float the remaining \(y\) around all the the other columns available spaces. Once you have used that last \(y\) in every available column, you change position of the \(y\) you fixed last, and then you float the remaining \(y\) again. You keep going until your very first fixed \(y\) has been in every available column.

So the total possible rows you get for any power of \(y\) is the the same number of groups of columns (reducing by one each time to represent all the \(y\)'s slotting into place) multiplied together as the power. So if you want to know how many total rows that generate \(y\)'s to the power of two you have, you do (total columns) multiplied by (total columns less one). If you want to know the same for \(y\) to the power four, you have (total columns) multiplied by (total columns less one) multiplied by (total columns less two) multiplied by (total columns less three). You can see in each multiplication there are the same number of terms as the power of \(y\) you are trying to find.

Of course that just gives you the total amount of rows that will be generated. You still need to get rid of the duplicates. That can be achieved by dividing the total number of rows created in the last step by the total number of ways you can arrange that number of \(y\)'s together. That will always be the total number of \(y\)'s (choices of position for the first \(y\)) multiplied by the choices of position for the second \(y\) placed, and so on down to the last \(y\) for which there is only one position left. Remember we are talking about the different ways of organising the \(y\)'s in the spaces for \(y\)'s already in the lines generated by the procedure in the previous paragraph. We are NOT organising the \(y\)'s in every possible space in those rows. The number that does the dividing is always going to be the same number as we are raising \(y\) to the power of, multiplied by all the whole numbers between it and zero (because we are using up a \(y\) position each time. We show this by using the \(!\) symbol after the number that starts that mutiplication.

We have already written down what the first four terms look like when we multiply \(m\) brackets together. Having thought about the arguments above we can also say what ANY term of that equation will look like. If we want to know the \(k\)th term we would change the symbol \(k\) in the following to the number of the term we wanted to know:

\[\frac{m(m-1)(m-2)\ldots (m-(k-1))}{k\!}y^k\]

That looks a bit tricky. Previously we have seen that the dots in a row mean "and so on" when added onto the end of a series of numbers. Here, albeit it appears between two terms in the top line of a division, it means exactly the same. It just really means "keep putting in brackets with one more being deducted from \(m\) each time, until you get to the bracket where the number being deducted is one less than \(k\)". We do not want to go all the way to \(k\) because then we would have one too many brackets being multiplied.

I am not just being lazy by not writing in all these brackets, I literally cannot do so because I do not know what value \(k\) has, so I do not know how many backets to write in until I pick a \(k\)! \(k\) may even be two, in which case I was wrong in writing in the \((m-2)\) bracket (because I should really have stopped at \((m-1)\) because one is one less than \(k\) when \(k\) is two). The idea behind the way that I have written that line is that I have given all the clues that will be needed for the whole line to be constructed once \(k\) is known, even if I have to add in or subtract brackets.

What happens if we set \(k\) to four? We get this:

\[\begin{align}
&\frac{m(m-1)(m-2)\ldots (m-(4-1))}{4\!}y^4\\
&\frac{m(m-1)(m-2)(m-(3))}{4\!}y^4\\
&\frac{m(m-1)(m-2)(m-3)}{4\!}y^4
\end{align}\]

That is the term that will always tell us the number of \(y\) to the fourth powers once we multiply out our brackets. If we say that we have four brackets, like we had above, \(m\) will be four, and the number of \(y\) to the fourth powers will be:

\[\begin{align}
&\frac{4(4-1)(4-2)(4-3)}{4\!}y^4\\
&\frac{4(3)(2)(1)}{4\!}y^4\\
&\frac{4(3)(2)(1)}{4\cdot 3\cdot 2 \cdot 1}y^4\\
&\frac{24}{24}y^4\\
&1\cdot y^4
\end{align}\]

Or, one \(y^4\). Which if you look above you will find is exactly the number we were expecting. So or logic seems sound. If you are a bit devious of mind, you may well ask, "what happens if I set \(k\) to be more than \(m\)", or in other words, what if I ask this equation how many \(y^4\)'s I will get when I am only multiplying together, say, two brackets. The answer must be zero, because to get to \(y^4\) you must be multiplying four \(y\)'s together. If we only have two brackets, we will only have, at most, two \(y\)'s to multiply together. So what does this look like?

\[\begin{align}
&\frac{2(2-1)(2-2)(2-3)}{4\!}y^4\\
&\frac{2(1)(0)(-1)}{4\!}y^4\\
&\frac{0}{4\!}y^4
\end{align}\]

Anything multiplied by zero is zero. So it doesn't matter what else we have in the top line of that division, as soon as a zero hoves into view the whole thing turns to zero. Zero divided by anything (apart from zero) is also zero. So we end up with zero \(y^4\)'s. Which is what we expected. We can also say that if \(m\) is less than \(k\) we will always end up creating at least one bracket on that line where a number equal to \(m\) is subtracted from \(m\), which reduces the whole thing to zero.

So we are finally able to restate our limiting definition of \(e^x\):

\[\begin{multline}
e^x=\lim_{m \to \infty} 1+my+\frac{m(m-1)}{2!}y^2+\frac{m(m-1)(m-2)}{3!}y^3+\ldots\\
+\frac{m(m-1)(m-2)\ldots (m-(k-1))}{k!}y^k
\end{multline}\]
(Remember that we have defined \(y=\left (\frac{x}{m} \right )\), so we need to replace every \(y\) accordingly)

We have now got rid of the pesky raising to the power, but we still have the problem of this being a limiting function. Let's see if we can get rid of that. How do we do that? What I am going to do is just say, "to hell with approaching infinity let's just make \(m\) infinity and see what happens". In particular I am going to look at each term in turn to see whet effect infinity has.

The first is always one. We agreed that no matter how many times you multiply one by one you still get one.

The second is \(my\) at the moment, but we need to reintroduce \(\left (\frac{x}{m} \right )\). This makes it:

\[m\cdot \frac{x}{m}\]

That is also:
\[\begin{align}
&\frac{m}{1} \cdot \frac{x}{m}\\
&\frac{m\cdot x}{1\cdot m}\\
&\frac{m\cdot x}{m\cdot 1}\\
&\frac{m}{m}\cdot \frac{x}{1}
\end{align}\]

So no matter what \(m\) is, even infinity, once you divide it by itself you get one. So the second term is going to be \(x\) an its own. It has to be because the two infinte \(m\)'s cancel each other out.

Let's look at the third term now:

\[\begin{align}
&\frac{m(m-1)}{2!}y^2\\
&\frac{m(m-1)}{2!}\cdot \left (\frac{x}{m}\right )^2\\
&\frac{m(m-1)}{2!}\cdot \frac{x^2}{m^2}\\
&\frac{m(m-1)}{m^2}\cdot \frac{x^2}{2!}
\end{align}\]

We can just swap around the bottom of those fractions because we are mutiplying them together. Look back up the the discussion about the last term if you want to see the steps. We would just have brought these together into one big division, reoganised the order of the terms and then separated out into these two fractions. Ok. Let's set \(m\) equal to infinity.

look at the first part of that multiplication. We are setting \(m\) to infinity. If we do that then \((m-1)\) is still infinity because infinity minus one is still infinity. Agreed? Given that is the case, we get:

\[\frac{\infty(\infty-1)}{\infty^2}\cdot \frac{x^2}{2!}\]

All together now, what is infinity minus one? Yes! Infinity. So we actually have:

\[\frac{\infty(\infty)}{\infty^2}\cdot \frac{x^2}{2!}\]

Which is:

\[\frac{\infty^2}{\infty^2}\cdot \frac{x^2}{2!}\]

What is anything (even infinity squared) divided by itself? Yes - one.

\[1\cdot \frac{x^2}{2!}\]

So the horrible looking third term actually just dissolved down to a nice simple looking fraction. This is also going to be our pattern. Think about it. We have exacly the same number of brackets on the top line of each division as the power we raise the \(y\) to which is multiplied by that division. So here we had two brackets and we squared \(y\). Next time we add a power and a bracket, because we have three brackets and we cube \(y\). This pattern continues. It means we can always be sure, no matter which term in this infinitely long series that we are looking at, that the power of \(m\) on the bottom of the fraction exactly balances the number of brackets containing \(m\) on the top of the fraction. When we set \(m\) to infinity that makes all the brackets on the top contain infinity, no matter what number they try to decuct from infinity. And that means that we have the same number of infinities multiplying together on the top and bottom of the division. If we have the same on the top and bottom we know that just equals one.

So we can say that when we set \(m\) to infinity, the \(k\)th term (assuming that the term at the very beginning is the zeroth term) of the series is \(\frac{x^k}{k!}\), which looks very neat indeed. So we can now write out \(e^x\) as an infinite series:

\[e^x=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\frac{x^4}{4!}+\ldots\]

If only we could write the first two terms of the series in the form \(\frac{x^k}{k!}\) we could even write out a nice summation symbol for all this. Well, let's just see. The first term (as opposed to the zeroth term) is \(x\). What would happen if we tried to work out the first term using our rules? Well, it would be \(\frac{x^1}{1!}\). One factorial (\(1!\)) is just one. And \(x\) to the power of one is just \(x\). So we end up with \(\frac{x}{1}\) which is just \(x\) which is exactly what we want - so that works fine.

How about the zeroth term? Well we agreed ages ago that anything to the power of zero is one. That applies just as much to \(x^0\), so our rule \(\frac{x^0}{0!}\) becomes \(\frac{1}{0!}\). But surely \(0!\) is zero, right?

Nah. We assumed that anything to the power zero was going to be zero, but it turned out to be one. Same thing happens for much the same reason here. Here's what happens.

Any normal factorial, \(3!\) or \(2!\) for instance, is just the factorial of the number one smaller mutiplied by the number of the factorial. So \(3!=3\cdot 2!\), and \(2!=2\cdot 1!\). If \(0!\) was zero, then this would not work with \(1!\), because that would turn into \(1\cdot 0!=1\cdot 0=0\) when we know that it is actually one. So to preserve the way the whole factorial system works, \(0!\) has to be one.

Going back to the zeroth term (\(\frac{1}{0!}\)) this gives us \(\frac{1}{1}\), or one. Which is the first term in our series. Woopee!

This means we CAN write the sum which generates the series like this:

\[\sum_{k=0}^\infty \frac{x^k}{k!}\]

What we are trying to do here is state what \(e^x\) is. So let's test all this with \(e^1\). Looking at our sum we can see that the \(x\)'s all appear on the top of the fractions. So all we need to do is replace them all with ones. So:

\[e^1=1+1+\frac{1^2}{2!}+\frac{1^3}{3!}+\frac{1^4}{4!}+\ldots\]

And of course one to any power is just one, so this simplifies to:

\[e^1=1+1+\frac{1}{2!}+\frac{1}{3!}+\frac{1}{4!}+\ldots\]

Let's work out those factorials:

\[e^1=1+1+\frac{1}{2}+\frac{1}{6}+\frac{1}{24}+\ldots\]

So we can see roughly what is going on, we'll stick to the first five terms here. We'll also convert them all into twenty fourths:

\[\begin{align}
e^1 &\approx \frac{24}{24}+\frac{24}{24}+\frac{12}{24}+\frac{4}{24}+\frac{1}{24}\\
e^1 &\approx \frac{65}{24}\\
e^1 &\approx 2.708333333
\end{align}\]

Well, that's not bad at all after only five terms. It is definitely heading in the right direction. Let's sum up then. We have gone from:

\[e=\lim_{n \to \infty} (1+\tfrac{1}{n})^n\]

to

\[\sum_{k=0}^\infty \frac{x^k}{k!}\]

or

\[e^x=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\frac{x^4}{4!}+\ldots\]

Monday, 10 October 2011

Limitless e, Part One

Limitless e

So last time we worked out HOW to raise the number \(e\) to an imaginary power - specifically the imaginary power that we are focussed on namely \(e^{x}\). We were still using our definition of e as the limit of a function when one of the variables increased in size towards infinity. This function:

\[e^{x}=\lim_{m \to \infty} (1+\tfrac{x}{m})^{m}\]

That function was a little different to our first definition of \(e\) because it is actually the function that gives us \(e\) to the power of \(x\). Which is handy. When we ran through everything I decided that while the function seemed to work perfectly it did not actually tell us anything about WHY these numbers combine in this fashion.

So let's take a different tack. Let's stop defining \(e\) as the limit of a function as you change a variable. Let's try to define it purely in terms of a one variable function. In otherwords, what I want to do is get rid of the \(m\) in the above function.

Right-o. How do we do that? Well, what we have is this:

\[(1+\tfrac{x}{m})^{m}\]

We want to get rid of the \(m\). The \(m\) just tells us how many times we need to multiply together the stuff in the brackets. Let's say that \(m\) was two. The function would look like this:

\[(1+\tfrac{x}{2})^{2}\]

Which would expand into this:

\[(1+\tfrac{x}{2})\cdot (1+\tfrac{x}{2})\]

Now we have the same kind of problem as the red and blue baskets with apples and oranges in them. Only this time we have exactly the same things in each bracket. it doesn't change the way we deal with it though, we just need to unpack the first bracket like this:

\[1\cdot (1+\tfrac{x}{2})+\tfrac{x}{2}\cdot (1+\tfrac{x}{2})\]

And then we can unpack the second bracket like this:

\[1\cdot 1+1\cdot \tfrac{x}{2}+\tfrac{x}{2}\cdot 1+\tfrac{x}{2}\cdot \tfrac{x}{2}\]

Which turns into:

\[1+ \tfrac{x}{2}+\tfrac{x}{2}+\tfrac{(x)^2}{2^2}\]

And finally:

\[1+ 2\cdot \tfrac{x}{2}+\tfrac{x^2}{4}\]

That's fair enough, and perfectly logical. However, what we want to do is not just multiply two brackets together, but to multiply \(m\) brackets together. How on earth do we do that? What we are going to do is to look closely at exactly what we get out of the brackets when we multiply them together, and how that changes when we add more and more brackets to the mix. We want to see if we can work out a general rule to tell us what we will get without having to go through all that tedious multiplication and addition that I just did above.

To help us with this process I am going to simplify our brackets a bit. Instead of \((1+\tfrac{x}{m})\), which is a bit of a mouthful, I am going to replace the \(\tfrac{x}{m}\) bit with the letter \(y\). So our bracket now looks like this: \((1+y)\). That's much nicer to look at. All we have to do is remember that when we are finished, we need to replace \(y\) where ever we find it with \(\tfrac{x}{m}\).

Right, to get a feel for what we are talking about lets look at three examples, \((1+y)^2\), \((1+y)^3\), and \((1+y)^4\). I won't do much talking in between, let's just have a look at the logical steps.

First of all \((1+y)^4=(1+y)\cdot(1+y)^3\), and \((1+y)^3=(1+y)\cdot(1+y)^2\). So we just need to work out the power two answer and then multiply through by another bracket, and so on. So, power two:

\[\begin{align}
&(1+y)^2\\
&(1+y)(1+y)\\
&1(1+y)+y(1+y)\\
&1\cdot1+1\cdot y+y\cdot1+y\cdot y\\
&1+y+y+y^2\\
&1+2y+y^2
\end{align}\]

Now power three:

\[\begin{align}
&(1+y)(1+2y+y^2)\\
&1(1+2y+y^2)+y(1+2y+y^2)\\
&1\cdot 1+1\cdot 2y+1\cdot y^2+y\cdot 1+y\cdot 2y+y\cdot y^2\\
&1+2y+y^2+y+y(y+y)+y^3\\
&1+2y+y^2+y+y^2+y^2+y^3\\
&1+2y+y^2+y+2y^2+y^3\\
&1+2y+y+3y^2+y^3\\
&1+3y+3y^2+y^3
\end{align}\]

Now power four:

\[\begin{align}
&(1+y)(1+3y+3y^2+y^3)\\
&1(1+3y+3y^2+y^3)+y(1+3y+3y^2+y^3)\\
&(1+3y+3y^2+y^3)+y(1+3y+3y^2+y^3)\\
&1+3y+3y^2+y^3+y\cdot 1+y\cdot 3y+y\cdot 3y^2+y\cdot y^3\\
&1+3y+3y^2+y^3+y+3y^2+3y^3+y^4\\
&1+4y+3y^2+y^3+3y^2+3y^3+y^4\\
&1+4y+6y^2+y^3+3y^3+y^4\\
&1+4y+6y^2+4y^3+y^4
\end{align}\]

Let's summarise what we have found so far:

\[\begin{align}
(1+y)^2&=1+2y+y^2\\
(1+y)^3&=1+3y+3y^2+y^3\\
(1+y)^4&=1+4y+6y^2+4y^3+y^4
\end{align}\]

Can we start to describe our results? Well, first of all each result starts with a one. Secondly each result ends with a single \(y\) to the same power as we were raising the bracket to. Thirdly, inbetween, the one and the single \(y\) we have some amount of every power of \(y\). In other words, reading from left to right, you have no \(y\)'s, then some number of just \(y\) and then some number of \(y^2\) then \(y^3\) and so on. We never miss a power of \(y\).

These are all fairly obvious. The last observation may have passed you by. Look at the numbers we multiply the \(y\)'s by. They are symmetrical! In the third power row you get three three, and in the fourth power row you get four - six - four. This is promising, because it suggests that there is a pattern to be discovered! Let's have a try at working out how we get to those numbers.

Why does each line start with a one? Remember the process that we used to generate these lines. We are multiplying together numerous brackets. The way we multiply brackets together is to multiply the individual terms in each bracket by every other term in all the other brackets. That's easy for only two brackets, because in each multiplication you only have two things being multiplied. With three or four brackets, we are going to have three or four items being multiplied.

To generate the first row above we went throught the whole term by term routine, but then when generating the other lines, we just mutlpied the last line by another bracket as a short cut. Let's actually look at what would have happened with three brackets, from scratch as it were:

\[(1+y)(1+y)(1+y)\]

First we multlply the first term in each of the brackets:

\[1\cdot 1\cdot 1\]

That gives us our number one. The more brackets you have the more ones go into that multiplication, but even with \(m\) number of ones in that multiplication the result is still going to be one. That's where the one comes from. What next? Lets have the first terms of the first two brackets and then the second from the last:

\[1\cdot 1\cdot y\]

That gives us a y on it's own (the ones just collapse down to tell us we end up with one multiplied by \(y\)). And the same thing will happen when the \(y\) in the middle bracket is multiplied by the ones in the other two. That looks like this:

\[1\cdot y\cdot 1\]

And finally we will also get the same result from choosing the \(y\) in the first bracket and ones in the last two:

\[y\cdot 1\cdot 1\]

If we then add up all those \(y\)'s we end up with, drumroll, \(3y\), which is what we were expecting. If you think about it there is no way for us to get any other individual \(y\)'s. There are only two options from each bracket a one or a \(y\). And if you multiply two or more \(y\)'s together you cannot end up with a \(y\) as opposed to a \(y^2\) or a \(y^3\). So the only options for getting \(y\)'s are to pick only one \(y\). We can only do that for as many brackets as we have. So that means if we have \(m\) brackets we will get \(m\) \(y\)'s. So far, with \(m\) brackets, we can say that we will have:

\[1+my+\ldots\]

If we move onto \(y^2\)'s things get a bit trickier. We have to keep track of all the combinations we have tried so far, and that is getting complicated. So I am going to use a table to keep track of things.

That shows us the result for the first line. Now let's add on the other options that we have worked through so far:

In order to do things as systematically as possible when using multiple \(y\)'s, I am going to fix one, and then move the other one around. This means that I won't accidentally miss any of the possible arrangements. I'll fix one in the third bracket first, putting the other bracket in the second and then first brackets:

Good. Now lets put the fixed \(y\) in the second bracket, and we'll use the last and then first for the other \(y\):

And last of all we will fix the \(y\) in the first bracket and use the last and then second brackets for the other \(y\):

Hang on. We have ended up with six \(y^2\)'s. We were only expecting three. What has gone wrong? If you look at the first and third \(y^2\) rows you will see that they are identical. They both represent a one in the first bracket mutiplied by a \(y\) in the second and third brackets. But once we have done that combination once, we can't do it again. So we have over estimated. If you look closely, you will see that each \(y^2\) line is duplicated once. Because we have twice as many \(y^2\) as we need we need to divide the result by two. Six divided by two is three, which IS what we were expecting. Why did we end up with twice as many as we were expecting? Look at this line:

How did we create that twice? Well once we had the fixed \(y\) in the first column, and then we had it in the second column. But both times we generated the same row. There are two \(y\)'s in the row, so there have to be two copies of the row in our whole table. Why? You have one copy for when the first \(y\) is fixed and one copy for when the second \(y\) is fixed. Good.

Next question. How did we end up with six rows? Well we were creating our rows by a careful systematic process. FIrst we picked a row for the fixed \(y\). We had three rows to choose from. Then we placed our floating \(y\). We no longer had three rows to choose from, because the fixed \(y\) was taking up a row. So we had one less that three, or two, rows to choose from. For each of the rows the fixed \(y\) was in, we had two options for the floating. That gave us three multiplied by two, or six total rows. Can we then come to any conclusions about the number of \(y^2\)'s we will have with \(m\) number of brackets?

We will have \(m\) choices of rows for our first \(y\), and then \(m-1\) options for our second \(y\). We will, however, end up with more rows then we need, because we will get one for each possible rearrangement of the fixed and floating \(y\)'s. With two possible \(y\)'s (fixed and floating) there are two possible ways to position them in each different row. So if the total variety of rows we get are \(m\) multiplied times \(m-1\), we then need to divide that by the number repeated rows, which for two types of \(y\) will be two. So we can say that the total number of \(y^2\)'s we will get is \(\frac{m(m-1)}{2}y^2\). We can add that on to our running total like this:

\[1+my+\frac{m(m-1)}{2}y^2+\ldots\]

Let's pause for a moment and check that this works with two brackets. With two brackets \(m=2\) so we get:

\[\begin{align}
1+2\cdot y+\frac{2\cdot (2-1)y^2}{2}+\ldots\\
1+2y+\frac{2\cdot (1)y^2}{2}+\ldots\\
1+2y+\frac{2\cdot y^2}{2}+\ldots\\
1+2y+\tfrac{2}{2}\cdot y^2+\ldots\\
1+2y+1\cdot y^2+\ldots\\
1+2y+y^2+\ldots
\end{align}\]

So far so good - that's what we got when we did it manually - but what about the \(\ldots\) bit? Given that we do not need it (because the first three terms are all we need) it must be zero, so we will see why next time.

Just before we go onto the next one, with our current table, there will be one final line to add:

That just confirms for three brackets multiplied together that we will end with one \(y^3\), becuase with only three brackets to choose from there will only ever be one way of multiplying one item from each to get a \(y^3\) and that is by choosing \(y\) in each bracket.

Right. On to four brackets. Let's think about our table. The first row is going to be all ones, representing the only way you can multiply all the ones together. Next will be the rows with one \(y\) in each row. There will be four rows of them, because there are four sets of brackets to choose an individual \(y\) from. Lets show that now:

Same old, same old. Now lets do the \(y^2\)'s. Just like last time, we will fix one \(y\) and then float one \(y\) around. This looks like this:

So you can see that for every \(y_{fixed}\) we have the other \(y\) in each of the other columns. So the total number of rows should be four multiplied by four minus one (three). That comes to twelve which is exactly the number of rows we find above. Excellent. As with the last time, if you ignore the disctinction of \(y_{fixed}\) as opposed to \(y\), you have twice as many rows than you need, because you can rearrange \(y\) and \(y_{fixed}\) in two ways for each possible row. So again you need to divide by two. Twelve divided by two is six, which is just what we found when we did this manually.

Now. What about \(y^3\)? Before we start, lets think through logically that should happen. We will have three \(y\)'s to choose from the brackets available. If we do the same as last time we will fix one \(y\), from four different options, but then we will have two other \(y\)'s to pop in. For the second \(y\) you have one less brackets to choose from. And for the third \(y\) we have one less bracket again to choose from. So the total number of rows we will generate is four multiplied by three multiplied by two. That's going to be twenty four in total. How many duplicates are we going to get? Well, for each possible unique row we have three places for \(y\)'s. Does that mean we get triple the number of rows we need instead of double like last time? No.

Obviously, we can't be right, because we already know the answer is four (from when we did it manually above). But twenty four divided by three is eight, not four. Four is twenty four divided by six not three. So where does the six come from? If you have a row with three \(y\)'s in it, how many different ways can you order the \(y\)'s? With one \(y\) there was only one way to put it in a row with one space for it, so the answer is one. With two spaces and two \(y\)'s, you can place the first \(y\) in two places, with the other one falling into the only remaining place. So the answer is two multiplied by one, or two. For three \(y\)'s you have three choices for the first \(y\), then two places for the second \(y\) and only one for the last \(y\). The answer is then three mutiplied by two mutiplied by one. That comes to six, which is the number we predicted. So we can say that the total possible number of rows is going to be \(m\) (for the first \(y\)) multiplied by \(m-1\) (for the second \(y\)) multiplied by \(m-2\) (the places left for the last \(y\)). Each row generated by that process has three \(y\)'s in it. There are three multiplied by two mutiplied by one ways to organise three \(y\)'s in three places, so we end up with six times as many rows as we need.

The number of \(y^3\)'s generated from \(m\) brackets is going to be \(\frac{m(m-1)(m-2)}{6}y^3\). We get the six from three times two times one. So we can expand our formula to:

\[1+my+\frac{m(m-1)}{2}y^2+\frac{m(m-1)(m-2)}{6}y^3+\ldots\]

If it wasn't obviously already, when we are looking at higher and higher powers of \(y\) from more and more brackets, this pattern is going to continue. For whatever power of \(y\) we are interested it, we are always going to have the total number of possible columns available for each \(y\) you need to place. That will always be one fewer as you place the \(y\)'s. For each of the unique lines that process generates, there will the same number of copies as there are ways to organise the number of \(y\)'s that you are placing.

We are stuck with the complicated \(m(m-1)(m-2)\ldots (m-(m-1)\) mess, but can we do anything with the three mutiplied by two mutiplied by one stuff? The answer is yes. This kind of stuff crops up from time to time, and there is a simple notation for it. You just write \(3!\) to represent three mutiplied by two mutiplied by one and \(4!\) is four multiplied by three mutiplied by two mutiplied by one. And so on. That turns our formula into:

\[1+my+\frac{m(m-1)}{2!}y^2+\frac{m(m-1)(m-2)}{3!}y^3+\ldots\]

Which is a bit neater. We'll see where this takes us next time.

Monday, 3 October 2011

Intermission

OK, we have been distracted by shiny things in the form of the mandelbrot set. Now we can get back to our task in hand, working out why \(e^{i\pi}+1=0\).

We have now met our cast of characters. Let's have a look at them all lined up on the complex plane:

So you take the number at \(\pi\) and multiply it by \(i\). That's easy enough to visualise now. We look at both numbers in polar form. \(\pi\) is just \(\pi\angle 0\) and \(i\) is \(1\angle \tfrac{\pi}{2}\). We just need to multiply the absolute size (\(\pi\cdot 1\)) and we add the angles (\(0+\tfrac{\pi}{2}\)). That gets us to \(\pi\angle \tfrac{\pi}{2}\). That looks like this:

So we are now left with the job of trying to work out how to raise the number \(e\) to this new number \(i\cdot \pi\). To do this, we are going to have to work out how to multiply \(e\) to a complex number. To start with, lets go back and look at our definition of \(e\).

\[e=\lim_{n \to \infty} (1+\tfrac{1}{n})^n\]

Lets look only at the function bit:

\[(1+\tfrac{1}{n})^n\]

Let's now raise that bit to a power:

\[((1+\tfrac{1}{n})^n)^x\]

We now know that when raising something to a power and then to a power again, we can just multiply the powers. In other words the expression above really means take the bit in brackets and multiply \(n\) of them together. Now take that group of things being multiplied together and multiply \(x\) of those groups together. If you did that you would just end up with \(x\) groups of \(n\) things all multiplied together. So essentially you end up with \(n\cdot x\) numbers of the stuff in brackets being multiplied together. So we can just write this instead:

\[(1+\tfrac{1}{n})^{n\cdot x}\]

How does this help me work out \(e\) to a complex power? Well, what I know how to do is to multiply, divide and add complex numbers. I do not know how to raise a number to a complex exponent. So what I really want to do is to get that \(x\) away from the exponent. How do I do that?

Let's create a new variable \(m\). Let's define this to be \(n\cdot x\):

\[m=n\cdot x\]

Now I will multiply both sides of that equation by \(\tfrac{1}{n}\):

\[\begin{align*}
\tfrac{1}{n}\cdot m&=\tfrac{1}{n}\cdot n\cdot x\\
\tfrac{m}{n}&=\tfrac{n}{n}\cdot x\\
\tfrac{m}{n}&=1\cdot x
\end{align*}\]

We also need to multiply both sides by \(\tfrac{1}{m}\):

\[\begin{align*}
\tfrac{m}{n}\cdot \tfrac{1}{m}&=x\cdot \tfrac{1}{m}\\
\tfrac{m\cdot 1}{n\cdot m}&=\tfrac{x}{m}\\
\tfrac{m\cdot 1}{m\cdot n}&=\tfrac{x}{m}\\
\tfrac{m}{m}\cdot \tfrac{1}{n}&=\tfrac{x}{m}\\
1\cdot \tfrac{1}{n}&=\tfrac{x}{m}
\end{align*}\]

We now have two defintions to use:

\[\begin{align*}
n\cdot x&=m\\
\tfrac{1}{n}&=\tfrac{x}{m}\\
\end{align*}\]

And we just need to fit them into this equation:

\[(1+\tfrac{1}{n})^{n\cdot x}\]

We have both a \(\tfrac{1}{n}\) and a \(n\cdot x\), so let's have at it:

\[(1+\tfrac{x}{m})^{m}\]

There we go - mission accomplished. We have manged to get the \(x\) AWAY from the exponent. We can now say that:

\[e^x=\lim_{m \to \infty} (1+\tfrac{x}{m})^{m}\]

We can now see what \(e\) raised to a complex power is. First, though, let's just run through this with real powers to check it is working. If we square \(e\), then according to my calculator we should get \(7.389056099\). So let's try:

\[e^2=\lim_{m \to \infty} (1+\tfrac{2}{m})^{m}\]

If you set \(m\) to \(1000000\) you get an equation that looks like this:

\[\begin{align*}
\left (\frac{1000000}{1000000}+\frac{2}{1000000}\right )^{1000000}\\
\left (\frac{1000002}{1000000}\right )^{1000000}\\
\frac{1000002^{1000000}}{1000000^{1000000}}
\end{align*}\]

If you work out that horror you get \(7.389041321\), which agrees to four decimal places with the "real" answer. So we are on the right track! So what we can do now is to replace the \(2\) with \(i\pi\). That looks like this:

\[e^{i\pi}=\lim_{m \to \infty} (1+\tfrac{i\pi}{m})^{m}\]

Let's set \(m\) to a hundred to see how we get on. That makes our equation:

\[e^{i\pi}\approx (1+\tfrac{i\pi}{100})^{100}\]

Let's deal with the bit in brackets first. How do we do the division? It is unsurprisingly the opposite of multiplication. So instead of multiplying the absolute values together you divide the first by the second. Then you deduct the second angle from the first. Remember the number we are dividing looks like this:

In polar form (for division) that is \(\pi\angle \tfrac{\pi}{2}\). The polar form for the number doing the dividing (the one on the bottom) is \(100\angle 0\). So the maths looks like this:

\[\frac{\pi\angle \tfrac{\pi}{2}}{100\angle 0}\]
\[(\tfrac{\pi}{100})\angle (\tfrac{\pi}{2}-0)\]
\[(\tfrac{\pi}{100})\angle \tfrac{\pi}{2}\]

So it is going to be at the same angle but only a hundredth the distance away from the origin. We then want to add one onto that. That just moves the number one unit in the positive real direction. That looks like this:

We can now see the number on the complex plane. This is the thing that we are going to raise to the power one hundred. Raising to a power is all about multiplication, so we are going to want to put the number into polar form. So what is it in polar form? First the absolute value is (again using Pythagoras) the square root of one squared plus \(\left (\frac{\pi}{100}\right )^2\). So the absolute value is:

\[\sqrt{1^2+\left(\frac{\pi}{100}\right )^2}\]
\[\sqrt{1^2+\left(\frac{\pi^2}{100^2}\right )}\]
\[\sqrt{1+\frac{\pi^2}{10000}}\]
\[\sqrt{\frac{10000}{10000}+\frac{\pi^2}{10000}}\]
\[\sqrt{\frac{10000+\pi^2}{10000}}\]

Just looking at that, you can see that it is a number just a little big bigger than one. Why? Well ten thousand and a little bit divided by ten thousand is pretty close to one. And the square root of something pretty close to one, is even closer to one. You can compare that with the diagram:

That makes sense. What about the size of the angle? The sine function of the angle gives us the height of the point above the real axis, divided by the absolute value (to scale everything to the unit circle). If we work out that scaled down height, we can use the inverse of sine function to give us the distance around the unit circle to that point, and hence the size of the angle. The scaled height is \(\frac{\frac{\pi}{100}}{\sqrt{\frac{10000+\pi^2}{10000}}}\). Hmm. That looks like a nightmare, but it really isn't. We have already seen that the bit on the bottom (the absolute value) is pretty close to one. Any number divided by one is just itself. So really, we are interested in the bit on top. That tells us that the sine of the angle we are looking for is roughly \(\tfrac{\pi}{100}\). In fact, if you work it out it is \(0.0314004349\).

Now to actually DO the inverse sine function we could start drawing our unit circle, and then make very very precise measurements, or we could just use a calculator. Calculator it is then. The distance round the unit circle to a point \(0.0314004349\) above the real axis is \(0.0314055972\). So that's our angle in radians.

(If you think long enough about this you may well ask 'How the fuck does my calculator KNOW that this is the angle size that corresponds with that sine?' The calculator doesn't draw a circle and get out a ruler. Does it come with all the possible sine values for all possible angles to whatever number of decimal places? No. And before we have finished with this, you will find out what your calculator did to get this precise result).

If you look at the size of the angle, you may notice that it is pretty bloody close to the sine of the angle. In other words the distance around the unit circle to the point is almost exactly the same size as the height of the point above the real axis. Apparently this happens when your angle is very small. Why? Well, let's look at the diagram:

That's our number. Let's scale it to the unit circle (this does not change much because we know that the dashed line there is pretty close to one anyway), and use our old friends Imogen, Polly and Abby to see what is going on:

We can see the very small angle we are dealing with. What I am going to do next is to get rid of Raul because that is not relevant to this particular discussion. I am then going to move Imogen so that it is directly under Polly:

Now let's zoom in on the interesting bit:

Can you see that the orange line of Imogen is practically the same length as the section of circle round to Polly that it is pretty much obstructing? Let's move closer:

Can you see how the orange line and the blue line up to Polly are very close in length? That's exactly what I mean when I say that the sine of an angle is very close to the size of the angle in radians, when the angle is very small.

Anyway, enough distractions. We have our polar form number: \(\sqrt{\frac{10000+\pi^2}{10000}}\angle 0.0314055972\).

What are we going to do with it? The absolute value of the number is going to be multiplied by itself one hundred times. The angle is going to be added to itself one hundred times. Where does that take us? Well, the size is actually a number very close to one. If it was the square root of ten thousand divided by ten thousand, it would be one. The only thing that stops it being one is the \(pi^2\) on the top line. So it is roughly the square root of ten thousand and nine over ten thousand. That is very very close to one. So when we multiply it by itself one hundred times it should still be pretty close to one unit long. In fact it turns out to be a bit over \(1.05\), but not much. Now the angle. If you look closely you will see that the angle is, to four decimal places, one hundredth of \(\pi\). So what do you get if you multiply one hundredth of \(\pi\) by one hundred? \(\pi\)! In fact for our numbers you get to \(99.99\%\) of \(\pi\). So, our end result, is:

\[\left(\sqrt{\frac{10000+\pi^2}{10000}}\angle 0.0314055972\right)^{100}\approx 1.05\angle99.99\%\pi\]

Well, the angle \(\pi\) is of course half a circle, which makes the result, to within \(5\%\), negative one. So we can say, to within \(5\%\) that:

\[e^{i\pi}\approx (1+\tfrac{i\pi}{100})^{100}\approx -1\]

To generate the identity that started this whole thing, all we do is add one, an equals sign, and zero. So it looks like we are definitely on the right track. What about increasing the value for \(m\)? Well, I can't be bothered running though all the arithmetic again for a start. But lets try to imagine what would happen.

First the multiplication of \(\pi\) and \(i\) would proceed unchanged. But then we would divide that number by a much larger number, say a million. That would bring the point down to a millionth of \(\pi\) away from the real axis. We would still add one, which would take us out to a point very, very close to one. The angle would be much smaller and the absolute size would by much much closer to one.

Secondly, when we then raised the absolute size to the power of a million, it would stay much closer to one. And although the angle would be much smaller, we would then add a million of them together, getting us even closer to \(\pi\) as the total. So as the \(m\) number gets larger, the result gets closer and closer to negative one.

In fact, if you put set \(m\) to infinity, you will divide \(i\pi\) by infinity, getting an infinitely small number. One plus an infinitely small number is infinitely close to one. That number's imaginary part would be infinitely small. When you raised the absolute value of the number to infinity, it would stay at one. And the angle, which would be an infinith fraction of \(\pi\) multiplied by infinity would be exactly \(\pi\)! Which is exactly what we wanted to prove isn't it? Let off the fireworks and start the band we're done!

No. No we're not. This:

\[e^{i\pi}=\lim_{m \to \infty} (1+\tfrac{i\pi}{m})^{m}=-1\]

:gives us no real sense of what is going on here. In other words, sure, this formula tells us that raising \(e\) to \(i\pi\) gives you negative one, but it does not tell us WHY.

That's why this section is just called the intermission. This has been a test of all the concepts and tools that we have built up along so far. The good news is that they work. So we are doing the right thing. The problem at the moment is this whole 'limit' feature of our definition of \(e\). What we are going to do next, is to try to get rid of the whole limit approach to the puzzle altogether.

Monday, 26 September 2011

Mandelbrot Set Part Two

OK, so last time we worked out that the limits of the mandelbrot set on the real number line was negative two on the negative side and plus a quarter on the positive side. Between those limits you can keep squaring and adding your original number and your series will never head off to infinity. So let's complicate matters and move to the complex plane.

First of all, let's look at \(i\). We now know what you have to do to multiply and add complex numbers. Remember that you can also write the number \(i\) as \(0+i\) and as \(1\angle\tfrac{\pi}{2}\). The first version means our point on the complex plane has a real value of zero, and an imaginary value of one. The second version means the point is one unit away from the origin (the crossing point of the real and imaginary axes), a quarter turn of circle from the real axis. So we know what it is, let's make it the first in our sequence:

\(0+i,\ldots\)

So what's the next number? The first step of working out the next number is to square the last one. How do we do that with complex numbers? Well, squaring a number is just multiplying the number by itself, so that is a multiplication task. Best move is to use the polar form. The rule we worked out was to multiply the lengths and add the angles. So if we look at the polar form of \(i\) we need to multiply one by one (getting one), and then add \(\tfrac{\pi}{2}\) to \(\tfrac{\pi}{2}\) (getting \(\pi\)). So our squared number is \(1\angle\pi\). Well, of course it fucking is. \(i\) IS the square root of negative one, so if we square it we get negative one. And negative one is, of course, one unit away from the origin ON the real axis, a half circle round from the starting position.

The second step is to add our first number. We need to switch back to rectangular form for this. Negative one in rectangular form is \(-1+0i\). To this we add our starting number. We can do this by just adding the real and imaginary parts. The real part of our starting number is zero, and the real part of our intermediate number is negative one, so the final real part is negative one. The imaginary part of the original is one, and the intermediate is zero, so the result is one. So our final number is \(-1+i\). In polar form this is \(\sqrt{2}\angle\tfrac{3\pi}{4}\). The polar form means turn three quarters of the way to a half circle, and go a distance of the square root of two. We get the distance from pythagoras as usual.

It is a bit tricky thinking about all this in abstract notation, even though we have gone through all of these steps before. Lets look at the process in picture form so we get a feel for what is going on. Our first number looks like this:

Now, let's look at the result of doing the squaring operation:

You can see we have doubled the angle (I have shown the extra angle with a dashed line), because we have added itself to itself. And the distance remains the same because one multiplied by one is one. Now we add on the original number:

And you can see that we end up, as predicted, at \(-1+i\). But there is something important to spot when you look at this that has been very difficult to spot when we have just been using pure algebra. Look at the dotted line that represents the addition of the first number. Compare it with the thick black line that goes up to the first number. They are exactly the same length and angle aren't they? Yes. You see when you "add" on the starting number, it is just like picking the starting number up off the diagram and plonking it down in a new place. The length of it and the angle it is sitting at, does not change. To see what we mean, let see what the next number in our series is, geometrically. Remember we are here now:

\(0+i,-1+i,\ldots\)

So this time we start with \(-1+i\):

The angle is now \(\tfrac{3\pi}{4}\) as we calculated, with a length of \(\sqrt{2}\). When we square that, we will add \(\tfrac{3\pi}{4}\) to \(\tfrac{3\pi}{4}\) getting \(\tfrac{3\pi}{2}\). We will also multiply \(\sqrt{2}\) by itself getting, drumroll, two. That looks like this:

You can see again that the angle has doubled and the length is definitely two. Now we need to add on our starting number. That means you pick up the very first (from a couple of diagrams above) thick black line in your minds eye, and you drop it onto the point at \(-2+0i\):

That takes us to \(0-i\). Because our starting number was just \(0+i\), when we add it to \(0-2i\), we just move up the imaginary axis by one unit. So we now have:

\(0+i,-1+i,0-i,\ldots\)

What's next? Let's just do it in one step:

What's gone on there then? Well, our angle has now reached three quarters of a full circle, so when we add that to itself we get one and a half circles. That just means we go all the way round the circle, back to the beginning, and then add on a half. So we end up, after squaring, on the real axis at negative one. We then add on our first number which, hang on have we not been here before? Answer yes - it was in our first time through. So we end up adding \(-1+i\) to our series:

\(0+i,-1+i,0-i,-1+i, \ldots\)

Which means that we know what the next entry in the series is:

\(0+i,-1+i,0-i,-1+i,0-i, \ldots\)

And so on and on and on. We are going to be going around in this circle for ever and ever. So \(i\) IS in the mandelbrot set, because we are never going off to infinity. Hurrah! You can go through almost the same sequence with \(-i\) which end up looking like this:

\(0-i,-1-i,0+i,-1-i,0+i, \ldots\)

I promised a shortcut to determine inclusion (or more accurately exclusion) from the Mandelbrot set. Now we have seen what the operations of multiplication and addition look like on the complex plane, we can consider this shortcut. First of all let's ask ourselves what happens when we do the squaring operation. We multiply the absolute value of the complex number by itself. If the absolute value is less than one, then when we square it it is going to end up closer to the origin. If the absolute value is more than one, when we square it we are going to end up further away from the origin. Just ignore the question of what we do with the angles for now, and concentrate on the length of the line.

Once we have squared the number, we add on the starting number in the series. That starting number also has an absolute value. If this starting number absolute value is not big enough to counteract the increase created by the squaring, then the complex number will just get bigger and bigger, and will disappear off to infinity. What is the biggest absolute value that we have seen being able to hold back the series from drifting off to infinity? Well, it's not a quarter, which was the limit on the positive real axis. It is not plus or minus \(i\) either, their values were one. No, the largest absolute value is two, which is the value of starting number negative two. Remember with absolute values I am only interested in how far away from the origin the point is on the complex plane, not what angle it is at. So negative two is two units away, so it has an absolute value of two.

It turns out that no matter which complex number you choose, if it is more than two units away from the origin, it is out of the Mandelbrot set. Why? Let's look and see with the number \(3\angle \tfrac{3\pi}{4}\):

The rectangular values of the points are a bit nightmarish to look at here, because we started with a polar form number. I can assure you that they are correct, and in the right place. But the more important thing is what happens at the squaring stage. You can see that the absolute value jumps all the way up to nine, and is then only slightly pulled back when we add on our original number. The problem, and it may be obvious, is that the square of the first number is so much further away from the origin that when you add the first number back on you are still left too far away from the origin to avoid flying off to infinity on the next squaring.

Is there a boundary? In other words, is there an absolute value of the first number beyond which you are doomed to fly to infinity? Let's think of the very best case scenario for the addition stage. The very best case would be to head directly back to the origin, shrinking the absolute value as much as possible. You can see in the diagram above that we do not achieve that. The absolute value was off at an angle to the imaginary axis, so when we added it on we did not get the benefit of the full length of three. To get the full benefit of all of the absolute value of your starting number, what you want then is an original angle that points back to where you started when you add it to itself. Hmm. To point back in the direction we started, we would need to travel round half a circle. We could start at a quarter circle and travel round half a circle, but that would be a total of three quarters of a circle. Three quarters is not one quarter added to another quarter, which is what we are restricted to.

Now, there is only one angle which if you double it is the same as adding a half circle. That angle is a half circle, or \(\pi\) radians. That must mean the absolute value of our starting number has to lie on the negative real axis:

But, hang on. Last time we looked exclusively at the real axis. We worked out that the limit on the negative side of the real axis for the Mandelbrot set is negative two. So we can now say that the limit of the absolute value of a starting number in the series is two.

We can say that because when we square the previous number in the series, if its absolute value is more than one, we are going to end up further away from the origin. To be in the Mandelbrot set we want the numbers in our series to stay near the origin, so that they avoid heading off to infinity. So to get back towards the origin you need to get the best return possible from adding your starting number. That means using the full absolute value of your starting number, which means it must take you straight back to where you came from. The only starting angle that allows that is a half circle, which drops you onto the negative real axis. And we know that the limit for the negative real axis is two, but only two ON the negative real axis.

For instance two on the positive real axis is not in the set. It squares to four away from the origin (and remember its starting angle is zero, and double zero is still zero so the square is also on the positive real axis), and then adds on two - but it adds on two in the worst possible direction - directly away from the origin!

This all means that we can say that any starting number with an absolute value of MORE THAN TWO is NOT in the Mandelbrot set. I am going to call this Rule Alpha.

Now, that is a considerable short cut, because it means we can just ignore any numbers outside a circle with a radius of two about the origin. What about the numbers inside that circle. A lot of them are not in the Mandelbrot set. Last time I really fudged the question of how you tell that a number is in or out after a certain point in the series. I just said it was "obvious". Well, we now have a proper test to apply. What is it?

Well, once the series hits a number which has an absolute value of more than two, you can stop the series right there and then and declare that the starting number of the series is OUT of the Mandelbrot set. Why can you do this? Think about it. You now have a number which next time you are going to square, ending up with a number much further away than your starting number's absolute value. But hang on! I didn't TELL you the starting number of the series. What happens if you hit a number with an absolute value of three, and your starting number had an absolute value of nine? That would be fine, because squaring three you get nine, and if your angles worked out just perfectly, that would place you back at the previous number in the series, and the series would settle down and not drift off to infinity. That is all correct, and impossible. Why? Look at the assumption that we had to make, that we had a starting absolute value of nine. Now go and read Rule Alpha. We cannot ever HAVE a starting number with an absolute value of nine, or of anything more than two. This means that whenever the series reaches a number more than two, we know that even if the angles match up precisely, the absolute value of our starting number cannot be enough to drag the series away from infinity.

This all means that if a term in the series has an absolute value of MORE THAN TWO, the starting term in the series is NOT in the Mandelbrot set. We'll call that, for balance, Rule Beta.

Some starting numbers may generate a series which takes a very long time to get to more than two. For instance we know that the number a quarter \(0.250\) is IN the Mandelbrot set - we calculated that it was the boundary on the positive side of the real axis. If you increase it by just one thousandth to \(0.251\) we know that it is no longer in the Mandelbrot set. However, the series generated by that number does not increase beyond two until the 97th entry. And remember that with a quarter we saw that we could keep going for ever and ever getting closer and closer to a half. So it is not possible to know if a number is in or out of the set until you get a number more than two in the series.

What we do, to make life easier for ourselves, is say that if a series has not hit a number with an absolute value of more than two after a certain number of entries, we will just say for the sake of argument that the starting number is IN the Mandelbrot set. The farther along the series that we place out arbitrary cut off point the more accuratly we can draw our Mandelbrot set. Even with relatively small series, we can still generate a decent looking Mandelbrot set picture.

For example, after, say, ten steps all the numbers we have previously identified as in the set are still in (of course). So we can draw them as points:

(I have coloured the axes and markings in blue now, because there is going to be an awful lot of black shortly, and I want them to still be legible.) Each of those dots are one tenth of a unit across. That means I can sit two of them side by side representing numbers that are apart by tenths only. I could draw a Mandelbrot set using dots this size by only checking complex numbers to one decimal place (-2.0, -1.9, -1.8, to 1.9, 2.0 and so on). I have better things to do with my time than to sit around working this out. Even at this scale I am still going to have to consider forty multiplied by forty different complex numbers. That's one thousand six hundred numbers. I could just ignore the ones that are more than two units away from the origin, but to do that I will have to pythagorise their absolute values first. What a pain.

So I have told the computer to do it instead. I run a spreadsheet that spits out co-ordinates on the complex plane that are IN the Mandelbrot set, for a given resolution (dot size) and cut off point in the series. To see what a difference the choice of cut off point makes for the purpose of accuracy, let's start with by cutting off the series after the very first squaring and adding. (Beardy types would call this the first iteration.) With the same sized dots as last time (but zoomed in a bit), it looks like this:

That's just a blotch isn't it. I have marked on the points we knew about before in blue so you can see where they are. I couldn't put on the point at a quarter, because we are only dealing with tenths here, and a quarter is between two and three tenths.

That's the set with the cut off point after two terms of the series. It makes a neat oval shape around the zero point, extending from negative two to one, and from negative \(i\) to \(i\). OK. Let's throw ANOTHER term onto the series:

Well now, there's a thing. The oval shape has disappeared after just one more iteration (I am working on the beard) and it now looks a bit pointy. Let's do another iteration:

Even more pointy! And now the tops and bottoms of the shape are looking pointy. I am feeling a bit bad for the quarter that we have to keep missing out. To bring it back in means that I have to halve the sizes of the circles, so that I can look at numbers not just in tenths, but in halves of tenths, or twentieths. That will let us get down to \(0.25\), because the \(0.05\) bit that we add on to the two tenths IS a twentieth. I am not going to change the cut off point for this one, just the resolution. The result looks like this:

OK, you can see right away that the edges have become more refined. That just comes from using smaller dots. It's exactly the same as adding more mega pixels to your camera sensor. You can also see that a quarter has joined the party. There is something a bit off with a quarter though. It is supposed to be the limit on the positive real axis, but here it looks miles away from the edge. Remember though that I said that you needed ninety seven numbers in the series for \(0.251\) to get above two, and we are only on number four here. Let's use our higher resolution and move on to the next iteration:

It is now starting to get really weird. The smooth edges have gone and we have all kinds of humps and bumps appearing. You can see where this is going though. Let's skip on to the tenth iteration (which would be the eleventh number in the series - bit confusing but the first number is just a number, the second number comes after the first maths bit which we are calling an iteration - beard is coming on nicely):

Now it it starting to look like some weird insect. It has developed spindly appendages. It is worth reminding ourselves at this point that the only maths going on here is squaring and addition. That's it. It is squaring and addition of complex numbers which sounds a bit tricky, but even that is actually simple. Add the rectangular parts for addition, multiply the lengths and add angles for multiplication. And after only ten iterations you get that weird shape looking back at you. Maths is fucking odd.

OK, two more before we finish up. First lets double the resolution once more, so our dots will be half the size again:

Not so great a change this time, I think we must be into the land of diminishing returns. Let's sign off though by boosting the cut off point all the way to iteration twenty five:

Wow. You can really now see a circular region forming around the number \(-1+0i\). Also you can see how close a quarter has got to the boundary on the positive real line. It is nestled in what I will call for want of a better description, the arse cheeks of the big circle. Finally you can also see how weird plus and minus \(i\) are. They now seem to be way off on their own at the end of long spindly structures.

These pictures are in black and white. To get colours into this I would just ask at what iteration did the points drop out of the set, and then give all the points that dropped out at the same iteration the same colour. Look back up at the difference between the first three images. You can see that we lose whole blocks of dots between each one. If you wanted to colour this you wouldn't delete the dots, you would leave them in but give them a different colour.

If you keep adding more iterations, and if you keep increasing the resolution, and if you colour the dots that drop out, you eventually end up where we came in:

Monday, 19 September 2011

Mandelbrot Set Part One

Before we move on, interestingly we have now covered all the groundwork required to understand what the Mandelbrot set is, and why it is. This you may have heard of. It is the colourful weird shape that appears on t shirts, mouse mats and so on. It looks like this:

So what is it, or more accurately how do you make one? First of all the picture is drawn on the complex plane around the \(0\) position. The picture above is from about \(-2\) on the left to about \(1\) on the right hand side.

We now know that every point on the complex plane is actually a number. Numbers can belong to sets. We have already encountered some sets before, although we did not call them by that name. For instance, all the positive integers (the counting numbers we use for tractors and apples and so on) are called the set of natural numbers. In that set of numbers you have sub sets of even and odd natural numbers. If you add on the negative integers to the natural numbers you get the set of whole numbers (sometimes also called the set of all integers). If you add in fractions you now have a set of all rational numbers. If you add in irrationals you now have all the numbers at we have looked at, in a set called the real numbers. Once we add in the imaginary axis, we end up with a set of all complex numbers.

When we drew the circle of convergence for the infinite sums we were looking at, we could have described every point within that circle as belonging to the set of numbers which caused the infinite sum to converge. The mandelbrot set is like that kind of set. It defines an area on the complex plane, within which a number is in the set, and out with which a number is not in the set. Looking at the picture above though, it is a pretty complex figure. It must be created by a pretty complex mechanism, mustn't it? Well.....

Membership of the set is defined by whether or not the result of a mathematical operation converges or diverges. That is, just like an infinite sum, does it go off to infinity or does it approach an actual number? The operation you do here is a bit different to an infinite sum, but just like the infinite sum, we could keep on doing the maths for ever and ever. So what is this massively complicated operation?

Every starting number is used as the first in a long (actually infinity long) series.
The next number in the series is calculated by squaring the last number and adding the starting number.
If the numbers in the series head off to infinity (get larger and larger) then the number IS NOT in the Mandelbrot set.
If the numbers in the series do not head off to infinity, but settle down, then the number IS in the Mandelbrot set.

That's it. That is literally all there is to it. Squaring and adding. Frankly it looks a lot simpler than the infinite sums nonsense, or taking a limit. Bizarrely though, this simple process of squaring, and adding the first number you thought of generates the very, very, complex pattern we see above.

It would be very boring, not to mention fatal, to try to calculate an infinitely long series, so when you are testing for membership, you usually just calculate a fixed number of terms. The more terms you calculate the finer the detail you can draw at the boundary, but the longer it takes to do the calculations. Once it is obvious that a series has left the building, so to speak, you note down the number of terms you have in the series at that point, and you convert that number of terms to a colour for that point on the complex plane. That's how you get the gradual colouring effects. All of the colours are OUT of the Mandelbrot set, but the colour of the point determines how close it was to getting in.

Lets consider an example so we know exactly where we are. To keep things simple, lets look at the real number line only, so we get rid of the imaginary element. This is very like the step BEFORE we saw the circle of convergence on the complex plane. At that point all we had was zones on the real number line coloured red and blue. Let's do that with the Mandelbrot set. Where are the blue regions on the real number line for the Mandelbrot set? Let's start with zero.

\[0,(0^2+0)\]
\[0,0,(0^2+0)\]

Well this is easy. Zero squared plus zero is still zero. Square it again? Still zero. Never going to get to be more than two no matter how many times we do this. So zero is IN the Mandelbrot set.

What about two?
\[2,(2^2+2)\]
\[2,6,(6^2+2)\]
\[2,6,38,(38^2+2)\]

Can you see that that is never going to settle down now? There is nothing in the function (square and add two) which is capable of putting the brakes on the series. It is off into the wild blue yonder. Two is NOT in the Mandelbrot set. So zero in, two out. Lets look on the other side of zero, and find out if negative two is in:

\[-2,(-2^2-2)\]
\[-2,2,(2^2-2)\]
\[-2,2,2,(2^2-2)\]

A, ha! Negative two squared is four. Four plus negative two is two. TWO squared is four, plus minus two is two. So negative two IS in the Mandelbrot set because its series settles down to an infinite series of two's. But it is on a knife edge isn't it? It only settles down to all these two's because the starting point is EXACTLY the opposite of the amount that the next number in the series increases by when the last one is squared. So the squaring operation is precisely balanced by the addition of the starting number. If we increased the starting number by, say, a tenth, to \(2.1\) look what happens:

\[-2.1,(-2.1^2-2)\]

(It's not obvious how to square \(-2.1\), but if we write it as \(\tfrac{-21}{10}\) and then multiply it by itself \(\tfrac{-21}{10}\cdot \tfrac{-21}{10}\) we get \(\tfrac{-21\cdot -21}{10\cdot 10}\) or \(\tfrac{441}{100}\) which is \(4.41\).)

\[-2.1,2.41,(2.41^2-2)\]
\[-2.1,2.41,3.8081,(3.8081^2-2)\]
\[-2.1,2.41,3.8081,12.50162561,(12.50162561^2-2)\]

It's off as well isn't it? The minus two that is applied is no longer big enough to hold back what happens when the previous term is squared. So precisely at negative two is a boundary of the set on the real number line. We can conclude something about the mandelbrot set already: it is not symmetrical about the zero point on the real axis. Negative two is in, while positive two is out. Let's try to find the boundary on the positive side of the real axis. Two is out, so let's try one:

\[1,(1^2+1)\]
\[1,2,(2^2+1)\]
\[1,2,5,(5^2+1)\]
\[1,2,5,26,(26^2+1)\]

It's gone, hasn't? The fact that you are squaring one to get one doesn't help because the one that you then add increases your number. Because the increase is above one, the next square operation increases it further. There is no suitable brake. The next number is always going to be much bigger than the previous one.

With negative numbers, what was important was having a starting number half the value of the resulting square to pull the square back. Every time you squared, when you applied the starting number it brought you back to the same place. You squared two up to four, and took off two, back to two. Two steps forwards and precisely two steps back. Is there a starting number that will work for us on the positive side, or is zero the boundary? Well, the number that you add cannot be the brake for the positive starting numbers, because it is always going to increase the next term in the series - it's positive! What we need to find then is a number which, when you square it, gets smaller. That will be a number between zero and one. Numbers between zero and one get smaller when they are squared because they are reduced be the same amount as they were less than one to begin with. It is easier to see with fractions less than one, because the number on the top of the fraction does not increase as much as the number on the bottom of the fraction, so the ratio between the two gets smaller. But our special number may not be a fraction. We could just test all the numbers between zero and one to find the special one, but that could take an infinitely long time. Let's find it by logic instead.

Lets work backwards, and not look for the starting number in the series. Let's look for the number the series is going to settle down to. Let's call that number \(a\). \(a\) has to be a number whose square is also half of itself. Why? Well we would square \(a\), reducing it by half, and then add back on the half we just removed. That would completely balance the squaring and addition process. We can write this out mathematically as:

\[a^2=\frac{a}{2}\]

That says \(a\) squared is the same as half of \(a\). From there we can rearrange the right hand side to look like this:

\[a^2=\tfrac{1}{2}\cdot a\]

We can then divide both sides by \(a\):

\[\frac{a^2}{a}=\frac{\tfrac{1}{2}\cdot a}{a}\]

That's the same as:

\[\frac{a\cdot a}{a}=\tfrac{1}{2}\cdot \tfrac{a}{a}\]

And:

\[a\cdot \tfrac{a}{a}=\tfrac{1}{2}\cdot \tfrac{a}{a}\]

We know that \(\tfrac{a}{a}\) is just one, so:

\[a\cdot 1=\tfrac{1}{2}\cdot 1\]

The ones cancel, and we are left with:

\[a=\tfrac{1}{2}\]

So we have our magic number. It is a half. This is what the series will settle down to. So what is the first number in our series? It is the square of a half, which is a quarter. (A quarter is half of a half). So the prediction is that from our series, if we plug in the first number as a quarter, the series should never go flying off to infinity. It will get closer and closer to the magic number of a half. This is because if it ever reached a half, it would bounce right back with the next step in the series. Let's see what the first ten terms in the series starting with a half are:

\[0.25,(0.25^2+0.25)\]
\[0.25,0.3125,(0.3125^2+0.25)\]
\[0.25,0.3125,0.3476\ldots,(0.3476\ldots^2+0.25)\]
\[0.25,0.3125,0.3476\ldots,0.3708\ldots,(0.3708\ldots^2+0.25)\]
\[0.25,0.3125,0.3476\ldots,0.3708\ldots,0.3875\ldots,(0.3875\ldots^2+0.25)\]
\[\begin{multline}
0.25,0.3125,0.3476\ldots,0.3708\ldots,0.3875\ldots,\\
0.4001\ldots,(0.4001\ldots^2+0.25)
\end{multline}\]
\[\begin{multline}
0.25,0.3125,0.3476\ldots,0.3708\ldots,0.3875\ldots,\\
0.4001\ldots,0.4101\ldots,(0.4101\ldots^2+0.25)
\end{multline}\]
\[\begin{multline}
0.25,0.3125,0.3476\ldots,0.3708\ldots,0.3875\ldots,\\
0.4001\ldots,0.4101\ldots,\\
0.4182\ldots,(0.4182\ldots^2+0.25)
\end{multline}\]
\[\begin{multline}
0.25,0.3125,0.3476\ldots,0.3708\ldots,0.3875\ldots,\\
0.4001\ldots,0.4101\ldots,0.4182\ldots,\\
0.4249\ldots,(0.4249\ldots^2+0.25)
\end{multline}\]
\[\begin{multline}
0.25,0.3125,0.3476\ldots,0.3708\ldots,0.3875\ldots,\\
0.4001\ldots,0.4101\ldots,0.4182\ldots,0.4249\ldots,\\
0.4305\ldots,(0.4305\ldots^2+0.25)
\end{multline}\]

In fact I can tell you, because I have done the sums, that the 100th term is \(0.490604220129385\ldots\) and the 1,000th is \(0.49900860913856\ldots\). So, it turns out we were right. Which ever way you look at it, one quarter is IN the Mandelbrot set, and the more times you run through the process the closer and closer to one half the result gets - just as we predicted. If we started with a number just a sliver higher than a quarter, this would not work. It would not be balanced by the addition of the quarter, so that when the long series, like the one above, got near a half, it would eventually pop over a half. Why? Well a half times a half plus a quarter and a little bit, is bigger than a half. So eventually the series of numbers is going to get to the point where the gap between the last number in the series squared and a quarter is LESS than the little bit your starting number is bigger than a quarter. Once that point is reached, the next term in the series must be bigger than a half. As soon as it is, the brake fails, and the series will eventually reach infinity.

So we can say, on the real number line, the limits of the mandelbrot set are negative two and one quarter. What about the limits on the complex plain? Well there it gets more complicated, and strangely more simple. Let's look at that next time.