Why Did No One Teach Me This?: October 2011

Monday, 17 October 2011

Limitless e, Part Two

We have now this equation for working out the number of ones, \(y\)'s, \(y^2\)'s, \(y^3\)'s that you will get no matter how many brackets you start with (or rather, no matter to what power you try to raise those backets).

\[1+my+\frac{m(m-1)}{2\!}y^2+\frac{m(m-1)(m-2)}{3\!}y^3+\ldots\]

With a little thought we can see that this pattern is going to continue for \(y^4\)'s, \(y^5\)'s and every other power of \(y\). Why? Well each higher power of \(y\) is created by multiplying that many \(y\)'s together. To make sure you create every possible combinations of rows you follow the procedure we established last time. First you fix one \(y\), then the rest of the available \(y\)'s, but one, and then you float the remaining \(y\) around all the the other columns available spaces. Once you have used that last \(y\) in every available column, you change position of the \(y\) you fixed last, and then you float the remaining \(y\) again. You keep going until your very first fixed \(y\) has been in every available column.

So the total possible rows you get for any power of \(y\) is the the same number of groups of columns (reducing by one each time to represent all the \(y\)'s slotting into place) multiplied together as the power. So if you want to know how many total rows that generate \(y\)'s to the power of two you have, you do (total columns) multiplied by (total columns less one). If you want to know the same for \(y\) to the power four, you have (total columns) multiplied by (total columns less one) multiplied by (total columns less two) multiplied by (total columns less three). You can see in each multiplication there are the same number of terms as the power of \(y\) you are trying to find.

Of course that just gives you the total amount of rows that will be generated. You still need to get rid of the duplicates. That can be achieved by dividing the total number of rows created in the last step by the total number of ways you can arrange that number of \(y\)'s together. That will always be the total number of \(y\)'s (choices of position for the first \(y\)) multiplied by the choices of position for the second \(y\) placed, and so on down to the last \(y\) for which there is only one position left. Remember we are talking about the different ways of organising the \(y\)'s in the spaces for \(y\)'s already in the lines generated by the procedure in the previous paragraph. We are NOT organising the \(y\)'s in every possible space in those rows. The number that does the dividing is always going to be the same number as we are raising \(y\) to the power of, multiplied by all the whole numbers between it and zero (because we are using up a \(y\) position each time. We show this by using the \(!\) symbol after the number that starts that mutiplication.

We have already written down what the first four terms look like when we multiply \(m\) brackets together. Having thought about the arguments above we can also say what ANY term of that equation will look like. If we want to know the \(k\)th term we would change the symbol \(k\) in the following to the number of the term we wanted to know:

\[\frac{m(m-1)(m-2)\ldots (m-(k-1))}{k\!}y^k\]

That looks a bit tricky. Previously we have seen that the dots in a row mean "and so on" when added onto the end of a series of numbers. Here, albeit it appears between two terms in the top line of a division, it means exactly the same. It just really means "keep putting in brackets with one more being deducted from \(m\) each time, until you get to the bracket where the number being deducted is one less than \(k\)". We do not want to go all the way to \(k\) because then we would have one too many brackets being multiplied.

I am not just being lazy by not writing in all these brackets, I literally cannot do so because I do not know what value \(k\) has, so I do not know how many backets to write in until I pick a \(k\)! \(k\) may even be two, in which case I was wrong in writing in the \((m-2)\) bracket (because I should really have stopped at \((m-1)\) because one is one less than \(k\) when \(k\) is two). The idea behind the way that I have written that line is that I have given all the clues that will be needed for the whole line to be constructed once \(k\) is known, even if I have to add in or subtract brackets.

What happens if we set \(k\) to four? We get this:

\[\begin{align}
&\frac{m(m-1)(m-2)\ldots (m-(4-1))}{4\!}y^4\\
&\frac{m(m-1)(m-2)(m-(3))}{4\!}y^4\\
&\frac{m(m-1)(m-2)(m-3)}{4\!}y^4
\end{align}\]

That is the term that will always tell us the number of \(y\) to the fourth powers once we multiply out our brackets. If we say that we have four brackets, like we had above, \(m\) will be four, and the number of \(y\) to the fourth powers will be:

\[\begin{align}
&\frac{4(4-1)(4-2)(4-3)}{4\!}y^4\\
&\frac{4(3)(2)(1)}{4\!}y^4\\
&\frac{4(3)(2)(1)}{4\cdot 3\cdot 2 \cdot 1}y^4\\
&\frac{24}{24}y^4\\
&1\cdot y^4
\end{align}\]

Or, one \(y^4\). Which if you look above you will find is exactly the number we were expecting. So or logic seems sound. If you are a bit devious of mind, you may well ask, "what happens if I set \(k\) to be more than \(m\)", or in other words, what if I ask this equation how many \(y^4\)'s I will get when I am only multiplying together, say, two brackets. The answer must be zero, because to get to \(y^4\) you must be multiplying four \(y\)'s together. If we only have two brackets, we will only have, at most, two \(y\)'s to multiply together. So what does this look like?

\[\begin{align}
&\frac{2(2-1)(2-2)(2-3)}{4\!}y^4\\
&\frac{2(1)(0)(-1)}{4\!}y^4\\
&\frac{0}{4\!}y^4
\end{align}\]

Anything multiplied by zero is zero. So it doesn't matter what else we have in the top line of that division, as soon as a zero hoves into view the whole thing turns to zero. Zero divided by anything (apart from zero) is also zero. So we end up with zero \(y^4\)'s. Which is what we expected. We can also say that if \(m\) is less than \(k\) we will always end up creating at least one bracket on that line where a number equal to \(m\) is subtracted from \(m\), which reduces the whole thing to zero.

So we are finally able to restate our limiting definition of \(e^x\):

\[\begin{multline}
e^x=\lim_{m \to \infty} 1+my+\frac{m(m-1)}{2!}y^2+\frac{m(m-1)(m-2)}{3!}y^3+\ldots\\
+\frac{m(m-1)(m-2)\ldots (m-(k-1))}{k!}y^k
\end{multline}\]
(Remember that we have defined \(y=\left (\frac{x}{m} \right )\), so we need to replace every \(y\) accordingly)

We have now got rid of the pesky raising to the power, but we still have the problem of this being a limiting function. Let's see if we can get rid of that. How do we do that? What I am going to do is just say, "to hell with approaching infinity let's just make \(m\) infinity and see what happens". In particular I am going to look at each term in turn to see whet effect infinity has.

The first is always one. We agreed that no matter how many times you multiply one by one you still get one.

The second is \(my\) at the moment, but we need to reintroduce \(\left (\frac{x}{m} \right )\). This makes it:

\[m\cdot \frac{x}{m}\]

That is also:
\[\begin{align}
&\frac{m}{1} \cdot \frac{x}{m}\\
&\frac{m\cdot x}{1\cdot m}\\
&\frac{m\cdot x}{m\cdot 1}\\
&\frac{m}{m}\cdot \frac{x}{1}
\end{align}\]

So no matter what \(m\) is, even infinity, once you divide it by itself you get one. So the second term is going to be \(x\) an its own. It has to be because the two infinte \(m\)'s cancel each other out.

Let's look at the third term now:

\[\begin{align}
&\frac{m(m-1)}{2!}y^2\\
&\frac{m(m-1)}{2!}\cdot \left (\frac{x}{m}\right )^2\\
&\frac{m(m-1)}{2!}\cdot \frac{x^2}{m^2}\\
&\frac{m(m-1)}{m^2}\cdot \frac{x^2}{2!}
\end{align}\]

We can just swap around the bottom of those fractions because we are mutiplying them together. Look back up the the discussion about the last term if you want to see the steps. We would just have brought these together into one big division, reoganised the order of the terms and then separated out into these two fractions. Ok. Let's set \(m\) equal to infinity.

look at the first part of that multiplication. We are setting \(m\) to infinity. If we do that then \((m-1)\) is still infinity because infinity minus one is still infinity. Agreed? Given that is the case, we get:

\[\frac{\infty(\infty-1)}{\infty^2}\cdot \frac{x^2}{2!}\]

All together now, what is infinity minus one? Yes! Infinity. So we actually have:

\[\frac{\infty(\infty)}{\infty^2}\cdot \frac{x^2}{2!}\]

Which is:

\[\frac{\infty^2}{\infty^2}\cdot \frac{x^2}{2!}\]

What is anything (even infinity squared) divided by itself? Yes - one.

\[1\cdot \frac{x^2}{2!}\]

So the horrible looking third term actually just dissolved down to a nice simple looking fraction. This is also going to be our pattern. Think about it. We have exacly the same number of brackets on the top line of each division as the power we raise the \(y\) to which is multiplied by that division. So here we had two brackets and we squared \(y\). Next time we add a power and a bracket, because we have three brackets and we cube \(y\). This pattern continues. It means we can always be sure, no matter which term in this infinitely long series that we are looking at, that the power of \(m\) on the bottom of the fraction exactly balances the number of brackets containing \(m\) on the top of the fraction. When we set \(m\) to infinity that makes all the brackets on the top contain infinity, no matter what number they try to decuct from infinity. And that means that we have the same number of infinities multiplying together on the top and bottom of the division. If we have the same on the top and bottom we know that just equals one.

So we can say that when we set \(m\) to infinity, the \(k\)th term (assuming that the term at the very beginning is the zeroth term) of the series is \(\frac{x^k}{k!}\), which looks very neat indeed. So we can now write out \(e^x\) as an infinite series:

\[e^x=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\frac{x^4}{4!}+\ldots\]

If only we could write the first two terms of the series in the form \(\frac{x^k}{k!}\) we could even write out a nice summation symbol for all this. Well, let's just see. The first term (as opposed to the zeroth term) is \(x\). What would happen if we tried to work out the first term using our rules? Well, it would be \(\frac{x^1}{1!}\). One factorial (\(1!\)) is just one. And \(x\) to the power of one is just \(x\). So we end up with \(\frac{x}{1}\) which is just \(x\) which is exactly what we want - so that works fine.

How about the zeroth term? Well we agreed ages ago that anything to the power of zero is one. That applies just as much to \(x^0\), so our rule \(\frac{x^0}{0!}\) becomes \(\frac{1}{0!}\). But surely \(0!\) is zero, right?

Nah. We assumed that anything to the power zero was going to be zero, but it turned out to be one. Same thing happens for much the same reason here. Here's what happens.

Any normal factorial, \(3!\) or \(2!\) for instance, is just the factorial of the number one smaller mutiplied by the number of the factorial. So \(3!=3\cdot 2!\), and \(2!=2\cdot 1!\). If \(0!\) was zero, then this would not work with \(1!\), because that would turn into \(1\cdot 0!=1\cdot 0=0\) when we know that it is actually one. So to preserve the way the whole factorial system works, \(0!\) has to be one.

Going back to the zeroth term (\(\frac{1}{0!}\)) this gives us \(\frac{1}{1}\), or one. Which is the first term in our series. Woopee!

This means we CAN write the sum which generates the series like this:

\[\sum_{k=0}^\infty \frac{x^k}{k!}\]

What we are trying to do here is state what \(e^x\) is. So let's test all this with \(e^1\). Looking at our sum we can see that the \(x\)'s all appear on the top of the fractions. So all we need to do is replace them all with ones. So:

\[e^1=1+1+\frac{1^2}{2!}+\frac{1^3}{3!}+\frac{1^4}{4!}+\ldots\]

And of course one to any power is just one, so this simplifies to:

\[e^1=1+1+\frac{1}{2!}+\frac{1}{3!}+\frac{1}{4!}+\ldots\]

Let's work out those factorials:

\[e^1=1+1+\frac{1}{2}+\frac{1}{6}+\frac{1}{24}+\ldots\]

So we can see roughly what is going on, we'll stick to the first five terms here. We'll also convert them all into twenty fourths:

\[\begin{align}
e^1 &\approx \frac{24}{24}+\frac{24}{24}+\frac{12}{24}+\frac{4}{24}+\frac{1}{24}\\
e^1 &\approx \frac{65}{24}\\
e^1 &\approx 2.708333333
\end{align}\]

Well, that's not bad at all after only five terms. It is definitely heading in the right direction. Let's sum up then. We have gone from:

\[e=\lim_{n \to \infty} (1+\tfrac{1}{n})^n\]

to

\[\sum_{k=0}^\infty \frac{x^k}{k!}\]

or

\[e^x=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\frac{x^4}{4!}+\ldots\]

Monday, 10 October 2011

Limitless e, Part One

Limitless e

So last time we worked out HOW to raise the number \(e\) to an imaginary power - specifically the imaginary power that we are focussed on namely \(e^{x}\). We were still using our definition of e as the limit of a function when one of the variables increased in size towards infinity. This function:

\[e^{x}=\lim_{m \to \infty} (1+\tfrac{x}{m})^{m}\]

That function was a little different to our first definition of \(e\) because it is actually the function that gives us \(e\) to the power of \(x\). Which is handy. When we ran through everything I decided that while the function seemed to work perfectly it did not actually tell us anything about WHY these numbers combine in this fashion.

So let's take a different tack. Let's stop defining \(e\) as the limit of a function as you change a variable. Let's try to define it purely in terms of a one variable function. In otherwords, what I want to do is get rid of the \(m\) in the above function.

Right-o. How do we do that? Well, what we have is this:

\[(1+\tfrac{x}{m})^{m}\]

We want to get rid of the \(m\). The \(m\) just tells us how many times we need to multiply together the stuff in the brackets. Let's say that \(m\) was two. The function would look like this:

\[(1+\tfrac{x}{2})^{2}\]

Which would expand into this:

\[(1+\tfrac{x}{2})\cdot (1+\tfrac{x}{2})\]

Now we have the same kind of problem as the red and blue baskets with apples and oranges in them. Only this time we have exactly the same things in each bracket. it doesn't change the way we deal with it though, we just need to unpack the first bracket like this:

\[1\cdot (1+\tfrac{x}{2})+\tfrac{x}{2}\cdot (1+\tfrac{x}{2})\]

And then we can unpack the second bracket like this:

\[1\cdot 1+1\cdot \tfrac{x}{2}+\tfrac{x}{2}\cdot 1+\tfrac{x}{2}\cdot \tfrac{x}{2}\]

Which turns into:

\[1+ \tfrac{x}{2}+\tfrac{x}{2}+\tfrac{(x)^2}{2^2}\]

And finally:

\[1+ 2\cdot \tfrac{x}{2}+\tfrac{x^2}{4}\]

That's fair enough, and perfectly logical. However, what we want to do is not just multiply two brackets together, but to multiply \(m\) brackets together. How on earth do we do that? What we are going to do is to look closely at exactly what we get out of the brackets when we multiply them together, and how that changes when we add more and more brackets to the mix. We want to see if we can work out a general rule to tell us what we will get without having to go through all that tedious multiplication and addition that I just did above.

To help us with this process I am going to simplify our brackets a bit. Instead of \((1+\tfrac{x}{m})\), which is a bit of a mouthful, I am going to replace the \(\tfrac{x}{m}\) bit with the letter \(y\). So our bracket now looks like this: \((1+y)\). That's much nicer to look at. All we have to do is remember that when we are finished, we need to replace \(y\) where ever we find it with \(\tfrac{x}{m}\).

Right, to get a feel for what we are talking about lets look at three examples, \((1+y)^2\), \((1+y)^3\), and \((1+y)^4\). I won't do much talking in between, let's just have a look at the logical steps.

First of all \((1+y)^4=(1+y)\cdot(1+y)^3\), and \((1+y)^3=(1+y)\cdot(1+y)^2\). So we just need to work out the power two answer and then multiply through by another bracket, and so on. So, power two:

\[\begin{align}
&(1+y)^2\\
&(1+y)(1+y)\\
&1(1+y)+y(1+y)\\
&1\cdot1+1\cdot y+y\cdot1+y\cdot y\\
&1+y+y+y^2\\
&1+2y+y^2
\end{align}\]

Now power three:

\[\begin{align}
&(1+y)(1+2y+y^2)\\
&1(1+2y+y^2)+y(1+2y+y^2)\\
&1\cdot 1+1\cdot 2y+1\cdot y^2+y\cdot 1+y\cdot 2y+y\cdot y^2\\
&1+2y+y^2+y+y(y+y)+y^3\\
&1+2y+y^2+y+y^2+y^2+y^3\\
&1+2y+y^2+y+2y^2+y^3\\
&1+2y+y+3y^2+y^3\\
&1+3y+3y^2+y^3
\end{align}\]

Now power four:

\[\begin{align}
&(1+y)(1+3y+3y^2+y^3)\\
&1(1+3y+3y^2+y^3)+y(1+3y+3y^2+y^3)\\
&(1+3y+3y^2+y^3)+y(1+3y+3y^2+y^3)\\
&1+3y+3y^2+y^3+y\cdot 1+y\cdot 3y+y\cdot 3y^2+y\cdot y^3\\
&1+3y+3y^2+y^3+y+3y^2+3y^3+y^4\\
&1+4y+3y^2+y^3+3y^2+3y^3+y^4\\
&1+4y+6y^2+y^3+3y^3+y^4\\
&1+4y+6y^2+4y^3+y^4
\end{align}\]

Let's summarise what we have found so far:

\[\begin{align}
(1+y)^2&=1+2y+y^2\\
(1+y)^3&=1+3y+3y^2+y^3\\
(1+y)^4&=1+4y+6y^2+4y^3+y^4
\end{align}\]

Can we start to describe our results? Well, first of all each result starts with a one. Secondly each result ends with a single \(y\) to the same power as we were raising the bracket to. Thirdly, inbetween, the one and the single \(y\) we have some amount of every power of \(y\). In other words, reading from left to right, you have no \(y\)'s, then some number of just \(y\) and then some number of \(y^2\) then \(y^3\) and so on. We never miss a power of \(y\).

These are all fairly obvious. The last observation may have passed you by. Look at the numbers we multiply the \(y\)'s by. They are symmetrical! In the third power row you get three three, and in the fourth power row you get four - six - four. This is promising, because it suggests that there is a pattern to be discovered! Let's have a try at working out how we get to those numbers.

Why does each line start with a one? Remember the process that we used to generate these lines. We are multiplying together numerous brackets. The way we multiply brackets together is to multiply the individual terms in each bracket by every other term in all the other brackets. That's easy for only two brackets, because in each multiplication you only have two things being multiplied. With three or four brackets, we are going to have three or four items being multiplied.

To generate the first row above we went throught the whole term by term routine, but then when generating the other lines, we just mutlpied the last line by another bracket as a short cut. Let's actually look at what would have happened with three brackets, from scratch as it were:

\[(1+y)(1+y)(1+y)\]

First we multlply the first term in each of the brackets:

\[1\cdot 1\cdot 1\]

That gives us our number one. The more brackets you have the more ones go into that multiplication, but even with \(m\) number of ones in that multiplication the result is still going to be one. That's where the one comes from. What next? Lets have the first terms of the first two brackets and then the second from the last:

\[1\cdot 1\cdot y\]

That gives us a y on it's own (the ones just collapse down to tell us we end up with one multiplied by \(y\)). And the same thing will happen when the \(y\) in the middle bracket is multiplied by the ones in the other two. That looks like this:

\[1\cdot y\cdot 1\]

And finally we will also get the same result from choosing the \(y\) in the first bracket and ones in the last two:

\[y\cdot 1\cdot 1\]

If we then add up all those \(y\)'s we end up with, drumroll, \(3y\), which is what we were expecting. If you think about it there is no way for us to get any other individual \(y\)'s. There are only two options from each bracket a one or a \(y\). And if you multiply two or more \(y\)'s together you cannot end up with a \(y\) as opposed to a \(y^2\) or a \(y^3\). So the only options for getting \(y\)'s are to pick only one \(y\). We can only do that for as many brackets as we have. So that means if we have \(m\) brackets we will get \(m\) \(y\)'s. So far, with \(m\) brackets, we can say that we will have:

\[1+my+\ldots\]

If we move onto \(y^2\)'s things get a bit trickier. We have to keep track of all the combinations we have tried so far, and that is getting complicated. So I am going to use a table to keep track of things.

That shows us the result for the first line. Now let's add on the other options that we have worked through so far:

In order to do things as systematically as possible when using multiple \(y\)'s, I am going to fix one, and then move the other one around. This means that I won't accidentally miss any of the possible arrangements. I'll fix one in the third bracket first, putting the other bracket in the second and then first brackets:

Good. Now lets put the fixed \(y\) in the second bracket, and we'll use the last and then first for the other \(y\):

And last of all we will fix the \(y\) in the first bracket and use the last and then second brackets for the other \(y\):

Hang on. We have ended up with six \(y^2\)'s. We were only expecting three. What has gone wrong? If you look at the first and third \(y^2\) rows you will see that they are identical. They both represent a one in the first bracket mutiplied by a \(y\) in the second and third brackets. But once we have done that combination once, we can't do it again. So we have over estimated. If you look closely, you will see that each \(y^2\) line is duplicated once. Because we have twice as many \(y^2\) as we need we need to divide the result by two. Six divided by two is three, which IS what we were expecting. Why did we end up with twice as many as we were expecting? Look at this line:

How did we create that twice? Well once we had the fixed \(y\) in the first column, and then we had it in the second column. But both times we generated the same row. There are two \(y\)'s in the row, so there have to be two copies of the row in our whole table. Why? You have one copy for when the first \(y\) is fixed and one copy for when the second \(y\) is fixed. Good.

Next question. How did we end up with six rows? Well we were creating our rows by a careful systematic process. FIrst we picked a row for the fixed \(y\). We had three rows to choose from. Then we placed our floating \(y\). We no longer had three rows to choose from, because the fixed \(y\) was taking up a row. So we had one less that three, or two, rows to choose from. For each of the rows the fixed \(y\) was in, we had two options for the floating. That gave us three multiplied by two, or six total rows. Can we then come to any conclusions about the number of \(y^2\)'s we will have with \(m\) number of brackets?

We will have \(m\) choices of rows for our first \(y\), and then \(m-1\) options for our second \(y\). We will, however, end up with more rows then we need, because we will get one for each possible rearrangement of the fixed and floating \(y\)'s. With two possible \(y\)'s (fixed and floating) there are two possible ways to position them in each different row. So if the total variety of rows we get are \(m\) multiplied times \(m-1\), we then need to divide that by the number repeated rows, which for two types of \(y\) will be two. So we can say that the total number of \(y^2\)'s we will get is \(\frac{m(m-1)}{2}y^2\). We can add that on to our running total like this:

\[1+my+\frac{m(m-1)}{2}y^2+\ldots\]

Let's pause for a moment and check that this works with two brackets. With two brackets \(m=2\) so we get:

\[\begin{align}
1+2\cdot y+\frac{2\cdot (2-1)y^2}{2}+\ldots\\
1+2y+\frac{2\cdot (1)y^2}{2}+\ldots\\
1+2y+\frac{2\cdot y^2}{2}+\ldots\\
1+2y+\tfrac{2}{2}\cdot y^2+\ldots\\
1+2y+1\cdot y^2+\ldots\\
1+2y+y^2+\ldots
\end{align}\]

So far so good - that's what we got when we did it manually - but what about the \(\ldots\) bit? Given that we do not need it (because the first three terms are all we need) it must be zero, so we will see why next time.

Just before we go onto the next one, with our current table, there will be one final line to add:

That just confirms for three brackets multiplied together that we will end with one \(y^3\), becuase with only three brackets to choose from there will only ever be one way of multiplying one item from each to get a \(y^3\) and that is by choosing \(y\) in each bracket.

Right. On to four brackets. Let's think about our table. The first row is going to be all ones, representing the only way you can multiply all the ones together. Next will be the rows with one \(y\) in each row. There will be four rows of them, because there are four sets of brackets to choose an individual \(y\) from. Lets show that now:

Same old, same old. Now lets do the \(y^2\)'s. Just like last time, we will fix one \(y\) and then float one \(y\) around. This looks like this:

So you can see that for every \(y_{fixed}\) we have the other \(y\) in each of the other columns. So the total number of rows should be four multiplied by four minus one (three). That comes to twelve which is exactly the number of rows we find above. Excellent. As with the last time, if you ignore the disctinction of \(y_{fixed}\) as opposed to \(y\), you have twice as many rows than you need, because you can rearrange \(y\) and \(y_{fixed}\) in two ways for each possible row. So again you need to divide by two. Twelve divided by two is six, which is just what we found when we did this manually.

Now. What about \(y^3\)? Before we start, lets think through logically that should happen. We will have three \(y\)'s to choose from the brackets available. If we do the same as last time we will fix one \(y\), from four different options, but then we will have two other \(y\)'s to pop in. For the second \(y\) you have one less brackets to choose from. And for the third \(y\) we have one less bracket again to choose from. So the total number of rows we will generate is four multiplied by three multiplied by two. That's going to be twenty four in total. How many duplicates are we going to get? Well, for each possible unique row we have three places for \(y\)'s. Does that mean we get triple the number of rows we need instead of double like last time? No.

Obviously, we can't be right, because we already know the answer is four (from when we did it manually above). But twenty four divided by three is eight, not four. Four is twenty four divided by six not three. So where does the six come from? If you have a row with three \(y\)'s in it, how many different ways can you order the \(y\)'s? With one \(y\) there was only one way to put it in a row with one space for it, so the answer is one. With two spaces and two \(y\)'s, you can place the first \(y\) in two places, with the other one falling into the only remaining place. So the answer is two multiplied by one, or two. For three \(y\)'s you have three choices for the first \(y\), then two places for the second \(y\) and only one for the last \(y\). The answer is then three mutiplied by two mutiplied by one. That comes to six, which is the number we predicted. So we can say that the total possible number of rows is going to be \(m\) (for the first \(y\)) multiplied by \(m-1\) (for the second \(y\)) multiplied by \(m-2\) (the places left for the last \(y\)). Each row generated by that process has three \(y\)'s in it. There are three multiplied by two mutiplied by one ways to organise three \(y\)'s in three places, so we end up with six times as many rows as we need.

The number of \(y^3\)'s generated from \(m\) brackets is going to be \(\frac{m(m-1)(m-2)}{6}y^3\). We get the six from three times two times one. So we can expand our formula to:

\[1+my+\frac{m(m-1)}{2}y^2+\frac{m(m-1)(m-2)}{6}y^3+\ldots\]

If it wasn't obviously already, when we are looking at higher and higher powers of \(y\) from more and more brackets, this pattern is going to continue. For whatever power of \(y\) we are interested it, we are always going to have the total number of possible columns available for each \(y\) you need to place. That will always be one fewer as you place the \(y\)'s. For each of the unique lines that process generates, there will the same number of copies as there are ways to organise the number of \(y\)'s that you are placing.

We are stuck with the complicated \(m(m-1)(m-2)\ldots (m-(m-1)\) mess, but can we do anything with the three mutiplied by two mutiplied by one stuff? The answer is yes. This kind of stuff crops up from time to time, and there is a simple notation for it. You just write \(3!\) to represent three mutiplied by two mutiplied by one and \(4!\) is four multiplied by three mutiplied by two mutiplied by one. And so on. That turns our formula into:

\[1+my+\frac{m(m-1)}{2!}y^2+\frac{m(m-1)(m-2)}{3!}y^3+\ldots\]

Which is a bit neater. We'll see where this takes us next time.

Monday, 3 October 2011

Intermission

OK, we have been distracted by shiny things in the form of the mandelbrot set. Now we can get back to our task in hand, working out why \(e^{i\pi}+1=0\).

We have now met our cast of characters. Let's have a look at them all lined up on the complex plane:

So you take the number at \(\pi\) and multiply it by \(i\). That's easy enough to visualise now. We look at both numbers in polar form. \(\pi\) is just \(\pi\angle 0\) and \(i\) is \(1\angle \tfrac{\pi}{2}\). We just need to multiply the absolute size (\(\pi\cdot 1\)) and we add the angles (\(0+\tfrac{\pi}{2}\)). That gets us to \(\pi\angle \tfrac{\pi}{2}\). That looks like this:

So we are now left with the job of trying to work out how to raise the number \(e\) to this new number \(i\cdot \pi\). To do this, we are going to have to work out how to multiply \(e\) to a complex number. To start with, lets go back and look at our definition of \(e\).

\[e=\lim_{n \to \infty} (1+\tfrac{1}{n})^n\]

Lets look only at the function bit:

\[(1+\tfrac{1}{n})^n\]

Let's now raise that bit to a power:

\[((1+\tfrac{1}{n})^n)^x\]

We now know that when raising something to a power and then to a power again, we can just multiply the powers. In other words the expression above really means take the bit in brackets and multiply \(n\) of them together. Now take that group of things being multiplied together and multiply \(x\) of those groups together. If you did that you would just end up with \(x\) groups of \(n\) things all multiplied together. So essentially you end up with \(n\cdot x\) numbers of the stuff in brackets being multiplied together. So we can just write this instead:

\[(1+\tfrac{1}{n})^{n\cdot x}\]

How does this help me work out \(e\) to a complex power? Well, what I know how to do is to multiply, divide and add complex numbers. I do not know how to raise a number to a complex exponent. So what I really want to do is to get that \(x\) away from the exponent. How do I do that?

Let's create a new variable \(m\). Let's define this to be \(n\cdot x\):

\[m=n\cdot x\]

Now I will multiply both sides of that equation by \(\tfrac{1}{n}\):

\[\begin{align*}
\tfrac{1}{n}\cdot m&=\tfrac{1}{n}\cdot n\cdot x\\
\tfrac{m}{n}&=\tfrac{n}{n}\cdot x\\
\tfrac{m}{n}&=1\cdot x
\end{align*}\]

We also need to multiply both sides by \(\tfrac{1}{m}\):

\[\begin{align*}
\tfrac{m}{n}\cdot \tfrac{1}{m}&=x\cdot \tfrac{1}{m}\\
\tfrac{m\cdot 1}{n\cdot m}&=\tfrac{x}{m}\\
\tfrac{m\cdot 1}{m\cdot n}&=\tfrac{x}{m}\\
\tfrac{m}{m}\cdot \tfrac{1}{n}&=\tfrac{x}{m}\\
1\cdot \tfrac{1}{n}&=\tfrac{x}{m}
\end{align*}\]

We now have two defintions to use:

\[\begin{align*}
n\cdot x&=m\\
\tfrac{1}{n}&=\tfrac{x}{m}\\
\end{align*}\]

And we just need to fit them into this equation:

\[(1+\tfrac{1}{n})^{n\cdot x}\]

We have both a \(\tfrac{1}{n}\) and a \(n\cdot x\), so let's have at it:

\[(1+\tfrac{x}{m})^{m}\]

There we go - mission accomplished. We have manged to get the \(x\) AWAY from the exponent. We can now say that:

\[e^x=\lim_{m \to \infty} (1+\tfrac{x}{m})^{m}\]

We can now see what \(e\) raised to a complex power is. First, though, let's just run through this with real powers to check it is working. If we square \(e\), then according to my calculator we should get \(7.389056099\). So let's try:

\[e^2=\lim_{m \to \infty} (1+\tfrac{2}{m})^{m}\]

If you set \(m\) to \(1000000\) you get an equation that looks like this:

\[\begin{align*}
\left (\frac{1000000}{1000000}+\frac{2}{1000000}\right )^{1000000}\\
\left (\frac{1000002}{1000000}\right )^{1000000}\\
\frac{1000002^{1000000}}{1000000^{1000000}}
\end{align*}\]

If you work out that horror you get \(7.389041321\), which agrees to four decimal places with the "real" answer. So we are on the right track! So what we can do now is to replace the \(2\) with \(i\pi\). That looks like this:

\[e^{i\pi}=\lim_{m \to \infty} (1+\tfrac{i\pi}{m})^{m}\]

Let's set \(m\) to a hundred to see how we get on. That makes our equation:

\[e^{i\pi}\approx (1+\tfrac{i\pi}{100})^{100}\]

Let's deal with the bit in brackets first. How do we do the division? It is unsurprisingly the opposite of multiplication. So instead of multiplying the absolute values together you divide the first by the second. Then you deduct the second angle from the first. Remember the number we are dividing looks like this:

In polar form (for division) that is \(\pi\angle \tfrac{\pi}{2}\). The polar form for the number doing the dividing (the one on the bottom) is \(100\angle 0\). So the maths looks like this:

\[\frac{\pi\angle \tfrac{\pi}{2}}{100\angle 0}\]
\[(\tfrac{\pi}{100})\angle (\tfrac{\pi}{2}-0)\]
\[(\tfrac{\pi}{100})\angle \tfrac{\pi}{2}\]

So it is going to be at the same angle but only a hundredth the distance away from the origin. We then want to add one onto that. That just moves the number one unit in the positive real direction. That looks like this:

We can now see the number on the complex plane. This is the thing that we are going to raise to the power one hundred. Raising to a power is all about multiplication, so we are going to want to put the number into polar form. So what is it in polar form? First the absolute value is (again using Pythagoras) the square root of one squared plus \(\left (\frac{\pi}{100}\right )^2\). So the absolute value is:

\[\sqrt{1^2+\left(\frac{\pi}{100}\right )^2}\]
\[\sqrt{1^2+\left(\frac{\pi^2}{100^2}\right )}\]
\[\sqrt{1+\frac{\pi^2}{10000}}\]
\[\sqrt{\frac{10000}{10000}+\frac{\pi^2}{10000}}\]
\[\sqrt{\frac{10000+\pi^2}{10000}}\]

Just looking at that, you can see that it is a number just a little big bigger than one. Why? Well ten thousand and a little bit divided by ten thousand is pretty close to one. And the square root of something pretty close to one, is even closer to one. You can compare that with the diagram:

That makes sense. What about the size of the angle? The sine function of the angle gives us the height of the point above the real axis, divided by the absolute value (to scale everything to the unit circle). If we work out that scaled down height, we can use the inverse of sine function to give us the distance around the unit circle to that point, and hence the size of the angle. The scaled height is \(\frac{\frac{\pi}{100}}{\sqrt{\frac{10000+\pi^2}{10000}}}\). Hmm. That looks like a nightmare, but it really isn't. We have already seen that the bit on the bottom (the absolute value) is pretty close to one. Any number divided by one is just itself. So really, we are interested in the bit on top. That tells us that the sine of the angle we are looking for is roughly \(\tfrac{\pi}{100}\). In fact, if you work it out it is \(0.0314004349\).

Now to actually DO the inverse sine function we could start drawing our unit circle, and then make very very precise measurements, or we could just use a calculator. Calculator it is then. The distance round the unit circle to a point \(0.0314004349\) above the real axis is \(0.0314055972\). So that's our angle in radians.

(If you think long enough about this you may well ask 'How the fuck does my calculator KNOW that this is the angle size that corresponds with that sine?' The calculator doesn't draw a circle and get out a ruler. Does it come with all the possible sine values for all possible angles to whatever number of decimal places? No. And before we have finished with this, you will find out what your calculator did to get this precise result).

If you look at the size of the angle, you may notice that it is pretty bloody close to the sine of the angle. In other words the distance around the unit circle to the point is almost exactly the same size as the height of the point above the real axis. Apparently this happens when your angle is very small. Why? Well, let's look at the diagram:

That's our number. Let's scale it to the unit circle (this does not change much because we know that the dashed line there is pretty close to one anyway), and use our old friends Imogen, Polly and Abby to see what is going on:

We can see the very small angle we are dealing with. What I am going to do next is to get rid of Raul because that is not relevant to this particular discussion. I am then going to move Imogen so that it is directly under Polly:

Now let's zoom in on the interesting bit:

Can you see that the orange line of Imogen is practically the same length as the section of circle round to Polly that it is pretty much obstructing? Let's move closer:

Can you see how the orange line and the blue line up to Polly are very close in length? That's exactly what I mean when I say that the sine of an angle is very close to the size of the angle in radians, when the angle is very small.

Anyway, enough distractions. We have our polar form number: \(\sqrt{\frac{10000+\pi^2}{10000}}\angle 0.0314055972\).

What are we going to do with it? The absolute value of the number is going to be multiplied by itself one hundred times. The angle is going to be added to itself one hundred times. Where does that take us? Well, the size is actually a number very close to one. If it was the square root of ten thousand divided by ten thousand, it would be one. The only thing that stops it being one is the \(pi^2\) on the top line. So it is roughly the square root of ten thousand and nine over ten thousand. That is very very close to one. So when we multiply it by itself one hundred times it should still be pretty close to one unit long. In fact it turns out to be a bit over \(1.05\), but not much. Now the angle. If you look closely you will see that the angle is, to four decimal places, one hundredth of \(\pi\). So what do you get if you multiply one hundredth of \(\pi\) by one hundred? \(\pi\)! In fact for our numbers you get to \(99.99\%\) of \(\pi\). So, our end result, is:

\[\left(\sqrt{\frac{10000+\pi^2}{10000}}\angle 0.0314055972\right)^{100}\approx 1.05\angle99.99\%\pi\]

Well, the angle \(\pi\) is of course half a circle, which makes the result, to within \(5\%\), negative one. So we can say, to within \(5\%\) that:

\[e^{i\pi}\approx (1+\tfrac{i\pi}{100})^{100}\approx -1\]

To generate the identity that started this whole thing, all we do is add one, an equals sign, and zero. So it looks like we are definitely on the right track. What about increasing the value for \(m\)? Well, I can't be bothered running though all the arithmetic again for a start. But lets try to imagine what would happen.

First the multiplication of \(\pi\) and \(i\) would proceed unchanged. But then we would divide that number by a much larger number, say a million. That would bring the point down to a millionth of \(\pi\) away from the real axis. We would still add one, which would take us out to a point very, very close to one. The angle would be much smaller and the absolute size would by much much closer to one.

Secondly, when we then raised the absolute size to the power of a million, it would stay much closer to one. And although the angle would be much smaller, we would then add a million of them together, getting us even closer to \(\pi\) as the total. So as the \(m\) number gets larger, the result gets closer and closer to negative one.

In fact, if you put set \(m\) to infinity, you will divide \(i\pi\) by infinity, getting an infinitely small number. One plus an infinitely small number is infinitely close to one. That number's imaginary part would be infinitely small. When you raised the absolute value of the number to infinity, it would stay at one. And the angle, which would be an infinith fraction of \(\pi\) multiplied by infinity would be exactly \(\pi\)! Which is exactly what we wanted to prove isn't it? Let off the fireworks and start the band we're done!

No. No we're not. This:

\[e^{i\pi}=\lim_{m \to \infty} (1+\tfrac{i\pi}{m})^{m}=-1\]

:gives us no real sense of what is going on here. In other words, sure, this formula tells us that raising \(e\) to \(i\pi\) gives you negative one, but it does not tell us WHY.

That's why this section is just called the intermission. This has been a test of all the concepts and tools that we have built up along so far. The good news is that they work. So we are doing the right thing. The problem at the moment is this whole 'limit' feature of our definition of \(e\). What we are going to do next, is to try to get rid of the whole limit approach to the puzzle altogether.