Now that we have got that fairly abstract stuff about functions and limits out of the way we can finally talk about \(e\). This is another number like \(\pi\) in that it is a constant. \(e\) is always the same number like 2 is two or 5 is five. It does not change like the \(x\)'s we have been looking at.
So what number is it? Well, lets talk about the interest on your bank account. We want to arrive at an exact figure for \(e\), not a multiple of something else, so we will start with a unit value in our bank account. If you remember a unit is always one. So lets say you have £1 in your bank account. It could be Euros, Dollars, Yen, or Sestertii. What is important is that we have one of them.
Your bank pays you interest on the money that it holds for you. it does this by giving your money to other people to set up businesses or buy houses, and charging them for the service. Because the money was yours it passes on some of the payment for this service to you. In these financially desperate times, of course, what the bank does is charge a large amount to lend out money, gives you a tiny amount back and keeps the vast amount of the charge for itself. Anyway, what we want to know is how much you will have in your account at the end of a period of investment. Traditionally the period of investment used to compare different methods of saving is one year, so let's use that.
Now, we also need to know what interest rate is going to be applied to your account. Lets say that you have pictures of your bank manager in flagrante delicto with a local farm animal, and you have been given an interest rate of 100%. (In reality, by contrast, an interest rate of one half of one percent would be more likely.)
So how much do you have in your account at the end of the year? The answer seems to be obvious - if you get 100% and you start with £1 you will have your original £1 and 100% of that again (£1) in interest, giving a total of £2 at the end of the year.
Things, however, are not that simple. We have made an assumption that your interest will be calculated and paid at the end of the year. That may not necessarily be the case. Instead, let's consider what happens if you negotiate with the bank to be paid your interest in two lump sums, one halfway through the year and then at the end of the year.
After six months of the year have passed, you will get one half of 100% of the interest on £1. That would be 50% of the total of £1 or 50p. Just what we would have expected. BUT WAIT a moment, because when you come to get your interest for the second six months, you are not earning interest on £1 any more, you now have £1.50 in your account. So you get half of 100% of the interest on £1.50. 100% of that interest would be £1.50 so you get 75p. So at the end of the year you have £1.50 plus 75p, or £2.25.
Hey, there is an extra 25p in interest over and above what you get if the interest is worked out once at the end of the year. Where did that come from? It comes from getting paid interest on the interest that you have already earned. 25p is half of 100% of the interest that you could have got on the 50p, had you started the year with that in your account. That sounds excellent, more money from seemingly nothing. Can you try that trick again?
Let's go back to the bank manager and show him the CCTV footage you have of him attending a very private party with the coach of the local college's ladies volleyball team. And the whole team. He agrees to calculate your interest payments every month. So at the end of the first month you get one twelfth of 100% of the interest on £1 or about 8p. At the end of the second month you get one twelfth of 100% of the interest on £1.08, or about 9p. At the end of the third month you get one twelfth of 100% of the interest on £1.17 (£1.08 plus 9p), which is nearly 10p. Note that each month the interest you get is a little bit more than what you got the previous month, because each month you get a little bit of interest paid on the interest you earned in the previous month.
So what do you get at the end of the year? If you do all of the adding up, you get a little bit more than £2.61. So that's pretty good. An extra 36p over the 25p we earned by calculating it twice per year. But notice - we are into the territory of diminishing returns. We haven't earned twelve lots of 25p extra. What is happening is that we are squeezing smaller and smaller amounts of interest out each time we calculate. So the increase over calculating it one per year is not that much more.
Can we work out what is happening here mathematically? Let's go back to payment once a year. How to we mathematically represent payment of 100% of the interest at the end of the year? We need to show that we still have our original £1, and we have been given an extra £1. That would be:
\[1+1=2\]
We could also write that down as:
\[1\cdot 2=2\]
because we end up with twice as many ones as we started with. Adding 100% to something means multiplying that thing by two, because you have your original thing, and you have a 100% copy of it, so you have two of your original thing.
Right. What happens with payment twice a year, after six months and then after twelve? Well, after the first six months you get half of 100% of the annual interest. How do we show that part? Well, we are not doubling our original thing, but we are adding half of it to the original. If you want to add half of something you would multiply it by one and a half. The one shows you are keeping your original, and the half gives you the half. So after six months we get:
\[1\cdot 1\tfrac {1}{2} = 1\tfrac{1}{2}\]
That's right, because we had £1.50 after the first six months. So what happens at the end of the year? Well, we want to take what we had at the halfway point and add half of 100% of the annual interest rate on that amount. Again that means we need to multiply the amount we had after six months by one (to show that that amount stays in our account) and a half (to show we are getting half of 100% of the annual rate). This looks like this:
\[1\tfrac{1}{2}\cdot 1\tfrac{1}{2}=2\tfrac{1}{4}\]
And that works out again, because you had £2.25 at the end of the year where you got interest after each six months. Let's try to work out what is happening. We wrote out the maths for the second interest payment above as the amount you had after six months multiplied by one and a half. But we already know that the amount you had after six months was one and a half times one. Remember that it does not make a difference in what order you multiply things. So we could have written the whole year out as:
\[1\cdot 1\tfrac{1}{2}\cdot 1\tfrac{1}{2}=2\tfrac{1}{4}\]
That works out mathematically because:
\[1\tfrac{1}{2} = \tfrac{3}{2}\]
So (remembering that to multiply two fractions together you just multiply the bottoms together and the tops together):
\[1\cdot 1\tfrac{1}{2}\cdot 1\tfrac{1}{2}=2\tfrac{1}{4}\]
\[1\cdot \tfrac{3}{2}\cdot \tfrac{3}{2}=2\tfrac{1}{4}\]
\[1\cdot \tfrac{9}{4}=2\tfrac{1}{4}\]
\[1\cdot (\tfrac{8}{4}+\tfrac{1}{4})=2\tfrac{1}{4}\]
\[1\cdot (2+\tfrac{1}{4})=2\tfrac{1}{4}\]
\[(2+\tfrac{1}{4})=2\tfrac{1}{4}\]
Anyway, back to the result. We also know how to write one number multiplied by itself don't we? That's just the square of the number, or the number raised to power two. That would make it:
\[1\cdot (1\tfrac{1}{2})^2=2\tfrac{1}{4}\]
Notice that we now have two twos on the left hand side of the equals sign. One is the denominator of the fraction, and the other is the power to which we raise the number in the brackets. We also get our interest paid twice a year. Coincidence? No! The fraction shows how much of the total annual interest we get applied each time, and the power number shows how many times we get paid the interest. So they are always going to be the same. If we get paid interest twice we get half each time. If we get paid four times, we get a quarter each time. Any if we get paid monthly we get a twelfth each time. If we write out the sum for monthly interest payments, it looks like this:
\[\begin{multline}
1\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot\\
1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}\cdot 1\tfrac{1}{12}=\tfrac{23298085122481}{8916100448256}
\end{multline}\]
The fraction at the end looks insane because:
\[1\tfrac{1}{12} = \tfrac{13}{12}\]
and when you multiply \(\tfrac{13}{12}\) by itself twelve times you get \(\frac{13^{12}}{12^{12}}\) which is the fraction at the end of that line. The maths works though, because if you divide the number on the top of the fraction by the number on the bottom of the fraction you get about 2.61, and you got £2.61 at the end of the year with monthly payments. That all looks like a hell of a mess though, so lets tidy it up a bit.
\[1\cdot (1\tfrac{1}{12})^{12}=\tfrac{23298085122481}{8916100448256}\]
Ladies and gentleman we have ourselves a pattern. If we calculate and pay interest \(n\) times a year, then at the end of the year we will have \(1\cdot(1\tfrac{1}{n})^n\) pounds, dollars, euros or whatever. We can describe those instructions as function, and give it the name \(f\), and the variable in question is \(n\), so:
\[f(n)=1\cdot(1\tfrac{1}{n})^n\]
Which we can tidy up a bit. The 1 at the beginning is just redundant. Look at the examples above - it does nothing. It just hangs about telling you you have one of whatever comes. Remember when we decided to use a unit value - this is why. We can just get rid of the one. If we started with £2 in our account we would need to bring the two in at this point in place of the one, but we are just working in units of one now. So removing the one we get:
\[f(n)=(1\tfrac{1}{n})^n\]
Also as we saw above, one and a half is the same as one plus a half. So to complete the tidying:
\[f(n)=(1+\tfrac{1}{n})^n\]
So we can calculate what we would get if we got paid every day, by sticking in 365 for \(n\):
\[f(365)=(1+\tfrac{1}{365})^{365}\]
\[f(365)\approx 2.71456748\]
The \(\approx\) just means approximately. As we saw, just going to \(n=12\) generates stupidly big fractions which have very long decimal expansions, so I am just rounding off the result. We could also calculate the amount we get if interest was paid every second of every minute of every hour of every day of the year:
\[60\cdot60\cdot24\cdot365=31536000\]
\[f(31536000)=(1+\tfrac{1}{31536000})^{31536000}\]
\[f(31536000)\approx 2.7182817\]
I should point out at this stage that there is nothing special about the choice of a year as the period we are using. It could just as well have been a week or an hour or a decade. The important thing is how many times during that period we pay interest.
As you should hopefully be able to see, the bigger the number that we plug into our function the closer we get to a fixed number. We have gone from \(2 \to 2.25 \to 2.61 \to 2.71456748 \to 2.7182817\). We are getting closer and closer to a number just a little bit bigger than 2.7. If we were paid interest infinitely often during the year, in other words if interest constantly accrued into our account every moment, then the amount we would have at the end of the year would be the number that we are approaching. This number, then, is the limit of the function that we have worked out, which is approached as the size of the number that we plug in gets larger and larger. Remember that we write this as:
\[\lim_{n \to \infty} (1+\tfrac{1}{n})^n\approx2.7182817\]
I still have to use the \(\approx\) symbol because the number just keeps going and going. Just like \(\pi\). Just like \(\pi\) we use a letter instead to represent the number exactly. That letter is \(e\). So:
\[e=\lim_{n \to \infty} (1+\tfrac{1}{n})^n\]
What does that actually mean? It means that if you have an interest rate of 100% and you get paid interest infinitely often during any period, then you will end up with \(e\) times your original balance. It is not a ratio like \(\pi\) is, so we have to reach it by a more circuitous route, but there it is an actual number represented by an actual symbol.
Monday, 27 June 2011
Monday, 20 June 2011
What is a Limit, Part Two?
There is one further thing we need to say about limits before we move on. Instead of making your input number get closer and closer to the number at which everything falls apart, you can also make your input number just get bigger and bigger. This can also help us answer the unanswerable, because you cannot really make your input number infinitely large (you would need an infinitely powerful calculator to work out the answer), but you can say what would happen if you did.
Lets look again at our original function:
\[x+1\]
Lets switch the sign again:
\[x-1\]
Lets add an \(x\) to it:
\[2x-1\]
And lets divide it by \(x\):
\[\frac{2x-1}{x}\]
This time, lets call the function \(g(x)\), so we distinguish it from the last one. . What is \(g(1)\)? Well that is two times one (two) less one (one) divided by one. One divided by one is one. Algebraically:
\[\frac{2\cdot 1 -1}{1}\]
\[\frac{2-1}{1}\]
\[\frac{1}{1}\]
\[1\]
What about \(g(10)\)? That is two times ten (twenty) less one (nineteen) divided by ten. Nineteen divided by ten is one and nine tenths. So:
\[\frac{2\cdot 10 -1}{10}\]
\[\frac{20-1}{10}\]
\[\frac{19}{10}\]
\[1.9\]
Multiplying the input number by 10 has not made a massive difference to the output - it just added on 0.9. The reason why is that while the input number gets larger it appears on both the top AND bottom of the function so it multiplies and divides. Each operation (multiplication and division) just about cancels the other out. So what if we go really big? Lets try \(g(10000)\):
\[\frac{2\cdot 10000 -1}{10000}\]
\[\frac{20000-1}{10000}\]
\[\frac{19999}{10000}\]
\[1.9999\]
So no matter how big the number you stick in, you double it and divide it by itself which nearly cancels one of the copies completely, and would cancel it BUT FOR the one that you take away on the top line. This also works for whatever number you stick in - it does not have to be 1, 10, 100, 1000 or so on, it is just that the results from other numbers look a bit messy. What we are seeing here is that the bigger and bigger the starting number you put in, the closer and closer the output is to two. In mathematical terms we describe this as:
\[\lim_{x \to \infty} \frac{2x-1}{x}=2\]
Which means that as \(x\) gets closer to infinity (\(\infty\)) the output of the function gets closer to two.
Lets look again at our original function:
\[x+1\]
Lets switch the sign again:
\[x-1\]
Lets add an \(x\) to it:
\[2x-1\]
And lets divide it by \(x\):
\[\frac{2x-1}{x}\]
This time, lets call the function \(g(x)\), so we distinguish it from the last one. . What is \(g(1)\)? Well that is two times one (two) less one (one) divided by one. One divided by one is one. Algebraically:
\[\frac{2\cdot 1 -1}{1}\]
\[\frac{2-1}{1}\]
\[\frac{1}{1}\]
\[1\]
What about \(g(10)\)? That is two times ten (twenty) less one (nineteen) divided by ten. Nineteen divided by ten is one and nine tenths. So:
\[\frac{2\cdot 10 -1}{10}\]
\[\frac{20-1}{10}\]
\[\frac{19}{10}\]
\[1.9\]
Multiplying the input number by 10 has not made a massive difference to the output - it just added on 0.9. The reason why is that while the input number gets larger it appears on both the top AND bottom of the function so it multiplies and divides. Each operation (multiplication and division) just about cancels the other out. So what if we go really big? Lets try \(g(10000)\):
\[\frac{2\cdot 10000 -1}{10000}\]
\[\frac{20000-1}{10000}\]
\[\frac{19999}{10000}\]
\[1.9999\]
So no matter how big the number you stick in, you double it and divide it by itself which nearly cancels one of the copies completely, and would cancel it BUT FOR the one that you take away on the top line. This also works for whatever number you stick in - it does not have to be 1, 10, 100, 1000 or so on, it is just that the results from other numbers look a bit messy. What we are seeing here is that the bigger and bigger the starting number you put in, the closer and closer the output is to two. In mathematical terms we describe this as:
\[\lim_{x \to \infty} \frac{2x-1}{x}=2\]
Which means that as \(x\) gets closer to infinity (\(\infty\)) the output of the function gets closer to two.
Monday, 13 June 2011
What is a Limit, Part One?
The limit of something, is the maximum amount of something you can have. It can be natural or artificial. There is usually a drink drive limit, which is an artificial level of alcohol set by the state, that you are allowed to have in your system and legally drive. So if you have more alcohol than that you are known as being over the limit. Then there are natural limits, such as the limit of human hearing - dogs can hear sounds beyond that limit.
In maths limits are important because they help us get the answers to questions that we cannot ask. If that sounds odd, it is supposed to. There are some questions that we want to ask, but if we do we get nonsensical answers. As an example, lets build a function. Lets start with the function we used when talking about functions the first time:
\[x+1\]
Now, lets change the sign from addition to subtraction:
\[x-1\]
Lets divide it by itself:
\[\frac{x-1}{x-1}\]
And finally lets multiply the \(x\) on top by itself once.
\[\frac{x^2-1}{x-1}\]
Right. Lets call this \(f(x)\). What is \(f(2)\)? Well it is two times itself (four) less one (three) divided by two less one (one), so three divided by one. Or in algebra:
\[\frac{2^2-1}{2-1}\]
\[\frac{4-1}{2-1}\]
\[\frac{3}{1}\]
\[3\]
What \(f(5)\)? Well...
\[\frac{5^2-1}{5-1}\]
\[\frac{25-1}{5-1}\]
\[\frac{24}{4}\]
\[6\]
Looking at this there seems to be an obvious connection between \(x\) and \(f(x)\). The function just seems to add one to \(x\). 3=2+1, and 6=5+1. We started with \(x+1\), and it appears we have got back there! So \(f(1)\) should be two. Lets see if it is:
\[\frac{1^2-1}{1-1}\]
\[\frac{1-1}{1-1}\]
\[\frac{0}{0}\]
Hmm. Zero divided by zero. That's just a mess. Is that infinity or zero? It is meaningless. It does not work because as soon as \(x\) is one, you get zeros everywhere because you are subtracting one from one on the top and bottom. So \(f(1)\) is special because of the way the function is built.
How does the concept of a limit help us to get an answer to \(f(1)\)? Well, we could ask what \(f(0.5)\) and \(f(1.5)\) are. The answers are 1.5 and 2.5 respectively. That matches what we thought - the function ends up adding one to your starting number. It also supports our guess for \(f(1)\) before we tried it, which was two. Lets get a bit closer to one, and try again this time with \(f(0.9)\) and \(f(1.1)\). The results are 1.9 and 2.1 respectively. Again, just what we would expect. The closer we get to \(f(1)\) the closer the result gets to two. If we try with something silly like \(f(0.99999)\) we get 1.99999. That pattern is going to repeat for as many nines as we add onto the end of the number we plug in.
The way we answer this question is to say that as the number we plug into the function gets closer and closer to one, the result of the function gets closer and closer to two, but because it breaks when we put in exactly one, we never get to exactly two. That's what a limit is. In mathematical script it looks like this:
\[\lim_{x \rightarrow 1} \frac{x^2-1}{x-1}=2\]
Spoken out loud that says the limit of the function (blah blah describe the function blah blah) as ecks approaches one is two.
In maths limits are important because they help us get the answers to questions that we cannot ask. If that sounds odd, it is supposed to. There are some questions that we want to ask, but if we do we get nonsensical answers. As an example, lets build a function. Lets start with the function we used when talking about functions the first time:
\[x+1\]
Now, lets change the sign from addition to subtraction:
\[x-1\]
Lets divide it by itself:
\[\frac{x-1}{x-1}\]
And finally lets multiply the \(x\) on top by itself once.
\[\frac{x^2-1}{x-1}\]
Right. Lets call this \(f(x)\). What is \(f(2)\)? Well it is two times itself (four) less one (three) divided by two less one (one), so three divided by one. Or in algebra:
\[\frac{2^2-1}{2-1}\]
\[\frac{4-1}{2-1}\]
\[\frac{3}{1}\]
\[3\]
What \(f(5)\)? Well...
\[\frac{5^2-1}{5-1}\]
\[\frac{25-1}{5-1}\]
\[\frac{24}{4}\]
\[6\]
Looking at this there seems to be an obvious connection between \(x\) and \(f(x)\). The function just seems to add one to \(x\). 3=2+1, and 6=5+1. We started with \(x+1\), and it appears we have got back there! So \(f(1)\) should be two. Lets see if it is:
\[\frac{1^2-1}{1-1}\]
\[\frac{1-1}{1-1}\]
\[\frac{0}{0}\]
Hmm. Zero divided by zero. That's just a mess. Is that infinity or zero? It is meaningless. It does not work because as soon as \(x\) is one, you get zeros everywhere because you are subtracting one from one on the top and bottom. So \(f(1)\) is special because of the way the function is built.
How does the concept of a limit help us to get an answer to \(f(1)\)? Well, we could ask what \(f(0.5)\) and \(f(1.5)\) are. The answers are 1.5 and 2.5 respectively. That matches what we thought - the function ends up adding one to your starting number. It also supports our guess for \(f(1)\) before we tried it, which was two. Lets get a bit closer to one, and try again this time with \(f(0.9)\) and \(f(1.1)\). The results are 1.9 and 2.1 respectively. Again, just what we would expect. The closer we get to \(f(1)\) the closer the result gets to two. If we try with something silly like \(f(0.99999)\) we get 1.99999. That pattern is going to repeat for as many nines as we add onto the end of the number we plug in.
The way we answer this question is to say that as the number we plug into the function gets closer and closer to one, the result of the function gets closer and closer to two, but because it breaks when we put in exactly one, we never get to exactly two. That's what a limit is. In mathematical script it looks like this:
\[\lim_{x \rightarrow 1} \frac{x^2-1}{x-1}=2\]
Spoken out loud that says the limit of the function (blah blah describe the function blah blah) as ecks approaches one is two.
Monday, 6 June 2011
Irrationality of the Square Root of Four?
OK, last time we proved that the square root of two is not one whole number divided by another whole number, because we could continue dividing for ever, and we know that we cannot do that because we would eventually run out of numbers. It does all sound a bit fishy though. Is this not just some trick of the maths? To double check, lets try the same process, but with the square root of four instead. The square root of four is easy - it is just two, because two multiplied by two is four.
Anyway, we are assuming that the square root of four can be written as one number over another number. So lets use the same letters as last time.
\[\frac{a}{b}=\sqrt{4}\]
Excellent, again. Some fraction equals the square root of four. That's the assumption that we want to make. Nothing wrong so far. Let's now do a mathematical operation on this equation.
What we want to do is get the right hand side to just be four. This means that we need to multiply it by itself to get four. Obviously the square root of four multiplied by the square root of four is four - because that's where it came from. Let's skip the not equals sign stuff this time around, you get the idea.
\[\frac{a^2}{b^2}=4\]
And again, we know that we can multiply both sides by \(b^2\) to get:
\[a^2=4\cdot b^2\]
Now at this point can we say that \(a\) is an even number? Yes, we can run through all of our odds and evens arguments and still say that, because four is just two multiplied by two, so the left hand side is still some other number on the right hand side multiplied by two. So that gives us:
\[(2\cdot c)^2=4\cdot b^2\]
And we can square out contents of the brackets on the left hand side:
\[2\cdot 2\cdot c\cdot c=4\cdot b^2\]
\[2\cdot 2\cdot c^2=4\cdot b^2\]
\[4\cdot c^2=4\cdot b^2\]
OK. Last time when hunting for the square root of two we had this:
\[4\cdot c^2=2\cdot b^2\]
What is the difference here? The right hand side is multiplied by the SAME NUMBER as the left hand side - four. What can we gather from this? Well we can divide both sides by four to get this:
\[c^2 = b^2\]
And we can square root each side to get this:
\[c = b\]
So we now know that \(c\) is the same as \(b\). We also know that \(a\) is the same as \(2\cdot c\). So we can now convert both \(a\) and \(b\) into their equivalents of \(c\) and stick those back in the very first equation:
\[\frac{2c}{c}=\sqrt{4}\]
And from our arguments last time we also know that is the same as:
\[2\cdot \frac{c}{c}=\sqrt{4}\]
\[2\cdot 1=\sqrt{4}\]
\[2=\sqrt{4}\]
So we have established that two is the square root of four. Which it is. So our assumption this time, that the square root of four can be written as one number divided by another number IS correct. In fact we know that it is always correct as long as the top number is the bottom number multiplied by two.
Anyway, we are assuming that the square root of four can be written as one number over another number. So lets use the same letters as last time.
\[\frac{a}{b}=\sqrt{4}\]
Excellent, again. Some fraction equals the square root of four. That's the assumption that we want to make. Nothing wrong so far. Let's now do a mathematical operation on this equation.
What we want to do is get the right hand side to just be four. This means that we need to multiply it by itself to get four. Obviously the square root of four multiplied by the square root of four is four - because that's where it came from. Let's skip the not equals sign stuff this time around, you get the idea.
\[\frac{a^2}{b^2}=4\]
And again, we know that we can multiply both sides by \(b^2\) to get:
\[a^2=4\cdot b^2\]
Now at this point can we say that \(a\) is an even number? Yes, we can run through all of our odds and evens arguments and still say that, because four is just two multiplied by two, so the left hand side is still some other number on the right hand side multiplied by two. So that gives us:
\[(2\cdot c)^2=4\cdot b^2\]
And we can square out contents of the brackets on the left hand side:
\[2\cdot 2\cdot c\cdot c=4\cdot b^2\]
\[2\cdot 2\cdot c^2=4\cdot b^2\]
\[4\cdot c^2=4\cdot b^2\]
OK. Last time when hunting for the square root of two we had this:
\[4\cdot c^2=2\cdot b^2\]
What is the difference here? The right hand side is multiplied by the SAME NUMBER as the left hand side - four. What can we gather from this? Well we can divide both sides by four to get this:
\[c^2 = b^2\]
And we can square root each side to get this:
\[c = b\]
So we now know that \(c\) is the same as \(b\). We also know that \(a\) is the same as \(2\cdot c\). So we can now convert both \(a\) and \(b\) into their equivalents of \(c\) and stick those back in the very first equation:
\[\frac{2c}{c}=\sqrt{4}\]
And from our arguments last time we also know that is the same as:
\[2\cdot \frac{c}{c}=\sqrt{4}\]
\[2\cdot 1=\sqrt{4}\]
\[2=\sqrt{4}\]
So we have established that two is the square root of four. Which it is. So our assumption this time, that the square root of four can be written as one number divided by another number IS correct. In fact we know that it is always correct as long as the top number is the bottom number multiplied by two.
Monday, 30 May 2011
Irrationality of the Square Root of Two
Now we have learned a little bit about fractions and equations, we can try and satisfy ourselves that the square root of two cannot be written as a whole number divided by another whole number (a rational number). This is a small detour on the way to \(e\), but I did promise to deal with it. We should now have all the tools we need to work it out. The way we are going to do this is to simply assume that you can actually write the square root of two as one whole number divided by another, do some logic, and if something breaks then it means that our assumption was wrong. This is called proof by contradiction.
You can use proof by contradiction in your own life as well. When you wake up in the morning, and you can't remember if it is a work (or school) day or not, just assume it is the weekend, and go back to sleep. If nobody wakes you from your slumber, your assumption was correct. If your boss (or parent, or teacher), starts shouting at you to get up, your assumption was wrong, and you have to get up. This is an easy way of proving something by not doing very much, and waiting to see if it all goes tits up. It's the lazy approach to proof.
So, let's assume that the square root of two is actually a rational number. What would that look like? Well, it would be one number divided by another number. We don't know what those two numbers would be, so let's replace them with symbols in the meantime. Let's use 'a' for the top number and 'b' for the bottom number. So it would look like this:
\[\frac{a}{b}\]
Good start. Let's bring in the equals sign and the square root of two so we can see exactly what we are talking about:
\[\frac{a}{b}=\sqrt{2}\]
Excellent. Some fraction equals the square root of two. That's the assumption that we want to make. Nothing wrong so far. Let's now do a mathematical operation on this equation.
What we want to do is get the right hand side to just be two. This means that we need to multiply it by itself to get two. Obviously the square root of two multiplied by the square root of two is two - because that's where it came from. This looks like this:
\[\frac{a}{b}\neq 2\]
Notice that we have had to change the equals sign to a not equals sign. This is because we did something to the right hand side without doing the same to the left hand side. Now, we could just multiply the left hand side by \(\sqrt{2}\) as well, but that's not going to help us, because we want to get rid of the square root. So, instead lets multiply the left hand side by 'a divided by b' (because we have said that a divided by b is the same as \(\sqrt{2}\) and that's what we multiplied the right hand side by). If we do the same thing to both sides, we still get to use the equals sign. So:
\[\frac{a}{b}\cdot\frac{a}{b}=2\]
We also know what to do to multiply fractions together, so we get:
\[\frac{a\cdot a}{b\cdot b}=2\]
This is the same as:
\[\frac{a^2}{b^2}=2\]
For those last two steps, we were just rearranging the left hand side, not performing an operation on it, so we get to keep our equals sign. Now, we want to have only \(a^2\) on the left hand side. How do we achieve that? Well, we multiply the left hand side by \(b^2\). That looks like this:
\[b^2\cdot \frac{a^2}{b^2}\neq2\]
We have had to bring in the not equals sign again, so lets get the equals sign back:
\[b^2\cdot \frac{a^2}{b^2}=2\cdot b^2\]
Now we know that something times a fraction is just something times the top number of the fraction, leaving the bottom number untouched (two times one third equals two thirds). So the left hand side above is the same as:
\[\frac{b^2\cdot a^2}{b^2}=2\cdot b^2\]
It is important to note that, again, we are not actually doing anything to the left hand side just now. The number it equals stays the same. We are just rearranging the way we write the number. This means we do not lose our equals sign. Now, remember it does not matter in which order you multiply things, so \(b^2\cdot \frac{a^2}{b^2} =\frac{b^2\cdot a^2}{b^2} = \frac{a^2\cdot b^2}{b^2}=a^2\cdot\frac{b^2}{b^2}\). This all means that we can just take the \(a^2\) out on its own leaving:
\[\frac{b^2}{b^2}\cdot a^2=2\cdot b^2\]
Now what is any number divided by itself? One. So we get:
\[1\cdot a^2=2\cdot b^2\]
Which is the same as:
\[a^2=2\cdot b^2\]
Remember, it is worth mentioning again, that since multiplying both sides by \(b^2\) we have not actually changed the numbers at all - we have just rearranged the way they are written down. So in all the changes on the left hand side since we got our equals sign back again, we have not lost it. Now, what does that number above tell us? It tells us that \(a\) multiplied by itself is an even number. Why can we say that? Well, an even number is a number that can be divided by two. And look at the equation: \(a^2\) is two times some other number. So \(a^2\) divided by two IS that other number. So \(a^2\) CAN be divided by two, so \(a^2\) is even. So if \(a^2\) is even, can we draw any conclusions about \(a\)?
Let's think about this. Any even number is any number that can be divided by two. That's the same as saying that any even number is some other number (the number you get when you divide your even number by two) MULTIPLIED by two. So lets call this number that you multiply by two to get an even number \(n\). Then any even number can be written as \(2\cdot n\) or \(2n\). If we then square that even number we get \(2\cdot n\cdot 2\cdot n\). Again remember that it doesn't matter what order we multiply things in. So that expression is the same as \(2\cdot 2\cdot n^2\). So, no matter what starting number \(n\) we choose, and in fact no matter whether the square of that number is odd or even, once we get the square of it we multiply it by two and then two again. Because we multiply by two to get the final number, what this tells us is that the final number has to be divisible by two. In turn this tells us that the square of any even number can be divided by two. And therefore that the square of any even number is an even number itself.
OK. But what about the square root of an even number (which is what we are considering). Just because every even number squares to an even number, it does not necessarily follow that the square root of an even number has to be even. That's like saying that because all sheep are fluffy white things, every fluffy white thing is a sheep. Which would be baaad news for clouds. Sorry.
Let's consider the possibility that an odd number could be the square root of an even number. Every number is capable of being multiplied by two, that's simple. So once you have multiplied every number by two, you end up with a list of all the even numbers. There is a gap between each even number which is exactly one number wide. That's because your original list which you multiplied had no gaps. If you multiplied all the original numbers by three, the gap would be two numbers wide. So between every even number there is a gap of exactly one number, and that number is an odd number, because it does not appear on the list of evens, so it cannot be divided by two. So for every even number if you add one, you move into that gap and land on an odd number. So, every odd number is an even number plus one. We have already agreed that every even number is \(2n\), so every odd number is \(2n+1\).
Good show. But what happens when we square an odd number? We get \((2n+1)(2n+1)\). You remember how you get rid of brackets don't you? We went through that whole tortuitous metaphor about the greengrocer with OCD. Once we have multiplied away the brackets we get:
\[2n\cdot 2n = 2\cdot 2\cdot n\cdot n = 4\cdot n^2\]
\[2n\cdot 1 = 2n\]
\[1\cdot 2n = 2n\]
\[1\cdot 1 = 1\]
And added all together those results look like this:
\[4n^2 + 2n + 2n +1\]
The \(n\)'s are the same thing, so we can just add them together:
\[4n^2 + 4n +1\]
Now we have added up everything, we can simplify it a bit by spotting that we have two things multiplied by four, so we can stick those things in a bracket and multiply the bracket by four:
\[4(n^2 + n)+1\]
Now, what can we tell from this? Well no matter what \(n^2\) actually is, we multiply it by four, which is two multiplied by two. Remember that as long as something is multiplied by two, the result can be divided by two, meaning that the result must be even. So no matter what goes on inside the brackets, when it gets multiplied by four it becomes even. What happens then? WE ADD ONE. We have already worked out that any even number plus one is an odd number. So no matter what \(n\) we choose to give us our original odd number, the square of that odd number will be odd because it is an even number plus one.
So we have now proved that any even number squared produces an even number, and any odd number squared produces an odd number. Now, you have to take one final thing on trust, because I have no proof for it. Other than zero and one, every other integer is either odd or even. There is no third type. And remember that because we are talking about the square root of two being one whole number divided by another whole number, all we care about are whole numbers.
So what does this tell us about \(a\)? Well we know that \(a^2\) is even (because it is some other number multiplied by two so it has to be divisible by two). If \(a^2\) is even can \(a\) be odd? No, because we just proved that any odd number squared is also an odd number. Can it be zero? No, because then the left hand side of the equation would be zero immediately, and the square root of two is not zero. Can it be one? No, because remember we are dealing with two whole numbers and 1 is not two times another whole number, one is two times a half. So if \(a\) is not odd, is not zero, is not one, it can only be even (because I have asked you to take as a given that there is no other option).
So if \(a\) is even, it is divisible by two, meaning that there is some other number that, if multiplied by two, gives \(a\). Lets call this other number \(c\). Important point, \(c\) is half of \(a\) so is smaller than \(a\). So we can now rewrite our equation by replacing a with two multiplied by \(c\). Remember the equation is:
\[a^2=2\cdot b^2\]
Replacing \(a\) for \(2\cdot c\) gives us:
\[(2\cdot c)^2=2\cdot b^2\]
Or:
\[2\cdot 2\cdot c\cdot c=2\cdot b^2\]
\[2\cdot 2\cdot c^2=2\cdot b^2\]
\[4\cdot c^2=2\cdot b^2\]
Notice that we have a two on the right hand side of the formula. We can divide that side by two to get rid of the two, as long as we divide the left hand side by two as well. If we do so, we get:
\[2\cdot c^2=b^2\]
Reverse the two sides and we get:
\[b^2=2\cdot c^2\]
What is the actual practical difference between that statement and the one we had before the substitution:
\[a^2=2\cdot b^2\]
The only difference is that we are dealing now with \(b\) and \(c\). We can still follow exactly the same logical steps with these two symbols as we did with \(a\) and \(b\). What will we end up with?
\[c^2=2\cdot d^2\]
with \(d\) equal to one half of \(b\). We could then go back to the beginning and end up with \(e\) and \(f\) with \(f\) half of \(d\) which is half of \(b\).
How long can we keep on doing this? How long can we keep running through this series of steps, getting new symbols each time? You may say, until we get to \(z\), but then we could move on to Greek letters, or hieroglyphs, or any other symbols you care to imagine, like pictures of clouds or puppies. Trust me we have plenty of symbols. The answer in maths in that once you have done something once, you can always do it again and again. I can add one to two to get three five hundred million times, and on the five hundred millionth and first time it is not going to suddenly be four. So we HAVE to be able to keep halving our variables each we run through, progressively getting smaller and smaller numbers.
But can we actually do that? Can we keep taking our original numbers and cut them in half forever and ever and never end up with a number less than one? Try it. Go on. Pick a number, any number. Cut it in half. Cut it in half again, and again, and do it forever. It will eventually get to a number that when you cut it in half it does not produce a whole number - the very last number, the finish line if you will, is the number one. If you hit that, you cut it in half and are left with a half.
(You may say, if you are being a smart arse, "A ha! I picked infinity, and I can keep cutting that in half forever." Tough. You can't do that. For a start it isn't really a number, but a concept. And secondly, you would need to have infinity on both the top AND bottom of the fraction. And what is any number divided by itself? One. And one multiplied by one is one, not two.)
So, you cannot keep reducing these starting numbers forever and ever, because you will eventually cut one into non-whole number sized pieces. But our original assumption, that the square root of two can be written as one whole number over another, says that we have to be able to just that. What is our conclusion? Our conclusion is that something has gone tits up, and broken. The office has phoned, and it's a work day, not the weekend. Our original assumption was false. Hence, the square root of two cannot be written as one whole number divided by another.
You can use proof by contradiction in your own life as well. When you wake up in the morning, and you can't remember if it is a work (or school) day or not, just assume it is the weekend, and go back to sleep. If nobody wakes you from your slumber, your assumption was correct. If your boss (or parent, or teacher), starts shouting at you to get up, your assumption was wrong, and you have to get up. This is an easy way of proving something by not doing very much, and waiting to see if it all goes tits up. It's the lazy approach to proof.
So, let's assume that the square root of two is actually a rational number. What would that look like? Well, it would be one number divided by another number. We don't know what those two numbers would be, so let's replace them with symbols in the meantime. Let's use 'a' for the top number and 'b' for the bottom number. So it would look like this:
\[\frac{a}{b}\]
Good start. Let's bring in the equals sign and the square root of two so we can see exactly what we are talking about:
\[\frac{a}{b}=\sqrt{2}\]
Excellent. Some fraction equals the square root of two. That's the assumption that we want to make. Nothing wrong so far. Let's now do a mathematical operation on this equation.
What we want to do is get the right hand side to just be two. This means that we need to multiply it by itself to get two. Obviously the square root of two multiplied by the square root of two is two - because that's where it came from. This looks like this:
\[\frac{a}{b}\neq 2\]
Notice that we have had to change the equals sign to a not equals sign. This is because we did something to the right hand side without doing the same to the left hand side. Now, we could just multiply the left hand side by \(\sqrt{2}\) as well, but that's not going to help us, because we want to get rid of the square root. So, instead lets multiply the left hand side by 'a divided by b' (because we have said that a divided by b is the same as \(\sqrt{2}\) and that's what we multiplied the right hand side by). If we do the same thing to both sides, we still get to use the equals sign. So:
\[\frac{a}{b}\cdot\frac{a}{b}=2\]
We also know what to do to multiply fractions together, so we get:
\[\frac{a\cdot a}{b\cdot b}=2\]
This is the same as:
\[\frac{a^2}{b^2}=2\]
For those last two steps, we were just rearranging the left hand side, not performing an operation on it, so we get to keep our equals sign. Now, we want to have only \(a^2\) on the left hand side. How do we achieve that? Well, we multiply the left hand side by \(b^2\). That looks like this:
\[b^2\cdot \frac{a^2}{b^2}\neq2\]
We have had to bring in the not equals sign again, so lets get the equals sign back:
\[b^2\cdot \frac{a^2}{b^2}=2\cdot b^2\]
Now we know that something times a fraction is just something times the top number of the fraction, leaving the bottom number untouched (two times one third equals two thirds). So the left hand side above is the same as:
\[\frac{b^2\cdot a^2}{b^2}=2\cdot b^2\]
It is important to note that, again, we are not actually doing anything to the left hand side just now. The number it equals stays the same. We are just rearranging the way we write the number. This means we do not lose our equals sign. Now, remember it does not matter in which order you multiply things, so \(b^2\cdot \frac{a^2}{b^2} =\frac{b^2\cdot a^2}{b^2} = \frac{a^2\cdot b^2}{b^2}=a^2\cdot\frac{b^2}{b^2}\). This all means that we can just take the \(a^2\) out on its own leaving:
\[\frac{b^2}{b^2}\cdot a^2=2\cdot b^2\]
Now what is any number divided by itself? One. So we get:
\[1\cdot a^2=2\cdot b^2\]
Which is the same as:
\[a^2=2\cdot b^2\]
Remember, it is worth mentioning again, that since multiplying both sides by \(b^2\) we have not actually changed the numbers at all - we have just rearranged the way they are written down. So in all the changes on the left hand side since we got our equals sign back again, we have not lost it. Now, what does that number above tell us? It tells us that \(a\) multiplied by itself is an even number. Why can we say that? Well, an even number is a number that can be divided by two. And look at the equation: \(a^2\) is two times some other number. So \(a^2\) divided by two IS that other number. So \(a^2\) CAN be divided by two, so \(a^2\) is even. So if \(a^2\) is even, can we draw any conclusions about \(a\)?
Let's think about this. Any even number is any number that can be divided by two. That's the same as saying that any even number is some other number (the number you get when you divide your even number by two) MULTIPLIED by two. So lets call this number that you multiply by two to get an even number \(n\). Then any even number can be written as \(2\cdot n\) or \(2n\). If we then square that even number we get \(2\cdot n\cdot 2\cdot n\). Again remember that it doesn't matter what order we multiply things in. So that expression is the same as \(2\cdot 2\cdot n^2\). So, no matter what starting number \(n\) we choose, and in fact no matter whether the square of that number is odd or even, once we get the square of it we multiply it by two and then two again. Because we multiply by two to get the final number, what this tells us is that the final number has to be divisible by two. In turn this tells us that the square of any even number can be divided by two. And therefore that the square of any even number is an even number itself.
OK. But what about the square root of an even number (which is what we are considering). Just because every even number squares to an even number, it does not necessarily follow that the square root of an even number has to be even. That's like saying that because all sheep are fluffy white things, every fluffy white thing is a sheep. Which would be baaad news for clouds. Sorry.
Let's consider the possibility that an odd number could be the square root of an even number. Every number is capable of being multiplied by two, that's simple. So once you have multiplied every number by two, you end up with a list of all the even numbers. There is a gap between each even number which is exactly one number wide. That's because your original list which you multiplied had no gaps. If you multiplied all the original numbers by three, the gap would be two numbers wide. So between every even number there is a gap of exactly one number, and that number is an odd number, because it does not appear on the list of evens, so it cannot be divided by two. So for every even number if you add one, you move into that gap and land on an odd number. So, every odd number is an even number plus one. We have already agreed that every even number is \(2n\), so every odd number is \(2n+1\).
Good show. But what happens when we square an odd number? We get \((2n+1)(2n+1)\). You remember how you get rid of brackets don't you? We went through that whole tortuitous metaphor about the greengrocer with OCD. Once we have multiplied away the brackets we get:
\[2n\cdot 2n = 2\cdot 2\cdot n\cdot n = 4\cdot n^2\]
\[2n\cdot 1 = 2n\]
\[1\cdot 2n = 2n\]
\[1\cdot 1 = 1\]
And added all together those results look like this:
\[4n^2 + 2n + 2n +1\]
The \(n\)'s are the same thing, so we can just add them together:
\[4n^2 + 4n +1\]
Now we have added up everything, we can simplify it a bit by spotting that we have two things multiplied by four, so we can stick those things in a bracket and multiply the bracket by four:
\[4(n^2 + n)+1\]
Now, what can we tell from this? Well no matter what \(n^2\) actually is, we multiply it by four, which is two multiplied by two. Remember that as long as something is multiplied by two, the result can be divided by two, meaning that the result must be even. So no matter what goes on inside the brackets, when it gets multiplied by four it becomes even. What happens then? WE ADD ONE. We have already worked out that any even number plus one is an odd number. So no matter what \(n\) we choose to give us our original odd number, the square of that odd number will be odd because it is an even number plus one.
So we have now proved that any even number squared produces an even number, and any odd number squared produces an odd number. Now, you have to take one final thing on trust, because I have no proof for it. Other than zero and one, every other integer is either odd or even. There is no third type. And remember that because we are talking about the square root of two being one whole number divided by another whole number, all we care about are whole numbers.
So what does this tell us about \(a\)? Well we know that \(a^2\) is even (because it is some other number multiplied by two so it has to be divisible by two). If \(a^2\) is even can \(a\) be odd? No, because we just proved that any odd number squared is also an odd number. Can it be zero? No, because then the left hand side of the equation would be zero immediately, and the square root of two is not zero. Can it be one? No, because remember we are dealing with two whole numbers and 1 is not two times another whole number, one is two times a half. So if \(a\) is not odd, is not zero, is not one, it can only be even (because I have asked you to take as a given that there is no other option).
So if \(a\) is even, it is divisible by two, meaning that there is some other number that, if multiplied by two, gives \(a\). Lets call this other number \(c\). Important point, \(c\) is half of \(a\) so is smaller than \(a\). So we can now rewrite our equation by replacing a with two multiplied by \(c\). Remember the equation is:
\[a^2=2\cdot b^2\]
Replacing \(a\) for \(2\cdot c\) gives us:
\[(2\cdot c)^2=2\cdot b^2\]
Or:
\[2\cdot 2\cdot c\cdot c=2\cdot b^2\]
\[2\cdot 2\cdot c^2=2\cdot b^2\]
\[4\cdot c^2=2\cdot b^2\]
Notice that we have a two on the right hand side of the formula. We can divide that side by two to get rid of the two, as long as we divide the left hand side by two as well. If we do so, we get:
\[2\cdot c^2=b^2\]
Reverse the two sides and we get:
\[b^2=2\cdot c^2\]
What is the actual practical difference between that statement and the one we had before the substitution:
\[a^2=2\cdot b^2\]
The only difference is that we are dealing now with \(b\) and \(c\). We can still follow exactly the same logical steps with these two symbols as we did with \(a\) and \(b\). What will we end up with?
\[c^2=2\cdot d^2\]
with \(d\) equal to one half of \(b\). We could then go back to the beginning and end up with \(e\) and \(f\) with \(f\) half of \(d\) which is half of \(b\).
How long can we keep on doing this? How long can we keep running through this series of steps, getting new symbols each time? You may say, until we get to \(z\), but then we could move on to Greek letters, or hieroglyphs, or any other symbols you care to imagine, like pictures of clouds or puppies. Trust me we have plenty of symbols. The answer in maths in that once you have done something once, you can always do it again and again. I can add one to two to get three five hundred million times, and on the five hundred millionth and first time it is not going to suddenly be four. So we HAVE to be able to keep halving our variables each we run through, progressively getting smaller and smaller numbers.
But can we actually do that? Can we keep taking our original numbers and cut them in half forever and ever and never end up with a number less than one? Try it. Go on. Pick a number, any number. Cut it in half. Cut it in half again, and again, and do it forever. It will eventually get to a number that when you cut it in half it does not produce a whole number - the very last number, the finish line if you will, is the number one. If you hit that, you cut it in half and are left with a half.
(You may say, if you are being a smart arse, "A ha! I picked infinity, and I can keep cutting that in half forever." Tough. You can't do that. For a start it isn't really a number, but a concept. And secondly, you would need to have infinity on both the top AND bottom of the fraction. And what is any number divided by itself? One. And one multiplied by one is one, not two.)
So, you cannot keep reducing these starting numbers forever and ever, because you will eventually cut one into non-whole number sized pieces. But our original assumption, that the square root of two can be written as one whole number over another, says that we have to be able to just that. What is our conclusion? Our conclusion is that something has gone tits up, and broken. The office has phoned, and it's a work day, not the weekend. Our original assumption was false. Hence, the square root of two cannot be written as one whole number divided by another.
Monday, 23 May 2011
Functions & Algebra
OK, before we can progress look at \(e\) (and to the joys of proving the irrationality of the square root of two) we need to stop for a minute to think about functions. This is yet another fairly basic element of mathematics that was glossed over completely in my formal maths education.
So what is a function? In a basic sense, a function is a list of mathematical operations that you carry out on a variable. Hmm. I think what this scenario requires is an ill thought out analogy. Tradition dictates that it should involve some sort of hot drink.
So lets have our hot drink function. It would state:
1. Boil Kettle
2. Put boiled water in mug
3. Put drink flavouring in mug.
4. Stir mug
Those instructions are good enough to make coffee, tea, hot orange, or a cup-a-soup. The only bit of the instructions that would have to be changed is the "drink flavouring" bit. That could be instant coffee for coffee, a tea bag for tea, some orange squash for hot orange, or a sachet of cup-a-soup powder for the soup. All we have to do is change that bit, and follow the rest of the instructions to the letter, and we get a different drink at the end. The "drink flavouring" bit is the VARIABLE, meaning that it is the bit which can change, if we want to change the hot drink we end up with.
Once we have listed all these instructions once, it would be dull to write them out every time. So instead we use a shorthand system. Lets give the group of instructions a name: "hotbeverage". To show that the outcome of the hotbeverage instructions depends on the drink flavouring variable, we put that in brackets after the name: hotbeverage(drink flavouring). If you were reading the notation aloud, you would say "hotbeverage of drink flavouring". Which makes some sense.
That notation means that you can change what goes in the brackets (tea bag, soup powder etc) and you STILL FOLLOW exactly the same instructions, and you end up with something else. If you decided that you were going to make hot orange, you would write that as hotbeverage(orangesquash). You have replaced the placeholder with the ingredient. Someone reading that would know that you meant "follow the list of instructions that I have called hotbeverage replacing the variable 'drink flavouring' with orangesquash".
Now, lets bring some algebra into this. First question, what the fuck is algebra? Answer, a very useful system of mathematical thinking that Muslims came up with in the middle ages, based on earlier work of Hindus, all at a time when western society was trying to work out how the Romans had built the bloody aqueducts. You can tell the word has Arabic origins because of the 'al' - much like alchemy, and algorithm.
So what actually is it? Well, in the example above we had a variable - something that we could change to change the outcome of the function. Algebra is a way of working with these variables. Sometimes that are referred to as 'unknowns' which is coming at the thing from a different angle. Before we had the concept of algebra, mathematical thinking was really done using geometry. So if you saw an ancient mathematician scribbling away on papyrus, or a sand board, they would be drawing lines and circles and so on, NOT the kind of symbols and operations that we use these days. The difference that algebra brought to the table, so to speak, was the ability to think more abstractly.
The Muslims made the leap from lines and circles to abstract ideas, but they could only express those ideas in long wordy sentences. Such as "if you take the third part of the first party raised to the second power and then find the root of the difference between that and the fifth part of the....". You get the drift. Much later than the Muslims, Western Europeans started using letters of the alphabet in place of the unknowns. They also started using the symbols we have already looked at to show what operations you were doing to numbers or these unknowns. So finally instead of drawings, or long wordy sentences, we finally had the numbers, letters and symbols that we now recognise as 'algebra'.
Tradition has it that we use letters from the end of the alphabet to represent these variables or unknowns, starting with \(x\). We have already used a letter from the Greek alphabet to represent a number - \(\pi\). Is \(x\) the same? No. \(\pi\) is always the same number a little bit more than 3, roughly \(\tfrac{22}{7}\). It does not vary. We just use the letter symbol, because we would never finish writing out the number otherwise. \(\pi\) does not vary - it remains constant, and so we call it a ... constant. It is like the symbol for two, '2', or five, '5'. Those symbols always means two or five. In the same way \(\pi\) always means just a little bit more than three.
So we normally use \(x\) to represent the first variable, and then \(y\) and \(z\) if there are more variables. In other subjects where algebra is bring used, you find different letters, or even other Greek letters. Physics is bloody littered with different letters ('s' usually stands for a variable which is the speed of something, 'd' for distance, 'v' for velocity (not the same as speed but never mind that now) and so on). So, a typical algebraic equation would look like this:
\[x+1=3\]
We are now supposed to follow some formal algebraic rules to get a statement that starts with:
\[x=\]
The bit on the right side of the = tells us what \(x\) actually is in this example. In this case, the rules we follow are to deduct 1 from each side of the equation:
\[x=2\]
We now know what \(x\) is. We can plug 2 into the place of \(x\) in the first equation:
\[2+1=3\]
Yep, that is correct. Richard Feynman explained algebra best when he said that it is just a puzzle game, where the goal is to find out what \(x\) is! You can look back at the original statement \(x+1=3\) and rephrase it as the question, "what number, if you add one to it, makes three?" Then it is just a puzzle which is easily solved.
That was an example of an algebraic equation. That English language question I translated it into was how algebra was done before the symbols were invented. Could ancient mathematicians have solved this using their lines and circles? Yes. What you do is draw a long line. And then take a compass set to a specific, and completely arbitrary, width. Stick the point anywhere along the long line. Draw a circle. The circle will cross the line at two points, the same distance from the pointy end of the compass. You now have three points, the original bit where you stuck the compass in, and two where the circle crosses the line. Can we answer the question what plus one is three? No, because we only have two identical line segments.
So now stick the compass pointy bit on either of the new points and (without changing the width of the compass) draw another circle. One crossing point will be the very first pointy bit, and the second will be a new point on the line. We now have four points and three equal segments of line. We now have three identical line segments. In this step we added one identical line segments, so the answer to the question must be "how many line segments did we have before we added this third one", and the answer as we saw a second ago was "two".
It should be obvious, but it bears repeating. Whenever you hear of an ancient mathematical proof or theorem, and you think "that's a doddle, I could have done that, and I am an idiot", remember they didn't have symbolic algebra. They had tools to draw circles, and tools to draw straight lines. That was it. And they still managed to come up with good stuff. Geniuses.
Anyway, we started to talk about functions. We have seen a function written out in normal language, so what does a function look like in symbolic algebra? Well, we have already seen one of those as well. A function is basically just the side of the equation that \(x\) is on. So in the above example, the function is:
\[x+1\]
The instructions we follow are really just to add 1 to our variable. We need to give our function a name, and traditionally we give it a name one letter long. We start using the letter \(f\), for the first function we use. If we need to use more than one to solve a problem, we give the next one the name \(g\). And so on. It would be terribly complex if we chose letters nearer \(xyz\) because you would start to get confused between the names of functions and the names of variables. That would be bad.
There is nothing special about the letters chosen, by the way. A variable called \(x\) is not one greater or less than a variable called \(y\). We could just as well be using pictures of clouds, buses, or lawnmowers. It is nothing more than a marker. Same thing applies to the names chosen for functions. I said earlier that we put the variable in brackets after the name of the function. So we end up with:
\[f(x)\]
That means the function is called \(f\) and it involves a variable called \(x\). We know what the function looks like, so we can describe the whole thing as:
\[f(x)=x+1\]
We can then replace the \(x\) with two like this:
\[f(2)=2+1\]
\[f(2)=3\]
We can now talk about \(f(x)\) rather than repeating all the steps every time. This is not too onerous with \(x+1\)) but we will see some more insane functions in due course. Much like the beverage example above you would say that this is "eff of ecks".
The important difference here is that an equation has an equals sign in it, whereas a function does not. OK, pedant, yes the example just about has an equals sign, but that is just telling you what the function is. The actual function (the bit on the right of the equals sign) only has a variable, the sign for addition, and a number. It has no equals sign. An equation, on the other hand, tells you that two different looking thinks are equivalent to each other, whereas a function is list of things that you do to \(x\), say, to get a result.
So what is a function? In a basic sense, a function is a list of mathematical operations that you carry out on a variable. Hmm. I think what this scenario requires is an ill thought out analogy. Tradition dictates that it should involve some sort of hot drink.
So lets have our hot drink function. It would state:
1. Boil Kettle
2. Put boiled water in mug
3. Put drink flavouring in mug.
4. Stir mug
Those instructions are good enough to make coffee, tea, hot orange, or a cup-a-soup. The only bit of the instructions that would have to be changed is the "drink flavouring" bit. That could be instant coffee for coffee, a tea bag for tea, some orange squash for hot orange, or a sachet of cup-a-soup powder for the soup. All we have to do is change that bit, and follow the rest of the instructions to the letter, and we get a different drink at the end. The "drink flavouring" bit is the VARIABLE, meaning that it is the bit which can change, if we want to change the hot drink we end up with.
Once we have listed all these instructions once, it would be dull to write them out every time. So instead we use a shorthand system. Lets give the group of instructions a name: "hotbeverage". To show that the outcome of the hotbeverage instructions depends on the drink flavouring variable, we put that in brackets after the name: hotbeverage(drink flavouring). If you were reading the notation aloud, you would say "hotbeverage of drink flavouring". Which makes some sense.
That notation means that you can change what goes in the brackets (tea bag, soup powder etc) and you STILL FOLLOW exactly the same instructions, and you end up with something else. If you decided that you were going to make hot orange, you would write that as hotbeverage(orangesquash). You have replaced the placeholder with the ingredient. Someone reading that would know that you meant "follow the list of instructions that I have called hotbeverage replacing the variable 'drink flavouring' with orangesquash".
Now, lets bring some algebra into this. First question, what the fuck is algebra? Answer, a very useful system of mathematical thinking that Muslims came up with in the middle ages, based on earlier work of Hindus, all at a time when western society was trying to work out how the Romans had built the bloody aqueducts. You can tell the word has Arabic origins because of the 'al' - much like alchemy, and algorithm.
So what actually is it? Well, in the example above we had a variable - something that we could change to change the outcome of the function. Algebra is a way of working with these variables. Sometimes that are referred to as 'unknowns' which is coming at the thing from a different angle. Before we had the concept of algebra, mathematical thinking was really done using geometry. So if you saw an ancient mathematician scribbling away on papyrus, or a sand board, they would be drawing lines and circles and so on, NOT the kind of symbols and operations that we use these days. The difference that algebra brought to the table, so to speak, was the ability to think more abstractly.
The Muslims made the leap from lines and circles to abstract ideas, but they could only express those ideas in long wordy sentences. Such as "if you take the third part of the first party raised to the second power and then find the root of the difference between that and the fifth part of the....". You get the drift. Much later than the Muslims, Western Europeans started using letters of the alphabet in place of the unknowns. They also started using the symbols we have already looked at to show what operations you were doing to numbers or these unknowns. So finally instead of drawings, or long wordy sentences, we finally had the numbers, letters and symbols that we now recognise as 'algebra'.
Tradition has it that we use letters from the end of the alphabet to represent these variables or unknowns, starting with \(x\). We have already used a letter from the Greek alphabet to represent a number - \(\pi\). Is \(x\) the same? No. \(\pi\) is always the same number a little bit more than 3, roughly \(\tfrac{22}{7}\). It does not vary. We just use the letter symbol, because we would never finish writing out the number otherwise. \(\pi\) does not vary - it remains constant, and so we call it a ... constant. It is like the symbol for two, '2', or five, '5'. Those symbols always means two or five. In the same way \(\pi\) always means just a little bit more than three.
So we normally use \(x\) to represent the first variable, and then \(y\) and \(z\) if there are more variables. In other subjects where algebra is bring used, you find different letters, or even other Greek letters. Physics is bloody littered with different letters ('s' usually stands for a variable which is the speed of something, 'd' for distance, 'v' for velocity (not the same as speed but never mind that now) and so on). So, a typical algebraic equation would look like this:
\[x+1=3\]
We are now supposed to follow some formal algebraic rules to get a statement that starts with:
\[x=\]
The bit on the right side of the = tells us what \(x\) actually is in this example. In this case, the rules we follow are to deduct 1 from each side of the equation:
\[x=2\]
We now know what \(x\) is. We can plug 2 into the place of \(x\) in the first equation:
\[2+1=3\]
Yep, that is correct. Richard Feynman explained algebra best when he said that it is just a puzzle game, where the goal is to find out what \(x\) is! You can look back at the original statement \(x+1=3\) and rephrase it as the question, "what number, if you add one to it, makes three?" Then it is just a puzzle which is easily solved.
That was an example of an algebraic equation. That English language question I translated it into was how algebra was done before the symbols were invented. Could ancient mathematicians have solved this using their lines and circles? Yes. What you do is draw a long line. And then take a compass set to a specific, and completely arbitrary, width. Stick the point anywhere along the long line. Draw a circle. The circle will cross the line at two points, the same distance from the pointy end of the compass. You now have three points, the original bit where you stuck the compass in, and two where the circle crosses the line. Can we answer the question what plus one is three? No, because we only have two identical line segments.
So now stick the compass pointy bit on either of the new points and (without changing the width of the compass) draw another circle. One crossing point will be the very first pointy bit, and the second will be a new point on the line. We now have four points and three equal segments of line. We now have three identical line segments. In this step we added one identical line segments, so the answer to the question must be "how many line segments did we have before we added this third one", and the answer as we saw a second ago was "two".
It should be obvious, but it bears repeating. Whenever you hear of an ancient mathematical proof or theorem, and you think "that's a doddle, I could have done that, and I am an idiot", remember they didn't have symbolic algebra. They had tools to draw circles, and tools to draw straight lines. That was it. And they still managed to come up with good stuff. Geniuses.
Anyway, we started to talk about functions. We have seen a function written out in normal language, so what does a function look like in symbolic algebra? Well, we have already seen one of those as well. A function is basically just the side of the equation that \(x\) is on. So in the above example, the function is:
\[x+1\]
The instructions we follow are really just to add 1 to our variable. We need to give our function a name, and traditionally we give it a name one letter long. We start using the letter \(f\), for the first function we use. If we need to use more than one to solve a problem, we give the next one the name \(g\). And so on. It would be terribly complex if we chose letters nearer \(xyz\) because you would start to get confused between the names of functions and the names of variables. That would be bad.
There is nothing special about the letters chosen, by the way. A variable called \(x\) is not one greater or less than a variable called \(y\). We could just as well be using pictures of clouds, buses, or lawnmowers. It is nothing more than a marker. Same thing applies to the names chosen for functions. I said earlier that we put the variable in brackets after the name of the function. So we end up with:
\[f(x)\]
That means the function is called \(f\) and it involves a variable called \(x\). We know what the function looks like, so we can describe the whole thing as:
\[f(x)=x+1\]
We can then replace the \(x\) with two like this:
\[f(2)=2+1\]
\[f(2)=3\]
We can now talk about \(f(x)\) rather than repeating all the steps every time. This is not too onerous with \(x+1\)) but we will see some more insane functions in due course. Much like the beverage example above you would say that this is "eff of ecks".
The important difference here is that an equation has an equals sign in it, whereas a function does not. OK, pedant, yes the example just about has an equals sign, but that is just telling you what the function is. The actual function (the bit on the right of the equals sign) only has a variable, the sign for addition, and a number. It has no equals sign. An equation, on the other hand, tells you that two different looking thinks are equivalent to each other, whereas a function is list of things that you do to \(x\), say, to get a result.
Monday, 16 May 2011
It's All Greek to Me
Yes \(\pi\) is a letter. In actual fact it is a letter of the greek alphabet. Quite a lot of mathematical place markers are. \(\theta\) (pronounced theta) is commonly used to represent and unknown or varying angle in problems. \(\pi\) is a bit different because it does not represent an unknown or fluctuating quantity, but instead it represents a single, solitary number. The number it represents is a ratio, an unchanging ratio - at least in our universe.
Let us consider a circle. A useful definition of a circle is the set of all points which lie the same distance from a single point. You can easily make a circle by using a pointy sharp thing with a fixed length of string attached to it, and with some sort of device that leaves a mark attached to the end of the string. You also need a flat surface. You can draw circles on non-flat surfaces, like footballs or saddles, but that way lies a special kind of madness called non euclidean geometry, which I am not touching with a bargepole. You stick the pointy sharp bit into the surface you wish to draw a circle on, pull the string tight, and place the marking device on the surface. What you have marked is a point which is the string's length away from the pointy bit. Now if you lift the marking device up and place it down (with the string still held taught) ANYWHERE else, you will mark another point the string's length away from the pointy bit. If you keep doing this randomly, you will eventually see the outline of a circle start to form - defined by all the individual points.
Of course, that is not the sensible way to draw a circle. Instead of lifting the marking device up every time, you just leave it touching the surface and, again keeping the string tight, you move it in either direction. I say either direction because you will find that you are constrained to only move two ways, clockwise or counter clockwise. Once you have moved back to the point you started from you have drawn a circle. Well done. Now, we can say five things about the circle you have marked. We can of course say what colour it is, but that is irrelevant for geometrical purposes. Secondly we can say how thick the line is that marks out the circle. This will depend on your choice of marking device. A felt tip pen or highlighter will leave a thicker line than a biro, and a crayon or a piece of chalk will leave a thicker line that all of the foregoing.
In pure geometry though, lines do not have thickness, and points do not have an area. A line, including the outline of the circle we drew, only extends in one direction. This makes sense, because of our definition., The circle is the set of all points EXACTLY the same distance from another point, not sort of the same distance depending on how big your marking device is. So the thickness of the line marking the circle, as with its colour, is irrelevant to us.
Thirdly, we can say how big the area is inside the circle. This is going to be measured on a two dimensional surface, so the answer will be some number of whatever units you like to the power of two. By that I mean the units are squared, not the number of them. So if your chosen measure of length is the flangit, then your chosen measurement of area will be square flangits. A square flangit is just a square whose length is exactly a flangit. Squaring a number is the same as raising it to the power of two. The area in the circle is therefor some number of square flangits.
Fourthly, we could talk about the length of the line we have drawn on the surface. If we took it in our minds eye and straightened it out, and measured it, how long would it be? Fifthly and finally we can talk about the length of the bit of string between the pointy sharp bit and the marking device.
As it turns out, the area in the circle and the length of the line you draw are related ONLY to the length of that bit of string. If that seems remarkable, remember that it doesn't matter where on your surface you poke your pointy sharp thing your circle will still look the same. The only determining factor of the size of the circle you create is the length of string you allow between the pointy sharp bit and the marking device.
So what does all this have to do with \(\pi\)? Well, if you take your length of string, and multiply it by \(2\pi\) you get the length of the line you draw. So \(\pi\) is the RATIO between the length of the string and the length of the line around the outside of the circle you draw with that string. It matters not a jot how long your bit of string is, the circle that results will ALWAYS have a line that is \(2\pi\) times the length of that string long around its outside.
So why do we say \(2\pi\) and not just \(\pi\) for this ratio? Good question. It turns out that it is a lot more useful to work with \(2\pi\) because \(\pi\) itself turns up in far more places on its own. If we went with \(\pi\) then we would keep having to half it. For example, the area of the circle that we can describe? It is the length of the string times itself (this makes the units squares remember) multiplied by \(\pi\). If we defined \(\pi\) as the direct ratio between the length of the string and the line, that number used to find the area would have to be half of \(\pi\). Which looks messy.
The line which forms the circle is called the circumference of the circle. Another word for something that runs around the outside of something else is the perimeter. The word for perimeter in ancient greek started with a letter in the ancient greek alphabet. Can you guess which letter? The length of the bit of string is called the radius of the circle. The point we made with the pointy sharp thing is the centre of the circle.
There is a line which is double the length of the radius, and is described as the straight line from one point on the circle, through the centre, to another point on the circle (which will inevitably be exactly opposite the starting point). This line is the diameter of the circle. Because this line is the radius multiplied by two, and because (as we have seen) it does not matter which order you multiply things in, you can get rid of the two from the \(2\pi\) when you talk about the ratio between the diameter and the circumference. The circumference is therefor just \(\pi\) times the diameter. This is because the radius times two times \(\pi\) is the same as two times the radius times \(\pi\), and two times the radius is the diameter.
What does this all look like? Glad you asked. Have a gander at this snazzy diagram:
I don't like the diameter though, because it has nothing to do with the construction of the circle. The diameter comes about once a circle has been made, while the radius is used in making the circle in the first place. So I prefer to think of \(\pi\) as it relates to the radius not the diameter.
The number itself is a bit more than three. It is not a whole number you can count on your fingers, and neither is it a fraction (although \(\tfrac{22}{7}\) comes pretty close. If you were to write out \(\pi\) as a decimal number as three point something something something, you would be writing "somethings" for ever, because there are an infinite amount of them. So instead of worrying about all those "somethings" we just write \(\pi\). This is a fantastic animation of the circumference "unrolling" to show it is \(pi\) times the diameter:
That animation was created (and GPL licensed) by wikipedia user John Reid.
Let us consider a circle. A useful definition of a circle is the set of all points which lie the same distance from a single point. You can easily make a circle by using a pointy sharp thing with a fixed length of string attached to it, and with some sort of device that leaves a mark attached to the end of the string. You also need a flat surface. You can draw circles on non-flat surfaces, like footballs or saddles, but that way lies a special kind of madness called non euclidean geometry, which I am not touching with a bargepole. You stick the pointy sharp bit into the surface you wish to draw a circle on, pull the string tight, and place the marking device on the surface. What you have marked is a point which is the string's length away from the pointy bit. Now if you lift the marking device up and place it down (with the string still held taught) ANYWHERE else, you will mark another point the string's length away from the pointy bit. If you keep doing this randomly, you will eventually see the outline of a circle start to form - defined by all the individual points.
Of course, that is not the sensible way to draw a circle. Instead of lifting the marking device up every time, you just leave it touching the surface and, again keeping the string tight, you move it in either direction. I say either direction because you will find that you are constrained to only move two ways, clockwise or counter clockwise. Once you have moved back to the point you started from you have drawn a circle. Well done. Now, we can say five things about the circle you have marked. We can of course say what colour it is, but that is irrelevant for geometrical purposes. Secondly we can say how thick the line is that marks out the circle. This will depend on your choice of marking device. A felt tip pen or highlighter will leave a thicker line than a biro, and a crayon or a piece of chalk will leave a thicker line that all of the foregoing.
In pure geometry though, lines do not have thickness, and points do not have an area. A line, including the outline of the circle we drew, only extends in one direction. This makes sense, because of our definition., The circle is the set of all points EXACTLY the same distance from another point, not sort of the same distance depending on how big your marking device is. So the thickness of the line marking the circle, as with its colour, is irrelevant to us.
Thirdly, we can say how big the area is inside the circle. This is going to be measured on a two dimensional surface, so the answer will be some number of whatever units you like to the power of two. By that I mean the units are squared, not the number of them. So if your chosen measure of length is the flangit, then your chosen measurement of area will be square flangits. A square flangit is just a square whose length is exactly a flangit. Squaring a number is the same as raising it to the power of two. The area in the circle is therefor some number of square flangits.
Fourthly, we could talk about the length of the line we have drawn on the surface. If we took it in our minds eye and straightened it out, and measured it, how long would it be? Fifthly and finally we can talk about the length of the bit of string between the pointy sharp bit and the marking device.
As it turns out, the area in the circle and the length of the line you draw are related ONLY to the length of that bit of string. If that seems remarkable, remember that it doesn't matter where on your surface you poke your pointy sharp thing your circle will still look the same. The only determining factor of the size of the circle you create is the length of string you allow between the pointy sharp bit and the marking device.
So what does all this have to do with \(\pi\)? Well, if you take your length of string, and multiply it by \(2\pi\) you get the length of the line you draw. So \(\pi\) is the RATIO between the length of the string and the length of the line around the outside of the circle you draw with that string. It matters not a jot how long your bit of string is, the circle that results will ALWAYS have a line that is \(2\pi\) times the length of that string long around its outside.
So why do we say \(2\pi\) and not just \(\pi\) for this ratio? Good question. It turns out that it is a lot more useful to work with \(2\pi\) because \(\pi\) itself turns up in far more places on its own. If we went with \(\pi\) then we would keep having to half it. For example, the area of the circle that we can describe? It is the length of the string times itself (this makes the units squares remember) multiplied by \(\pi\). If we defined \(\pi\) as the direct ratio between the length of the string and the line, that number used to find the area would have to be half of \(\pi\). Which looks messy.
The line which forms the circle is called the circumference of the circle. Another word for something that runs around the outside of something else is the perimeter. The word for perimeter in ancient greek started with a letter in the ancient greek alphabet. Can you guess which letter? The length of the bit of string is called the radius of the circle. The point we made with the pointy sharp thing is the centre of the circle.
There is a line which is double the length of the radius, and is described as the straight line from one point on the circle, through the centre, to another point on the circle (which will inevitably be exactly opposite the starting point). This line is the diameter of the circle. Because this line is the radius multiplied by two, and because (as we have seen) it does not matter which order you multiply things in, you can get rid of the two from the \(2\pi\) when you talk about the ratio between the diameter and the circumference. The circumference is therefor just \(\pi\) times the diameter. This is because the radius times two times \(\pi\) is the same as two times the radius times \(\pi\), and two times the radius is the diameter.
What does this all look like? Glad you asked. Have a gander at this snazzy diagram:
I don't like the diameter though, because it has nothing to do with the construction of the circle. The diameter comes about once a circle has been made, while the radius is used in making the circle in the first place. So I prefer to think of \(\pi\) as it relates to the radius not the diameter.
The number itself is a bit more than three. It is not a whole number you can count on your fingers, and neither is it a fraction (although \(\tfrac{22}{7}\) comes pretty close. If you were to write out \(\pi\) as a decimal number as three point something something something, you would be writing "somethings" for ever, because there are an infinite amount of them. So instead of worrying about all those "somethings" we just write \(\pi\). This is a fantastic animation of the circumference "unrolling" to show it is \(pi\) times the diameter:
That animation was created (and GPL licensed) by wikipedia user John Reid.
Subscribe to:
Posts (Atom)
