Friday 3 December 2010

Electronics for Dogs 4 - On the nature of numbers

We all know what numbers are. If I want to write 10 I write 10 on a piece of paper. I can keep writing forever as big a number as I fancy - or at least as big as will fit on the paper.
Computers aren't like that. Not only do they have limited amount of memory - the size of the notepad as it were, but the size of the number they can represent is limited. Before I say why this is the case I want to say why it normally isn't a problem.

Let's say for the sake of argument that the largest number you could represent was 256. Actually no, let's say 100 because we're still thinking in decimal at the moment. What would you do if you needed to represent the fact that you owned 101 sheep? Simple you'd say you had 100 sheep plus 1 sheep.
Hang on a minute, that's exactly what our decimal system does. We only have 10 numbers 0,1,2,3,4,5,6,7,8 & 9. If you want to represent 10 then you say you have 1 lot of 10 plus 0 lots of 1. 23 is 2 lots of ten plus 3 lots of 1. 101 is 1 lot of 100 plus 1 lot of 1. We call this the base system, that adding numbers to the left represents increments of the base. Most of us deal in base 10. Computers deal in base 2. Why this is and what it means we'll get to later, but don't forget that we tell the time in base 12. Imperial measures are in whatever base they fancy at that particular time. You're used to different bases whether you realise it or not.
But let's say that even with this base system, this system of adding numbers to the left to represent ever larger numbers you still only have limited space. Let's say you're a bureaucrat making a form and you want to make it easy for yourself. So you make a form where the person has to put each letter of their name in a different box. You're not doing this to be annoying as such, you're just trying to stop people from using cursive handwriting to make sure you can get their names right. Of course as soon as someone tries to fill out your form who has 3 middle names then your system breaks, but hey that's not your fault is it?
Anyway this is exactly the sort of problem we deal with all the time in computing and electronics. because you physically have to build something you have to chose how big to make it in the first place, you can't just add a few extra bits on at the end because there is no space for them because you haven't built that bit of the computer.

So what do you do? First you make things big, normally you make them way too big. Second you don't just have fixed space allowance for things. Let me once again explain by analogy,
Imagine an authoritarian government states that every family shall have a home. Every family shall have at most three children. All houses are identical. Every family shall have an address of city number and house number and occupant number. Their name shall be formed by sticking all three together.
Sounds great/terrible but what happens when the supreme leader wants 4 children. He's not allowed to break the rules as such, but he really wants that extra child. No problem he thinks I'll create a new house number that the extra children will stay in. Child number 3 will be reassigned to the new house along with child 4 and the child 3 designation at the original house will be a pointer, a reference to this new house. Using this system of continually pointing to further houses he can have as many children as he wants without breaking the system.
Yes it's a mess, yes it's often confusing, yes there's a long trail of investigation to find what you need to know but this way he could have as many children as he wanted yet staying within the scheme.

This is the sort of problem we have all the time in electronics. Many processors in everyday devices, the one that probably runs your washing machine or burgler alarm, your calculator even are 8 bit devices. This means that they can't represent any number larger than 256. Ask it to add 3 to 255 and it will tell you the answer is 2.
Wait a minute you say, I just told my calculator to add 255 to 3 and it said 258 clear as day! Well yes, that's because it cheats, or perhaps the person who was designing it knew about this problem and worked around it. So in the case of the washing machine, it has 8 programs, has a delay timer of up to 120 minutes, has 3 sensors reporting nothing more complex than door open or closed so it doesn't need anything more sophisticated.
In the case of the calculator it's be a good question to ask, well why not make it bigger? Why not make it understand numbers larger than 256 rather than always working around the problem.
Well this comes back to my very fast idiot again. To get an idiot that can deal with bigger numbers is a little bit harder. And even if I got an idiot that could deal with the concept of 4 billion, what if I needed to deal with 5 billion? No first of all I need to come up with a set of instructions that can cope with anything I throw at the idiot/calculator then go from there.
So back to the problem, the calculator is trying to add 255 to 3 and telling me the answer is 2. How do I work around that? Well fortunately when it does the add it remembers that it ran out of space so while it tells you the answer is 2 you can then go and say "Did you run out of space" in this case it would answer yes. most of the time it would answer no. You as the programmer writing the instructions for this idiot can then put something in to keep track of this overflow and deal with it appropriatly. The advantage is if you write these instructions correctly the user of the calculator will never notice this problem.
Except that that's not entirely true either. On the average calculator try the calculation (9999999999 *100000) +1 - (9999999999 *100000) and you'll get an answer of 0 rather than 1, but hey nothing's perfect.

So anyway back to the whole point of this. I've been trying to hold this off in these articles but I can wait no longer. Binary and Hexadecimal. Bases and all their glory.
The building blocks of everything we've talked about is a switch, for the purposes of discussion almost all electronics is digital these days. The analogue stuff is I would argue more interesting, but the digital stuff is really what gets most of the work done. Why that's the case is not important right now, but it is simpler to understand so let's run with that. The switch can be on or off. Unless you have dimmer switches this is no surprise. How can you express numbers then. If you have only light bulbs then how do you represent 7? Exactly the same way we represent 15, or 23. 0 is 0 and 1 is 1. So it hopefully follows that 10 is 2 and 11 is 3. 100 then must be 4 which means 7 is 111.
Here comes the old joke that there are 10 types of people in the world, those who understand binary and those who don't. Now you can count yourself amongst the type that does.

This was where the 256 came from earlier. If you have 8 switches then you can have 256 values. This is why I say that while most people can count to 10 on their hands an engineer can count to over 1000, 1023 to be precise. Why? Because I still want a value for zero and with 1024 states that means I can count to 1023. How do I calculate 1024 for 10 fingers? Easy. With 1 finger I have 0 or 1. With two fingers I have 0, 1, 2 & 3. So I have 2*2 states. With 3 fingers 2*2*2. etc. Also known as 2^3, that is 2 raised to the power of 3. so 10 fingers gives you 2^10, 2 times 2 ten times, 2 to the power of 10 or however else you want to phrase it.

And that I promise you is the most complex maths you'll need for this series. Well unless I get as far as digital coding theory, but that's just long division. Oh and analogue electronics is full of calculus, but I can't remember much of that anymore so you should be fine. Control theory is much worse but I'll be glossing over that because I never really understood it anyway; it's a great case of Someone Else's Problem; and unless you're interested in switched mode power supplies (the thing that powers your computer) then you're fine.

Right so by convention we call each of those individual switches a bit. If we have 8 of them that is a byte. Normally if we need more of them we tend to use multiple bytes to give us 16, 32 or 64 bits. While 24 bit processors do exist they are not common for most domestic purposes. I have used 10 and 14 bit processors in my time, but they were really specialist so let's not worry about those for the moment.

All this is well and good provided you just want to deal with whole numbers or integers as they are often known. What if i want to represent 1.2 or 34.382476235076?
As always there are many ways you can solve this problem, but the one commonly used is known as floating point. Again if you're interested in why other systems aren't used then let me know, but otherwise trust me floating point solves most problems well enough. This is not how it actually works in most implementations - look here if you want to know some specifics, as usual this is about the concepts.
Because you want a fixed storage space for a number that could take a very large range you break the problem up.
Take the number 32.76256
in floating point form we store 3.276256 and 1. That is 3.276256 * 10. We could also have 327.6256 as 3.276256 and 2.
What's the advantage here? Simply put it's that often we only care about the number of significant digits. That is, to use an earlier example if we have (9999999999 *100000) then we often don't care about that last 1 that is added to it. Or to put it another way we'd often rather sacrifice the precision of an answer in order to be able to deal with the full range of the problem. That is it's better to lose that 1 in that calculation than to not be able to perform the calculation at all. If you have gone through life not knowing about this problem then that just shows how true it is. If you have experienced it before then you know how to deal with it I'm sure ;-)

Now one last word on these numbers before we go. 32 bits gives you just over 4 billion separate values. That's an awful lot you say. Indeed for most things we need to do it's an excessively large number. However with more than that many people on earth it's clearly not enough to assign a separate number to each device attached to the global network. This is exactly the problem the internet will face soon, but that's a whole other issue...

Next time there's a few loose ends to tie up. I mentioned how things get into and out of a processor and I need to deal with that first before we can then go on and look at some of how we would build a processor.
Feedback needed as always by the way, what is obvious and what is still a blur please do comment...

No comments: