impossible or improbable?

April 7th, 2007

Statistics is not exciting. It's not something that makes people giddy. It's a dull, mundane discipline. But... it's useful, very useful. Because statistics allows you to see truths that that naked eye (by which I mean the mind without using any special method) doesn't. And the reason I say it's useful is because statistics is a field that deals almost exclusively in application. There isn't really that much to gain from statistics as a mathematical field in of its own, it's the ability to apply this method widely that makes it powerful.

When you learn statistics you explore examples like what is the probabiliy of winning the lottery. It's a contrived example really, because I imagine there isn't a wide correlation between those who play the lottery and those interested in learning statistics. But what you find out is just how oblivious those people are about the fact that there is no chance in hell they will ever win the lottery.

But there are more interesting truths you can derive with statistics, like accurate predictions about who's going to win the election, with 50 million voters, by only polling 1000 people. Now that's pretty impressive, I'd say. Of course, a lot of people dismiss statistics as just manipulating numbers to suit your end. And that's true. But it's a very shallow view on statistics, which is a lot more than that.

Because statistics deals fundamentally with examining relations between things. And that is a very general idea you can apply to lots of scenarios. And what's more interesting: probability. We tend to accept the theory that if something has happened, there is a chance it will happen again. Probability is no more than a formalization of this idea. If you are late for class 9 times out of 10, then chances are, by the sheer virtue that history tends to repeat itself, you are going to be late next time. Of course, it's just a prognosis, it doesn't determine the outcome.

And this is where I think people dismiss probability as being some kind of useless game. Because the thing is that according to the physical models we have of our world, there is nothing definite about anything. It's all probability. When you drive under a bridge, you don't know for certain that the bridge isn't going to collapse and smother your car. Now the chemical qualities of the materials used to build this bridge are such that the probability of collapse is very small. The bridge will probably stand for at least 100 years. But there is nothing certain about this. And yet it's good enough for us to trust, isn't it?

So why isn't it definite? When you buy a hard disk, you're hoping that the disk won't crash on you and you'll lose all your data. Why do you have this fear? Because hard disks are known to crash, sometimes they break. And if on average one disk in 1,000 breaks within the first year, then you know that yours may break. Why do you know this? Because you believe that something that's happening won't just stop happening for no reason, you believe there is a certain continuity to events. So if the design and production of hard disks is such that 1 in 1000 breaks, then you know that there's a chance of 1:1000 you'll be replacing yours. But why is this true? Does the prognosis now determine the outcome? No, it doesn't. Instead, the probability represents a truth about the present. That is to say it's not some product of guesswork and wishful thinking we hope will come true, it's a truth. What determines the outcome isn't a guess - it's the present situation.

If you slide a pencil off your desk and it drops to the floor, you will not be surprised. And if you repeat this experiment, you will not be surprised to learn that it happens every time. Does that mean you can be certain that it always will? Probably. On the other hand, if you sit down to dinner everyday, and occasionally (let's say once a year) you spill your drink, can you be certain that there will come a time in the future when you'll again spill your drink? No, not really. In the first case, you can confirm the same outcome every time, there isn't even one example to the opposite. In the second, it happens very rarely, and so it's hard to say whether it will happen again, and if so when. Both are scenarios with a certain probability.

So how useful is probability in predicting the future? I haven't done a lot of air travel in my life, but in the last few years with vacations and moving abroad it has picked up. And I have never been late for a flight. Just like with the pencil. Could I say that if I've been on 30 flights and I've dropped the pencil 30 times, the probability of being late for my next flight is the same as the pencil not dropping? No, because the underlying conditions are much different. The only thing affecting the pencil is gravity, and we know gravity is pretty reliable. What it takes for me to get to the airport on time is timing skills, as well as public transportation being on time. So if I've never been late, and I know that the bus is 10 minutes late one time out of say 100, I know that sooner or later I may be on the bus that happens to be late. So essentially, every time I make it on time (approaching that 100th trip), I know that next time I'm more likely to be late than I was this time. That is to say, I anticipate that 100th occasion to happen.

Of course, the probability of missing the next flight is a lot more complicated than just determining how likely the bus is to be late. Perhaps they switched buses to a new model that's more reliable. Perhaps everytime I take the bus it has just undergone maintenance (and is less likely to break down right after maintenance than it is otherwise). Anyway, I still have to get to the bus in the first place. Perhaps the cops decide to close down the block and send me a on giant detour. Perhaps my knees give out. Perhaps I was playing sports the day before and injured myself, slowing me down. Perhaps I slip and injure myself on the way to the bus. Perhaps the machine at the airport that prints boarding cards runs out of paper just when it's my turn. Perhaps the security people decide to pull me over for an hour long check because they're bored. There's a huge amount of probabilities that enter into the calculation. One or multiple can make me late for the plane. The only thing I [probably] know for certain is that there is no certainty.

:: random entries in this category ::

4 Responses to "impossible or improbable?"

  1. erik says:

    Well written piece. But I would make a distinction between calculating statistical probabilities where mathematical facts are concerned, versus those depended on people's actions. Polling people to predict the outcome of an election is *very* tricky. If you simply ask people who they're gonna be voting for, it's reasonably reliable so long as you focus on a group representable for the entire population with voting rights (though it still depends on whether you allow people to give anonymous answers etc). But if you try to draw a conclusion who people are going to be voting for by asking them all sorts of different questions, such as where they stand on environmental policy, you've got yourself a completely different ball game.

    People are tricky, which I suppose is what you were trying to say when you compared the pencil to arriving on time for the airplane. Still, personally, I would distinct mathematical probability from anything involving people altogether.

  2. Graham says:

    "So essentially, every time I make it on time (approaching that 100th trip), I know that next time I’m more likely to be late than I was this time. That is to say, I anticipate that 100th occasion to happen."

    Something inside me doesn't agree with that statement. If you flip a coin and it turns up heads, does that make tails the more likely outcome? Of course not. If you throw a coin 1000 times, and it turns up heads 999 times, is the probability of it showing tails any more likely than the first 999 throws? No.

    You could say that flipping a coin is a much simpler concept than making it to a flight on time, but I'd have to disagree with that too, because there are countless things affecting it, from the thrower to wind speed to humidity to the alignment of the planets.

  3. numerodix says:

    Yes, there is a flaw in that reasoning, and it's a distinction I'm unable to formulate.

  4. John Healy says:

    It's about measurement of probability - if you have an unbiased coin (i.e. 50-50 chances of either outcome), and you flip it 1000 times, you might get 999 heads - it's possible, if unlikely. The probability of the next flip being heads is still 50-50 though, because your measurement isn't the same as the probability.

    Of course, if you did flip a coin 1000 times and got a result like that, it's far more likely that your coin is biased than not, but how biased? I don't know. My damned stats course was postponed for Good Friday!