How to remember Bayes’ Theorem without really trying

Bayes’ Theorem crops up a lot. There’s even a picture of it in neon tubes on the Wikipedia page.

Its beauty is that it relates the probability of one event occurring after another to its inverse i.e. P(A|B) or the probability of A after B, to P(B|A) or the probability of B after A.

The standard example given in the textbooks is when P(A) is the probability that an individual in a population has a disease, say cancer, and P(B) is the probability that a medical test for that cancer comes back positive.

We are given, or can work out, the probability that any individual in the population has cancer (say 1%). We are also given the efficacy of the test i.e. the percentage of true positives the tests gives (I guess the manufacturer would provide this on the basis of trials, giving it to patients they know have cancer and patients they know don’t and then seeing how it performs. Say it’s 98%.) The manufacturers also give the number of false positives i.e. the number of times the test comes up positive and in fact there is no cancer (say 3%). So now imagine someone, imagine you in fact, take the cancer test and it comes up positive. What is the chance you have cancer?

Well, it’s not 99%.

But you can calculate the answer from the information given using Bayes’ Theorem. You want to know P(A|B), the probability of the test being positive after having cancer. Having gotten a positive test, having cancer is what you are frightened of. You have been given P(B|A) which is the probability of having cancer after having the test (i.e. the efficacy of the test, the percentage of true positives, from the trials – 98% ). Bayes theorem allows you to relate the two.

The probabilities calculated are often surprising – much lower than most people would guess. In this instance it’s only 24.8%.

Unfortunately, if you just try and learn Bayes Theroem as some sort of magic formula into which you plug numbers, it’s very easy to get confused by all the P(A)‘s and P(B)‘s and P(A|B)‘s and so on, which is a shame because the basic ideas are quite straightforward.

Fortunately it’s also very easy to be come unconfused. When things are explained to you right.

In fact, as per usual, if you understand what is going with pictures, on you don’t need to learn any formulae at all. Just the pictures.

So here are the pictures.

Think of a sample space. We’re going to designate this by an empty rectangle. The rectangle represents  every possible outcome for our experiments. Inside the rectangle we’re going to have 2 ovals, one for event A and one for event B. Oval A is blue. Oval B is yellow. Where oval A overlaps oval B is green. The size of each oval represents the probability of the event occuring i.e. if A is a huge oval almost filling the rectangle then this means A is very likely to happen. If we close our eyes and stick a pin inside the rectangle randomly, we are very likely to stick it inside the oval A. Similarly, if A is very small, the event A is unlikely to happen.

In order to understand Bayes Theroem there are only 3 scenarios we need consider.

Scenario 1

This is where if event B happens (i.e. you stick your pin in oval B) event A will never happen.

Imagine you hold an imaginary pin over the rectangle. You randomly plunge it down and, unfortunately, it sticks inside the oval B (unfortunately, because that is equivalent to getting your positive test result back). The moment the pin hits inside oval B, you shrink to the size of an ant and you find yourself standing inside oval B, exactly where your pin hit the paper. At the same time a ten foot high barbed wire fence suddenly shoots up around the boundary of oval B. The fence, not the rectangle, is now the limit of your world. You are frightened. You daren’t look down. Because if you look down and you see you are standing on a green carpet you know that means you have cancer. If you’re standing on a yellow carpet you’re safe – positive test, but no cancer.

What are the chances you are also standing on a green carpet? Well in this instance 0% – it can’t happen, because A is right over the other side of the rectangle and doesn’t connect with B at all. So you’re definitely safe.

This is the scenario with P(A \cap B) = 0 i.e. where A and B don’t intersect. It gives P(A|B) =0.

Now for scenario 2:

which is the scenario where B is right inside A.

So imagine again, you stuck your pin in the rectangle randomly and again you hit oval B (this time its totally green, because it’s all inside oval A). You tested positive. Again you shrink down to the size of an ant and find yourself standing where you stuck the pin. The ten foot high barbed wire fence shoots up around oval B, trapping you inside. You daren’t look down. What is the chance you’re standing on some green?

In this instance, you are definitely f*cked, because 100% of oval B is green.

P(A \cap B) = P(B) gives P(A|B) = 100%.

Finally scenario 3, the mixed case :

You plunge down your pin into the rectangle and, as always, are unlucky enough to stick it inside oval B. You shrink down to the size of an ant and stand up where the pin hit the oval. The fence shoots up around the boundary of oval B. You’re frightened to look down. What is the probability that you might be standing on green this time? Well, it’s not 100% nor 0% but something in between. Given a positive test it’s certainly possible you have cancer, but how possible?

In this case, some of the floor in the fenced off area is green, some is yellow. The probability you’re standing on green is the proportion of the green floor inside the fenced off area (all of the oval B).

P(A|B) = \dfrac {P(A \cap B)} {P(B)}      (i)

This is our general case. This formula also takes care of scenario 1 and scenario 2.

Now we want to go through the 3 scenarios again but this time imagining that when we plunged the pin down, we stuck it in oval A i.e. this time we definitely know we have cancer. We then take the test. What is the chance we are positive. So imagine jabbing the pin in oval A, and then shrinking to the size of an ant to stand where the pin hit the paper, and then the fence rising up around oval A, trapping you inside.

We could go through the whole rigmarole above, wondering what are the chances we are standing on blue or green.

Or we just swap the As and Bs around in the general formula (i) we already figured out.

This gives the other general case P(B|A) = \dfrac {P(B \cap A)} {P(A)} (ii)

Now we know {P(B \cap A)} = {P(A \cap B)} (check any of the scenario diagrams) so we can combine (i) and (ii), giving:

P(A|B) = \dfrac {P(B|A) * P(A)} {P(B)}

as per the neon lights.

This is the general form of the theorem, but we are not done yet because there is also a long form of the Theorem.

To get the long form, all we do is get rid of P(B) from the bottom of the general formula. (In the original example, I didn’t give you P(B) which would be the probability of a random person in the population testing positive either because they had cancer, or because of a false positive.)

To get rid of P(B) we just look again at the mixed case (picture 3). Just think about the most bizarre way to describe P(B). In terms of intersections it is P(A {\cap B})+P ({{\neg A} \cap B}) where {\neg A} stands for “not A”.

So using the formula (i) derived above P(A|B) = \dfrac {P(A \cap B)} {P(B)} (i)

we can say P(B|A) = \dfrac {P(B \cap A)} {P(A)} (which was our formula (ii), which we got by swapping A and B about)

and P(B|{\neg A}) = \dfrac {P(B \cap \neg A)} {P(\neg A)} (swapping {\neg A} for {A} in formula (ii)).

So using our bizarre idea of P(B) as  P(A {\cap B})+P ({{\neg A} \cap B}) gives us a formula for P(B) = {P(B|A) * P(A)} + {P(B| \neg A) * P( \neg A)}

We use this in the original Bayes Formula as the denominator, giving the extended version:

P(A|B) = \dfrac{P(B|A) * P(A)} {{P(B|A) * P(A)} + {P(B| \neg A) * P( \neg A)}}

Questions using Bayes’ Theorem then become just exercises in plugging percentages to these formulae.

In our original example, P(A)=0.01,  P(B|A)=0.98,  P(B| {\neg A}) = 0.03.

So using the extended formula P(A|B) = \dfrac{({0.98} *{0.01})} {({0.98} * {0.01}) + {({0.03}* {0.99})}} = 24.8%.

There is hope – the chances of you having cancer after testing positive are not so dire after all.

Often using a table can be a good idea to make sure you have everything sorted out correctly. There are good examples of doing it like that here.


3 thoughts on “How to remember Bayes’ Theorem without really trying

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s