Get Probability, Choice, and Reason 1st Edition Leighton Vaughan Williams Free All Chapters
Get Probability, Choice, and Reason 1st Edition Leighton Vaughan Williams Free All Chapters
Get Probability, Choice, and Reason 1st Edition Leighton Vaughan Williams Free All Chapters
com
https://ebookmeta.com/product/probability-choice-
and-reason-1st-edition-leighton-vaughan-williams/
OR CLICK BUTTON
DOWLOAD EBOOK
https://ebookmeta.com/product/cambridge-igcse-and-o-level-
history-workbook-2c-depth-study-the-united-states-1919-41-2nd-
edition-benjamin-harrison/
https://ebookmeta.com/product/how-woke-won-the-elitist-movement-
that-threatens-democracy-tolerance-and-reason-joanna-williams/
https://ebookmeta.com/product/the-lying-tree-1st-edition-sundae-
leighton/
https://ebookmeta.com/product/reputation-1st-edition-sarah-
vaughan/
The Tango of Ethics: Intuition, Rationality and the
Prevention of Suffering 1st Edition Jonathan Leighton
https://ebookmeta.com/product/the-tango-of-ethics-intuition-
rationality-and-the-prevention-of-suffering-1st-edition-jonathan-
leighton/
https://ebookmeta.com/product/statistics-and-probability-
faulkner/
https://ebookmeta.com/product/managing-a-video-production-
company-1st-edition-vaughan-mountford/
https://ebookmeta.com/product/kants-reason-1st-edition-karl-
schafer/
https://ebookmeta.com/product/beyond-reason-1st-edition-a-
goswami/
Probability, Choice,
and Reason
Probability, Choice,
and Reason
Reasonable efforts have been made to publish reliable data and information, but the author and pub-
lisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.
co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003083610
Typeset in Palatino
by Deanta Global Publishing Services, Chennai, India
For Mum and Dad, and my wife, Julie.
Contents
Preface.................................................................................................................... xiii
Author Biography................................................................................................ xvii
vii
viii Contents
2. Probability Paradoxes................................................................................... 47
2.1 The Bertrand’s Box Paradox................................................................ 47
2.1.1 Exercise..................................................................................... 48
2.1.2 Reading and Links.................................................................. 49
2.2 The Monty Hall Problem.................................................................... 49
2.2.1 Appendix................................................................................. 52
2.2.1.1 Alternative Derivation............................................54
2.2.2 Exercise..................................................................................... 55
2.2.3 Reading and Links.................................................................. 55
2.3 The Three Prisoners Problem............................................................. 56
2.3.1 Exercise..................................................................................... 58
2.3.2 Reading and Links.................................................................. 58
2.4 The Deadly Doors Problem................................................................ 59
2.4.1 Exercise..................................................................................... 60
2.4.2 Reading and Links.................................................................. 60
2.5 Portia’s Challenge................................................................................. 61
2.5.1 Exercise..................................................................................... 62
2.5.2 Reading and Links.................................................................. 62
2.6 The Boy–Girl Paradox.......................................................................... 62
2.6.1 Appendix................................................................................. 67
2.6.2 Exercise..................................................................................... 68
2.6.3 Reading and Links.................................................................. 68
2.7 The Girl Named Florida Problem...................................................... 68
2.7.1 Appendix................................................................................. 71
2.7.2 Exercise..................................................................................... 72
2.7.3 Reading and Links.................................................................. 72
2.8 The Two Envelopes Problem.............................................................. 73
2.8.1 Exercise..................................................................................... 75
2.8.2 Reading and Links.................................................................. 75
2.9 The Birthday Problem......................................................................... 75
2.9.1 Exercise..................................................................................... 79
2.9.2 Reading and Links.................................................................. 79
2.10 The Inspection Paradox.......................................................................80
2.10.1 Exercise..................................................................................... 82
2.10.2 Reading and Links.................................................................. 82
2.11 Berkson’s Paradox................................................................................ 82
2.11.2 Exercise.....................................................................................84
2.11.3 Reading and Links.................................................................. 85
2.12 Simpson’s Paradox...............................................................................85
2.12.1 Exercise..................................................................................... 87
2.12.2 Reading and Links.................................................................. 88
Contents ix
This book is designed as an invaluable resource for those studying the sci-
ences, social sciences, and humanities on a formal or informal basis, espe-
cially those with an interest in engaging with ideas rooted in chance and
probability and in the theory and application of choice and reason. Notably,
statistics and probability are topics that students often find difficult to get to
grips with, and this book fills and makes accessible a significant gap in this
area across a range of disciplines. These include economics, engineering,
finance, law, marketing, mathematics, medicine, psychology, and many oth-
ers. It will also appeal to those taking courses in probability and in statistics.
The target student audience includes university, college, and high school
students who wish to expand their reading, as well as teachers and lecturers
who want to liven up their courses while retaining academic rigour. More
generally, the book is designed with the intelligent and enquiring layperson
in mind, including anyone who wants to develop their skills to probe num-
bers and anyone who is interested in the many statistical and other para-
doxes that permeate our lives.
The underpinning of the book is that much of our thinking on a range of
subjects is flawed because we base much of our thinking on faulty intuition.
The content is primarily about the tools and framework of logical thought
that we can use to address and overcome these fundamental cognitive flaws.
By using the framework and tools of probability and statistics, we can
overcome these barriers to provide solutions to many real-world problems
and paradoxes. We show how to do this and find answers that are frequently
very contrary to what we might expect. Along the way, we venture into
diverse realms and thought experiments which challenge the way that most
of us see the world, and we explore the big questions of choice and reason.
The tools of so-called Bayesian reasoning run through important sections
of the book. The ideas extend well beyond a Bayesian framework, however,
and include several topics and anomalies that are attractive on their own
merits and from which we might learn broader lessons. The reader will also
explore ideas, concepts, and applications rooted in game theory.
A recurring theme at the heart of this book, however, is the conflict between
intuition and logic.
Imagine, for example, a bus that arrives every 30 minutes, on average, and
you arrive at the bus stop at some random time, with no idea when the last
bus left. How long can you expect to wait for the next bus to arrive? Half of
30 minutes, i.e. 15 minutes? Intuitively, that sounds right, but you’d be lucky
to wait only 15 minutes. It’s likely to be somewhat longer, and the laws of
probability and statistics show why.
xiii
xiv Preface
In medical trials, the success rate for a new drug is better than for an old
drug on each of the first two days of the trials. The new drug must, therefore,
have recorded a higher success rate than the old drug, judged over the entire
two days of the trials. Sounds right, but it’s not so. After the two days, the old
drug turns out to be more successful than the new drug even though it per-
formed worse on each of the first two days. Is this possible? It’s like saying
that a player performs better than another player in successive seasons but
performs worse overall. Can that happen? Yes. It can and does.
How many restaurants should you look at before starting to decide on a
place to eat? How many used cars should you pass on before you start look-
ing seriously for one? How many potential partners should you consider
before looking for the special one? In each case, we can derive the answer
from a simple formula.
A doctor performs a test on all her patients for a virus. The test she gives
them is 99% accurate, in the sense that 99% of people who have the virus test
positive, and 99% of the healthy people test negative. Now the question is:
If the patient tests positive, what is the chance the doctor should give to the
patient having the virus? The intuitive answer is 99%, but that is likely to be
a gross over-estimate of the true probability.
You meet a man at a sales convention, who mentions his two children, one
of whom, you learn, is a boy. You never found out anything about the other
child. What should be your best estimate of the probability that the other
child is a girl? It’s not a half, as you might think intuitively. If you had met the
same man in different circumstances, accompanied by his son, now what is
the probability that the man’s other child is a girl? Isn’t it the same as before?
In fact, it’s quite different. It does matter in estimating the chance of the other
child being a girl that you bumped into the young boy instead of being told
about him. It matters that his name was Barrington, and it would be slightly
different if it were Bob.
You turn up to watch your local team play football. There are 22 players
on the pitch, plus the referee. What’s the chance that two or more of them
share a birthday? Well, there are 365 days in the year and only 23 people on
the pitch, so the chance is likely to be slim, you might think. In fact, it’s more
likely than not that at least two of those on the pitch share the same birthday.
But the referee is unlikely to be one of them.
Can we improve our forecasts of football match outcomes by studying the
rate of fatalities from horse kicks of Prussian cavalry officers? Yes, we can.
Can we devise a game where you auction a dollar and be pretty much
guaranteed to turn a profit on the deal? There is a way to do this.
You have arranged to meet a stranger on a particular day for an important
appointment, but you forgot to name the time and place, and neither of you
have the contact details of the other. Where and when should you turn up?
You need to double your remaining money to pay off a pressing debt, and
you decide to take to the casino tables. If you don’t double up tonight, you are
Preface xv
doomed to a dusty demise. What staking plan should you adopt to maximise
your chances of survival? The answer may be surprising.
It’s possible to win at Blackjack by counting cards, memorising what cards
have already been dealt. You need a good memory for that, don’t you? You
don’t.
Choose a number between 0 and 100. You win a prize if your number is
equal or closest to two thirds of the average number chosen by all other par-
ticipants. What number should you choose?
Select a newspaper or magazine with a lot of numbers about naturally
occurring phenomena, such as the populations of different countries or the
heights of mountains. Now circle the numbers. Would you expect a very big
difference between the numbers starting with a 1, 2, 3, 4, 5, 6, 7, 8, or 9? Yes,
you would. And that fact can help identify fraudsters.
As a prize for winning a competition, you’re offered a chance to open a
gold, silver or lead casket, in one of which the host has placed a cheque for
£10,000. The others are empty. You choose the gold casket and the silver cas-
ket is opened. It is empty. You are generously offered a chance to swap to a
different casket before the reveal. Should you take the offer? The solution is
counter-intuitive.
The penalty-taker must decide which way to shoot. The goalkeeper must
decide which way, and whether, to dive. How can they use game theory
to maximise their chances of success? The answer involves thinking both
inside and outside the box.
Can we profit on the stock market by waiting till Halloween or by invest-
ing on a cold, overcast day?
Are professional golfers more successful when putting for par than for
birdie?
Does seeing a blue tennis shoe increase the likelihood that all flamingos
are pink?
Do we live in a simulation or is this world the real thing?
How long can we expect humanity as we know it to survive?
We ask and seek to resolve these and many more questions involving prob-
ability, choice, and reason. Not least, we solve the greatest mystery of them
all – why we always seem to end up in the slower lane.
Exercises, references, and links are provided for those wishing to cross-
reference or to probe further, and many of the chapters contain a technical
appendix. Solutions to the exercises are provided at the end of the book.
Author Biography
xvii
1
Probability, Evidence, and Reason
DOI: 10.1201/9781003083610-1 1
2 Probability, Choice, and Reason
did yesterday and the day before, and so on, gradually approaches but never
quite reaches 100%.
The Bayesian viewpoint is just like that, the idea that we learn about the
world and everything in it through a process of gradually updating our
beliefs. In this way, we edge closer to the truth as we obtain more data, more
information, more evidence.
The Bayes Business School, formerly City University of London’s business
school, explained their choice of name in similar terms: “Bayes’ theorem sug-
gests that we get closer to the truth by constantly updating our beliefs in pro-
portion to the weight of new evidence. It is this idea … that is the motivation
behind adopting this name” (Significance, June 2021, p. 3).
As such, the perspective of Reverend Bayes differs from that of philoso-
pher David Hume. For Hume, assumptions about the future, such as that the
sun will rise again, cannot be rationally justified based simply on the past
because no law exists that the future will always resemble the past. Bayes
instead sees reason as a practical matter, to which we can apply the laws of
probability in a systematic way.
To Bayes, therefore, we step ever nearer to the truth based on new evi-
dence and the proper application of the laws of probability. This is called
Bayesian reasoning. According to this approach, we can see probability as
a bridge between ignorance and knowledge. Bayes’ Theorem is, in this way,
concerned with conditional probability. It tells us the probability, or updates
the probability, that a theory or hypothesis is correct, given that we observe
some new evidence. A particularly good thing about Bayesian reasoning is
that the mathematics of it is so straightforward.
At its heart, then, Bayes’ Theorem allows us to use all the information
available to us. Our beliefs, our judgments, our subjective opinions, what
we have already learned from the previous body of knowledge to which we
have had access. We can incorporate this in updating our estimate of the
probability that a hypothesis is true. As such, we can be explicit and open
about the uncertainty in our data and our beliefs. The problem with implicit
reasoning, or intuition, is that our intuition is often wrong and subject to
systematic biases. Instead, we should be trained to think in a Bayesian way
about the world.
Often the conclusions generated by the application of Bayes’ Theorem will
challenge intuition. This is because the world is, in many ways, a counter-
intuitive place. Accepting that fact is the first step towards mastering life’s
logical maze.
Intuition also often lets us down because our in-built judgment of the
weight that we should attach to new evidence tends to be skewed relative to
pre-existing evidence.
New evidence also tends to colour our perception of the pre-existing evi-
dence. Moreover, we tend to see evidence that is consistent with something
being true as evidence that it is in fact true. Bayes’ Theorem is the map that
helps guide us through this maze.
Probability, Evidence, and Reason 3
This idea can also be applied to beliefs. So, P (BIE) can be understood as the
degree of belief, B, given evidence, E. P (B) is our prior degree of belief before
we encountered evidence, E. Employing Bayes’ Theorem allows us to con-
vert our prior belief into a posterior belief. When new evidence is observed,
we can perform the same calculation again, this time our previous posterior
belief becoming our next prior belief. And so on. As McGrayne (2011, preface)
puts it, “by updating our initial belief about something with objective new
information, we get a new and improved belief. To its adherents, this is an
elegant statement about learning from experience”.
More generally, the probability that a hypothesis is true, P (H), given new
evidence, P (E), is written as: P (H I E).
The problem with P (E) is that it’s often difficult to calculate it in many real-
world cases. In such cases, it may sometimes be preferable to use the Bayes
Factor, which is a formula for comparing the plausibility of one hypothesis
with another.
P (H I E) = P (H) . P (E I H) /P (E).
But it’s not always clear how to measure P (E), the probability of the evidence.
An alternative approach which doesn’t require knowledge of P (E) is by
using the proportional form of Bayes’ Theorem.
The proportional form of Bayes’ Theorem sees the posterior probability of
a hypothesis, P (H I E), as proportional to the prior probability, P (H), multi-
plied by the likelihood, P (E I H).
From this we derive a ratio of how well each of our hypotheses explains
the evidence we have observed.
Probability, Evidence, and Reason 5
The posterior odds is a measure of how many times better our hypothesis
explains the evidence compared to a competing hypothesis.
Take as an example a hypothesis that a machine is a perfect Coin Toss
Predictor, in that it can unfailingly calculate how a coin will land face up as
soon as it is thrown. You toss a fair coin a series of times.
If the Predictor is perfect, it will always calculate correctly, so P (E I H1) =
1, i.e. the probability of calling it correctly (E) given that the hypothesis, H1
(it is a perfect Predictor), is true = 1.
The alternative hypothesis, H2, is that the Predictor is simply guessing. In
this case, P (E I H2) = 0.5 from one toss of the coin.
Say you toss the coin five times, and the Predictor calls it correctly five times.
In this case, the probability of doing this by chance = 0.5 × 0.5 ×0.5 ×
0.5 ×
0.5 = (0.5)5 = 1/32. We can multiply the probabilities as each coin toss is an
independent event.
explains the evidence we have witnessed 32 times better than the alternative
hypothesis, that it is a guessing machine.
If, on the other hand, the prior probability we assign to the coin being a
genuine perfect Predictor compared to a guesser is 1/64, then the posterior
odds = 1/64 × 32 = 1/2. Now, we believe that it is twice as likely that the
machine is a guessing machine than a perfect Coin Toss Predictor.
How big do the posterior odds have to be to prove convincing? To some
extent that depends on what you are using them for. If it’s to help resolve
a casual disagreement among friends, a small positive number might be
enough. If your life depends on getting it right, you might prefer that num-
ber to end in quite a few zeros!
Prosecutor’s Fallacy
The Prosecutor’s Fallacy is to represent P (HIE) as an equivalent to P (EIH).
In fact, P (HIE) = P (EIH) P (H) / P (E) … Bayes’ Theorem. Therefore, P
(HIE) only equals P (EIH) when P (H) = P (E), i.e. P (H) / P (E) = 1. Bayes’
Theorem can be expanded to: P (HIE) = P (EIH) P (H) / [P (EIH) P (H) + P
(EIH’) P (H’)]
1. Bayes’ Theorem makes clear the importance of not just new evidence
but also the (prior) probability that the hypothesis was true before
the arrival of the new evidence. This prior probability may in com-
mon intuition be given too little (or too much) weight relative to the
latest evidence. Bayes’ Theorem makes the assigned prior probabil-
ity explicit and shows how much weight to attach to it.
2. Bayes’ Theorem allows us a way to update the probability that a
hypothesis is true. It does so by combining the prior probability with
the probability that the new evidence would arise if the hypothesis
is true and the probability that it would arise if the hypothesis is
false.
3. Bayes’ Theorem shows that the probability that a hypothesis (H) is
true given the evidence (E) is not equal to the probability of the evi-
dence arising given that the hypothesis is true, except in limiting
circumstances. Specifically, P (H given E) does not equal P (E given
H) except when P (H) = P (E).
1.1.1 Appendix
Bayes’ Theorem consists of three variables.
(1 − a) is, therefore, the prior probability that the hypothesis is not true. In
traditional notation, we represent this as P (H’) or 1 − P (H), i.e. one minus
the probability that the hypothesis is true.
Using the a, b, c notation, the probability that a hypothesis is true given
some new evidence (“posterior probability”) = ab/ [ab + c (1 − a)].
P ( H Ç E ) = P ( HIE ) × P ( E )
Similarly,
P ( E Ç H ) = P ( EIH ) . P ( H )
Now,
P (H Ç E) = P (E Ç H)
So:
P ( HIE ) P ( E ) = P ( EIH ) P ( H )
Intuitive Presentation
Bayes’ Theorem can be derived from the equation P (HIE) . P (E) = P (H) . P
(EIH).
The intuition underlying this equation is that both sides are alternative
ways of looking at the same thing. It is the combined probability of observ-
ing the evidence relating to a hypothesis and the probability that the hypoth-
esis is true, P (H and E).
b = P (EIH); c = P (EIH¢).
1.1.2 Exercise
Question a.
Write the Bayesian equation (using a, b, and c) for deriving the poste-
rior (updated) probability of a hypothesis being true after the arrival
of new evidence. Explain what a, b, and c represent.
Question b.
If P (H) is the probability that a hypothesis is true before some new
evidence (E), what is the updated (or posterior) probability after the
10 Probability, Choice, and Reason
new evidence? Use the terms P (H), P (EIH), P (HIE), P (H’), and P
(EIH’) to construct the Bayesian equation.
Question c.
How do the terms used in Question b relate to a, b, and c in the
Bayesian formula referred to in Question a?
Question d.
1. Is the probability that a hypothesis is true, given the evidence,
P (HIE), equal to the probability of the evidence, given that the
hypothesis is true, P (EIH)? In other words, does P (HIE) = P
(EIH)?
2. Is the probability of feeling warm given that you are out in the
sun equal to the probability of being out in the sun given that
you are feeling warm?
Question e.
For a person emerging from a dark cave into the world for the first
time and watching the sun rise seven times, the estimate that it will
rise again is 88.9%, if we use a Bayesian “prior” of 1, 2. Calculate
the updated probabilities that the sun will rise again if we use a
Bayesian “prior” of 5, 10? What is the significance of using a 5, 10
prior compared to a 1, 2 prior?
Question f.
Uncle Austin and Uncle Idris each present you with a die. One is fair,
and one is biased. The fair die (A) lands on all numbers (1–6) with
equal probability. The biased die (B) lands on 6 with a 50% chance
and each of the other numbers (1–5) with an equal 10% chance each.
Now, choose one of the two dice at random. You can’t tell by
inspection whether it is the fair or the biased die. You now roll the
die, and it lands on 6. What is the probability that the die you rolled
is the biased die?
Answer guide: state the hypothesis to be that you chose the biased die.
What is P (H)? What is the probability that the die is biased before
the evidence that the die landed on a 6?
What is P (EIH)? Note that the evidence is that the die landed on a 6.
What is P (EIH’), i.e. the probability that you would throw a 6 if the
die was not biased?
What is P (HIE)?
Alternatively, you can use the formula: ab / [(ab + c (1 − a)].
Question g.
Auntie Beatrice and Auntie Kit each present you with a coin. One
of these is a fair coin, and the other is weighted. The fair coin (Coin
1) lands on heads and tails with equal likelihood, the weighted coin
(Coin 2) lands on heads with a 75% chance.
Probability, Evidence, and Reason 11
Flam, F.D. 2014. The odds, continually updated. New York Times. 29 September. https://
www.nyti mes.com/2014/09/30/science/the- odds- cont inual ly-updated.html
?referringSource=articleShare
Hooper, M. 2013. Richard Price, Bayes’ theorem and god. Significance. February,
36–39. https://www.york.ac.uk/depts/maths/histstat/price.pdf
Johnson, E.D. and Tubau, E. 2015. Comprehension and computation in Bayesian
problem solving. Frontiers in Psychology, 27 July, 6: 938. https://www.frontier
sin.org/articles/10.3389/f psyg.2015.00938/full
Kurt, W. 2019. Bayesian Statistics: The Fun Way. Understanding Statistics and Probability
with Star Wars, Lego, and Rubber Ducks. San Francisco, CA: No Starch Press.
Lee, M., and King, B. 2017. Bayes’ theorem: The maths tool we probably use every day.
But what is it? The Conversation. 23 April. https://theconversation.com/bayes-t
heorem-the-maths-tool-we-probably-use-every-day-but-what-is-it-76140
LessWrong. 2011. A history of Bayes’ theorem. Lukeprog. 29 August. https://www.nyt
imes.com/2014/09/30/science/the- odds- cont inual ly-updated.html?refer r in
gSource=articleShare
Marianne. 2016. Maths in a minute: The prosecutor’s fallacy. + plus magazine. 11
October. https://plus.maths.org/content/maths-minute-prosecutor-s-fallacy
McGrayne, S.B. 2011. The Theory that Would Not Die: How Bayes’ Rule Cracked the
Enigma Code, Hunted Down Rusian Submarines, and Emerged Triumphant from Two
Centuries of Controversy. New Haven, CT: Yale University Press.
McRaney, D. 2016. YANSS 073 – How to get the most out of realizing you are wrong
by using Bayes’ theorem to update your beliefs. 8 April. [Podcast]. https://yo
uarenotsosmart.com/2016/04/08/yanss- 073-how-to -get-the-most-out- of-rea
lizing-you-are-wrong-by-using-bayes-theorem-to-update-your-beliefs/
Olasov, I. 2016. Fundamentals: Bayes’ Theorem. 22 April. https://www.khanacademy.
org/partner-content/wi-phi/wiphi-crit ical-think ing/wiphi-fundamentals/v/
bayes-theorem
Puga, J., Krzywinski, N., and Altman, N. 2015. Points of significance: Bayes’ theorem. 12,
4, April, 277–278. https://www.nature.com/articles/nmeth.3335.pdf?origin=ppub
Significance. 2021. A school named Bayes. June, 18, 3.
Stylianides, N. and Kontou, E. (2020). Bayes Theorem and Its Recent Applications.
MA3517 Mathematics Research Journal, March, 1-7. file:///C:/Users/epa3will
ilv/Downloads/3488-9410-1-PB.pdf
Taylor, K. (2018). The Prosecutor’s Fallacy. Centre for Evidence-Based Medicine, 16
July. https://www.cebm.ox.ac.uk/news/views/the-prosecutors-fallacy
Tijms, H. 2019. Chapter 4: Was the champions league rigged? In Surprises in Probability –
Seven Short Stories. CRC Press. Taylor & Francis Group, Boca Raton, pp. 23–30.
AMSI. 2020. Bayes’ theorem: The past and the future. 19 June. [Podcast]. https://amsi.
org.au/2020/06/19/bayes-theorem-the-past-the-future-acems-podcast/
Rationally Speaking Podcast. 2012. RS58 – Intuition. 8 April. [Podcast]. http://rat
ionallyspeakingpodcast.org/show/rs58-intuition.html
SuperDataScience. SDS 096: Bayes theorem. [Podcast]. https://soundcloud.com/su
perdatascience/sds- 096-bayes-theorem
Wiblin, R., and Harris, K. 2018. How much should you change your beliefs based on
new evidence? 7 August. [Podcast]. https://80000hours.org/podcast/episodes/
spencer-greenberg-bayesian-updating/
A Derivation of Bayes’ Rule. Ox educ. 29 July 2014. YouTube. https://youtu.be/_
DsO4ZSYpHUA Visual Guide to Bayesian Thinking. Galef, J. 17 July 2015.
YouTube. https://youtu.be/BrK7X_XlGB8
Probability, Evidence, and Reason 13
In this case, the hypothesis is that the taxi that knocked down the pedestrian
was green, where:
Posterior probability = 0.15 ´ 0.8 / [0.15 ´ 0.8 + 0.2 (1 - 0.15)] = 0.41 = 41%.
In other words, the actual probability that the taxi that knocked down the
pedestrian was green is not 80% (despite the witness evidence) but about half
of that. The baseline probability is important. A common error is to place too
much weight on new evidence about an event (the judgment of the witness)
and too little on the general frequency of that event (in this case, represented
by the proportion of green cabs in the taxi population).
If new evidence subsequently arises, Bayesians are not content to leave the
probabilities alone. Say, for example, that a second witness appears and is
also given the observation test, revealing a reliability score of 90%. Again,
we have no reason to doubt the integrity of this second witness. A Bayesian
now inserts that number (0.9) into Bayes’ formula (b = 0.9) so that c (the prob-
ability that the witness is mistaken) = 0.1. The new baseline (or prior) prob-
ability, a, is no longer 0.15, as it was before the first witness appeared, but 0.41
(the probability incorporating the evidence of the first witness). In this sense,
yesterday’s posterior probabilities are today’s prior probabilities.
Inserting into Bayes’ Theorem, the new posterior probability = 0.86 = 86%.
This is the new baseline probability underpinning any further new evidence
which might arise.
There are three critical illustrative cases of the Bayesian Taxi Problem
which bear highlighting. The first is a scenario where the new witness scores
50% on the observation test. Here is a case where intuition and Bayes’ for-
mula converge. A witness who is right only half the time is also wrong half
the time, and so any evidence they give is worthless. Bayes’ Theorem tells us
that this is indeed so, as the posterior probability ends up being equal to the
prior probability.
The second illustrative case is where a new witness is 100% reliable about
the colour of the taxi. In this case, b = 1 and c = 0. Intuition tells us that the
Probability, Evidence, and Reason 15
evidence of such a witness solves the case. If the infallible witness says the
taxi was green, it was green. Bayes’ Theorem agrees.
Now for the third illustrative case. If the new witness scores 0% on the
observation test, this indicates that they always identify the wrong colour for
the taxi. If they say it is green, it is not green. So the chance (posterior prob-
ability) that the cab is green if they say so is zero, which accords with Bayes’
Theorem.
More generally, information that informs us that a witness is usually
wrong is valuable, as it can be reversed to beneficial effect. A witness who
always identifies a green taxi as blue and vice versa, and is 100% consistent in
doing so, yields us reliable information by merely reversing their designated
colour.
So if the witness says the taxi is blue, we can now identify the taxi as defi-
nitely being green. This now converges on the second illustrative case.
Similarly, a witness who is, say, right only 25% of the time in identifying
the colour of the taxi in the observation test also yields us valuable informa-
tion. By reversing the defined colour, this produces a 75% reliability score,
which can be inserted accordingly into Bayes’ Theorem to update the prob-
ability that the taxi that knocked down the pedestrian was green. In other
words, a witness who is 25% reliable and identifies the cab as green is equiv-
alent to the witness being 75% reliable in determining the taxi as blue, and
vice versa.
The only observation evidence that is worthless, therefore, is evidence that
could have been produced by the flip of a coin.
The Bayesian Taxi Problem is an instance of what is known as the Base
Rate Fallacy. This occurs when we undervalue prior information when mak-
ing a judgement as to how likely something is. If presented with general
(base rate) information and specific information (pertaining only to a par-
ticular case), the fallacy arises from a tendency to focus on the latter at the
expense of the former. For example, if someone is an avid book enthusiast,
we might think it more likely that they work in a bookshop or a library than
as, say, a nurse. There are, however, many more nurses than librarians and
bookshop assistants. Our mistake is not to take sufficient account of the base
rate numbers for each occupation.
And the conclusion to the case? CCTV evidence was later produced in
court, which was able to identify the taxi and the driver conclusively. The
pedestrian never regained consciousness. The driver of what transpired to
be a blue taxi told the jury that the pedestrian unexpectedly stepped out and
lightly brushed against the passenger side door. He thought at the time that
it was a minor incident and was completely unaware that the victim had
slipped and hit his head awkwardly. This account was rejected by the jury,
who accepted the prosecution’s contention that the driver had acted with
premeditation and malicious intent. They based their decision on their view
that a driver who was so motivated would indeed have driven off. It was all
they needed to reach their unanimous verdict of first-degree murder.
16 Probability, Choice, and Reason
1.2.1 Appendix
In the original taxi problem scenario:
This is the new baseline probability underpinning any new evidence which
might arise.
If new evidence subsequently arises, this should be used to update the new
baseline probability of 0.41.
Say, for example, that a new witness is correct 90% of the time (wrong 10%
of the time). New posterior probability = 0.41 × 0.9 / (0.41 × 0.9 +0.1 ×
0.59) =
0.369/ (0.369 + 0.059) =86% (rounded to the nearest per cent). This is also
the new baseline probability underpinning any further new evidence which
might arise.
Solution to the three illustrative cases of the Bayesian Taxi Problem:
So when b and c both equal 0.5 in regard to new evidence, this evi-
dence has no impact on the probability of the hypothesis being tested
being true. The posterior probability equals the prior probability. In
this case, the evidence of the witness can be discounted.
2. The second illustrative case is where a new witness is 100% accurate
about the colour of the taxi. In this case, b = 1 and c = 0. Intuition tells
us that the evidence of such a witness solves the case. If the infallible
Probability, Evidence, and Reason 17
witness says the taxi was green, it was green. Bayes’ formula agrees.
Inserting b = 1 and c = 0 into the formula gives:
ab / éëab + c ( 1 - a ) ùû = a / ( a + 0 ) = a / a = 1
1.2.2 Exercise
For the purpose of this exercise, use the a, b, c method to derive the solutions.
Question a.
New Amsterdam has 1,000 taxis, and 800 of them are yellow and
200 are white. The driver of one of these taxis knocks down a
pedestrian and drives away. There is no prior reason to believe
that the driver of a yellow taxi is more likely to have knocked
down the pedestrian than of a white taxi, or vice versa. There is
one witness, however, who saw the event and says the colour of
the cab was white.
The witness, Reverend Latimer Williams, is given a well-respected
observation test and is right 80% of the time.
What is our best estimate now of the probability that the taxi was
white?
Question b.
What if a second witness now comes forward?
We determine that the probability that this witness is correct when
identifying the colour of the taxi as 70%.
The witness, Mr. Henry Morris, says the colour of the taxi was white.
What is the new posterior (updated) probability that the taxi that
knocked down the pedestrian is white?
18 Probability, Choice, and Reason
Question c.
What if a third witness now comes forward?
We determine that the probability that this witness is correct when
identifying the colour of the taxi as 50%.
The witness, Mr. Edmund Coss, says the colour of the taxi was white.
What is the new posterior (updated) probability that the taxi that
knocked down the pedestrian is white?
Question d.
A witness, Mr. Smith, is correct 50% of the time.
A witness, Mr. Jones, is correct 100% of the time.
A witness, Mr. Evans, gets it wrong 100% of the time.
Which of the three witnesses is the most useful/least useful to
investigators?
Woolley, R. (2016). Do I call or fold? How Bayes’ Theorem can help navigate Poker’s
uncertainty, part 2. 22 February.
https://www.pokernews.com/strategy/call-or-fold-bayes-theorem-poker-uncerta
inty-2-24133.htm
Base rate fallacy. Wikipedia. https://en.wikipedia.org/wiki/Base_rate_fallacy
Base Rate Fallacy. Yang, C. 22 March 2017. YouTube. https://youtu.be/Fs8cs0gUjGY
Counting Carefully – The Base Rate Fallacy. Simple Scientist. 20 May 2013. YouTube.
https://youtu.be/VeQXXzEJQrg
Know Your Bias: Base Rate Neglect. Deciderata. 25 July 2016. YouTube. https://youtu.
be/YuURK_q2NR8
Where a is the prior probability of the hypothesis (beetle is rare) being true.
b is the probability we observe the pattern, and the beetle is rare (hypothesis
is true). c is the probability we observe the pattern, and the beetle is not rare
(hypothesis is false).
In this case, a = 0.001 (0.1%); b = 0.98 (98%); c = 0.05 (5%).
So, updated probability = ab / [ab + c (1 − a)]= 0.0192.So there is just a
1.92% chance that the beetle is rare when the distinctive pattern is spotted
on its back.
Why the counter-intuitive result? Few beetles are rare, so it would take a
lot more evidence than observing the rare pattern to alter the prior expecta-
tion that the beetle is not rare.
So the probability that the beetle is rare (the hypothesis) given that we
observe the distinctive pattern (the evidence) is 1.92%. What is the chance,
however, that we will observe the distinctive pattern if the beetle is rare? In
other words, what is the chance of observing the evidence (the pattern) if the
hypothesis (the beetle is rare) is correct? That is 98%.
20 Probability, Choice, and Reason
To believe these two things are the same is a common mistake known as
the Inverse (or Prosecutor’s) Fallacy. In this instance, it is to believe that the
chance of observing the pattern given that the beetle is rare (98%) is the same
as the chance that the beetle is rare given the observation of the pattern (the
actual probability that the beetle is rare, which is 1.92%).
1.3.1 Appendix
We can also solve the beetle problem using the traditional notation version
of Bayes’ Theorem.
In this case, P (H) = 0.001 (0.1%); P (EIH) = 0.98 (98%); P (EIH’) = 0.05 (5%).
So, P (HIE) = 0.98 × 0.001 / [0.98 × 0.001 + 0.05 × 0.999)] = 0.00098 / 0.00098
+ 0.04995= 0.00098/ 0.05093= 0.0192.So there is just a 1.92% chance that the
beetle is rare when the entomologist spots the distinctive pattern on its back.
Note also that P (HIE) = 0.0192, while P (EIH) = 0.98.
The Prosecutor’s Fallacy is to conflate these two expressions.
1.3.2 Exercise
A nature lover spots what might be a rare category of beetle, due to the pat-
tern on its back. In the rare category, 95% have the pattern. In the common
category, only 2% have the pattern. The rare category accounts for only 1% of
the population. How likely is the beetle to be rare?
In solving this question, what are a, b, and c?
Solve again, using traditional notation, in the case where 5% (instead of
2%) of those in the common category have the pattern.
surgery have the virus. The test is 99% accurate, in the sense that 99% of
people with the virus test positive, and 99% of those who do not have the
virus test negative.
Let us say that the first patient tests positive. What is the chance that the
patient has the virus?
The intuitive answer is 99%, as the test is 99% accurate. But is that right?
The information we are given relates to the probability of testing positive
given that you have the virus. What we want to know, however, is the probabil-
ity of having the virus given that you test positive. This is a crucial difference.
Common intuition conflates these two probabilities, but they are very dif-
ferent. If the test is 99% accurate, this means that 99% of those with the virus
test positive. But this is not the same thing as saying that 99% of patients
who test positive have the virus. This is another example of the “Inverse
Fallacy” or “Prosecutor’s Fallacy”. In fact, those two probabilities can diverge
markedly.
So what is the probability you have the virus if you test positive, given that
the test is 99% accurate? To answer this, we can use Bayes’ Theorem.
The probability that a hypothesis is true after obtaining new evidence,
according to the a, b, c formula of Bayes’ Theorem, is equal to: ab / [ab + c
(1 − a)], where: