Let’s assume the following:
The goal of this section is to introduce some key terms necessary for expressing Bayes’ theorem. To facilitate this, it is helpful to have a working example.
This information can be represented using a probability tree (see Figure 1).
First, let’s define what are known as unconditional probabilities or prior probabilities. These are probability that do not depend upon (are not conditioned by) something else being the case. In our example, there are three prior probabilities.
This information can be put in the language of hypothesis and evidence / data / observation. First, suppose we have the hypothesis that we have VX. Call this hypothesis . The prior probability of is the probability of independent of (or prior to) any test for VX.
Definition 1 (prior probability of ) the probability of before we take into consideration some evidence . Let stand for “the probability of the hypothesis ()” independent of evidence .
Second, with this hypothesis in mind, we can characterize the results of VX as evidence. Call either a positive or negative test . The prior probability of is the probability of independent of whether or not we have the VX disease or not.
Definition 2 (prior probability of ) the probability of before we take into consideration some hypothesis (theory) . Let stand for “the probability of an observation ()” independent of a hypothesis .
What we have then is two different unconditional probabilities: and . Note, however, that incoming information influences the probability of the hypothesis and that scientists want to know the probability of a hypothesis given some piece of evidence. In other words, given some incoming evidence , we should adjust the probability of the in light of .
Definition 3 (conditional probability) The conditional probability of A given B is the probability of A on the condition that B.
There are two important conditional probabilities in our example.
Definition 4 (posterior probability of - ) on the assumption (condition) that the hypothesis is true, the probability of the evidence . That is, the likelihood of some observation given (in light of) the hypothesis . Let stands for the probability of assuming is true.
Definition 5 (posterior probability of - ) on the assumption (condition) that some observation is the case, the probability of the hypothesis (). That is, the likelihood of the hypothesis given (in light of) the evidence . Let stand for the probability of given .
The posterior probability of given , that is is what we want to know. Bayes theorem tells us how to calculate this information.
At the core of the Bayesian approach to science is Bayes’ theorem. What is the theorem? Let’s consider three different, increasingly precise articulations of Bayes’ theorem.
Definition 6 (Super simple Bayes’ theorem) Belief (hypothesis) + new evidence = new and improved belief (hypothesis)
Definition 7 (Bayes’ theorem in English) The probability of a hypothesis given some new evidence is equal to the probability of the evidence given the truth of the hypothesis times the probability of the hypothesis independent of the evidence divided by the probability of the evidence independent of the hypothesis.
Definition 8 (Bayes theorem)
In what follows, some step by step examples are provided to show how Bayes’ theorem can be used to determine the posterior probability of some hypothesis given some evidence. In using Bayes’ theorem, the following step-by-step method will be followed:
Example 1 (Cancer screening)
Now take John. What is his probability of having VX ? In looking at the probability tree, we would say it is 1%. Now suppose John is tested for VX and he tests positive. What now is his probability of having VX given that he has tested positive ?
The answer, in this case, is obvious, but it can be computed using Bayes’ theorem. First, let’s write out Bayes’ theorem:
Next, let’s input the values from our probability tree beginning with the prior probability that John had VX:
Next, input the probability that he tests positive given that he has VX
Next, we input the probability that he tests positive prior to determining whether he does or does not have VX. For this, we add the probability that he has VX to the probability that he tests positive (.01*1) to the probability that he does not have VX and tests positive (.99*0):
Finally, we do the calculations and, as expected, we find that the probability that John has VX given a positive test is 100%.
What we see from the above is that Bayes theorem can be used to determine the probability of the hypothesis that someone has the VX-disease given
What it does is allow us to update the probability of a hypothesis given new information. Let’s consider a more complicated and realistic example. We cannot expect every test to be 100% accurate.
Example 2 (Cancer screening)
Suppose a woman between 40-50 goes to have a mammogram. Her doctor meets with her to tell her that her test has come back positive. Since the test is not 100 percent reliable, she does not know if, in fact, she has cancer. It could be a false positive.
What is the probability that she actually has breast cancer.
We now can use the diagram and Bayes’ theorem to determine the probability that an individual has cancer given a positive test. First, we determine the probability that an individual who tests positive actually has cancer . This is multiplied by the probability that an individual has cancer independent of whether they tests positive or negative .
Next, we determine the likelihood that someone will test positive for cancer independent of whether they have cancer or not: .
Then using Bayes’ theorem we can calculate the probability that one has cancer given a positive test.
Discussion 1 Suppose you are working for a large corporation that drug tests employees. To keep things simple, let’s suppose that the drug is cocaine. If any individual fails the drug test, they are immediately fired and the employer informs the police.
Now suppose someone tests positive for cocaine. What is the probability that they are on cocaine given the positive test?
Bayes’ theorem has a wide variety of applications. What role does it play in the philosophy of science?
Note 1 (Bayes’ theorem and the confirmation of a scientific theory.)
Bayes theorem can be used to formulate a theory of incremental confirmation. Namely, we can say that some evidence confirms if and only if . That is, if the posterior probability of a hypothesis is greater than the prior probability of a hypothesis, then the hypothesis has been confirmed by the evidence.
This does not mean that the hypothesis is true, likely, or that we should believe the theory when . Notice that in the case of cancer screening, even though the probability that you have cancer given a positive test is greater than the probability that you have cancer independent of a test (), it is still more likely that you don’t have cancer than you do have it.
Note 2 (Bayes’ theorem largely corresponds to how we think observation and evidence influences the probability of a scientific theory)
First, if is likely whether is the case or not, won’t (strongly) support a hypothesis . That is, if is true under a variety of competing hypotheses, then won’t increase the probability of being true.
For example, suppose a hypothesis and two competing hypotheses . If under each hypothesis on the condition that the hypothesis is true, then it is the case that the sky is blue (that is, ), then does not change the likelihood of the . In short, evidence that supports a variety of competing hypotheses equally or is true whether is the case or not won’t be strongly supportive (see Figure 2).
Second, if is extremely unlikely unless is true and it turns out that is the case, then will significantly increase the likelihood of . The idea here is that if a hypothesis makes a novel (or unusual) prediction – one that is not likely given other hypotheses –, and this prediction is confirmed, then the probability of the significantly increases.
Note 3 (Bayes’ theorem and ad hoc modifications.)
Bayesianism is capable of accounting for problems associated with ad hoc modifications to theories. For suppose runs into conflict with some observation . A proponent of theory might, however, modify the theory in an ad hoc way in order to preserve the theory.
However, the Bayesian can account for why ad hoc modifications to theories fail to be better than the original theory simply because they are less susceptible to being falsified.
Objection 1 (How do we know the prior probability of the hypothesis?)
The use of Bayes’ theorem requires that we know the (i) prior probability of the hypothesis, (ii) the prior probability of the evidence, and (iii) the conditional probability of the evidence given the hypothesis. How is the prior probability determined before applying Bayes’ theorem? For example:
Response 1 (All hypotheses have equal prior probability) One response is to take an objectivist position. Namely, all hypotheses are equal until the evidence makes them more or less probable. Call this the objectivist position.
Imagine two boxers: Ryan and Frank. To determine the likelihood of one boxer beating the other, we begin by simply supposing that each has an equal chance. So the likelihood that Ryan will beat Frank is 50% and Frank beating Ryan is 50%. From there, we look at features in the world, using Bayes’ theorem, to adjust the likelihoods, e.g. injury, training, etc.
Objection 2 (Probability of scientific hypotheses) The objectivist approach might work in simple cases where there are two options, but it cannot work in science where there are potentially an infinite number of hypotheses. For if there are an infinite number of hypotheses, then probability of each is 0 and Bayes’ theorem won’t work if the prior probability of a hypothesis is 0 (no matter the evidence).
Response 2 (The prior probability of a hypothesis is determined subjectively) In contrast to an objectivist view of probability, let’s consider the subjectivist view. The subjectivist accepts Bayes’ theorem and interprets the prior probability that in terms of the degree of confidence that people have in occurring.
There are, however, some important caveats to this claim:
So, in the case of two boxers: Ryan and Frank, the probability that Ryan will beat Frank is whatever individuals would be willing to bet. From there, we look at features in the world, using Bayes’ theorem, to adjust the likelihoods, e.g. injury, training, etc.
Objection 3 (This makes probability rest on subjective considerations.) The problem seems to be that individuals might have different prior probabilities for the same probability. For example, I might say that the prior probability of astrology being true is 99% and the prior probability of it not being true is 1%. Where you might say that the prior probability of astrology being true is 1% while it being false is 99%.
Response 3 (Initial subjectivity is fine, probabilities converge.) Consider a hypothesis that you would not take to be very probable. Let’s suppose that I strongly believe that a certain man in an alley has psychic powers. I think the hypothesis that he can predict the future is 99.9% probable and you think it is closer to .1%. With this hypothesis in hand, we can subject him to various tests and this incoming information will (hopefully) lead us to assign the same probability to his ability to predict the future.
For example, suppose you and I go to test the hypothesis. You ask him to guess what number you are thinking of from 1-100, and he answers correctly, then 1-10,000 and he answers correctly. You are slightly more convinced. There are alternative explanations, but these seem less and less likely the more he guesses correctly. Finally, you ask him to guess tomorrow’s lottery numbers. He guesses correctly. This prediction is staggering and with each unbelievable predication, your re-use of Bayes theorem would lead you to change the probability of the man having psychic powers from .1% to 99.9%.
In short, the initial probability of a hypothesis is unimportant. What is important is that the hypothesis is allowed to adjust to incoming information.