Elementary Probability - Computing with Events
Overview
Probability can be somewhat difficult and counterintuitive.
=> It's better to approach the subject in stages.
Stages of learning in probability:
- Elementary probability (This module):
- Learn how to reason about coin flips, die rolls and the like.
- Understand sample spaces, events and what
probability means.
- Learn about combining events, conditional probability,
total-probability, and Bayes' rule.
- Be able to solve simple word problems of this kind:
- A biased coin has probability 0.7 of showing heads when tossed.
What is the probability of obtaining exactly 4 heads in 8 tosses?
- I have two coins, one of which is fair (Pr[heads]=0.5)
and the other biased (Pr[heads]=0.7). I pick one of them
randomly and flip 8 times. If I obtain 4 heads (in 8 tosses)
what is the probability that I picked the fair coin?
- Core probability:
- The concepts of: random variable, distribution, expectation.
- A few well-known discrete and continuous distributions.
- Difference between CDF and PDF.
- Joint distributions, conditional distributions, independence.
- Laws of large numbers and Central Limit Theorem.
- Elementary stats: mean, variance, confidence intervals.
- Simulation:
- How to generate random values from given distributions.
- How to write a discrete-event simulation.
- Simulation data structures.
- (Assorted) Intermediate topics:
- Conditional distributions, Markov chains.
- Queueing Theory.
- Monte Carlo methods.
- Hidden-Markov models.
Discrete vs. continuous:
- The discrete-continuous dichotomy in probability manifests in two different ways.
- Actual probabilities are themselves real numbers
=> That part is always continuous
- However, the outcomes of an experiment may be
- discrete: e.g., result of a coin flip
- continuous: e.g., location of a robot in the (x,y) plane
- It's usually easier to understand discrete probability,
so we'll start with that.
About this module:
- We will focus on elementary probability
=> As usual, we will take a computational approach.
- But first, let's familiarize ourselves with some classic examples.
Coin flips
Let's begin with some simple coin flips:
- Consider the following program
(source file)
public class CoinExample {
public static void main (String[] argv)
{
// Flip a coin 5 times.
Coin coin = new Coin ();
for (int i=0; i<5; i++) {
int c = coin.flip (); // Returns 1 (heads) or 0 (tails).
System.out.println ("Flip #" + i + ": " + c);
}
}
}
Exercise 1:
Download Coin.java,
CoinExample.java,
and RandTool.java.
Then compile and execute.
Exercise 2:
Download StrangeCoin.class.
This class models a biased coin in which Pr[heads] is not 0.5.
Write a program to estimate Pr[heads] for this coin.
Next, we'll estimate the probability that, if we flip a
biased (Pr[heads]=0.6)
coin twice, at least one flip resulted in heads:
(source file)
public class CoinExample2 {
public static void main (String[] argv)
{
// "Large" # trials.
double numTrials = 1000000;
// Count # times desired outcome shows up.
double numSuccesses = 0;
Coin coin = new Coin (0.6); // Pr[heads]=0.6
for (int n=0; n<numTrials; n++) {
// Notice what we're repeating for #trials: the two flips.
int c1 = coin.flip ();
int c2 = coin.flip ();
if ( (c1==1) || (c2==1) ) {
// If either resulted in heads, that's a successful outcome.
numSuccesses ++;
}
}
// Estimate.
double prob = numSuccesses / numTrials;
System.out.println ("Pr[At least 1 h in 2 flips]=" + prob + " theory=" + 0.84);
}
}
- Thus far, the appropriate number of trials to use seems arbitrary.
- To avoid casting, we declared both numTrials and
numSuccesses as double's.
Exercise 3:
Modify the above program to estimate the probability of obtaining
exactly 2 heads in 3 flips. Can you reason about the theoretically
correct answer?
Exercise 4:
This exercise has to do with obtaining tails in the first two flips
using the biased (Pr[heads]=0.6):
- What is the probability that, in 3 flips, the third and only
the third flip results in heads?
- What is the probability that, with an unlimited number of
flips, you need at least three flips to see heads for the
first time?
Dice
The following simple experiment rolls a die and estimates
the probability that the outcome is odd:
(source file)
public class DieRollExample {
public static void main (String[] argv)
{
double numTrials = 1000000;
double numSuccesses = 0;
Die die = new Die ();
for (int n=0; n<numTrials; n++) {
// This is the experiment: a single roll.
int k = die.roll ();
// This is what we're interested in observing:
if ( (k==1) || (k==3) || (k==5) ) {
numSuccesses ++;
}
}
double prob = numSuccesses / numTrials;
System.out.println ("Pr[odd]=" + prob + " theory=" + 0.5);
}
}
Exercise 5:
Roll a pair of dice (or roll one die twice).
What is the probability that the first outcome
is odd AND the second outcome is even?
Write a program to simulate this event
using Die.java and
see if you can reason about this theoretically.
Cards
Let's now draw cards from a deck of cards:
- We'll number the cards 0,..,51.
- Convention: 0..12 (spades), 13..25 (hearts), 26..38
(diamonds), 39..51 (clubs)
- Type of drawing:
- We can draw with replacement
=> Put the card back into the deck.
- Or, without replacement
=> Draw further cards without putting back cards already drawn.
- We'll use the class CardDeck,
which has methods
int drawWithReplacement()
int drawWithoutReplacement()
isSpade (int c)
isHeart (int c)
isDiamond (int c)
isClub (int c)
Now consider this experiment:
- Draw a card without replacement, then draw a second card.
=> What is the probability the second card is the Ace of Hearts?
- Here's a simple program to estimate this probability:
(source file)
public class CardExample {
public static void main (String[] argv)
{
double numTrials = 100000;
double numSuccesses = 0;
for (int n=0; n<numTrials; n++) {
// Note: we use fresh pack each time because the cards aren't replaced.
CardDeck deck = new CardDeck ();
int c1 = deck.drawWithoutReplacement ();
int c2 = deck.drawWithoutReplacement ();
if (c2 == 13) {
numSuccesses ++;
}
}
double prob = numSuccesses / numTrials;
System.out.println ("Pr[c2=13]=" + prob + " theory=" + (1.0/52.0));
}
}
- We can test membership in a suit as follows:
int c = deck.drawWithoutReplacement ();
if ( deck.isClub(c) ) {
// Take appropriate action ...
}
Exercise 6:
Use CardDeck.java to estimate
the following probabilities when two cards are drawn
without replacement.
- The probability that the first card is a club given
that the second is a club.
- The probability that the first card is a diamond given
that the second is a club.
That is, we are interested in those outcomes when the second card
is a club.
Theory: sample spaces, events and probability
First, we'll define a few terms:
- Experiment: An action that produces an outcome that we're interested in,
and which can be, at least in thought, repeated under
exactly the same conditions.
e.g., Flip a coin
- An outcome is an observable result of an experiment.
e.g., Heads
- The sample space for an experiment is the set
of all possible outcomes.
e.g., {heads, tails}
- Typically, we use Ω to represent the sample space.
Perhaps the most important term:
- An event is a subset of the sample space
=> Any subset is an event.
Exercise 7:
List all possible events of the sample space Ω={heads, tails}.
Exercise 8:
How many possible events are there for a sample space of
size |Ω|=n?
Correspondence between events and "word problems":
- Consider this problem: what is the probability of obtaining
exactly 2 heads in three flips of a coin?
- What is the sample space?
Ω=
{(H,H,H),
(H,H,T),
(H,T,H),
(H,T,T),
(T,H,H),
(T,H,T),
(T,T,H),
(T,T,T)}
- What is the event of interest?
Let E = {(H,H,T), (H,T,H), (T,H,H)}
- Example: what is the probability of obtaining an odd number
in a die roll?
- What is the sample space?
Ω= {1, 2, 3, 4, 5, 6}
- What is the event of interest?
Let E = {1, 3, 5}?
- The "occurrence" of an event:
- Consider a die roll and the event E = {1, 3, 5}?
- Suppose you roll the die and "3" shows up
=> Event E occurred (because one of its outcomes occurred).
Probability:
Exercise 9:
How many such numbers need to specified for a die roll?
Operations on events:
- The usual set operators apply to events, since events are sets.
- A' = Ω-A = complement of A.
- A and B = intersection of event A and
event B.
Exercise 10:
Use the axioms to show that Pr[A'] = 1 - Pr[A]
for any event A.
Other ways of specifying probability measures:
- For a die-roll, specify as follows:
Pr[E] = |E| / 6, for any event E.
- Note: this achieves two purposes
- Compact representation (instead of full list of events).
- Assumes Axiom 3 is satisfied
=> Axiom 3 is used by construction
e.g., Pr{1,3,5} = Pr{1} + Pr{3} + Pr{5} = 1/6 + 1/6 + 1/6 = 3/6
- Similarly, for a single card drawing:
Pr[E] = |E| / 52, for any event E.
- For example, E={0,1,...,12}
- Pr[E] = 13/52 (using Axiom 3)
- Notation: we will occasionally use an informal event
description (in English) e.g.,
Pr[draw a spade] = 13/52.
Simple and complex experiments:
- Consider two die rolls:
- Sample space:
Ω = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6),
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6),
(5,1), (5,2), (5,3), (5,4), (5,5), (5,6),
(6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}
- Thus, there are 236 possible events.
- One way to specify:
Pr[E] = |E|/36, for any event E.
- Example: Pr[same number on both dice] = 6/36.
- More complex experiments can be constructed out of simpler ones.
Exercise 11:
What is a compact way of representing the sample space for
two card drawings without replacement?
What subset corresponds to the event "both are spades"?
Exercise 12:
Consider this experiment: flip a coin repeatedly until you
get heads. Count the number of flips. What is the sample space?
A minor tweak to Axiom 3:
- Since Pr[A union B] = Pr[A] + Pr[B], it's true
that Pr[A union B union C] = Pr[A] + Pr[B] + Pr[C]
- In general,
Pr[ union of A1, ..., An]
= Pr[A1] + ... + Pr[An]
- But consider a sample space like this:
Ω = {1, 2, 3, ...} (countably infinite)
- Let event Ai={i}.
- Let event A = union of all Ai's.
- Countable additivity axiom:
Pr[A] = Pr[A1] + Pr[A2] + ...
=> Axiom 3 needs to hold for countable unions of disjoint events.
The meaning of probability: the frequentist interpretation
- Consider an experiment, an event E and the number Pr[E].
- Suppose we are able to repeat the experiment any number of times.
- Let f(n) = number of times event E occurs in
n repetitions.
- Then,
limn→infty f(n)/n = Pr[E].
As it turns out, there is another intepretation, based on the
intuitive term belief
But that needs a little more background (an understanding of distributions)
Conditional probability
About conditional probability:
- The theory developed so far can be used to solve many simple
word problems.
- However, it does not account for questions of this sort:
Given that the outcome of a die-roll is odd, what then is
the probability that the outcome is 3?
- Unfortunately, there does not seem to be an event
corresponding to the above item of interest.
Consider the die-roll example:
- Define the events
A = {3}
B = {1, 3, 5}
- Consider n repetitions of the experiment and let
fB(n) = # occurrences of B
fAB(n) = # occurrences of B when A also occurs
- Then, for large n, we are really interested in
desired conditional probability = fAB(n) / fB(n)
- Divide both sides by n:
(fAB(n)/n) / (fB(n)/n)
- In the limit, fB(n)/n → Pr[B]
- In the limit, fAB(n)/n → Pr[A and B]
- Thus,
desired conditional probability =
Pr[A and B] / Pr[B]
More formally,
- The conditional probability Pr[A|B] is defined as
Pr[A|B] = Pr[A and B] / Pr[B]
- Note that [A|B] is not an event.
- Read Pr[A|B] as probability of "A given B".
Exercise 13:
Consider an experiment with three coin flips. What is the
probability that 3 heads occurred given that at least one
occurred?
Exercise 14:
Consider two card drawings without replacement. What is the probability
that the second card is a club given that the first is a club?
An important variation of stating the rule for conditional probability:
- By re-arranging the definition
Pr[A|B] Pr[B] = Pr[A and B]
- By symmetry,
Pr[B|A] Pr[A] = Pr[B and A] = Pr[A and B]
- Thus, for any two events A, B:
Pr[A|B] Pr[B] = Pr[B|A] Pr[A]
An interesting twist:
- Notice that we can re-arrange the last equation as
Pr[A|B] = (Pr[B|A] Pr[A]) / Pr[B]
- Next, consider the card example (without replacement):
- Let A = first card was a club.
- Let B = second card was a club.
- Then, Pr[A|B] = probability first was a club
given that the second was a club.
- Thus, it seems as though we can reason about past events
given knowledge of current events.
- This is strictly not true, because the sample space consists
of the entire history (for two card drawings):
Ω =
{(0,1),...,(0,51),
(1,0), (1,2)...,(1,51),
...
(51,0), (51,2)...,(51,50)}
Thus, we are really reasoning about the entire history.
- This idea
Pr[A|B] = (Pr[B|A] Pr[A]) / Pr[B]
is often called Bayes' rule.
Back to the card problem:
- Recall the earlier problem:
- The probability that the first card is a club given
that the second is a club.
- The probability that the first card is a diamond given
that the second is a club.
- Define these events:
C1 = first card is a club
C2 = second card is a club
D1 = first card is a diamond
- Then,
Pr[C1|C2] =
Pr[C2|C1] Pr[C1] / Pr[C2]
- Observe that
Pr[C1] = 13/52
Pr[C2] = 13/52
Pr[C2|C1] = 12/51
Exercise 15:
Why?
- Substituting, we get Pr[C1|C2] = 12/51.
Exercise 16:
Compute Pr[D1|C2].
A variation of Bayes' rule:
Exercise 17:
Finish (by hand) the calculation above.
Consider another example:
- A lab performs a blood test is performed to reveal the presence or absence
of a certain infection.

- The test is not perfect
=> If a person is infected, the test may not always work.
=> An uninfected person may have a test turn out positive (false positive).
- Assume we have the following model:
- 5% of the population is infected.
- The probability that the test works for an infected person
is 0.99.
- The probability of a false positive is 3%.
- We want the following probabilities:
- Probability that if a test is positive, the person is
infected.
- Probability that if a test is positive, the person is well.
- Define these events:
S = person is sick
T = test is positive
Exercise 18:
Write down the desired probabilities in terms of what's given
in the model, and complete the calculations.
Write a program to confirm your calculations:
Continuous sample spaces
Consider the following example:
Exercise 19:
Download and execute BusStopExample.java.
You will also need BusStop.java.
- What is Pr[A>1]? What is Pr[A>0.5]?
- What is an example of an event for the above sample space?
- Is {1.0, 1.2, 1.5} an event?
- Is the interval [1.0, 3.5] an event?
Exercise 20:
Download BusStop.java and
BusStopExample.java.
Then change the type of interarrival by replacing true
with false above, and estimate Pr[first-interarrival > 0.5].
Exercise 21:
Modify the code to estimate the following conditional
probability. Let A denote the interarrival time.
Estimate Pr[A>1.0|A>0.5] for each of
the two types of interarrivals (exponential and uniform).
What is strange about the result you get for
the exponential type?
Hint: compare this estimate to that in the previous exercise.
Exercise 22:
Estimate the average interarrival time for each
of the two types of interarrivals (exponential and uniform).
More on the same example:
- Suppose we arrive at the bus stop at time=10.0.

- We can count how many buses have gone by:
(source file)
double myArrivalTime = 10; // We arrive at 10.0
BusStop busStop = new BusStop (true);
double arrivalTime = 0;
double numBuses = -1;
while (arrivalTime < myArrivalTime) {
numBuses ++;
busStop.nextBus (); // Keep generating successive buses
arrivalTime = busStop.getArrivalTime ();
}
System.out.println ("Number of buses: " + numBuses);
Exercise 23:
Why doesn't the following code work?
...
double numBuses = 0;
while (arrivalTime < myArrivalTime) {
busStop.nextBus ();
arrivalTime = busStop.getArrivalTime ();
numBuses ++;
}
...
Exercise 24:
Modify the above program to estimate the probability
that at least 5 buses have gone by.
Exercise 25:
Consider the waiting time: the time from your arrival
(at 10.0) to the time the next bus arrives.
Estimate your average waiting time.
- Does this depend on when you arrival (at 10.0 vs. 20.0, for example)?
- Try this for each of the two types of interarrivals
(exponential and uniform).
- What do you observe is unusual about this wait time?
(In)Dependence
Consider flipping a coin twice:
- Let C1 be the event: "first flip is heads"
- Let C2 be the event: "second flip is heads"
- We will compute Pr[C2]
and Pr[C2|C1].
- Here is a simple program to estimate both:
(source file)
double numTrials = 1000000;
double numSuccesses = 0; // #times C2 occurs.
double numFirstHeads = 0; // #times C1 occurs.
double numBothHeads = 0; // #times both are heads.
for (int n=0; n<numTrials; n++) {
Coin coin = new Coin ();
int c1 = coin.flip ();
int c2 = coin.flip ();
if (c2 == 1) {
numSuccesses ++;
}
if (c1 == 1) {
numFirstHeads ++;
if (c2 == 1) {
numBothHeads ++;
}
}
}
double prob1 = numSuccesses / numTrials; // Pr[C2]
double prob2 = numBothHeads / numFirstHeads; // Pr[C2|C1]
System.out.println ("Pr[c2=1]=" + prob1 + " Pr[c2=1|c1=1]=" + prob2);
Exercise 26:
Instead of Coin.java in the example
above, use BizarreCoin.java
and see what you get.
Some definitions:
- Two events A and B are independent
if Pr[A|B] = Pr[A].
- Thus, events A and B are dependent
if they are not independent.
- Recall that
Pr[A|B] = Pr[A and B] / Pr[B].
- Thus, if they are independent
Pr[A and B] / Pr[B] = Pr[A].
Or,
Pr[A and B] = Pr[A] Pr[B].
This is how some books define independence.
Exercise 27:
For the bus-stop problem, define the following events:
- Event B1 = exactly one bus arrives in the period [0,0.5].
- Event B2 = exactly one bus arrives in the period [0.5,1].
First, without writing a program, what does your intuition tell you
about whether these events are independent?
Next, write a program to estimate Pr[B2] and
Pr[B2|B1].
Do this for each of the exponential and uniform cases.
Problem solving
Probability problems are often tricky and defy intuition.
We suggest the following high-level
problem-solving procedure:
- Identify the sample space and find some compact way of writing
it down.
- Identify the question being asked:
- Is it an event (subset of the sample space)?
- Or is it a conditional probability?
- Is conditioning involved? That is, does it seem
like one action will change the probabilities for a later action?
=> If so, the conditional probabilities are usually
easier to read off from the problem.
- Identify relevant events and write them down, if possible,
as subsets of the sample space.
- Try to identify, if conditioning is involved, which
conditional probabilities are useful.
- How is the probability measure specified?
- Finally, complete the calculations.
Let's look a simple example: what is the probability that
a rolled pair of dice shows the same number?
- Identify the sample space:
{(i,j): i,j ε 1,2,...,6}
- There appears to be no conditioning.
- The event of interest:
Let A = {(i,i): i ε 1,2,...,6}
- The probability measure is not specified.
We'll make the assumption that each outcome is equally likely.
- Now we solve the problem:
Pr[A] = |A| / |Ω| = 6/36
Let's look a little deeper at the above problem:
- A more intuitive measure is: each number shows up with equal
probability on each die
Pr[i] = 1/6, i ε {1,...,6}
- How does give us
Pr[(i,j)] = 1/36, i,j ε {1,...,6}?
- Define the events
Ai = "obtain i on first roll"
Bj = "obtain j on second roll"
- Then, one can assume independence (because the second roll
shouldn't be affected by the outcome of the first):
Pr[Ai] = 1/6
Pr[Bj] = 1/6
- Therefore,
Pr[(i,j)] = Pr[Ai] Pr[Bj] = 1/36
- Thus, the assumption of independence is buried when we state
Pr[A] = |A| / |Ω|
for this example.
Another example:
Let's now look at an instructive variation of the above:
Exercise 28:
Three subcontractors - let's call them A,B,C - to a popular
clothing store contribute roughly
20%, 30% and 50% of the products in the store.
Turns out that 6% of A's products have some defect; the defect numbers
for B and C are 7% and 8% respectively.
A product is selected at random and found to be defective. What
is the probability that it came from subcontractor A?