Module 13: Probability and Algorithms
Introduction
Probability is the study of "chance":
The mathematics of probability is used to answer questions such as:
- If two dice are rolled, what is the probability that the
face-up numbers add up to 7?
(Simple probability)
- If n samples are used in an estimate of average
temperature, how large should n be for 3-decimal-place accuracy?
(First course in probability)
- Is the distribution of accident victims among ages uniform?
(Second course in probability/stats)
- What is the relationship between GDP and Dow Jones?
(Sophisticated statistical modeling required)
- What is the average time required by a particle
experiencing Brownian motion to escape a given enclosure?
(Fairly advanced probability)
Probability and computer science:
- There is a significant overlap between computing and probability.
- Simulation:
- Writing programs to simulate a real-life activity that
incorporates randomness
=> use multiple simulations to determine systems statistics.
- Random number generation.
- Simulation packages.
- Monte Carlo simulation:
- Estimating integrals using randomness.
- Computational probability:
- Computation of probability models, statistical models.
- Computation related to reliability estimation.
- Performance evaluation:
- Systematic approach to architecture/system design using
probabilistic models of load.
- Involves probabilistic modeling and simulation.
- Queueing theory, Markov chains.
- Analysis of algorithms:
- Assume a probability distribution over possible inputs (e.g,
a random permutation of integers).
- Analyse running time (e.g., of a sorting algorithm) mathematically.
- Generate random data and evaluate algorithm performance.
- Randomized algorithms:
- Introduce randomness deliberately into an algorithm
=> can help performance.
- Example: randomized choice of partition-element in QuickSort.
- Some randomized algorithms are easier to code than competing
deterministic algorithms.
Simple Estimation
In-Class Exercise 13.1:
Consider the following situation. I place three closed boxes A, B and
C, in front of you, one of which contains "the treasure" and the
others are empty. The box containing the treasure has a secret
marking on the exterior that only I can identify (So I know where the
treasure is). You guess one box by pointing to it. Suppose you pick
Box A. Now one or both of boxes B or C must be empty. If both are
empty, I pick one of them randomly, else I pick the only empty one.
Then, I show you that it is empty by opening it. After this, there
are two boxes left, one of which contains the treasure.
I now let you change your selection before finally revealing where the
treasure is. The question is: does the option to change your selection matter?
Consider a dice-estimation example:
- Experiment: roll two dice and see if the face-up numbers add up
to 7.
- Objective: write a program to estimate the probability of
this event occuring.
- Key ideas:
- Each die has numbers { 1, 2, 3, 4, 5, 6 }.
- Write code to select one of these randomly.
- Get random value for each die.
- Add numbers to see if they total 7.
- Repeat many trials of this experiment.
- Report fraction of trials in which this event occurs.
- Sample Java code:
(source file)
public static void main (String[] argv)
{
// We will count the number of times our event of interest occurs.
int count = 0;
// Repeat experiment 100 times.
for (int i=0; i < 100; i++) {
// Roll the dice.
int m = (int) UniformRandom.uniform (1, 6);
int n = (int) UniformRandom.uniform (1, 6);
// See if the event of interest occurs.
if (m+n == 7)
count ++;
}
// Compute the fraction of time the event occurred.
double avg = (double) count / 100;
// Output.
System.out.println ("Estimated Probability: " + avg);
}
- Improved program:
- The above program uses only 100 samples in its estimate
=> that may be too small.
- The desired total "7" is hardcoded
=> we might be interested in other totals.
- Let's re-write the code to make it a little more general:
(source file)
public static double estimate (int sum, int numSamples)
{
// We will count the number of times our event of interest occurs.
int count = 0;
// Repeat experiment "numSamples" times.
for (int i=0; i < numSamples; i++) {
// Roll the dice.
int m = (int) UniformRandom.uniform (1, 6);
int n = (int) UniformRandom.uniform (1, 6);
// See if the event of interest occurs.
if (m+n == sum)
count ++;
}
// Compute the fraction of time the event occurred.
double avg = (double) count / numSamples;
return avg;
}
A game example:
- Consider a generic random 2-person game:
- The players are called A and B.
- There is no skill involved, only randomness determines the result.
- If A wins, A gets a reward. Likewise, if B wins then B gets a reward.
- If the probability that A wins is 0.5, the game is fair.
- A and B repeat multiple games
=> in the long run their winnings should be approximately
the same.
- Consider this particular game:
- An n-bit string is randomly generated (Here, n is even).
- If the number of 1's is equal to the number of 0's, then A wins.
(Otherwise B wins).
- Is the game fair?
- How to make an unbalanced game fair:
- If the probability of A winning is less than 0.5, increase
A's reward.
- Let's write a program that:
- Simulates a number of games.
- Takes the reward values as input.
- Prints out the resulting winnings of each player.
- We will use n = 6
- First, here is code to play a single game:
public static boolean playGame ()
{
int count = 0;
// Generate a random 6-bit string.
for (int i=0; i<6; i++) {
// Generate i-th bit randomly.
int bit = (int) UniformRandom.uniform (0, 1);
// Count number of 1's.
if (bit == 1)
count ++;
}
if (count == 3)
return true; // A wins.
else
return false; // B wins.
}
- Now, let's add code to play multiple games and accumulate rewards:
(source file)
public static void playGames (int numGames, double rewardA, double rewardB)
{
// Each player starts with zero.
double sumA = 0, sumB = 0;
for (int i=0; i < numGames; i++) {
// Play a game.
boolean A_wins = playGame ();
if (A_wins) {
// A gets reward.
sumA += rewardA;
}
else {
// B gets reward.
sumB += rewardB;
}
}
// Results.
System.out.println (" A's winnings: " + sumA);
System.out.println (" B's winnings: " + sumB);
}
- By experimenting with different rewards, the game can be
made fair.
In-Class Exercise 13.2:
Write a program to estimate which of the two strategies for the 3-box
problem (in exercise 13.1) is better. To do this, you can use 2
players, one of whom bets on "changing", the other who "stays with the
current box". Then see whether anyone tends to win in the long run
(playing multiple games).
You will need UniformRandom.java.
A "real-valued" game example:
- The game is played as follows:
- A number R is given ahead of time.
- Two points are randomly generated in the unit square.
- If the distance between the two points is less than R, player A wins.
(Otherwise B wins).
- Two ways to make the game fair:
- Use differing rewards.
- Use an optimal value of R.
(The approach we will take).
- Question: what value of R will make this a fair game?
=> we want the probability of A winning to be about 0.5.
- Let's write code to simulate multiple games for a given
value of R:
(source file)
public static void playGames (int numGames, double R)
{
int numA_wins = 0, numB_wins = 0;
for (int i=0; i < numGames; i++) {
// Play a game.
boolean A_wins = playGame (R);
if (A_wins) {
numA_wins ++;
}
else {
numB_wins ++;
}
}
// Results.
double fracA_wins = (double) numA_wins / numGames;
double fracB_wins = (double) numB_wins / numGames;
System.out.println ("Fraction of games won by A: " + fracA_wins);
System.out.println ("Fraction of games won by B: " + fracB_wins);
}
public static boolean playGame (double R)
{
// Generate two points randomly in the unit square.
double x1 = UniformRandom.uniform();
double y1 = UniformRandom.uniform();
double x2 = UniformRandom.uniform();
double y2 = UniformRandom.uniform();
// Compute the distance between them.
double dist = Math.sqrt ( (x1-x2)*(x1-x2) + (y1-y2)*(y1-y2) );
// See if A wins.
if (dist < R)
return true;
else
return false;
}
In-Class Exercise 13.3:
Download the above code and experiment: what value of R makes
the game fair?
Estimation and Accuracy
About data:
- Definition: the sample space is the set of all possible
values for whatever is being estimated.
- Example: in observing a die-roll, the sample space {1, 2, 3, 4, 5, 6}.
- Example: in producing a random point in the unit square
=> sample space is [0,1] x [0,1].
- Types of data:
- Discrete-finite: sample space is a finite set.
- Discrete-infinite: sample space is countably infinite.
e.g. number of flips required to get your first "heads".
- Continuous: sample space is uncountably infinite
=> usually a subset of real numbers.
Estimation:
- Suppose we observe n samples
=> the outcomes of n experiments.
- Denote the samples by s1, s2, ..., sn.
- Example:
- Suppose roll a die 8 times and outcomes are 1, 2, 1, 1, 3, 6, 4, 4.
- Then,
s1 = 1, s2 = 2, s3 = 1, ...,
s8 = 4.
- Definition: the sample mean M is the average of the
n samples:
M = (s1 + s2 + ... +
sn) / n.
- Definition: the sample variance V is the average of
squared-distances from the mean.
V = ( (s1-M)2 + (s2-M)2 + ...
(sn-M)2 ) / n
- Definition: the sample standard deviation D is the
square-root of the variance.
D = V 0.5
Accuracy in estimation:
- Recall the Two-Point-Game:
the value R was close to but not 0.5.
=> how can we tell for sure it's not 0.5?
- Number of samples: how many samples are "sufficient"?
- The theory of probability provides a way to determine how
many samples are "sufficient":
- Suppose n samples are used to obtain estimates of mean and variance, say
M' and V'.
- Then, compute d = 1.96 * sqrt (V'/n)
- Then the probability the real mean lies in the range
[M'-d, M'+d] is 0.95.
- The interval [M'-d, M'+d] is called a confidence interval.
- Many statistical techniques help in improving the quality of
estimation.
(We would need a course on statistics to cover these methods).
- Sample Java code:
- We will define a useful class to collect data, accumulate
totals, variance and compute the confidence interval.
- To use the class:
Statistics stats = new Statistics ();
for (int i=0; i < numTrials; i++) {
// Produce i-th data.
stats.add ( ... );
}
// Summary:
System.out.println (stats);
- The code:
(source file)
public class Statistics {
long numSamples = 0;
double sampleTotal = 0;
double sampleMean = 0;
double sampleVar = 0;
public void addData (double x)
{
// Increment number of samples obtained.
numSamples ++;
// Accumulate total for sample mean.
sampleTotal += x;
// First, save current mean.
double oldMean = sampleMean;
// Now update sample mean.
sampleMean = sampleTotal / numSamples;
// Use recurrence relation for variance.
if (numSamples == 1) {
sampleVar = 0;
}
else {
sampleVar = (1 - 1.0 / (double) (numSamples - 1)) * sampleVar
+ numSamples * (sampleMean - oldMean) * (sampleMean - oldMean);
}
}
public double getMean ()
{
return sampleMean;
}
public double getVariance ()
{
return sampleVar;
}
public double getConfidenceHalfInterval ()
{
// mean +- this value, i.e., this is half the interval.
return 1.96 * Math.sqrt (sampleVar / numSamples);
}
public String toString()
{
if (numSamples == 0)
return "No samples";
else
return getMean() + " +- " + getConfidenceHalfInterval() + " #samples=" + numSamples;
}
}
In-Class Exercise 13.4:
Use the above Statistics.java class to obtain an accurate
estimate for the mean distance between two points randomly generated
in the unit square. (The code for generating the two points has been
given prior to Exercise 13.3).
Histograms
What are they?
- Sometimes, estimating just the mean (average value) does not
provide a complete picture
=> would like some sense of the "spread" of data.
- A histogram provides a way to estimate "spread".
Example: a group of coin-flips
- Suppose we flip a coin n times: how many "heads" are obtained?
- The average number of heads: n / 2.
- However, what does this say about how likely it is to get
n / 3 heads?
- Let P(i) be the probability that i heads are
obtained
(Here, 0 <= i <= n).
- We will write code to estimate each P(i)
=> a histogram.
static void computeHistogram (int numFlips, int numTrials)
{
// Create space for the bins.
int[] bins = new int [numFlips+1];
// Perform trials.
for (int i=0; i < numTrials; i++) {
// Flip coins "numFlips" times and count.
int count = 0;
for (int j=0; j < numFlips; j++) {
if (UniformRandom.uniform() < 0.5) {
// Heads.
count ++;
}
}
// Update counter.
bins[count] ++;
}
System.out.println ("Histogram: ");
System.out.println (" Bin #Trials Fraction");
for (int i=0; i <= numFlips; i++) {
System.out.println (" " + i + " " + bins[i] + " "
+ ((double)bins[i]/(double)numTrials) );
}
}
- The usage of "bin" will become clear below.
Example: evaluating whether a distribution is uniform
- This example is for "real-valued" data.
- For the coin flip case, there were only n+1 outcomes:
{ 0, 1, 2, ..., n }.
- For real-valued data, the outcomes are infinite.
=> we will create "intervals" or "bins"

- Sample Java code:
static void computeHistogram (int numBins, int numTrials)
{
// Space for the bins.
int[] bins = new int [numBins];
// Length of a single small interval.
double interval = 1.0 / (double) numBins;
// Perform trials.
for (int i=0; i < numTrials; i++) {
// Generate a value.
double u = UniformRandom.uniform ();
// Find out where this value lies (which bin or interval).
int whichBin = (int) Math.floor (u / interval);
// Update counter.
bins[whichBin] ++;
}
System.out.println ("Histogram: ");
System.out.println (" Bin #Trials Fraction");
for (int i=0; i < numBins; i++) {
System.out.println (" " + i + " " + bins[i] + " "
+ ((double)bins[i]/(double)numTrials) );
}
}
- Example with 10 bins:
- With 10 bins and 10 samples, here is the output of a sample run:
Histogram:
Bin #Trials Fraction
0 0 0.0
1 0 0.0
2 1 0.1
3 1 0.1
4 2 0.2
5 1 0.1
6 1 0.1
7 2 0.2
8 2 0.2
9 0 0.0
Note:
- We have also printed out the fraction of occurences for each bin.
- The fraction should be around 0.1 (about 1/10-th) for a
large number of samples.
- With 1000 samples:
Histogram:
Bin #Trials Fraction
0 100 0.1
1 98 0.098
2 108 0.108
3 105 0.105
4 117 0.117
5 103 0.103
6 112 0.112
7 81 0.081
8 92 0.092
9 84 0.084
- With 100000 samples:
Histogram:
Bin #Trials Fraction
0 9854 0.09854
1 10141 0.10141
2 9914 0.09914
3 9967 0.09967
4 10101 0.10101
5 9998 0.09998
6 10144 0.10144
7 10016 0.10016
8 9895 0.09895
9 9970 0.0997
In-Class Exercise 13.5:
Obtain a histogram (use 10 bins) of the distance between two randomly
generated points in the unit square. Is the distribution uniform?
Monte Carlo Estimation
What is it?
- Monte Carlo estimation is a way of computing functions (usually
definite integrals) using random generation.
- Key ideas:
- Consider a function that is difficult to integrate, e.g,

- First, draw a box around it of any height:

- Now suppose we generate points randomly in the box:

- Let N be the total number of points generated.
- Let Nf be the number points that fall below
the curve.
- Let A be the area of the box.
- Let Af be the area under the curve.
- Then, for large enough N
Nf / N = Af / A
Therefore
Af = (A * Nf) / N
- Note that all quantities on the right are easy to determine.
Example:
- Let's compute the integral of an easy function like f(x) =
x2 over the interval [0, 1].
- Sample Java code:
(source file)
// Initialize count.
int count = 0;
// Box dimensions and area.
double minY = 0;
double maxY = 1;
double boxArea = 1.0;
// Perform trials.
for (int i=0; i < numTrials; i++) {
// First pick an x value randomly.
double x = UniformRandom.uniform();
// Now pick a y-value randomly in range.
double y = UniformRandom.uniform (minY, maxY);
// Check if y < f(x)
if (y < x*x)
count ++;
}
// Apply formula.
double area = boxArea * ((double) count / (double) numTrials);
// Output.
System.out.println ("Area estimate: " + area);
In-Class Exercise 13.6:
Find the area under the curve of the function
f(x) = e- x2/2
in the interval [-3, 3].
Generating Data Randomly
About random data generation:
- Random generation is a subtantial topic in itself (several
books on the subject).
- There are efficient ways of using a standard uniform
generator to generate from most distributions of interest.
- There are many ways to generate uniform random numbers
=> the Lehmer modulo-generator is one of the most popular.
(That is what we use in UniformRandom.java).
- We will only consider a few simple cases.
Example: biased coin
In-Class Exercise 13.7:
Suppose we want to generate outcomes from rolling a biased die where
the probabilities of getting {1, 2, 3, 4, 5, 6} are
respectively {0.1, 0.2, 0.3, 0.15, 0.13, 0.12}.
Use the following template
and write code to generate from this distrbution.
Random permutations:
Randomized Algorithms
Can an algorithm actually exploit randomness?
- Recall Quicksort:
- Pick the leftmost element as the median element.
- Find the right place for the median.
- Re-arrange elements to the left (smaller) and right
(larger) of the median.
- Recurse on left and right parts.
- What could go wrong: leftmost element could be a poor choice.
- Randomized quicksort: pick a random element as the median.
- Does this really make a difference?
A different interpretation of worst-case performance:
- For standard quicksort, we ask a devil's advocate question:
=> Can we pick a worst-case input to make the algorithm perform badly?
- Answer: yes
=> Elements already sorted.
- An important point: for standard quicksort,
we always have such a worst-case input at our disposal.
- For randomized quicksort:
=> Can we pick a worst-case input to make randomized-quicksort perform badly?
- Answer: only with extremely low probability.
- Only if we knew the random choices, could we pick such a
worst-case input at will.
=> But we don't know the random choices ahead of time.
Let's look at a more revealing example of how randomness can help:
- Consider a group of devices that need to communicate
wirelessly:
- Each device has a transmitter/receiver pair.
- Each device has "stuff" (data) to send to others.

- Without coordination, there can be collisions
=> Two or more devices transmitting simultaneously.
- One possible solution:

- Divide time into fixed slots.
- Each device is assigned its own slot.
=> Equivalently, the devices take turns.
In-Class Exercise 13.8:
Identify some inefficiencies with the above round-robin approach.
A randomized algorithm (protocol):

- Divide time into fixed-size slots (as before).
- Any device can transmit in any slot.
- Allow collisions to occur, but re-transmit whenever
there's a collision.
- A device randomly decides not to transmit with
probability (1-p).
=> Equivalently, a device transmits with probability p.
- Why might this be useful?
=> If p is small, the chances of a collision are small.
In-Class Exercise 13.9:
Suppose there are n devices and
Device 3 wants to transmit in a given slot. What is
the probability that the transmission is successful?
- How would you write a simulation to determine this number?
Write pseudocode.
- Use this template to implement your idea.
- Can you write down a mathematical expression for the result?
In-Class Exercise 13.10:
Suppose there are n=5 devices.
What is the optimal value of p?
- Start by suggesting what "optimal" might mean.
- Write a small program to compute the optimal value.
- Can you derive a formula for the optimal value?
Evaluating Algorithms
What to evaluate:
- Performance: running time of an algorithm.
- Quality: quality of solutions produced (if it's an
optimization algorithm).
General approach:
- Generate many random data sets.
- Run algorithm on each data set.
- Evaluate (performance or quality) on each data set.
- Output average performance or quality.
Simple example: sorting
- We will evaluate the running time of Java's sorting algorithm
using several randomly created arrays (of integers).
- Comparison with theoretical running time:
- Java uses MergeSort
=> theoretical running time is O(n log(n)).
- We will compute the ratio of the actual running time to
theoretical time.
=> should be approximately constant.
- Sample code:
(source file)
static double performTrials (int numTrials, int arraySize)
{
// Initialize array A.
int[] A = new int [arraySize];
for (int i=0; i < arraySize; i++)
A[i] = i;
// We will measure the total time taken over all trials.
double totalTimeTaken = 0;
// Repeat "numTrials" times:
for (int n=0; n < numTrials; n++) {
// Randomly permute elements.
RandomPermutation.permute (A);
// Start timing.
double startTime = System.currentTimeMillis();
// Sort.
Arrays.sort (A);
// Measure time taken.
double timeTaken = System.currentTimeMillis() - startTime;
// Accumulate, for average.
totalTimeTaken += timeTaken;
}
// Return average time taken.
return totalTimeTaken / numTrials;
}
- To compare with theoretical time:
// Perform comparison with each array size.
// Here, arraySizes[i] = size of i-th array.
for (int i=0; i < arraySizes.length; i++) {
// Get estimate using randomly generated data.
double avgTimeTaken = performTrials (numTrials, arraySizes[i]);
// Get theoretical running time.
double theoretical = theoreticalTime (arraySizes[i]);
// Print ratio.
System.out.println ("Actual=" + avgTimeTaken + " theoretical=" + theoretical
+ " ratio=" + (avgTimeTaken/theoretical));
}
More interesting example: the Marriage Problem
- The Marriage Problem:
- There are n men and n women.
- Each man ranks the women in order of preference.
- Each woman ranks the men in order of preference.
- A (group) marriage is a 1-1 pairing of men and women.
- A marriage is unstable if there exists two couples
(Ma, Wb) and
(Mc, Wd) such that
- Ma prefers Wd over Wb
- Wd prefers Ma over Mc
If this were the case, Ma and
Wd would elope.
=> unstable solution.
- Goal: write an algorithm to produce a stable marriage.
- The Proposal algorithm: "Man proposes, woman disposes"
Algorithm: proposalAlgorithm (menPrefs, womenPrefs)
Input: preferences of the men and women
// Initially, nobody is married.
1. UnmarriedList = place all men in list sorted by ID;
// As long as there are unmarried guys (and therefore, unmarried gals)
// we'll find pairings.
2. while UnmarriedList not empty
3. Mi = lowest-numbered guy in UnmarriedList
// Mi tries his luck.
4. for j=0 to n-1
5. Wk = woman who is Mi's j-th preference.
6. // Mi proposes to Wk.
7. if Wk is not married
// She accepts.
8. Mi and Wk are married
9. exit for-loop
10. else if Wk prefers Mi to current husband
// She elopes.
11. Wk's current husband is dumped back on the UnmarriedList;
12. Mi and Wk are married
13. exit for-loop
14. endif
15. endfor
16. endwhile
Output: stable marriage
- Analysis:
- At every inner most step (the if-statement), a pairing occurs
only if Wk has not rejected Mi before.
- But every woman rejects each man only once.
=> maximum O(n2) rejections.
- Or, O(n2) executions in the if-statement
=> O(n2) time overall.
- We have assumed operations in the if-statement take
O(1) time
=> can be done if ranks are computed ahead of time (one
time sorting cost of O(n log(n)).
- What about random data?
- O(n2) is worst-case.
- Best-case: O(n) (first try).
- What is the average running time?
=> use randomly generated data.
- What is the average "quality"?
=> find the average ranking of each mate.
- Estimating the time taken:
- Create several random data sets: random preferences.
- Run the algorithm each time.
- Accumulate total running time, and find the average.
- Sample code:
(source file)
double totalTimeTaken = 0;
// Repeat "numTrials" times:
for (int n=0; n< numTrials; n++) {
// Create set of random preferences for men and women.
for (int i=0; i < numCouples; i++) {
// Create random prefs.
RandomPermutation.permute (menPrefs[i]);
RandomPermutation.permute (womenPrefs[i]);
}
// Start timing.
double startTime = System.currentTimeMillis();
// Perform matching.
marry (menPrefs, womenPrefs);
// Measure time taken.
double timeTaken = System.currentTimeMillis() - startTime;
// Accumulate, for average.
totalTimeTaken += timeTaken;
}
// Return average time taken.
return totalTimeTaken / numTrials;
- It is possible to show that the average running time is no
worse than O(n log(n)).
In-Class Exercise 13.11:
Download this template and
estimate the "quality" of solution for both men and women.
To do this, use the following:
- Once a solution (marriage) is produced, examine the spouse for
each person, and find where in the ranking of that person the spouse occured.
(Thus, the rank is equivalent to "how desirable in my list is my
assignent spouse").
- Call this the "spouse-rank" for a married person.
- Then, compute the average "spouse-rank" over all the men.
- Compute the average "spouse-rank" over all the women.
- For each particular marriage, you will compute: (1) the average
"spouse-rank" for the men, and (2) the average "spouse-rank" for
the women. Now, over multiple data sets (and therefore, multiple
marriages), compute the averages of each of these averages.
- You will also need the classes UniformRandom
and RandomPermutation. Both can be downloaded
together as a jar file
or as a gzipped tar file.