Module 7: Next-Event Simulation


What is a next event simulation?

First, recall the elements of continuous system simulation:

As we'll see, next-event simulation differs from continuous simulation in several ways:

Exercise 1: Modify QueueControl.java to print out the clock value (the variable clock in the method nextStep()) to see how the clock advances.

The single server queue

To learn how to write a next-event simulation, let's simplify the queue example to a single server queue:

Exercise 2: Download and execute Queue.java, which is a simulation of a single server queue. Cursorily examine the code.
  • What data structures are being used?
  • Where in the code are interarrival times being generated? From what distribution?
  • Where are service times being generated? From what distribution?

Let's now understand the details using an example:

There are two key ideas in next-event simulation. Let's explore these using our queue example:

Where are events stored?

Exercise 5: Examine method simulate() in Queue.java and verify that it has this structure. Then examine init() to see if the initialization makes sense. Why is there a call to scheduleArrival() in init()?

Event handling details:

The other important data structure:

We're now in a position to examine a complete simulation:

What do we want to get out of the simulation?


The single server queue - some analysis

The idea of a rate:

  • Let's go back to arrivals for a moment:

    Here, by time t=8.97, 4 arrivals have occurred.
              => Number of arrivals per unit time = 4 / 8.97.

  • In general, let N(t) be the number of arrivals occuring by time t.
              => Then, arrival rate = N(t) / t.

  • Let Ai = the i-th interarrival time.
              Σi≤n Ai is the time taken for the first n arrivals.

  • We know that as n grows large,
              (1/n) Σi≤n Ai   →   E[A1].

  • Thus,
              n / Σi≤n Ai   →   1 / E[A1].

  • But this is the same as saying
              arrival rate = N(t) / t   →   1 / E[A1].

  • Let λ denote the arrival rate.
    Exercise 8: Modify Queue.java to estimate the arrival rate λ and see if it matches the stated arrival rate in the program. Relate this to your earlier estimate of the average interarrival time.

  • There's another idea related to the arrival rate:
    • Consider any interval of length t, say [a, a+t].
    • How many arrivals occur in this interval on average?
                => rate * length of interval
                => λ (a+t - a) = λ t.

  • Similarly, the service rate is the rate at which service is provided.
              => The inverse of the average service time.

  • Let μ denote the service rate.
    Exercise 9: Modify Queue.java to estimate the service rate μ and see if it matches the stated service rate in the program.
Exercise 10: Fix the service rate at μ=1 and vary the arrival rate: try λ = 0.5, 0.75, 1.25. What do you observe when λ = 1.25?

Relationshiop between λ and μ:

  • First consider the case λ ≥ μ:
              => This is the same as saying E[A] < E[S].
              => There are more arrivals than the server can handle (on average).
              => The number in the waiting area gradually increases
              => Grows without bound (blows up).

  • Suppose we fix μ=1 and vary λ between 0 and 1.

  • Next, recall that
              Di = customer i's time spent in the system.

  • Let d denote the mean system time.
              Then, (1/n)Σi Di   → d.

  • Clearly, d depends on both μ=1 and λ.
              We'll use the notation d(λ,μ) to show that d is a function of both.

  • We've established one fact already
              As λ → μ, we get d → infty.

  • What about the other end, when λ → 0?
    Exercise 11: What about it?

  • Next, let's systematically consider different values of λ.
    Exercise 12: Vary λ as follows: 0.1, 0.2, ..., 0.9 and estimate d. What is the shape of the function d(λ,μ)?

Next, let's focus on the number of customers in the system:

  • An arriving customer either gets service immediately or has to wait.
              => An arriving customer either sees zero customers in the system or some non-zero number of customers.

  • Let Mi be the number of customers seen by customer i at the moment of arrival.

  • We want to estimate
              (1/n)Σi Mi.

  • Let m denote the mean number in the system (as seen by an arriving customer), i.e.,
              (1/n)Σi Mi   →   m.
    Exercise 13: For &lambda=0.75 and &mu=1, estimate m. Then compute m/d, where d is the mean system time.

Little's Law:

  • Consider the three numbers d, m and λ.

  • Now consider an "average" customer.

  • How much time does Mr.Average spend in the system?
              => d.

  • During this period d, how many arrivals occur (on average)?
              => λd.

  • When Mr.Average departs, what is the average number of customers remaining in the system?
              => m.

  • But these remaining customers are the ones who came during Mr.Average's sojourn
              => m = λ d (on average).
              => This is called Little's Law.

  • Little's Law can be shown to hold in great generality.
              => Doesn't have to do with queueing systems.

Other measures of interest:

  • We have thus far measured the average (number in the system, system time).

  • Often, we are also interested in particular probabilities
              e.g., the system utilization = Pr[server is busy].

  • In an underutilized system, the server is unused.
              => This might suggest allowing an increased arrival rate.

  • Sometimes, it's useful to know the distribution instead of the mean.
Exercise 14: For &lambda=0.75 and &mu=1, estimate the probability that an arriving customer finds the server free. Try this for λ = 0.5, 0.6 as well. Can you relate this probability to &lambda=0.75 and &mu=1?
Exercise 15: We will focus on two distributions: the distribution of the number in the system, and the system time. Let rv M denote the number of customers seen by an arriving customer, and let rv D denote the system time experienced by a random customer.
  • Is M discrete or continuous? What about D? What is the range of M? Of D?
  • For &lambda=0.75 and &mu=1, obtain the appropriate histogram of M. Which well-known distribution does this look like?
  • For &lambda=0.75 and &mu=1, obtain the appropriate histogram of D. Which well-known distribution does this look like?

An example with multiple queues

Consider this example with three queues:

  • There are three servers: 0, 1, 2.

  • Each server has a waiting area
              => We'll call the combination of server and waiting area: a queue.
              => There are three queues.

  • Customers arrive from the outside and enter queue 0.
              => We'll use exponentially-distributed interarrival times.

  • A customer departing from queue 0 goes to either queue 1 or queue 2 (to receive service from server 1 or 2).
              => We'll use a simple choice-model: a customer chooses between 1 and 2 with equal probability.

  • Upon departing either queue 1 or 2, a customer may either leave the system or may return for another sojourn through the system.
              => Some fraction q of customers return.

  • Our goal: to understand how to write a next-event simulation of this system.

Before writing the simulation, let's look at possible applications:

  • A web server farm:
    • HTTP requests come from browsers (users) to a website.
    • The website has a front-end server that then routes the request to one of two back-end servers.
    • The backend servers perform the computation required and send the results back to the client browser.
    • To improve response times, the requests are divided between the two backend servers.
    • Some requests result in an immediate follow up (this is the feedback part).

  • A dual-core processor:
    • The first queue represents the operating system overhead in scheduling.
    • The other two queues are the two processors.
    • Some jobs return after receiving a slice of CPU time.

  • Machine-shop:
    • A factory makes a product that need two types of machining.
    • There are three machines used: machine 0 does the first type, and machines 1 or 2 do the second type.
    • Some products need finishing touches or re-working, and go back after inspection.

  • Evil bureaucracy:
    • You need two kinds of approval on your application.
    • Desk 0 does the first kind, and desks 1,2 do the second kind.
    • Because they are nit-picky, you might have to go back through the system.

Applications of queueing systems in general:

  • Queueing systems are used in a variety of situations: computer systems, manufacturing, service-centers.

  • These systems share the following characteristics:
    • There are multiple "points of service" or stages of service.
    • Customers (jobs, tasks) are routed amongst these servers.
    • Each server has a waiting area.
    • Service times are often random (with some mean).

OK, back to our example:

  • First, let's identify the parameters:
    • The external arrival rate.
    • The mean service time at each server
                => Alternatively, the service rate for each server.
    • The fraction of customers who return.
    Thus, our code will have something like
    public class ThreeQueue {
    
        // Fraction that return to start.
        double fracReturn = 0.1;
    
        // Avg time between arrivals = 1.0, avg time at server=1/0.75.
        double arrivalRate = 1.0;
        double[] serviceRates = {2.0, 1.0, 1.0};
    
    }
         

  • Next, let's look at the data structures needed:
    • We'll use one queue (list) data structure for each queue.
    • And we'll use a single event list.
    public class ThreeQueue {
    
        // Fraction that return to start.
        double fracReturn = 0.1;
    
        // Avg time between arrivals = 1.0, avg time at server=1/0.75.
        double arrivalRate = 1.0;
        double[] serviceRates = {2.0, 1.0, 1.0};
    
        // Three queues: queue[0], queue[1], queue[2]
        LinkedList<Customer>[] queues;
    
        PriorityQueue<Event> eventList;
    
    }
           

  • The main simulation loop will be exactly the same:
        void simulate (int maxCustomers)
        {
            init ();
    
            while (numDepartures < maxCustomers) {
                Event e = eventList.poll ();
                clock = e.eventTime;
                if (e.type == Event.ARRIVAL) {
                    handleArrival (e);
                }
                else {
                    handleDeparture (e);
                }
            }
    
            stats ();
        }
         
    Note: the structure of the simulation is the same.
              => It's only the event-handling details that differ.

  • To handle an external arrival:
        void handleArrival (Event e)
        {
    	numArrivals ++;
    	queues[0].add (new Customer (clock));      // Place external arrival in queue[0]
    	if (queues[0].size() == 1) {
    	    scheduleDeparture (0);                 // Schedule departure if needed.
    	}
    	scheduleArrival ();
        }
           
    Notice that scheduleDeparture() now takes a parameter - the queue number.

  • Thus, in scheduleDeparture():
        void scheduleDeparture (int i)
        {
    	double nextDepartureTime = clock + randomServiceTime (i);
    	eventList.add (new Event (nextDepartureTime, Event.DEPARTURE, i));
        }
    
        double randomServiceTime (int i)
        {
            // Each server may have a different rate => different exponential parameters.
    	return exponential (serviceRates[i]);
        }
          

  • How is a departure event handled? In pseudocode
          Method: handleDeparture (k)
          Input: k, the queue from which the departure occurs
          1.   if k = 0
          2.       send to queue 1 or queue 2
          3.   else
          4.       if uniform() < q
          5.           send back to queue 0
          6.       else
          7.           remove from system
          8.       endif
          9.   endif
          
Exercise 16: Download and execute the above simulation: ThreeQueue.java
  • What is the average system time?
  • Examine the code to see where the system time is being accumulated.
  • Change the queue choice (between 1 and 2) to "shortest-queue" instead of "uniform-random" and see the difference this makes.
  • In what kind of application is "uniform-random" better than "shortest-queue"?
  • Measure the rate of departures from the system and explain the value you get.

Synchronous simulations

What is a synchronous simulation?

  • Recall that in a next-event simulation, the clock jumps by random amounts
              => From one event occurence to the next.

  • We could, loosely, call this an asychronous simulation.

  • In contrast, in a synchronous simulation, the clock steps are the integers.
              => Time proceeds in fixed steps.

  • Furthermore, it's usually the case that the actual time in a step doesn't matter
              => The steps are all that's needed.
Exercise 17: Download and execute Boids.java.

About the boids simulation:

  • It is meant to simulate the flocking behavior of birds.

  • Each bird, except the "leader" looks at other birds and tries to fly towards them.

  • The leader picks a random direction in which to fly.

  • To make it interesting, we occasionally change the leader.

  • We will use a synchronous simulation:
         Algorithm: boidsSimulation (n)
         Input: N - the number of steps to simulate
         1.   boids = random initial locations
         2.   for n=1 to N
         3.       nextStep ()               // This is one step in N steps.
         4.   endfor
    
    
         Method: nextStep ()
         1.   newBoidLocations = applyBoidRules ()      // Move according to boid rules.
         2.   for i=1 to numBoids         
         3.       boids[i] = newBoidLocations[i]        // Set new locations in one shot.
         4.   endfor
         

  • In the Boids.java program:
        void simulate ()
        {
            init ();
            for (int n=0; n<numSteps; n++) {
                nextStep ();
                try {                               // Sleep for animation effect.
                    Thread.sleep (100);
                }
                catch (InterruptedException e) {
                    break;
                }
                repaint ();
            }
        }
        
    
        void nextStep ()
        {
            // The new locations will be stored here temporarily.
    	Point[] newLocations = new Point [numBoids];
    
    	// Is it time to change the leader?
    	if (RandTool.uniform() < leaderChangeProb) {
    	    leader = RandTool.uniform (0, numBoids-1);
    	    leaderTheta = RandTool.uniform (0.0, Math.PI);
    	}
    
            // Except for the leader, move the others towards neighbors.
    	for (int i=0; i<numBoids; i++) {
    	    if (i == leader) {
                    // Apply different rules for leader.
                    newLocations[leader] = moveLeader ();
    	    }
                else {
                    newLocations[i] = moveOther (i);
                }
            }
            
            // Synchronous update for all.
            for (int i=0; i<numBoids; i++) {
                boids[i] = newLocations[i];
            }
        }
         

Let's see what an asynchronous version of boids might look like:

  • In the asynchronous version:
    • Each boid moves independently.
    • Each moves and then sleeps for a random amount of time.
                => The sleep time will be exponentially distributed.
  • Our main simulation loop will look like:
        void simulate ()
        {
            init ();
    
            for (int n=0; n<numSteps; n++) {
    
                // Next-event simulation:
                Event event = eventList.poll ();
                nextStep (event.i);
                scheduleNextMove (event.i);
    
                // Sleep for animation effect ... etc
            }
        }
         
    Thus, the next boid to "wake up" is moved, and its next wake-up time is scheduled.

  • In nextStep() we now only focus on moving the one boid that's awake:
        void nextStep (int i)
        {
    	// Is it time to change the leader?
    	if (RandTool.uniform() < leaderChangeProb) {
    	    leader = RandTool.uniform (0, numBoids-1);
    	    leaderTheta = RandTool.uniform (0.0, Math.PI);
    	}
    
            // Except for the leader, move the others towards neighbors.
            if (i == leader) {
                // Apply different rules for leader.
                boids[i] = moveLeader ();
            }
            else {
                boids[i] = moveOther (i);
            }
    
            // Note: No need for copying over new locations.
        }
         
Exercise 18: Download and execute AsynchBoids.java. What do you notice? Does it work?
Exercise 19: Examine the code in the molecular simulation from Module 4. Is this synchronous or asynchronous?
Exercise 20: Find a simulation of the Game-of-Life. Is this synchronous or asynchronous?

Both synchronous and asynchronous simulations share some features:

  • In the examples we've seen, there are fundamental units or "particles" that are processed
    • In queues, the particles are customers.
    • In the boids example, the particles are boids.
    • In molecular simulations, particles are molecules.

  • Each particle is subject to "something random"
    • Customers have random interarrivals, random service times, and random routing.
    • Boids have random movements (especially the leader).
    • Molecules have random movements.

  • The particles are also subjected to non-random rules:
    • Customers wait in order of arrival, they are served one at a time.
    • Boids (other than the leader) move according to others.
    • Molecular movement occurs in short distances. Collisions result in reactions.

Agent simulations:

  • Agent simulations are usually intended to model some higher-level entities such as:
    • People in an economy:
                The interactions are trades or buy-sell transactions.
    • Animals in a competitive environment
                => e.g., predator-prey.

  • Computationally, an agent simulation is no different than a particle simulation.

  • Most agent simulations have some spatial aspect
              => e.g., a 2D "landscape"

  • Predator-prey example:

    • The landscape: M x M cells.
    • Two types of creatures: prey and predator.
    • At time-step t, let
                N1(t) = number of prey
                N2(t) = number of predators.
    • Some cells have food (needed by prey).

  • Example of rules for predator-prey simulation:
    • Movement rules: e.g., each animal moves to a random neighboring cell.
    • If the cell has food, a prey animal consumes food.
                => Else, prey dies off.
    • If a predator is in the same cell as a prey, prey is consumed
                => predator survives.

  • Examples of simulation objectives:
    • Study correlation with food availability, speed of movement.
    • What happens to N1(t) and N2(t) as t gets large?
    • Is there a stable population or is there natural fluctuation as in the ODE model?
    • What is the effect of death rates amongst predators? A sudden catastrophe?
    • Multiple predators, multiple prey.

A simple one-particle simulation

The raindrop problem:

  • Our single particle will be a raindrop.
              => We'll model a raindrop falling down, buffeted by winds.

  • Our model will be simple:
    • A raindrop is a point particle.
    • It can move down, left or right.
                => A down-move occurs with probability p, and a left move occurs with probability (1-p)/2.
    • The distance moved is a fixed constant, s.
    • The raindrop is dropped from a height h.
    • When the raindrop hits the ground, we stop.

  • What do we want to do with a simulation?
    • Measure the distribution of where the rain drop falls.
    • Measure the distribution of the time taken.

  • The two rv's of interest are:
    • Let rv X = the point (along the x-axis) where the raindrop hits the ground.
                => We'll assume the drop point is along the y-axis.
    • Let rv T denote the time taken.

  • The simulation (for a single trial) is quite simple
         Algorithm: raindrop (h, s, p)
         Input: h - starting height, s = stepsize, p = Pr[move down]
         1.   x = 0
         3.   t = 0
         4.   while h > 0
         5.      if uniform() < p
         6.          h = h - s                        // Move down
         7.      else if uniform() < 0.5
         8.          x = x + s                        // Move right
         9.      else
         10          x = x - s                        // Move left
         11      endif
         12      t = t + 1
         13   endwhile
         14.  return x, t
         
Exercise 21: Fill in code in Raindrop.java to obtain histograms of X and T respectively. Use s=1, h=10, p=0.5.
  • What is the likely distribution of X?
  • Vary h: try h = 20, 30, 40, 50. What is the relationship between E[T] and h?
Exercise 22: Do you know the historical significance of the distributions of X and T?

The eventlist data structure

A data structure for an eventlist must support two operations:

  • insert: insert a new element into the collection.

  • extractMin: remove the smallest element in the collection.

  • The implicit assumption: elements can be ordered from "smallest" to "largest".
              => Usually based on some numeric value.

  • In a next-event simulation:
    • The elements are events.
    • The order is based on the time of occurence of an event.

  • In the CS jargon, such a data structure is called a priority queue.
              => The value used in ordering is called an element's priority.

  • In a simulation:
              priority = event's time.
              => Here, extractMin will remove the smallest-priority element.
Exercise 23: What is the size of the eventlist for the single-server queue? For the three-queue example?

When is this data structure important?

  • When the eventlist becomes large.

  • The eventlist becomes large in some applications:
              => e.g., an asynchronous simulation with thousands (or millions) or particles.

  • Suppose we use a sorted linked-list:

    • For extractMin, remove and return the first element.
    • For insert, insert a new event in correct position in list.
Exercise 24: If the eventlist has n events, how long does it take for each operation (in order-notation)?
Exercise 25: If instead of a sorted-list, we use an unsorted list, how long does each operation take?

Better data structures:

  • Using a binary heap, each operation can be done in O(log n) time.

  • For an explanation of how a binary heap works, see Module 2 of the Algorithms course.

  • Alternatives to the binary heap include: Binomial heap, Fibonacci heap and the run-relaxed heap.

Summary

There are several types of simulation:

  • Continuous-system simulation:
    • These are essentially differential equations.
                => Usually a system of equations (ODEs).
    • Fixed, small time steps, Δ t.
    • No randomness.
    • Typical goal: to compute the time-evolution of continuous functions.
    • Example:
                v'(t) = a
                d'(t) = v
      Goal: compute d(t) (distance as a function of time).

  • Next-event simulation:
    • Key entities (particles) are discrete.
    • Particles are subjected to randomness in some form.
    • Events occur at random times.
                => Nothing of significance happens between events.
    • Clock is advanced from event to event.

  • Synchronous simulation:
    • Time proceeds in fixed steps
                => The steps are numbered 1,2,3,..., and size usually has no meaning.
    • The entire next state is computed for each particle (in temporary variables).
                => Switch to next state all at once.

The world of simulation:

  • Sophisticated random generation techniques.
              => Techniques for efficient, statistically-sound generation.

  • Statistical techniques for simulation:
              e.g., rare-event simulation, variance reduction.

  • Simulation of large systems:
              => Parallel or distributed simulation, simulation approximation.

  • Continuous-system simulation:
              => Specialized algorithms for different types of differential equations.

  • Agent-based simulation (like Boids):
              => Large application area (economics, sociology).

  • Simulation optimization
              => How to optimize parameters in a simulation.

  • Simulated (virtual) worlds
              => 3D graphics, modeling worlds, realistic physics simulation.

  • Specialized applications:
              => e.g., queueing simulations, molecular simulations, complex systems.