To estimate the last component of the formula we need to know a little about distri-butions of random variables. A variable is random if it takes one of a specified set of values with a specified probability; that is, you cannot know exactly what its next value will be, but you may know the probability of all possible values.
Requests for service from an I/O system can be modeled by a random vari-able because the operating system is normally switching between several pro-cesses that generate independent I/O requests. We also model I/O service times by a random variable given the probabilistic nature of disks in terms of seek and rotational delays.
One way to characterize the distribution of values of a random variable with discrete values is a histogram, which divides the range between the minimum and maximum values into subranges called buckets. Histograms then plot the number in each bucket as columns.
Histograms work well for distributions that are discrete values—for example, the number of I/O requests. For distributions that are not discrete values, such as
Server utilization Arrival rate×Timeserver 50
sec---×0.01sec 0.50
= = =
time waiting for an I/O request, we have two choices. Either we need a curve to plot the values over the full range, so that we can estimate accurately the value, or we need a very fine time unit so that we get a very large number of buckets to estimate time accurately. For example, a histogram can be built of disk service times measured in intervals of 10 μs although disk service times are truly contin-uous.
Hence, to be able to solve the last part of the previous equation we need to characterize the distribution of this random variable. The mean time and some measure of the variance are sufficient for that characterization.
For the first term, we use the weighted arithmetic mean time. Let’s first assume that after measuring the number of occurrences, say, ni, of tasks, you could compute frequency of occurrence of task i:
Then weighted arithmetic mean is
Weighted arithmetic mean time = f1× T1 + f2× T2 + . . . + fn× Tn
where Ti is the time for task i and fi is the frequency of occurrence of task i.
To characterize variability about the mean, many people use the standard deviation. Let’s use the variance instead, which is simply the square of the stan-dard deviation, as it will help us with characterizing the probability distribution.
Given the weighted arithmetic mean, the variance can be calculated as
It is important to remember the units when computing variance. Let’s assume the distribution is of time. If time is about 100 milliseconds, then squaring it yields 10,000 square milliseconds. This unit is certainly unusual. It would be more con-venient if we had a unitless measure.
To avoid this unit problem, we use the squared coefficient of variance, tradi-tionally called C2:
We can solve for C, the coefficient of variance, as
We are trying to characterize random events, but to be able to predict perfor-mance we need a distribution of random events where the mathematics is tracta-ble. The most popular such distribution is the exponential distribution, which has a C value of 1.
Weighted arithmetic mean time2
Note that we are using a constant to characterize variability about the mean.
The invariance of C over time reflects the property that the history of events has no impact on the probability of an event occurring now. This forgetful property is called memoryless, and this property is an important assumption used to predict behavior using these models. (Suppose this memoryless property did not exist;
then, we would have to worry about the exact arrival times of requests relative to each other, which would make the mathematics considerably less tractable!)
One of the most widely used exponential distributions is called a Poisson dis-tribution, named after the mathematician Siméon Poisson. It is used to character-ize random events in a given time interval and has several desirable mathematical properties. The Poisson distribution is described by the following equation (called the probability mass function):
where a = Rate of events × Elapsed time. If interarrival times are exponentially distributed and we use the arrival rate from above for rate of events, the number of arrivals in a time interval t is a Poisson process, which has the Poisson distribution with a = Arrival rate × t. As mentioned on page D-26, the equation for Timeserver
has another restriction on task arrival: It holds only for Poisson processes.
Finally, we can answer the question about the length of time a new task must wait for the server to complete a task, called the average residual service time, which again assumes Poisson arrivals:
Although we won’t derive this formula, we can appeal to intuition. When the dis-tribution is not random and all possible values are equal to the average, the stan-dard deviation is 0 and so C is 0. The average residual service time is then just half the average service time, as we would expect. If the distribution is random and it is Poisson, then C is 1 and the average residual service time equals the weighted arithmetic mean time.
Example Using the definitions and formulas above, derive the average time waiting in the queue (Timequeue) in terms of the average service time (Timeserver) and server utilization.
Answer All tasks in the queue (Lengthqueue) ahead of the new task must be completed before the task can be serviced; each takes on average Timeserver. If a task is at the server, it takes average residual service time to complete. The chance the server is busy is server utilization; hence, the expected time for service is Server utilization × Average residual service time. This leads to our initial formula:
Probability k( ) e–a×ak ---k!
=
Average residual service time = 1 2⁄ ×Arithemtic mean×(1+C2)
Timequeue = Lengthqueue×Timeserver
+ Server utilization×Average residual service time
Replacing the average residual service time by its definition and Lengthqueue by Arrival rate × Timequeue yields
Since this section is concerned with exponential distributions, C2 is 1. Thus
Rearranging the last term, let us replace Arrival rate × Timeserver by Server utili-zation:
Rearranging terms and simplifying gives us the desired equation:
Little’s law can be applied to the components of the black box as well, since they must also be in equilibrium:
If we substitute for Timequeue from above, we get:
Since Arrival rate × Timeserver = Server utilization, we can simplify further:
This relates number of items in queue to service utilization.
Example For the system in the example on page D-26, which has a server utilization of 0.5, what is the mean number of I/O requests in the queue?
Answer Using the equation above,
Therefore, there are 0.5 requests on average in the queue.
Timequeue = Server utilization×[1 2⁄ ×Timeserver×(1+C2)] + Arrival rate( ×Timequeue) Time× server
Timequeue = Server utilization×Timeserver+(Arrival rate×Timequeue) Time× server
Timequeue = Server utilization×Timeserver+(Arrival rate×Timeserver) Time× queue
Server utilization×Timeserver+Server utilization×Timequeue
=
Timequeue = Server utilization×Timeserver+Server utilization×Timequeue Timequeue–Server utilization×Timequeue = Server utilization×Timeserver
Timequeue×(1–Server utilization) = Server utilization×Timeserver Timequeue Timeserver Server utilization
1–Server utilization
( )
---×
=
Lengthqueue = Arrival rate×Timequeue
Lengthqueue Arrival rate Timeserver Server utilization 1–Server utilization
Lengthqueue Server utilization Server utilization 1–Server utilization
Lengthqueue Server utilization2 1–Server utilization
As mentioned earlier, these equations and this section are based on an area of applied mathematics called queuing theory, which offers equations to predict behavior of such random variables. Real systems are too complex for queuing theory to provide exact analysis, hence queuing theory works best when only approximate answers are needed.
Queuing theory makes a sharp distinction between past events, which can be characterized by measurements using simple arithmetic, and future events, which are predictions requiring more sophisticated mathematics. In computer systems, we commonly predict the future from the past; one example is least recently used block replacement (see Chapter 2). Hence, the distinction between measurements and predicted distributions is often blurred; we use measurements to verify the type of distribution and then rely on the distribution thereafter.
Let’s review the assumptions about the queuing model:
■ The system is in equilibrium.
■ The times between two successive requests arriving, called the interarrival times, are exponentially distributed, which characterizes the arrival rate men-tioned earlier.
■ The number of sources of requests is unlimited. (This is called an infinite population model in queuing theory; finite population models are used when arrival rates vary with the number of jobs already in the system.)
■ The server can start on the next job immediately after finishing the prior one.
■ There is no limit to the length of the queue, and it follows the first in, first out order discipline, so all tasks in line must be completed.
■ There is one server.
Such a queue is called M/M/1:
M = exponentially random request arrival (C2 = 1), with M standing for A. A.
Markov, the mathematician who defined and analyzed the memoryless processes mentioned earlier
M = exponentially random service time (C2 = 1), with M again for Markov 1 = single server
The M/M/1 model is a simple and widely used model.
The assumption of exponential distribution is commonly used in queuing examples for three reasons—one good, one fair, and one bad. The good reason is that a superposition of many arbitrary distributions acts as an exponential distri-bution. Many times in computer systems, a particular behavior is the result of many components interacting, so an exponential distribution of interarrival times is the right model. The fair reason is that when variability is unclear, an exponen-tial distribution with intermediate variability (C = 1) is a safer guess than low variability (C ≈ 0) or high variability (large C). The bad reason is that the math is simpler if you assume exponential distributions.
Let’s put queuing theory to work in a few examples.
Example Suppose a processor sends 40 disk I/Os per second, these requests are exponen-tially distributed, and the average service time of an older disk is 20 ms. Answer the following questions:
1. On average, how utilized is the disk?
2. What is the average time spent in the queue?
3. What is the average response time for a disk request, including the queuing time and disk service time?
Answer Let’s restate these facts:
Average number of arriving tasks/second is 40.
Average disk time to service a task is 20 ms (0.02 sec).
The server utilization is then
Since the service times are exponentially distributed, we can use the simplified formula for the average time spent waiting in line:
The average response time is
Thus, on average we spend 80% of our time waiting in the queue!
Example Suppose we get a new, faster disk. Recalculate the answers to the questions above, assuming the disk service time is 10 ms.
Answer The disk utilization is then
The formula for the average time spent waiting in line:
The average response time is 10 + 6.7 ms or 16.7 ms, 6.0 times faster than the old response time even though the new service time is only 2.0 times faster.
Server utilization = Arrival rate×Timeserver = 40×0.02 = 0.8
Timequeue Timeserver Server utilization 1–Server utilization
Server utilization = Arrival rate×Timeserver = 40×0.01 = 0.4
Timequeue Timeserver Server utilization 1–Server utilization
Thus far, we have been assuming a single server, such as a single disk.
Many real systems have multiple disks and hence could use multiple servers, as in Figure D.17. Such a system is called an M/M/m model in queuing theory.
Let’s give the same formulas for the M/M/m queue, using Nservers to represent the number of servers. The first two formulas are easy:
The time waiting in the queue is
This formula is related to the one for M/M/1, except we replace utilization of a single server with the probability that a task will be queued as opposed to being immediately serviced, and divide the time in queue by the number of servers.
Alas, calculating the probability of jobs being in the queue is much more compli-cated when there are Nservers. First, the probability that there are no tasks in the system is
Then the probability there are as many or more tasks than we have servers is Figure D.17 The M/M/m multiple-server model.
Arrivals
Utilization Arrival rate×Timeserver Nservers
---=
Lengthqueue = Arrival rate×Timequeue
Timequeue Timeserver
Prob0 tasks 1 (Nservers×Utilization)Nservers Nservers!×(1–Utilization)
Nservers×UtilizationNservers Nservers!×(1–Utilization)
---×Prob0 tasks
=
Note that if Nservers is 1, Probtask³N
servers simplifies back to Utilization, and we get
the same formula as for M/M/1. Let’s try an example.
Example Suppose instead of a new, faster disk, we add a second slow disk and duplicate the data so that reads can be serviced by either disk. Let’s assume that the requests are all reads. Recalculate the answers to the earlier questions, this time using an M/M/m queue.
Answer The average utilization of the two disks is then
We first calculate the probability of no tasks in the queue:
We use this result to calculate the probability of tasks in the queue:
Finally, the time waiting in the queue:
The average response time is 20 + 3.8 ms or 23.8 ms. For this workload, two disks cut the queue waiting time by a factor of 21 over a single slow disk and a factor of 1.75 versus a single fast disk. The mean service time of a system with a single fast disk, however, is still 1.4 times faster than one with two disks since the disk service time is 2.0 times faster.
Server utilization Arrival rate×Timeserver Nservers
It would be wonderful if we could generalize the M/M/m model to multiple queues and multiple servers, as this step is much more realistic. Alas, these mod-els are very hard to solve and to use, and so we won’t cover them here.