SPEC System-Level File Server, Mail, and Web Benchmarks
D.5 A Little Queuing Theory
Arrivals Departures
than simulation, we won’t cover such analyses here. (See the references in Appendix L for more details.)
Hence, in this section we make the simplifying assumption that we are evalu-ating systems with multiple independent requests for I/O service that are in equi-librium: The input rate must be equal to the output rate. We also assume there is a steady supply of tasks independent for how long they wait for service. In many real systems, such as TPC-C, the task consumption rate is determined by other system characteristics, such as memory capacity.
This leads us to Little’s law, which relates the average number of tasks in the system, the average arrival rate of new tasks, and the average time to perform a task:
Little’s law applies to any system in equilibrium, as long as nothing inside the black box is creating new tasks or destroying them. Note that the arrival rate and the response time must use the same time unit; inconsistency in time units is a common cause of errors.
Let’s try to derive Little’s law. Assume we observe a system for Timeobserve minutes. During that observation, we record how long it took each task to be serviced, and then sum those times. The number of tasks completed during Timeobserve is Numbertask, and the sum of the times each task spends in the sys-tem is Timeaccumulated. Note that the tasks can overlap in time, so Timeaccumulated ≥ Timeobserved. Then,
Algebra lets us split the first formula:
If we substitute the three definitions above into this formula, and swap the result-ing two terms on the right-hand side, we get Little’s law:
This simple equation is surprisingly powerful, as we shall see.
If we open the black box, we see Figure D.16.The area where the tasks accu-mulate, waiting to be serviced, is called the queue, or waiting line. The device performing the requested service is called the server. Until we get to the last two pages of this section, we assume a single server.
Mean number of tasks in system = Arrival rate×Mean response time
Mean number of tasks in system Timeaccumulated
Timeobserve
---=
Mean response time Timeaccumulated
Numbertasks
---=
Arrival rate Numbertasks Timeobserve
---=
Timeaccumulated
Timeobserve
--- Timeaccumulated
Numbertasks
---∞Numbertasks Timeobserve
---=
Mean number of tasks in system = Arrival rate×Mean response time
Little’s law and a series of definitions lead to several useful equations:
■ Timeserver—Average time to service a task; average service rate is 1/Timeserver, traditionally represented by the symbol µ in many queuing texts.
■ Timequeue—Average time per task in the queue.
■ Timesystem—Average time/task in the system, or the response time, which is the sum of Timequeue and Timeserver.
■ Arrival rate—Average number of arriving tasks/second, traditionally repre-sented by the symbol λ in many queuing texts.
■ Lengthserver—Average number of tasks in service.
■ Lengthqueue—Average length of queue.
■ Lengthsystem—Average number of tasks in system, which is the sum of Lengthqueue and Lengthserver.
One common misunderstanding can be made clearer by these definitions:
whether the question is how long a task must wait in the queue before service starts (Timequeue) or how long a task takes until it is completed (Timesystem). The latter term is what we mean by response time, and the relationship between the terms is Timesystem = Timequeue + Timeserver.
The mean number of tasks in service (Lengthserver) is simply Arrival rate × Timeserver, which is Little’s law. Server utilization is simply the mean number of tasks being serviced divided by the service rate. For a single server, the service rate is 1 ⁄ Timeserver. Hence, server utilization (and, in this case, the mean number of tasks per server) is simply:
Service utilization must be between 0 and 1; otherwise, there would be more tasks arriving than could be serviced, violating our assumption that the system is in equilibrium. Note that this formula is just a restatement of Little’s law. Utiliza-tion is also called traffic intensity and is represented by the symbol ρ in many queuing theory texts.
Figure D.16 The single-server model for this section. In this situation, an I/O request
“departs” by being completed by the server.
Arrivals
Queue Server
I/O controller and device
Server utilization = Arrival rate×Timeserver
Example Suppose an I/O system with a single disk gets on average 50 I/O requests per sec-ond. Assume the average time for a disk to service an I/O request is 10 ms. What is the utilization of the I/O system?
Answer Using the equation above, with 10 ms represented as 0.01 seconds, we get:
Therefore, the I/O system utilization is 0.5.
How the queue delivers tasks to the server is called the queue discipline. The simplest and most common discipline is first in, first out (FIFO). If we assume FIFO, we can relate time waiting in the queue to the mean number of tasks in the queue:
Timequeue = Lengthqueue× Timeserver + Mean time to complete service of task when new task arrives if server is busy
That is, the time in the queue is the number of tasks in the queue times the mean service time plus the time it takes the server to complete whatever task is being serviced when a new task arrives. (There is one more restriction about the arrival of tasks, which we reveal on page D-28.)
The last component of the equation is not as simple as it first appears. A new task can arrive at any instant, so we have no basis to know how long the existing task has been in the server. Although such requests are random events, if we know something about the distribution of events, we can predict performance.