3.3 Deadline-driven, Random
Another way to distribute the system load across the clusters is to employ a random placement of data across the clusters ,  (Figure 3.6). With deadline-driven (DD) approach, each block request is tagged with a deadline and the disk services requests using an earliest-deadline-first (EDF) policy. Note that round-robin (RR) uses an object-based retrieval while random uses a block-based retrieval, thus, a request with RR is for an entire object while with random, it is for a block of an object. Thus, a request with RR is for an entire object while with DD, it is for a block of an object.
Figure 3.6 Random placement.
With random, the system may suffer from a statistical variation of the number of block requests in a cluster. Even though all the objects are displayed with a constant rate, the probability that a cluster receives a higher number of requests in a given time than other clusters might be significant. Thus, the time to retrieve a block might be greater than a time period, resulting in a hiccup. For example, assume that Figure 3.6 demonstrates a load distribution in a system consisting of six disks at a certain time. Each disk has its own queue and requests remain in queues until being serviced. For a simple discussion, assume that each block request is tagged with the same deadline, a Tp, and each disk drive can support up to three block retrievals during a Tp. Then, all requests can be serviced in a Tp but the last two requests in the queue of disk d3. The deadlines of these two block requests are violated and hiccups happen.
We can employ a queueing model to quantify the expected hiccup probability with DD. With a Poisson arrival process and a deterministic service process, this is the probability that a request remains in the system more than the duration of a Tp (including waiting time in the queue and the service time) . In particular, when a new request finds Nsys or more waiting requests in the queue of a disk, this request cannot be serviced in Tp and will experience a hiccup.
Suppose there are d disks in a system and each disk supports a maximum of Ndisk requests in a Tp. When there are k active requests in the system, each disk receives k/d requests on the average per Tp. This is because blocks are randomly distributed across disks and a request accesses a specific disk with a probability of 1/d. Using a M/D/1 queueing model , the probability that a request spends time less than or equal to t in the system can be calculated using the queue length distribution, Pn:
where js ≤ t < (j + l)s, j = 1, 2, . . and ω is the random variable describing the total time a request spends in the queueing system, and
where s is the average service time, ρ is the system utilization (load), and λ is the arrival rate. Then, the probability of hiccups is:
The average startup latency with random placement can be defined by the average time in the queueing system (average queueing time plus service time):
3.3.1 Functionality versus Performance
Based on the analytical models in Section 3.2 and 3.3, Figure 3.7.a shows an example of the average startup latency of two approaches, RR and random. The X axis of Figure 3.7.a shows the system load (utilization) and the y axis denotes the average startup latency in seconds. In all cases, the average startup latency with RR is significantly higher than that with random.  With a high system utilization this difference becomes larger. The knee of the curves prior to a rapid increase is 85% with RR and 95% with random. While the results of Figure 3.7.a argue in favor of random, there is a fundamental difference in servicing continuous displays between RR and random. With RR, once a request is accepted, a hiccup-free display for the request is guaranteed until the end of its display. However, with random, there is a possibility of hiccups. Because hiccup-free display of continuous media is an important functionality, the hiccup probability should be minimized with random.
Figure 3.7 RR vs. random.
Figure 3.7.b shows the probability of hiccups with random as a function of the system utilization. With a utilization higher than 80%, the quality of display would suffer due to a high probability of hiccup. Thus, the maximum utilization of a server that employs random should be less than 80%.
In general, random is more appropriate to latency-sensitive applications than RR. RR is more appropriate if an application can tolerate a higher startup latency. For example, if 10 seconds of average startup latency is tolerable, RR can reach up to 90% system utilization while random realizes an 80% system utilization. However, if the average startup latency is required to be less than 2 seconds, random is the right choice.
In Chapter 4, we will discuss several techniques to reduce the hiccup probability with the deadline driven and random approach.