# Minimizing Stall Time in Single and Parallel Disk Systems.

Abstract. We study integrated prefetching and caching problems following the work of Cao et al. [1995] and Kimbrel and Karlin [1996]. Cao et al. and Kimbrel and Karlin gave approximation algorithms for minimizing the total elapsed time in single and parallel disk settings. The total elapsed time is the sum of the processor stall times and the length of the request sequence to be served.We show that an optimum prefetching/caching schedule for a single disk problem can be computed in polynomial time, thereby settling an open question by Kimbrel and Karlin. For the parallel disk problem, we give an approximation algorithm for minimizing stall time. The solution uses a few extra memory blocks in cache. Stall time is an important and harder to approximate measure for this problem. All of our algorithms are based on a new approach which involves formulating the prefetching/caching problems as linear programs.

Categories and subject descriptors: D.4.2 [Operating Systems]: Storage Management--storage hierarchies; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems--sequencing and scheduling; G.1.6 [Mathematics of Computing]: Optimization--linear programming.

General Terms: Algorithms, Design, Performance

Additional Keywords: Approximation algorithms, caching algorithms, prefetching

1. Introduction

Prefetching and caching are powerful tools for increasing the performance of file and data base systems. In prefetching, memory blocks are loaded from slow memory, for example, a disk, into cache before the actual references to the blocks so as to reduce the waiting time incurred if the block were to be fetched from disk when it is referenced. Caching on the other hand tries to maintain the most frequently accessed blocks in cache so that they do not have to be fetched from disk. Both prefetching and caching have separately been the subjects of extensive theoretical and experimental studies.(1) However, only recently researchers have started looking at these techniques in an integrated manner and to explore interrelationships between them.(2) In a seminal work, Cao et al. [1995] introduced a model that allows an algorithmic study of the problem.

First, consider the case when all blocks reside on one disk (Figure 1). We are given a request sequence [Sigma] = [r.sub.1], ..., [r.sub.n] and a cache of size k. Each of the n requests [r.sub.i] specifies a memory block stored on disk. We emphasize that we study the offline problem in which the entire request sequence is given in advance. Serving a request takes one time unit. However, a request can be served only if the block requested is in cache. Fetching a block not in cache takes F time units. Thus, if we encounter a request to a block that is not in cache we can start fetching the block from disk; in this case, the processor has to stall for F time units. A better option is to prefetch the block, that is, to initiate the fetch of the block some i time units before the actual reference; the processor now has to stall for only max{F - i, 0} time units. A prefetch operation may be initiated at any time provided it is the only prefetch happening at that time. However--and this is where caching enters the picture--when we initiate a prefetch we also have to make room in cache for the in-coming block by evicting some block from cache. Thus, not only do we need to decide when to initiate a prefetch but also what blocks to fetch and evict. Starting a prefetch too early might force us to evict blocks which are requested fairly soon so that we have to initiate more prefetches to avoid stalling for these blocks. On the other hand, if a prefetch is started late, the processor might have to stall for a long time. Our goal is to minimize the total stall time, which is the total time the processor is idle. This is equivalent to minimizing the total time taken to serve the request sequence since this is just the sum of the stall times and the length of the sequence.

[Figure 1 ILLUSTRATION OMITTED]

As an example, consider the request sequence a, b, c, g, a, b, g, h and a cache size of 4, with blocks a, b, c and d being initially in cache. Assume F = 5. The minimum stall time required on this sequence is 3. We begin by prefetching g and evicting block d. Since 5 time units are required to fetch g we stall for two time units waiting for g to arrive. On the request to g, we start prefetching h and evict c and hence have to stall for one time unit before h is in cache.

In the case of a parallel disk system, first explored by Kimbrel and Karlin [1996], the memory blocks are distributed over D disks with each block stored on exactly one disk. At any time, at most one block may be fetched from a given disk. However, blocks that reside on different disks may be prefetched in parallel. Any block in cache may be evicted to make room for a block being fetched. Thus, this corresponds to the setting where blocks are read-only and do not have to be written back to disk. Again, the goal is to minimize the total stall time. Since blocks from different disks can be fetched in parallel, an efficient strategy for the parallel disk case involves balancing the load, that is, the number of fetches, amongst the disks.

We give a small illustrating example for three disks (Figure 2). Suppose that disk 1 stores blocks [a.sub.1], [a.sub.2], [a.sub.3], [a.sub.4], disk 2 stores blocks [b.sub.1], [b.sub.2] and disk 3 stores blocks [c.sub.1], [c.sub.2]. We assume F = 5 and a cache of size 4. Blocks [a.sub.1], [a.sub.2], [b.sub.1], [c.sub.1] are initially in cache. In Figure 2, we give a schedule for serving the request sequence [a.sub.1], [a.sub.2], [b.sub.1], [a.sub.3], [b.sub.2], [b.sub.1], [b.sub.1], [a.sub.2], [a.sub.4], [b.sub.2], [c.sub.2] with a stall time of 4 units. The schedule shows that while the processor stalls, one could be prefetching concurrently from several disks. This is the case at times 4 and 5 as well as at time 11. Recall that from a given disk, one can only prefetch blocks that are stored on it, while evictions can be from any disk.

[Figure 2 ILLUSTRATION OMITTED]

1.1. PREVIOUS WORK. Cao et al. [1995] analyzed two algorithms, conservative and aggressive for the single disk problem. The conservative strategy incurs the same faults as Belady's optimal paging algorithm [Belady 1966], but starts prefetch operations at the earliest possible point in time. The aggressive strategy starts prefetch operations at the earliest "reasonable" time.

The elapsed time of the schedule obtained by conservative (respectively, aggressive) is at most 2 (respectively min{2, 1 + F/k}) times the optimum. In addition to combinatorial analyses, Cao et al. [1995] presented extensive experimental studies of the two algorithms.

Kimbrel and Karlin [1996] studied conservative and aggressive for the parallel disk problem. They showed that the approximation ratios, when the measure is the elapsed time, are D + 1 and D(1 + (F + 1)/k), respectively. They also presented an algorithm called reverse aggressive, which is the aggressive strategy on the reverse sequence. This algorithm achieves an approximation ratio of (1 + DF/k). This gives good approximation ratios if F/k are small, which is true in many practical applications. Karlin and Kimbrel left open the question whether an optimum prefetching/caching schedule can be computed in polynomial time even for the single disk case. A partial answer to this question was given by Kimbrel [1997] who showed a dynamic programming strategy that decides whether a request sequence can be served with zero stall time in the single disk setting.

1.2. OUR CONTRIBUTION. In this paper, we present a new approach to the problem of minimizing stall time in single and parallel disk systems based on a linear programming formulation of these problems.

First, in Sections 2 and 3, we give a polynomial time algorithm for minimizing the stall time for the single disk problem, thereby settling a question left open by Cao et al. [1995]. In particular, we show that any solution of our linear program can be written as a convex combination of (polynomially many) integral solutions. This is equivalent to saying that there is an optimum solution to the linear program that is integral.

All results in the mathematical programming literature that prove that the optimum solution to a certain linear program is integral do so by arguing that all vertices of the corresponding polytope are integral. This is done either by arguing that the constraint matrix is totally unimodular, as is in the case of bipartite matching and maximum s-t flow, or by combinatorial arguments as for the matching and matroid polytopes [Schrijver 1986]. However, the polytope corresponding to the LP we consider has nonintegral vertices. In our case we prove that the optimal solution to the linear program is integral by exploiting specific characteristics of our objective function. This is done by showing that every optimal solution that is not integral is a convex combination of a set of integral solutions.

In Section 4, we study the parallel disk problem; the main novelty here being that we minimize the total stall time instead of the total elapsed time. While computing the optimum for these two measures is equivalent, approximating total stall time is harder than approximating elapsed time, since the length of the sequence is not part of our objective function. To minimize total stall time is the real objective of an efficient prefetch/caching strategy. We generalize the linear program and the proof techniques presented in Sections 2 and 3 for a single disk to the setting of parallel disks. An optimum solution to the linear program is then transformed into an integral solution that achieves a D-approximation for the total stall time. The solution constructed uses at most D - 1 additional memory locations in cache. This is actually very small--D is typically 4 or 5--when compared with the size of the cache.

Note that for D = 1, we obtain our optimum algorithm for the single disk case. Another pleasing feature of our algorithm is that, if a sequence can be served with zero stall time, we obtain a schedule that has no stall either and uses at most D - 1 extra memory locations in cache. Finally, in Appendix A, we demonstrate an integrality gap of D for a linear programming formulation stronger than that used for parallel disk systems and, in Appendix B, that if no extra memory locations are allowed, then the integrality gap of our linear program can be arbitrarily large.

In Section 5, we conclude with some remarks and open problems.

2. The LP Formulation for a Single Disk

We first formulate a 0-1 linear program for the single disk problem. We assume that the request sequence is of length n. It is no loss of generality to assume that the cache initially contains k blocks never requested in the sequence. We identify periods in which a prefetch is performed by considering intervals of the request sequence of length at most F. Interval I = (i, j) of length |I| = j - i - 1 indicates a prefetch starting after request i and ending before request j, with i = 0, ..., n - 1, j = 1, ..., n, i [is less than] j. Since a prefetch requires F time-units an interval I of length less than F is viewed as having a stall time of F - |I| units at the end. With every such interval I, we associate a variable x(I), which is 1 when a prefetch is performed in interval I and 0 otherwise. Thus, minimizing the total stall time is equivalent to minimizing [[Sigma].sub.I] x(I)(F - |I|).

Note that we have also included zero length intervals of the form (i, i + 1), i = 0, ..., n - 1. A prefetch performed in one such interval corresponds to the processor stalling for F time-units after request i. Also note that the total number of intervals is bounded by n min{(F + 1), n}.

An interval (a, b) is contained in an interval (c, d) if c [is less than or equal to] a and d [is greater than or equal to] b; we denote this by (a, b) [subset or equal to] (c, d). To ensure that two prefetches are not performed simultaneously, we add for each i, 1 [is less than or equal to] i [is less than or equal to] n the constraint that [[Sigma].sub.I:(i - 1,i) [subset of equal to] I] x(I) [is less than or equal to] 1.

With each interval I and distinct block a, we associate variable [f.sub.I,a] (respectively, [e.sub.I,a]), which is 1 if block a is fetched (respectively, evicted) in interval I and 0, otherwise. Clearly,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In a feasible solution, prefetches are scheduled so that a block is in cache when it is referenced. This constraint is enforced as follows: After each reference to a block, the block is in cache. It can then be evicted at most once up until the next reference to that block, and if it is, it must also be fetched back prior to that next reference. Thus, if i, j are two consecutive references to a block a, then

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

To ensure that every block is in cache at its first reference, we require that the total fetch of a block on intervals before its first reference should be 1 and the total evict of the block on these intervals should be 0. Thus, if i is the first reference to block a, [[Sigma].sub.I [subset or equal to] (0,i)] [f.sub.I,a] = 1 and [[Sigma].sub.I [subset or equal to] (0,i) [e.sub.I,a] = 0.

We also impose that a block is not evicted for more than 1 unit after its last reference. Let i be the last reference to a block a, i = 0 for a block initially in cache then [[Sigma].sub.I [subset or equal to] (i,n) [e.sub.I,a] [is less than or equal to] 1.

Finally, we require that on each request, the requested block is neither prefetched nor evicted, that is, if i is a reference to block a, then

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We give below a compact description of the 0-1 linear program. In the request sequence [Sigma], let [a.sub.1], ..., [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] be the requests to block a. Let [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for a block initially in cache never requested in the sequence.

minimize [[Sigma].sub.I] x(I)(F - |I|)

subject to

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The linear programming relaxation of this 0-1 linear program is obtained by relaxing the 0-1 constraints on the variables x(I), [f.sub.I,a], [e.sub.I,a] to 0 [is less than or equal to]] x(I), [f.sub.I,a], [e.sub.I,a] [is less than or equal to] 1. Clearly, the value of the optimum solution to this linear program is only less than the minimum stall time.

3. Minimizing Stall Time for a Single Disk

In this section, we consider an arbitrary optimum solution to the linear program and show how to write it as a convex combination of polynomially many integral solutions. It then follows that all these integral solutions have a stall time that is at most the stall time of the fractional solution and hence at most the minimum stall time. Thus, we show the following result.

THEOREM 3.1. An optimum prefetching/caching schedule for a single disk can be computed in polynomial time.

In the next subsections, we first identify important properties of an optimum solution. We prove that there exists an optimum solution that does not contain nested pairs of intervals and then analyze the sequence of fetches and evictions of blocks.

3.1. MODIFYING INTERVALS. Let I be the set of intervals with x(I) [is greater than] 0, that is, I = {I|x(I) [is greater than] 0}. An interval [I.sub.1] = ([i.sub.1], [j.sub.1]) is properly contained in interval [I.sub.2] = ([i.sub.2], [j.sub.2]) iff [i.sub.1] [is greater than] [i.sub.2] and [j.sub.1] [is less than] [j.sub.2]; a pair of intervals such that one is properly contained in the other is called a nested-pair. Let [I.sub.1] [element of] I be properly contained in [I.sub.2] [element of] I and let x = min{x([I.sub.1]), x([I.sub.2])}. We reduce each of x([I.sub.1]), x([I.sub.2]) by an amount x; this causes one of x([I.sub.1]), x([I.sub.2]) to go down to zero and we remove the corresponding interval. We also add two new intervals [J.sub.1] = ([i.sub.2], [j.sub.1]) and [J.sub.2] = ([i.sub.1], [j.sub.2]) with x([J.sub.1]) = x([J.sub.2]) = x. The fetch in [J.sub.1] (respectively, [J.sub.2]) is the same as the fetch in [I.sub.2] (respectively, [I.sub.2]) while the evict in [J.sub.1] (respectively, [J.sub.2]) is the same as the evict in [I.sub.2] (respectively, [I.sub.1]). Since [J.sub.1] ends with [I.sub.1] the blocks that were fetched in [I.sub.1] still arrive in cache at the same time. Further, since [J.sub.1] begins with [I.sub.2] the blocks evicted in [I.sub.2] are evicted from cache at the same time as before. The same is true for the blocks fetched/evicted in interval [J.sub.2] and hence the new solution also satisfies all the constraints of the LP (Figure 3). Furthermore, since the total length of intervals [J.sub.1], [J.sub.2] is the same as that of [I.sub.1], [I.sub.2] and the reduction in x([I.sub.1]), x([I.sub.2]) is the same as the increase in x([J.sub.1]), x([J.sub.2]), the value of the objective function remains unchanged.

[Figure 3 ILLUSTRATION OMITTED]

Thus, any nested-pair of intervals can be replaced by a set of at most 3 intervals none of which properly contains the other. By performing this transformation for every nested-pair we obtain an equivalent fractional solution without nested-pairs. Henceforth, I denotes this new set of intervals.

We now order the intervals in I by increasing starting points; if two intervals have the same start point, then they are ordered by increasing end-points. We could also have ordered the intervals by increasing end-points, breaking ties by looking at starting points. It turns out that since I has no nested-pairs these two orderings are identical. Let [is less than] denote this total order on I.

3.2. THE OPTIMUM FRACTIONAL SOLUTION. As observed in Cao et al. [1995], there is an optimum (integral) solution that obeys the following two rules for fetching/evicting blocks: at any point the block fetched is the block not in cache whose next reference is earliest and the block evicted is that block in cache whose next reference is furthest in the future. The optimum fractional solution also follows these rules albeit in a fractional sense.

Consider intervals in the order [is less than] and let C denote the cache configuration after we have performed the fetches and evicts corresponding to the first i intervals in the sequence. Note that each block is in C to an extent between 0 and 1. Further, let I be the (i + 1)-st interval. As we shall see, there exists an optimum fractional solution for which the next two claims are satisfied.

CLAIM 3.1. In interval I, we fetch the block that is not completely in C and whose next reference is earliest.

PROOF. For contradiction, assume that this block, say a, is not fetched up to 1 in I and let b be one of the blocks fetched in I. Let [f.sub.I,b] be the amount of b fetched in I and let [[Delta].sub.a] be the amount of a that does not reside in cache when the fetches/evictions in I are complete. Let [Delta] = min{[f.sub.I,b], [[Delta].sub.a]}. We can now fetch a instead of b in interval I and fetch b in those intervals where a is fetched. More precisely, in I we reduce [f.sub.I,b], by [Delta] and increase [f.sub.I,a] by [Delta]. In the subsequent intervals I' (up to the next request to a) with [f.sub.I',a] [is greater than] 0 we reduce [f.sub.I',a] by a total of [Delta] and increase [f.sub.I',b], by a total of [Delta]. Since the next reference of b is later than the next reference of a, b will be fetched before it is referenced. Repeating this process, we obtain an interval I with the desired property. []

CLAIM 3.2. In interval I, we evict the block which is partially or completely in C whose next reference is furthest.

PROOF. For contradiction, assume that this block, say a, is not evicted up to 1 in I and let b be one of the blocks evicted in I. Let [[Delta].sub.a] be the amount of a that still resides in cache when the fetches/evictions in I are complete and let [e.sub.I,b] be the amount of b evicted in I. Let [Delta] = min{[[Delta].sub.a], [e.sub.I,b]}. We can evict a instead of b in I and fetch back a in those intervals where b is fetched. In the interval I we reduce [e.sub.I,b] by [Delta] and increase [e.sub.I,a] by [Delta]. In the subsequent intervals I' with [f.sub.I',b] [is greater than] 0 we reduce [f.sub.I',b] by a total of [Delta] and increase [f.sub.I',a] by a total of [Delta]. Since the next reference of a is only after the next reference of b, a will be fetched before it is referenced. Repeating this process, the claim follows. []

The amount of fetch of a block prescribed by Claim 3.1 might be less than x(I) if the block is brought completely into cache. In such a case, we apply the same rule to fetch another block in I. The same is true for the case of evictions in Claim 3.2. The above two claims then tell us what blocks to fetch/evict in I. This then gives us a new cache configuration, which we use to decide what blocks to fetch/evict in the interval that follows I in the order [is less than].

Define the distance of interval I, dist(I), as the sum of x(I) where I are the intervals which precede I in [is less than], that is, dist(I) = [[Sigma].sub.I [is less than] I] x(I). We can also view the process of fetching/evicting as a process in time by associating the time interval [dist(I), dist(I) + x(I)) with interval I; thus there is a unique interval in I associated with each time instant. Note that the time ranges from 0 to T, where T = [[Sigma].sub.I [element of] I] x(I). We will also associate a unique fetch/evict with each time instant. If I [element of] I is the interval associated with time t and a is the only block fetched and b the only block evicted in I, then we fetch a and evict b at time t. If there are many blocks fetched/evicted in I, then we order them as follows: For any two blocks a, b fetched in I, a precedes b iff the next reference to a is before the next reference to b. This defines a total order on the blocks fetched in I; let [a.sub.1], [a.sub.2], ..., [a.sub.i], ..., [a.sub.p] be the blocks in this order. Block [a.sub.i] is now fetched for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] time units starting at time dist:(I) + [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Similarly, for any two blocks a, b evicted in I, a precedes b iff the next reference to a is after the next reference to b. This defines another total order on the blocks evicted in I; let [b.sub.1], [b.sub.2], ..., [b.sub.i], ..., [b.sub.q] be the blocks in this order. Block [b.sub.i] is now evicted for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] time units starting at time dist(I) + [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

From the above two claims and our ordering of the fetches/evicts within an interval, it follows that a block a is fetched continuously till it is fully in cache. With regard to evictions the situation is different. The eviction of a could be interrupted before it is completely out of cache if during the course of its eviction, there is a reference to a block b whose subsequent reference is further in the future than the next reference to a.

In the remainder of this section, we consider the fetches/evictions of a block a between two consecutive references to a.

LEMMA 3.2. Every interruption in the eviction of a is for some integral time units.

PROOF. The eviction of a is interrupted to evict a set of blocks whose next reference is further in the future than the next reference to a. The eviction of a is resumed only after that these blocks are fully evicted. Therefore, the total length of the interruption in the eviction of a is integral, since initially the blocks were completely in cache. []

We say that a is partially fetched/evicted if the total extent to which a is fetched/evicted between two consecutive references is strictly less than one.

LEMMA 3.3. If a is partially fetched/evicted, then the fetch of a begins some integral time units after the start of its eviction.

PROOF. At any point, in time after the first time any page is evicted consider the blocks that are only partially in cache. Since the cache is full, the total extent to which these blocks are not in cache is integral. Hence, the total time spent in bringing these blocks completely into cache is integral.

When the eviction of a starts, the next reference to a is later than the next reference of any other partially evicted block. Hence, these partially evicted blocks will be fetched back completely into cache before a is fetched. By the above argument, fetching these partially evicted blocks will require some integral time units. Let us now consider the other blocks that are fetched between the time when we start evicting a and fetching a back. These blocks were not in cache when we started evicting a and so are fetched completely into cache before we start fetching a. Hence, the total time required to fetch these blocks is also integral. The two contributions together take same integral time units, thus proving the claim. []

The next lemma holds for any time instant t [element of] [0, T), where T = [[Sigma].sub.I [element of] S] x(I), and not only for time instants where fetch/evict operations begin or end. This will be important in the remainder of the proof. Recall that, for any time instant, there is a unique block fetched and evicted at that time.

LEMMA 3.4. If a is evicted at time t and referenced again after time t, then there is a time t' = t + i, for some integer i, at which a is fetched back.

PROOF. We first assume that a is partially fetched/evicted. By Lemma 3.3, the difference in the times at which we start evicting a and fetching a back is integral. Once we start fetching a, we fetch it continuously till it is completely in cache. The eviction of a could however be interrupted. But, by Lemma 3.2, every interruption is for some integral time units.

If a is fetched/evicted completely, then it is no more the case that the start of the eviction and the fetch of a are integral time units apart. However, it is still true that once we begin fetching a we fetch it continuously for one time unit after which it is completely in cache and that every interruption in the eviction of a is for some integral time units. []

3.3. THE CONVEX DECOMPOSITION. We decompose the fractional solution into a convex combination of integral solutions as described below. Let t be in the range [0, 1) and let [t.sub.i] = i + t for every integer i, 0 [is less than or equal to] i [is less than or equal to] T. Let [S.sub.t] be the intervals corresponding to the time instants [t.sub.i].

CLAIM 3.3. Let [t.sub.1], [t.sub.2] be two time instants such that [t.sub.2] = [t.sub.1] + i for some positive integer i, and let [I.sub.1], [I.sub.2] be the intervals associated with these time instants. Then [I.sub.1] and [I.sub.2] are disjoint.

PROOF. We have [t.sub.2] [is greater than or equal to] [t.sub.1] + 1. Therefore, the sum of the values of all intervals between (and including) [I.sub.1] and [I.sub.2] in [is less than] is at least 1. Hence, [I.sub.1], [I.sub.2] cannot overlap. []

LEMMA 3.5. For any time t [element of] [0, 1), the set of intervals [I.sub.t] forms a feasible solution.

PROOF. Claim 3.3 ensures that the intervals of [I.sub.t] are disjoint. In the interval corresponding to [t.sub.i], we fetch the block [f.sub.t], and evict the block [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Lemma 3.4 ensures that if a block is evicted in some interval I [element of] [I.su.t] and referenced again after time t, then it is fetched back in some later interval I' [element of] [I.sub.t] before its next reference. Thus, the set of intervals [I.sub.t] and the associated fetches and evicts form a feasible integral solution to the problem. []

Consider the set of different solutions obtained as t varies from 0 to 1. Note that each solution is obtained not for just one value of t but for a range of values, say for all t in the range [x.sub.p-1], [x.sub.p], p = 1, ..., q, [x.sub.0] = 0, [x.sub.q] = 1. We assign to the pth solution a weight [x.sub.p] - [x.sub.p-1] in the decomposition. Clearly, the total weight of the solutions that an interval I occurs in equals x(1). Further, since t ranges from 0 to 1, the sum of the weights assigned to all solutions is 1. Hence, this collection of solutions with the associated weights is a convex decomposition of the optimum fractional solution. Denote by val(p) the cost of the pth solution and by I(p) the set of intervals of the pth solution. We then have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which implies that each of the q integral solutions is an optimal feasible solution.

4. The Multiple Disk Case

In this setting, the blocks are distributed over D different disks. At any point, we can fetch at most one block residing on a disk but fetches from different disks may proceed simultaneously.

4.1. THE LINEAR PROGRAM. The 0-1 linear program for this case differs from the one for the single-disk setting in that we now have one copy of interval I [element of] I for each disk. Let [I.sup.d], d = 1, ..., D, denote the copy of interval I for disk d; henceforth, we view intervals [I.sup.1], [I.sup.2], ..., [I.sup.D] as distinct intervals. Let x([I.sup.d]) be 1 if a fetch is performed from disk d in interval I and 0 otherwise. Similarly, let [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (respectively, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]) be 1 if block a is evicted (respectively, fetched) in interval [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] Since only blocks that reside on disk d can be fetched in interval [I.sup.d], we have that [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] if a is not on disk d. As before

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

To ensure that prefetches from a disk are not performed simultaneously, we add for each disk d, 1 [is less than or equal to] d [is less than or equal to] D, and request i, 1 [is less than or equal to] i [is less than or equal to] n, the constraint [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. As in the single disk setting, we require that the total fetch of a block a on internals between two consecutive references of a equals the total eviction of a on these intervals and is at most 1, and that a block is brought to cache for an extent equal to 1 before its first reference. Moreover, no block may be fetched or evicted while it is referenced.

Let I be the set of intervals with x([I.sup.d]) = 1 in a feasible solution to the 0-1 linear program. Note that for intervals [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] [element of] I, if [d.sub.1] = [d.sub.2] then intervals [I.sub.1], [I.sub.2] are nonoverlapping. Thus, the stall time for this solution is at least [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]) /D. Hence, the objective function for this 0-1 linear program is to minimize ([[Sigma].sub.I,d] x([I.sup.d])(F - |[I.sup.d]|))/D. As in the single-disk, we relax the 0-1 constraints on the variables X([I.sup.d]), [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] to 0 [is less than or equal to] x([I.sup.d]), [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] [is less than or equal to] 1. An optimum solution to this linear program can now be found in polynomial time. In Appendix A, we give an alternate linear program that models the objective function more accurately. However, we show that the approximation ratio achievable using the corresponding linear program relaxation cannot be better than D.

4.2. THE OPTIMUM FRACTIONAL SOLUTION. Let [I.sup.d] = {[I.sup.d]|x([I.sup.d]) [is greater than] 0} be the set of intervals from disk d which have a positive value and let I = [[union].sub.d] [I.sup.d]. As in the single disk setting, we can modify intervals so that [I.sup.d] contains no nestedpairs. We order intervals in I by increasing starting points with ties broken first by increasing ending points and then by the number of the disk to which the interval belongs; let [is less than] denote this order. Note that for intervals from one disk the order [is less than] is exactly the same as for the single-disk setting.

Once again consider intervals in the order [is less than] and let C denote the cache configuration after we have performed fetches and evicts corresponding to the first i intervals in this order. Let [I.sup.d] be the (i + 1)-st interval. The following claims are the multi-disk counterparts of Claims 3.1 and 3.2

CLAIM 4.1. In [I.sup.d], we fetch the block from disk d that is not completely in C and whose next reference is earliest.

CLAIM 4.2. If we evict a block from disk j in interval [I.sup.d], then it is the block from disk j that is partially or completely in C and whose next reference is furthest.

4.3. CONSTRUCTING A FEASIBLE SOLUTION. The multidisk setting differs from that of the single-disk in that for an interval [I.sup.d] we only know what block to evict from each disk; we do not know from which disk to select the block to evict.

As in the single-disk setting, define the distance of an interval [I.sup.d], dist([I.sup.d]), as the sum of x([I.sup.d]) where [I.sup.d] [element of] [I.sup.d] precedes [I.sup.d] in the order [is less than], that is, dist([I.sup.d]) = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Once again, we view this as a process in time by associating the time interval [dist([I.sup.d]), dist([I.sup.d]) + x([I.sup.d])] with interval [I.sup.d]. Thus, there is a unique interval in [I.sup.d] associated with each time instant. As before, we order the blocks fetched in [I.sup.d] by increasing order of their next references. Let [a.sub.1], [a.sub.2], ..., [a.sub.p] be the blocks in this order. Block ai is now fetched for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], time units starting at time dist([I.sup.d]) + [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Thus, at each time instant, we fetch a unique block from each disk.

At each time instant, we will also evict a unique block from each disk. Let [P.sup.d] be the set of blocks that reside on disk d. Let [a.sub.1], ..., [a.sub.p] [element of] [P.sup.d] be the blocks from disk d that are evicted in interval I ordered in decreasing order of their next reference. Block [a.sub.i] is evicted for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] time units starting at time [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Note that, if there was only one disk, then the time at which we start evicting [a.sub.i] is exactly the same as dist(I) + [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], which was how we had defined the starting time of this eviction earlier. However, if [a.sub.i] is evicted at time t, then, unlike the single-disk setting, it is not necessary that in the fractional solution a i is evicted in one of the intervals associated with time instant t. This is because two or more blocks from the same disk d may be evicted simultaneously from cache while fetch operations are performed. Thus, within a time range, the amounts of evictions can be (highly) unbalanced for the different disks.

The machinery we developed for the single-disk case can now be applied to each disk in the multi-disk setting. We consider the fetches/evictions of blocks that reside on any disk as a process in time as in the single-disk case. Using Claims 4.1 and 4.2, we can extend Lemmas 3.2 and 3.3, which can then be used, exactly as before, to prove Lemma 3.4 for the multi-disk setting.

Extending Claim 3.3 to the multi-disk setting yields

CLAIM 4.3. Let [t.sub.1], [t.sub.2] be two-time instants such that [t.sub.2] = [t.sub.i] + i for some positive integer i, and let [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] be the intervals on disk d associated with these time instants. Then [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] are disjoint.

We now show how to obtain an integral feasible solution that uses at most D - 1 extra memory locations. We show how to construct an integral solution for every t in the range [0, 1). Let [t.sub.i] = i + t, for every integer i [is greater than or equal to] 0. For each time instant [t.sub.i] the unique interval in [I.sub.d] associated with [t.sub.i] is part of the solution. Let [I.sub.t] denote this set of intervals. We are left with the problem of assigning fetches and evictions to the intervals of [I.sub.t]. In the interval corresponding to [t.sub.i] and disk d we fetch the block from disk d that is fetched at time [t.sub.i]. The block that resides on disk d and is evicted at time [t.sub.i] will also be evicted in this solution, albeit in a different interval.

Evictions are assigned to intervals of [I.sub.t] in the following manner. Consider the intervals in I in the order [is less than] and let I be the current interval. Suppose there is a block that is evicted in I and the same block is evicted at time [t.sub.i] for some i. We then add this block to a set S (S is the set of block evictions that need to be assigned to intervals of [I.sub.t] and is initially empty). If I [element of] [I.sub.t] and S is not empty, then remove a block from S and assign it to interval I; no block is evicted in interval I [element of] [I.sub.t] in this solution if the set S is empty.

LEMMA 4.1. The solution [I.sub.t] for any t [element of] [0, 1) is a feasible solution for the prefetching/caching problem on D parallel disks that uses at most D - 1 extra blocks in cache.

PROOF. By Claim 3.3, any two intervals in [I.sub.t] that are from the same disk are disjoint. If in our solution we fetch a block in an interval I, then the same block is fetched in I in the fractional solution. If the fractional solution evicts a block in an interval I, then in our solution the block is evicted in an interval whose starting point is only after the starting point of I. Next consider two consecutive references to a block a. By Lemma 3.4, it follows that if a is evicted in some interval of this solution then it is also fetched back. Thus, this assignment of fetches/evictions to intervals of [I.sub.t] is a feasible solution to the problem provided that every interval of [I.sub.t] has an eviction assigned to it.

We next prove that at most D - 1 intervals do not have an eviction assigned. Our procedure for assigning evictions to intervals of [I.sub.t] considers intervals of I in the order [is less than]. It any step, let F be the number of intervals of [I.sub.t] encountered and E the number of evictions encountered that will be assigned to intervals in [I.sub.t]. We prove that F - E [is less than or equal to] D - 1.

Let [f.sub.d], [e.sub.d] be the total amount of fetch, evict of blocks from disk d till this point. Clearly, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Further, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and The claim that F [is less than or equal to] E + D - 1 follows from

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Assume that the interval I is in [I.sub.t] and there are D - 1 intervals preceding I in order [is less than] that belong to [I.sub.t] and do not have an eviction assigned. Since at any point F - E [is less than or equal to] D - 1, the set S is not empty and hence I will be assigned an eviction.

Since, at most, D - 1 intervals do not have an eviction assigned, we can use D - 1 extra cache locations to fetch the blocks fetched in these intervals. Note that a block fetched into one of these extra locations can be evicted later and replaced by a different block. Thus, for every t [element of] [0, 1), we have a feasible solution that uses at most D - 1 extra blocks in cache. []

Consider the different solutions obtained as t varies from 0 to 1. Note that every solution is obtained not for just one value of t but for a range of values. Let 0 [is less than or equal to] [x.sub.1] [is less than] [is less than] [x.sub.2] [is less than] ... [is less than] [x.sub.s] [is less than] [is less than] 1 be a set of values such that if we start fetching/evicting a block at time t on disk d or if dist([I.sup.d]) = t for some [I.sup.d] then there exists a value [x.sub.i] such that [x.sub.i] = t mod 1. From our definition of [I.sub.t] and the fetches/evictions assigned to intervals in [I.sub.t], it follows that for all of t in the range [[x.sub.i], [x.sub.i+1]) we would obtain the same solution. We assign this solution a weight [x.sub.i+1] - [x.sub.i]. Clearly, the total weight of the solutions that an interval [I.sub.d] occurs is equal to x([I.sub.d]). Further, since t ranges from 0 to 1, the sum of the weights assigned to all the solutions is 1. Hence, this collection of solutions with the associated weights is a convex decomposition of the optimum fractional solution. The total weighted stall time (the stall time of the solution corresponding to [x.sub.i] [is less than or equal to] t [is less than] [x.sub.i+1] is weighted by [x.sub.i+1] - [x.sub.i]) of these s solutions is at most [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Hence, there is a solution of stall time at most [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], which is at most D times the stall time of the optimum fractional solution.

We would like to select the best among the s integral solutions. The number of solutions we construct is bounded by the total number of fetches/evictions of blocks over all the intervals in the fractional solution. This number is bounded by O([Dn.sup.2] min{F, n}).

We can therefore conclude with the following theorem:

THEOREM 4.2. There exists a polynomial-time algorithm for the prefetch/caching problem on D parallel disks, that produces a solution with at most D times the optimum stall time using at most D - 1 extra memory locations.

Observe that, for D = 1, we get a solution with minimum stall time without using any extra memory location. In Appendix B, we show that if no extra memory locations are used, then the integrality gap of our linear program can be arbitrarily large.

5. Conclusions

In this paper, we presented a polynomial time algorithm for optimal prefetching/caching on a single disk. For the parallel disk problem, we developed a D-approximation algorithm that is allowed to use D - 1 extra memory locations in cache.

We can remove the additional memory locations at the expense of increasing the stall time. The integral solution constructed in Section 4.3 works on a cache of size k. Consider one of the D - 1 prefetch operations that do not have an eviction assigned. In this operation, we now evict the block a in cache whose next reference is furthest in the future. If a is evicted in some other interval I before the next reference to a, then we cancel the eviction there; otherwise, we introduce an interval I right before the reference to a and fetch a. In any of the two cases, the block to be evicted in I is determined in the same way as before. We repeat this process until the end of the request sequence is reached. In the same way, we process the other D - 2 prefetch operations that do not have an eviction assigned. This gives us a schedule in which every prefetch operation has an eviction assigned. The extra stall time introduced is at most (D - 1)(F/k)n. The total elapsed time is bounded by (1 + (D - 1)(F/k)n + DS, where n is the length of the request sequence and S is the minimum stall time. The approximation of the elapsed time so obtained improves over the factor (1 + D(F/k)) of the algorithm by Kimbrel and Karlin if (F/k) [is greater than or equal to] 1.

An interesting open problem is to find a combinatorial, polynomial-time algorithm for minimizing stall time on a single disk. A challenging open problem is to find a constant approximation algorithm for the parallel disk problem or decide if the problem can be solved in polynomial time.

Appendix A

The objective function used in the linear program for the multiple disk problem in Section 4 simply adds up the processor stall times of the individual prefetch intervals ignoring the fact that a processor stall time has to be counted only once if several prefetches are performed in parallel. A natural question is if a more. accurate objective function could lead to a better approximation. We first show that a very natural linear program formulation of the multiple disk problem has an integrality gap of D and hence would not yield a better solution.

The integral solution constructed in Section 4.3 uses D - 1 extra memory blocks in cache. One might ask whether this is necessary. In the second part of this appendix, we show that the integrality gap of the linear program is unbounded if no extra blocks are allowed.

An alternative LP formulation for minimizing stall time in the multi-disk setting would be as follows: We have stall-variables [s.sub.i] indicating the duration of the stall after the (i - 1)th request and before the ith request is served. Thus, the objective would now be to minimize [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Once again we have a variable x([I.sup.d]) associated with the copy of interval I on disk d where I is of length at most F. We also have fetch and evict variables associated with every 3-tuple, (block, interval, disk), as before. All constraints from the earlier LP still apply. However, we now need additional constraints to ensure that for every interval I that is chosen, the sum of the stall times before the requests in this interval is at least F - |I|. Then, for every disk d and interval I = (p, q), we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The linear relaxation to the integer program is obtained by replacing the integrality constraint on x([I.sup.d]) by the linear constraint 0 [is less than or equal to] x([I.sup.d]) [is less than or equal to] 1. Using this relaxation, we cannot approximate stall time better than D. Consider a cache of size D + 1, with blocks [a.sub.1], [c.sub.1], ..., [c.sub.D] being initially in cache. Block a [a.sub.1] is stored on disk 1 and block [c.sub.i], for 1 [is less than or equal to] i [is less than or equal to] D, is stored on disk i. The request sequence to be served is [([a.sub.1]).sup.F], [b.sub.1], [c.sub.1], ..., [c.sub.D] where block [b.sub.1] is stored on disk 1. Here [([a.sub.1]).sup.F] represents F requests to [a.sub.1].

An optimum fractional solution for serving this sequence prefetches [b.sub.1] during the F requests to aI and evicts every block [c.sub.i], 1 [is less than or equal to] i [is less than or equal to] D to an extent of 1/D. Starting with the request to [b.sub.1], the D disks simultaneously fetch the missing portions of [c.sub.1], ..., [c.sub.D]). Before the request to [c.sub.1] a stall of F - 1 time units has to be introduced. However, since each disk only prefetches a block to an extent of 1/D the objective function value is 1/D(F - 1).

An optimum integral solution, when prefetching [b.sub.1], evicts block [c.sub.D]. On the request to [b.sub.1], disk D starts prefetching [c.sub.D] while the other disks are idle. Before request [c.sub.D], a stall time of F - D time units has to be inserted. This gives a performance ratio of (F - D)/(1/D(F - 1)) = D(1 - (D - 1)/(F- 1)), which can be arbitrarily close to D.

Appendix B

Consider the integral solution constructed in Section 4.3. We show that if no extra memory blocks are allowed, the integrality gap of our linear program can be arbitrarily large. This holds even for problems on two disks. We give a request sequence [Sigma] such that (a) there exists a fractional solution with zero stall time and (b) there exists no integral solution with zero stall time.

The request sequence [Sigma] is composed of three subsequences [[Sigma].sub.1], [[Sigma].sub.2], and [[Sigma].sub.3]. We first give zero stall time solutions for [[Sigma].sub.12] = [[Sigma].sub.1] [[Sigma].sub.2] and [[Sigma].sub.23] = [[Sigma].sub.2] [[Sigma].sub.3] and then show that there is no integral solution for [Sigma] = [[Sigma].sub.1] [[Sigma].sub.2] [[Sigma].sub.3] that has zero stall time.

Consider a system with two disks. We need 12 blocks [a.sub.i], [b.sub.i], [c.sub.i] and [d.sub.i], 1 [is less than or equal to] i [is less than or equal to] 3, where [a.sub.i] and [c.sub.i] are stored on disk 1 and [b.sub.i] and [d.sub.1] are stored on disk 2, 1 [is less than or equal to] i [is less than or equal to] 3. Let

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We assume a cache of size 10, where initially all but blocks [a.sub.3] and [b.sub.3] are stored in cache. The prefetch time is F = 8.

Figure B1 shows zero stall time schedules for the sequence [[Sigma].sub.12] = [[Sigma].sub.1] [[Sigma].sub.2]. The intervals above the request sequence represent an optimum fractional solution, where each interval I has an associated value x(I) = 1/2. The intervals below the request sequence represent the integral solution in which fetches on disk 1 are completed as early as possible. An earlier completion time on disk 1 could only be achieved if, in the first prefetch operations, disk 1 evicts [b.sub.1] and disk 2 evicts [b.sub.2]. However, this leads to a schedule with non-zero stall time because disk 2 cannot simultaneously prefetch [b.sub.1] and [b.sub.2]. Note that at the end of the schedules, blocks [c.sub.3] and [d.sub.3] are not in cache.

[Figure B1 ILLUSTRATION OMITTED]

Figure B2 shows solutions for the request sequence [[sigma].sub.23] = [[Sigma].sub.2] [[Sigma].sub.3] given an initial cache in which blocks [c.sub.3] and [d.sub.3] are missing. The integral solution given below the request sequence is the only integral solution with zero stall time. In an integral solution, disk 1 must evict [d.sub.1] in the first prefetch operation. It is impossible to evict [c.sub.1] because [c.sup.1] cannot be fetched back in time. Given that disk 1 evicts [d.sub.1], disk 2 must evict [c.sub.1] in its first prefetch operation; otherwise, [d.sub.1] cannot be fetched back in time. This requires that the prefetch on disk 1 starts on request [d.sub.2].

For the sequence [Sigma] = [[Sigma].sub.1][[Sigma].sub.2][[Sigma].sub.3], the fractional solutions in Figure B1 and B2 can be combined and give an optimum fractional solution for [Sigma]. However, there is no integral solution with zero stall time. To serve [[Sigma].sub.1][[Sigma].sub.2], disk 1 must prefetch [a.sub.2] while serving request [d.sub.2] in [[Sigma].sub.2]. To serve [[Sigma].sub.2][[Sigma].sub.3], disk 2 must prefetch [c.sub.1] while serving that particular request.

[Figure B2 ILLUSTRATION OMITTED]

ACKNOWLEDGMENTS. We thank two anonymous referees for their helpful comments improving the presentation of the paper.

REFERENCES

BELADY, L. A. 1966. A study of replacement algorithms for virtual storage computers. IBM Syst. J. 5, 78-101.

BORODIN, A., IRANI, S., RAGHAVAN, P., AND SCHIEBER, B. 1995. Competitive paging with locality of reference. J. Comput. Syst. Sci. 50, 244-258.

CAO, P., FELTEN, E. W., KARLIN, A. R., AND LI, K. 1995. A study of integrated prefetching and caching strategies. In Proceedings of the Joint ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '95/PERFORMANCE '95) (Ottawa, Ont., Canada, May 15-19). ACM, New York, 188-196.

CAO, P., FELTEN, E. W., KARLIN, A. R., AND LI, K. 1996. Implementation and performance of integrated application-controlled caching, Prefetching and disk scheduling. ACM Trans. Comput. Syst. (TOCS) 14, 4 (Nov.), 311-343.

CUREWITZ, K. M., KRISHNAN, P., AND VITTER, J. S. 1993. Practical prefetching via data compression. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (Washington, D.C., May 26-28). ACM, New York, pp. 257-266.

ELLIS, C. S., AND KOTZ, D. 1989. Prefetching in file systems for MIMD multiprocessors. In Proceedings of the 1989 International Conference on Parallel Processing, pp. I306-314.

FIAT, A., KARP, R. M., MCGEOCH, L. A., SLEATOR, D. D., AND YOUNG, N.E. 1991. Competitive paging algorithms. J. Algorithms 12, 685-699.

FIAT, A., AND KARLIN, A. R. 1995. Randomized and multiprocessor paging with locality of reference. In Proceedings of the 27th Annual ACM Symposium on Theory of Computing (Las Vegas, Nev., May 29-June 1), ACM, New York, pp. 626-634.

FIAT, A., AND ROSEN, Z. 1997. Experimental studies of access graph based heuristics. Beating the LRU standard. In Proceedings of the 8th ACM-SIAM Symposium on Discrete Algorithms, ACM, New York, pp. 63-72.

FUCHS, D. R., AND KNUTH, D. E. 1985. Optimal prepaging and font caching. ACM Trans. Prog. Lang. Syst. 7, 62-79.

KARLIN, A. R., PHILLIPS, S., AND RAGHAVAN, P. 1992. Markov paging. In Proceedings of the 33rd Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., pp. 208-217.

KIMBREL, T. 1997. Parallel prefetching and caching (Ph.D. dissertation). Tech. rep. 97-07-03. Dept. Comput. Sci. Eng., Univ. Washington.

KIMBREL, T., TOMKINS, A., PATTERSON, R. H., BERSHAD, B., CAO, P., FELTEN, E. W., GIBSON, G. A., KARLIN, A. R., AND LI, K. 1996. A Trace-Driven Comparison of Algorithms for Parallel Prefetching and Caching. In Proceedings of the ACM SIGOPS/USENIX Association Symposium on Operating System Design and Implementation (OSDI) ACM, New York, 19-34.

KIMBREL, T., AND KARLIN, A. R. 1996. Near-optimal parallel prefetching and caching. In Proceedings of the 37th IEEE Annual Symposium on Foundations of Computer Science. IEEE Computer Science Press, Los Alamitos, Calif., pp. 540-549.

KOTZ, D., AND ELLIS, C.S. 1993. Practical prefetching techniques for multiprocessor file systems. J. Dist. Paral. Datab. 1, 33-51.

KRISHNAN, P., AND VITTER, J. S. 1994. Optimal prediction for prefetching in the worst case. In Proceedings of the 5th ACM-SLAM Symposium on Discrete Algorithms (Arlington, Va., Jan. 23-25). ACM, New York, pp. 392-401.

LEE, K.-K., AND VARMAN, P. 1995. Prefetching and I/O parallelism in multiple disk systems. In Proceedings of the 1995 International Conference on Parallel Processing. CRC Press, Boca Raton, Fla., pp. III160-163.

PALMER, M., AND ZDONIK, S.B. 1991. Fido: A cache that learns to fetch. In Proceedings of the 17th International Conference on Very Large Data Bases. Morgan-Kaufmann, San Francisco, Calif., pp. 255-264.

PATTERSON, R. H., GIBSON, G. A., GINTING, E., STODOLSKY, D., AND ZELENKA, J. 1995. Informed prefetching and caching. In Proceedings of the 15th Symposium on Operating Systems Principles. (Copper Mountain Resort, Colo., Dec. 3-6). ACM, New York, pp. 79-95.

SCHRIJVER, A. 1986. Linear and Integer Programming. Wiley, Chichester, England.

SLEATOR, D. D., AND TARJAN, R. E. 1985. Amortized efficiency of list update and paging rules. Commun. ACM 28, 2 (Feb.), 202-208.

VITTER, J., AND KRISHNAN, P. 1991. Optimal prefetching via data compression. In Proceedings of the 32nd Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., pp. 121-130.

RECEIVED MAY 1998; REVISED MARCH 2000; ACCEPTED MARCH 2000

A preliminary version of this paper in the Proceedings of the 30th ACM Annual Symposium on Theory of Computing (Dallas, Tex., May 23-26). ACM, New York, 1998, pp. 454-462.

This work was done while S. Albers was at the Max-Planck-Institut fur Informatik, Saarbrucken, Germany.

The work of S. Leonardi was partly supported by EU ESPRIT Long-Term Research Project ALCOM-IT under contract no. 20244, and by Italian Ministry of Science Research Project 40% "Algoritmi, Modelli di Calcolo e Strutture Informative."

This work was done while S. Leonardi was visiting the Max-Planck-Institut fur Informatik, Saarbrucken, Germany.

Authors' addresses: S. Albers, Lehrstuhl Informatik II, Universitat Dortmund, 44221 Dortmund, Germany, e-mail: albers@ls2.cs.uni-dortmund.de; N. Garg, Department of Computer Science & Engineering, Indian Institute of Technology, New Dehli 110016, India, e-mail: naveen@iitd.ernet.in; S. Leonardi, Dipartmento di Informatica Sistematica, Universita di Roma "La Sapienza," via Salaria 113, 00198-Roma, Italia, e-mail: leon@dis.uniroma1.it.

Printer friendly Cite/link Email Feedback | |

Author: | ALBERS, SUSANNE; GARG, NAVEEN; LEONARDI, STEFANO |
---|---|

Publication: | Journal of the Association for Computing Machinery |

Geographic Code: | 1USA |

Date: | Nov 1, 2000 |

Words: | 10293 |

Previous Article: | Contention Resolution with Constant Expected Delay. |

Next Article: | On the Sorting-Complexity of Suffix Tree Construction. |

Topics: |