[PDF] Stationary Distribution of a Generalized LRU-MRU Content Cache

Abstract

Many different caching mechanisms have been previously proposed, exploring different insertion and eviction policies and their performance individually and as part of caching networks. We obtain a novel closed-form stationary invariant distribution for a generalization of LRU and MRU caching nodes under a reference Markov model. Numerical comparisons are made with an "Incremental Rank Progress" (IRP a.k.a. CLIMB) and random eviction (a.k.a. random replacement) methods under a steady-state Zipf popularity distribution. The range of cache hit probabilities is smaller under MRU and larger under IRP compared to LRU. We conclude with the invariant distribution for a special case of a random-eviction caching tree-network and associated discussion.

Full PDF

SStationary Distribution of aGeneralized LRU-MRU Content Cache ∗ George KesidisSchool of EECSPennsylvania State UniversityUniversity Park, PA, 16802, USAEmail: [email protected] 19, 2018

Abstract

Many diﬀerent caching mechanisms have been previously proposed, ex-ploring diﬀerent insertion and eviction policies and their performance indi-vidually and as part of caching networks. We obtain a novel closed-formstationary invariant distribution for a generalization of Least Recently Used(LRU) and Most Recently Used (MRU) eviction for single caching nodesunder a reference Markov model. Numerical comparisons are made withan “Incremental Rank Progress” (IRP a.k.a. CLIMB) and random eviction ∗ This research supported in part by a Cisco Systems URP gift and NSF CNS grant 1526133. a r X i v : . [ c s . PF ] D ec RE a.k.a. random replacement, RANDOM) methods under a steady-stateZipf popularity distribution. The range of cache hit probabilities is smallerunder MRU and larger under IRP compared to LRU. We conclude with theinvariant distribution for a special case of a RE caching tree-network.

Caching is a ubiquitous mechanism in communication and computer systems. Therole of a content caching network is to reduce the load on the origin servers of re-quested data objects, reduce the required network bandwidth to transmit content ,and reduce the response times to the queries. Caching in computational settingsreduces delays associated with disk IO (page caches). Data actively being, orlikely soon to be, accessed by a CPU is stored in lower-level caches, i.e. , memoriescloser (with less access time) to the CPU.The invariant distribution of the widely deployed Least Recently Used (LRU)eviction mechanism for a caching node was found in [1]. LRU has lower averagemiss rate compared to FIFO caching [2, 3, 4]. Numerically useful approximationsfor LRU caching nodes are found in [5, 6, 7, 8, 9]; in particular the expected workingset miss ratio (WS) approximation of [5] and that of [8] are equivalent [10]. In [11],LRU caching was studied for dependent (semi-Markov) object demand processesin a limiting regime for certain object popularity proﬁles. In [12, 13], time-to-live(TTL) caching networks are studied. Approximations for networks of “capacity That is, content that is not encrypted for particular end-users. Under FIFO caching, the oldest item in the cache is evicted upon a cache miss. youngest object in the cache isevicted upon cache miss. More speciﬁcally, an object is evicted under MRU whenit is the subject of a cache hit or miss (so becomes youngest) and then a cache miss(query for an uncached object) immediately follows. MRU is used in cases wherethe older the object is in the cache, the more likely it is to be accessed [16]. Thatis, MRU is used when demand for hot (most popular) objects is such that theyare not likely to be needed again soon after they are queried for, e.g. , the inter-query times of hot objects are a.s. lower bounded by a strictly positive amount, cf. , Section 5.In this paper, we focus on single caching nodes and present a closed-forminvariant distribution for a standard Markov model of a generalization of LRUand MRU eviction under the IRM. To this end, we provide a proof for LRUwhich we will subsequently adapt. For a Zipf popularity distribution, numericalcomparisons are made with the simple Incremental Rank Progress (IRP) and andRandom Eviction (RE a.k.a. random replacement, RANDOM [3]) methods. Ournumerical examples focus on the range of cache-hit probabilities for steady-stateZipf popularity distributions. We numerically show that the range of cache hitprobabilities is smaller under MRU and larger under IRP compared to LRU, andconjecture that this is true in general. We next give a result for a special case of Called CLIMB in [3], IRP is somewhat related to the insertion scheme based on tandemvirtual caches of “ k -LRU” [15].

3n RE caching tree-network. The paper concludes with a summary.

The stationary state-space R of a LRU cache is the set of B -permutations of { , , ..., N } where N is the number of objects that could be cached and B objectsis the capacity of the cache with N > B > N (cid:29) B ) and the objectsassumed identically sized (but cf. , (8)). For r ∈ R , deﬁne r ( k ) as the element of r in the k th position. The entries of r are ranked in order of their position in r : • the most recently accessed (LRU) object being r (1), • the oldest object in the cache being r ( B ), and • uncached objects n are denoted n (cid:54)∈ r .Note that in a transient regime, the cache may be in a state (cid:54)∈ R with fewer than B objects cached.For a single node, we assume that demand process for object n ∈ { , , ..., N } is Poisson with intensity λ n . The Poisson demands are assumed independent. Letthe total demand intensity be Λ = (cid:80) Nn =1 λ n . So, this is the classical “IndependentReference Model” (IRM) with query probabilities p n = λ n / Λ [3, 4].For LRU, a cache miss of object r (1) at state M − n ( r ) resulting in a transition4o state r ∈ R occurs at rate λ r (1) , where n (cid:54)∈ r and( M − n ( r ))( k ) =  n if k = Br ( k + 1) if k < B i.e. , n (cid:54)∈ r is the oldest object in the cache in state M − n ( r ).For LRU, a cache hit of object r (1) at state H − k ( r ) resulting in a transitionto state r occurs at rate λ r (1) where 1 ≤ k ≤ B and( H − k ( r ))( (cid:96) ) =  r (1) if (cid:96) = kr ( (cid:96) + 1) if (cid:96) < kr ( (cid:96) ) if k < (cid:96) ≤ B i.e. , r (1) is the k th youngest object in the cache in state H − k ( r ) and H − ( r ) = r .As commonly assumed with the IRM [14], we also assume (i) that cache missescause the query to be forwarded, possibly to a server holding the requested object,and once resolved, the object is reverse-path forwarded so that caches that missedit can be updated; and (ii) the required time for this query resolution process isnegligible compared to the inter-querying times of the caching network. The following invariant of LRU found by W.F. King in [1].

Theorem 2.1

The unique invariant distribution of the LRU Markov chain is π ( r ) = B (cid:89) k =1 λ r ( k ) Λ − (cid:80) k − i =1 λ r ( i ) (1) for r ∈ R , where ∀ k , (cid:80) k − i = k ( ... ) ≡ [1]. ∀ r ∈ R ,(Λ − λ r (1) ) π ( r ) = (2) (cid:88) n (cid:54)∈ r λ r (1) π ( M − n ( r )) + B (cid:88) j =2 λ r (1) π ( H − j ( r )) . Under (1), for all n (cid:54)∈ r , π ( M − n ( r )) = λ n Λ − (cid:80) Bi =2 λ r ( i ) B (cid:89) k =2 λ r ( k ) Λ − (cid:80) k − i =2 λ r ( i ) Also under (1), for all j ∈ { , , ..., B } , π ( H − j ( r )) = j (cid:89) k =2 λ r ( k ) Λ − (cid:80) k − i =2 λ r ( i ) · λ r (1) Λ − (cid:80) ji =2 λ r ( i ) · B (cid:89) k = j +1 λ r ( k ) Λ − (cid:80) k − i =1 λ r ( i ) Substituting into (2) and after some term cancellation, we see that (1) satisﬁes(2) if and only if1 = B +1 (cid:89) k =3 Λ − (cid:80) k − i =1 λ r ( i ) Λ − (cid:80) k − i =2 λ r ( i ) (3)+ B (cid:88) j =2 j (cid:89) k =3 Λ − (cid:80) k − i =1 λ r ( i ) Λ − (cid:80) k − i =2 λ r ( i ) · λ r (1) Λ − (cid:80) ji =2 λ r ( i ) where (cid:81) k =3 ( ... ) ≡ r (2). Now sequentially , according to the distribution (1), object r (1) attemptsto enter the cache after r (2). If it fails to enter in the k th attempt, then r ( k + 2)is placed in the cache instead and r (1) tries again. The summand of (3) with j = 2 is the probability that r (1) enters in the second position right after r (2):6 r (1) / (Λ − λ r (2) ). Generally, the summand for j ∈ { , , ..., B } is the probability r (1) enters in the j th position (after having failed to enter in one of the more highlyranked ones). The ﬁrst term of the right-hand-side of (3) is the probability r (1)fails to enter the cache. So, (3) must generally hold by the law of total probability.Finally, since the stationary LRU Markov chain is irreducible on R , there is aunique invariant.This result was generalized in [17] to add object-dependent insertion proba-bilities interpreted as access costs. Also note that, generally, the LRU Markovchain is neither time-reversible nor quasi-reversible [18]. Obviously, more popularobjects (larger λ ) are more likely stored, and the LRU invariant is uniform in thespecial case that all the mean querying rates λ n are the same. Finally, by PASTA,the stationary hit probability of object n in a LRU cache is h n = (cid:88) r : n ∈ r π ( r ) , where the approximations of hit probabilities in [8, 9] are obviously substantiallysimpler to compute. Under LRU, a query for any object n results in it being ranked ﬁrst in the cache.One can also consider slowing the “progress through the ranks” of objects as theyare queried, leading to some obvious trade-oﬀs with LRU: Slowing progress wouldmean less popular content does not enter the cache at ﬁrst rank, but also more7opular content will take longer to reach the cache. Such issues are importantwhen there are dynamic changes/churn in objects cached and their popularity.Under an Incremental Rank Progress (IRP) caching mechanism, a query forobject n results in its rank improved by just one (or zero if the object is alreadyranked ﬁrst), i.e. , for 1 ≤ k ≤ B − r ∈ R ,( T k ( r ))( (cid:96) ) =  r ( k ) if (cid:96) = k + 1 r ( k + 1) if (cid:96) = kr ( (cid:96) ) elsewhere the transition T k ( r ) → r with rate λ r ( k ) . Missed objects enter the cache atlowest rank, i.e. , for n (cid:54)∈ r , deﬁne( S n ( r ))( (cid:96) ) =  r ( k ) if (cid:96) < Bn if (cid:96) = B where the transition S n ( r ) → r occurs with rate λ r ( B ) . The invariant for IRP isfound in [3] and can be immediately shown using detailed balance. Theorem 2.2

IRP is time-reversible with unique stationary invariant π ( r ) = (cid:81) Bk =1 λ B +1 − kr ( k ) (cid:80) r (cid:48) ∈R (cid:81) Bk =1 λ B +1 − kr (cid:48) ( k ) . (4) Suppose that a cache miss of object n at state M − (cid:96),n ( r ) results in a transition tostate r ∈ R at rate B − λ n , where n ∈ r , n (cid:54)∈ M − (cid:96),n ( r ), (cid:96) ∈ M − (cid:96),n ( r ), and (cid:96) (cid:54)∈ r .That is, a cache miss for object n results in n inserted into the cache and evicting8f an object (cid:96) selected uniformly at random from the cache. The cache state r does not change if a cache hit occurs. The stationary state-space R is the setof B - combinations of N diﬀerent objects. The following invariant for RE is alsofound in [3] and can also be immediately shown by detailed balance. Theorem 2.3

The RE Markov chain is time-reversible with unique stationaryinvariant distribution π ( r ) = (cid:81) n ∈ r λ n (cid:80) r (cid:48) ∈R (cid:81) n ∈ r (cid:48) λ n . (5) Deﬁne the aggregate hit rate for a caching discipline as H := N (cid:88) n =1 h n p n = N (cid:88) n =1 h n λ n Λ , (6) i.e. , the probability that a query is a cache hit. This is a single criterion thatcan be used to compare diﬀerent caching disciplines. Typically H is largest forLRU eviction under the IRM. Note that under the IRM, by PASTA and Fubini’stheorem the following holds for all of the above capacity-driven caching disciplines, N (cid:88) n =1 h n = N (cid:88) n =1 (cid:88) r ∈R : n ∈ r π ( r ) = (cid:88) r ∈R π ( r ) B = B. (7) To account for objects of diﬀerent lengths for capacity-driven caches (with rankedobjects) like LRU, simply consider a “complete-rankings” LRU variation, wherethe ranking of all objects is maintained whether the objects are cached or not.That is, the state-space R is now the set of permutations of all N objects.9 orollary 2.1 The unique stationary invariant π of complete-rankings LRU is(1) with B replaced by N . Additionally consider the diﬀerent sizes (cid:96) n bytes of objects n , where the cachecapacity B is in bytes. The number of objects in the cache is given by K ( r ) = max { K | K (cid:88) k =1 (cid:96) r ( k ) ≤ B, ≤ K ≤ N } . So, the hit probability of object n when the objects are of variable length is h n = (cid:88) r : r ( n ) ≤ K ( r ) π ( r ) . (8)See the byte-hit performance metric of [19]. Again deﬁne the state-space R as the set of B -permutations of { , , ..., N } . UnderMRU [16, 19], a cache hit of object r (1) at state H − k ( r ) resulting in a transitionto state r occurs at rate λ r (1) where 1 ≤ k ≤ B and ( H − k ( r ))( (cid:96) ) is given by (1)as LRU. But for MRU, a cache miss of object r (1) at state M − n ( r ) resulting in atransition to state r ∈ R occurs at rate λ r (1) , where n (cid:54)∈ r and( M − n ( r ))( k ) =  n if k = 1 r ( k ) if k > i.e. , n (cid:54)∈ r is the youngest object in the cache in state M − n ( r ). Theorem 3.1

The unique invariant distribution of the MRU Markov chain is, or r ∈ R , π ( r ) = λ r (1) Λ · (cid:0) N − B − (cid:1) B − (cid:89) k =2 λ r ( k ) Λ − (cid:80) k − i =1 λ r ( i ) − (cid:80) n (cid:54)∈ r λ n . (9) proof The full balance equations are as for LRU but with a diﬀerent deﬁnitionfor M − n .Let Λ − r = Λ − (cid:80) n (cid:54)∈ r λ n . By substituting (9) into the full balance equations(and moving the cache-miss terms to the left-hand side), we get that (9) satisﬁesthe full balance equations if and only if − r − λ r (1)  λ r (1) B − (cid:88) j =2 j − (cid:89) k =2 Λ − r − (cid:80) k − i =1 λ r ( i ) Λ − r − (cid:80) ki =2 λ r ( i ) + λ r ( B ) B − (cid:89) k =2 Λ − r − (cid:80) k − i =1 λ r ( i ) Λ − r − (cid:80) ki =2 λ r ( i ) (cid:33) = B − (cid:88) j =2 (cid:32) j − (cid:89) k =2 Λ − r − (cid:80) ki =1 λ r ( i ) Λ − r − (cid:80) ki =2 λ r ( i ) (cid:33) λ r (1) Λ − r − (cid:80) ji =2 λ r ( i ) + B − (cid:89) k =2 Λ − r − (cid:80) ki =1 λ r ( i ) Λ − r − (cid:80) ki =2 λ r ( i ) (10) where (cid:81) k =2 ( ... ) ≡ λ r (1) when ﬁlling the cache, given thatonly objects ∈ r will be chosen and that λ r (2) has already been chosen ﬁrst. λ r (1) is chosen on the ﬁrst try with probability λ r (1) / (Λ − r − λ r (2) ), otherwise λ r (3) entersthe cache - this is the summand of (10) with j = 2. Generally, the j th summandis the probability that λ r (1) enters the cache on the ( j − th try, otherwise object λ r ( j +1) is placed in the cache. The ﬁnal term of (10) is the probability r (1) fails toenter the cache before the last ( B th ) position, because in the penultimate choice11nly objects r ( B ) and r (1) remain, i.e. , λ r ( B ) = Λ − r − (cid:80) B − i =1 λ r ( i ) . So, (3) mustgenerally hold by the law of total probability.Finally, since the stationary LRU Markov chain is irreducible on R , there is aunique invariant.Note that it’s easily directly veriﬁed that (9) satisﬁes (2) for the cases B = 2and B = 3, e.g. , for B = 3 and N = 4, π ( r ) = λ r (1) λ r (2) / (3Λ( λ r (2) + λ r (3) )) . To interpret (9): λ r (1) is chosen with probability λ r (1) / Λ; then the remaining B − r are chosen from the remaining N − (cid:0) N − B − (cid:1) − ; ﬁnally, the order of the remaining items λ r (2) , λ r (3) , ... are determined as the LRU invariant distribution (1).Finally, we make an observation about cache-hit probabilities under MRU evic-tion. Consider a MRU cache under the IRM that is “synchronized” so that a queryfor object n occurs at time 0. Thus, immediately thereafter, n is the MRU objectin the cache. The next query for object n will be at time T n ∼ exp( λ n ). Again,under MRU eviction, the only way an object n is evicted is when a cache missoccurs immediately after a query for n , i.e. , a cache miss when n is the MRUobject. So, the stationary hit probability h n of object n equals the probabilitythat a hit occurs at time T n , which is • the probability that no other queries occurred in the interval (0 , T n ) plus • the probability that a query does occur in (0 , T n ) and the ﬁrst such query isa hit. 12hus, we can write ∀ n , h n = E  e − T n (cid:80) j (cid:54) = n λ j + (1 − e − T n (cid:80) j (cid:54) = n λ j ) (cid:88) j (cid:54) = n λ j h j | n (cid:80) i (cid:54) = n λ i  , where h j | n is the probability that a query is a hit on j given that object n is MRU.We have therefore shown the following. Proposition 3.1

For a MRU-eviction cache under the stationary IRM: ∀ n , h n = p n + (cid:80) j (cid:54) = n p j h j | n = (cid:80) j p j h j | n , where p j = λ j / (cid:80) i λ i and h j | j = 1 ; equivalently,a kind of balance equation: ∀ n , (cid:88) j p j h n | j = (cid:88) j p j h j | n . “ k th Recently Used” ( k RU) is a simple generalization of LRU and MRU whereinobject r ( k ), for some ﬁxed k ∈ { , , ..., B } , is evicted upon cache miss; otherwisecache insertion (at rank 1) upon misses and promotion (to rank 1) and demotions(by 1) upon hits are the same as both MRU and LRU. That is, B RU is LRU and1RU is MRU.

Corollary 4.1

The invariant distribution of k RU is π ( r ) = k (cid:89) j =1 λ r ( j ) Λ − (cid:80) ji =2 λ r ( i ) (11) × (cid:0) N − kB − k (cid:1) B − (cid:89) j = k +1 λ r ( j ) Λ − (cid:80) j − i =1 λ r ( i ) − (cid:80) n (cid:54)∈ r λ n . Numerical results for small

N, B

In this numerical study, we directly computed the invariants π by generating allpossible object permutations representing cache state by the Steinhaus-Johnson-Trotter algorithm. So, we considered only small values for the number of objectsand the cache size. Figure 1 is representative of our numerical study on cache-hitprobabilities using a Zipf popularity model λ n = n − α for with α = 0 .

75 (see Table1 of [20]) and most popular object indexed 1 with normalized rate λ = 1. k RU with 1 < k < B gives hit-probability performance between MRU ( k = 1)and LRU ( k = B ). That is, one can see that the range of hit probabilities forLRU is larger than that of MRU.Figure 1: k RU cache hit probabilities h n and popularity λ n versus object index n for a cache of size B = 6, N = 12 objects, and Zipf popularity parameter α = 0 . k RP with cacheentry at lowest rank B upon cache miss compared to LRU. Note that k RP hasgreater range of hit probability values than LRU. We postulate that generally forZipf popularity distributions, the range of hit probabilities of IRP is larger thanthose of LRU which is larger than those of MRU.Figure 2: k RP (with cache entry upon cache miss) and LRU cache hit probabilities h n and popularity λ n versus object index n for a cache of size B = 6, N = 12objects, and Zipf popularity parameter α = 0 . i.e. , that the sum of the stationary hit probabilities is thesame for all of these caching disciplines under the IRMThough our derivations herein are for the IRM, MRU may out-perform LRUfor non-Poisson arrivals in terms of aggregate hit rate (6). Recall mention in15igure 3: Cache hit probabilities h versus popularity λ for a cache of size B = 3, N = 12 objects, and Zipf popularity parameter α = 0 . D .Speciﬁcally, inter-query times equal D plus an exponentially distributed quantity,such that D = 0 corresponds to the IRM (here with intensities following a Zipfpopularity distribution). In Table 1, we see that LRU has best aggregate hit rateunder IRM (mean hit rate increases with k when D = 0), while MRU is best when D = 1 , k when D = 2).16 D = 0 D = 1 D = 21 (MRU) 0.52 0.4578 0.452 0.54 0.4213 0.403 0.56 0.4014 0.354 0.58 0.4026 0.315 0.60 0.4187 0.296 (LRU) 0.62 0.4423 0.29Table 1: k RU aggregate hit rate (6) for N = 12 objects, cache of capacity B = 6objects, and Zipf popularity distribution with exponent α = 0 .

75 ( D = 0 corre-sponds to the IRM). The performance of Markovian networks of such capacity-driven caches are approx-imated in e.g. , [14, 15]. To illustrate the diﬃculties with capacity-driven cachingnetworks, now consider the simplest ones based on RE. Though RE caches aretime-reversible, a tree of independent local caches whose collective query-missesare forwarded to an Internet cache (also running RE, see Figure 4), is not time-reversible and its non-local nodes do not operate under the IRM. To see why it’snot time-reversible, consider a cache miss of object n of local cache q of size B q instate r q , so that object n q is evicted, and suppose it’s also a miss on the Internetcache of size b in state R , so that object n is evicted; this can be reversed withone query (so that states r q and R are restored) only if n = n q .The following result is for the very special case that the Internet cache holds17igure 4: A tree-network of caching nodes that feeds forward cache misses withassumed independent local cachesonly one object. Proposition 6.1

The invariant distribution π of the network Figure 4 with REcaching and b = 1 satisﬁes π ( R | r ) = (cid:80) q { R ∈ r q } Λ q,r q /B q (cid:80) q Λ q,r q (12) where Λ q,x = (cid:88) (cid:96) (cid:54)∈ x λ q,(cid:96) , (cid:80) ∅ ( ... ) ≡ , and indicator X = 1 if X is true otherwise = 0 . proof For n ∈ r q , m (cid:54)∈ r q , let δ q − n + m r be r but with n in r q replaced by m .18imilarly deﬁne δ − n + (cid:96) R . The full balance equations are π ( r, R ) (cid:88) q,m : m (cid:54)∈ r q λ q,m = (cid:88) q,m,n : m (cid:54)∈ r q ; n ∈ r q ∩ R π ( δ q − n + m r, R ) λ q,n B q + (cid:88) q,m,n,(cid:96) : m (cid:54)∈ r q ; n ∈ r q ∩ R ; (cid:96) (cid:54)∈ R π ( δ q − n + m r, δ − n + (cid:96) R ) λ q,n B q b Dividing by π ( r, R ) = π ( R | r ) (cid:81) q π ( r q ) and then substituting the stationary jointdistribution of the independent RE local caches (5) into the full balance equationsgives: ∀ r, R, π ( R | r ) (cid:88) q,m : m (cid:54)∈ r q λ q,m = (cid:88) q,m : m (cid:54)∈ r q λ q,m B q × (cid:88) n ∈ r q ∩ R  π ( R | δ q − n + m r ) + 1 b (cid:88) (cid:96) (cid:54)∈ R π ( δ + l − n R | δ q − n + m r )  For the special case of b = 1, i.e. , R (= n ) is a single object, we get that theright-hand-side simpliﬁes to (cid:88) q,m : m (cid:54)∈ r q λ q,m B q { R ∈ r q }×  π ( R | δ q − R + m r ) + (cid:88) (cid:96) (cid:54) = R π ( (cid:96) | δ q − R + m r )  = (cid:88) q,m : m (cid:54)∈ r q λ q,m B q { R ∈ r q } = (cid:88) q Λ q,r q B q { R ∈ r q } The invariant is unique since (

R, r ) is irreducible.In steady state, R ⊂ ∪ q r q a.s., i.e. , if ∀ q, R (cid:54)∈ r q then π ( R | r ) = 0. Notethat (12) is the eviction probability of object R upon local cache miss in localcache state r . An individual RE cache r is not quasi-reversible since the missrates (“departures”), π ( r ) (cid:80) m (cid:54)∈ r,n ∈ r λ n π ( δ − n + m ( r )) depend on the state r . Though19uasi-reversibility is not a necessary condition [18], Proposition 6.1 shows that REnetworks generally do not have product-form invariants. More speciﬁcally, one canidentify the incident mean rate of queries for object n to the Internet cache, (cid:98) λ n := (cid:80) q λ q,n (1 − h q,n ) = (cid:80) q λ q,n (cid:80) r q : n (cid:54)∈ r q π ( r q ), where 1 − h q,n is the stationary missprobability of local cache q for object n under RE . According to this proposition, π ( R ) does not depend on the (cid:98) λ n in the way the IRM invariant π ( r q ) depends onthe λ q,n in (5), i.e. , π ( R ) = (cid:80) r π ( R | r ) π ( r ) = (cid:80) r π ( R | r ) (cid:81) q π ( r q ) (cid:54) = (cid:98) λ R / (cid:80) n (cid:98) λ n .Finally note that, since the capacity of the Internet cache is one object ( b = 1), itcould obviously be operating any eviction policy. In this paper, under the IRM, a closed-form expression for the invariant distri-bution was derived for a caching node using k RU eviction. Numerically, it wasshown that under IRM and Zipf popularity distributions for the data objects, therange of cache-hit probabilities of the data objects under IRP caching is largerthan LRU, which is larger than RE, which is larger than MRU (also, a non-IRMexample was given where MRU had higher aggregate hit rate than LRU). Finally,the invariant distribution of a special case of a Markovian RE caching tree-networkwas also derived. In this way, one can easily identify the “ﬂow-balance equations” for more general cachingnetworks [14]. eferences [1] W. King, “Analysis of paging algorithms,” in Proc. IFIP Congress ,Lyublyana, Yugoslavia, Aug. 1971.[2] L. Belady, R. Nelson, and G. Shedler, “An Anomaly in Space-time Character-istics of Certain Programs Running in a Paging Machine,”

Commun. ACM ,vol. 12, no. 6, June 1969.[3] O. Aven, E. Coﬀman, and Y. Kogan,

Stochastic analysis of computer storage .D. Reidel Publishing Co., 1987.[4] J. V. D. Berg and A. Gandolﬁ, “LRU is better than FIFO under the inde-pendent reference model,”

J. Appl. Prob. , vol. 29, 1992.[5] P. Denning and S. Schwartz, “Properties of the working-set model,”

Commun.ACM , vol. 15, no. 3, p. 191 198, March 1972.[6] A. Dan and D. Towsley, “An approximate analysis of the LRU and FIFObuﬀer replacement schemes,”

SIGMETRICS Perform. Eval. Rev. , vol. 18, p.143 152, April 1990.[7] P. Jelenkovic, “Asymptotic approximation of the move-to-front search costdistribution and least-recently-used caching fault probabilities,”

Ann. Appl.Probab. , vol. 9, no. 2, p. 430464, 1999.[8] H. Che, Y. Tung, and Z. Wang, “Hierarchical Web Caching Systems: Mod-eling, Design and Experimental Results,”

IEEE JSAC , vol. 20, no. 7, Sept.2002. 219] C. Fricker, P. Robert, and J. Roberts, “A Versatile and Accurate Approxima-tion for LRU Cache Performance,” in

Proc. International Teletraﬃc Congress ,2012.[10] P. Jelenkovic, “Private communication,” Dec. 2017.[11] P. Jelenkovic and A. Radovanovic, “Least-recently-used caching with depen-dent requests,”

Theoretical Computer Science , vol. 326, pp. 293–327, Oct.2004.[12] D. Berger, S. S. School, P. Gland, and F. Ciucu, “Exact Analysis of TTLCache Networks The Case of Caching Policies Driven by Stopping Times,”in

Proc. ACM SIGMETRICS , Austin, Texas, June 2014.[13] F. Cavallin, A. Marin, and S. Rossi, “A product-form model for the analysisof systems with aging objects,” in

Proc. IEEE MASCOTS , Atlanta, Sept.2015.[14] E. Rosensweig, J. Kurose, and D. Towsley, “Approximate models for generalcache networks,” in

Proc. IEEE INFOCOM , March 2010.[15] M. Garetto, E. Leonardi, and V. Martina, “A Uniﬁed Approach to the Per-formance Analysis of Caching Systems,”

ACM TOMPECS , vol. 1, no. 3, May2016.[16] S. Dar, M. Franklin, B. Jonsson, D. Srivastava, and M. Tan, “Semantic datacaching and replacement,” in

Proc. Conf. on Very Large Databases (VLDB) ,1996. 2217] D. Starobinski and D. Tse, “Probabilistic methods for web caching,”

Perfor-mance Evaluation , 2001.[18] X. Chao, M. Miyazawa, R. Serfozo, and H. Takada, “Markov network pro-cesses with product form stationary distributions,”

Queueing Systems , vol. 28,p. 377401, 1998.[19] A. Balamash and M. Krunz, “An overview of web caching replacement algo-rithms,”

IEEE Communications Surveys & Tutorials , vol. 6, no. 2, 2004.[20] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, “Web Caching andZipf-like Distributions: Evidence and Implications,” in