[PDF] The Study of Dynamic Caching via State Transition Field -- the Case of Time-Varying Popularity

Abstract

In the second part of this two-part paper, we extend the study of dynamic caching via state transition field (STF) to the case of time-varying content popularity. The objective of this part is to investigate the impact of time-varying content popularity on the STF and how such impact accumulates to affect the performance of a replacement scheme. Unlike the case in the first part, the STF is no longer static over time, and we introduce instantaneous STF to model it. Moreover, we demonstrate that many metrics, such as instantaneous state caching probability and average cache hit probability over an arbitrary sequence of requests, can be found using the instantaneous STF. As a steady state may not exist under time-varying content popularity, we characterize the performance of replacement schemes based on how the instantaneous STF of a replacement scheme after a content request impacts on its cache hit probability at the next request. From this characterization, insights regarding the relations between the pattern of change in the content popularity, the knowledge of content popularity exploited by the replacement schemes, and the effectiveness of these schemes under time-varying popularity are revealed. In the simulations, different patterns of time-varying popularity, including the shot noise model, are experimented. The effectiveness of example replacement schemes under time-varying popularity is demonstrated, and the numerical results support the observations from the analytic results.

Full PDF

aa r X i v : . [ c s . O S ] S e p The Study of Dynamic Caching via StateTransition Field - the Case of Time-VaryingPopularity

Jie Gao,

Member, IEEE , Lian Zhao,

Senior Member, IEEE , andXuemin (Sherman) Shen,

Fellow, IEEE

Abstract

In the second part of this two-part paper, we extend the study of dynamic caching via state transitionﬁeld (STF) to the case of time-varying content popularity. The objective of this part is to investigate theimpact of time-varying content popularity on the STF and how such impact accumulates to affect theperformance of a replacement scheme. Unlike the case in the ﬁrst part, the STF is no longer static overtime, and we introduce instantaneous STF to model it. Moreover, we demonstrate that many metrics, suchas instantaneous state caching probability and average cache hit probability over an arbitrary sequenceof requests, can be found using the instantaneous STF. As a steady state may not exist under time-varying content popularity, we characterize the performance of replacement schemes based on how theinstantaneous STF of a replacement scheme after a content request impacts on its cache hit probability atthe next request. From this characterization, insights regarding the relations between the pattern of changein the content popularity, the knowledge of content popularity exploited by the replacement schemes,and the effectiveness of these schemes under time-varying popularity are revealed. In the simulations,different patterns of time-varying popularity, including the shot noise model, are experimented. Theeffectiveness of example replacement schemes under time-varying popularity is demonstrated, and thenumerical results support the observations from the analytic results.

Index Terms cache replacement policy, content popularity, shot noise model, temporal locality, online caching,mobile edge caching.

J. Gao and X. Shen are with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON,N2L 3G1, Canada (e-mail: { jie.gao, sshen } @uwaterloo.ca).L. Zhao is with the Department of Electrical, Computer, and Biomedical Engineering, Ryerson University, Toronto, ON, M5B2K3, Canada (e-mail: [email protected]). I. I

NTRODUCTION

Driven by the upsurge in the number of user devices and their demand for multimedia services,the role of caching in improving the content delivery performance of wireless networks becomesprominent [1] - [3]. Accordingly, the modeling and analysis of caching have gained tremendousresearch attention [4]- [7]. While the independence reference model (IRM) is the de facto modelfor content requests, it has been argued that the IRM may not be sufﬁciently accurate in practicesince temporal correlation of content requests can be too important to neglect [8]. As a result,one particular topic, i.e., online caching with time-varying content popularity, has attracted greatresearch interest lately [9], [10].The above-mentioned temporal correlation of content requests is sometimes referred to as‘temporal locality’, which suggests that a recently requested content is likely to be requestedagain in the near future. Temporal locality, however, has been shown to emerge from the temporalcorrelation of requests, the content popularity, or both [11]. Therefore, temporal locality existseven with IRM, and time-varying content popularity complicates the locality by introducing thetemporal correlation. As a result, the study of online caching in the case of time-varying contentpopularity can be very challenging [9]. Existing research on caching with time-varying contentpopularity can be roughly categorized into two groups: the ﬁrst group of works aims to analyzeor model temporal locality, and the second group targets at proposing caching solutions to copewith it.Some early works on analyzing temporal locality focused on understanding its sources anddeveloping metrics to measure it, e.g., [12]. In a recent work, Zhou et al. investigated the changeof popularity over time in the video-on-demand services [13]. While the above studies tend tobe experiment-based, mathematical models for characterizing temporal locality can be found ina few works. An inter-reference gap model was developed in [14], which focused on describingtemporal locality based on the gaps between successive requests. Traverso et al. proposed a shotnoise model [15], which represents the requests for a content with an inhomogeneous Poissonprocess, and later applied it on the analysis of video-on-demand trafﬁc [16]. Other approachesto integrate temporal locality into the analysis of caching also exist, most of which modeled therequest for each content as a (semi-)Markov-modulated process or a renewal process [17], [18].By comparison, a larger number of works can be found in the second group, which proposescaching solutions to cope with temporal locality. Such solutions generally require the prediction of locality or the learning of content popularity. A cache replacement scheme based on predictingthe interval between requests was proposed in [19] and shown to be effective in increasingcache hits. Li et al. developed a popularity-driven cache replacement scheme which learns thecontent popularity in an online fashion and makes replacement decisions based on the popularityforecast [20]. Zhang et al. proposed a model-free reinforcement learning algorithm for cachereplacement based on a linear content popularity prediction model [21]. The above works can belabeled as online caching based on learning/prediction since decisions for cache update are madeafter every content request. Another type of solutions is proactive caching based on prediction,which can handle time-varying content popularity assuming that cached contents are updatedwith a sufﬁciently high frequency. Sadeghi et al. exploited reinforcement learning to trackcontent popularity in an online fashion and developed a Q-learning based algorithm for contentplacement [22]. Applegate et al. formulated content placement as an optimization problem and,through estimating content popularity, proposed strategies to update cache contents to track time-varying content popularity [23]. Bharath et al. characterized the performance of caching withnon-stationary content popularity from a learning-theoretic perspective and proposed a cacheupdate policy based on the estimation of content popularity [24].Evidently, understanding the impact of time-varying content popularity on the performanceof caching is important for the analysis and design of cache replacement schemes. However,analysis regarding the impact of time-varying popularity on the performance of replacementschemes is limited in the existing literature. The state transition ﬁeld (STF) that we proposed in[25] can be used for such analysis. However, with time-varying content popularity, the STF isno longer a static ﬁeld but a dynamically varying ﬁeld, and, consequently, a steady state maynot exist. The objective of the second part of this two-part paper is to investigate the impactof time-varying content popularity on the STF and, as a result, the performance of replacementschemes.The contributions of the second part are the followings.First, we extend the concept of STF from the ﬁrst part of this two-part paper [25] and introduceinstantaneous STF to characterize replacement schemes in the case of time-varying contentpopularity. It is shown that many metrics, such as instantaneous state caching probability (SCP)at an arbitrary instant and average cache hit probability over an arbitrary sequence of requests, canbe found based on instantaneous STF. The results demonstrate the importance of instantaneousSTF in modeling and analyzing replacement schemes with time-varying content popularity.

Second, as steady states may not exist, we characterize performance of a replacement schemeby analyzing the difference in instantaneous cache hit probability with and without applying thatscheme after a content request. The result reveals insights regarding the relation between thechange pattern in content popularity and the effectiveness of replacement schemes. We illustratethe results in the vector space of SCPs and relate them to the knowledge of content popularityexploited by replacement schemes.Third, we demonstrate instantaneous STF and average cache hit ratio under time-varying pop-ularity with extensive simulations using example schemes. For instantaneous STF, we illustrateits relation with instantaneous content popularity and instantaneous cache hit probability. Foraverage cache hit ratio, we adopt different models of time-varying content popularity, includingthe shot noise model, and compare the performance of the example schemes. The results verifythe observations from analysis and provide guidelines for designing replacement schemes undertime-varying content popularity.II. S

YSTEM M ODEL UNDER T IME -V ARYING C ONTENT P OPULARITY

For the sake of presentation clarity, we reintroduce some formulations from the ﬁrst part ofthis two-part paper in Sections II and III. The basic system model follows from the basic modelin the ﬁrst part [25]. As the content popularity becomes time-varying, the symbols used herecan be categorized into three groups based on their dependence on the time instant of contentrequest or replacement:G-1: independent in both [25] and this paper;G-2: independent in [25] but dependent in this paper;G-3: dependent in [25] and temporal locality introduces further dependence on the time instantin this paper;A superscript ( · ) ( n ) is added on symbols in groups G-2 and G-3 to denote the time instant relatedto the n th content request or replacement. A. Request-independent SymbolsCache State Vector/Matrix : the cache state vector s k for state k and the cache state matrix C s = [ s , . . . , s N s ] , where N s is the number of cache states. Neighboring States : the set of neighbors H k and the set of content- l neighbors H k,l of state k , for any k and any l / ∈ C k , where C k is the set of cached contents in state k . The above symbols are in group G-1.

B. Request-dependent SymbolsContent Request Probabilities : the probability of content l being requested at request instant n , denoted by υ ( n ) l , and the overall content popularity at the n th content request, denoted by υ ( n ) . The content request probabilities are in symbol group G-2. Instantaneous Cache Hit Probability : the instantaneous cache hit probability at the ( n + 1) threquest, denoted by γ ( n +1) is given by: γ ( n +1) = (cid:0) υ ( n +1) (cid:1) T λ ( n ) , (1)where · T represents transpose, and λ ( n ) is the content caching probability (CCP) vector afterthe n th round of request and replacement. It can be seen that γ ( n +1) is in symbol group G-3.Note that, in a practical network, there can be different metrics for content delivery, e.g., latency.However, the cache hit probability is an underlying factor which other metrics are dependenton. Consequently, improving the cache hit probability can improve the performance under othermetrics. For example, if the cache hit probability at an edge server increases, then the needfor retrieving contents from the cloud, and thus the average content delivery latency, reduces.Therefore, our study centers around the cache hit probability. Station Transition Matrices : The conditional state transition matrix and the state transitionmatrix are generally time dependent and thus denoted by Θ ( n ) l and Θ ( n ) , respectively, undertime-varying content popularity. However, the situation is complicated by the possible choicesof various replacement schemes and will be analyzed in details in Section III.It is worth noting that the relation between state and content caching probabilities from theﬁrst part of this two-part paper [25], i.e., λ ( n ) = C s η ( n ) , (2)still applies in the second part, where η ( n ) is the SCP vector after the n th round of request andreplacement. The above equation can be rewritten as: η ( n ) = C Ts (cid:0) C s C Ts (cid:1) − λ ( n ) + n ( n ) C , (3)where n ( n ) C can be any vector in the null space of C s that renders η ( n ) a valid probability vector,i.e., η ( n ) (cid:23) , η ( n ) (cid:22) , and T η ( n ) = 1 . Therefore, the value of n ( n ) C is dependent on the valueof λ ( n ) . The content and state caching probabilities λ ( n ) and η ( n ) belong to group G-3 and will beanalyzed in details in Section IV.III. G ENERAL R EPLACEMENT M ODEL AND S PECIFIC C ASES

In this section, the state transition probability matrix of the general replacement model isformulated, followed by the study of the four example replacement schemes introduced in the ﬁrstpart of this two-part paper, i.e., random replacement (RR), replace less popular (LP), replace theleast popular (TLP), and least-recently-used (LRU) [25]. Based on the state transition probabilitymatrices, the instantaneous STF is deﬁned at the end of this section.

A. General Replacement Model

Similar to the case in the ﬁrst part, the state transition probability matrix in the general modelcan be written as: Θ ( n ) = X l ∈C υ ( n ) l Θ ( n ) l . (4)where C is the set of all contents, and the conditional cache state transition probability matrixgiven that content l / ∈ C k is requested, i.e., Θ ( n ) l , is given by: Θ ( n ) l ( m, k ) =  , if k = m and l ∈ C k , − P m ∈H k,l φ l,e ( k,m ) ,k , if k = m and l / ∈ C k ,φ l,e ( k,m ) ,k , if m ∈ H k,l , , otherwise , (5)where φ l,q,k denotes the probability of replacing content q with content l given that the cache isat state k and content l is requested. Unlike the case with time-invariant content popularity, theconditional cache state transition probability matrix Θ ( n ) l can be implicitly request-dependent asa result of φ l,q,k being request-dependent. Consider the situation when e ( k, m ) = q and content q is less popular at instant n but more popular at instant n ′ compared to content l , i.e., υ ( n ) q < υ ( n ) l and υ ( n ′ ) q > υ ( n ′ ) l . Consequently, φ l,q,k can be different at instants n and n ′ if LRU, LP, or TLPis used, and thus Θ ( n ) l ( m, k ) can be different from Θ ( n ′ ) l ( m, k ) . Using LRU as an example, theprobability of content q being the LRU content can be different at instants n and n ′ . Therefore, Θ ( n ) l ( m, k ) is implicitly request-dependent although the request index · ( n ) does not appear in theright-hand side of eq. (5). B. RR

It is straightforward to see that the conditional cache state transition probability matrix Θ l ( m, k ) in the case of RR is request-independent and remains the same as that in the ﬁrst part. The overallstate transition probability matrix Θ RR , however, becomes dependent on ( n ) through υ ( n ) : Θ ( n )RR ( m, k ) =  − Lφ P l / ∈C k υ ( n ) l , if k = m,φυ ( n ) e ( m,k ) , if m ∈ H k , , otherwise , (6)where φ ∈ (0 , /L ] represents the conditional replacement probability that any speciﬁed cachedcontent is replaced given that the requested content is not in the cache. C. LP

In LP, an existing content may be replaced by the new content after the n th request if thenew content is more likely to be requested at the ( n + 1) th request. The case of LP can becomplicated as it involves the prediction of content popularity. Denote the prediction of contentpopularity at the ( n + 1) th request as ˜ υ ( n +1) . Sort the states in a non-decreasing order based onthe sum of predicted request probability of the cached contents, i.e., X q ∈C m ˜ υ ( n +1) q ≥ X q ∈C k ˜ υ ( n +1) q , if m ≥ k. (7)The state transition probability matrix of LP is then given by: Θ ( n )LP ( m, k )=  P q ∈C k υ ( n ) q + P l ∈ ˜ C ¯ k ↓ υ ( n ) l + P l ∈ ˜ C ¯ k ↑ υ ( n ) l (1 − α ) , if m = k,αυ ( n ) e ( m,k ) φ ( n ) e ( m,k ) ,e ( k,m ) ,k , if m > k and m ∈ H k , , otherwise , (8)in which α is the parameter for controlling the replacement probability, φ ( n ) l,q,k = ˜ υ ( n +1) l − ˜ υ ( n +1) q P { t | t ∈C k , ˜ υ ( n +1) t < ˜ υ ( n +1) l } (˜ υ ( n +1) l − ˜ υ ( n +1) t ) , (9)and ˜ C ¯ k ↓ = (cid:26) l | l / ∈ C k , ˜ υ ( n +1) l ≤ min t ∈C k { ˜ υ ( n +1) t } (cid:27) , (10a) ˜ C ¯ k ↑ = (cid:26) l | l / ∈ C k , ˜ υ ( n +1) l ≥ min t ∈C k { ˜ υ ( n +1) t } (cid:27) . (10b) Note that the prediction ˜ υ ( n +1) is not necessarily updated for each content request, and, as aresult, ˜ υ ( n +1) can be a constant for a number of requests. The above state transition probabilitymatrix applies regardless of what the predicted popularity stands for (i.e., the prediction can befor the next request or for a time period over multiple requests, etc.). D. TLP

In TLP, an existing content is replaced after the n th request if it is both: i) the least likely tobe requested among the cached content at the ( n +1) th request; and ii) less likely to be requestedat the ( n + 1) th request compared to the new content at the n th request. Sort the states in anon-decreasing order based on the sum of predicted request probability of the cached contents.The state transition probability matrix of TLP is given by: Θ ( n )TLP ( m, k )=  P q ∈C k υ ( n ) q + P l ∈ ˜ C ¯ k ↓ υ ( n ) l + P l ∈ ˜ C ¯ k ↑ υ ( n ) l (1 − φ l,q † ( k ) ,k ) , if m = k,υ e ( m,k ) φ ( n ) e ( m,k ) ,q † ( k ) ,k , if m > k and k ∈ H m,q † ( k ) , , otherwise . (11)where φ ( n ) l,q † ( k ) ,k is the conditional probability of replacing q † ( k ) with l in state k , and q † ( k ) = argmin t ∈C k { ˜ υ ( n +1) t } . (12)Note that q † ( k ) changes over time although the superscript · ( n ) is neglected here for simplicityof denotations. The value of φ ( n ) e ( m,k ) ,q † ( k ) ,k , where m > k and k ∈ H m,q † ( k ) , can be either 1or ˜ υ ( n +1) e ( m,k ) − ˜ υ ( n +1) q † ( k ) , referred to TLP-A (always replace) and TLP-P (probabilistically replace),respectively.Similar to the case in the ﬁrst part, Θ ( n )LP and Θ ( n )TLP are both lower-triangular matrices.The relation among the content popularity, the prediction, and the SCP, all of which are timevarying, can be very complicated. As our focus is on understanding the impact of replacementschemes on the time-varying SCP instead of predicting content popularity, the prediction in thecase of LP and TLP will be assumed to be accurate in this work. Same as in the ﬁrst part, LPand TLP, unlike RR and LRU, are not practical replacement schemes but considered here justfor analyzing the impact of content popularity information on the STF of replacement schemes. E. LRU

To ﬁt the LRU into the general cache state transition model, the conditional probability thata speciﬁc cached content is the LRU given the current cache state needs to be found. In orderto ﬁnd this conditional probability, the following result is obtained.

Lemma 1:

The joint probability that: i) the current state is k ; ii) content q ⋆ ∈ C k is the LRUcontent at the n th request; and iii) the most recent request for q ⋆ is the ( n − w ) th request, denotedby ρ ( n ) ( q ⋆ , w, k ) , can be found by: ρ ( n ) ( q ⋆ , w, k ) = U w X u =1 L − Y i =1 Y t ∈T ( k,i,u, ¯ q ⋆ ) υ ( t ) k ( i, ¯ q ⋆ ) . (13)where k ( i, ¯ q ⋆ ) , i ∈ { , . . . , L − } represents the i th cached content in state k that is not content q ⋆ , U w represents the number of all possible ways for ordering and allocating w − requeststo L − contents while guaranteeing at least one request for each content, and T ( k, i, u, ¯ q ⋆ ) represents the set of requests allocated to content k ( i, ¯ q ⋆ ) in the u th out of the U w allocations. Proof : See Section A in Appendix.Given the joint probability in Lemma 1, the conditional probability that content q ⋆ ∈ C k isthe LRU content given that the current state is k can be found as follows : ρ ( n ) ( q ⋆ | k ) = ∞ P w = L ρ ( n ) ( q ⋆ , w, k ) ∞ P w = L P q ∈C k ρ ( n ) ( q, w, k ) . (14)Note that the above probability is the general case for the probability ρ LRU e ( k,m ) | k from the ﬁrst partof this two-part paper.Using the above conditional probability, the conditional state transition probability matrix Θ l can be given by: Θ ( n ) l, LRU ( m, k ) =  , if l ∈ C k and k = m,ρ ( n ) ( e ( k, m ) | k ) , if m = H k,l , , otherwise . (15)The overall state transition probability matrix Θ LRU is given by: Θ ( n )LRU ( m, k ) =  P l ∈C k υ ( n ) l , if k = m,υ ( n ) e ( m,k ) ρ ( n ) ( e ( k, m ) | k ) , if m ∈ H k , , otherwise . (16) Here it is assumed that a sufﬁcient number of requests have occurred, i.e., n → ∞ . IV. I

NSTANTANEOUS

CCP

AND

STFBased on the state transition probability matrix, this section analyzes the transition of theinstantaneous CCP and formulates the instantaneous STF.

A. Instantaneous CCP

Based on the relation between the content and the state caching probabilities in eq. (2), theresulting CCP vector after the n th request and replacement is given by: λ ( n ) = C s η ( n ) = C s X l ∈C υ ( n ) l Θ ( n ) l η ( n − . (17)Using eq. (3), it follows that: λ ( n ) = C s X l ∈C υ ( n ) l Θ ( n ) l C Ts (cid:0) C s C Ts (cid:1) − ! λ ( n − + C s X l ∈C υ ( n ) l Θ ( n ) l n ( n − C . (18)It can be seen that the mapping from λ ( n − to λ ( n ) is complicated. Speciﬁcally, unlikethe mapping between two consecutive SCP vectors, which can be simply written as η ( n ) = Θ ( n ) η ( n − , the mapping between consecutive CCP vectors cannot be written in a linear formdue to the second item in eq. (18), i.e., C s P l ∈C υ ( n ) l Θ ( n ) l n ( n − C . Moreover, despite that eq. (18)seems to have an afﬁne form, the mapping from λ ( n − to λ ( n ) is not afﬁne either. This isimplicitly conveyed through the variable n ( n − C since the value of n ( n − C depends on λ ( n − andthe dependence is nonlinear as explained after eq. (3) in Section II. B. Instantaneous STF - The General Case

Under time-varying content popularity, the state transition probability matrix is Θ ( n ) when theSCP is η ( n − . Therefore, the STF at the instant of the n th request and the point η ( n − is givenby: u ( n ) ( η ( n − ) = Θ ( n ) η ( n − − η ( n − . (19)The superscript ( n ) in u ( n ) ( · ) reﬂects the fact that the STF is no longer static but time-varyingas a result of the time-varying content popularity. The direction and strength of the instantaneousSTF depend on both η , i.e., the location in the state transition domain, and n , i.e., the requestinstant. The value of the instantaneous STF u ( n ) ( η ( n − ) represents the change in the SCP after the n th round of request and replacement. The effect of a replacement scheme on the dynamicSCP over a sequence of requests can be decomposed into the summation over the instantaneousSTFs: η ( n + N − − η ( n − = N − X t =0 (cid:0) η ( n + t ) − η ( n + t − (cid:1) = N − X t =0 u ( n + t ) ( η ( n + t − ) , (20)for any n ≥ and N ≥ .Similarly, other metrics can also be studied through instantaneous STFs, e.g., the averagecache hit probability. Lemma 2:

Using instantaneous STFs from the ﬁrst till the n th request, the average cache hitprobability over the n requests can be given by: γ avg = 1 n n X t =2 (cid:0) υ ( t ) (cid:1) T C s (cid:18) t − X t ′ =0 u ( t ′ +1) (cid:19) + υ Tavg C s η (0) , (21)in which u ( t ′ +1) is the abbreviation for u ( t ′ +1) ( η ( t ′ ) ) , and υ avg = 1 n n X t =1 υ ( t ) (22)is the average content popularity over the n requests.Proof: See Section B in Appendix.Lemma 2 shows that the average cache hit probability over an arbitrary number of requests,starting from any initial SCP η (0) , can be obtained from instantaneous STFs, instantaneouscontent request probabilities, and the initial point η (0) . The inner summation over t ′ in eq. (21)represents the effect of historical requests and replacements on the instantaneous cache hitprobability at the t th request. The decomposition in eq. (20) and the result in eq. (21) demonstratethe importance in analyzing the instantaneous STF under different replacement schemes. If theinstantaneous content request probabilities υ ( t ) , t ∈ { , . . . , n } can be obtained, the instantaneousSTF of a replacement scheme at any point in the state transition region can be calculated usingeqs. (4), (5), and (19). For evaluating and comparing different cache replacement schemes, wecan substitute the speciﬁc STF of the replacement schemes for u (1) , . . . , u ( t − in eq. (21).The instantaneous STF can be decomposed. Deﬁne the l th component of u ( n ) ( η ( n − ) as: u ( n ) l = Θ ( n ) l η ( n − − η ( n − . (23) It can be seen that: u ( n ) ( η ( n − ) = Θ ( n ) η ( n − − η ( n − = X l ∈C υ ( n ) l (cid:0) Θ l η ( n − − η ( n − (cid:1) = X l ∈C υ ( n ) l u ( n ) l . (24) C. The Case of RR, LP, TLP, and LRU

When a speciﬁc replacement scheme is considered, u ( n ) l can be found based on its conditionalstate transition probability matrix Θ ( n ) l using (23).For the case of RR, the m th element of u ( n ) l is given by: u m,l, RR =  φ P { k | m ∈H k,l } η k , if l ∈ C m , − Lφη m , otherwise . (25)The m th element of u ( n ) l for LP is given by: u ( n ) m,l, LP =  P k ∈G ( n ) m,l η ( n − k φ ( n ) l,e ( k,m ) ,k , if l ∈ C m , − η ( n − m , if l / ∈ C m and min q ∈C m { ˜ υ ( n +1) q } < ˜ υ ( n +1) l , , otherwise , (26)where G ( n ) m,l = { k | m ∈ H k,l , ˜ υ ( n +1) e ( k,m ) < ˜ υ ( n +1) l } , (27)representing the set of states which include state m in their content- l neighbors and cache aless popular content compared to state m according to the predicted popularity for the ( n + 1) threquest.Similarly, the m th element of u ( n ) l for TLP is given by: u ( n ) m,l, TLP =  P k ∈ ˆ G ( n ) m,l φ ( n ) e ( m,k ) ,q † ( k ) ,k η ( n − k , if l ∈ C m , − φ ( n ) e ( m,k ) ,q † ( k ) ,k η ( n − m , if l / ∈ C m and min q ∈C m { ˜ υ ( n +1) q } < ˜ υ ( n +1) l , , otherwise , (28) where ˆ G ( n ) m,l = { k | m ∈ H k,l , ˜ υ ( n +1) e ( k,m ) = min q ∈C k { ˜ υ ( n +1) q } < ˜ υ ( n +1) l } , (29)representing the set of states which include state m in their content- l neighbors and cache acontent less popular than any content cached by state m according to the predicted popularityfor the ( n + 1) th request.For the case of LRU, the m th element of u ( n ) l is given by: u ( n ) m,l, LRU =  P k ∈G m,l ρ ( n ) ( e ( k, m ) | k ) η ( n − k , if l ∈ C m − η ( n − m , otherwise (30)where G m,l = { k | m ∈ H k,l } . (31)In the next section, we study the instantaneous STF of the considered replacement schemesand its impact on their instantaneous cache hit probability.V. I MPACT OF

STF ON I NSTANTANEOUS C ACHE H IT P ROBABILITY

When the content popularity varies over time, a replacement scheme may not lead to anysteady state. As a result, the analysis of steady states and rate of convergence does not apply.Instead, the impact of a replacement scheme on the instantaneous cache hit probability at thenext request is investigated.

A. The General Case

A replacement after the n th request affects the cache hit probability at the ( n + 1) th request.Consider the time instant right after the n th request and replacement so that u ( n ) ( · ) is the currentSTF and the ( n +1) th request is the next request in future. The effect of a replacement scheme canbe conveyed through the difference between the cache hit probability at the ( n +1) th request withand without a replacement (based on the chosen scheme) after the n th request. This differenceis given by: d ( n +1) γ = (cid:0) υ ( n +1) (cid:1) T C s (cid:0) η ( n ) − η ( n − (cid:1) = (cid:0) υ ( n +1) (cid:1) T C s u ( n ) ( η ( n − ) . (32) B A z ( n +1) B : ( υ ( n +1) ) T C s ( η − η ( n − ) = 0 z ( n +1) : C Ts υ ( n +1) A : T η = 1 S S η ( n − Fig. 1: Illustration of the relation between instantaneous cache hit probability, η ( n ) , and υ ( n +1) .Area S is the area that η ( n +1) may fall in, i.e., the intersection of hyperplane A and the subspace η ( n +1) (cid:23) . If η ( n +1) falls in area S , then ( z ( n +1) ) T η ( n +1) ≥ ( z ( n +1) ) T η ( n ) .The above result shows that, the cache hit ratio at the ( n + 1) th request depends on the contentpopularity at the ( n + 1) th request, i.e., υ ( n +1) , the STF at the n th request, i.e., u ( n ) ( · ) , andthe SCP at the ( n − th request, i.e., η ( n − . Among these three factors, η ( n − reﬂects theaccumulative effect of the previous n − rounds of request and replacement, u ( n ) ( · ) representsthe current STF, and υ ( n +1) represents the content popularity at the next request in future. Theresult in eq. (32) shows the complication due to time-varying content popularity: υ ( n +1) and u ( n ) ( · ) in eq. (32) would reduce to υ and u ( · ) , respectively, if the content popularity becomestime-invariant.Some general observations can be made:1) Deﬁne z ( n +1) = C Ts υ ( n +1) . Then z ( n +1) is the state cache hit probability vector at the ( n + 1) th request. Depending on η ( n − , υ ( n ) , and Θ ( n ) , η ( n +1) may fall at any point inthe areas S in Fig. 1. The replacement after the n th request improves the instantaneouscache hit probability at the ( n + 1) th request if the replacement drives the SCP into thearea S shown in Fig. 1.2) d ( n +1) γ is small, regardless of υ ( n +1) , when η ( n − is close to the steady state correspondingto υ ( n ) (i.e., the steady state if the content popularity is constant and remains equal to υ ( n ) ).3) In the trivial case when υ ( n +1) approaches /N c · , where N c is the number of contents,the hyperplane ( υ ( n +1) ) T C s ( η − η ( n ) ) = 0 coincides with the hyperplane T η = 1 . Insuch case, d ( n +1) γ becomes zero for any replacement scheme. The effect of a replacement scheme on d ( n +1) γ can be conveyed through the set of content-speciﬁc instantaneous STF { u ( n ) l } using eq. (24). Theorem 1:

The d ( n +1) γ in eq.(32) can be equivalently rewritten as: d ( n +1) γ = X l ∈C ( υ ( n ) l − ¯ υ l ) c ( n +1) l , (33)where c ( n +1) l = (cid:0) υ ( n +1) (cid:1) T C s u ( n ) l , (34)and { ¯ υ l } l ∈C represents the content popularity under which η ( n − would be the steady state. Proof : See Section C in Appendix.Based on eq. (33) and eq. (34), the factors that determine d ( n +1) γ are: { υ ( n +1) l } , { υ ( n ) l } , { ¯ υ l } ,and u ( n ) l . The factor { ¯ υ l } depends on the historical content requests till the ( n − th request, u ( n ) l depends on η ( n ) , and both { ¯ υ l } and u ( n ) l depend on the replacement scheme. The term υ ( n ) l − ¯ υ l reﬂects the deviation in the request probability for content l from its ‘steady’ requestprobability, which manifests the inﬂuence of historical requests. The term c ( n +1) l represents thechange in the cache hit probability at the ( n + 1) th request, using the corresponding replacementscheme, when the current SCP is η ( n − and content l is requested at the n th request.Using Theorem 1, a more detailed investigation could be conducted for a speciﬁc contentpopularity model (i.e., shot noise model [15]). Nevertheless, the study on speciﬁc contentpopularity models is not the focus of this work. Section VI, however, will cover numericalresults on the performance of replacement schemes under speciﬁc content popularity models. B. Upper and Lower Bounds of d ( n +1) γ The term C s u ( n ) ( η ( n − ) in d ( n +1) γ represents the change in the content caching probabilitiesafter the n th request under the chosen replacement scheme. Sort the contents based on theirpopularity at the instant of the n th request so that υ ( n )1 ≥ υ ( n )2 ≥ · · · ≥ υ ( n ) N c . The upper-boundand lower-bound of d ( n +1) γ can be found using the following result. Theorem 2:

The upper-bound and lower-bound of d ( n +1) γ , denoted as ˆ d ( n +1) γ and ˇ d ( n +1) γ , re-spectively, for RR, LP, TLP, and LRU are given by : ˆ d ( n +1) γ =  Lφ max l { υ ( n ) l } , RR α max l { υ ( n ) l } , LP max l { υ ( n ) l } , TLP-A or LRU max l { υ ( n ) l } max l { ˜ υ ( n +1) l } , TLP-P (35)and ˇ d ( n +1) γ =  − φ, RR − α, LP − , TLP-A or LRU − N c P l =1 υ ( n ) l ˜ υ ( n +1) l , TLP-P . (36) Proof : See Section D in Appendix.

C. Observations

The following observations can be made from the preceding analysis of the relation betweenthe instantaneous STF and the difference in cache hit probability. • From eq. (25), eq. (33), and eq. (34), it can be seen that the parameter φ is only a scalingfactor in d ( n +1) γ in the case of RR. Speciﬁcally, whether d ( n +1) γ is negative or not is jointlydecided by υ ( n +1) , υ ( n ) , and η ( n − . The parameter φ can scale d ( n +1) γ but does not haveany impact on its sign. This explains the result in [25] that φ impacts on the convergencespeed but not the steady state under constant content popularity. • Four cases of instantaneous STF u ( n ) ( η ( n − ) and z ( n +1) are illustrated in Fig. 2a and Fig. 2b.In each single replacement, both LP and TLP drive the SCP η towards a direction thatincreases ( z ( n +1) ) T η , i.e., ( z ( n +1) ) T η ( n +1) ≥ ( z ( n +1) ) T η ( n ) , where z ( n +1) = C Ts υ ( n +1) .Therefore, only case 2 in Fig. 2a and case 3 in Fig. 2b are possible for LP and TLP whileall four cases can occur for RR and LRU. Moreover, TLP drives η towards the directionthat increases ( z ( n +1) ) T η the fastest, which is a resemblance to the steepest gradient in For the lower-bound of d ( n +1) γ in the case of TLP-P, it is assumed that the L least popular contents at the n th request remainleast popular at the ( n + 1) th request. Accurate prediction of content popularity is assumed for the case of LP and TLP. η ( n − A A : T η = 1 z ( n +1) ! ! u ( n ) ( η ( n − ) (a) u ( n ) and z ( n +1) : cases 1 and 2. η ( n − A A : T η = 1 z ( n +1) ! ! u ( n ) ( η ( n − ) (b) u ( n ) and z ( n +1) : cases 3 and 4. η ( n − A A : T η = 1 z ( n +1) z ( n + N ) u ( n ) ( η ( n − ) [TLP] u ( n ) ( η ( n − ) [LP] (c) z ( n ) changes along a straight path. η ( n − A A : T η = 1 z ( n +1) z ( n + N ) u ( n ) ( η ( n − ) [TLP] u ( n ) ( η ( n − ) [LP] (d) z ( n ) changes randomly. Fig. 2: Illustration of the relation between the replacement schemes, the instantaneous STF u ( n ) ( η ( n − ) , and the state cache hit probability z ( n +1) .optimization. This explains the result in the ﬁrst part that TLP converges faster than LPunder constant content popularity. • Under time-varying content popularity, LP and TLP may not effectively trace the varyingcontent popularity depending on the pattern of variation. Speciﬁcally, if υ ( n ) varies so that z ( n ) changes along a straight path over time, as shown in Fig. 2c, then LP and TLP canstill trace the content popularity well, and TLP should outperform LP. An example of suchscenario is when popularity concentrates so that the most popular contents become evenmore popular over time. • If υ ( n ) varies so that z ( n ) changes fast and randomly in an area, as shown in Fig. 2d, thenLP and TLP may not trace the content popularity well, and TLP can perform worse thanLP. An example of such scenario is when content popularity varies drastically over time so (a) RR. (b) LP, example 1.(c) LP, exapmle 2. Fig. 3: Instantaneous STF and its impact on the instantaneous cache hit probability at the nextrequest.that the most popular set of contents rapidly changes.VI. N

UMERICAL E XAMPLES

A. Instantaneous STF under Time-varying Content Popularity

Fig. 3 demonstrates the instantaneous STF under time-varying content popularity and furtherillustrates Fig. 1 using RR and LP as examples. Similar to the ﬁrst part of this two-part paper,we use 3-D STFs for illustrations.Fig. 3a shows the case under RR. The content popularity at the n th and ( n + 1) th requests are υ ( n ) = [0 . , . , . T and υ ( n +1) = [0 . , . , . T , respectively. The solid circle with redﬁlling shows where the steady state would be if the content popularity were ﬁxed and equal to υ ( n ) . The hollow circle shows where the stationary state would be if the content popularity wereﬁxed and equal to υ ( n +1) . The black triangular area with solid edges represents the state transition domain. The black arrows demonstrate the direction and strength of the STF at the instant of the n th request and the corresponding locations in the state transition domain. The colored straightlines in the x-y plane show the contour of the cache hit probability in the state transition domain.The solid straight line from the origin (0 , , to the diamond marker in the STF are speciﬁedby the vector C s υ ( n +1) . Denote the SCP vector η at the diamond marker as ¯ η ( n ) . The dashedtriangle in blue represents the intersection of the plane ( υ ( n +1) ) T C s ( η − ¯ η ( n ) ) = 0 with the 3planes η = 0 , η = 0 , and η = 0 . The dotted line represents the intersection of the plane ( υ ( n +1) ) T C s ( η − ¯ η ( n ) ) = 0 with the state transition domain.From Fig. 3a, the effect of the n th replacement, given the replacement scheme of RR andthe above change of content popularity from υ ( n ) to υ ( n +1) , can be observed. Speciﬁcally, givenany SCP, i.e, a point in the state transition domain, if the arrow representing the instantaneousSTF at that point can be scaled such that it crosses the dotted line from below to above, the n th replacement yields a smaller cache hit probability at the ( n + 1) th request compared withno replacement. By contrast, if the arrow can be scaled such that it crosses the dotted line fromabove to below, the n th replacement yields a larger cache hit probability at the ( n + 1) th request.If the arrow is in parallel with the dotted line, the n th replacement has no impact on the cachehit probability at the ( n + 1) th request.Fig. 3b shows the ﬁrst of two examples with LP. The content popularity υ ( n ) and υ ( n +1) arethe same as in Fig. 3a. In this example, the change in the content popularity is not signiﬁcant sothat the state which caches the most popular contents does not change. As a result, the stationarystate if the content popularity is ﬁxed and equal to υ ( n ) and that if the content popularity isﬁxed and equal to υ ( n +1) are identical and shown by a solid circle in the ﬁgure. The dashedtriangle, solid straight line, and dotted line illustrate the same objects or variables as in Fig. 3a,respectively. The effect of the n th replacement on the cache hit probability at the ( n + 1) threquest at any SCP point in the state transition domain can be observed from Fig. 3b followingthe same method described in the preceding paragraph. In this example, the arrow at any point(except the stationary point) can be scaled such that it crosses the dotted line with the arrowhead below the line. As a result, a replacement after request n based on LP always increases thecache hit probability at the ( n + 1) th request (except at η = [1 , , ). This example correspondsto the scenario of varying content popularity which drives z ( n ) along a somewhat straight path,as shown in Fig. 2c.Fig. 3c shows the second example with LP. The content popularity υ ( n ) is the same as in (a) Cache hit ratio versus t max0 . (b) Request instants for 40 out of 1000 contentswhen t max0 = 250 . (c) Request instants for 40 out of 1000 contentsin one round when t max0 = 2500 . Fig. 4: Cache hit ratio under shot noise model.Fig. 3a and Fig. 3b, while υ ( n +1) = [0 . , . , . T . The solid and the hollow circles show thestationary states in the cases when the content popularity is ﬁxed and equal to υ ( n ) and υ ( n +1) ,respectively. At any SCP point, if the arrow can be scaled such that it crosses the dotted line fromright to left, the n th replacement yields a smaller cache hit probability at the ( n + 1) th requestcompared with no replacement. By contrast, if the arrow can be scaled such that it crosses thedotted line from left to right, the n th replacement yields a larger cache hit probability at the ( n + 1) th request. In this example, a replacement after request n based on LP may either increaseor decrease the cache hit probability at the ( n + 1) th request. This example corresponds to thescenario of varying content popularity which leads to a randomly changing z ( n ) , as shown inFig. 2d. B. Cache Hit Ratio under Time-varying Content Popularity

In the second set of examples, the cache hit ratio of the considered cache replacement schemesunder time-varying content popularity is demonstrated.First, the cache hit ratio is demonstrated when the time-varying content popularity is gen-erated using the shot noise model [15]. Speciﬁcally, the request for content l follows a time-inhomogeneous Poisson process with the instantaneous rate at time t given by: y l ( t ) =  A l b l exp − b l ( t − t l, ) , if t ≥ t l, , otherwise (37)Accordingly, requests for content l start occurring from t l, . The parameter A l limits the maximumrequest rate of content l . For content l , an allocation of A l over time is given by an exponentialdistribution with rate parameter b l . It follows that contents have different life-span and entrancetime. The entrance time t l, is uniformly generated in [0 , t max0 ] , and A l is uniformly generatedin [ A min l , A max l ] . For RR, we test two cases, φ = 0 . and φ = 0 . . A larger φ results in morefrequent content replacements and higher sensitivity to the changes in the content popularity.Similarly, for LP, we test two cases, i.e., α = 0 . and α = 0 . .In the ﬁrst example with shot noise model, the number of contents N c is set to 1000 and thecache size L is set to . A duration with 5000 seconds from t = 0 to t = 5000 is considered.The parameters A min l and A max l are set to 10 and 1000, respectively. Fig. 4a shows the resultingcache hit ratio of the considered replacement schemes versus t max0 . Each data point in Fig. 4a isaveraged over 200 rounds of simulations for the considered 5000 seconds duration. For LP andTLP, accurate prediction of content popularity is assumed. It can be seen from the Fig. 4a that LPand TLP have a signiﬁcant advantage over RR and LRU when t max0 is small (i.e., t max0 ≤ ).However, RR and LRU are much better than LP and TLP when t max0 becomes large.The content request time instants for 40 out of the 1000 contents in the case when t max0 = 250 and t max0 = 2500 are plotted in Figs. 4b and 4c, respectively. Colors are used to distinguishthe requests for different contents. Each asterisk in Figs. 4b and 4c represents a request, withits x and y coordinates specifying the corresponding request time instant and the content ID,respectively. It can be seen from Figs. 4b and 4c that, when t max0 becomes large, the set ofavailable contents can vary signiﬁcantly over time. This has two effects on the cache hit ratio.On one hand, the cache hit ratio should increase as the number of simultaneous available contents Speciﬁcally, the contents whose content ID is a multiple of 25 are selected. (a) Cache hit ratio versus t max0 . (b) Request instants for 40 out of 2000 contentsin one round when t max0 = 2500 . Fig. 5: Cache hit ratio under shot noise model, short life-span.can be smaller when t max0 is large. On the other hand, due to the property of the instantaneousrate given by eq. (37), the maximum instantaneous request rate of any content occurs when thecontent just becomes available. If follows that the varying set of available contents when t max0 is large can lead to frequent and abrupt change of content popularity over time, as illustrated inFig. 2d and Fig. 3c. Since LP and TLP exploit the content popularity information in a greedymanner (i.e., maximizing the cache hit ratio based on the current content popularity information),the second effect can hinder the cache hit ratio, and the combined impact of the above two effectsyields an almost steady cache hit ratio of LP and TLP in Fig. 4a. By contrast, the cache hitratio of RR and LRU increases with t max0 as the result of the ﬁrst effect while the second effecthas no signiﬁcant impact as RR and LRU do not rely on the instantaneous content popularityinformation.In the second example with shot noise model, N c is increased from 1000 to 2000, and A max l and A min l are decreased from 1000 to 200 and from 10 to 1, respectively. The average contentlife-span also becomes shorter. Fig. 5a shows the resulting cache hit ratio versus t max0 , while therequest time instants for 40 out of the 2000 contents when t max0 = 2500 is plotted in Fig. 5b.Comparing Fig. 5a with Fig. 4a, three observations can be made. First, the cache hit ratio inFig. 5a becomes lower for all schemes when t max0 = 0 , as a result of N c increasing to 2000.Second, the effect of φ and α on the performance of RR and LP, respectively, becomes obvious inFig. 5a. This is because a larger φ or α allows for a faster adaption to new content requests, whichis important now that the number of requests for each content decreases signiﬁcantly. Third, RR (a) Cache hit ratio versus t max0 . (b) Request instants for 40 out of 1000 contentsin one round when t max0 = 2500 . Fig. 6: Cache hit ratio under time-inhomogeneous Poisson process represented by eq. (38).and LRU begin to outperform LP and TLP from a smaller t max0 in Fig. 5a compared to that inFig. 4a, and the performance gap between the two groups becomes larger. This is because thecombination of active contents and their popularity varies even more rapidly compared with thecase in Fig. 4a, as a result of a larger N c and shorter content life span. The result in Fig. 5ashows that exploiting the instantaneous content popularity information in a content replacementscheme is not necessarily beneﬁcial for increasing the cache hit ratio even if such informationis predicted perfectly. This is because the usefulness of the instantaneous content popularityinformation depends on how rapidly the content popularity changes. This example correspondsto the case illustrated in Fig. 2d.Fig. 6 shows the cache hit ratio with a time-varying content popularity model different fromeq. (37). Speciﬁcally, the request for content l follows a time-inhomogeneous Poisson processwith the instantaneous rate at time t given by: y l ( t ) = A l √ πσ exp − ( t − tl, σ . (38)The parameter t l, is no longer the entrance time of instant l in eq. (38). However, t l, in botheq. (37) and eq. (38) corresponds to the time instant of the peak instantaneous request rate forcontent l . Similarly, t l, is uniformly generated in [0 , t max0 ] , and A l is uniformly generated in [ A min l , A max l ] .In this simulation, N c is set to 1000 and the cache size L is set to . A duration with 5000seconds from t = 0 to t = 5000 is considered. The parameters A min l and A max l are set to 1 and 50,respectively. Fig. 6a shows the resulting cache hit ratio of the considered replacement schemes versus t max0 , in which each data point is averaged over 200 rounds of simulations. Accurateprediction of content popularity is again assumed for LP and TLP. The request time instantsfor 40 out of the 1000 contents when t max0 = 2500 is plotted in Fig. 6b. It can be seen thatFig. 6a shows a very different result when compared with Fig. 4a or Fig. 5a. Speciﬁcally, LP andTLP always perform better that RR and LRU in Fig. 6a, and the performance gap between thetwo groups increases with t max0 . This is because that, unlike the abrupt and frequent variationsintroduced by eq. (37), the instantaneous rate model in eq.(38) leads to smooth and graduatevariations in the content popularity. As a result, the instantaneous content popularity at anyinstant can be close to the instantaneous content popularity for a number of subsequent requests.Therefore, the greedy maximization of the cache hit ratio based on the current content popularityinformation used by LP and TLP can beneﬁt the cache hit ratio for both the immediate nextrequest and also subsequent requests. Consequently, the LP and TLP outperform RR and LRUdue to the exploration of the instantaneous content popularity information in such case. Thisexample corresponds to the case illustrated in Fig. 2c.VII. C ONCLUSION

We have extended the study of dynamic caching via STF to the case of time-varying contentpopularity. In our analysis, we have focused on developing the model and methodology withoutassuming a speciﬁc pattern of change in content popularity. The results have demonstrated theimpact of varying popularity on the STF and the performance of replacement schemes in thegeneral case. Further extensions can be conducted by incorporating a speciﬁc model of time-varying content popularity. In our simulations, we have adopted different models of varyingpopularity, and the numerical results have been shown to be consistent with the observationsfrom the analysis.Through the two parts of this paper, we have provided a novel perspective and developedmethods for studying cache replacement in the vector space of SCP using STF. It has beendemonstrated that the design of replacement schemes is essentially the design of STF and thatthe knowledge of content popularity is beneﬁcial only if exploited properly, depending on thepattern of change in the content popularity. As there are many open issues, especially in the caseof time-varying content popularity, the results of this paper have been developed in the effortof inspiring the analysis or design of cache replacement schemes for various speciﬁc problemsand scenarios. A PPENDIX

A. Proof of Lemma 1

Suppose that the LRU content at the n th request is content q ⋆ , and the most recent requestfor q ⋆ is the ( n − w ) th request. It must hold that w ≥ L , and all requests from ( n − w + 1) threquest to the ( n − th request must be for contents l ∈ C k \{ q ⋆ } . Denote the N c contents in l ∈ C k \{ q ⋆ } as k (1 , ¯ q ⋆ ) , . . . k ( L − , ¯ q ⋆ ) . To allocate the total number of w − requests (i.e.,from the ( n − w + 1) th request to the ( n − th request) to the L − contents in l ∈ C k \{ q ⋆ } ,there are P w = (cid:0) w − L − (cid:1) different different allocations, without considering the order of requests,that guarantees at least one request for each content. Denote the number of requests for content k ( i, ¯ q ⋆ ) in the j th combination as T ( k, i, j, ¯ q ⋆ ) , where i ∈ { , . . . , L − } and j ∈ { , . . . , P w } .It follows that: L − X T ( k, i, j, ¯ q ⋆ ) = w − , ∀ j. (39)Then, considering the order of request, the number of different ordered allocations are: U w = P w X j =1 L − Y i =1 (cid:18) w − − a k,i,j, ¯ q ⋆ T ( k, i, j, ¯ q ⋆ ) (cid:19) . (40)in which a k,i,j, ¯ q ⋆ =  , if i = 1 i − P y =1 T ( k, y, j, ¯ q ⋆ ) , if i ≥ . (41)Denote the set of request instants for content k ( i, ¯ q ⋆ ) in the u th ordered combination as T ( k, i, u, ¯ q ⋆ ) , where i ∈ { , . . . , L − } and u ∈ { , . . . , U w } . It follows that: L − [ i =1 T ( k, i, u, ¯ q ⋆ ) = { n − , . . . , n − w + 1 } , ∀ u. (42)Accordingly, the joint probability that the current state is k , content q ⋆ = e ( k, m ) ∈ C k is theLRU content at the n th request, and the most recent request for the LRU is the ( n − w ) th requestis given by eq. (13). (cid:4) B. Proof of Lemma 2

The average cache hit probability from the 1st till the n th request is given by: γ avg = 1 n n X t =1 (cid:0) υ ( t ) (cid:1) T λ ( t − = 1 n n X t =1 (cid:0) υ ( t ) (cid:1) T C s η ( t − . (43)Using eq. (20) (and setting n = 1 and N = t − in eq. (20)), it holds that: η ( t − = t − X t ′ =0 u ( t ′ +1) ( η ( t ′ ) ) + η (0) (44)for any t ≥ . Substituting eq. (20) into eq. (43), it holds that: γ avg = 1 n n X t =2 (cid:0) υ ( t ) (cid:1) T C s (cid:18) t − X t ′ =0 u ( t ′ +1) ( η ( t ′ ) ) + η (0) (cid:19) + 1 n (cid:0) υ (1) (cid:1) T C s η (0) = 1 n n X t =2 (cid:0) υ ( t ) (cid:1) T C s (cid:18) t − X t ′ =0 u ( t ′ +1) ( η ( t ′ ) ) (cid:19) + 1 n n X t =1 (cid:0) υ ( t ) (cid:1) T C s η (0) = 1 n n X t =2 (cid:0) υ ( t ) (cid:1) T C s (cid:18) t − X t ′ =0 u ( t ′ +1) ( η ( t ′ ) ) (cid:19) + n n X t =1 υ ( t ) ! T C s η (0) , (45)which leads to eq. (21). (cid:4) C. Proof of Theorem 1 As ¯ υ l is deﬁned such that η ( n − would be the steady state if the content request probabilitieswere time-invariant and equal to { ¯ υ l } . It follows that: X l ∈C ¯ υ l Θ l η ( n − = η ( n − . (46) Based on eq. (24) and eq. (46), it holds that: η ( n ) − η ( n − = X l ∈C (cid:16) υ ( n ) l − ¯ υ l (cid:17) Θ l η ( n − = X l ∈C (cid:16) υ ( n ) l − ¯ υ l (cid:17) ( η ( n − + u ( n ) l )= X l ∈C (cid:16) υ ( n ) l − ¯ υ l (cid:17) u ( n ) l , (47)where the last equality uses the fact that P l ∈C (cid:16) υ ( n ) l − ¯ υ l (cid:17) = 0 .Substituting the above equation into eq. (32) gives d ( n +1) γ = (cid:0) υ ( n +1) (cid:1) T C s X l ∈C (cid:16) υ ( n ) l − ¯ υ l (cid:17) u ( n ) l . (48)Rearranging the above equation using eq. (34) leads to eq. (33). (cid:4) D. Proof of Theorem 2

The proof is based on the equality d ( n +1) γ = (cid:0) υ ( n +1) (cid:1) T C s u ( n ) in eq. (32). The elements ofthe N c × vector C s u ( n ) are the changes in the caching probabilities of the N c contents afterthe n th request and replacement. It is straightforward to see that the upper and lower bounds of d ( n +1) γ are decided by the maximum and minimum elements of C s u ( n ) , respectively.Given that contents are sorted based on their popularity at the n th request, the maximumelement of C s u ( n ) for all cases but TLP-P corresponds to the case when content 1 is requestedwhile it is being cached with probability zero. Using eq. (24) and eqs. (25) - (30), it can be seenthat the maximum element of C s u ( n ) is Lφ max l { υ ( n ) l } , α max l { υ ( n ) l } , max l { υ ( n ) l } , and max l { υ ( n ) l } for RR, LP, TLP-A, and LRU, respectively. For the case of TLP-P, it holds that ˆ d ( n +1) γ ≤ max l { υ ( n ) l } · max l { ˜ υ ( n +1) l − ˜ υ ( n +1) N c } . (49)For all cases but TLP-P, the minimum of C s u ( n ) corresponds to the following scenario: i). thestate with the L least popular contents is being cached with probability 1; and ii). a content notin the cache is requested. The change in the SCP of this state in the described scenario givesthe minimum of C s u ( n ) .For RR, the change in the above SCP is given by ˇ d ( n +1) γ = − φ N c − L X l =1 υ ( n ) l ≥ − φ, (50) where the inequality is based on the approximation that the summation of request probabilitiesof all but the L least popular contents should be close to 1.For LP, the change is given by ˇ d ( n +1) γ = − α N c − L X l =1 υ ( n ) l ˜ υ ( n +1) l − ˜ υ ( n +1) N c N c P q = N c − L +1 (˜ υ ( n +1) l − ˜ υ ( n +1) q ) ≥ − α N c − L X l =1 υ ( n ) l ≥ − α. (51)For both TLP-A and LRU, the state will change as long as the requested content is not in thecache. Therefore, the aforementioned change is given by ˇ d ( n +1) γ = − N c − L X l =1 υ ( n ) l ≥ − . (52)For TLP- P, assuming that the L least popular contents at the n th request remain to be theleast popular at the ( n + 1) th request, the change is given by ˇ d ( n +1) γ = − N c − L X l =1 υ ( n ) l (cid:16) ˜ υ ( n +1) l − ˜ υ ( n +1) N c (cid:17) ≥ − N c − L X l =1 υ ( n ) l ˜ υ ( n +1) l ≥ − N c X l =1 υ ( n ) l ˜ υ ( n +1) l . (53)This completes the proof of Theorem 2. (cid:4) R EFERENCES [1] X. Wang, M. Chen, T. Taleb, A. Ksentini, and V. C. M. Leung, “Cache in the air: exploiting content caching and deliverytechniques for 5G systems,”

IEEE Commun. Mag. , vol. 52, no. 2, pp. 131–139, Feb. 2014.[2] S. Zhang, W. Quan, J. Li, W. Shi, P. Yang, and X. Shen, “Air-Ground Integrated Vehicular Network Slicing With ContentPushing and Caching,”

IEEE J. Sel. Areas Commun. , vol. 36, no. 9, pp. 2114–2127, Sept. 2018.[3] J. Gao, L. Zhao, and L. Sun, “Probabilistic Caching as Mixed Strategies in Spatially-Coupled Edge Caching,” in

Proc.29th Biennial Symp. Commun. , Toronto, Canada, 2018.[4] S. M¨uller, O. Atan, M. van der Schaar, and A. Klein, “Context-Aware Proactive Content Caching With ServiceDifferentiation in Wireless Networks,”

IEEE Trans. Wireless Commun. , vol. 16, no. 2, pp. 1024–1036, Feb. 2017.[5] P. Yang, N. Zhang, S. Zhang, L. Yu, J. Zhang, and X. Shen, “Content Popularity Prediction Towards Location-AwareMobile Edge Caching,”

IEEE Trans. Multimedia , vol. 21, no. 4, pp. 915–929, Apr. 2019.[6] J. Gao, S. Zhang, L. Zhao, and X. Shen, “The Design of Dynamic Probabilistic Caching with Time-Varying ContentPopularity,” submitted to

IEEE Trans. Mobile Comput. , under review.[7] K. Li, C. Yang, Z. Chen, and M. Tao, “Optimization and Analysis of Probabilistic Caching in N -Tier HeterogeneousNetworks,” IEEE Trans. Wireless Commun. , vol. 17, no. 2, pp. 1283–1297, Feb. 2018. [8] G. Paschos, E. Bastug, I. Land, G. Caire, and M. Debbah, “Wireless Caching: Technical Misconceptions and BusinessBarriers,” IEEE Commun. Mag. , vol. 54, no. 8, pp. 16–22, Aug. 2016.[9] G. S. Paschos, G. Iosiﬁdis, M. Tao, D. Towsley, and G. Caire, “The Role of Caching in Future Communication Systemsand Networks,”

IEEE J. Sel. Areas Commun. , vol. 36, no. 6, pp. 1111–1125, June 2018.[10] S. M. Azimi, O. Simeone, A. Sengupta, and R. Tandon, “Online Edge Caching and Wireless Delivery in Fog-AidedNetworks With Dynamic Content Popularity,”

IEEE J. Sel. Areas Commun. , vol. 36, no. 6, pp. 1189–1202, June 2018.[11] S. Jin and A. Bestavros, “Temporal Locality in Web Request Streams: Sources, Characteristics, and Caching Implications,”Technical Report, BUCS-1999-014, Boston, USA, Oct. 1999.[12] M. Busari and C. Williamson, “On the Sensitivity of Web Proxy Cache Performance to Workload Characteristics,” in

Proc.IEEE INFOCOM , Anchorage, USA, 2001, pp. 1225–1234.[13] Y. Zhou, L. Chen, C. Yang, and D. M. Chiu, “Video Popularity Dynamics and Its Implication for Replication,”

IEEETrans. Multimedia , vol. 17, no. 8, pp. 1273–1285, Aug. 2015.[14] V. Phalke and B. Gopinath, “An Inter-Reference Gap Model for Temporal Locality in Program Behavior,” in

Proc. ACMSIGMETRICS/PERFORMANCE Conf. , Ottawa, Canada, May 1995, pp. 291–300.[15] S. Traverso, M. Ahmed, M. Garetto, P. Giaccone, E. Leonardi, and S. Niccolini, “Temporal Locality in Today’s ContentCaching: Why It Matters and How to Model It,”

ACM SIGCOMM Comput. Commun. Rev. vol. 43, no. 5, pp. 5–12.Nov. 2013.[16] S. Traverso, M. Ahmed, M. Garetto, P. Giaccone, E. Leonardi, and S. Niccolini, “Unravelling the Impact of Temporal andGeographical Locality in Content Caching Systems,”

IEEE Trans. Multimedia , vol. 17, no. 10, pp. 1839-1854, Oct. 2015.[17] M. Garetto, E. Leonardi, and V. Martina, “A Uniﬁed Approach to the Performance Analysis of Caching Systems,”

ACMTrans. Model. Perform. Eval. Comput. Syst. , vol. 1, no. 3, May 2016.[18] M. Garetto, E. Leonardi, and S. Traverso, “Efﬁcient Analysis of Caching Strategies under Dynamic Content Popularity,”in

Proc. IEEE INFOCOM , Kowloon, China, Apr./May 2015, pp. 2263–2271.[19] A. Jaleel, K. B. Theobald, S. C. Steely Jr., and J. Emer, “High Performance Cache Replacement Using Re-referenceInterval Prediction (RRIP),”

ACM SIGARCH Comput. Archit. News , vol. 38, no. 3, pp. 60–71, June 2010.[20] S. Li, J. Xu, M. van der Schaar, and W. Li, “Popularity-driven content caching,” in

Proc. IEEE INFOCOM , San Francisco,USA, 2016.[21] N. Zhang, K. Zheng, and M. Tao, “Using Grouped Linear Prediction and Accelerated Reinforcement Learning for OnlineContent Caching,” in

Proc. IEEE ICC Workshops , Kansas City, USA, May 2018.[22] A. Sadeghi, F. Sheikholeslami, and G. B. Giannakis, “Optimal and Scalable Caching for 5G Using Reinforcement Learningof Space-Time Popularities,”

IEEE J. Sel. Areas Signal Process. , vol. 12, no. 1, pp. 180–190, Feb. 2018.[23] D. Applegate, A. Archer, V. Gopalakrishnan, S. Lee, and K. K. Ramakrishnan, “Optimal Content Placement for a Large-Scale VoD System,”

IEEE/ACM Trans Netw. , vol. 24, no. 4, pp. 2114–2127, Aug. 2016.[24] B. N. Bharath, K. G. Nagananda, D. G¨und¨uz, and H. V. Poor, “Caching With Time-Varying Popularity Proﬁles: A Learning-Theoretic Perspective,”

IEEE Trans. Commun. , vol. 66, no. 9, pp. 3837–3847, Sept. 2018.[25] J. Gao, L. Zhao, and X. Shen, “The Study of Caching via State Transition Field - the Case of Time-Invariant Popularity”,