[PDF] On Optimal Proactive and Retention-Aware Caching with User Mobility

Abstract

Caching popular contents at edge devices is an effective solution to alleviate the burden of the backhaul networks. Earlier investigations commonly neglected the storage cost in caching. More recently, retention-aware caching, where both the downloading cost and storage cost are accounted for, is attracting attention. Motivated by this, we address proactive and retention-aware caching with user mobility, optimizing the sum of the two types of costs. This is a combinatorial optimization problem. However, we derive a stream of analytical results and they together lead to an algorithm that guarantees global optimum with polynomial-time complexity. Numerical results show significant improvements in comparison to popular caching and random caching.

Full PDF

aa r X i v : . [ c s . D C ] J a n On Optimal Proactive and Retention-Aware Cachingwith User Mobility

Ghafour Ahani and Di Yuan

Department of Information TechnologyUppsala University, SwedenEmails: { ghafour.ahani, di.yuan } @it.uu.se Abstract —Caching popular contents at edge devices is aneffective solution to alleviate the burden of the backhaul net-works. Earlier investigations commonly neglected the storagecost in caching. More recently, retention-aware caching, whereboth the downloading cost and storage cost are accounted for,is attracting attention. Motivated by this, we address proactiveand retention-aware caching problem with the presence of usermobility, optimizing the sum of the two types of costs. Moreprecisely, a cost-optimal caching problem for vehicle-to-vehiclenetworks is formulated with joint consideration of the impactof the number of vehicles, cache size, storage cost, and contentrequest probability. This is a combinatorial optimization problem.However, we derive a stream of analytical results and they to-gether lead to an algorithm that guarantees global optimum withpolynomial-time complexity. Numerical results show signiﬁcantimprovements in comparison to popular caching and randomcaching.

Index Terms —Caching, retention time, storage cost, mobility.

I. I

NTRODUCTION

The explosive mobile data trafﬁc growth is putting a heavyburden on backhaul links, causing delays in downloadingcontents. However, a large portion of the mobile trafﬁc is dueto duplicate downloads of a few popular contents. Cachingtechnology has been considered as an effective solution toreduce the burden of the backhaul, by storing the contents onedge devices [1]. This enables the mobile users to downloadtheir requested contents from the nearby devices instead ofdownloading the contents from the core network. In modelingsuch scenarios, most of research efforts focused on down-loading cost. However, storing a content may be subject toa cost as well. A storage cost may be due to ﬂash rental costincurred by cloud service providers or ﬂash damage caused bywriting a content to the memory device [2]. In both cases, thestorage cost typically depends on the time duration of storage,hereafter referred to as the retention time. Intuitively, withlonger retention time, the requested contents can be obtainedwith higher probability from the cache, thus avoiding the costof downloading from the network. But longer retention timeresults in a higher storage cost. Therefore, what to cache and for how long are both key aspects in optimal caching. Fewworks in studying optimal caching have considered the impactof storage cost. The works such as [3]–[5] considered only thedownloading cost. The study in [3] proposed an approximationalgorithm with performance guarantee for multicast-awareproactive caching. The authors in [4] considered cost-optimalcaching with user mobility. They presented an extension in [5]by providing a linear lower bound of the objective function. In these works, storage cost was neglected. The study in [6]chose to represent storage cost using a random variable. Later,the work in [7] suggested that the storage cost can be bettermodeled by an increasing linear/convex function. Anotherlimitation of [6] is that the retention time is ﬁxed. Later, thisassumption was relaxed in [8] and the retention time wastreated as an optimization variable in a time-slotted system.We remark that in [8], a user is associated with only onecache. A generalization of a multiple-path routing model withretention-aware caching was studied in [9].For mobility scenarios, contents are often cached at mobiledevices. They can exchange the requested contents when theymove into the communication range of each other. Makingthe best of mobility information between mobile users cansigniﬁcantly improve the caching efﬁciency [4], [10], [11].However, considering both downloading and storage costs,with presence of user mobility, calls for further research. Tothis end, our main contributions are as follows. • We formulate a Proactive Retention-Aware Caching Opti-mization (PRACO) problem with user mobility in a time-slotted system, taking into account both the downloadingand storage costs. • This problem is a combinational optimization in itsnature. However, we provide mathematical analysis inorder to facilitate the computation of global optimumtime-efﬁciently, namely, – we ﬁrst prove that for any content, the optimalcaching decisions over the time slots can be derived,given the initial number of mobile users caching thecontent; – the above analysis is then embedded into a dynamicprogramming algorithm and we prove it is bothglobally optimal and of polynomial-time complexity. • The numerical results show signiﬁcant improvementsin comparison to two conventional algorithms, namely,popular caching and random caching.II. S

YSTEM M ODEL AND P ROBLEM F ORMULATION

A. System Model

We consider a vehicular network scenario which consists ofa content server having all the contents, a number of vehicles,and road side units (RSUs) providing signal coverage for thevehicles. Denote by R the set of vehicles that are interestedin requesting contents, referred to as requesters , whose indexset is represented by R = { , , . . . , R } . Denote by H theet of vehicles that we call helpers . Each helper is equippedwith a cache of size s , that can supply the requesters withcontents from the cache, and therefore to mitigate backhaulcongestion. We consider a library of C contents, whose indexset is C = { , , . . . , C } . The sizes of all the contents are thesame and are assumed to be one. In addition, each content iseither fully stored or not stored at all at a helper. Figure 1shows the system scenario. HelperRequesterRSUServer

Figure 1. System scenario.

The event that vehicles move into the transmission range ofeach other is called a contact , during which communicationbetween them can occur. We consider that the contact be-tween any two vehicles follows a Poisson distribution. Poissondistribution can characterize the mobility pattern of vehiclesas the tail behavior of the inter-contact time distribution canbe characterized as an exponential distribution by analyzingthe real-world vehicle mobility traces, see [12]. Here, weassume a homogeneous contact rate for all the vehicles,denoted by λ . This assumption is in fact common [13], [14].As a consequence, it is not necessary to explicitly considerthe content cached by each helper, as there is no differencebetween the helpers in the perspective of the requesters. Thus,the caching performance is fully determined by the numberof helpers for each content. Moreover, it is obvious that acontent is cached by no more than H helpers. Therefore, inmodeling cache capacity, it is sufﬁcient to constrain that thetotal number of the cached contents of all the helpers does notexceed S = sH .We consider a time-slotted system where each time slot isof duration δ . The time period subject to optimization consistsof T time slots. In each slot, all the requesters are active andask for some content. Also, no requester becomes helper inthe next slots or vice versa. Each requester has its contentrequest probabilities, of which the distribution is independentof time slot. The probability that content c is requested byrequester r is denoted by w rc , with P c ∈C w rc = 1 . Thecontents, if cached by the helpers, are fetched at the beginningof the time period. When a requester asks for a content in aslot, the requester will ﬁrst try to collect the content from theencountered helpers. If the requester fails to obtain the contentat the end of this slot, it downloads the content from the server.In the latter case, a downloading cost is incurred. B. Cost Model

Denote by x our caching decision which is a C × T matrixfor the C contents and the T slots. The entry at location ( c, t ) , Note that the time duration of a slot is different from that in LTE. Here,the order of magnitude in performance evaluations is hour. i.e., x ct , denotes the number of helpers storing content c inslot t , x ct ∈ { , , . . . , H } . The caching optimization process,applied at the beginning of the time period, will determine thenumber of helpers for each content as well as the retentiontime. The latter is represented by the number of helpers overthe time slots, and this number either remains or decreasesfrom one time slot to the next.Downloading a content from the server results in a down-loading cost. Also, caching a content in a helper has a storagecost due to storage rental cost and ﬂash memory damage. Sameas [8] and [9], we neglect the cost of the helpers to ﬁll theircaches at the beginning of the time period. In addition, for therequesters, the downloading cost from helpers is negligiblein comparison to that from the server. Therefore, the totalcost consists of the downloading cost from the server for therequesters and the storage cost for the helpers.Denote by f ( t ) the storage cost due to storing a content in ahelper’s cache in slot t . A longer retention time needs a higherthreshold voltage, which results in a higher memory damageand consequently gives a higher storage cost, for more detaileddiscussions, see [2]. Motivated by this, we assume that f ( t ) is an increasing function.When content c is requested by requester r in slot t , theprobability that the requester has to download the content fromthe server is denoted by p crt . If r does not meet any helperhaving c within the slot, the only way of obtaining c is todownload from the server. As the contacts between the usersfollows a Poisson distribution, p crt is given by: p crt = e − x ct λδ Thus, the total cost, denoted by Cost ( x ) , reads as:Cost ( x ) = X r ∈R X c ∈C T X t =1 w rc p crt | {z } downloading cost + α X c ∈C T X t =1 f ( t ) x ct | {z } storage cost , (1)where α is the weighting factor of the two cost types. C. Problem Formulation

The proactive retention-aware caching optimization(PRACO) problem can be formulated as (2). min x Cost ( x ) (2a)s.t . X c ∈C x ct ≤ S, ∀ t ∈ { , , . . . , T } (2b) x ct ∈ { , , , . . . , H } , c ∈ C , t ∈ { , , . . . , T } (2c)Constraints (2b) guarantees that the total number of storedcontents in each time slot does not exceed the total cachecapacity, i.e., S . The formulation does not explicitly requirethe number of helpers of any content does not increase overtime. This aspect is analyzed later on in Section III.III. P ROBLEM A NALYSIS

We prove that for each content, the optimal number ofhelpers caching the content decreases over the time slots.Next, we present an algorithm, which, with respect to thepossible initial numbers of helpers of a content, computes theptimal number of helpers for this content over time. We thenprove that the algorithm’s optimality and its polynomial timecomplexity.

Lemma 1.

For any content c and time slot t , if k helpersminimizes the total cost for t , then for any t ′ > t , the totalcost of using k ′ > k helpers is higher than using k helpers.Proof. The total cost for content c and time slot t is: ∆( x ct ) = X r ∈R w rc e − x ct λδ + αf ( t ) x ct (3)As the minimum of ∆( x ct ) occurs for x ct = k , we have: X r ∈R w rc e − kλδ − X r ∈R w rc e − x ′ ct λδ < αf ( t )( x ′ ct − k ) , (4)for x ′ ct > k . Also, f ( t ) is an increasing function. Thus: αf ( t )( x ′ ct ′ − k ) < αf ( t ′ )( x ′ ct ′ − k ) , (5)for x ′ ct ′ > k . From (4) and (5), we obtain: X r ∈R w rc e − kλδ − X r ∈R w rc e − x ′ ct ′ λδ < αf ( t ′ )( x ′ ct ′ − k ) , (6)for x ′ ct ′ > k. By rearranging the terms of (6), we have ∆( x ct ′ ) < ∆( x ′ ct ′ ) for x ct ′ = k and x ′ ct ′ > k .Next, we prove that for a content, if the initial number ofhelpers is given, the optimum can be obtained in polynomialtime. The procedure is in Algorithm 1, see Lines 6 and 7. Algorithm 1:

Optimization for given initial conditions

Input: C , H , T Output: z for k = to C do z k ← [0] ( H +1) for x k = to H do x ∗ k ← x k z k ( x k ) ← ∆( x k ) for t = to T do x ∗ kt ← arg min { ∆( x kt ) } x kt ∈{ , ,...,x ∗ k ( t − } z k ( x k ) ← z k ( x k ) + min { ∆( x kt ) } x kt ∈{ , ,...,x ∗ k ( t − } return z In the algorithm, for each content k ∈ C , a vector z k ofsize ( H + 1) is used to store the optimal cost for all possibleinitial numbers of helpers. Line considers all possibleinitial numbers of helpers in the range [0 , H ] . Lines 6 and7 compute the optimal values of x c , . . . , x cT , denoted by x ∗ c , . . . , x ∗ cT respectively, for given x c . The computation isof complexity O ( HT R ) . For any content c and any possibleinitial number of helpers h , the optimal cost over all the slotsare saved in z c ( h ) . The overall complexity of Algorithm 1 is O (max { CH T, CT HR } ) , given ∆ computed.Note that Lines 6 and 7 are greedy by construction. Namely,for time slot t , Line 7 determines the number of helpers thatminimizes the cost of that speciﬁc time slot. Even though thisis intuitive, it is not obvious that the greedy choice is globally optimal for the given initial number of helpers. The optimalityanalysis is formalized in Lemma 2. Lemma 2.

For any content c ∈ C , if x c is given, then theoptimal values of x ct i.e., x ∗ ct , t = { , , . . . , T } are computedvia Lines 6 and 7 in Algorithm in polynomial time.Proof. For any c , consider x ∗ c , . . . , x ∗ cT , the numbers ofhelpers over the time slots returned by Algorithm when x c is given. Consider another sequence x ′ c , . . . , x ′ cT that differsfrom the ﬁrst sequence and offers a lower total cost. Firstconsider the case that x ∗ ct < x ′ ct for some t ∈ { , . . . , T } ,and x ∗ ct remains smaller than the values of the second se-quence in consecutive time slots until time slot t + n where ≤ n ≤ T − t . That is, the second sequence has elements x ′ ct , x ′ c ( t +1) , . . . , x ′ c ( t + n ) all being greater than x ∗ ct , whereasfor slot t + n + 1 , x ′ c ( t + n +1) ≤ x ∗ ct . Consider changing allof x ′ ct , x ′ c ( t +1) , . . . , x ′ c ( t + n ) to x ∗ ct in sequence two, whilekeeping the values of all other time slots of this sequence.The updated sequence is feasible because x ′ c ( t + n +1) ≤ x ∗ ct .Thus, monotonicity remains for the updated sequence. Theupdate reduces the cost of the second sequence by Lemma ,hence a contradiction. A special case is t + n = T , forwhich t + n + 1 does not exist. However the same updateand conclusion apply. One case remains, namely there is notime slot t with x ∗ ct < x ′ ct , yet sequence two is different fromsequence one. In other words, x ∗ c ≥ x ′ c , . . . , x ∗ cT ≥ x ′ cT .Let t , t ∈ { , . . . , T } , be the ﬁrst time slot with strictinequality, i.e., x ∗ ct > x ′ ct . Such a time slot must exist, because,otherwise the two sequences coincide. Consider increasing x ′ ct to x ∗ ct . Sequence two remains feasible in terms of beingmonotonically decreasing, because x ′ ct ≤ x ′ c ( t − after setting x ′ ct to x ∗ ct as x ∗ ct ≤ x ∗ c ( t − = x ′ c ( t − . The cost of t , due tothe update, becomes lower because when t is considered bythe algorithm, x ∗ ct is the optimum. Therefore in this case thesecond sequence cannot be better either. Hence the result.By Algorithm 1, x ∗ ct , t = 2 , , . . . , T , can be computedif x c , c ∈ C , is given. Consequently, solving PRACO isequivalent to ﬁnding the optimal values of x c , c ∈ C . Wedrop the second subscript and use x c , c ∈ C as optimizationvariables for the initial numbers of helpers, and reformulatePRACO as follows. The cost of x c , i.e., z c ( x c ) is fromAlgorithm 1. Constraint (7b) models the cache capacity. min x c X c ∈C z c ( x c ) x c (7a)s.t . X c ∈C x c ≤ S (7b) x c ∈ { , , . . . , H } , c ∈ C (7c)IV. T HE OVERALL A LGORITHM AND O PTIMALITY

A. Dynamic Programming

We use dynamic programming (DP) to obtain the optimalvalues of x c , c ∈ C . Denote by a ∗ ( k, i ) the cost of optimalcaching of the ﬁrst k contents with a total cache capacity of i units. Thus, by deﬁnition, a ∗ ( C, S ) is the overall optimal cost.The values of a ∗ ( k, i ) , k = 1 , . . . , C, i = 0 , . . . , S , submit torecursion, as formalized in the lemma below. emma 3. The value of a ∗ ( k, i ) can be derived from therecursive formula shown in (9) for k = 2 , . . . , C , with: a ∗ (1 , i ) = min { z ( x ) } , x ∈{ , ,..., min { i,H }} (8) a ∗ ( k, i ) = min { z k ( x k ) + a ∗ ( k − , i − x k ) } x k ∈{ , ,..., min { i,H }} (9) Proof.

We use induction. For k = 1 , the result is obvious forany i = 0 , , . . . , H . Suppose that a ∗ ( k, i ) is the optimal valuefor some k , with i in any range of interest. By (9), we have: a ∗ ( k + 1 , i ) = min { z k +1 ( x k +1 ) + a ∗ ( k, i − x k +1 ) } x k +1 ∈{ , ,..., min { i,H }} For k + 1 , the initial number of helpers x k +1 must be oneof the values in { , , . . . , min { i, H } . For any value of x k +1 , z k +1 ( x k +1 ) is the optimal cost for content k + 1 (Lemma 2),and the corresponding cache capacity for contents up to k is i − x k +1 . For the latter, a ∗ ( k, i − x k +1 ) is optimal. These,together with the min -operator give the optimum for k +1 . B. Algorithm Description and Optimality

Algorithm 2 describes the DP approach. The input param-eters consist of z , C , S , and H . Here, z is from the output ofAlgorithm 1. Apart from a ∗ as deﬁned earlier, b ∗ is usedto store the optimal caching solution. Lines - compute a ∗ ( k, i ) and b ∗ ( k, i ) for k < C , whereas Lines and compute a ∗ ( C, S ) and b ∗ ( C, S ) for k = C . Finally, b ∗ ismapped to optimal values of x , denoted by x ∗ , using Lines - . Theorem 4.

Algorithm 2 delivers the global optimum ofPRACO in polynomial time.Proof.

The optimality follows from Lemma 2 and the recur-sion of which the correctness is established in Lemma 3.As for time complexity, the steps in Algorithm 2 togetherrequire a complexity of O ( HCS ) = O ( H Cs ) . However, aprerequisite is that the z -values are given. Computing thesevalues with Algorithm 1, given ∆ computed, has complexity O (max { CH T, CT HR } ) . Hence the overall complexity isof O (max { H Cs, CH T, CT HR } ) . Finally, note that, eventhough s is not a parameter for input size, its values isbounded by C , because otherwise the capacity constraintis redundant and the problem decomposes by content (andsolved without the need of DP). Hence the complexity isof O (max { H C , CH T, CT HR } ) , which is polynomial ininput size. V. P ERFORMANCE E VALUATION

We compare the DP algorithm to two conventional cachingalgorithms, i.e., random caching [15] and popular caching [16].Both algorithms consider contents for caching one by one.In the former, the contents are considered randomly, but withrespect to the ﬁles’ request probabilities; a content with higherrequest probabilities will be more likely selected for caching.In the latter, popular contents, i.e., contents with higher requestprobabilities, will be considered ﬁrst. For the content underconsideration, the cache decision is the number of helpers withminimum total cost.

Algorithm 2:

The DP algorithm

Input: z , C , S , H Output: x ∗ a ∗ ← [0] C × ( S +1) , b ∗ ← [0] C × ( S +1) , x ∗ ← [0] C for k = 1 : C do if k < C then for i = 0 : S do if k = 1 then a ∗ (1 , i ) ← min { z ( x ) } x ∈{ , ,..., min { i,H }} b ∗ (1 , i ) ← arg min { z ( x ) } x ∈{ , ,..., min { i,H }} else a ∗ ( k, i ) ← min { z k ( x k ) + a ∗ ( k − , i − x k ) } x k ∈{ , ,..., min { i,H }} b ∗ ( k, i ) ← arg min { z k ( x k ) + a ∗ ( k − , i − x k ) } x k ∈{ , ,..., min { i,H }} else a ∗ ( C, S ) ← min { z C ( x C ) + a ∗ ( C − , S − x C ) } x C ∈{ , ,...,H } b ∗ ( C, S ) ← arg min { z C ( x C ) + a ∗ ( C − , S − x C ) } x C ∈{ , ,...,H } for k = C : do if k = C then x ∗ C ← b ∗ ( C, S ) e ← S − b ∗ ( C, S ) else x ∗ k ← b ∗ ( k, e ) e ← e − b ∗ ( k, e ) return x ∗ We use a Zipf distribution with shape parameter γ tocharacterize the content request probability for any requester.Thus, w rc = c − γ P k ∈C k − γ , r ∈ R . Same as [8], the time periodis set to hours, and the duration of each time slot ( δ ) is hour. The storage cost is simulated using f ( t ) = t .Figures 2-5 provide the results and show the impacts ofparameters H , α , s , and γ on the cost, respectively. It canbe seen that the cost decreases with respect to all the men-tioned parameters. This is quite expected. For example, when H increases, the requesters have more opportunity to meethelpers, leading to lower cost. The same conclusion can bemade for cheaper storage (small α ), and higher capacity (larger s ). For parameter γ , a higher value means more variation inthe contents’ request probabilities, thus it is easier for thealgorithms to identify caching solutions such that the helpersmore likely store the requested contents.The DP algorithm outperforms the two conventional cachingalgorithms. In Figures 2-4, the improvement is signiﬁcantwhen H and δ increase and α decreases. For example, when H increases from to , the DP algorithm outperforms thepopular caching algorithm by to , and outperformsthe random caching algorithm by to . This is becausethe DP algorithm uses the storage capacity of helpers optimallyin comparison to the conventional algorithms.Recall that small α means low storage cost. When α = 0 . ,which is a fairly large value in the context, the optimalstrategy tends to not to store contents – it is more preferableo download from the server. Hence cache optimization isless relevant and the algorithms are similar in performance.When α decreases, the difference between the performanceof the DP and the other algorithms becomes apparent, as theDP algorithm uses the storage capacity optimally while theconventional algorithms are not able to accomplish this. H C o s t DPPopular cachingRandom caching

Figure 2. Impact of H on cost when s = 4 , C = 100 , T = 24 , δ = 1 , R = 10 , γ = 1 , λ = 1 , and α = 0 . . C o s t DPPopular cachingRandom caching

Figure 3. Impact of α on cost when H = 12 , s = 4 , C = 100 , T = 24 , δ = 1 , R = 10 , λ = 1 , and γ = 1 . VI. C

ONCLUSIONS

The paper has studied a proactive retention-aware cachingproblem, considering user mobility, storage cost, and cachesize. We have provided analysis and algorithm development,proving that global optimum is within reach in polynomialtime. Simulation results have manifested signiﬁcant improve-ments by the proposed algorithm in comparison to two con-ventional caching algorithms. In our future work, we considera more general system scenario including non-homogeneouscontact rates, helpers with different cache sizes, and contentswith different sizes. Thus, the problem becomes more chal-lenging and new solution approaches need to be developed.R

EFERENCES[1] X. Wang, M. Chen, T. Taleb, A. Ksentini, and V. Leung, “Cache in theair: Exploiting content caching and delivery techniques for 5G systems,”

IEEE Commun. Mag. , vol. 52, no. 2, pp. 131–139, Feb. 2014.[2] S. Shukla and A. A. Abouzeid, “Optimal device-aware caching,”

IEEETrans. Mobile Comput. , vol. 16, no. 7, pp. 1994–2007, Jul. 2017.[3] K. Poularakis, G. Iosiﬁdis, V. Sourlas, and L. Tassiulas, “Exploitingcaching and multicast for 5G wireless networks,”

IEEE Trans. WirelessCommun. , vol. 15, no. 4, pp. 2995–3007, Apr. 2016. s C o s t DPPopular cachingRandom caching

Figure 4. Impact of s on cost when H = 12 , C = 100 , T = 24 , δ = 1 , R = 10 , γ = 1 , λ = 1 , and α = 0 . . C o s t DPPopular cachingRandom caching

Figure 5. Impact of γ on cost when H = 12 , s = 4 , C = 100 , T = 24 , δ = 1 , R = 10 , λ = 1 , and α = 0 . .[4] T. Deng, G. Ahani, P. Fan, and D. Yuan, “Cost-optimal caching for D2Dnetworks with presence of user mobility,” in Proc. IEEE Globecom ,2017, pp. 1–6.[5] ——, “Cost-optimal caching for D2D networks with user mobility:Modeling, analysis, and computational approaches,”

IEEE Trans. onWireless Commun. , vol. 17, no. 5, pp. 3082–3094, May. 2018.[6] N. Abedini and S. Shakkottai, “Content caching and scheduling inwireless networks with elastic and inelastic trafﬁc,”

IEEE/ACM Trans.Netw. , vol. 22, no. 3, pp. 864–874, Jun. 2014.[7] B. Schroeder, R. Lagisetty, and A. Merchant, “Flash reliability inproduction: The expected and the unexpected,” in

Proc. Usenix FAST ,2016, pp. 67–80.[8] S. Shukla and A. Abouzeid, “Proactive retention-aware caching,” in

Proc. IEEE Infocom , 2017, pp. 1–9.[9] S. Shukla, O. Bhardwaj, A. Abouzeid, T. Salonidis, and T. He, “Hold’emcaching: Proactive retention-aware caching with multi-path routing forwireless edge networks,” in

Proc. ACM Mobihoc , 2017, pp. 1–10.[10] W. Wang, X. Peng, J. Zhang, and K. Letaief, “Mobility-aware cachingfor content-centric wireless networks: Modeling and methodology,”

IEEE Commun. Mag. , vol. 54, no. 8, pp. 77–83, Aug. 2016.[11] R. Wang, J. Zhang, and K. Letaief, “Mobility-aware caching in D2Dnetworks,”

IEEE Trans. Wireless Commun. , vol. 16, no. 8, pp. 5001–5015, Aug. 2017.[12] H. Zhu, L. Fu, G. Xue, Y. Zhu, M. Li, and L. Ni, “Recognizingexponential inter-contact time in vanets,” in

Proc. IEEE Infocom , 2010.[13] T. Spyropoulos, K. Psounis, and C. Raghavendra, “Efﬁcient routingin intermittently connected mobile networks: The multiple-copy case,”

IEEE/ACM Trans. Netw. , vol. 16, no. 1, pp. 77–90, Feb. 2008.[14] X. Zhang, G. Neglia, J. Kurose, and D. Towsley, “Performance modelingof epidemic routing,”

Comput. Netw. , vol. 51, no. 10, pp. 2867–2891,Jul. 2007.[15] B. Blaszczyszy and A. Giovanidis, “Optimal geographic caching incellular networks,” in

Proc. IEEE ICC , 2015, pp. 3358–3363.16] H. Ahlehagh and S. Dey, “Video-aware scheduling and caching in theradio access network,”