Can We Achieve Fresh Information with Selfish Users in Mobile Crowd-Learning?
aa r X i v : . [ c s . S I] F e b Can We Achieve Fresh Information with SelfishUsers in Mobile Crowd-Learning?
Bin Li † Jia Liu ∗† Dept. of Electrical, Computer and Biomedical Engineering, University of Rhode Island ∗ Dept. of Computer Science, Iowa State University
Abstract —The proliferation of smart mobile devices hasspurred an explosive growth of mobile crowd-learning services,where service providers rely on the user community to voluntarilycollect, report, and share real-time information for a collection ofscattered points of interest. A critical factor affecting the futurelarge-scale adoption of such mobile crowd-learning applicationsis the freshness of the crowd-learned information, which canbe measured by a metric termed “age-of-information” (AoI).However, we show that the AoI of mobile crowd-learning couldbe arbitrarily bad under selfish users’ behaviors if the systemis poorly designed. This motivates us to design efficient rewardmechanisms to incentivize mobile users to report informationin time, with the goal of keeping the AoI and congestion levelof each PoI low. Toward this end, we consider a simple linearAoI-based reward mechanism and analyze its AoI and conges-tion performances in terms of price of anarchy (PoA), whichcharacterizes the degradation of the system efficiency due toselfish behavior of users. Remarkably, we show that the proposedmechanism achieves the optimal AoI performance asymptoticallyin a deterministic scenario. Further, we prove that the proposedmechanism achieves a bounded PoA in general stochastic cases,and the bound only depends on system parameters. Particularly,when the service rates of PoIs are symmetric in stochastic cases,the achieved PoA is upper-bounded by / asymptotically. Col-lectively, this work advances our understanding of informationfreshness in mobile crowd-learning systems. I. I
NTRODUCTION
Fueled by the proliferation of smart mobile devices (e.g.,smartphones, tablets, etc.), recent years have witnessed a rapidgrowth of information services and data analytics based onlarge-scale crowd-learning . A key defining feature of thesecrowd-learning applications is that they rely on the usercommunity to voluntarily collect, report, and share real-timeinformation for a set of distributed points of interest (PoI).Such crowd-learned information will in turn benefit the usersthemselves and attract more users to join the community(by reputation, word of mouth, etc.), which further enhancesthe accuracy, value, and significance of the crowd-learningapplications. For example, the real-time traffic congestion andaccident information on Google Waze [1] (a community-basedGPS system) relies on the reports from mobile devices andthe tracking of their locations, densities, and trajectories. Asanother example, by offering a variety of incentives, manydata analytics services leverage their user communities to sharereal-time information of scattered commodities and resources,such as cheap gasoline stations (e.g., GasBuddy [2]), parking
Authors are listed in alphabetical order. Both authors are primary authors. space availability (e.g., Pavemint [3]), free WiFi hotspots (e.g.,WiFi Finder [4]), popular grocery deals information (e.g.,Basket [5]), to name just a few. It can be foreseen that newcrowd-learning applications will continue to emerge.Although mobile crowd-learning holds a great potential tofundamentally change our modern society, a critical factoraffecting its future large-scale adoption is the freshness ofthe crowd-learned information, which can be measured by afundamental metric termed “Age-of-Information” (AoI). Guar-anteeing information freshness in crowd-learning is criticalbecause stale information discourages existing and new usersfrom participating, which in turn degrades the informationfreshness and creates a vicious circle. Unfortunately, due tothe special dynamics between the service provider and theusers, there is an inherent lack of information freshness guar-antee in mobile crowd-learning: First, to maintain informationfreshness, the service provider needs to incentivize the users toupdate the states of the PoIs. Second, the crowd-learning usersare “selfish” in the sense that their best interest is to maximizetheir own benefit from participating in crowd-learning, ratherthan minimizing the AoI for the service provider. Hence,a poorly designed incentive mechanism could result in twoundesirable consequences: (i) too many users flock to anattractive PoI, which leads to redundant sampling and severequeueing congestion; and (ii) all other PoIs suffer from largeAoI because of under-sampling. In light of these uniquecharacteristics of mobile crowd-learning, several fundamentalopen questions naturally arise:1) Is it possible to guarantee information freshness by incen-tivizing selfish users in mobile crowd-learning?2) If the answer to 1) is “yes,” what is the fundamentalrelationship between reward and AoI in crowd-learning?3) How to design reward mechanisms to avoid large queueingcongestion while guaranteeing AoI in crowd-learning?However, answering the above questions are non-trivialbecause the AoI and congestion analysis in mobile crowd-learning face the following challenges: First, there is a lackof analytical model that characterizes the essential features ofmobile crowd-learning in the literature. Most of the existingwork on crowd-sensing are based on static models that hardlycapture the dynamic and stochastic nature of participatingusers in mobile crowd-learning. Second, as shown by recentstudies (see, e.g., [6]–[9]), AoI dynamics are fundamentallydifferent from the traditional queueing evolution, which ne-essitates new theoretical tools. Third, as will be shown later,there is a strong coupling between the AoI and queue-lengthprocesses in crowd-learning, where changing the design ofeither one would significantly affect that of the other.In this paper, we overcome the above challenges andpropose a new analytical model coupled with the
Price ofAnarchy (PoA) metric, which characterizes the degradationof a system due to selfish behavior of users . This enablesus to analyze and understand the relationships between AoI,queueing congestion, and rewards under users’ selfishness. Themain results and contributions of this paper are as follows: • First, we develop a new analytical model for mobile crowd-learning, which takes into account the strong couplingsbetween the stochastic arrivals of participating users, PoIs’information evolutions, and reward mechanisms. As willbe discussed next, this new analytical model enables us toreveal the fundamental scaling law between AoI, queueingcongestion, and the reward rate set by the service provider. • Next, as a starting point, we analyze the AoI performanceunder a linear AoI-based reward mechanism in a determin-istic setting, where there is exactly one arriving user in eachtime slot, and each PoI serves exactly one user (if any) ineach time slot (and hence no queueing effect in this setting).We show that given an AoI reward rate β , the PoA is upper-bounded by O (1 /β ) , which implies that the system achievesthe optimal AoI as β increases asymptotically. • Finally, based on our results for the deterministic case,we characterize the joint AoI-congestion performance ofmobile crowd-learning for stochastic settings. Although thereward policy design for joint AoI and queueing congestionoptimization remains an open problem in stochastic settings,surprisingly, we show that the above linear AoI-based re-ward mechanism yields a bounded PoA, which only dependson the arrival and service parameters of the system. In thecase of symmetric services, the PoA is upper-bounded by / as the reward rate β increases asymptotically.Collectively, our results in this paper advance the under-standing of achieving information freshness in mobile crowd-learning with selfish users. The remainder of this paperis organized as follows: Section II reviews related work.Section III introduces system model and problem statement.Section IV introduces a linear reward mechanism, and Sec-tions V–VI study its PoAs in the deterministic and stochasticcases, respectively. Section VII presents numerical results andSection VIII concludes this paper.II. R ELATED W ORK
To put our work in comparative perspectives, in this section,we provide an overview on the related work in the areas ofcrowd-sensing and age-of-information, respectively. a) Crowd-Sensing:
In the literature, crowd-sensing refersto the sensing model where a group of individuals collec-tively measure some common phenomena, e.g., environmental The value of PoA is always between and , and the larger the PoA, theless efficient the system. See Sections IV–VI for more in-depth discussions. quality monitoring [10], noise pollution assessment [11], [12],and traffic monitoring [13], etc. Although crowd-sensing bearssome similarity to mobile crowd-learning, the main focusesof the crowd-sensing research community are on networkresource management, system infrastructure, incentive mech-anism designs, etc. (see [14] for a comprehensive survey). Incontrast, the overarching theme of this paper is to guaran-tee information freshness in learning scattered objects by a selfish crowd. Moreover, most of the existing crowd-sensingresearch adopts either a static model, where the set of sensingindividuals is fixed (see, e.g., [15] and references therein);or based on a static game-theoretic model, where a fixedset of sensing individuals are incentivized/contracted by afixed set of employers (see, e.g., [16] and references therein).These are fundamentally different from our dynamic modeldescribed in Section III. Hence, our work fills a critical gapin understanding large-scale mobile crowd-learning. b) Age-of-Information (AoI): Originated from sensingsystems, AoI has attracted increasing attention from the infor-mation theory, signal processing, and communications com-munities in recent years. Besides being a useful performancemetric, AoI also possesses several key features that distinguishitself from the traditional notion of queueing delay. Mostnotably, in many sensing systems, it has been found that whilequeueing delay benefits from lower sampling rates (implyingless data traffic), AoI is non-monotone with respect to sam-pling rates. This key difference has sparked AoI research inseveral aspects, e.g., real-time sampling and remote estimationtrade-off [17], [18], joint source-channel coding exploitation[19], [20], caching [21], [22], optimization algorithms for AoIminimization [23], [24], age-based scheduling [25], [26], justto name a few. We note that the key differences betweenour research and the existing AoI research are: i) the tightcoupling and dependence between multi-user arrival dynamics and multi-source information time series on a network level ;and ii) the complex interactions between AoI, fresh/outdatedinformation, and queueing, all of which are governed by theservice provider’s reward mechanism designs . These key dif-ferences introduce new challenges in guaranteeing stochasticnetwork information freshness unseen in existing AoI research.III. N
ETWORK M ODEL AND P ROBLEM S TATEMENT
As shown in Fig. 1, we consider a mobile crowd-learningsystem consisting of N nodes that represent N points ofinterest (PoI), e.g., road intersections, parking garages, po-tential WiFi hotspots, gas stations, etc. We consider a time-slotted system. In each time slot t , each PoI n has somestate information p n [ t ] (e.g., congestion level, parking rate andspace, gas price, etc.) that is time-varying and to be sampled bytheir users. We assume that p n [ t ] ∈ [ p min , p max ] , ∀ t, for somepositive constants p min and p max . A service provider (i.e.,a crowd-learning-based information/data analytics platform)relies on randomly arriving users to sample and report thestates of the PoIs. The service provider maintains a recordfor each PoI, whose value in time slot t is denoted as r n [ t ] , n = 1 , . . . , N . For ease of exposition, we will refer to p n [ t ] .. ...... Report?
Node 1’s RecordNode 2’s RecordNode N ’s Record NN YYYNUserDecisionStochasticUser Arrivalswith RandomGeo-Locations PoI 1’s State InfoPoI 2’s State InfoPoI N ’s State Info r [ t ] r [ t ] r N [ t ] p [ t ] p [ t ] p N [ t ] Q [ t ] Q [ t ] Q N [ t ]Service Provider Report?Report?
Fig. 1: A system model for mobile crowd-learning.and r n [ t ] as “ price ” and “ recorded price ” in the rest ofthis paper, respectively. Let u n [ t ] be the most recent updatetime up to time slot t for PoI n ’s record. Hence, the age (freshness) of record r n [ t ] in time slot t can be represented as ∆ n [ t ] = t − u n [ t ] .Let A [ t ] be the number of users arriving at the system intime slot t . We assume that A [ t ] , t ≥ , are independentlyand identically distributed (i.i.d.) across time with mean λ , E [ A [ t ]] > and bounded second moment E [ A [ t ]] < ∞ . Thearrivals model the scenario that users at different locationsuse their mobile apps in each time slot to acquire informationof the PoIs before making decisions. Each arriving user willfirst observe the current records of all PoIs and choose afavorable one (e.g., choosing the least congested route, thelowest gas price, or the cheapest and nearest parking space,etc.). However, due to the random updating time in crowd-learning, the information of some PoI n ’s record could be oldand hence r n [ t ] may be outdated and inaccurate.On the other hand, upon the arrival at his/her chosen PoI,say n in time slot t , the user will report the PoI’s real-timestate (e.g., real-time price, congestion level, etc.), i.e., p n [ t ] .Let R n [ t ] denote the number of users that can be served byPoI n in time slot t . We assume that R n [ t ] , t ≥ , are i.i.d.across time and independently distributed across PoIs withmean µ n , E [ R n [ t ]] > , ∀ n , and R n [ t ] ≤ R max , ∀ n, t , forsome R max < ∞ . We use Q n [ t ] to denote the number of usersawaiting for service in PoI n in time slot t .The service provider’s goal is to achieve minimum time-average AoI while keeping queueing congestion at each PoIlow. The rationale behind this goal is that low AoI (i.e., freshinformation) implies multiple benefits, e.g., high informationaccuracy, which attracts more users; hence more advertisingrevenues due to large user volume, etc. However, the followingtoy example shows that the natural greedy behavior of selfishusers could yield AoI instability in mobile crowd-learning:
A Motivating Example (AoI Instability due to Selfishness):
Consider a two-PoI example as shown in Fig. 2. Consider themost “natural” price-greedy decision made by selfish users:In time slot t , each arriving user compares the recorded prices r [ t ] and r [ t ] and chooses the cheaper PoI, i.e., choosing n ∗ [ t ] ∈ arg min n ∈{ , } { r n [ t ] } . Suppose that p n [ t ] ∈ [0 , p max ] , n = 1 , . Assume that the probability Pr { p n [ t ] = p max } = ǫ , n = 1 , , where ǫ > is some small value. Suppose alsothat in the initial state, p [0] = p max and p [0] = δ < p max .Thus, at t = 0 , all users choose PoI 2 and the record r [ t ] Node 1’s Record PoI 1PoI 2DecisionUser Arrivalswith RandomGeo-LocationsStochastic p [ t ] p [ t ] r [ t ] r [ t ]Service ProviderUserNode 2’s Record Fig. 2:
A two-PoI motivating examplewith p [0] = 0 . and p [0] = 0 . .
50 100 150 200
Time A ge o f I n f o r m a t i on PoI 2PoI 1
Fig. 3:
Large and unstableAoI of PoI 1 in Fig. 2. will be updated, in which case the age of PoI 2 in time slot becomes zero, i.e., ∆ [1] = 0 . However, due to the highinitial price p [0] , no user chooses PoI 1. Also, due to thelow probability of p [ t ] reaching p max , it would take a longtime (could be unbounded if ǫ is arbitrarily small) for PoI1 to receive any user to update r [ · ] , although p [ t ] may belower than p [ t ] . For example, in Fig. 3, p [ t ] and p [ t ] areuniformly distributed in [0 , . We let p [0] = 0 . (largeinitial value) and p [0] = 0 . . Clearly, we can see that PoI1’s AoI is large and grows linearly with respect to time.The above observation of AoI instability due to users’ self-ishness motivates us to design crowd-learning reward mecha-nisms to ensure information freshness in crowd-learning.IV. A L INEAR A O I-B
ASED R EWARD M ECHANISM
To keep the AoI being bounded, the service provider wouldlike users to go to and sample a PoI with the most outdatedinformation. However, unlike traditional scheduling problems,the crowd-learning service provider cannot enforce each ar-riving selfish user to go to a certain PoI. Rather, the serviceprovider can only offer incentives/rewards to influence theusers to choose certain PoIs. So far, however, the problem ofoptimal reward mechanism design for mobile crowd-learningwith selfish users has not been addressed in the literature.Therefore, in this paper, we start from considering a simple linear reward mechanism for mobile crowd-learning.Specifically, we let β > represent the “ reward per unitof age ” offered by the service provider. Note that each userprefers to select a PoI with both low price and congestionlevel. We use a parameter γ > to denote users’ sensitivityto queueing congestion, which depends on specific mobilecrowd-learning application . Hence, in each time slot t , anarriving user’s presumed benefit for choosing PoI n andreporting its state is: β ∆ n [ t ] − γQ n [ t ] − r n [ t ] . In this work,we assume that all arriving users are selfish and rational, sothat they would select a PoI n ∗ [ t ] to maximize their presumedbenefit, i.e., n ∗ [ t ] ∈ arg max n ∈{ , ,...,N } ( β ∆ n [ t ] − γQ n [ t ] − r n [ t ]) , ∀ t. (1)Note that for any fixed γ , when the reward rate diminishes,i.e., β ↓ , each user essentially follows the “greedy” schemeto select a PoI with the smallest value of γQ n [ t ] + r n [ t ] .By contrast, when the reward rate approaches infinity, i.e., Here, we assume that all users are homogeneous and have the same γ -value. The impact of users’ heterogeneity in congestion sensitivity will bestudied in our future work. ↑ ∞ , the effects of Q n [ t ] and r n [ t ] become negligible inusers’ presumed benefit and thus it encourages users to helpthe service provider maintain information freshness.To facilitate our subsequent analysis, we use S n [ t ] to denotewhether there is at least one user selecting PoI n in time slot t . In particular, S n [ t ] = 1 if at least one user selects PoI n intime slot t , and S n [ t ] = 0 otherwise. Under the assumption ofusers’ selfishness and rationality, the dynamics of queue-lengthand age of PoI n can be described as follows: Q n [ t + 1] = max { Q n [ t ] + A [ t ] S ∗ n [ t ] − R n [ t ] , } , ∀ n. (2)and ∆ n [ t + 1] = ( ∆ n [ t ] + 1 , if S ∗ n [ t ] { A [ t ] > } = 0 , , otherwise , (3)where S ∗ n [ t ] = 1 if n = n ∗ [ t ] and S ∗ n [ t ] = 0 otherwise, and {·} is an indicator function. Let S ∗ [ t ] , ( S ∗ n [ t ]) Nn =1 .To understand the impact of users’ selfishness on AoI andqueueing congestion, in this paper, we adopt the so-called Price of Anarchy (PoA) metric from the game theory literature,which characterizes the degradation of the system efficiencydue to the selfish behavior of users compared to the optimum.Roughly speaking, the notion of PoA ρ is defined as: ρ = 1 − Minimum costCost under selfish behavior . (4)Note that ρ ∈ [0 , and the smaller the PoA, the more efficientthe system under selfish user behavior. In what follows, we willanalyze the PoA of the linear reward scheme (1), where thedefinition of cost in (4) depends on specific system scenariosthat will be clarified in subsequent sections.V. P RICE OF A NARCHY : A D
ETERMINISTIC C ASE
In this section, we first consider a simple deterministic case,where, in each time slot, there is exactly one arriving userand each PoI serves exactly one user if there is any. Thisdeterministic case not only provides interesting insights, itsresults and proof strategies will also serve as a foundation foranalyzing general cases with stochastic arrivals and services.Note that due to the special arrival and service patterns in thisdeterministic case, there is no queueing effect at each PoI.Hence, user’s selfish selection (cf. (1)) becomes: n ∗ [ t ] ∈ arg max n ∈{ , ,...,N } ( β ∆ n [ t ] − r n [ t ]) . (5)In addition, the evolution of age of PoI n in (3) becomes: ∆ n [ t + 1] = (∆ n [ t ] + 1)(1 − S ∗ n [ t ]) . (6)Next, we study the information freshness performance underthe selfish behavior of users (cf. (5)) based on the notion ofPoA. Since queueing does not play a role and the system issymmetric, we define the cost function with selfish users undersome reward rate β as ∆ ( β )max and hence the PoA is ρ ( β ) , − ∆ ( OPT )max / ∆ ( β )max , where ∆ ( OPT )max and ∆ ( β )max are the averagemaximum age under an optimal policy (with unselfish users)and under the user’s selfishness, respectively. The first mainresult of this paper is stated as follows: Theorem 1 (AoI-Based PoA for the Deterministic Case) . Ifthere is exactly one user arriving in each time slot and eachPoI serves exactly one user per time-slot if there is any, theusers’ selfishness yields the following PoA performance: ρ ( β ) ≤ p max ( N − β + p max = O (1 /β ) . (7) Proof.
The proof consists of two main steps: (i) Finding anupper bound on the average maximum age ∆ ( β )max due to users’selfishness; and (ii) derving a lower bound on the averagemaximum age ∆ (OPT) max achieved by an optimal policy. Step 1) : To find an upper bound on the average maximumage due to users’ selfish behavior, we perform Lyapunov driftanalysis through an age-based Lyapunov function as follows: V [ t ] , N X n =1 ∆ n [ t ] . (8)Let M [ t ] , ( { ∆ n [ t ] } Nn =1 , { r n [ t ] } Nn =1 ) and consider an un-selfish policy e S [ t ] , ( e S n [ t ]) Nn =1 ∈ arg max S P Nn =1 ∆ n [ t ] S n [ t ] ,i.e., users select the PoI with the largest age. Then, the one-step conditional expected drift of V [ t ] can be computed as: ∆ V [ t ] , E [ V [ t + 1] − V [ t ] | M [ t ]]= N X n =1 E [∆ n [ t + 1] − ∆ n [ t ] | M [ t ]] ( a ) = N X n =1 E [1 − (∆ n [ t ] + 1) S ∗ n [ t ] | M [ t ]] ( b ) = N − − N X n =1 E [∆ n [ t ] S ∗ n [ t ] | M [ t ]] (9) ≤ N − − N X n =1 E (cid:20)(cid:18) ∆ n [ t ] − β r n [ t ] (cid:19) S ∗ n [ t ] (cid:12)(cid:12)(cid:12)(cid:12) M [ t ] (cid:21) ( c ) ≤ N − − N X n =1 E (cid:20)(cid:18) ∆ n [ t ] − β r n [ t ] (cid:19) e S n [ t ] (cid:12)(cid:12)(cid:12)(cid:12) M [ t ] (cid:21) , ( d ) ≤ N − − ∆ max [ t ] + 1 β p max , (10)where ( a ) uses dynamics of ∆ n [ t ] in (6); ( b ) follows from thefact that each user joins one of the PoIs in each time slot, i.e., P Nn =1 S ∗ n [ t ] = 1 ; ( c ) follows from the definition of S ∗ n [ t ] ; and ( d ) uses the fact that r n [ t ] ≤ p max , ∀ n, t ≥ , the definitionof e S [ t ] , and the fact that exactly one e S n [ t ] is non-zero. It thenfollows from (10) that: E [ V [ t + 1] − V [ t ]] ≤ N − − E [∆ max [ t ]] + 1 β p max . (11)Summing (11) for t = 0 , , , . . . , T − , we obtain: E [ V [ T ] − V [0]] ≤ − T − X t =0 E [∆ max [ t ]]+( N − T + Tβ p max , which implies that ∆ ( β )max , lim sup T →∞ T T − X t =0 E [∆ max [ t ]] ≤ N − β p max . (12) tep 2) : Next, we derive a fundamental lower bound onthe average maximum age that can be achieved by the op-timal policy. By using the same Lyapunov function in (8)to compute the conditional expected one-step drift under theoptimal policy { S ( OPT ) n [ t ] } and following similar steps, wehave ∆ V [ t ] = N − − P Nn =1 E [∆ ( OPT ) n [ t ] S ( OPT ) n [ t ] | M [ t ]] ,where ∆ ( OPT ) n [ t ] is the age of PoI n in time slot t under theoptimal policy. In Step 1, we have already shown that theaverage maximum age is finite under the selfish policy. Thisreadily implies that the average maximum age is also finiteunder the optimal policy. Therefore, E [∆ V [ t ]] will be equalto zero in steady-state and thus we have N X n =1 E [ b ∆ ( OPT ) n b S ( OPT ) n ] = N − , (13)where b ∆ ( OPT ) n and b S ( OPT ) n are random variables with the samedistribution as ∆ ( OPT ) n [ t ] and S ( OPT ) n [ t ] in steady-state underthe optimal policy, respectively. Hence, we have ∆ ( OPT )max ( a ) = E [ b ∆ ( OPT )max ] ( b ) = E [max n b ∆ ( OPT ) n ] ( c ) ≥ N X n =1 E [ b ∆ ( OPT ) n b S ( OPT ) n ] ( d ) = N − , (14)where step ( a ) follows from the boundedness of the aver-age maximum age under the optimal policy; ( b ) is true for b ∆ ( OPT )max , max n b ∆ ( OPT ) n ; ( c ) follows from the fact that eacharriving user joins exactly one of the PoIs, i.e., P Nn =1 b S ( OPT ) n =1 ; and ( d ) uses (13). Lastly, by combining the upper boundin Step 1 and lower bound in Step 2, the desired PoA resultin Theorem 1 follows and the proof is complete. Remark 1.
Two insightful remarks for Theorem 1 are inorder: i) In Step 2, the lower bound of ∆ (OPT) max is tight andcan be achieved by the Round-Robin policy, i.e., the systemguides each arriving user to the PoIs in a Round-Robin fashion.Indeed, under Round-Robin, the ages of PoIs are a permutationof { , , , . . . , N − } in each time slot, and hence the max-imum age under Round-Robin is ∆ (RR) max [ t ] = N − , ∀ t ≥ ,which implies that ∆ (RR) max = N − (OPT) max ; ii) FromTheorem 1, we can observe that if β increases asymptotically(i.e., β ↑ ∞ ), we have ρ ( β ) ↓ . This implies that the systemis optimal and mimicking Round-Robin when the serviceprovider increases the incentive asymptotically. On the otherhand, if β reduces to zero (i.e., β ↓ ), we can see from (5)that each user just follows a price-greedy strategy. In this case,Theorem 1 suggests that the upper bound of ρ ( β ) approaches , which is consistent with our observation (cf. motivatingexample in Section III) that the system suffers a poor AoIperformance and potentially AoI instability (i.e., ∆ ( β )max ↑ ∞ ).VI. P RICE OF A NARCHY : S
TOCHASTIC C ASES
Based on the results for the deterministic case, we are nowin a position to analyze the AoI and congestion performancesunder users’ selfishness in cases with stochastic arrivals andservices. To facilitate analysis, we define a parameter q , Pr { A [ t ] > } for the arrivals, which is strictly positive for λ , E [ A [ t ]] > . Let µ Σ , P Nn =1 µ n . Here, we adopt thecost function J ( β, γ ) , γǫN P Nn =1 Q n + β P Nn =1 µ n µ Σ ∆ n , where ǫ > satisfies µ n /λ ≥ µ n /µ Σ + ǫ/N, ∀ n = 1 , , . . . , N due to the fact that λ < µ Σ (necessary for guaranteeingthe system’s queueing stability ), and Q n and ∆ n are theaverage queue-length and average age of PoI n under theuser’s selfishness, respectively. We note that in J ( β, γ ) , ǫ isused as a scaling parameter to reduce the cost’s sensitivity toaverage queue-length N P Nn =1 Q n under different arrival rates λ . Also, γ and β are used to emphasize the relative importancebetween queueing and AoI costs, as in the presumed benefitfor users’ selfish decisions (cf. (1)). Also, note that J ( β, γ ) is based on weighted average age, where the weight µ n µ Σ isused to “equalize” the different AoI scales caused by theheterogeneity of the PoIs . As a result, the PoA is specializedto ρ ( β, γ ) , − J (OPT) ( β,γ ) J ( β,γ ) . Our second key result is for thestochastic cases and stated as follows: Theorem 2 (Joint AoI-Congestion PoA of Stochastic Cases) . If λ < µ Σ , then there exists an ǫ > satisfying µ n /λ ≥ µ n /µ Σ + ǫ/N, ∀ n = 1 , , . . . , N . In such a case, the users’selfishness yields the following PoA performance: ρ ( β, γ ) ≤ B ( γ ) − γM + p max B ( γ ) + β (cid:16) Nq − (cid:17) + p max + β (cid:16) Nq − qµ max P Nn =1 µ n − (cid:17) B ( γ ) + β (cid:16) Nq − (cid:17) + p max , (15) where B ( γ ) , γ λ (cid:0) E [ A [ t ]] + P Nn =1 E [ R n [ t ]] (cid:1) , M , ǫ N ( µ Σ − λ ) (cid:0) Var ( A [ t ]) + P Nn =1 Var ( R n [ t ]) + ( µ Σ − λ ) (cid:1) − ǫR max , and µ max , max n µ n .Proof. Similar to the proof of Theorem 1, we first find an up-per bound on J ( β, γ ) by using the Lyapunov drift analysis andthen determine a fundamental lower bound on J (OPT) ( β, γ ) . Step 1) : Consider the following Lyapunov function: L [ t ] , γ λβ P Nn =1 Q n [ t ] + q P Nn =1 ∆ n [ t ] . Let Z [ t ] , (( Q n [ t ]) Nn =1 , (∆ n [ t ]) Nn =1 , ( r n [ t ]) Nn =1 ) . Then, the one-step con-ditional expected drift can be computed as: ∆ L [ t ] , E [ L [ t + 1] − L [ t ] | Z [ t ]] = E (cid:20) γ λβ N X n =1 (cid:0) Q n [ t + 1] − Q n [ t ] (cid:1) + 1 q N X n =1 (∆ n [ t + 1] − ∆ n [ t ]) (cid:12)(cid:12)(cid:12)(cid:12) Z [ t ] (cid:21) ( a ) ≤ B ( γ ) β + E (cid:20) γλβ N X n =1 Q n [ t ]( A [ t ] S ∗ n [ t ] − R n [ t ]) , + 1 q N X n =1 (cid:0) − (∆ n [ t ] + 1) S ∗ n [ t ] { A [ t ] > } (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) Z [ t ] (cid:21) , (16) In this paper, we say that a queue n is stable if its average queue-lengthis finite, i.e., lim sup T →∞ T P T − t =0 E [ Q n [ t ]] < ∞ . A system is stable ifall its queues are stable. This is also motivated by the fact that the service provider prefers a betterAoI for the PoI with a faster service rate. here ( a ) is true for B ( γ ) = γ λ ( E [ A [ t ]] + N R ) < ∞ ,and uses dynamics of Q n [ t ] (cf. (2)) and ∆ n [ t ] (cf. (3)), andthe fact that (max { x, } ) ≤ x , ∀ x .Next, we let Z ′ [ t ] , ( Z [ t ] , { A [ t ] > } ) . Then, for anyfunction f ( Z [ t ]) , the following sequence of equalities holds: E { f ( Z [ t ]) A [ t ] | Z [ t ] } = q E { f ( Z [ t ]) A [ t ] | Z ′ [ t ] } = q E { A [ t ] | A [ t ] > } E { f ( Z [ t ]) | Z ′ [ t ] } = E { A [ t ] } E { f ( Z [ t ]) | Z ′ [ t ] } . (17)Note that each arriving user joins one of the PoIs in each timeslot, i.e., P Nn =1 S ∗ n [ t ] = 1 . Also, the users’ decisions S ∗ [ t ] only depend on Z [ t ] . Hence, we have that: (16) ( a ) ≤ B ( γ ) β + Nq − − γλβ N X n =1 µ n Q n [ t ]+ E (cid:20) γβ N X n =1 Q n [ t ] S ∗ n [ t ] − N X n =1 ∆ n [ t ] S ∗ n [ t ] (cid:12)(cid:12)(cid:12)(cid:12) Z ′ [ t ] (cid:21) ≤ B ( γ ) β + Nq − − γλβ N X n =1 µ n Q n [ t ] − E (cid:20) N X n =1 (cid:18) ∆ n [ t ] − γβ Q n [ t ] − β r n [ t ] (cid:19) S ∗ n [ t ] (cid:12)(cid:12)(cid:12)(cid:12) Z ′ [ t ] (cid:21) , (18)where ( a ) follows from (17) and the fact that q ∈ (0 , .Next, consider an unselfish stationary randomized policy with E [ e S n [ t ]] = µ n /µ Σ , ∀ n , if A [ t ] > , and µ Σ , P Nn =1 µ n .Clearly, from the definition of S ∗ [ t ] , we have: (18) ≤ B ( γ ) β + Nq − − γβ N X n =1 µ n λ Q n [ t ] − E (cid:20) N X n =1 (cid:18) ∆ n [ t ] − γβ Q n [ t ] − β r n [ t ] (cid:19) e S n [ t ] (cid:12)(cid:12)(cid:12)(cid:12) Z ′ [ t ] (cid:21) , (19)Noting that µ n /λ ≥ µ n /µ Σ + ǫ/N, ∀ n , we have ∆ L [ t ] ( a ) ≤ B ( γ ) β + Nq − p max β − γǫN β N X n =1 Q n [ t ] − γβ N X n =1 µ n µ Σ Q n [ t ] − N X n =1 µ n µ Σ E (cid:20)(cid:18) ∆ n [ t ] − γβ Q n [ t ] (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) Z ′ [ t ] (cid:21) = − γǫN β N X n =1 Q n [ t ] − N X n =1 µ n µ Σ ∆ n [ t ]+ B ( γ ) β + Nq − p max β , where ( a ) follows from r n [ t ] ≤ p max , ∀ n, t , and the definitionof the stationary randomized policy { e S [ t ] } t ≥ . This implies E [ L [ t + 1] − L [ t ]] ≤ − γǫN β N X n =1 E [ Q n [ t ]] − N X n =1 µ n µ Σ E [∆ n [ t ]] + B ( γ ) β + Nq − p max β . (20) Summing (20) for t = 0 , , , . . . , T − , we obtain E [ L [ T ] − L [0]] ≤ − γǫN β T − X t =0 N X n =1 E [ Q n [ t ]] − T − X t =0 N X n =1 µ n µ Σ E [∆ n [ t ]] + (cid:18) B ( γ ) β + Nq − p max β (cid:19) T, which further implies the following upper bound on J ( β, γ ) : J ( β, γ ) , lim sup T →∞ T T − X t =0 " γǫN N X n =1 E [ Q n [ t ]]+ β N X n =1 µ n µ Σ E [∆ n [ t ]] ≤ B ( γ ) + β (cid:18) Nq − (cid:19) + p max . (21) Step 2) : Next, we derive a fundamental lower bound on J (OPT) ( β, γ ) . Since we have shown that J ( β, γ ) is upper-bounded under the selfish policy in Step 1, J (OPT) ( β, γ ) isalso bounded under the optimal policy. Therefore, we have J (OPT) ( β, γ ) = γǫN P Nn =1 E [ b Q (OPT) n ] + β P Nn =1 µ n µ Σ E [ b ∆ (OPT) n ] ,where b Q (OPT) n and b ∆ (OPT) n are random variables with thesame distribution as Q n [ t ] and ∆ n [ t ] in steady-state un-der the optimal policy, respectively. Next, we lower-bound P Nn =1 E [ b Q (OPT) n ] and P Nn =1 µ n E [ b ∆ (OPT) n ] individually. In therest of the proof, we omit the signifier “ ( OPT ) ” for notationalconvenience and better readability.We first consider P Nn =1 µ n E [ b ∆ n ] . By choosing the Lya-punov function V [ t ] , P Nn =1 µ n ∆ n [ t ] and following similarsteps as in the derivation of (9), we have ∆ V [ t ] , E [ V [ t + 1] − V [ t ] | Z ′ [ t ]] = µ Σ − q N X n =1 µ n E { S n [ t ] | Z ′ [ t ] }− q N X n =1 µ n E { ∆ n [ t ] S n [ t ] | Z ′ [ t ] } . Since J ( OPT ) ( β, γ ) is bounded under the optimal policy, theweighted sum of average age must also be finite under theoptimal policy. Therefore, one can conclude that E [∆ V [ t ]] =0 in steady-state. It then follows that: N X n =1 µ n E h b ∆ n b S n i = 1 q µ Σ − N X n =1 µ n E [ b S n ] , (22)where b S n is the random variable with the same distribution as S n [ t ] in the steady-state under the optimal policy.Similarly, using Lyapunov function V [ t ] , P Nn =1 µ n ∆ n [ t ] and setting its drift to zero in steady-state yields: N X n =1 µ n E [ b ∆ n ] = q N X n =1 µ n E [ b ∆ n b S n ]+ q N X n =1 µ n E [ b ∆ n b S n ] . (23)For any sample path, by Cauchy-Schwarz’s Inequality, we have (cid:18) N X n =1 µ n b ∆ n b S n (cid:19) = (cid:18) N X n =1 q µ n b S n · q µ n b S n b ∆ n (cid:19) ≤ (cid:18) N X n =1 µ n b S n (cid:19)(cid:18) N X n =1 µ n b ∆ n b S n (cid:19) , (24)hich implies P Nn =1 µ n b ∆ n b S n ≥ ( P Nn =1 µ n b ∆ n b S n ) P Nn =1 µ n b S n , and hence E " N X n =1 µ n b ∆ n b S n ≥ E " (cid:16)P Nn =1 µ n b ∆ n b S n (cid:17) P Nn =1 µ n b S n . (25)Since f ( X, Y ) = X /Y is convex for all X ≥ and Y > , by using Jensen’s Inequality, we have E [ X Y ] ≥ ( E [ X ]) E [ Y ] .Thus, setting X = P Nn =1 µ n b ∆ n b S n and Y = P Nn =1 µ n b S n ,inequality (25) becomes: N X n =1 µ n E h b ∆ n b S n i ≥ (cid:16)P Nn =1 µ n E [ b ∆ n b S n ] (cid:17) P Nn =1 µ n E [ b S n ] . (26)By combining (22), (23) and (26), we have: N X n =1 µ n E [ b ∆ n ] ≥ µ Σ (cid:20) µ Σ q P Nn =1 µ n E [ b S n ] − (cid:21) ≥ µ Σ (cid:20) µ Σ qµ max − (cid:21) , (27)where the last step is true for µ max , max n µ n .In order to lower-bound P Nn =1 E [ b Q n ] , we construct a hypo-thetical single-server queue { Φ[ t ] } with the same arrival pro-cess { A [ t ] } t ≥ and an aggregated service process { R Σ [ t ] } t ≥ ,where R Σ [ t ] , P Nn =1 R n [ t ] . The queue-length evolutionof this single-server queue can be written as: Φ[ t + 1] =max { Φ[ t ]+ A [ t ] − R Σ [ t ] , } . Due to resource pooling, the con-structed hypothetical single-server’s queue-length { Φ[ t ] } t ≥ is stochastically smaller than { P Nn =1 Q n [ t ] } t ≥ under anyfeasible policy. Hence, by [27, Lemma 5], we immediatelyhave the following lower bound: N X n =1 E [ b Q n ] ≥ M Nǫ , (28)where M , ǫ N ( µ Σ − λ ) (cid:0) Var ( A [ t ]) + P Nn =1 Var ( R n [ t ]) + ( µ Σ − λ ) (cid:1) − ǫR max . Lastly, combining (27), (28), and (21) yieldsthe desired result in Theorem 2 and the proof is complete. Remark 2.
From Theorem 2, we can see that for any fixed γ value, we have lim β →∞ ρ ( β, γ ) ≤ − qµ max P Nn =1 µ n − Nq − , whose upper bound is equal to / in the case with symmetricservices, i.e., µ = µ = · · · = µ N . However, we shall seefrom the numerical results presented in Section VII that forany fixed γ value, as β increases, the PoA actually convergesto zero in the case with symmetric services. The loosenessof the upper bound analysis is due to the intrinsic nature ofthe Lyapunov analysis methodology, which only captures thedrift among neighboring slots in temporal domain and doesnot characterize the Round-Robin behavior in spatial domain. VII. N UMERICAL R ESULTS
In this section, we conduct simulations to study the PoAperformance under users’ selfishness (cf. (1)) in a mobilecrowed-learning system. We use a 10-PoI system and assumethat each PoI n ’s state information p n [ t ] belongs to the finiteset { . , . , . , } , and p n [ t ] changes to a different valueuniformly at random every time slots. We consider bothdeterministic and stochastic cases. For the deterministic case,we assume that there is exactly one arriving user in each timeslot and each PoI can serve one user in one time slot if any. Forthe stochastic case, we assume that users arrive at the systemaccording to the Bernoulli distribution with mean λ = 0 . and service provided by each PoI n follows an i.i.d. Bernoullidistribution with mean µ n , n = 1 , , . . . , . We consider bothsymmetric and asymmetric services: For symmetric services,we let µ n = 0 . , ∀ n ; For asymmetric services, we let µ n =0 . , ∀ n = 1 , , . . . , and µ n = 0 . , ∀ n = 6 , , . . . , .
1) Deterministic Scenario:
Fig. 4 illustrates the PoA per-formance in the deterministic case. In this case, there isno queueing effect and the PoA performance reflects theinformation freshness due to users’ selfish behavior comparedto the optimal AoI performance. We can observe from Fig. 4that PoA decreases as the reward rate β increases and roughlyfollows the O (1 /β ) law, meaning that the AoI performanceimproves. Moreover, PoA decreases to zero for β ≥ . . Thismeans that the AoI performance is optimal even with selfishusers. Both observations corroborate the result in Theorem 1.
2) Stochastic Scenario:
Next, we study the PoA perfor-mance in stochastic cases. We consider both symmetric andasymmetric services. Here, PoA reflects the gap betweenjoint AoI-congestion performance under users’ selfishnesscompared to the optimal performance. We note that, evenwithout incorporating AoI, it remains an open problem tofind an optimal policy to minimize the total mean queue-length. In deriving the upper bound on PoA, we use thefundamental lower bound on total mean queue-length (cf. [27,Lemma 5]), which may not be tight. In this simulation, weadopt the Join-the-Shortest-Queue (JSQ) policy (e.g., [27]) anduse its mean queue-length to serve as a lower bound for thequeueing component in PoA. This is because JSQ minimizesthe total mean queue-length (see [28, Proposition 3]) in thecase with Bernoulli arrival and symmetric Bernoulli services,and it is optimal (see [27]) in the case with general arrival andservice processes in the heavy-traffic regime (i.e., arrival rateapproaches the total service rate asymptotically).Fig. 5 shows the PoA performance of the case with asym-metric services under different values of γ . We can see that,for any fixed γ value, PoA converges to . instead of as β increases. The main reason is that we adopt the weighted sumof mean age as the metric for information freshness, and thepolicy that achieves optimal information freshness is unknown.Thus, we use our derived fundamental lower bound on theweighted mean sum-age to replace the optimal value for theinformation freshness, which may render a loose bound onPoA. However, we point out that our derived lower bound Reward Rate P r i c e o f A na r c h y ( P o A ) Fig. 4: PoA with respect to reward rate β in the deterministic case. Reward Rate P r i c e o f A na r c h y ( P o A ) = 0 = 1 = 5 = 10 Fig. 5: PoA with respect to β and γ in thestochastic case with asymmetric services. Reward Rate P r i c e o f A na r c h y ( P o A ) = 0 = 1 = 5 = 10 Fig. 6: PoA with respect to β and γ in thestochastic case with symmetric services.is tight in the symmetric service case as the reward rate β increases asymptotically, even though the derived upper boundof PoA is / (cf. Remark 2). Indeed, we can observe fromFig. 6 that PoA actually converges to zero as β increases inthe case with symmetric services.VIII. C ONCLUSION
In this paper, we have strived to understand whether or notwe can achieve information freshness guarantee with selfishusers in mobile crowd-learning. To answer this question, wefirst developed a new analytical model that takes into accountthe essential features of mobile crowd-learning. Then, basedon this model, we showed that the natural greedy behaviorof selfish users could lead to AoI instability, which necessi-tates the design of reward mechanisms to induce informationfreshness guarantee. Toward this end, we proposed a linearAoI-based reward mechanism, under which we analyzed theimpacts of users’ selfishness on AoI based on the notionof Price of Anarchy (PoA). We showed that the proposedreward mechanism achieves bounded AoI and congestionperformances in terms of PoA, and can even achieves optimalAoI asymptotically in a deterministic scenario. Collectively,these results serve as an exciting first step toward optimizinginformation freshness in mobile crowd-learning systems.R
Proc. CISS , Princeton, NJ, USA, March 2012, pp. 1–6.[7] R. D. Yates and S. Kaul, “Real-time status updating: Multiple sources,”in
Proc. IEEE ISIT , Cambridge, MA, USA, July 2012, pp. 2666–2670.[8] S. Kaul, R. D. Yates, and M. Gruteser, “Real-time status: How oftenshould one update?” in
Proc. IEEE INFOCOM , Orlando, FL, USA,March 2012, pp. 2731–2735.[9] S. Kaul, M. Gruteser, V. Rai, and J. Kenney, “Minimizing age ofinformation in vehicular networks,” in
Proc. IEEE SECON , Salt LakeCity, UT, USA, June 2011, pp. 350–358.[10] “Creek watch,” 2010. [Online]. Available:http://creekwatch.researchlabs.ibm.com/ [11] N. Maisonneuve, M. Stevens, M. E. Niessen, and L. Steels, “Noisetube:Measuring and mapping noise pollution with mobile phones.” Springer,2009, pp. 215–228.[12] R. K. Rana, C. T. Chou, S. S. Kanhere, N. Bulusu, and W. Hu,“Ear-phone: an end-to-end participatory urban noise mapping system,”in
Proceedings of the 9th ACM/IEEE International Conference onInformation Processing in Sensor Networks . ACM, 2010, pp. 105–116.[13] D. Zhang, H. Xiong, L. Wang, and G. Chen, “Crowdrecruiter: selectingparticipants for piggyback crowdsensing under probabilistic coverageconstraint,” in
Proceedings of the 2014 ACM International Joint Con-ference on Pervasive and Ubiquitous Computing . ACM, 2014, pp.703–714.[14] J. Liu, H. Shen, and X. Zhang, “A survey of mobile crowdsensingtechniques: A critical component for the internet of things,” in
Proc.IEEE ICCCN , 2016.[15] Y. Chen, J. Zhou, and M. Guo, “A context-aware search system for inter-net of things based on hierarchical context model.”
TelecommunicationSystems , vol. 62, no. 1, pp. 77–91, 2016.[16] K. Han, H. Huang, and J. Luo, “Posted pricing for robust crowdsensing,”in
Proc. ACM MobiHoc , 2016, pp. 261 – 270.[17] Y. Sun, Y. Polyanskiy, and E. Uysal-Biyikoglu, “Remote estimation ofthe wiener process over a channel with random delay,” in
Proc. IEEEISIT , June 2017, pp. 321–325.[18] X. Gao, E. Akyol, and T. Basar, “Optimal estimation with limitedmeasurements and noisy communication,” in
Proc. IEEE CDC , Osaka,Japan, December 2015, pp. 1775–1780.[19] E. T. Ceran, D. Gunduz, and A. s Gyorgy. (2017, October) Average ageof information with hybrid arq under a resource constraint. [Online].Available: https://arxiv.org/abs/1710.04971v1[20] R. D. Yates, E. Najm, E. Soljanin, and J. Zhong, “Timely updates overan erasure channel,” in
Proc. IEEE ISIT , 2017.[21] C. Kam, S. Kompella, G. Nguyen, J. Wieselthier, and A. Ephremides,“Information freshness and popularity in mobile caching,” in
Proc. IEEEISIT , June 2017, pp. 136–140.[22] R. D. Yates, P. Ciblat, A. Yener, and M. Wigger, “Age-optimal con-strained cache updating,” in
Proc. IEEE ISIT , June 2017, pp. 141–145.[23] Y. Sun, E. Uysal-Biyikoglu, R. D. Yates, C. E. Koksal, and N. B. Shroff,“Update or wait: How to keep your data fresh,”
IEEE Transactions onInformation Theory , vol. 63, no. 11, pp. 7492–7508, November 2017.[24] S. K. Kaul and R. D. Yates, “Status updates over unreliable multiaccesschannels,” in
Proc. IEEE ISIT , 2017, pp. 331–335.[25] B. Li, A. Eryilmaz, and R. Srikant, “On the universality of age-basedscheduling in wireless networks,” in
Proc. IEEE INFOCOM , Kowloon,Hong Kong, April 2015, pp. 1302–1310.[26] N. B. Lakshminarayana, J. Lee, and H. Kim, “Age based schedulingfor asymmetric multiprocessors,” in
Conference on High PerformanceComputing Networking, Storage and Analysis , Portland, OR, USA,November 2009.[27] A. Eryilmaz and R. Srikant, “Asymptotically tight steady-state queuelength bounds implied by drift conditions,”
Queueing Systems , vol. 72,no. 3-4, pp. 311–359, 2012.[28] B. Li, A. Eryilmaz, R. Srikant, and L. Tassiulas, “On optimal routingin overloaded parallel queues,” in