Power-Delay Tradeoff in Multi-User Mobile-Edge Computing Systems
aa r X i v : . [ c s . I T ] S e p Power-Delay Tradeoff in Multi-User Mobile-Edge ComputingSystems
Yuyi Mao † , Jun Zhang † , S.H. Song † , and K. B. Letaief †∗ , Fellow, IEEE † Dept. of ECE, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong ∗ Hamad bin Khalifa University, Doha, QatarEmail: { ymaoac, eejzhang, eeshsong, eekhaled } @ust.hk Abstract —Mobile-edge computing (MEC) has recentlyemerged as a promising paradigm to liberate mobile devicesfrom increasingly intensive computation workloads, as wellas to improve the quality of computation experience. In thispaper, we investigate the tradeoff between two critical butconflicting objectives in multi-user MEC systems, namely, thepower consumption of mobile devices and the execution delayof computation tasks. A power consumption minimizationproblem with task buffer stability constraints is formulated toinvestigate the tradeoff, and an online algorithm that decides thelocal execution and computation offloading policy is developedbased on Lyapunov optimization. Specifically, at each timeslot, the optimal frequencies of the local CPUs are obtained inclosed forms, while the optimal transmit power and bandwidthallocation for computation offloading are determined with theGauss-Seidel method. Performance analysis is conducted for theproposed algorithm, which indicates that the power consumptionand execution delay obeys an [ O (1 /V ) , O ( V )] tradeoff with V as a control parameter. Simulation results are provided tovalidate the theoretical analysis and demonstrate the impacts ofvarious parameters to the system performance. Index Terms —Mobile-edge computing, dynamic voltage andfrequency scaling, power control, bandwidth allocation, Lya-punov optimization, quality of computation experience.
I. I
NTRODUCTION
The increasing popularity of smart mobile devices is driv-ing the development of mobile applications, which can becomputation-intensive, e.g., interactive online gaming, facerecognition and 3D modeling. This poses more stringentrequirements on the quality of computation experience, whichcannot be easily satisfied by the limited processing capabilityof mobile devices. As a result, new solutions to handle theexplosive computation demands and the ever-increasing com-putation quality requirements are emerging [1]. Mobile-edgecomputing (MEC) is such a promising technique to releasethe tension between the computation-intensive applicationsand the resource-limited mobile devices [2]. Different fromconventional cloud computing systems, where remote publicclouds are utilized, MEC offers computation capability withinthe radio access network. Therefore, by offloading the compu-tation tasks from the mobile devices to the MEC servers, thequality of computation experience, including the device energyconsumption and execution latency, can be greatly improved[3].
This work is supported by the Hong Kong Research Grants Council underGrant No. 16200214.
Nevertheless, the efficiency of computation offloadinghighly depends on the wireless channel conditions, as offload-ing tasks requires effective data transmission. Therefore, com-putation offloading policies for MEC systems have attractedsignificant attention in recent years [4]-[11]. For applicationswith strict deadline requirements, the local execution energyconsumption was minimized by adopting dynamic voltageand frequency scaling (DVFS) techniques, and the energyconsumption for computation offloading was optimized usingdata transmission scheduling in [4]. In [5], joint allocationof communication and computational resources for femto-cloud computing systems was proposed, where each com-putation task should be completed before its deadline. In[6], a dynamic computation offloading policy was developedfor MEC systems with energy harvesting devices under astrict execution delay requirement. Besides, a decentralizedcomputation offloading algorithm was proposed to minimizethe computation overhead for multi-user MEC systems in [7].Imposing strict execution delay constraints makes the com-putation offloading design more tractable, as only short-termperformance, e.g., the performance for executing a single task,needs to be considered. However, it may be impractical forapplications that can tolerate a certain period of executionlatency, such as multi-media streaming. For such type of ap-plications, the long-term system performance is more relevant,where the coupling among the randomly arrived tasks cannotbe ignored. In order to minimize the long-term average energyconsumption, a stochastic control algorithm was proposed in[8], which determines the offloaded software components.In [9], a delay-optimal stochastic task scheduling algorithmwas developed for single-user MEC systems. Moreover, anonline task scheduling algorithm was proposed to investigatethe energy-delay tradeoff for MEC systems with a multi-coremobile device in [10], and this study was later extended toscenarios with heterogeneous types of mobile applications in[11]. Unfortunately, existing works only focused on single-userMEC systems, and the design methodologies for multi-userMEC systems remain unknown.In this paper, we consider a general MEC system withmultiple mobile devices, where computation tasks arrive atthe mobile devices in a stochastic manner. Joint design oflocal execution and computation offloading strategies willbe investigated. With multiple devices, the design becomesmuch more challenging as intelligent management of the radioresources for computation offloading, e.g., the transmit powernd available spectrum, is needed. We formulate a powerconsumption minimization problem with task buffer stabilityconstraints. An online algorithm is proposed based on Lya-punov optimization, which decides the CPU-cycle frequenciesfor local execution, and the transmit power and bandwidthallocation for computation offloading. In particular, the op-timal CPU-cycle frequencies are obtained in closed forms,while the optimal transmit power and bandwidth allocation aredetermined by the Gauss-Seidel method. Performance analysisis conducted for the proposed algorithm, which explicitlycharacterizes the tradeoff between the power consumptionof the mobile devices and the execution delay. Simulationresults verify the theoretical analysis and demonstrate thatthe proposed algorithm is capable of controlling the powerconsumption and execution delay performance in multi-userMEC systems.The organization of this paper is as follows. We introducethe system model in Section II. The power consumptionminimization problem is formulated in Section III, and anonline local execution and computation offloading policy isdeveloped in Section IV. Simulation results will be shown inSection V, and we will conclude this paper in Section VI.II. S
YSTEM M ODEL (cid:48)(cid:40)(cid:38)(cid:3)(cid:86)(cid:72)(cid:85)(cid:89)(cid:72)(cid:85)(cid:48)(cid:39) (cid:20) (cid:48)(cid:39) (cid:21) (cid:48)(cid:39) (cid:22) (cid:48)(cid:39) (cid:23)
Fig. 1. A mobile-edge computing system with four mobile devices (MDs).
We consider a mobile-edge computing (MEC) systemas shown in Fig. 1, where N mobile devices runningcomputation-intensive applications are assisted by an MECserver. The MEC server could be a small data center installedat a wireless access point deployed by the telecom operator.Therefore, it can be accessed by the mobile devices throughwireless channels, and will execute the computation tasks onbehalf of the mobile devices [3], [4]. By offloading part of thecomputation tasks to the MEC server, the mobile devices couldenjoy a higher level of quality of computation experience [3].The available system bandwidth is w Hz, which is sharedby the mobile devices, and the noise power spectral densityat the receiver of the MEC server is denoted as N . Timeis slotted and the time slot length is τ . For convenience, wedenote the index sets of the mobile devices and the time slotsas N , { , · · · , N } and T , { , , · · · } , respectively. A. Computation Task and Task Queueing Models
We assume the mobile devices are running fine-grainedtasks [11]: At the beginning of the t th time slot, A i ( t ) (bits) of computation tasks arrive at the i th mobile device, which canbe processed starting from the ( t + 1) th time slot. Withoutloss of generality, we assume the A i ( t ) ’s in different timeslots are independent and identically distributed (i.i.d.) within [ A i, min , A i, max ] with E [ A i ( t )] = λ i , i ∈ N .In each time slot, part of the computation tasks of the i thmobile device, denoted as D l,i ( t ) , will be executed at thelocal CPU, while D r,i ( t ) bits of the computation tasks willbe offloaded to and executed by the MEC server. The arrivedbut not yet executed tasks will be queued in the task buffer ateach mobile device, and the queue lengths of the task buffersat the beginning of the t th time slot are denoted as Q ( t ) , [ Q ( t ) , · · · , Q N ( t )] with Q (0) = , where Q i ( t ) evolvesaccording to the following equation: Q i ( t + 1) = max { Q i ( t ) − D Σ ,i ( t ) , } + A i ( t ) , t ∈ T . (1)In (1), D Σ ,i ( t ) = D l,i ( t ) + D r,i ( t ) is the amount of tasksdeparting from the task buffer at the i th device in time slot t . B. Local Execution Model
In order to process one bit of task input at the i th mobiledevice, L i CPU cycles will be needed, which depends on thetypes of applications and can be obtained by off-line measure-ments [12]. Denote the scheduled CPU-cycle frequency for the i th mobile device in the t th time slot as f i ( t ) , which cannotexceed f i, max . Thus, D l,i ( t ) can be expressed as D l,i ( t ) = τ f i ( t ) L − i . (2)Accordingly, the power consumption for local execution at the i th mobile device is given by p l,i ( t ) = κf i ( t ) , (3)where κ is the effective switched capacitance related to thechip architecture [13]. C. MEC Server Execution Model
To offload the computation tasks for MEC server execution,the input bits of the tasks need to be delivered to the MECserver. For simplicity, we assume the MEC server is equippedwith an N -core high-speed CPU so that it can execute N different applications in parallel, and the processing latency atthe MEC server is negligible. We leave the investigation ofmore general MEC servers to our future work.The wireless channels between the mobile devices and theMEC server are i.i.d. frequency-flat block fading. Denote thesmall-scale fading channel power gain from the i th mobiledevice to the MEC server at the t th time slot as h i ( t ) , which isassumed to have a finite mean value, i.e., E [ h i ( t )] = h i < ∞ .Thus, the channel power gain from the i th mobile device to theMEC server can be represented by H i ( t ) = h i ( t ) g ( d /d i ) θ ,where g is the path-loss constant, d is the reference distance, θ is the path-loss exponent, and d i is the distance from mobiledevice i to the MEC server. Hence, the amount of offloadedtasks at the i th mobile device in time slot t is given by D r,i ( t ) = ( α i ( t ) wτ log (cid:16) H i ( t ) p tx ,i ( t ) α i ( t ) N w (cid:17) , α i ( t ) > , α i ( t ) = 0 , (4)here p tx ,i ( t ) is the transmit power with the maximum valueof p i, max , and α i ( t ) is the portion of bandwidth allocated tothe i th mobile device. Denote α ( t ) , [ α ( t ) , · · · , α N ( t )] asthe bandwidth allocation vector, which should be chosen fromthe feasible set A [14], i.e., α ( t ) ∈ A , (cid:26) α ∈ R N + (cid:12)(cid:12) X i ∈N α i ≤ (cid:27) . (5)III. P ROBLEM F ORMULATION
In this section, we will first introduce the performancemetrics, namely, the power consumption of the mobile devicesand the average queue lengths of the task buffers. A powerconsumption minimization problem will then be formulatedto facilitate the investigation of the power-delay tradeoff.The average power consumption of the mobile devices,including the power consumed by the local CPUs and thetransmit power for computation offloading, can be expressedas P = lim T →∞ T E " T − X t =0 P ( t ) , (6)where P ( t ) , P i ∈N ( p tx ,i ( t ) + p l,i ( t )) .According to the Little’s Law [15], the execution delay isproportional to the average queue length of the task buffer.Hence, we adopt the average queue length of the task bufferas a measurement of the execution delay, which can be writtenas Q i = lim T →∞ T E " T − X t =0 Q i ( t ) , i ∈ N . (7)Denote f ( t ) , [ f ( t ) , · · · , f N ( t )] and p tx ( t ) , [ p tx , ( t ) , · · · , p tx ,N ( t )] . Thus, the power consumption mini-mization problem is formulated as P : min f ( t ) , p tx ( t ) , α ( t ) P s . t . α ( t ) ∈ A , t ∈ T (8) ≤ f i ( t ) ≤ f i, max , i ∈ N , t ∈ T (9) ≤ p tx ,i ( t ) ≤ p i, max , i ∈ N , t ∈ T (10) lim t →∞ E [ | Q i ( t ) | ] t = 0 , i ∈ N , (11)where (9) and (10) are the CPU-cycle frequency constraint andthe transmit power constraint, respectively. (11) requires thetask buffers to be mean rate stable [16], which ensures that allthe arrived computation tasks can be executed with finite delay.In general, P is a stochastic optimization problem, for which,the CPU-cycle frequency, the transmit power as well as thebandwidth allocation need to be determined for each device ateach time slot. This problem is difficult to solve as the optimaldecisions are temporally correlated. Also, a joint considerationon the local execution and computation offloading strategiesis needed, as both of them affect the system performance.Besides, the spatial coupling of the bandwidth allocationamong different mobile devices poses another challenge. Instead of solving P directly, we consider P , which is amodified version of P by replacing set A in (5) by set ˜ A ,with ˜ A defined as ˜ A = (cid:26) α ∈ R N + | X i ∈N α i ≤ , α i ≥ ǫ A , i ∈ N (cid:27) . (12)With such modification, the departure function of MEC serverexecution, D r,i ( t ) , is continuous and differentiable with re-spect to α ( t ) ∈ ˜ A . In addition, the optimal value of P islarger but can be made arbitrarily close to that of P by setting ǫ A ( ǫ A ∈ (0 , /N ) ) to be sufficiently small. Furthermore, anyfeasible solution for P is also feasible for P . Thus, we willfocus on P in the remainder of this paper.IV. O NLINE L OCAL E XECUTION AND C OMPUTATION O FFLOADING P OLICY
In this section, we will propose an online local executionand computation offloading policy to solve P based onLyapunov optimization [16], where a deterministic problemneeds to be solved at each time slot. We will then analyze theperformance of the proposed algorithm and reveal the power-delay tradeoff in multi-user MEC systems. A. Lyapunov Optimization-Based Online Algorithm
To present the algorithm, we first define the Lyapunovfunction as L ( Q ( t )) = 12 X i ∈N Q i ( t ) . (13)Thus, the Lyapunov drift function can be written as ∆ ( Q ( t )) = E [ L ( Q ( t + 1)) − L ( Q ( t )) | Q ( t )] . (14)Accordingly, the Lyapunov drift-plus-penalty function can beexpressed as ∆ V ( Q ( t )) = ∆ ( Q ( t )) + V · E [ P ( t ) | Q ( t )] , (15)where V ( bits · W − ) is a control parameter in the proposedalgorithm. We find an upper bound of ∆ V ( Q ( t )) under anyfeasible f ( t ) , p tx ( t ) , and α ( t ) , as specified in Lemma 1. Lemma For arbitrary f ( t ) , p tx ( t ) , α ( t ) such that ∀ i ∈N , f i ( t ) ∈ [0 , f i, max ] , p tx ,i ( t ) ∈ [0 , p i, max ] , and α ( t ) ∈ ˜ A , ∆ V ( t ) is upper bounded by ∆ V ( Q ( t )) ≤ − E "X i ∈N Q i ( t ) ( D Σ ,i ( t ) − A i ( t )) | Q ( t ) + V · E [ P ( t ) | Q ( t )] + C, (16)where C is a constant. Proof:
Proof is omitted due to space limitation.The main idea of the proposed online local execution andcomputation offloading policy is to minimize the upper boundof ∆ V ( Q ( t )) in the right-hand side of (16) greedily at eachtime slot. By doing so, the amount of tasks waiting in the taskbuffers can be maintained at a small level. Meanwhile, thepower consumption of the mobile devices can be minimized.The proposed algorithm is summarized in Algorithm 1, wherea deterministic optimization problem P PTS needs to be solvedt each time slot. It is worthy to note that the objective functionof P PTS corresponds to the right-hand side of (16), and allthe constraints in P except the stability constraints in (11)are retained in P PTS . The optimal solution for P PTS will bedeveloped in the next subsection.
Algorithm 1
Lyapunov Optimization-Based Online LocalExecution and Computation Offloading Policy At the beginning of the t th time slot, obtain { Q i ( t ) } , { H i ( t ) } , and { A i ( t ) } . Determine f ( t ) , p tx ( t ) and α ( t ) by solving P PTS : min f ( t ) , p tx ( t ) , α ( t ) − X i ∈N Q i ( t ) D Σ ,i ( t ) + V · P ( t )s . t . α ( t ) ∈ ˜ A , (9) and (10) . Update { Q i ( t ) } according to (1) and set t = t + 1 . B. Optimal Solution For P PTS
In this subsection, we will derive the optimal CPU-cyclefrequencies, transmit powers and bandwidth allocation vectorfor P PTS . Optimal CPU-cycle Frequencies:
It is straightforward toshow that the optimal CPU-cycle frequency for the i th mobiledevice in time slot t can be obtained by solving SP : min ≤ f i ( t ) ≤ f i, max − Q i ( t ) τ f i ( t ) L − i + V κf i ( t ) , (17)and its optimal solution is achieved at either the stationarypoint of the objective function or one of the boundary points,which is given by f ⋆i ( t ) = min (cid:26) f i, max , s Q i ( t ) τ κV L i (cid:27) , i ∈ N . (18) Remark Note that f ⋆i ( t ) increases with Q i ( t ) as it isdesirable to execute more tasks in order to keep the queuelength of the task buffer small. Besides, f ⋆i ( t ) decreases withboth V and L i : With a larger value of V , the weight of thepower consumption becomes larger, and thus the local CPUslows down its frequency to reduce power consumption; witha larger value of L i , local execution becomes less efficient asmore CPU cycles are needed to process per bit of task input,which leads to a smaller CPU-cycle frequency. Optimal Transmit Power and Bandwidth Allocation:
After decoupling f ( t ) from P PTS , the optimal p ⋆ tx ( t ) and α ⋆ ( t ) can be obtained by solving SP : min α ( t ) , p tx ( t ) − X i ∈N Q i ( t ) D r,i ( t ) + V X i ∈N p tx ,i ( t )s . t . ≤ p tx ,i ( t ) ≤ p i, max , i ∈ N α ( t ) ∈ ˜ A . (19)It is not difficult to identify that SP is a convex optimizationproblem. However, generic convex algorithms suffer fromrelatively high complexity as they are developed for general convex problems and do not make use of the problem struc-tures [17]. Motivated by this, we propose to solve SP byoptimizing the transmit power and bandwidth allocation inan alternating manner, where in each iteration, the optimaltransmit powers are obtained in closed forms and the optimalbandwidth allocation is determined by the Lagrangian method .Since SP is jointly convex with respect to p tx ( t ) and α ( t ) ,and its feasible region is a Cartesian product of those of p tx ( t ) and α ( t ) , the alternating minimization procedure isguaranteed to converge to the global optimal solution, whichis termed as the Gauss-Seidel method in literature [18].
1) Optimal Transmit Power:
For a fixed bandwidth al-location vector α ( t ) , the optimal transmit power for the i thmobile device can be obtained by solving P PWR : min ≤ p tx ,i ( t ) ≤ p i, max − Q i ( t ) D r,i ( t ) + V p tx ,i ( t ) , (20)whose optimal solution is achieved at either the stationarypoint of the objective function or one of the boundary pointssimilar to SP , and it is given in closed form by p ⋆ tx ,i ( t ) =min (cid:26) α i ( t ) w max (cid:26) Q i ( t ) τ ln 2 · V − N H i ( t ) , (cid:27) , p i, max (cid:27) , i ∈ N . (21)
2) Optimal Bandwidth Allocation:
For a fixed transmitpower vector p tx ( t ) , the optimal bandwidth allocation can beobtained by solving the following problem: P BW : min α ( t ) ∈ ˜ A − X i ∈N Q i ( t ) D r,i ( t ) , (22)which is more challenging as the bandwidth allocation deci-sion is coupled among different mobile devices. Fortunately,the Lagrangian method offers an effective solution for P BW .Specifically, the partial Lagrangian can be written as L ( α ( t ) , λ ) = − X i ∈N Q i ( t ) D r,i ( t ) + λ X i ∈N α i ( t ) − ! , (23)where λ ≥ is the Lagrangian multiplier associated withconstraint P i ∈N α i ( t ) ≤ . Based on the Karush-Kuhn-Tucker (KKT) conditions, the optimal bandwidth allocation α ⋆ ( t ) and the optimal Lagrangian multiplier λ ⋆ should satisfythe following equation set: ( α ⋆i ( t ) = max { ǫ A , R i ( λ ⋆ ) } , i ∈ N , λ ⋆ > P i ∈N α ⋆i ( t ) = 1 . (24)In (24), if p ⋆ tx ,i ( t ) = 0 , R i ( λ ) , ǫ A ; otherwise, R i ( λ ) denotes the root of Q i ( t ) dD r,i ( t ) dα i ( t ) = λ for λ > , whichis positive and unique as dD r,i ( t ) dα i ( t ) decreases with α i ( t ) .Thus, it suggests a bisection search over [ λ L , λ U ] for theoptimal λ ⋆ , where λ L = max i ∈N Q i ( t ) dD r,i ( t ) dα i ( t ) | α i ( t )=1 , and λ U satisfies P i ∈N max { ǫ A , R i ( λ U ) } < . Hence, R i ( λ ) can be obtained by a bisection search over (0 , , and thesearching process for the optimal λ ⋆ will be terminated when | P i ∈N max { ǫ A , R i ( λ ) } − | < ξ , where ξ is the accuracyf the algorithm. Details of the Lagrangian method for P BW are summarized in Algorithm 2. Algorithm 2
Lagrangian Method for P BW Set ξ = 10 − , λ U = λ L , l = 0 , I max = 200 , β = 1 . , ǫ A = 10 − . Set α i ( t ) = max { ǫ A , R i ( λ U ) } , i ∈ N . While P i ∈N α i ( t ) ≥ do λ U = β · λ U . Set α i ( t ) = max { ǫ A , R i ( λ U ) } , i ∈ N . Endwhile While | P i ∈N α i ( t ) − | ≥ ξ and l ≤ I max do ˜ λ = ( λ L + λ U ) and l = l + 1 . Set α i ( t ) = max { ǫ A , R i (cid:16) ˜ λ (cid:17) } , i ∈ N . If P i ∈N α i ( t ) > then λ L = ˜ λ . Else λ U = ˜ λ . Endif
Endwhile
Remark One main benefit of the proposed online algo-rithm is that it does not require prior information on the com-putation task arrival and wireless channel fading processes,which makes it also applicable for unpredictable environments.Besides, the proposed algorithm is of low complexity, as ateach time slot, the optimal CPU-cycle frequencies are obtainedin closed forms, while the computation offloading policy isdetermined by an efficient alternating minimization algorithm.Furthermore, as will be shown in the next subsection, theachievable performance of the proposed algorithm can beanalytically characterized and thus facilitates the analysis onthe power-delay tradeoff in multi-user MEC systems.
C. Performance Analysis
In this subsection, we will provide the main theoreticalresult in this paper, which characterizes the upper boundsfor the power consumption of the mobile devices and theaverage sum queue length of the task buffers. Also, the tradeoffbetween the power consumption and execution delay will berevealed.
Theorem Assume that P is feasible, we have: • The average power consumption of the mobile devicesunder the proposed algorithm satisfies: P ≤ P optΣ + C · V − , (25)where P optΣ is the optimal value of P . • For arbitrary i ∈ N , Q i ( t ) is mean rate stable. • Suppose there exist ǫ > and Ψ ( ǫ ) ( Ψ ( ǫ ) > P optΣ ) thatsatisfy the Slater conditions [16], then the average sumqueue lengths of the task buffers satisfies: X i ∈N Q i ≤ (cid:2) C + V (cid:0) Ψ ( ǫ ) − P optΣ (cid:1)(cid:3) · ǫ − . (26) Proof:
Proof is omitted due to space limitation.
Remark Theorem 1 shows that under the proposed on-line local execution and computation offloading policy, the worst-case power consumption of the mobile devices decreasesinversely proportional to V , while the upper bound of theexecution delay increases linearly with V , i.e., there existsan [ O (1 /V ) , O ( V )] tradeoff between these two objectives.Thus, we can balance the power consumption and executiondelay by adjusting V : For delay-sensitive types of applications,we can use a small value of V ; while for energy-sensitivenetworks and delay-tolerant applications, a large value of V can be adopted. V. S IMULATION R ESULTS
In simulations, we assume N mobile devices are located atan equal distance of m from the MEC server. The small-scale fading channel power gains are exponentially distributedwith unit mean. Besides, κ = 10 − , τ = 1 ms, w = 10 MHz, N = − dBm/Hz, g = − dB, d = 1 m, θ =4 , f i, max = 1 GHz, p i, max = 500 mW, A i ( t ) is uniformlydistributed within [0 , A i, max ] , and L i = 737 . cycles/bit, i ∈N [12]. The simulation results are averaged over 5000 timeslots. × W −1 ) P o w e r c on s u m p t i on ( W ) (a) 0 1 2 3 4 5x 10 V (bits × W −1 ) A v e r age queue l eng t h pe r u s e r ( b i t s ) (b) Fig. 2. Power consumption of the mobile devices/average queue length peruser vs. the control parameter V , N = 5 and A i, max = 4 kbits. We first show the relationship between the power consump-tion of the mobile devices/average queue length of the taskbuffers and the control parameter V in Fig. 2. We see from Fig.2a) that the power consumption decreases inversely propor-tional to V and converges to P optΣ when V is sufficiently large.Meanwhile, as shown in Fig. 2b), the average queue lengthof the task buffers increases linearly with V and becomesunbounded when V goes to infinity. These results verify the [ O (1 /V ) , O ( V )] tradeoff between the power consumptionand execution delay as shown in Theorem 1.In Fig. 3, we show the relationship between the powerconsumption and execution delay for scenarios with andwithout MEC . It is observed that by increasing V from to × bit · W − , the power consumption of the mobiledevices decreases significantly for both cases. However, thebehaviors of the execution delay are substantially different: The average execution delay is calculated by P i ∈N Q i / P i ∈N λ i (timeslots) according to the Little’s Law. P o w e r c on s u m p t i on ( W ) With MECWithout MECV = 10 bits × W −1 V = 5×10 bits × W −1 Fig. 3. Power consumption of the mobile devices vs. execution delay forsystems with and without MEC, N = 5 and A i, max = 4 kbits. −1 Average execution delay (ms) P o w e r c on s u m p t i on ( W ) N = 10, A i,max = 4 kbitsN = 5, A i,max = 8 kbitsN = 5, A i,max = 4 kbitsV = 10 bits × W −1 V = 5×10 bits × W −1 Fig. 4. Power consumption of the mobile devices vs. execution delay.
With MEC, the execution delay decreases sharply from 33.2to 1.05 ms as V decreases, while without MEC, the executiondelay has minor changes at around ms. This is becausewithout the aid of the MEC server, the devices cannot stabilizetheir task buffers even with a small V , where the local CPUsoperate at their maximum frequencies. Therefore, we verifythe benefits of MEC for improving the quality of computationexperience.By varying A i, max and N , we show the relationship betweenthe power consumption and execution delay in Fig. 4. Ingeneral, the average execution delay increases as the powerconsumption decreases, which indicates that a proper V shouldbe chosen to balance the two desirable objectives. For instance,with N = 5 and A i, max = 4 kbits, if the average executiondelay requirement is 20 ms, V = 3 × bits · W − can bechosen, and the power consumption will be 0.1 W. Besides,with a given execution delay, the power consumption increaseswith the computation task arrival rate (the number of mobiledevices), which agrees with the intuitions as the workloadof the MEC system becomes heavier, more power is neededto stabilize the task buffers. In addition, when V goes toinfinity, doubling the computation task arrival rates results in a higher power consumption than doubling the numberof mobile devices, which is due to the increased multi-userdiversity gain and the availability of extra local CPUs.VI. C ONCLUSIONS
In this paper, we investigated the power-delay tradeoff in amulti-user mobile-edge computing system. A power consump-tion minimization problem with task buffer stability constraintswas formulated, and an online algorithm that decides thelocal execution and computation offloading policy was derivedbased on Lyapunov optimization. Performance analysis wasconducted for the proposed algorithm, which explicitly char-acterizes the [ O (1 /V ) , O ( V )] tradeoff between the powerconsumption and execution delay performance. Simulationresults validated the theoretical analysis, and showed that theproposed algorithm is capable of balancing the power con-sumption of the mobile devices and the quality of computationexperience. For future investigation, it would be interesting toextend the findings in this work to scenarios with fairnessconsiderations among multiple devices.R EFERENCES[1] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswmi, “Internet of Things(IoT): A vision, architectural elements, and future directions,”
ELSEVIERFuture Gener. Comp. Syst. , vol. 29, no. 7, pp. 1645-1660, Sep. 2013.[2] European Telecommunications Standards Institute (ETSI), “Mobile-edgecomputing-Introductory technical white paper,” Sep. 2014.[3] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case forvm-based cloudlets in mobile computing,”
IEEE Pervasive Comput. , vol.8, no. 4, pp. 14-23, Oct. 2009.[4] W. Zhang, Y. Wen, K. Guan, D. Kilper, H. Luo, and D. Wu, “Energy-optimal mobile cloud computing under stochastic wireless channel,”
IEEETrans. Wireless Commun. , vol. 12, no. 9, pp. 4569-4581, Sep. 2013.[5] O. Munoz, A. Iserte, and J. Vidal, “Optimization of radio and computa-tional resources for energy efficiency in latency-constrained applicationoffloading,”
IEEE Trans. Veh. Technol. , vol. 64, no. 10, pp. 4738-4755,Oct. 2015.[6] Y. Mao, J. Zhang, and K. B. Letaief, “Dynamic computation offloadingfor mobile-edge computing with energy harvesting devices,”
IEEE J. Sel.Areas Commun. , to appear.[7] X. Chen, “Decentralized computation offloading game for mobile cloudcomputing,”
IEEE Trans. Parallel Distrib. Syst. , vol. 26, no. 4, pp. 974-983, Apr. 2015.[8] D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algorithm formobile computing,”
IEEE Trans. Wireless Commun. , vol. 11, no. 6, pp.1991-1995, Jun. 2012.[9] J. Liu, Y. Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computationtask scheduling for mobile-edge computing systems,” in
Proc. IEEE Int.Symp. Inf. Theory (ISIT) , Barcelona, Spain, Jul. 2016.[10] Z. Jiang and S. Mao, “Energy delay trade-off in cloud offloadingfor multi-core mobile devices,” in
Proc. IEEE Global Commun. Conf.(GLOBECOM) , San Diego, CA, Dec. 2015.[11] J. Kwak, Y. Kim, J. Lee, and S. Chong, “DREAM: Dynamic resourceand task allocation for energy minimization in mobile cloud systems,”
IEEE J. Sel. Areas Commun. , vol. 33, no. 12, pp. 2510-2523, Dec. 2015.[12] A. P. Miettinen and J. K. Nurminen, “Energy efficiency of mobile clientsin cloud computing,” in
Proc. USENIX Conference on Hot Topics in CloudComputing (HotCloud) , Boston, MA, Jun. 2010.[13] T. D. Burd and R. W. Brodersen, “Processor design for portablesystems,”
Kluwer J. VLSI Signal Process. Syst. , vol. 13, no. 2/3, pp.203-221, Aug. 1996.[14] Z. Wang, V. Aggarwal, and X. Wang, “Joint energy-bandwidth allocationin multiple broadcast channels with energy harvesting,”
IEEE Trans.Commun. , vol. 63, no. 10, pp. 3842-3885, Oct. 2015.[15] S. M. Ross,
Introduction to probability models . Academic Press, 2014.[16] M. J. Neely,
Stochastic network optimization with application to com-munication and queueing systems . Morgan & Calypool, 2010.[17] S. Boyd and L. Vandenberghe,
Convex optimization . Cambridge Univer-sity Press, 2004.18] L. Grippo and M. Sciandron, “On the convergence of the block nonlinearGauss-Seidel method under convex constraints,”