[PDF] Distributed Resource Allocation in Device-to-Device Enhanced Cellular Networks

Abstract

Cellular network performance can significantly benefit from direct device-to-device (D2D) communication, but interference from cochannel D2D communication limits the performance gain. In hybrid networks consisting of D2D and cellular links, finding the optimal interference management is challenging. In particular, we show that the problem of maximizing network throughput while guaranteeing predefined service levels to cellular users is non- convex and hence intractable. Instead, we adopt a distributed approach that is computationally extremely efficient, and requires minimal coordination, communication and cooperation among the nodes. The key algorithmic idea is a signaling mechanism that can be seen as a fictional pricing mechanism, that the base stations optimize and transmit to the D2D users, who then play a best response (i.e., selfishly) to this signal. Numerical results show that our algorithms converge quickly, have low overhead, and achieve a significant throughput gain, while maintaining the quality of cellular links at a predefined service level.

Full PDF

11 Distributed Resource Allocation inDevice-to-Device Enhanced Cellular Networks

Qiaoyang Ye, Mazin Al-Shalash, Constantine Caramanis and Jeffrey G. Andrews

Abstract —Cellular network performance can signiﬁcantly ben-eﬁt from direct device-to-device (D2D) communication, but in-terference from cochannel D2D communication limits the perfor-mance gain. In hybrid networks consisting of D2D and cellularlinks, ﬁnding the optimal interference management is challeng-ing. In particular, we show that the problem of maximizingnetwork throughput while guaranteeing predeﬁned service levelsto cellular users is non-convex and hence intractable. Instead, weadopt a distributed approach that is computationally extremelyefﬁcient, and requires minimal coordination, communication andcooperation among the nodes. The key algorithmic idea is asignaling mechanism that can be seen as a ﬁctional pricingmechanism, that the base stations optimize and transmit tothe D2D users, who then play a best response (i.e., selﬁshly)to this signal. Numerical results show that our algorithmsconverge quickly, have low overhead, and achieve a signiﬁcantthroughput gain, while maintaining the quality of cellular linksat a predeﬁned service level.

I. I

NTRODUCTION

To meet the growing demand for local wireless services,direct communication between user equipments (UEs) – called device-to-device (D2D) communication – is envisioned as animportant technology component for LTE-Advanced. Takingadvantage of physical proximity, D2D enables a more ﬂex-ible infrastructure than conventional cellular networks withpotential beneﬁts such as efﬁcient resource utilization, largethroughput and reduced end-to-end latency, exploiting whichD2D can create new business opportunities by supporting vari-ous applications, such as content sharing, multiplayer gaming,social networking services and mobile advertising [1–6]. Inthis paper, we focus on the use cases where the D2D trafﬁcis generated by UEs themselves (e.g., sharing a just-takenvideo), and thus only consider one hop D2D transmission,which is different from the case where D2D works as arelay [7]. In D2D-enabled cellular networks, resources canbe either allocated orthogonally to D2D and cellular links, orshared between them. While the interference management iscertainly simpliﬁed in a network with orthogonal allocation,the resource utilization is less efﬁcient. To improve the re-source utilization, we consider a hybrid network with sharedallocation – called a shared network in this paper.The uplink spectrum is often under-utilized compared todownlink [8], and thus letting D2D links use uplink resourcesmay improve the resource utilization. Moreover, when D2Dlinks share downlink resources, base stations (BSs) become

Q. Ye, C. Caramanis and J. G. Andrews are with the Universityof Texas at Austin, USA, M. Shalash is with Huawei Technologies.Email: [email protected], [email protected], [email protected],[email protected]. Manuscript last revised: June 6, 2018. fairly strong interferers for D2D receivers, and D2D trans-mitters may cause high interference to nearby co-channelcellular UEs, which may signiﬁcantly degrade the networkperformance [9]. In contrast, when D2D links use uplinkresources, the interference from D2D to cellular transmissionscan be better handled, since the BSs that are more powerfulthan UEs suffer from D2D interference. Therefore, sharinguplink spectrum is preferable overall [5].The objective of this paper is to improve the networkthroughput by allowing D2D communication to share thecellular uplink resources. At the same time, we restrict theaccess of D2D links to the uplink spectrum in order tomanage the interference. Generally, centralized interferencemanagement requires the central controller (e.g., the BS) toacquire the channel state information (CSI) between eachtransmitter and receiver. This requires high overhead, partic-ularly in the scenario where channels vary rapidly (e.g., in ahigh-mobility environment). Therefore, a distributed algorithmrequiring only local information is preferable. We address thefollowing design question in this paper: how to intelligentlymanage spectrum for D2D with only local information andBS assistance (e.g., setting a high cost for D2D links causingstrong interference), so as to manage the interference andimprove the network throughput?

A. Related Work

There has been increasing interest recently in the investi-gation of interference management in shared networks. Powercontrol is one viable approach, e.g., consideration of a greatlysimpliﬁed model with one cellular UE and one D2D link [10],a simple power reduction method based on the derived signal-to-interference-plus-noise ratio (SINR) [11], and study ofseveral power control schemes including ﬁxed power and ﬁxedSNR target [12] are some existing works. Another popularapproach, related to the direction we propose, is to intelligentlymanage spectrum for D2D links based on channel conditionsand nearby interfering UEs, e.g., maximal mutual interferenceminimization [13], network throughput maximization [14], set-ting exclusive D2D transmission zone to achieve interferenceavoidance [15, 16], and interference randomization throughfrequency and/or time hopping [9, 17]. However, the key newaspect of D2D-enabled cellular networks – assistance of BSs– has received much less treatment in the literature. Possiblyone reason for this is that the computational problem itself isquite difﬁcult, and the communication and coordination alonerequired for a good centralized solution might be prohibitive.As we discuss below, the key approach adopted in this paperis a two-stage distributed algorithm that has a game theoretic a r X i v : . [ c s . I T ] D ec interpretation: BSs send out a signal representing a ﬁctionalprice that can be considered as the assistance from BSs, andthen D2D users optimize a local objective function adaptingto that price, and to what the other users are doing. Sinceindividual D2D links optimize their local functions “selﬁshly”,this approach has a game theoretic interpretation, and thisallows us to use algorithms and analysis concepts from gametheory, even though there is no actual market, and the usersagree to “play” this game using the BS’s signal, withoutactually exchanging currency.Game theory has been used in various disciplines to modelcompetition for limited resources in more general networks.For example, work has been done considering spectrumsharing based on local bargaining [18], repeated game [19],auction mechanisms [20, 21] and two-stage game [22, 23].Paper [24] demonstrates several different game models forD2D resource allocation, where an interesting example is touse the reverse iterative combinatorial auction [25]. In thispaper, the game theoretic approach is used as an algorithmictechnique to obtain efﬁcient distributed spectrum management.Similar to the recent work [23], we model a Stackelberg gameto control the interference from D2D to cellular network. Thekey difference from [23] is in the upper-stage problem, wherewe takes into account the D2D rate. Moreover, we investigatethe convergence of the algorithms for both lower-stage andupper-stage problems. Works in cognitive radio such as [22]are related to the second part of our work (i.e., the investigationof optimal prices to charge D2D links accessing the sharedresources), which proposes that secondary users adapt theirpowers for alleviating interference to primary users. The keytechniques used in this paper to study the optimal prices aresimilar to [22], but we in addition investigate the convergenceof the spectrum management scheme for the D2D network(i.e., the lower-stage problem), as well as the convergence ofthe proposed heuristic algorithm. B. Contributions And Organizations

We present a distributed, efﬁcient and low-overhead spec-trum management method for D2D links to improve thethroughput while keeping the performance of cellular usersat a guaranteed level. Speciﬁcally, the main contributions are:

A Two-stage Distributed Algorithm.

We propose an it-erative two-stage algorithm in Section III. In the ﬁrst stage,the BSs send a pricing signal that adapts to the gap betweenthe aggregate interference from D2D links and a predeﬁnedinterference tolerance level, where the price increases if D2Dinterference is higher than the tolerance level and decreasesotherwise. In the second stage, each D2D link independentlymaximizes its utility consisting of a reward equal to itsexpected rate and a penalty proportional to the interferencecaused by this link to the BS, as measured by the pricingsignal. Note that this two-stage model is a Stackelberg game[26], and the algorithm can be seen as a pricing mechanism.This algorithm requires no cooperation among D2D links,yet succeeds in discouraging strongly interfering or low-SINRD2D links to access more resource blocks (RBs).

Utility-based D2D resource allocation adaptation.

In Sec-tion IV, we consider the lower-stage problem, where we max- imize the D2D rate in terms of expected SINR for tractability,which provides a performance upper bound and can serve asa benchmark. Each D2D link selﬁshly maximizes its utilitygiven other D2D links’ decisions and the price broadcastfrom BSs, which essentially forms a non-cooperative game. Toreduce the computational complexity and overhead, we furtherconsider the problem of maximizing a lower bound of theutility function for each D2D link. We then propose an efﬁcientiterative algorithm similar to a waterﬁlling algorithm, whichonly requires local information. Our simulation results showthat the result obtained by the proposed iterative algorithm isvery close to the solution to the upper bound problem. Thisfurther lightens the computational burden on each user.

Cellular link performance protection.

Given the solutionof the lower-stage problem, we study the optimal price theBSs report in Section V, to maximize the network utilitywhile protecting cellular links. We show that this problemcan be transformed into a linear complementarity problem(LCP). This allows us to take advantage of, and adapt forour problem, general algorithms for LCP. We further proposea simpler heuristic algorithm based on the bisection method,and observe that it has low overhead and converges veryquickly with almost no loss. We also propose a simple greedyalgorithm that leads to efﬁcient computation at the cost ofoverall throughput, where the throughput loss decreases as theinterference tolerance level increases, e.g., the throughput losscompared to the algorithm for LCP are about and when the interference tolerance level is dB lower and abovethe cellular signal in our setup, respectively.Numerical results in Section VI show that the cellularlinks can be well protected with the average D2D throughputreduction of only in our setup, compared to the scenariowhere all D2D links are active. On the other hand, compared toconventional cellular networks without D2D communication,the proposed algorithms provide signiﬁcant throughput gain(about x with D2D links per cell and average D2D linklength m in our simulation setup). Note that the throughputgain highly depends on the amount of D2D trafﬁc and averageD2D link length. We take D2D links per cell and link length m as an illustration example in this paper.II. S YSTEM M ODEL

We consider a uplink shared network, where cellular UEs inthe same cell get different sub-bands (i.e., orthogonal chunksof RBs). Any general scheduling scheme for cellular UEs canbe used. A potential D2D link can either transmit directly byD2D communication, or transmit to a BS. We call the choice mode selection . By intelligently conducting mode selection,we can adjust the aggregate interference in the network andthus optimize the achievable network performance. However,the mode selection variables in the SINR expression result innon-convexity of objective functions that are in terms of rate,i.e., log(1+

SINR ) . Moreover, the mode selection variables arebinary, making the problem combinatorial. Note that differentmode selection schemes lead to different optimal spectrummanagement, due to the differences in the resource allocationof cellular users and thus the differences in the interference tolerance level. On the other hand, spectrum management af-fects the achievable rate and thus affects the mode selection ofD2D links. As discussed above, it is difﬁcult to ﬁnd the optimalmode selection, let alone the joint optimal mode selection andspectrum management of D2D links, where mode selectionand spectrum management are coupled with each other. De-spite the intractability of the optimization problem, there arevarious practical (but not necessarily optimal) mode selectionapproaches (e.g., distance-based mode selection [27]).We propose the following mode selection as one viablescheme. The potential D2D transmitters are treated the same ascellular UEs when scheduling, except that we can add a weightto the scheduling metric. For example, with proportional fairscheduling [28], the user with the largest qR i / ¯ R i wouldbe scheduled, where q is the weight, R i and ¯ R i are theinstantaneous rate and average rate of link i , respectively.Without loss of generality, we assume cellular users have q = 1 . A typical value for q of potential D2D links mightbe / , since a potential D2D link in cellular mode wouldoccupy both uplink and downlink resources. We can alsolet the weight q impose the cost on the backhual usage ofcore networks. Considering that D2D mode has more efﬁcientresource utilization, the potential D2D links are biased againstcellular mode using q < . Note that other mode selectionschemes can also be applied to our following framework easily.We assume potential D2D links that are not scheduled byBSs would be in D2D mode. In other words, we propose to leteach BS complete the mode selection of potential D2D linksin its coverage area. Given the mode selection, we aim to ﬁndthe optimal spectrum management of D2D links to maximizethe network utility. We leave the joint optimization of modeselection and spectrum management to future work.Assuming that the cellular resource allocation is done bythe BSs, our focus is on the spectrum management of D2Dlinks. In this paper, we consider resource allocation at eachRB to simplify the notation and explanation, but any generalunits of RBs can be considered similarly. The sets of cellularUEs accessing the k th RB and of D2D links are denoted by C k and D , respectively, where the set of cellular UEs includesthe potential D2D links in cellular mode. We deﬁne I k as theset of D2D links accessing RB k (i.e., the set of interferingD2D links). Then the SINRs of D2D link i at RB k and acellular UE belonging to C k are, respectively,SINR ( D ) ik = { i ∈I k } P D i h ( k ) ii (cid:80) j ∈I k ,j (cid:54) = i P D j h ( k ) ji + (cid:80) j ∈C k P C j h ( k ) ji + W ik , (1)SINR ( C ) ik = P C i g ( k ) ii (cid:80) j ∈I k P D j g ( k ) ji + (cid:80) j ∈C k ,j (cid:54) = i P C j g ( k ) ji + W ik , (2)where { a ∈A} is an indicator function with { a ∈A} = 1 if a ∈ A and { a ∈A} = 0 otherwise, P D i and P C i are thetransmit powers of D2D and cellular links, respectively, h ( k ) ji is the channel gain from UE j to D2D receiver i at RB k , g ( k ) ji is the channel gain from UE j to the BS serving cellular UE i ,and W ik is the noise power of link i at RB k . We use Shannoncapacity to calculate rate, i.e., R ik = B log (1 + SINR ik ) , where B is the frequency bandwidth of a RB.III. P ROBLEM F ORMULATION

In this section, we ﬁrst formulate a single-stage optimizationproblem to maximize the D2D throughput with a performanceprotection for cellular links. The computational intractabilityof the single-stage optimization then motivates us to considera distributed setting, where each D2D link tries to maximizeits own utility based only on local information.

A. Single-stage Problem Formulation

Without loss of generality, we let x ik be the probabilitythat D2D link i accessing RB k . The investigation of optimalaccess probabilities upper bounds the channel assignmentproblem where D2D links either access a RB or not (i.e., theaccess probability is either 1 or 0). We consider the followingutility maximization problem, subject to a D2D interferenceconstraint to guarantee the cellular performance: max x (cid:88) i ∈D w i K (cid:88) k =1 R ( D ) ik ( x ) s.t. (cid:88) i ∈D x ik P D i g ( k ) ii ≤ Q k , ∀ k,x ik ∈ [0 , , (3)where w i is the weight for the i th D2D link, and K is thenumber of total available RBs for D2D links. Denoting thepower set of D by D , the rate of D2D link i at RB k is R ( D ) ik = (cid:88) I k ∈ D (cid:89) j ∈I k x jk (cid:89) n ∈D\I k (1 − x nk ) log (cid:16) SINR ( D ) ik (cid:17) . (4)The ﬁrst constraint in (3) is for protection of cellular trans-missions, where Q k – called the interference tolerance level –depends on the channel condition of cellular transmission onRB k (e.g., Q k could be the signal strength of the cellular linkusing RB k multiplied by a predeﬁned threshold). Note that Q k can be optimized to maximize a utility function incorporatingthe cellular rate. In this paper, we consider Q k as a predeﬁnedparameter and leave the joint optimization of Q k and D2Dresource allocation to future work. We observe that (3) is nota convex optimization problem. The computational complexityof a brute-force approach to solve (3) is O ( N N D x N D ) , where N x is the number of possible values of x to be searched and N D is the number of D2D links. Thus, the computation isessentially impossible for even a modest-sized network.Instead of the centralized approach, we adopt a differentstrategy that results in an efﬁcient, distributed algorithm withlow coordination, cooperation and communication overhead .For tractability, we introduce variables – called prices foraccessing RBs – to decouple the interference constraint in(3) and develop a distributed tractable framework. Particularly,BSs adjust prices to control the total D2D interference, andeach D2D link individually maximizes its utility in terms of theexpected rate and prices charged by BSs. This leads to a two-stage optimization problem, which consists of a problem toﬁnd optimal prices and several small-size convex optimization problems for D2D links. Though solutions to the two-stageproblem may not provide the optimal solution to the originalsingle-stage problem (3), this relaxation allows us to efﬁcientlyallocate resources in a distributed fashion, and the numericalresults in Section VI demonstrate a large rate gain withoutserious degradation in cellular performance using the proposedalgorithm for the two-stage problem. B. Two-stage Problem Formulation

We propose a pricing mechanism, where a BS charges theD2D link i in its coverage area the amount µ ik per unit ofthe interference caused by this D2D link to the BS at RB k ,i.e., the cost for a D2D link to access RB k is µ ik x ik P D i g ( k ) ii .Assuming that each cell runs this mechanism independently,the cost of a D2D link only depends on the interference causedby this D2D link to its associated BS.We assume the interference from other cells is invariantwhen we consider the resource allocation in a typical cell.Therefore, we can incorporate the interference from neighbor-ing cells into noise and the multi-cell scenario is simpliﬁed toa single-cell scenario. Under this assumption, the interferenceconstraint is for the interference caused by D2D links in thiscell. Note that in this case, the updated noise (incorporatinginter-cell interference) is different from user to user, wheregenerally cell-edge users suffer larger noise. Though we fo-cus on the asynchronous scheduling scenario, the proposedframework can be easily generalized to a synchronous multi-cell scenario if the price at each RB is uniﬁed among differentcells, where the BS in the proposed model becomes a networkcontroller, and the interference becomes the aggregate inter-ference from D2D links to all BSs in the network.The net utility of D2D link i is U i = w i (cid:80) Kk =1 R ( D ) ik ( x ) − (cid:80) Kk =1 µ ik x ik P D i g ( k ) ii , where the ﬁrst and second term canbe considered as the reward and penalty functions, respec-tively. The problem involves a non-cooperative network, whereeach D2D link aims to maximize its utility selﬁshly. Wedenote the access probabilities of D2D link i by x i :=[ x i , x i , · · · , x iK ] T . The access probabilities of all other D2Dlinks are denoted by x − i := [ x T , · · · , x Ti − , x Ti +1 , · · · , x TN D ] T ,where N D is the number of D2D links. Similarly, we deﬁnethe price vector of D2D link i as µ i := [ µ i , µ i , · · · , µ iK ] T .Given µ i and x − i , the problem for the D2D link i is max x i U i ( x i ; x − i , µ i ) s.t. x ik ∈ [0 , , ∀ k. (5)On the other hand, the network aims to ﬁnd optimal prices: max µ ≥ U c ( µ , x ∗ ( µ )) s.t. (cid:88) i ∈D x ∗ ik ( µ ) P D i g ( k ) ii ≤ Q k , ∀ k, (6)where U c ( µ , x ∗ ( µ )) = (cid:80) i ∈D (cid:80) Kk =1 µ ik x ∗ ik ( µ ) P D i g ( k ) ii , and x ∗ ( µ ) is the solution of (5) for a given µ . Taking a gametheoretic perspective, the above problem is a decentralizedStackelberg game (a two-stage game), where the leader movesﬁrst and then the followers move accordingly. In this paper,the BS is the leader and the D2D links are the followers. To solve the two-stage problem, we use a backward induc-tion technique. We start with the problem of the D2D links –called a lower problem – and get the D2D access probability x ∗ ( µ ) . By plugging x ∗ ( µ ) into (6), we then investigate thenetwork utility maximization – called an upper problem .IV. L OWER P ROBLEM : A N ON - COOPERATIVE

D2DN

ETWORK

Given µ , D2D links try to maximize their utility selﬁshly.This deﬁnes a non-cooperative game G D = [ D , { x i } , { U i } ] .For tractability, we use Jensen’s inequality and consider thefollowing objective function that upper bounds (4): max x i w i K (cid:88) k =1 ˜ R ( D ) ik ( x ) − K (cid:88) k =1 µ ik x ik P D i g ( k ) ii s.t. x ik ∈ [0 , , ∀ k, (7)where ˜ R ( D ) ik = log  (cid:88) I k ∈ D (cid:89) j ∈I k x jk (cid:89) n ∈D\I k (1 − x nk ) SINR ( D ) ik  . The upper bound is tight if most x ik are binary. We comparethe gap between the solution maximizing (4) and (7) numeri-cally in Section VI and leave the analysis to future work.We adopt an identical price for D2D links accessing thesame RB, i.e., µ ik = µ jk . The rationale for doing this is thatthe BS only cares about the aggregate interference, rather thanthe differences between the interference values from differentD2D links. The structure of (7) suggests decoupling the lowerproblem into K subproblems, where we consider each RBindependently. In the rest of this paper, we consider a typicalRB, and ignore the RB index k for notation simplicity. A. Distributed Algorithms Design

Optimization problems produce solutions with certain op-timality guarantees. In our setting, however, the D2D linksbehave in a non-cooperative fashion. Thus, understanding thebehavior and performance of our algorithm requires consid-eration of a different solution concept. This notion has beenwell-studied in game theory, and it is known that the analogof stationary points in an optimization solution are the so-called

Nash Equilibrium (NE) points. In our context, theseare the ﬁxed points from which no D2D link would want to unilaterally deviate [26]. In the rest of this paper, the NEpoints always refer to the NE of the D2D non-cooperativegame G D . In this subsection, we study what these NE pointsare and propose a algorithm that converges to a NE.We denote the feasible region of x i by X i , where X i = { x i ∈ [0 , } . The existence of NE for the non-cooperativegame is given by Lemma 1, according to the Debreu-Glicksberg-Fan Theorem [29–31]. Lemma 1. If X i is compact and convex, U i is concave in x i given x − i and continuous, then the NE exists. It is straightforward to show that the above conditions aresatisﬁed, and thus we have at least one NE. Then a naturalquestion follows: how to attain a NE?

For ﬁxed x − i and µ , the problem (7) is a convex optimiza-tion, and the optimal solution is the point which vanishes theﬁrst derivative of the objective function (if feasible): x ∗ i = (cid:20) w i µP D i g ii ln 2 − (cid:80) I∈ D (cid:81) j ∈I\ i x j (cid:81) n ∈D\I (1 − x n ) SINR ( D ) i (cid:35) , (8)where [ x ] = min { , max { , x }} . We deﬁne the followingfunction: f ( x , · · · , x N D ; µ ) = (cid:0) x ∗ ( x − ) , · · · , x ∗ N D ( x − N D ) (cid:1) ,where x ∗ i ( x − i ) is given by (8). Function f describes theoptimal resource access probabilities given that the accessprobabilities of other links are ﬁxed, and thus is called the best-response (BR) function . We propose a synchronous iterativealgorithm – called the BR Algorithm , where all D2D linksadjust their access probabilities simultaneously according to ( x ( t + 1) , · · · , x N D ( t + 1)) = f ( x ( t ) , · · · , x N D ( t ); µ ) . Applying the

Maximum Theorem [32], we can show that f iscontinuous. Note that the BR Algorithm will never converge toa solution that is not a NE, since each D2D link has the accessprobability that maximizes its utility, which implies that nolinks can gain by changing only their own access probabilitiesunilaterally at the convergence point.Though procedures of the BR Algorithm are simple, thecomplexity to calculate (8) is high, due to the expectationcalculation involving N D Bernoulli random variables, whosecomplexity is O (2 N D N D ) . In addition, D2D links need toexchange their current access probabilities, which causes highoverhead. The overhead and complex computation are notdesirable, especially for UEs that are power limited. Otheralgorithms such as gradient-projection based algorithm [33]or algorithms in learning automata [34] can also be applied,with the disadvantages of either slow convergence or memoryspace limit. These motivate the following subsection, wherewe consider a lower bound of the objective function in (7). B. Joint Resource Allocation and Power Control – A LowerBound Problem

In problem (7), each D2D link maximizes the utility in termsof the expected SINR. Approximating the rate to be calculatedby expected interference rather than expected SINR, we have max x i w i log (cid:0) SINR (cid:48) i (cid:1) − µx i P D i g ii s.t. ≤ x i ≤ , (9)where SINR (cid:48) i = x i P Di h ii (cid:80) j ∈D ,j (cid:54) = i x j P Dj h ji + (cid:80) j ∈C P Cj h ji + W i . Thisproblem motivates a low-complexity low-overhead algorithm,as shown below.Variables x i in (9) can be considered as a joint resourceallocation and power control variable, where ( x i > in-dicates whether D2D link i accesses the RB, and the valueof x i denotes the fraction of maximal transmit power touse. The strategy with respect to (7) can be considered asa scheme similar to random hopping (with different hoppingprobabilities at each link), while the strategy in (9) is deter- ministic, which considers power control in addition to resourceallocation. Intuitively, the hopping scheme randomizes stronginterference, and thus may potentially provide a larger gainthan the latter case, though we consider power control jointly.We show this relationship mathematically in Proposition 1. Proposition 1.

The optimization problem (9) maximizes alower bound of the utility function in (7).Proof:

Denoting the interference from other D2Dlinks by I , the SINR can be written as E [ SINR D i ] = E I (cid:104) P Di h ii I + (cid:80) j ∈C P Cj h ji + W i (cid:105) . It is straightforward to verify that f ( I ) = P Di h ii I + (cid:80) j ∈C P Cj h ji + W i is convex. By Jensen’s inequality,we have f ( E [ I ]) ≤ E [ f ( I )] , which completes the proof.We call (9) the lower bound problem of (7) in this paper.Invoking Lemma 1 again, we can show that there is at least oneNE for the D2D game formulated in this subsection. Thoughthere may exist multiple NEs in general, our setup admits aunique NE under some speciﬁc conditions; we specify thoseprecisely in Section IV-C. Note that the NEs of the gameswith (7) and with (9) are not necessarily the same, and thusProposition 1 does not say that the BR Algorithm in SectionIV-A always performs better than the algorithms proposed inthe following subsection.Given x − i and µ , (9) is a convex optimization problem andits optimal solution is given by Proposition 2. Proposition 2.

The solution of (9) has the following form x ∗ i = (cid:20) a i − s i P D i h ii (cid:21) , (10) where a i = w i P Di h ii µP Di g ii − (cid:80) j ∈C P C j h ji − W i , s i = (cid:80) j (cid:54) = i x j P D j h ji , and [ x ] = min { , max { x, }} .Proof: According to the Karush-Kuhn-Tucker (KKT)conditions [35], we have ∂U i ∂x i = 0 if x i ∈ (0 , , ∂U i ∂x i ≤ if x i = 0 , and ∂U i ∂x i ≥ otherwise, where ∂U i ∂x i = w i P Di h ii x i P Di h ii + (cid:80) j ∈D ,j (cid:54) = i x j P Dj h ji + (cid:80) j ∈C P Cj h ji + W i − µ i P D i g ii . Theabove equations and inequations result in (10).Eq. (10) is similar to the waterﬁlling function in powerallocation problems, except that our constraint x i ∈ [0 , isindependent over different RBs and thus we obtain a closed-form solution (10). Leveraging existing works on waterﬁllingproblems, we propose an iterative algorithm similar to theiterative waterﬁlling algorithm (see, e.g., [32, 36, 37]). C. Algorithm Design for the Lower bound Problem

Similar to Section IV, we propose a synchronous it-erative algorithm based on the BR function, deﬁned as f L ( x , · · · , x N D ; µ ) = (cid:0) x ∗ ( x − ) , · · · , x ∗ N D ( x − N D ) (cid:1) , where x ∗ i ( x − i ) is given by (10). The algorithm – called the LBAlgorithm – is given by Algorithm 1. Similar to the BRAlgorithm, we have that if the LB Algorithm converges, thenit converges to a NE.

Implementation interpretations.

Adopting the LB Algo-rithm, each D2D link ﬁrst acquires CSI of the link from itstransmitter to the BS. This can be either estimated based on

Algorithm 1

LB Algorithm: an iterative algorithm for lowerbound problem of D2D Initialization: given price µ ≥ , let x i (0) = 1 , ∀ i , and t = 0 ; while (cid:107) x ( t ) − x ( t − (cid:107) ≥ (cid:15) do let x ( t + 1) = f L ( x ( t ); µ ) ; let t = t + 1 ; end while BS: update price μ Cellular UE

Measure UL channel

Each D2D link measures interference at its Rx Each D2D link measures its signal channel

D2D Tx 1 D2D Rx 1 Access prob. x D2D Tx 2 D2D Rx 2 Access prob. x D2D Tx N D D2D Rx N D Access prob. x ND Broadcast price μ Interference measurement: P i x i P D i g ii Fig. 1. Illustration of the proposed algorithm. The arrows ﬁlled withdark color indicate the procedures requiring message exchange, while thearrows ﬁlled with light color indicate the procedures involving only localmeasurements. The lower part describes the LB Algorithm for the lowerproblem, while the upper part illustrates algorithms proposed for the upperproblem. the downlink signal (e.g., in a time-division duplexing (TDD)uplink/downlink conﬁguration), or provided by the BS, whichmeasures the uplink channel and sends the information tothe D2D user (e.g., in a frequency-division duplexing (FDD)uplink/downlink conﬁguration). Apart from uplink CSI, eachD2D link also measures the channel between the transmitterto its paired receiver. The frequency to update CSI dependson the channel variance. For example, in a slow mobilityscenario, D2D links may just update the information once(at the beginning of each resource allocation period). At eachiteration of the LB Algorithm, every D2D link measures theinterference if accessing a RB. There is no additional messageexchange in this step. Thus, the LB Algorithm only requireslocal information and reduces the overhead, as shown in Fig. 1.

Convergence analysis.

To get the convergence criteria ofthe LB Algorithm, we ﬁrst investigate some basic propertiesof function f L . We assume that there are ﬁnite number of D2Dlinks. We call the set of D2D links with x i = 1 saturated D2Dlinks, denoted by S := { i ∈ D : x i = 1 } , and the set of D2Dlinks with x i ∈ (0 , active D2D links, denoted by A := { i ∈ D : x i ∈ (0 , } . Denoting s = [ s , · · · , s N D ] T , we have s = Gx , where G is an N D × N D matrix with zero diagonalelements and ( i, j ) th element (with i (cid:54) = j ) being P D j h ji . Proposition 3.

The best-response function f L has the follow-ing properties:1) f L is a continuous mapping from X to X . 2) f L is piecewise afﬁne, which means that f L has thefollowing two properties:a) The domain of function f L can be partitionedinto ﬁnitely many polyhedral regions, denoted by P , · · · , P d , which are determined by the choice of A and S ;b) On the polyhedron P n deﬁned by A ( n ) and S ( n ) ,we have f L ( x ) = M ( n ) x + b ( n ) , where b ( n ) is aconstant vector, and M ( n ) = B ( n ) G with B ( n ) beinga diagonal matrix, which has (cid:104) B ( n ) i (cid:105) kl = − P Di h ii if k = l, i ∈ A ( n ) , and (cid:104) B ( n ) i (cid:105) kl = 0 otherwise.Proof: See Appendix A.We assume that the resource allocation is carried out wellduring the channel coherence time, and thus channel can beregarded as static during resource allocation updates. We leavethe stochastic channel analysis as future work. Deﬁning thematrix norm of a matrix M induced by the vector norm (cid:107) · (cid:107) as (cid:107) M (cid:107) := max {(cid:107) Mx (cid:107) : (cid:107) x (cid:107) = 1 } [32], a sufﬁcientcondition for the convergence of the proposed algorithm withgeneral matrix norms can be found as follows, leveraging thetechniques used in Theorem 7 in [32]. Theorem 1. If (cid:107) M n (cid:107) < , we have1) the synchronous iterative algorithm converges for anyinitial resource allocation;2) there is a unique ﬁxed point x ∗ ;3) (cid:107) x ( t ) − x ∗ (cid:107) ≤ η t (cid:107) x (0) − x ∗ (cid:107) , where η = max n (cid:107) M n (cid:107) .The upper bound of the convergence rate is η .Proof: If f L is a contraction mapping, then global stabilityfollows from the Banach Fixed Point Theorem (see, e.g.[38]).The proof of contraction mapping is similar to the proof ofTheorem 7 in [32], and thus we ignore the details. Given that f L is a contraction mapping, we have | f L ( x (cid:48) ) − f L ( x ) (cid:107) ≤ η (cid:107) x − x (cid:48) (cid:107) . The rate of convergence for a sequence { x n } converging to L is deﬁned as the lim n →∞ | x n +1 − L || x n − L | . Observingthe above inequality, we conclude that the convergence rate ofthe BR Algorithm is upper bounded by η .The number of polyhedral regions that partition the domainof f L is O (3 N D ) , which is very large, and thus it is impracticalto check the conditions in Theorem 1 directly for all regions.We further provide sufﬁcient conditions in Proposition 4 thatare easy to apply. Proposition 4.

If the matrix G satisﬁes (cid:107) G (cid:107) ≤ min i,k P D i h ( k ) ii , then the algorithm converges to the uniqueﬁxed point regardless of the initial point.Proof: According to Prop. 3, we have M n = B ( n ) G .To make f L a contraction mapping, we have to satisfy (cid:107) B ( n ) G (cid:107) ≤ . By the property of matrix norm that (cid:107) AB (cid:107) ≤(cid:107) A (cid:107) · (cid:107) B (cid:107) , we obtain a sufﬁcient condition that is (cid:107) B ( n ) (cid:107) ·(cid:107) G (cid:107) ≤ . Matrix B ( n ) is a diagonal matrix, whose norm is (cid:107) B ( n ) (cid:107) ≤ max i P Di h ii , ∀ n. Then we can get one sufﬁcientcondition as (cid:107) G (cid:107) ≤ (cid:18) max i,k P Di h ( k ) ii (cid:19) − = min i,k P D i h ( k ) ii . Cellular userBSD2D TxD2D RxInactive D2D linkActive D2D linkSaturated D2D link (a) µ = 80 dB. Cellular userBSD2D TxD2D RxInactive D2D linkActive D2D linkSaturated D2D link (b) µ = 90 dB.Fig. 2. The access probabilities of D2D links vs. µ . The areas in the darkshade show the locations of silent D2D links with x i = 0 . The light shadedareas show the locations of active D2D links (i.e., x i ∈ (0 , ). The remainingparts show the locations of saturated D2D links (i.e., x i = 1 ). Design interpretations.

The above result is true for anygeneral l p norm with p ≥ . As in [32], we apply itto some special matrix norms and give the correspondinginterpretations as follows. Example 1 ( l norm) . We have (cid:107) G (cid:107) =max {(cid:107) Gx (cid:107) : (cid:107) x (cid:107) = 1 } = max { (cid:80) i =1 | s i |} . This impliesthat a sufﬁcient condition for the convergence of the LBAlgorithm is that no D2D transmitter causes very stronginterference to other D2D links. Example 2 ( l ∞ norm) . We have (cid:107) G (cid:107) ∞ =max {(cid:107) Gx (cid:107) ∞ : (cid:107) x (cid:107) ∞ = 1 } = max { max i | s i |} . Thisimplies that a sufﬁcient condition for the convergence is thatno D2D receiver suffers excessive interference.We show examples of D2D access probabilities obtained bythe BR Algorithm versus different µ in Fig. 2. We can observethat the BR Algorithm discourages D2D links whose transmit-ters are near the BS from accessing the RB. In addition, theBR Algorithm takes into account the SINR of D2D links, andencourages a D2D link far from the BS to keep silent if thereare many D2D links nearby, to decrease the interference inD2D networks. Comparing the two subﬁgures, we conclude that less D2D links would be active as µ increases. Thissuggests the potential effectiveness of µ , which is investigatedin the following section.V. U PPER P ROBLEM : N

ETWORK ’ S P RICING M ECHANISM

It is difﬁcult to analyze the upper problem by plugging (8)directly into (6), due to the complex term in the denominator.Simulation results in Section VI show that the performanceof the LB Algorithm for problem (9) is very close to theperformance of the BR Algorithm for the original problem (7).This suggests to approximate the solution of (7) to the solutionof the lower bound problem (10). By this approximation, wepropose an algorithm leveraging techniques from LCP [39].We further propose a bisection algorithm, which has lowoverhead and can be applied to the original two-stage problem(with the lower problem (7)). In addition, motivated by theresults of the lower problem (see e.g., Fig. 2), we propose asimple greedy heuristic algorithm, which can be applied to thenetwork with high interference tolerance level.

A. An Equivalent Upper Problem

According to Lemma 1 in [22], the upper problem isequivalent to the following problem: max µ ≥ min (cid:40) µ (cid:88) i ∈D x i P D i g ii , µQ (cid:41) s.t. x i = x ∗ i , (11)where x ∗ i is given by (8). For simplicity, we use the notation ≤ a ⊥ b ≥ to represent the complementarity conditionof a and b , i.e., ab = 0 and a, b ≥ [39]. Letting I Ci = (cid:80) j ∈C P C j h ji + W i , and λ i be the Lagrange multiplier to relaxthe constraint x i ≤ , we have the following lemma. Lemma 2.

Denoting t i = w i P Di h ii uP Di g ii ( uP Di g ii + λ i ) , (10) is equiv-alent to the following parametric LCP with variables ( x i , t i ) and parameter µ [22, 39]: ≤ x i ⊥  t i − w i h ii µg ii + (cid:88) j ∈D P D j h ji x j + I Ci  ≥ , ≤ t i ⊥ (1 − x i ) ≥ . (12) Proof:

Details can be found in [22]. Key steps are tomultiply (10) by (cid:80) j ∈D P Dj h ji x j + I Ci µP Di g ii + λ i and change the variable t i .In the following, we explore properties of the objectivefunction in (11), which provides clues to design efﬁcientalgorithms for the upper problem. B. Algorithm Design for the Upper Problem

In this paper, we use the symmetric parametric principlepivoting algorithm (SPPP) – a classical algorithm for para-metric LCP [39] – to ﬁnd the optimal µ and its correspondingfeasible solutions x i in (12). We write (12) in matrix formas ≤ y ⊥ Ay + q + ν d ≥ , where ν = µ , y =[ x , . . . , x N D , t , . . . , t N D ] T , q = [ I C . . . , I CN D , , . . . , T , d = [ − w h g , . . . , − w ND h NDND g NDND , , . . . , T , A is a matrixwith ( i, j ) th element P D j h ji , and A = (cid:2) A I − I 0 (cid:3) . Note that µ ≥ implies that ν ≥ . We set a upper limit for ν , denotedby ¯ ν < ∞ , which is a sufﬁcient large real number. The SPPPis given by Algorithm 2. Algorithm 2

SPPP-based algortihm Initialization: τ = 0 , µ ∗ = 0 , U ∗ = 0 , q ( τ ) = q , d ( τ ) = d , A ( τ ) = A , ν ( τ ) = 0 , y ( τ ) = , and z ( τ ) = q + ν ( τ ) d + Ay ( τ ) ; (cid:46) comment : critical value Determine the next critical value of µ : ν ( τ + 1) = min (cid:26) min i (cid:26) − q i ( τ ) d i ( τ ) : d i ( τ ) < (cid:27) , ¯ ν (cid:27) ; Set ( z ( τ + 1) , y ( τ + 1)) = ( q ( τ ) + ν d ( τ ) , for all ν ∈ [ ν ( τ ) , ν ( τ + 1)] ; if ν ( τ + 1) = ¯ ν then stop; else let r = arg min i (cid:110) − q i ( τ ) d i ( τ ) : d i ( τ ) < (cid:111) ; end if The new critical value of ν is ν ( τ + 1) = − q r ( τ ) /d r ( τ ) ,and thus µ ( τ + 1) = ( ν ( τ + 1)) − ; (cid:46) comment : pivoting if A rr ( τ ) > then pivot < z r ( τ ) , y r ( τ ) > ; let z r ( τ + 1) = y r ( τ ) , y r ( τ + 1) = z r ( τ ) ; let z i ( τ + 1) = z i ( τ ) , y i ( τ + 1) = y i ( τ ) , for i (cid:54) = r ; let τ = τ + 1 , and return to Step 2; else if A r r ( τ ) = 0 then use y r ( τ ) as a driving variable and determin the basicblocking variable z s ( τ ) ; pivot < z s ( τ ) , y r ( τ ) > , < z r ( τ ) , y s ( τ ) > ; let z s ( τ + 1) = y r ( τ ) , y s ( τ + 1) = z r ( τ ) , z r ( τ + 1) = y s ( τ ) , y r ( τ + 1) = z s ( τ ) ; let z i ( τ + 1) = z i ( τ ) , y i ( τ + 1) = y i ( τ ) , for i (cid:54) = r, s ; let τ = τ + 1 , and return to Step 2; end if get x i ( τ + 1) from y i ( τ + 1) ; let U be (11) at µ ( τ + 1) and x i ( τ + 1) ; if U > U ∗ then let U ∗ = U , µ ∗ = µ ( τ + 1) ; end if Since the SPPP Algorithm requires CSI between eachtransmitter and receiver (i.e., matrix A ), which may causehigh overhead, the result of SPPP can be used as a per-formance benchmark, and another low-overhead algorithm isdesirable. To propose such algorithms, we ﬁrst explore theproperties of the objective function in (11), which is denotedby U c := min { U c , U c } , where U c = µ (cid:80) i ∈D x i P D i g ii and U c = µQ . Function U c is a linear increasing function of µ ,while U c is more complicated since it involves solving (10).The properties of function U c are given by Proposition 5,leveraging the techniques in [22]. Proposition 5.

The function U c ( µ ) has the following prop-erties:1) U c is a continuous function of µ ;2) U c is piecewise afﬁne;3) If (cid:80) j ∈D ,j (cid:54) = i h ij h jj g jj < g ii , ∀ i ∈ D , and (cid:80) j ∈D s P D j (cid:16) h ji − h i g g jj (cid:17) ≥ , ∀ i ∈ D a , then U c is a non-increasing function.Proof: See Appendix B.The sufﬁcient conditions given in Proposition 5 to make U c non-decreasing essentially say that the interference amongD2D links and the interference from saturated D2D links (i.e., x i = 1 ) to the BS should be weak. Given by Proposition 5that U c is piecewise afﬁne, and U c is linear, the optimal µ ∗ must either be at a break point of U c – the discontinuouspoints in the derivative of U c – or at the intersection of U c and U c [22]. When U c is non-decreasing, the optimal µ ∗ must be at the intersection of U c and U c . This motivates theheuristic algorithm, given by Algorithm 3, which converges toan intersection point of U c and U c [22]. Algorithm 3

A bisection algorithm for ﬁnding optimal price µ ∗ at monotonic case Initialization: given accuracy (cid:15) ≥ , let µ u = µ max , and µ l ≥ ; while | µ u − µ l | ≥ (cid:15) do let µ m = µ u + µ l ; get x i ( µ m ) by running the LB Algorithm; if U ( µ m , x i ( µ m )) ≤ U ( µ m , x i ( µ m )) then µ u = µ m ; else µ l = µ m ; end if end while let µ ∗ = µ m .In the bisection algorithm, we set an upper limit for µ ,denoted by µ max < ∞ , which is a sufﬁcient large real number.We consider non-trivial cases, where the interference fromD2D to BSs, when all D2D links are active, is greater thanthe interference tolerance level; otherwise, we just let all D2Dlinks access a RB with probability one. Under this assumption,we have the following result. Proposition 6.

The bisection algorithm always converges.In particular, the algorithm requires at most log ( µ max /(cid:15) ) iterations to converge.Proof: See Appendix C.Note that the bisection algorithm also converges when weuse the BR Algorithm instead of the LB Algorithm to solve thelower problem, due to that function f is continuous. Under theconditions given by Proposition 5, i.e., the interference amongD2D links and the interference from saturated D2D links toBSs are weak, the bisection algorithm achieves the optimal µ ∗ . In other words, the optimal strategy in this case is to letthe number of active D2D links as large as possible, until thetotal interference from D2D links reaches the tolerance level. Implementation interpretations.

Adopting the bisectionalgorithm, the BS ﬁrst broadcasts a price, and then measuresthe aggregate interference at this price. If the interference isgreater than the tolerance level, the BS increases the price;otherwise, the BS decreases the price. In fact, the behavioris consistent with the law of supply and demand : if thedemand (the interference) exceeds the supply (the interferencetolerance level), the price increases to make the RB lessattractive. The algorithm can also be implemented adaptively.The network locally measures the total D2D interference, andincreases (decreases) the price if the interference level is above(below) the predeﬁned tolerance level Q , until the interferencelevel reaches Q . Fig. 1 illustrates the structure of Algorithm 3,which shows that the signaling overhead is caused by the pricebroadcast and the channel measurements. The overhead due toprice broadcast is proportional to the number of RBs, which isquite small. As for the channel measurements, the BS requiresthe channel information of cellular links, and each D2D linkneeds the CSI of the link between its transmitter to the pairedreceiver and of the link between the transmitter and the BS.Thus, the algorithm only requires local information and theoverhead due to the channel measurements is proportional tothe total number of cellular and D2D links, which is muchlower than the overhead of centralized algorithms (e.g., thebrute force approach or the SPPP Algorithm) that requireglobal CSI. Note that the channel information updating fre-quency depends on the channel variance over time, which isquite low in a slow mobility environment (e.g., we may onlymeasure channels once for each or several resource allocationtime scales). Therefore, the required overhead is not signiﬁcantcompared to the potential advantages of our algorithm.As for the complexity, letting T be the number of requirediterations for the LB Algorithm, the computational complexityof Algorithm 3 is O ( N D log ( µ max /(cid:15) ) + N D T ) . The param-eters T and log ( µ max /(cid:15) ) are generally much smaller than N N D x as illustrated in Section VI, where T and log ( µ max /(cid:15) ) are between 5 and 10, while N N D x is . Thus, the com-plexity of Algorithm 3 is much lower than the complexity ofthe centralized scheme, which is O ( N N D x N D ) .Observing Fig. 2, D2D links mostly have larger accessprobabilities when they are far from the BS. This motivatesanother greedy heuristic algorithm – called the IO Algorithm (short for interference ordering), which needs no iteration.The D2D links are sorted by the interference caused to theBS in an ascending order, i.e., P D g ≤ P D g ≤ · · · ≤ P DN D g N D N D . The BS lets x = 1 , . . . , x n = 1 and otherD2D links be silent, where n satisﬁes (cid:80) n P Di g ii ≤ Q and (cid:80) n +11 P Di g ii > Q . Adopting the IO Algorithm, the BSmeasures the uplink CSI from D2D transmitters, based onwhich the BS determines the access probabilities. Therefore,this algorithm has lower overhead than the bisection algorithm,and gets the solution more quickly, at the cost of overallperformance, which is shown in the following section.VI. P ERFORMANCE E VALUATION

We consider an uplink system with a hexagonal BS model.The main simulation parameters are listed as follows, unless otherwise speciﬁed. The BS density is 1 per π (500 m ) . Thecellular UEs and D2D links are deployed according to twoindependent Poisson point processes with the same density links per macrocell. We let the average length of D2D links be m. We assume the total bandwidth is MHz with MHz persub-band. The transmitters adopt fractional power control, i.e., P = min { P max , d κα } , where P max is the maximum transmitpower, d is the distance of the link, κ is the compensationfactor for path loss, and α is the path loss exponent. We letcellular UEs and D2D links have the same power control factor κ = 0 . . The maximum transmit powers of cellular UEsand D2D links over one sub-band are mW and mW,respectively, due to the fact that cellular UEs only access onesub-band while D2D links can access multiple sub-bands. Notethat D2D links may not access all sub-bands, and thus weset a conservative maximum transmit power for D2D links.The noise power spectrum density is − dBm/Hz. Pathloss exponents of UE-UE and UE-BS links are . and . ,respectively. We compare the performance of our proposedalgorithms to the scenario where all D2D links are active,as well as the scheme where D2D links become silent whentheir transmitters are within a circle around their nearest BSs– called a guard zone scheme . A. The Lower Problem: D2D non-cooperative Game

In this section, we consider a single cell scenario, where theinterference tolerance level is dB above the cellular signal.We investigate the performance of the BR Algorithm and theLB Algorithm. Note that the BR Algorithm provides the NEresult, which may not be optimal. Due to the complexity tosolve (3) via brute force search ( O ( N N D x N D ) ), we comparethe gap between the NE and optimal results in a small networkwith three D2D links. The average total D2D rates obtained bybrute force search and by the BR Algorithm are 3.362 bps/Hzand 3.355 bpz/Hz, respectively. Thus, we observe that the NEsolution is near-optimal in small networks, which is mainlydue to that the D2D links are active with probability close toone in most cases in the small network. On the other hand,the D2D links have fractional active probabilities in most casesin the large network, and thus the observation may be quitedifferent in large networks. We leave the analysis of the gapbetween the NE and optimal solution of (7) in more generalnetworks to future work. To compare the BR Algorithm andLB Algorithm, we consider a case with ten D2D links. Fig. 3shows that the rate distributions using the BR Algorithm andthe LB Algorithm are almost the same. This implies that wecan use the solution of the LB Algorithm to approximate thesolution of the BR Algorithm. By comparing to Figs. 6 and7, we observe that the performance of different algorithmsin single-cell networks is similar to the multi-cell networks.Therefore, more discussion is left to the following subsection. B. The Upper Problem: Network Pricing Mechanism

In this section, we consider an asynchronous multi-cellnetwork, where each cell allocates resources independently.We let the interference tolerance level be the same as the R a t e CD F D2D: BR AlgorithmCellular: BR AlgorithmD2D: LB AlgorithmCellular: LB AlgorithmD2D: guard zone, D=150mCellular:guard zone, D=150mD2D: all D2D links activeCellular: all D2D links active

Fig. 3. The rate distributions of D2D and cellular links using differentalgorithms in a single-cell network. T o t a l r a t e ( bp s / H z ) Algo.2(SPPP), cellularALog.3(bisection), cellularAlgo.2(SPPP), D2DALog.3(bisection), D2D

Fig. 4. The convergence of different algorithms. received signal of cellular link (i.e., the interference levelnormalized by the cellular signal is 0dB). From the simulationresults, the number of iterations required for the convergenceof the LB Algorithm is about 4-8, which is quite small. Fig. 4shows the convergence of the SPPP and bisection algorithms.Both algorithms converge quickly. While SPPP provides alarger cellular rate, it converges more slowly than the bi-section algorithm. The quick convergence of LB Algorithmand bisection algorithm implies that the complexity of theproposed scheme O ( N D log ( µ max /(cid:15) )+ N D K ) is much lowerthan the complexity of the centralized scheme O ( N N D x N D ) ,where K ∈ [4 , and log ( µ max /(cid:15) ) ∈ [5 , , while N x ≥ generally and thus N N D x ≥ in our setup.In Fig. 5, we compare the rates of D2D and cellular linksusing different algorithms. The SPPP and bisection algorithmsprovide larger D2D and/or cellular rates than the guard zoneschemes. If there is no interference management, the rateof cellular UE is very small (see “All D2D active” in theﬁgure). Adopting the proposed algorithms, the cellular linkscan get much better performance (total cellular rate increasingfrom . to about . bps/Hz), at the cost of less totalthroughput (about loss in our setup). We observe that T o t a l r a t e ( bp s / H z ) Cellular linksD2D linksAlgo.2(SPPP) Algo.3(bisection) All D2D linksare active Guard zone(D=150) Guard zone(D=200)

Fig. 5. The total rates of cellular and D2D links using different approaches.

SPPP provides a slightly larger rate for cellular links thanthe bisection algorithm. This implies that in some cases, thefunction U c is non-monotonic and the optimal µ is not atthe intersection of U c and U c . However, in general, thegap between the bisection algorithm and the SPPP algorithmis small regardless of the monotonicity of U c . The averagetotal rate in conventional networks, where potential D2D linksoperate only in cellular mode, is . bps/Hz in our setup.Deﬁning the rate gain by the increased total rate divided bythe average rate in conventional networks, we conclude thatallowing D2D links and using proposed algorithms achievesa very large rate gain compared to conventional networks(about x in our simulation setup), and meanwhile keepsthe performance of cellular UEs at an acceptable level (withaverage rate per cellular link being . bps/Hz). Note thatthe rate gain depends on various system parameters, such asaverage D2D link length and the amount of D2D trafﬁc.The rate distributions of cellular and D2D links are shownin Figs. 6 and 7, respectively. Fig. 6 shows that proposedalgorithms can effectively protect the cellular performance.Comparing to Fig. 3, where the average cellular rate isabout . bps/Hz with the normalized interference tolerancelevel being dB, we can conclude that a lower normalizedinterference tolerance level ( dB) is needed in the multi-cellscenario. Also, the rate of cellular links has a larger rangethan the single cell scenario (i.e., the variance is larger). Onepossible reason is that there may be some nearby interferingD2D links and cellular UEs in neighboring cells. Though theD2D links have large rates without interference management,they hurt cellular links a lot. Adopting the guard zone scheme,cellular links can be protected, at the cost of the degradation ofD2D throughput. Moreover, it is difﬁcult to develop a tractableframework to study the guard zone scheme, and thus difﬁcultto ﬁnd the optimal distance threshold analytically. Therefore,the SPPP and bisection schemes are more preferable.Note that the interference tolerance level can be tunableto maximize utility functions in terms of both the cellular andD2D links (e.g., the total rate in the hybrid network). We showthe rates of cellular and D2D links versus the interferencetolerance level numerically in Figs. 8 and 9, respectively. The R a t e CD F Alog.2 (SPPP)Algo.3 (bisection)All D2D links are activeGuard zone, D=150mGuard zone, D=200m

Fig. 6. The rate distribution of cellular links using different approaches. R a t e CD F Algo.3 (bisection)Alog.2 (SPPP)All D2D links are activeGuard zone, D=150mGuard zone, D=200m

Fig. 7. The rate distribution of D2D links using different approaches. analysis of optimal Q with respect to different utility functionsis left to future work. Fig. 8 shows that as the interferencetolerance level increases, the rate of cellular users decreases,because more D2D links are allowed to transmit. On the otherhand, as Q increases, D2D links can access the RBs moreaggressively and the total rate of D2D links increases. TheIO Algorithm protects the performance of cellular links well.However, in a network with strict interference constraints, thetotal rate of D2D links using the IO Algorithm is less than theSPPP and bisection algorithms, which implies the importanceto consider power control for D2D resource allocation, aswell as the interference experienced at D2D receivers. FromFig. 9, we can conclude that although the IO Algorithm isvery simple, it can only be applied to the cases with highinterference tolerance (e.g., cases with normalized interferencetolerance level larger than dB).Figs. 10 and 11 show the rates of cellular links and D2Dlinks versus the D2D density, respectively. We ignore theguard zone scheme with radius 150m in these ﬁgures, since itsperformance is similar to the guard zone scheme with radius200m. As shown in Fig. 11, the total rate of D2D links in-creases as D2D density increases, while the rate of cellular linkdecreases in Fig. 10. The decrease of cellular rate using the −20 −15 −10 −5 0 5 10 15 2000.40.81.21.622.42.8 Interference tolerance level (normalized by cellular signal, dB) T o t a l c e ll u l a r r a t e ( bp s / H z ) Algo.2 (SPPP)Algo.3 (bisection)All D2D links are activeIO AlgorithmGuard zone, D=150mGuard zone, D=200m

Fig. 8. The rate of cellular links vs. the interference tolerance level.The normalized interference tolerance level means that the interferencetolerance level Q is divided by the signal of the cellular link accessing theconsidered RB. −20 −15 −10 −5 0 5 10 15 202468101214161820 Interference tolerance level (normalized by cellular signal, dB) T o t a l D D r a t e ( bp s / H z ) Algo.2 (SPPP)Algo.3 (bisection)All D2D links are activeIO AlgorithmGuard zone, D=150mGuard zone, D=200m

Fig. 9. The total rate of D2D links vs. the interference tolerance level.The normalized interference tolerance level means that the interferencetolerance level Q is divided by the signal of the cellular link accessing theconsidered RB. SPPP and bisection algorithms vanishes much more quicklythan the scenario with all D2D links being active, whichsuggests the efﬁciency of the SPPP and bisection algorithmsfor protecting cellular transmissions. The ﬁgures also showthat besides the interference tolerance level, the throughputgain of the proposed algorithms also highly depends on thedensity of D2D links.VII. C

ONCLUSION

This paper presents a decentralized spectrum managementfor a shared network consisting of D2D and cellular links,aiming to maximize the total throughput of D2D links withan interference constraint for protecting cellular transmis-sions. We propose a low-complexity low-overhead distributedalgorithm to update D2D access probabilities, and use theSPPP algorithm to get the optimal price for controlling theinterference from co-channel D2D links. Though the SPPPAlgorithm requires global CSI, it provides a benchmark forother algorithms. We further propose a low-overhead efﬁcient T o t a l r a t e ( bp s / H z ) Average number of D2D links per macrocell Algo.2 (SPPP) Algo.3 (bisecDon) All D2D links are acDve Guard zone, D=200m

Fig. 10. The total rate of cellular links vs. different D2D densities. T o t a l r a t e ( bp s / H z ) Average number of D2D links per macrocell Algo.2 (SPPP) Algo.3 (bisecCon) All D2D links are acCve Guard zone, D=200m

Fig. 11. The total rate of D2D links vs. different D2D densities. heuristic algorithm based on the bisection method, which isshown to be convergent. Numerical results show that theheuristic algorithm has about the same performance as theSPPP algorithm, especially in the cases with low interferencetolerance level. Another simple greedy algorithm is proposedand shown to perform well in scenario with high interferencetolerance level. The proposed algorithms provide a largethroughput gain with a performance guarantee of cellular links,compared to a conventional network with links operating onlyin cellular mode. Comparing to the cases without interferencemanagement (i.e., all D2D links are active), the average rateof cellular links improves signiﬁcantly (e.g., average rate percellular link increases from . to . bps/Hz in our setup).This implies that the proposed algorithms can efﬁciently man-age the interference from D2D links to the cellular network.Future work could include investigation of more general utilityfunctions incorporating both throughput and fairness, jointoptimization of D2D mode selection, and consideration of amore ﬂexible multiple-cell system. A PPENDIX AP ROOF OF P ROPOSITION

31) We can complete the proof by applying the

MaximumTheorem with

Φ = U i ( x i ; x − i , µ ) [32].2) Let A and S be any two distinct subsets of set D = { , , · · · , N D } . We have  a i − s i ≤ , for i ∈ D \ ( A ∪ S ) , < a i − s i < P D i h ii , for i ∈ A ,a i − s i ≥ P D i h ii , for i ∈ S . (13)Given that the number of D2D links is ﬁnite, we havethat the number of choices of disjoint A and S is ﬁnite.Therefore, the domain of f L can be partitioned intoﬁnitely many polyhedra. With s = Gx , the inequalities(13) can be changed to inequalities in terms of x , whichdeﬁnes a (possibly empty) polyhedron in X . On thepolyhedron P n , according to (10), we have x i = a i − s i P Di h ii for i ∈ A ( n ) , x i = 1 for i ∈ S ( n ) and x i = 0 otherwise, which can be expressed in a matrix formas x i = B ( n ) i s i + b ( n ) i , where B ( n ) i is deﬁned in theProposition 3. Combining with s = Gx , we completethe proof. A PPENDIX BP ROOF OF P ROPOSITION

51) We deﬁne a function g : R + → X as g ( µ ) =( g ( µ ) , . . . , g N D ( µ )) , where g i ( µ ) = x i ( µ, x − i (0)) , x (0) is a given initial vector, and x i ( µ, x − i (0)) is calculatedaccording to (10) by ﬁxing x − i = x − i (0) . Observing(10), we can see that g i ( µ ) is a continuous functionfor a given x − i . According to properties of continuousfunctions (see, e.g., Theorem 4.10 in [40]), g ( µ ) iscontinuous due to the fact that each of the function g ( µ ) , . . . , g N D ( µ ) is continuous. Proposition 3 showsthat the best-response function f L : X → X is contin-uous. Invoking properties of continuous functions again(see, e.g., Theorem 4.7 in [40]), we can conclude that f L ( g ( µ ) is a continuous mapping, which implies that theNE x ∗ i ( µ ) is a continuous function of µ . Therefore, U c is also a continuous function of µ .2) Recall that A and S denote the sets of active D2Dlinks and of saturated D2D links, respectively. Withoutloss of generality, let A = { , . . . , n } and S = { n +1 , . . . , n + m } . We denote x a = [ x , · · · , x n ] T , x s =[ x n +1 , · · · , x n + m ] T and x = [ x n + m +1 , · · · , x N D ] T . Let H aa be a matrix with ( i, j ) th element being P Dj h ji P Di h ii for i, j ∈ A , W a = (cid:104) w P D g , · · · , w n P Dn g nn (cid:105) T and C a = (cid:104) I C + I D P D h , · · · , I Cn + I Dn P Dn h nn (cid:105) T , where I D i = (cid:80) j ∈S P D j h ji .According to (10), we have x a = ( H aa ) − (cid:104) W a µ − C a (cid:105) , x s = 1 and x = 0 . The domain of function U c can bedivided into ﬁnite polyhedra according to different A and S . In each polyhedron, we have U c ( µ ) = µ (cid:88) i ∈A x i P D i g ii + µ (cid:88) i ∈S P D i g ii = µ β Ta ( H aa ) − (cid:20) W a µ − C a (cid:21) + µ β Ts T = β Ta ( H aa ) − (cid:104) W a + (cid:16) H aa ˜ β (cid:0) β Ts T (cid:1) − C a (cid:17) µ (cid:105) , (14)where β a = [ P D g , P D g · · · , P D n g nn ] T , β s = (cid:2) P D n +1 g ( n +1)( n +1) , · · · , P D n + m g ( n + m )( n + m ) (cid:3) T , ˜ β =[ P D g , , · · · , , and = [1 , , · · · , T . Therefore, U c is a linear function in each given polyhedron, andthus it is piecewise afﬁne.3) We use the same argument as the proof of Theorem1 in [22]. The ﬁrst condition is equivalent to thatmatrix H ( β ) aa := diag ( β a ) H aa ( diag ( β a )) − is strictly(column-wise) diagonally dominant. Deﬁning C ( β ) a = diag ( β a ) (cid:16) C a − H aa ˜ β (cid:0) β Ts T (cid:1)(cid:17) , we have β Ta H − aa (cid:16) C a − H aa ˜ β (cid:0) β Ts T (cid:1)(cid:17) = T (cid:16) H ( β ) aa (cid:17) − C ( β ) a . (15)To show that U c is non-increasing, we need to showthat (15) is non-negative. The ﬁrst condition is a suf-ﬁcient condition for T (cid:16) H ( β ) aa (cid:17) − ≥ . Similar proofcan be found in [22], and we ignore the details. Theremaining proof is to show C a − H aa ˜ β (cid:0) β Ts T (cid:1) ≥ .The i th element of the left term of the above inequalityis P Di h ii (cid:16) I C i + (cid:80) j ∈D s P D j h ji − h i g (cid:80) j ∈D s P D j g jj (cid:17) ,which implies that (cid:80) j ∈D s P D j (cid:16) h ji − h i g g jj (cid:17) ≥ , ∀ i ∈ D a is a sufﬁcient condition to make C a − H aa ˜ β (cid:0) β Ts T (cid:1) ≥ . Combining with T (cid:16) H ( β ) aa (cid:17) − ≥ , we can conclude that (15) is non-negative, and thus U c is non-increasing.A PPENDIX CP ROOF OF P ROPOSITION U c and U c on [0 , µ max ] is non-empty, and then show that the bisectionalgorithm converges to one of the intersection points. Prop.5 shows that U c is a continuous function of µ . It is easy toobserve that U c is also a continuous function of µ . Therefore, U c − U c is a continuous function of µ . Recalling theassumption that when all D2D links are active, the interferencefrom D2D to BSs is greater than the interference tolerancelevel, we have U c − U c > when µ = 0 . On the otherhand, when µ = µ max , we have U c − U c < . According tothe intermediate value theorem, we can conclude that there issome number µ ∈ [0 , µ max ] such that U c ( µ ) − U c ( µ ) = 0 .In other words, there is at least one intersection point between U c and U c on [0 , µ max ] .Adopting the bisection algorithm, the interval is dividedinto two halves at each iteration. The interval at iteration t is denoted by L ( t ) = [ a t , b t ] , where a = 0 and b = µ max .According to procedures of the bisection algorithm, we have U c ( a t ) ≥ U c ( a t ) and U c ( b t ) ≤ U c ( b t ) at each itera-tion t . Similar to the proof of the existence of intersectionpoints on [0 , µ max ] , we can show that there is at least oneintersection point between U c and U c on L ( t ) . Therefore,the bisection algorithm preserves the existence of intersectionpoints in current interval. The length of interval L ( t ) has | L ( t ) | = | L ( t − | / · · · = µ max / t . It must stop when | L ( t ) | ≤ (cid:15) , which implies that the algorithm converges, andthe maximum number of iteration for convergence, denotedby T , satisﬁes µ max / T = (cid:15) , i.e., T = log ( µ max /(cid:15) ) .R EFERENCES[1] K. Doppler, M. Rinne, C. Wijting, C. Ribeiro, and K. Hugl, “Device-to-device communication as an underlay to LTE-Advanced networks,”

IEEE Communications Magazine , vol. 47, pp. 42–49, Dec. 2009.[2] M. S. Corson, R. Laroia, J. Li, V. Park, T. Richardson, and G. Tsirtsis,“Toward proximity-aware internetworking,”

IEEE Wireless Communica-tions , vol. 17, pp. 26–33, Dec. 2010.[3] X. Wu, S. Tavildar, S. Shakkottai, T. Richardson, J. Li, R. Laroia, andA. Jovicic, “FlashlinQ: A synchronous distributed scheduler for peer-to-peer ad hoc networks,” in

Allerton Conference on Communication,Control, and Computing , pp. 514–521, 2010.[4] G. Fodor, E. Dahlman, G. Mildh, S. Parkvall, N. Reider, G. Mikl´os,and Z. Tur´anyi, “Design aspects of network assisted device-to-devicecommunications,”

IEEE Communications Magazine , vol. 50, pp. 170–177, Mar. 2012.[5] X. Lin, J. G. Andrews, A. Ghosh, and R. Ratasuk, “An overviewon 3GPP device-to-device proximity services,”

IEEE CommunicationsMagazine , vol. 52, pp. 40–48, Apr. 2014.[6] Qualcomm, “LTE direct always-on device-to-deviceproximal discovery,” white paper

IEEE CommunicationsMagazine , vol. 52, pp. 56–65, Apr. 2014.[8] P. Marques, J. Bastos, and A. Gameiro, “Opportunistic use of 3Guplink licensed bands,” in

Proc., IEEE Intl. Conf. on Communications ,pp. 3588–3592, 2008.[9] Q. Ye, M. Al-Shalash, C. Caramanis, and J. G. Andrews, “Resourceoptimization in device-to-device cellular systems using time-frequencyhopping,”

IEEE Trans. on Communications , vol. 13, pp. 5467–5480,Oct. 2014.[10] C.-H. Yu, K. Doppler, C. B. Ribeiro, and O. Tirkkonen, “Resourcesharing optimization for device-to-device communication underlayingcellular networks,”

IEEE Transactions on Wireless Communications ,vol. 10, pp. 2752–2763, Aug. 2011.[11] C.-H. Yu, O. Tirkkonen, K. Doppler, and C. Ribeiro, “On the perfor-mance of device-to-device underlay communication with simple powercontrol,” in

Proc., IEEE Veh. Technology Conf. , 2009.[12] J. Gu, S. J. Bae, B.-G. Choi, and M. Y. Chung, “Dynamic powercontrol mechanism for interference coordination of device-to-devicecommunication in cellular networks,” in

International Conference onUbiquitous and Future Networks (ICUFN) , pp. 71–75, 2011.[13] P. Janis, V. Koivunen, C. Ribeiro, J. Korhonen, K. Doppler, andK. Hugl, “Interference-aware resource allocation for device-to-deviceradio underlaying cellular networks,” in

Proc., IEEE Veh. TechnologyConf. , 2009.[14] M. Zulhasnine, C. Huang, and A. Srinivasan, “Efﬁcient resource allo-cation for device-to-device communication underlaying LTE network,”in

IEEE Wireless Mobile Computing, Networking and Communications(WiMob) , pp. 368–375, 2010.[15] S. Xu, H. Wang, T. Chen, Q. Huang, and T. Peng, “Effective interferencecancellation scheme for device-to-device communication underlayingcellular networks,” in

Proc., IEEE Veh. Technology Conf. , 2010.[16] H. Min, J. Lee, S. Park, and D. Hong, “Capacity enhancement using aninterference limited area for device-to-device uplink underlaying cellularnetworks,”

IEEE Transactions on Wireless Communications , vol. 10,pp. 3995–4000, Dec. 2011.[17] T. Chen, G. Charbit, and S. Hakola, “Time hopping for device-to-device communication in LTE cellular system,” in

Proc., IEEE WirelessNetworking and Comm. Conf. , 2010. [18] L. Cao and H. Zheng, “Distributed spectrum allocation via local bar-gaining.,” in Proc. IEEE SECON , pp. 475–486, 2005.[19] R. Etkin, A. Parekh, and D. Tse, “Spectrum sharing for unlicensedbands,”

IEEE Journal on Sel. Areas in Communications , vol. 25,pp. 517–528, Apr. 2007.[20] C. Xu, L. Song, Z. Han, Q. Zhao, X. Wang, and B. Jiao, “Interference-aware resource allocation for device-to-device communications as anunderlay using sequential second price auction,” in

Proc., IEEE Intl.Conf. on Communications , pp. 445–449, 2012.[21] C. Xu, L. Song, Z. Han, Q. Zhao, X. Wang, X. Cheng, and B. Jiao, “Efﬁ-ciency resource allocation for device-to-device underlay communicationsystems: a reverse iterative combinatorial auction based approach,”

IEEEJournal on Sel. Areas in Communications , vol. 31, pp. 348–358, Sep.2013.[22] M. Razaviyayn, Z.-Q. Luo, P. Tseng, and J.-S. Pang, “A Stackelberggame approach to distributed spectrum management,”

Mathematicalprogramming , vol. 129, pp. 197–224, July 2011.[23] F. Wang, L. Song, Z. Han, Q. Zhao, and X. Wang, “Joint schedulingand resource allocation for device-to-device underlay communication,”in

Proc., IEEE Wireless Networking and Comm. Conf. , 2013.[24] L. Song, D. Niyato, Z. Han, and E. Hossain, “Game-theoretic resourceallocation methods for device-to-device communication,”

IEEE WirelessCommunications , vol. 21, pp. 136–144, June 2014.[25] C. Xu, L. Song, Z. Han, D. Li, and B. Jiao, “Resource allocation usinga reverse iterative combinatorial auction for device-to-device underlaycellular networks,” in

Proc., IEEE Globecom , pp. 4542–4547, 2012.[26] M. J. Osborne,

A Course in Game Theory . Cambridge, Mass.: MITPress, 1994.[27] X. Lin, J. G. Andrews, and A. Ghosh, “Spectrum sharing for device-to-device communication in cellular networks,”

IEEE Trans. on Commu-nications , vol. PP, pp. 1–1, Sep. 2014.[28] D. Tse and P. Viswanath,

Fundamentals of Wireless Communication .Cambridge University Press, 2005.[29] G. Debreu, “A social equilibrium existence theorem,”

Proc. of theNational Academy of Sciences , vol. 38, pp. 886–893, Oct. 1952.[30] K. Fan, “Fixed-point and minimax theorems in locally convex topolog-ical linear spaces,”

Proc. of the National Academy of Sciences , vol. 38,pp. 121–126, Feb. 1952.[31] I. L. Glicksberg, “A further generalization of the Kakutani ﬁxed pointtheorem, with application to Nash equilibrium points,”

Proc. of theAmerican Mathematical Society , vol. 3, no. 1, pp. 170–174, 1952.[32] K. W. Shum, K. K. Leung, and C. W. Sung, “Convergence of iterativewaterﬁlling algorithm for Gaussian interference channels.,”

IEEE Jour-nal on Sel. Areas in Communications , vol. 25, pp. 1091–1100, Aug.2007.[33] D. P. Bertsekas and J. N. Tsitsiklis,

Parallel and Distributed Computa-tion: Numerical Methods . Prentice Hall Inc., 1989.[34] K. S. Narendra and M. A. Thathachar,

Learning Automata: An Intro-duction . Prentice-Hall, 1989.[35] S. P. Boyd and L. Vandenberghe,

Convex Optimization . CambridgeUniversity Press, 2004.[36] W. Yu, G. Ginis, and J. M. Ciofﬁ, “Distributed multiuser power controlfor digital subscriber lines,”

IEEE Journal on Sel. Areas in Communi-cations , vol. 20, pp. 1105–1115, June 2002.[37] N. Jindal, W. Rhee, S. Vishwanath, S. A. Jafar, and A. Goldsmith,“Sum power iterative water-ﬁlling for multi-antenna Gaussian broadcastchannels,”

IEEE Transactions on Information Theory , vol. 51, pp. 1570–1580, Apr. 2005.[38] A. Granas,

Fixed Point Theory . Springer, 2003.[39] R. W. Cottle, J. S. Pang, and R. E. Stone,

The Linear ComplementarityProblem . Academic Press, 1992.[40] W. Rudin,