Power control with partial observation in wireless ad hoc networks
PPOWER CONTROL WITH PARTIAL OBSERVATION IN WIRELESS AD HOC NETWORKS
Sara Berri ∗ , † , Samson Lasaulce † , and Mohammed Said Radjef ∗∗ Research Unit LaMOS (Modeling and Optimization of Systems), Faculty of Exact Sciences,University of Bejaia, Bejaia, 06000, Algeria † L2S (CNRS-CentraleSupelec-Univ. Paris Sud), Gif-sur-Yvette, France
ABSTRACT
In this paper, the well-known forwarder’s dilemma is gen-eralized by accounting for the presence of link quality fluc-tuations; the forwarder’s dilemma is a four-node interactionmodel with two source nodes and two destination nodes. It isknown to be very useful to study ad hoc networks. To charac-terize the long-term utility region when the source nodes haveto control their power with partial channel state information(CSI), we resort to a recent result in Shannon theory. It isshown how to exploit this theoretical result to find the long-term utility region and determine good power control policies.This region is of prime importance since it provides the bestperformance possible for a given knowledge at the nodes. Nu-merical results provide several new insights into the repeatedforwarder’s dilemma power control problem; for instance, theknowledge of global CSI only brings a marginal performanceimprovement with respect to the local CSI case.
1. INTRODUCTION
In wireless ad hoc networks nodes are typically interdepen-dent. One node needs the assistance of neighboring nodesto relay the information it wants to send to the receiver(s).Therefore, nodes are in the situation where they have to relaysignals or packets but have at the same time to manage the en-ergy they spend for helping other nodes. To study the tradeoffbetween a cooperative behavior, which is necessary to conveyinformation through an ad hoc network, and a selfish behaviorwhich aims at managing the node energy, the authors of [1]proposed a very simple but efficient model. Their modelinghas been found to be very important and insightful in the lit-erature of ad hoc networks, as advocated by the many paperswhere it is exploited. The model consists in studying, in pos-sibly large ad hoc networks, the local interaction between fournodes (see Fig. 1). Node (resp. ) wants to send informa-tion to Node (resp. ) and, for this purpose needs the assis-tance of Node (resp. ). In the original model of [1], Node (resp. ) has two possible choices namely, forward or drop thepackets it receives from Node (resp. ). Assuming that eachnode wants to maximize a utility function which consists ofthe addition of a data rate term (which is maximized when theother node forwards its packets) and an energy term (which is maximized when the node does not forward the packets ofthe other node). At the Nash equilibrium of the correspond-ing strategic-form game (called the forwarder’s dilemma inthe corresponding literature), nodes don’t transmit at all. Toavoid this, cooperation has to be stimulated e.g., by studyingthe repeated interaction between the nodes [1] [2] or by im-plementing incentive mechanisms [3] [4]. While providingan efficient solution, these models still have some limitations.The purpose of this paper is precisely to overcome those lim-itations.If we interpret the model of [1] as a power control problemfor which each node has to decide to transmit at maximum orminimum power to relay the packet of the other node, fourlimitations appear in the formulation of [1]. First, the nodeonly chooses the cooperation power while in an ad hoc wire-less network such as sensor networks, it has also to choose thepower used to send its own packet. Second, the power con-trol scheme does not take into account the quality of link be-tween the transmitting node and the receiving node. Third, noframework is provided to tackle the scenario where the nodeshave only partial observation of the actions of the other nodesand the channel gains. Fourth, no framework is provided to beable to reach any point of the feasible utility region, especiallyin the presence of partial observation.The purpose of the present paper is precisely to contributeto overcoming these limitations. To this end, we exploit therecent Shannon-theoretic works [5], [6]. These works givecoding theorems which we exploit here in a constructive man-ner. This allows us to find the feasible utility region of theconsidered problem in the presence of partial observation and,more importantly, to provide a numerical algorithm to obtainglobally efficient power control policies.The paper is structured as follows. In Sec. 2, we providea local interaction model between four nodes of an (possiblylarge) ad hoc network which generalizes the famous modelintroduced in [1]. In Sec. 3 we explain how the feasible util-ity region of a strategic-game with partial observation can beobtained and provide a numerical technique to determine de-cision functions; it has to be noted that this works for anyutility function and not only for those assumed in Sec. 2. Sec.4 corresponds to the numerical performance analysis. a r X i v : . [ c s . N I] D ec g (cid:48) g g (cid:48) Fig. 1 . The figure represents the studied scenario, each node i ∈ { , } uses the power couple ( p i , p (cid:48) i ) . p i for the ownpackets, p (cid:48) i for the other nodes’ packets.
2. PROPOSED PROBLEM FORMULATION
In this section, we provide a model which generalizes themodel of [1] (see Fig. 1). The model is as follows. Thequality of the four links between the four nodes is assumed tofluctuate over time. More precisely, we assume a block-fadinglaw for the four channel gains (see Fig. 1) namely, each chan-nel gain is assumed to be constant over the duration of a blockor packet and varies from block to block in an i.i.d. manner.Nodes and are the source nodes while Nodes and arethe destination nodes. For each block, Node i , i ∈ { , } , hasto choose p i the power it uses to send its own packet to theclosest node and p (cid:48) i the cooperative power it uses to relay thepacket of the other source node. The channel gains g , g (cid:48) , g , g (cid:48) and the transmit powers p , p (cid:48) , p , p (cid:48) are assumed to lie indiscrete sets: G i = { g i , . . . , g Ni } and G (cid:48) i = { g (cid:48) i , . . . , g (cid:48) Ni } ,with g i = g min i , g Ni = g max i , g (cid:48) i = g (cid:48) min i and g (cid:48) Ni = g (cid:48) max i , N ≥ ; P i = P (cid:48) i = { p i , . . . , p Mi } , with p i = P min i and p Mi = P max i , M ≥ . Assuming that the sets are discreteis of practical interest, as there exist wireless communicationstandards in which the power can only be decreased or in-creased by step and in which quantized wireless channel stateinformation is used (see e.g., [7] [8]).The performance metric, which will be called utility, ofNode i ∈ { , } is chosen as follows: u i ( x , a , a ) = ϕ (SNR i ) − α ( p i + p (cid:48) i ) . (1)where: x = ( g , g (cid:48) , g , g (cid:48) ) is the global channel state ; a i =( p i , p (cid:48) i ) is the action of Node i ; SNR i is the signal to interfer-ence ratio. Since a source node sends packets through an in-termediate relay node, using a two-hop communication link, SNR i is defined according to the source node and relay node; SNR i = p i g i p (cid:48)− i g (cid:48)− i σ , (2) σ being the noise variance and − i standing for the othersource node. The function ϕ is a communication efficiencyfunction which is assumed to be increasing and lie in [0 , ;it may typically represent the packet success rate; a typicalchoice for ϕ is for example, ϕ ( x ) = (1 − e − x ) L , L beingthe number of symbols per packet (see e.g., [9] [10] [11]) or ϕ ( x ) = e − cx with c = 2 r − , r being the spectral efficiency in bit/s per Hz [12]. The parameter α ≥ allows one to as-sign more or less importance to the energy consumption of thenode. Note that the assumed expression of the SNR may forinstance, be justified when nodes implement the amplify-and-forward protocol to relay the signals or packets to the des-tination [13]. The set of nodes I = { , } , the action sets A i = P i × P (cid:48) i , and the utility functions u , u define a staticor one-shot strategic-form game. The original one-shot gamemodel of [1] can be obtained by assuming that ϕ is a stepfunction, p i is constant, p (cid:48) i is binary, and all the channel gainsare constant.The problem we want to solve in this paper is as follows.Assuming that the nodes interact over T ≥ stages or blocksand that they have a certain knowledge of the channel state,what is the utility region of the system? And which powercontrol policy should be used to reach a given point of thisregion? To answer these questions, we associate a long-term dynamical version with the assumed one-shot game. The dy-namical game we consider can be put under strategic-form.The set of players is I = { , } . The strategies or powercontrol policies are defined as follows: σ i,t : S ti → A i ( s i (1) , . . . , s i ( t )) (cid:55)→ a i ( t ) (3)where t ≥ is the stage or block index, S ti = S i × S i × . . . × S ti , S i being the set of signals or observations of Node i . Theobservation s i ( t ) ∈ S i corresponds to the image Node i hasabout the global channel state x ( t ) at stage t . This imageis assumed to be the output of a memoryless channel whoseconditional probability is (cid:107) i ( s i | x ) . The long-term utility ofNode i is defined as: U i ( σ , σ ) = E (cid:34) T T (cid:88) t =1 u i ( x ( t ) , a ( t ) , a ( t )) (cid:35) . (4)
3. UTILITY REGION CHARACTERIZATION ANDPROPOSED POWER CONTROL POLICIES
It is important to know that the characterization of the feasi-ble utility region of dynamic games (which includes repeatedgames as a special case) with an arbitrary observation struc-ture is still an open problem [14]. Remarkably, as shown re-cently in [6] [5], the problem can be solved for some setupswhich are quite generic in wireless communications. Indeed,when T is assumed to be large, the random process associatedwith the system state X (1) , ..., X ( T ) is i.i.d., and the obser-vation structure given by (cid:107) , (cid:107) , it is possible to derive a cod-ing theorem which characterizes the joint probability distri-bution Q ( x , a , a ) which can be attained. This is preciselythe corresponding theorem we propose to exploit here to de-termine numerically the feasible utility region of the consid-ered dynamic game and to derive a good power control policy.After [6] [15] a joint distribution Q ( x , a , a ) is achievablen the limit T → ∞ if and only if it factorizes as: Q ( x , a , a ) = (cid:88) V,s ,s ρ ( x ) P V ( v ) × (cid:107) ( s , s | x ) × P A | S ,V ( a | s , v ) P A | S ,V ( a | s , v ) (5)where: ρ is the probability distribution of the channel state; (cid:107) is the joint conditional probability which defines the assumedobservation structure; V ∈ V is an auxiliary random variableor lottery which can be proved to improve the performance ingeneral (see [5] for more details). By exploiting this result, itcomes that a pair of expected payoffs ( U , U ) is achievable ifit writes as ( E Q ( u ) , E Q ( u )) where Q factorizes as (5). Byusing a time-sharing argument, the achievable utility regionhas to be convex. Therefore, the Pareto frontier of the utilityregion can be obtained by maximizing the weighted utility W λ = λ E Q ( u ) + (1 − λ ) E Q ( u ) (6)with respect to Q , for every λ ∈ [0 , . In (5), ρ and (cid:107) aregiven. Thus, W λ has to be maximized with respect to thetriplet ( P A | S ,V , P A | S ,V , P V ) . In this paper, we restrict ourattention to the optimization of ( P A | S ,V , P A | S for a fixedlottery P V ) and leave the general case as an extension. Giventhis, the optimization problem at hand becomes a bilinearproblem: the function to be maximized is bilinear and the op-timization space is the unit simplex ∆( A × A × V ) . It can bechecked (see e.g., [16]) that the optimal solutions of this prob-lem lies on the vertices of the unit simplex ∆( A × A × V ) .This key observation indicates that there is no loss of optimal-ity by only searching for functions instead of a general condi-tional probability distribution. The variables to be optimizedare thus functions which we denote by f i : S i × V → A i . Thecorresponding bilinear program can be solved by using tech-niques such as the one proposed in [17], but global conver-gence is not guaranteed. Two other relevant numerical tech-niques have also been proposed in [16]. The first techniqueis based on cutting plane while the second one consists ofan enlarging polytope approach. For both techniques, con-vergence may also be an issue since for the first technique,no convergence result is provided and for the second tech-nique, cycles may appear [18]. To solve the convergence is-sue, we propose to exploit the sequential best-response dy-namics (see e.g., [19], [20]), which has been used recently forthe interference channel [21]. The sequential best-responsedynamics applied in the considered setup works as follows.At iteration , an initial choice for the conditional proba-bility which defines the decision of the two nodes is made: ( P A | S ,V , P A | S ,V ) = ( P (0) A | S ,V , P (0) A | S ,V ) . At iteration , (6) is maximized with respect to P (1) A | S ,V . At iteration , P A | S ,V is updated by maximizing (6) with respect to P (2) A | S ,V , and so on. Since at each iteration, the same util-ity (namely, W λ ) is maximized, the sequential best-responsedynamics is guaranteed to converge; the proof follows by in- duction or using a more general argument such as exact po-tentiality. Global convergence is however not guaranteed ingeneral. The only argument the authors can provide is that ex-haustive simulations show that the loss of optimality inducedby operating at the points the algorithm reaches is small. Animportant point to note here is that the proposed procedureprovides stationary power control strategies or power controldecision functions, which can be used in practice.
4. NUMERICAL PERFORMANCE ANALYSIS
By default, the following simulation setup is assumed; oth-erwise, the parameters are explicitly mentioned in the fig-ures. Set of possible power levels ∀ i { , } , P i = P (cid:48) i : M = 25 , P min i (dB) = − , P max i (dB) = +20 , and thepower increment in dB equals . Set of possible chan-nel gains: ∀ i { , } , G i = G (cid:48) i : N = 20 , g min i = 0 . , g max i = 10 , and the channel gain increment equals − . .The different means of the channel gains are given by: (¯ g , ¯ g (cid:48) , ¯ g , ¯ g (cid:48) ) = (1 , . , , . . The communication ef-ficiency function is chosen as in [12]: ϕ ( x ) = e − cx with c = 2 r − , r being the spectral efficiency in bit/s per Hz [12].In the simulations provided we have either r = 1 or r = 3 ,which are reasonable values for wireless sensor networkcommunications.Fig. 2 represents the long-term utility region of the dy-namical game presented in this paper. It has been obtainedby applying the sequential best-response dynamics to (6) fordifferent values of λ . Three scenarios are considered. Thepoint in the bottom left corner corresponds to the performanceof the Nash equilibrium of the forwarder’s dilemma of [1].The two curves corresponds to achievable pair of long-termutilities when global channel state information is known fromthe two source nodes and when they have only local channelstate information. First, it is seen that our new formulationprovides very significant improvements over the approachwhich consists in studying the packet forwarding problemwithout accounting for the link quality fluctuations. Second,the framework we use in this paper allows one to quantifywhat is gained when more information. Here, in particular, itis seen that global channel state information only provide arelatively small gain for the utilities. This therefore advocatesthe use of distributed power control policies which only relyon local channel state information.Fig. 3 represents the long-term utility of Node (whichcoincides with the one of Node since considered scenar-ios are symmetric) against the reciprocal of the weight as-signed to the energy consumption namely, α ; two values forthe spectral efficiency are retained r = 1 bit/s per Hz and r = 3 bit/s per Hz. The figure clearly shows a very signif-icant gain in terms of utility and fully supports the proposedapproach i.e, the power control policies obtained through theproposed numerical procedure outperform classical policies.Fig. 4 corresponds to an ad hoc network which comprises nodes and -node interactions are assumed, that are notall disjoint. i.e, there may exist two sets of − node contain-ing at most three same nodes. The figure shows the perfor-mance gain against the fraction of nodes which implementthe proposed power control policies while the other nodesimplement the Nash equilibrium forwarder’s dilemma poli-cies of [1]. The performance gains scales linearly with thefraction of ”advanced nodes” and the network performanceis very significantly improved when one compares the casewhere the fraction equals to the one where it equals . −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.6−0.4−0.200.20.40.60.81 E(u ) E ( u ) Achievable region with global CSI (g , g ’, g , g ’)Achievable region with local CSI (g i , g i ’)Nash equilibrium of [1] ¯g = ( . , , . , ) g min = . , g max = min = − ( dB ), P max = ( dB ) α = , c = , σ = Fig. 2 . Achievable utility region for two scenarios of partialinformation for the source nodes: global CSI and local CSI.The performance can be compared to the isolated point whichrepresents the one-shot forwarder’s dilemma Nash equilib-rium [1].
100 1000 10000 100000 1000000−1−0.8−0.6−0.4−0.200.20.40.60.81 α E x p e c t e du ili t y ¯ u Proposed power control scheme with c = = = = ¯g = ( , . , , . ), σ = min = . , g max = min = − ( dB ), P max = ( dB ) Fig. 3 . Long-term utility with local CSI against the reciprocalof the weight assigned to the energy cost part in (1) that is, α . Proportion of advanced nodes E x p ec t e d s u m − u t ili t y c = = = ( , . , , . ),
50 nodesg min = . , g max = min = − ( dB ), P max = ( dB ) α = , c = , σ = Fig. 4 . Expected sum-utility against the proportion of ad-vanced nodes for an ad hoc network of nodes with − nodeinteractions; by ”advanced” it is meant that the nodes imple-ment the power control policy proposed in this work whilethe other nodes implement the one-shot forwarder’s dilemmaNash equilibrium power control policies.
5. CONCLUSION
One of the contributions of this work is to generalize the fa-mous and insightful model of forwarder’s dilemma [1] byaccounting for channel gain fluctuations. The problem ofknowledge about global CSI therefore appears. We have seenthat it is possible to characterize the performance of the stud-ied system even in the presence of partial information; thecorresponding observation structure is arbitrary provided theobservations are generated by a discrete memoryless channeldenoted by (cid:107) in this paper. In terms of performance, design-ing power control policies which exploit as well as possiblethe available knowledge is shown to lead to very significantgains. A very significant extension of the present work wouldbe to relax the i.i.d. assumption on the system state. In thiswork, the system state corresponds to the global channel stateand the i.i.d. assumption is known to be very reasonable but,in other setups, where the state represents e.g., a queue lengthor an energy level, the used framework needs to be extended.
REFERENCES [1] M. Felegyhazi and J.-P. Hubaux,
Game Theory inWireless Networks: A Tutorial , in EPFL TechnicalReport, LCA-REPORT-2006-002, February. 2006.[2] J. J. Jaramillo, R. Srikant,
A Game Theory BasedReputation Mechanism to Incentivize Cooperation inWireless Ad hoc Networks , Ad Hoc Networks, Vol. 8,No. 4, pp. 416-429, 2010.[3] L. Butty ´ a n, J-P. Hubaux, Stimulating Cooperation inSelf-Organizing Mobile Ad Hoc Networks , Mobileetworks and Applications, Vol. 8, No. 5, pp. 579-592, 2003.[4] D. E. Charilas, K. D. Georgilakis andA. D. Panagopoulos,
ICARUS: hybrId inCen-tive mechAnism for coopeRation stimUlation in adhoc networkS , Ad Hoc Networks, Vol. 10, No. 6,pp. 976-989, 2012.[5] B. Larrousse, S. Lasaulce, M. Wigger,
CoordinatingPartially-Informed Agents over State-Dependent Net-works , IEEE Proc. of the Information Theory Work-shop (ITW), Jerusalem, Israel, Apr-May 2015.[6] B. Larrousse and S. Lasaulce,
Coded Power Pontrol:Performance Analysis , IEEE International Sympo-sium on Information Theory, July. 2013.[7] S. Sesia, I. Toufik, and M/ Baker. LTE,
The UMTSLong Term Evolution: From Theory to Practice ,Wiley Publishing, 2009.[8] A. Gjendemsj, D. Gesbert, G.E. Oien, and S.G.Kiani,
Binary Power Control for Sum Rate Maxi-mization over Multiple Interfering Links , IEEETransactions on Wireless Communications, Vol. 7,No. 8, pp. 3164-3173, 2008.[9] D. J. Goodman and N. Mandayam,
A Power Controlfor Wireless Data , IEEE Personal Communications,Vol. 7, No. 2, pp. 48-54, April 2000.[10] F. Meshkati, M. Chiang, H. Poor, and S. Schwartz,
AGame Theoretic Approach to Energy-Efficient PowerControl in Multicarrier CDMA Systems , in Journalon Selected Areas in Communications, Vol. 24, No.6, pp. 1115-1129, 2006.[11] S. Lasaulce, Y. Hayel, R. El Azouzi, and M. Deb-bah,
Introducing Hierarchy in Energy Games , IEEETrans. on Wireless Communications, Vol. 8, No. 7,pp. 3833-3843, July 2009.[12] E. V. Belmega and S. Lasaulce,
Energy-Efficient Pre-coding for Multiple-Antenna Terminals , IEEETrans. on Signal Processing, Vol. 59, No. 1, pp. 329-340, January. 2011.[13] A. El Gamal and Y. Kim,
Network Information The-ory , Cambridge University Press, 2011.[14] M. Maschler, E. Solan, S. Zamir
Game Theory ,Cambridge University Press, 2013.[15] B. Larrousse, S. Lasaulce, and M. Bloch,
Coordi-nation in Distributed Networks via Coded Actionswith Application to Power Control , IEEE Transac-tions on Information Theory, submitted in Aug. 2015,available on arXiv.[16] G. Gallo and A. Olkticti,
Bilinear Programming: AnExact Algorithm , Mathematical Programming, Vol.12, No. 1, pp. 173-194, 1977. [17] H. Konno,
A Cutting Plane Algorithm for Solving Bi-linear Programs , Mathematical Programming, Vol.11, No. 1, pp. 14-27, 1976.[18] H. Vaish and C. M. Shetty,
The Bilinear Program-ming Problem , Naval Research Logistics Quarterly,Vol. 23, No. 2, 303-309, 1976.[19] D. Fudenberg and J. Tirole,
Game Theory , MITPress, 1991.[20] S. Lasaulce and H. Tembine,
Game Theory andLearning for Wireless Networks: Fundamentals andApplications , Academic Press, 2011.[21] A. Agrawal, S. Lasaulce, O. Beaude, and R. Visoz,