Green Codes: Energy-Efficient Short-Range Communication
aa r X i v : . [ c s . I T ] M a y Green Codes: Energy-Efficient Short-RangeCommunication
Pulkit Grover and Anant SahaiWireless Foundations, Department of EECSUniversity of California at Berkeley, CA-94720, USA { pulkit, sahai } @eecs.berkeley.edu Abstract — A green code attempts to minimize the total en-ergy per-bit required to communicate across a noisy channel.The classical information-theoretic approach neglects the energyexpended in processing the data at the encoder and the decoderand only minimizes the energy required for transmissions. Sincethere is no cost associated with using more degrees of freedom,the traditionally optimal strategy is to communicate at rate zero.In this work, we use our recently proposed model for thepower consumed by iterative message passing. Using generalizedsphere-packing bounds on the decoding power, we find lowerbounds on the total energy consumed in the transmissions andthe decoding, allowing for freedom in the choice of the rate. Weshow that contrary to the classical intuition, the rate for greencodes is bounded away from zero for any given error probability.In fact, as the desired bit-error probability goes to zero, theoptimizing rate for our bounds converges to . I. I
NTRODUCTION
With the development of billion transistor chips, the rangeof communication has come down dramatically from hundredsof kilometers (e.g. deep space communication) to a few meters(e.g. ad-hoc wireless networks) or a few millimeters or evenless (e.g. on chip communication). To communicate oversmaller distances, the transmit power required is much smaller.At these distances, the energy used in transmissions can becomparable to that expended by the system processes. Thesmall size limits the ability of these chips to dissipate heat.Further, the chip might be battery operated, imposing stringentconstraints on its energy usage. It is therefore of interestto design coding techniques that minimize the total energy consumed, which includes the transmission energy as well asthe processing energy. We refer to the coding techniques thatminimize the total energy as green codes .The classical information theoretic approach finds the mini-mum transmission energy required to communicate reliablyacross the channel. The approach is motivated by long-range communication, that corresponds to power constrainedchannels. Shannon [1] first characterized the minimum energyrequired to communicate across a channel with fixed rate. Theresulting bounds are expressed using ‘waterfall’ curves thatconvey the revolutionary idea that unboundedly low proba-bilities of bit-error are attainable using only finite transmitpower. This characterization raises a natural question: whatis the minimum energy required for communication that isfree of a rate constraint? The classical approach [2] [3] givesthe minimum transmission energy required (on average) to communicate one bit reliably across the channel. For example,for an AWGN channel of noise variance 1, this minimumenergy is lim P T → P T C ( P T ) = 2 ln(2) Joules . (1)Since there is no penalty associated with lower rates, it is goodto use as many degrees of freedom as are available, and theoptimal transmission rate is zero.The problem of minimizing combined transmission andprocessing energy is well studied in networks. The commonthread in [4], [5], [6], [7], [8], [9] is that the energy consumedin processing the signals can be a substantial fraction of thetotal power. In [7], an information-theoretic formulation isconsidered. The authors model the processing energy by aconstant ǫ per unit time when the transmitter is transmitting(and hence, is in the ‘on’ state). A total of r channel usesare allowed, and the total energy available is r E , where E is aconstant. Let P i be the transmit power at i -th time instant, andlet C ( P i ) be the capacity of the corresponding channel. Thenthe problem is to transmit maximum number of bits with thetotal power less than r E . That is, max r X i =1 i C ( P i ) (2)subject to r X i =1 i ( P i + ǫ ) ≤ r E (3)where i = 1 if a symbol is transmitted in the i -th channel use,and is 0 otherwise. This is equivalent to dividing the channelinto r sub-channels, with independent coding on each sub-channel. Since the capacity function C ( P ) is concave in itsargument, for maximizing the total number of information bitscommunicated, the transmission energy P i should be equalfor all i where i = 1 . Without accounting for the energyconsumed by the system processes, the optimal strategy wouldbe to use all the r parallel channels, and share the energyequally amongst them. However, the energy consumed by thesystem processes imposes a fixed penalty on each channel use.The authors quantify this tension by measuring ‘burstiness’ Θ of signaling defined as Θ = r P ri =1 i .The transmissions should not be too bursty because of thelaw of diminishing returns associated with the log( · ) function.On the other hand, the transmission strategy should not makeuse of all degrees of freedom either, since there is an ǫ costssociated with the use of each degree of freedom. The authorsconclude that for minimum total energy, < Θ < . Contraryto conventional information theoretic wisdom, it is no longeroptimal to use all available degrees of freedom. Consequently,the optimal rate that minimizes the total energy consumptionis bounded away from zero. That is, if processing energy istaken into account, green codes may not communicate at zerorate! The objective in [7] [5] [9] is to reduce the energy consump-tion for wireless devices that consume energy continuouslywhen operating e.g. hand-held computers, high-end laptops,etc. Energy consumption per unit time for such devices isindeed well modeled by a constant possibly independentof the coding strategy being used. In this paper, we areinterested in the energy expended by the decoding processitself. The decoding circuit requires some non-zero energy toperform each operation. As opposed to energy consumed bysystem processes in [7], [5], [9], the decoding energy dependssignificantly on the code construction, the rate and the desirederror probability, and therefore needs more careful modeling.In this work, we study explicit models of energy expended atthe decoder. Owing to their low implementation complexity,and hence low energy consumption, we concentrate on themessage passing decoder. For this decoder, we derive lowerbounds on the combined transmission and decoding energy,with no constraint on the rate. We show that the optimizingrate for green codes based on message passing decoding isindeed bounded away from zero. As the error probabilitydecreases to zero, the optimizing rate increases. In a result thatis qualitatively different from those in [7], we show that thereis no advantage in increasing the rate beyond . Therefore,as the error probability converges to zero, the optimizing rateconverges to 1!The organization of the paper is as follows : In Section II,we introduce the channel model, the decoder model, andthe energy model. In Section III, we summarize some ofour results in [10]. In Section IV, we build on the resultsin [10] to find bounds on the minimum total energy required tocommunicate across a channel, with no rate constraint, takinginto account the decoding energy as well. We conclude inSection V. II. S YSTEM MODEL
Consider a point-to-point communication link. An informa-tion sequence B k is encoded into mR codeword X m , using apossibly randomized encoder. The observed channel output is Y m . The information sequences are assumed to consist of iidfair coin tosses and hence the rate of the code is R = k/m .The channel model considered is an average power con-strained AWGN channel of noise variance σ P . We also obtainsome results for the BSC arising from performing hard-decision on BPSK symbols transmitted over an AWGN chan-nel. The true channel is denoted by P . The channel capacity isdenoted by C σ ( P T ) , where σ is the noise variance, and P T is the average power constraint. We drop σ from this notationwhen no ambiguity is created in doing so. For maximum generality, we do not impose any a priori structure on the code itself. Instead, inspired by [11], [12],[13], we focus on the parallelism of the decoder and the energyconsumed within it. We assume that the decoder is physicallymade of computational nodes that pass messages to each otherin parallel along physical (and hence unchanging) wires. Asubset of nodes are designated ‘message nodes’ in that eachis responsible for decoding the value of a particular messagebit. Another subset of nodes (not necessarily disjoint), calledthe ‘observation nodes’ has members that are each initializedwith at most one observation of the received channel outputsymbols. There may be additional computational nodes tomerely help in decoding.The implementation technology is assumed to dictate thateach computational node is connected to at most α + 1 > other nodes with bidirectional wires. No other restriction isassumed on the topology of the decoder. In each iteration, eachnode sends (possibly different) messages to all its neighboringnodes. No restriction is placed on the size or content ofthese messages except for the fact that they must dependonly on the information that has reached the computationalnode in previous iterations.
If a node wants to communicatewith a more distant node, it has to have its message relayedthrough other nodes. The neighborhood size at the end of l iterations is denoted by n ≤ α l +1 . Each computational nodeis assumed to consume a fixed E node joules of energy at eachiteration.Let the average probability of bit error of a code be denotedby h P e i when it is used over channel P . The main tool isa lower bound on the neighborhood size n as a function of h P e i and R . This then translates into a lower bound on thenumber of iterations that can in turn be used to lower boundthe required decoding power.Throughout this paper, we allow the encoding and decodingto be randomized with all computational nodes allowed toshare a pool of common randomness. We use the term ‘averageprobability of error’ to refer to the probability of bit erroraveraged over the channel realizations, the messages, theencoding, and the decoding.III. L OWER BOUNDS ON THE DECODING COMPLEXITYAND TOTAL ENERGY
In this section we summarize our results for lower boundson decoding complexity for an AWGN channel from [10].The main bounds are given by theorems that capture a localsphere-packing effect. These can be turned around to givea family of lower bounds on the neighborhood size n as afunction of h P e i and R . Using a simple lower bound on thenumber of iterations, l ≥ log( n )log( α ) − , we get a lower bound on complexity. The family of lower bounds is indexed by thechoice of a hypothetical channel G and the bounds can beoptimized numerically for any desired set of parameters. In practice, this limit could come from the number of metal layers on achip. α = 1 would just correspond to a big ring of nodes and is thereforeuninteresting. We approximate this by l ≥ log( n )log( α ) for the rest of the paper. heorem 3.1: For the AWGN channel and the decodermodel in Section II, let n be the maximum size of the decodingneighborhood of any individual message bit. The followinglower bound holds on the average probability of bit error. h P e i ≥ sup σ G : C σ G ( P T ) 32 + 2 ln h − b ( δ ( σ G )) !! (cid:18) σ G σ P − (cid:19) (cid:19) , (4)where δ ( σ G ) = 1 − C σ G ( P T ) /R , the capacity C σ G ( P T ) = log (cid:16) P T σ G (cid:17) , and the KL divergence D ( σ G || σ P ) = h σ G σ P − − ln (cid:16) σ G σ P (cid:17)i . Proof: See [10]. There is a better bound in [10] as well.This bound is presented here for ease of exposition.Observe that the required value of n increases as h P e i de-creases. Taking log on both sides of (4), it is evident that forsmall h P e i , the term nD ( σ G || σ P ) dominates the other termsin the RHS. For small h P e i , σ G can be taken close to σ ∗ G thatsatisfies C σ ∗ G ( P T ) = R . Neglecting the other two terms, weget n & log(1 / h P e i ) D ( σ ∗ G || σ P ) . (5)IV. M INIMIZATION OF TOTAL ENERGY BY OPTIMIZINGOVER THE RATE AND TRANSMIT POWER .Consider the total energy spent in transmission. For trans-mitting k bits at rate R , the number of channel uses is m = k/R . If each transmission has power ξ T P T , the totalenergy used in the transmissions is ξ T P T m .At the decoder, let the number of iterations be l . Assume thateach node consumes E node joules of energy in each iteration.The number of computational nodes can be lower boundedby m , the number of received channel outputs, and also by k , the number of bits to be decoded. We lower bound by themaximum of the two E dec ≥ E node × max { k, m } × l. (6)There is no lower bound on the encoding complexity and sothe encoding is considered free. For m transmissions withaverage power P T , we require mP T joules of energy. Thisresults in the following bound for the weighted total energy E total ≥ ξ T mP T + ξ D E node max { k, m } × l. (7) A lower bound of m + k would not allow for node sharing between theset of observation nodes and the message nodes. The parameters ξ T and ξ D are weights assigned to the transmit and thedecoding energy respectively. ξ T depends on the path-loss across the channel. ξ D indicates the relative importance of decoding energy. For example, if theenergy use at the decoder is severely constrained, ξ D would be large. Using l ≥ log( n )log( α ) , E total ≥ mξ T P T + ξ D E node max { k, m } log( n )log( α ) ∝ mP T σ P + γ max { k, m } log( n ) , (8)where γ = ξ D E node σ P ξ T log( α ) is a constant that summarizes allthe technological and environmental terms. The expressionin (8) gives the normalized total energy, normalized by thenoise variance σ P . Figure 1 provides example behavior of γ with distance. The neighborhood size n itself can be lowerbounded by plugging the desired average probability of errorinto Theorem 3.1. γ ( d B ) Fig. 1. The plot shows the behavior of γ with distance d for path loss ξ T = d for d > . mm (and path loss for smaller d ). E node is pJ, α = 4 , ξ D = 1 , and σ P = 4 × − J. The energy per bit is normalizedby σ P . We thus obtain the following expression for the minimumnormalized total energy, E per bit = min P T ,R R P T σ P + 1 R γ max (cid:26) R , (cid:27) log( n ) . (9)Observe that in (9), the decoding energy increases as the errorprobability decreases for constant transmit power and rate.This behavior is not reflected by using the model inspiredfrom [7] for decoding energy. The bounds in [7] are for errorprobability converging to zero. To compare our bounds withthe black-box model of [7], in Appendix I we derive boundsfor non-zero error probability based on the model in [7].We plot the two bounds against each other in Figure 2 for k = 10 , bits.We choose ǫ = 4 , for which the total energy per bit forthe black-box model equals the energy per bit for γ = 0 . for our bound for h P e i = 10 − . The figure shows that for The energy cost of one iteration at one node E node ≈ pJ is arrived atby an optimistic extrapolation from the reported values in [14], [15], thermalnoise energy per sample σ P ≈ × − J from kT with T around roomtemperature. P e i smaller than this threshold, the model inspired from [7]underestimates the total energy. It is because this model treatsthe decoder as a black-box where ǫ does not change with errorprobability or rate.It is interesting to observe what values of R optimize (9).Under the small h P e i approximation in (5), we now heuristi-cally argue that the optimal rate R opt should converge to as h P e i → .Observe that for R < , E per bit = P T σ P R + γR log ( n )= P T σ P R + γR log (cid:18) log (cid:18) h P e i (cid:19)(cid:19) − log (cid:0) D ( σ ∗ G || σ P ) (cid:1) As h P e i → , n → ∞ . Therefore, the decoding energyincreases to infinity. Increasing the rate R at the cost ofincreasing P T offsets the increasing decoding costs. However,for R ≥ , E per bit & P T σ P R + γ log log (cid:16) h P e i (cid:17) D ( σ ∗ G || σ P ) , (10)which indicates there is no advantage in increasing rate beyond R = 1 , since it no longer decreases the decoding energy.Evidently, for finite h P e i , there exists an optimal rate R opt > that minimizes the combined energy consumed.Using numerical evaluation of the bound (9), we plot thebehavior of the optimal rate with h P e i in Figure 5. The plotsdemonstrate that the optimal rate indeed converges to .Figure 3 shows the behavior of our lower bound on sumenergy with h P e i for various values of γ . Figure 4 shows thatsimilar behavior also holds for a BSC arising from performinghard-decision on BPSK symbols transmitted over an AWGNchannel. The optimal rate for this channel also converges to as h P e i → . Due to lack of space, we omit the plots. 1 1.38 2 3 4 5−55−50−45−40−35−30−25−20−15 Energy per bit l og ( 〈 P e 〉 ) black−box bounds ε = 4Our bounds γ = 0.2Limiting value of optimal transmit power neglecting procecssing energy Fig. 2. The plot shows the comparison of lower bounds on the minimumnormalized energy for k = 10 , bits. The ‘black-box bounds’ plot is basedon the model in [7], where the details of the processor are ignored. Our boundstake into account the decoder structure as well. l og ( 〈 P e 〉 ) γ = 0.4 γ = 0.3 γ = 0.2 γ = 0.1Limiting value of optimal transmit power neglecting procecssing energy Fig. 3. The plot shows the behavior of lower bound on the normalized sumenergy with h P e i for various values of γ . The sum energy goes to infinity as h P e i → . 2 4 6 8 10 12 14 16 18 20−60−50−40−30−20−10 Energy per bit l og ( 〈 P e 〉 ) Limiting value of optimal transmit power neglecting the proessing energy γ = 0.4 γ = 0.1 γ = 1Uncoded transmission Fig. 4. The plot shows the behavior of lower bound on normalized sumenergy with h P e i for various values of γ for a BSC arising from performinghard-decision on BPSK symbols transmitted over an AWGN channel. Theoptimizing rate converges to 1 as h P e i → . Even so, this plot shows thatthe optimal strategy is not uncoded transmission at low h P e i since codedtransmission outperforms uncoded transmission at small h P e i . V. D ISCUSSIONS AND C ONCLUSIONS In this work, we derived lower bounds on the combinedtransmission and decoding energy for iterative decoding withunconstrained rates. It is important to note that these are lowerbounds, and the actual energy consumption would only behigher. An interesting feature of the our bounds is that theoptimizing rate for green codes is bounded away from zero,and, in fact, converges to as the error probability convergesto zero. This is qualitatively different from a pure black-boxmodeling of the decoding process, where energy consumptionis independent of the desired error probability and the rate. Inthat case, as observed in [7], the optimal rate is a constant thatcan be greater than . opt l og ( 〈 P e 〉 ) γ = 0.4 γ = 0.3 γ = 0.2 γ = 0.1 Fig. 5. Optimal value of rate vs error probability: As h P e i converges to ,the optimizing rate converges extremely slowly to . For an AWGN channel, the value for optimal rate is aresult of a bit-wise representation of the information at thedecoder. If, however, the message nodes represent the infor-mation in base M then the optimizing rate would converge to log ( M ) .For the BSC arising from performing hard-decision onBPSK symbols transmitted over an AWGN channel, the opti-mal rate still converges to . The rate is upper bounded by because the channel has binary input alphabet, and thus thiscase might seem somewhat uninteresting. However, uncodedtransmission over BSC also corresponds to rate , which mightfalsely suggest that uncoded transmission is asymptoticallyoptimal for minimizing the total energy. We observe thatdespite the optimal rate approaching , coded transmissionattains the same error probability with much smaller totalenergy than uncoded transmission.We note that the total energy per-bit required to commu-nicate at arbitrarily low error probability increases to infinityfor the message passing decoder. This is in contrast to theclassical information-theoretic result for transmit power, whichshows that the transmit power is bounded even as h P e i → .Based on results in [10], the total energy per bit increasesto infinity for most known codes and decoding algorithms.It would be interesting to extend this result to all possiblecodes and decoding algorithms. An approach based on lawsof physics is suggested in [10] for the fixed rate problem. Theapproach might yield results here as well.A PPENDIX IB OUNDS IN [7] FOR NON - ZERO ERROR PROBABILITY Observe that the results in [7] are for h P e i → andinfinitely many information bits. Parallel to our analysis formessage passing decoding, in this appendix, we build onthe analysis in [7] to derive bounds on the minimum energyrequired for communicating with a non-zero error probability h P e i and finite information bits. Assume k bits are to be transmitted across the chan-nel, with desired error probability h P e i . In [7], the authorsmaximize the information bits communicated under a totalenergy constraint. Turning around the problem in [7], we caninstead minimize the total energy consumed given the numberof bits transmitted. Now we can add an error probabilityconstraint to the bits transmitted. Assume that a block code isused to communicate across the channel. The correspondingerror exponent is bounded by the sphere-packing bound [16].Assuming optimistically that the code actually achieves thesphere-packing bound in the exponent, h P e i ≤ P e,block ≈ e − mE sp ( P T ,R ) where E sp ( P T , R ) is the sphere-packing bound at rate R andtransmit power P T . The objective, therefore, is min P T ,m m × ( P T + ǫ ) subject to m × E sp (cid:18) P T , km (cid:19) = ln (cid:18) h P e i (cid:19) . (11)R EFERENCES[1] C. E. Shannon, “A mathematical theory of communication,” Bell SystemTechnical Journal , vol. 27, pp. 379–423, 623–656, Jul./Oct. 1948.[2] J. Pierce, “Optical channels: Practical limits with photon counting,” IEEE Transactions on Communications , pp. 1819–1821, Dec. 1978.[3] S. Verdu, “On channel capacity per unit cost,” IEEE Trans. Inform.Theory , vol. 36, pp. 1019–1030, 1990.[4] P. Agrawal, “Energy efficient protocols for wireless systems,” in IEEEInternational Symposium on Personal, Indoor, Mobile Radio Communi-cation , 1998, pp. 564–569.[5] S Cui, AJ Goldsmith and A Bahai, “Energy Constrained ModulationOptimization,” IEEE Trans. Wireless Commun. , vol. 4, no. 5, pp. 1–11,2005.[6] A. J. Goldsmith and S. B. Wicker, “Design challenges for energyconstrained ad hoc wireless networks,” IEEE Trans. Wireless Commun. ,pp. 8–27, 2002.[7] P. Massaad, M. Medard, and L. Zheng, “Impact of Processing Energyon the Capacity of Wireless Channels,” in International Symposium onInformation Theory and its Applications (ISITA) , 2004.[8] S. Vasudevan, C. Zhang, D. Goeckel, and D. Towsley, “Optimalpower allocation in wireless networks with transmitter-receiver powertradeoffs,” Proceedings of the 25th IEEE International Conference onComputer Communications INFOCOM , pp. 1–11, Apr. 2006.[9] R. Kravertz and P. Krishnan, “Power management techniques for mobilecommunications,” The Fourth Annual ACM/IEEE International Confer-ence on Mobile Computing and Networking , 1998.[10] A. Sahai and P. Grover, “The price of certainty : “waterslide curves”and the gap to capacity,” Submitted to IEEE Transactions on InformationTheory, available online at http://arXiv.org/abs/0801.0352v1 , Dec. 2007.[11] P. P. Sotiriadis, V. Tarokh, and A. P. Chandrakasan, “Energy reduction inVLSI computation modules: an information-theoretic approach,” IEEETrans. Inform. Theory , vol. 49, no. 4, pp. 790–808, Apr. 2003.[12] N. Shanbhag, “A mathematical basis for power-reduction in digital VLSIsystems,” IEEE Trans. Circuits Syst. II , vol. 44, no. 11, pp. 935–951,Nov. 1997.[13] T. Koch, A. Lapidoth, and P. P. Sotiriadis, “A channel that heats up,” in Proceedings of the 2007 IEEE Symposium on Information Theory , Nice,France, 2007.[14] M. Bhardwaj and A. Chandrakasan, “Coding under observation con-straints,” in Proceedings of the Allerton Conference on Communication,Control, and Computing , Monticello, IL, Sep. 2007.[15] S. L. Howard, C. Schlegel, and K. Iniewski, “Error control coding inlow-power wireless sensor networks: when is ECC energy-efficient?” EURASIP Journal on Wireless Communications and Networking , pp.1–14, 2006.[16] C. E. Shannon, “Probability of error for optimal codes in a gaussianchannel,”