[PDF] Random Access: An Information-Theoretic Perspective

Abstract

This paper considers a random access system where each sender can be in two modes of operation, active or not active, and where the set of active users is available to a common receiver only. Active transmitters encode data into independent streams of information, a subset of which are decoded by the receiver, depending on the value of the collective interference. The main contribution is to present an information-theoretic formulation of the problem which allows us to characterize, with a guaranteed gap to optimality, the rates that can be achieved by different data streams. Our results are articulated as follows. First, we exactly characterize the capacity region of a two-user system assuming a binary-expansion deterministic channel model. Second, we extend this result to a two-user additive white Gaussian noise channel, providing an approximate characterization within 3 – √ /2 bit of the actual capacity. Third, we focus on the symmetric scenario in which users are active with the same probability and subject to the same received power constraint, and study the maximum achievable expected sum-rate, or throughput, for any number of users. In this case, for the symmetric binary expansion deterministic channel (which is related to the packet collision model used in the networking literature), we show that a simple coding scheme which does not employ superposition coding achieves the system throughput. This result also shows that the performance of slotted ALOHA systems can be improved by allowing encoding rate adaptation at the transmitters. For the symmetric additive white Gaussian noise channel, we propose a scheme that is within one bit of the system throughput for any value of the underlying parameters.

Full PDF

aa r X i v : . [ c s . I T ] D ec Random Access: An Information-TheoreticPerspective

Paolo Minero, Massimo Franceschetti, and David N. C. Tse

Abstract

This paper considers a random access system where each sender can be in two modes of operation,active or not active, and where the set of active users is available to a common receiver only. Activetransmitters encode data into independent streams of information, a subset of which are decoded by thereceiver, depending on the value of the collective interference. The main contribution is to present aninformation-theoretic formulation of the problem which allows us to characterize, with a guaranteed gapto optimality, the rates that can be achieved by different data streams.Our results are articulated as follows. First, we exactly characterize the capacity region of a two-user system assuming a binary-expansion deterministic channel model. Second, we extend this resultto a two-user additive white Gaussian noise channel, providing an approximate characterization within √ / bit of the actual capacity. Third, we focus on the symmetric scenario in which users are activewith the same probability and subject to the same received power constraint, and study the maximumachievable expected sum-rate, or throughput, for any number of users. In this case, for the symmetricbinary expansion deterministic channel (which is related to the packet collision model used in thenetworking literature), we show that a simple coding scheme which does not employ superpositioncoding achieves the system throughput. This result also shows that the performance of slotted ALOHAsystems can be improved by allowing encoding rate adaptation at the transmitters, achieving constant(rather than zero) throughput as the number of users tends to inﬁnity. For the symmetric additive whiteGaussian noise channel, we propose a scheme that is within one bit of the system throughput for anyvalue of the underlying parameters. P. Minero and M. Franceschetti are with the Advanced Network Science group (ANS) of the California Institute ofTelecommunications and Information Technologies (CALIT2), Department of Electrical and Computer Engineering, Universityof California, San Diego CA, 92093, USA. Email: [email protected], [email protected] N. C. Tse is with the Wireless Foundations at the Department of Electrical Engineering and Computer Science,University of California, Berkeley, CA 94720, USA. Email: [email protected] work was partially supported by the National Science Foundation CAREER award CNS-0546235, and CCF-0635048,and by the Ofﬁce of Naval Research YIP award N00014-07-1-0870.

October 28, 2018 DRAFT

I. I

NTRODUCTION

Random access is one of the most commonly used medium access control schemes for channel sharingby a number of transmitters. Despite decades of active research in the ﬁeld, the theory of random accesscommunication is far from complete. What has been notably pointed out by Gallager in his reviewpaper more than two decades ago [12] is still largely true: on the one hand, information theory providesaccurate models for the noise and for the interference caused by simultaneous transmissions, but it ignoresrandom message arrivals at the transmitters; on the other hand, network oriented studies focus on thebursty nature of messages, but do not accurately describe the underlying physical channel model. Asan example of the ﬁrst approach, the classic results by Ahlswede [3] and Liao [15] provide a completecharacterization of the set of rates that can be simultaneously achieved communicating over a discretememoryless (DM) multiple access channel (MAC). But the coding scheme they develop assumes a ﬁxednumber of transmitters with continuous presence of data to send. As an example of the second approach,Abramson’s classic collision model for the ALOHA network [2] assumes that packets are transmitted atrandom times and are encoded at a ﬁxed rate, such that a packet collision occurs whenever two or moretransmitters are simultaneously active. The gap between these two lines of research is notorious and welldocumented by Ephremides and Hajek in their survey article [10].In this paper, we try to bridge the divide between the two approaches described above. We presentthe analysis of a model which is information-theoretic in nature, but which also accounts for the randomactivity of users, as in models arising in the networking literature. We consider a crucial aspect of randomaccess, namely that the number of simultaneously transmitting users is unknown to the transmittersthemselves. This uncertainty can lead to packet collisions, which occur whenever the underlying physicalchannel cannot support the transmission rates of all active users simultaneously. However, our viewpointis that the random level of the interference created by the random set of transmitters can also be exploitedopportunistically by allowing transmission of different data streams, each of which might be decoded ornot, depending on the interference level at the receiver.To be fair, the idea of transmitting information in layers in random access communication is not new;however an information-theoretic perspective of this layering idea was never exposed. Previously, Medard et al. [17] studied the performance of Gaussian superposition coding in a two-user additive white Gaussiannoise (AWGN) system, but did not investigate the information-theoretic optimality of such a scheme. Inthe present work, we present coding schemes with guaranteed gaps to the information-theoretic capacity.We do so under different channel models, and also extending the treatment to networks with a large

October 28, 2018 DRAFT number of users. Interestingly, it turns out that in the symmetric case in which all users are subject tothe same received power constraint and are active with the same probability, superposition coding is notneeded to achieve up to one bit from the throughput of an AWGN system.The paper is organized in incremental steps, the ﬁrst ones laying the foundation for the more complexscenarios. Initially, we consider a two-user random access system, in which each sender can be in twomodes of operation, active or not active. The set of active users is available to the decoder only, andactive users encode data into two streams: one high priority stream ensures that part of the transmittedinformation is always received reliably, while one low priority stream opportunistically takes advantageof the channel when the other user is not transmitting. Given this set-up, we consider two differentchannel models. First, we consider a binary-expansion deterministic (BD) channel model in which theinput symbols are bits and the output is the binary sum of a shifted version of the codewords sent by thetransmitters. This is a ﬁrst-order approximation of an AWGN channel in which the shift of each inputsequence corresponds to the amount of path loss experienced by the communication link. In this case,we exactly characterize the capacity region and it turns out that senders need to simultaneously transmitboth streams to achieve capacity. Second, we consider the AWGN channel and present a coding schemethat combines time-sharing and Gaussian superposition coding. This turns out to be within √ / bitfrom capacity. Furthermore, we also show that in the symmetric case in which both users are subject tothe same received power constraint, superposition coding is not needed to achieve up to √ / bit fromcapacity.Next, we consider an m -user random access system, in which active transmitters encode data intoindependent streams of information, a subset of which are decoded by a common receiver, dependingon the value of the collective interference. We cast this communication problem into an equivalentinformation-theoretic network with multiple transmitters and receivers and we focus on the symmetric scenario in which users are active with the same probability p , independently of each other, and aresubject to the same received power constraint, and we study the maximum achievable expected sum-rate—videlicet throughput. Given this set-up, we again consider the two channel models described above.First, we consider the BD channel model in the symmetric case in which all codewords are shifted by thesame amount. In this setting, input and output symbols are bits, so that the receiver observes the binarysum of the codewords sent by the active transmitters. The possibility of decoding different messages inthe event of multiple simultaneous transmissions depends on the rate at which the transmitted messageswere encoded. Colliding codewords are correctly decoded when the sum of the rates at which they wereencoded does not exceed one. This is a natural generalization of the classic packet collision model widely October 28, 2018 DRAFT used in the networking literature, where packets are always encoded at rate one, so that transmissions aresuccessful only when there is one active user. We present a simple coding scheme which does not employsuperposition coding and which achieves the throughput. The coding scheme can be described as follows.When p is close to zero, active transmitters ignore the presence of potential interferers and transmit astream of data encoded at rate equal to one. By doing so, decoding at the receiver is successful if there isonly one active user, and it fails otherwise. This is what happens in the classic slotted ALOHA protocol,for which a collision occurs whenever two or more users are simultaneously active in a given slot. Incontrast, when p is close to one, the communication channel is well approximated by the standard m -userbinary sum DM-MAC, for which the number of transmitters is ﬁxed and equal to m . In this regime,active users transmit a stream of data encoded at rate equal to m , that is, each active user requests anequal fraction of the m -user binary sum DM-MAC sum-rate capacity. Any further increase in the per-userencoding rate would result in a collision. When p is not close to either of the two extreme values, basedon the total number of users m and the access probability p , transmitters estimate the number of activeusers by solving a set of polynomial equations. If k is the inferred number, then transmitters send onestream of data encoded at rate k , that is, each user requests an equal fraction of the k -user binary sumDM-MAC sum-rate capacity. Interestingly, it turns out that the estimator needed to achieve the throughputis different from the maximum-likelihood estimator ⌊ mp ⌋ for the number of active users. The analysisalso shows that the performance of slotted ALOHA systems can be improved by allowing encoding rateadaptation at the transmitters. In fact, we show that the expected sum-rate of our proposed scheme tendsto one as m tends to inﬁnity. Hence, there is no loss due to packet collisions in the so called scalinglimit of large networks. This is in striking contrast with the well known behavior of slotted ALOHAsystems in which users cannot adjust the encoding rate, for which the expected sum-rate tends to zero as m tends to inﬁnity. In practice, however, medium access schemes such as 802.11x typically use backoffmechanisms to effectively adapt the rates of the different users to the channel state. It is interesting to notethat while these rate control strategies used in practice are similar to the information-theoretic optimumscheme described above for the case of equal received powers, practical receivers typically implementsuboptimal decoding strategies, such as decoding one user while treating interference as noise.Next, we consider the case of the m -user AWGN channel. For this channel, we present a simple codingscheme which does not employ superposition coding and which achieves the throughput to within onebit — for any value of the underlying parameters. Perhaps not surprisingly, this coding scheme is verysimilar to the one described above for the case of the BD channel. In fact, the close connection betweenthese two channel models has recently been exploited to solve capacity problems for AWGN networks October 28, 2018 DRAFT through their deterministic model counterpart [5].Finally, we wish to mention some additional related works. Extensions of ALOHA resorting to prob-abilistic models to explain when multiple packets can be decoded in the presence of other simultaneoustransmissions appear in [13] and [19]. An information-theoretic model to study layered coding in a two-user AWGN-MAC with no channel state information (CSI) available to the transmitters was presented ina preliminary incarnation of this work [18]. The two-user BD channel has been studied in the adaptivecapacity framework in [14] and in this paper we also provide a direct comparison with that model. Wealso rely on the broadcast approach which has been pursued in [20], and [22] to study multiple accesschannels with no CSI available. A survey of the broadcast approach and its application to the analysisof multiple antenna systems appeared in [21], and we refer the reader to this work and to [6] for anoverview of the method and for additional references. The DM-MAC with partial CSI was studied in [8]assuming two compressed descriptions of the state are available to the encoders.The rest of the paper is organized as follows. The next section formally deﬁnes the problem in the caseof a two-user AWGN random access system. Section V presents the extension to of the m -user randomaccess system assuming an additive channel model. Section VI consider the case of a BD channelmodel, while section VII deals with the AWGN channel. A discussion about practical considerations andlimitations of our model concludes the paper.II. T HE TWO - USER A DDITIVE R ANDOM A CCESS C HANNEL

Consider a two-user synchronous additive DM-MAC where each sender can be in two modes ofoperation, active or not active, independently of each other. The set of active users is available to thedecoder, while encoders only know their own mode of operation. This problem is the compound DM-MACwith distributed state information depicted in Fig. 1. Speciﬁcally, the state of the channel is determinedby two statistically independent binary random variables S and S , which indicate whether user oneand user two, respectively, are active, and it remains unchanged during the course of a transmission.Each sender knows its own state, while the receiver knows all the senders’ states. The presence of sideinformation allows each transmitter to adapt its coding scheme to its state component. We can assumewithout loss of generality that senders transmit a codeword only when active, otherwise they remainsilent.Each sender transmits several streams of data, which are modeled via independent information mes-sages, a subset of which is decoded by the common receiver, depending on the state of the channel. Thenotation we use is as follows. We denote by W = { W , , . . . , W , |W | } and W = { W , , . . . , W , |W | } October 28, 2018 DRAFT

PSfrag replacements X X Z Y Tx Tx Rx S S S S S ∈ { , } S ∈ { , } Fig. 1. The two-user MAC with partial CSI modeling random access communications. The variables in bold represent vectorsof length n . the ensemble of independent messages transmitted by user and user , respectively. We assume thateach message W i,j is a random variable independent of everything else and uniformly distributed overa set with cardinality nR i,j , for some non-negative rate R i,j , j ∈ { , . . . , |W i |} , i ∈ { , } . We let W i ( A ) ⊆ W i denotes the set of messages transmitted by user i , i ∈ { , } , that are decoded when theset of senders A ⊆ { , } is active. Finally r i ( A ) denotes the sum of the rates at which messages in W i ( A ) are encoded.Therefore, we can distinguish three non-trivial cases: if user 1 is the only active user, then the receiverdecodes the messages in W ( { } ) and the transmission rate is equal to r ( { } ) ; similarly, if user 2 isthe only active user, then the receiver decodes the messages in W ( { } ) , which are encoded at total rateof r ( { } ) ; ﬁnally, the receiver decodes messages in W ( { , } ) and W ( { , } ) when both users areactive, so senders communicate at rate r ( { , } ) and r ( { , } ) , respectively. The resulting information-theoretic network is illustrated in Fig. 2, where one auxiliary receiver is introduced for each channelstate component. It is clear from the ﬁgure that, upon transmission, each transmitter is connected to thereceiver either through a point-to-point link or through an additive DM-MAC, depending on the channelstate.Observe that if the additive noises in Fig. 2 have the same marginal distribution, then the channeloutput sequence observed by the MAC receiver is a degraded version of the sequence observed by eachof the two point-to-point receivers, because of the mutual interference between the transmitted codewords.As in a degraded broadcast channel, the “better” receiver can always decode the message intended for October 28, 2018 DRAFT

PSfrag replacements X X Z Z Z Y Y Y Z ∅ Y ∅ W W W ( { } ) W ( { } ) W ( { , } ) W ( { , } ) Tx Tx Rx Rx Rx N (0 , C (P) Fig. 2. The network used to model a two-user random access system. The subscript index in Y and in Z denote the setof active users, W ( { , } ) and W ( { , } ) encode information which is always decoded, while W ( { } ) \ W ( { , } ) and W ( { } ) \ W ( { , } ) denote messages which are decoded when there is no interference. the “worse” receiver, similarly here each point-to-point receiver can decode what can be decoded at theMAC receiver. Thus, there is no loss of generality in assuming that W ( { , } ) ⊆ W ( { } ) (1)and that W ( { , } ) ⊆ W ( { } ) . (2)Then, messages in W ( { , } ) and W ( { , } ) ensure that some transmitted information is alwaysreceived reliably, while the remaining messages provide additional information that can be opportunis-tically decoded when there is no interference. If conditions (1) and (2) are satisﬁed, then we say that W = ( {W , W } , {W ( { } ) , W ( { , } ) , W ( { } ) , W ( { , } ) } ) denotes a message structure for thechannel in Fig. 2.For a given message structure W , we say that the rate tuple ( r ( { } ) , r ( { } ) , r ( { , } ) , r ( { , } )) is achievable if there exist a sequence of coding and decoding functions such that each receiver in Fig. 2can decode all intended messages with arbitrarily small error probability as the coding block size tendsto inﬁnity. We deﬁne the capacity region C W as the closure of the set of achievable rate tuples.Observe that as we vary |W | , |W | , and the sets of decoded messages, there are inﬁnitely manypossible message structures for a given channel. For each one of them we deﬁne C W .Next, we deﬁne the capacity of the channel in Fig. 2, denoted by C , as the closure of the union of C W over all possible message structures W . Note that C represents the optimal tradeoff among the rates October 28, 2018 DRAFT ( r ( { } ) , r ( { } ) , r ( { , } ) , r ( { , } ) over all possible ways of partitioning information into differentinformation messages such that conditions (1) and (2) are satisﬁed.In the next section we answer the question of characterizing C for two additive channels of practicalinterest. First, we consider the BD channel model, for which we completely characterize the capacityregion C . Perhaps not surprisingly, we show that to achieve C it sufﬁces that each sender transmits two independent information messages, one of which carries some reliable information which is alwaysdecoded, while the remaining one carries additional information which is decoded when the other useris not transmitting. Second, we consider the AWGN channel, for which we provide a constant gapcharacterization of C , where the constant is universal and independent of the channel parameters. Finally,we apply this result to the study of the throughput of a two-user random access system under symmetryassumptions. III. E XAMPLE THE TWO - USER BD RANDOM ACCESS CHANNEL

Suppose that channel input and output alphabets are each the set { , } n , for some integer number n , and that at each time unit t ∈ { , . . . , n } inputs and outputs are related as follows: Y ,t = X ,t ,Y ,t = X ,t + S n − n X ,t ,Y ,t = S n − n X ,t , (3)where n ≤ n denotes an integer number, summation and product are over GF (2) , and S n − n denotesthe ( n − n ) × ( n − n ) shift matrix having the ( i, j ) th component equal to if i = j + ( n − n ) ,and otherwise. By pre-multiplying X ,t by S n − n , the ﬁrst n components of X ,t are down-shiftedby ( n − n ) positions and the remaining elements are set equal to zero. We refer to this model asthe two-user BD random access channel (RAC), see Fig. 3 for a pictorial representation. Physically,this channel represents a ﬁrst-order approximation of a wireless channel in which continuous signals arerepresented by their binary expansion, the codeword length n represents the noise cut-off value, andthe amount of shift n − n corresponds to the path loss of user 2 relative to use 1 [5]. The followingtheorem characterizes the capacity region of this channel. Theorem 1:

The capacity region C of the two-user BD-RAC is the set of non-negative rate tuples such October 28, 2018 DRAFT

PSfrag replacements X X ZY W , W , W , W , ˆ W , ˆ W , ˆ W , ˆ W , ˆ W , ˆ W , Tx Tx Rx Rx Rx n n S ∈ { , } S ∈ { , } Fig. 3. The two-user BD-RAC, and the message structure used to prove the achievability of the capacity region. that r ( { } ) ≤ n ,r ( { } ) ≤ n ,r ( { } ) + r ( { , } ) ≤ n ,r ( { } ) + r ( { , } ) ≤ n ,r ( { , } ) ≤ r ( { } ) ,r ( { , } ) ≤ r ( { } ) . (4)The proof of the converse part of the above theorem can be sketched as follows. Observe that thecommon receiver observing Y , { Y , , . . . , Y ,n } can decode messages in W ( { , } ) . Let us supposethat this receiver is given messages in W \ W ( { , } ) as side information. Then, it has full knowledgeof W , so it can compute the codeword X transmitted by user 2, subtract it from the aggregate receivedsignal Y , obtaining X . Thus, given the side information, the channel output observed by the commonreceiver becomes statistically equivalent to Y . Since receiver can decode W ( { } ) upon observing Y , we conclude that receiver must also be able to decode message W ( { , } ) . Hence, r ( { } ) + r ( { , } ) ≤ n . By providing side information about message W \ W ( { , } ) and following the sameargument above, we obtain that r ( { } ) + r ( { , } ) ≤ n . The remaining bounds are trivial.The proof of the achievability part of the theorem shows that it sufﬁces to partition information into two independent messages, such that W = { W , , W , } and W = { W , , W , } . Messages W , and W , represent ensure that part of the transmitted information is always received reliably, while W , and W , are decoded opportunistically when one user is not transmitting. The corresponding message structure is October 28, 2018 DRAFT0 illustrated in Fig. 3. In general, the coding scheme which we employ in the proof of the achievabilityrequires that user 1 simultaneously transmits W , and W , . However, in the special symmetric case inwhich n = n all rate tuples in the capacity region can be achieved by means of coding strategies inwhich each user transmits only one of the two messages. Proof:

First, we prove the converse part of the theorem. The ﬁrst two inequalities which deﬁne C arestandard point-to-point bounds which can be derived via standard techniques. To obtain the third inequal-ity, observe that by Fano’s inequality we have that H ( W ( { , } ) | Y ) ≤ nǫ n , H ( W ( { , } ) | Y ) ≤ nǫ n , as well as H ( W i ( i ) | Y i ) ≤ nǫ n , where ǫ n tend to zero as the block length n tends to inﬁnity. Fromthe independence of the source messages, we have that n ( r ( { } ) + r ( { , } )) = H ( W ( { } ) , W ( { , } )) , = H ( W ( { } ) , W ( { , } ) |W \ W ( { , } )) , = I ( W ( { } ) , W ( { , } ); Y |W \ W ( { , } )) , + H ( W ( { } ) , W ( { , } ) | Y , W \ W ( { , } )) . (5)Using the memoryless property of the channel and the fact that conditioning reduces the entropy, theﬁrst term in the right hand side of (5) can be upper bounded as I ( W ( { } ) , W ( { , } ); Y |W \ W ( { , } )) ≤ nn . (6)On the other hand, from the chain rule, the fact that conditioning reduces the entropy, and Fano’sinequality, we have that H ( W ( { } ) , W ( { , } ) | Y , W \ W ( { , } ))= H ( W ( { , } ) | Y , W \ W ( { , } )) + H ( W ( { } ) | Y , W ) ≤ H ( W ( { , } ) | Y ) + H ( W ( { } ) | Y , W , X )= H ( W ( { , } ) | Y ) + H ( W ( { } ) | Y ) ≤ nǫ n (7)where the last equality is obtained observing from (3) that, if X is given, then Y is statisticallyequivalent to Y . Substituting (6) and (7) into (5), we have that n ( r ( { } ) + r ( { , } )) ≤ nn + 2 nǫ n , October 28, 2018 DRAFT1 and the desired inequality is obtained in the limit of n going to inﬁnity. The fourth inequality in (4) isobtained by a similar argument. Finally, the last two inequalities in (4) follow from (1) and (2).Next, to prove the direct part of the theorem, we establish that C is equal to the capacity of thetwo-user BD-RAC for the speciﬁc message structure deﬁned by W i = { W i, , W i, } , W i ( { i } ) = W i , and W i ( { } ) = { W i, } , i ∈ { , } . For this message structure we have that r ( { } ) = R , + R , ,r ( { } ) = R , + R , ,r ( { , } ) = R , ,r ( { , } ) = R , . (8)We have established above that if ( r ( { } ) , r ( { } ) , r ( { , } ) , r ( { , } ) ∈ C W ⊆ C , then inequalities(4) have to be satisﬁed. Combining the non-negativity of the rates, (4), and (8), and eliminating ( r ( { } ) ,r ( { } ) , r ( { , } ) , r ( { , } ) from the resulting system of inequalities, we obtain R , + R , ≤ n ,R , + R , ≤ n ,R , + R , + R , ≤ n ,R , + R , + R , ≤ n . (9)The above system of inequalities is the image of (4) under the linear map (8). Since the map is invertible,proving the achievability of all rate tuples ( r ( { } ) , r ( { } ) , r ( { , } ) , r ( { , } ) satisfying (4) isequivalent to proving the achievability of all rate tuples ( R , , R , , R , , R , ) satisfying (9). It is tediousbut simple to verify that the set of non-negative rate tuples satisfying (9) is equal to the convex hull of tenextreme points, four of which are dominated by one of the remaining six. Given two vectors u and v , wesay that u dominates v if each coordinate of u is greater than or equal to the corresponding coordinate of v . The six dominant extreme points of (9) are given by v = [ n , n , n − n , T , v = [ n − n , , , n ] T , v = [0 , , n − n , n ] T , v = [ n , n , , T , v = [0 , , n , T , v = [0 , , , n ] T , where the fourcoordinates denote ( R , , R , , R , , R , ) , respectively.The achievability of v , . . . , v can be sketched as follows. To achieve v sender 1 transmit simulta-neously W , and W , , in the ﬁrst n − n and last n components of X , respectively. User 2, instead,transmits W , in the ﬁrst n components of X . Because of the downshift in X , the multiple accessdecoder receives the binary sum of W , and W , in the last n components of Y , and can successfullydecoded W , from the ﬁrst n − n interference-free components. Effectively, coding is performed so that October 28, 2018 DRAFT2 Tx Tx Rx Rx Rx PSfrag replacements X X ZY W , W , W , W , ˆ W , ˆ W , ˆ W , ˆ W , ˆ W , Tx Tx Rx Rx Rx n n n n − n S ∈ { , } S ∈ { , } Fig. 4. Pictorial representation of the coding scheme which achieves the rate tuple R , = n , R , = n − n , R , = n , R , = 0 . Message W , is transmitted via the ﬁrst n − n interference-free links (dotted lines), while W , and W , are sentthrough the remaining n links (solid lines), so that the interference they generate results “aligned” at the common receiver. W , and W , are received “aligned” at the common receiver, see Fig. 4 for a pictorial representation.Observe that in the special case in which n = n , sender 1 only transmit message W , . Likewise, v , . . . , v can be achieved by transmitting one message per user, in such a way that the transmittedcodewords do not interfere with each other at the multiple access receiver. For example, to achieve v user 2 transmits W , in the ﬁrst n components of X , while user 1 transmits W , in the ﬁrst n − n components of X .Next, observe that if an extreme point v is achievable, then all extreme points dominated by v arealso achievable by simply decreasing the rate of some of the messages. Finally, any point in (4) can bewritten as convex combination of the extreme points, hence it can be achieved by time-sharing amongthe basic coding strategies which achieve v , . . . , v . This shows that all rate tuples satisfying (9) areachievable. A. The throughput in a symmetric scenario.

Having an exact characterization of the capacity region at hands, it is now possible to formulate andsolve optimization problems of practical interests. As an example, we consider the problem of maximizingthe throughput in the symmetric scenario where n = n = 1 , and where each user is independentlyactive with probability p .This model represents a ﬁrst-order approximation of a wireless channel in which data arrivals followthe same law, and where transmitted signals are received at the same power level. The codeword length is October 28, 2018 DRAFT3 normalized to 1 so that the maximum amount of information which can be conveyed across the channelis one bit per channel use, regardless the number of active users. The possibility of decoding differentmessages in the event of multiple simultaneous transmissions depends on the rate at which the messageswere encoded. Colliding codewords are correctly decoded when the sum of the rates at which they wereencoded does not exceed one. This is a natural generalization of the classic packet collision model widelyused in the networking literature, where packets are always encoded at rate one, so that transmissionsare successful only when there is one active user. The parameter p represents the burstiness of dataarrivals, and determines the law of the variables S and S in Fig. 1, hence the channel law. Based onthe knowledge of p , each sender can “guess” the state of operation of the other user, and optimize thechoice of the encoding rates so that the expected sum-rate, or throughput, is maximized.Formally, we look for the solution of the following optimization problem: max p (1 − p ) [ r ( { } ) + r ( { } )] + p [ r ( { , } ) + r ( { , } )] subject to the constraint that the rates should be in C . Observe that the weight assigned to each ratecomponent r i ( A ) is uniquely determined by p , and is equal to the probability that users in the set A areactive. By means of Theorem 1, it is easy to show that the solution to the above problem is equal to  p (1 − p ) , if p ∈ (0 , / p, if p ∈ (1 / , . The coding strategy used to achieve the throughput can be described as follows. If the transmissionprobability p lies in the interval (0 , / , then user i transmits message W i, encoded at rate 1. A collisionoccurs in the event that both senders are simultaneously transmitting, which occurs with probability p ,in which case the common receiver cannot decode the transmitted codewords. Decoding is successfulif only one of the two users is active, so the expected sum-rate achieved by this scheme is equal to p (1 − p ) . If, instead, the transmission probability p lies in the interval (1 / , , then user i transmitsmessage W i, encoded at rate / , i.e., at half the sum-rate capacity of the two-user binary additiveMAC. By doing so, the transmitted codewords are never affected by collisions, and can be decoded inany channel state. This yields an expected sum-rate of p (1 − p )1 / p . It should be highlighted thatin this symmetric scenario each user transmits only one of the two messages for any value of p .We show later in the paper that this optimization problem can be solved in the general case of anetwork with more than two users. October 28, 2018 DRAFT4

IV. E

XAMPLE THE TWO - USER

AWGN-RACWe now turn to another example of additive channels. Assume that at each discrete time step inputsand outputs are related as follows: Y ,t = X ,t + Z ,t ,Y ,t = X ,t + X ,t + Z ,t ,Y ,t = X ,t + Z ,t , (10)where Z ,t , Z ,t , and Z ,t are independent standard Gaussian random variables, and the sum is overthe ﬁeld of real numbers. Assume that the realizations of { X i,t } satisfy the following average powerconstraint n X t =1 x i,t ≤ n P i for some positive constant P i , i = 1 , , and that P ≥ P . We refer to the model in (10) as the two-userAWGN-RAC. In the rest of the paper, we use the notation C ( x ) , / x ) .An outer bound to the capacity region C of the two-user AWGN-RAC in (10) is given by the followingTheorem. Theorem 2:

Let C denote the set of non-negative rates such that r ( { } ) ≤ C (P ) ,r ( { } ) ≤ C (P ) ,r ( { } ) + r ( { , } ) ≤ C (P + P ) ,r ( { } ) + r ( { , } ) ≤ C (P + P ) ,r ( { , } ) ≤ r ( { } ) ,r ( { , } ) ≤ r ( { } ) . (11)Then, C ⊆ C .The proof of the above theorem is similar to the converse part of Theorem 1 and it is hence omitted.Next, we prove an achievability result by computing an inner bound to the capacity region C W of thetwo-user AWGN-RAC for a speciﬁc message structure W . As for the BD-RAC, we let W i = { W i, , W i, } , W i ( i ) = W i , and W i (12) = { W i, } , i ∈ { , } . The encoding scheme we use is Gaussian superpositioncoding. Each sender encodes the messages using independent Gaussian codewords having sum-powerless or equal to the power constraint. Decoding is performed using successive interference cancelation:messages are decoded in a prescribed decoding order, treating interference of messages which follow in October 28, 2018 DRAFT5 the order as noise. Then, each decoded codeword is subtracted from the aggregate received signal.

Proposition 3:

Let C ′ W denote the set of non-negative rates such that r ( { } ) ≤ C (P ) ,r ( { } ) ≤ C (P ) ,r ( { } ) + r ( { } ) ≤ C (P + P ) ,r ( { , } ) ≤ r ( { } ) ,r ( { , } ) = r ( { } ) . (12)Similarly, let C ′′ W denote the set of non-negative rates satisfying (12) after after swapping the indices 1and 2. Finally, let C ′′′ W denote the set of non-negative rates satisfying the following inequalities r ( { , } ) ≤ C (cid:16) (1 − β )P β P + β P +1 (cid:17) ,r ( { , } ) ≤ C (cid:16) (1 − β )P β P + β P +1 (cid:17) ,r ( { , } ) + r ( { , } ) ≤ C (cid:16) (1 − β )P +(1 − β )P β P + β P +1 (cid:17) ,r ( { } ) ≤ r ( { , } ) + C ( β P ) ,r ( { } ) ≤ r ( { , } ) + C ( β P ) . (13)for some ( β , β ) ∈ [0 , × [0 , . Let C W = closure ( C ′ W ∪ C ′′ W ∪ C ′′′ W ) . Then, C W ⊆ C W ⊆ C . Proof:

Suppose that sender two does not transmit message W , , i.e., R , = 0 . The achievabilityof C ′ can then be shown by using a standard random coding argument as for the AWGN-MAC. Tosend ( W , , W , ) , encoder one sends the sum of two independent Gaussian codewords having sum-power equal to P . On the other hand, sender two encodes W , into a Gaussian codeword having power P . A key observation is that the common receiver observing Y can decode all transmitted messages: W , , W , can be decoded by assumption, while W , can be decoded after having subtracted X fromthe received channel output. Thus, by joint typical decoding, decoding is successful with arbitrarilysmall error probability if R , + R , + R , < C (P + P ) , i.e., r ( { } ) + r ( { } ) < C (P + P ) .Similarly, the receiver observing Y can decode messages W , , W , as long as R , + R , < C (P ) ,i.e., r ( { } ) < C (P ) while the receiver observing Y can decode messages W , if r ( { } ) ≤ C (P ) .We conclude that C ′ is an inner bound to the capacity region. By swapping the role of user 1 and user 2it is easy to see that C ′′ is also an inner bound to the capacity region. We claim that C ′′′ can be achievedby a coding scheme which combines Gaussian superposition coding and multiple access decoding. Asin the Gaussian broadcast channel, to send the message pair (cid:0) W i, , W i, (cid:1) , encoder i sends the codeword X i (cid:0) W i, , W i, (cid:1) = U i (cid:0) W i, (cid:1) + V i (cid:0) W i, (cid:1) , where the sequences U i and V i are independent Gaussian October 28, 2018 DRAFT6 codewords having power (1 − β i )P i and β i P i respectively, i = 1 , . Upon receiving Y , decoder 12 ﬁrstdecodes W , and W , using a MAC decoder and treating V (cid:0) W , (cid:1) + V (cid:0) W , (cid:1) as noise. Decodingis successful with arbitrarily small error probability if R , < C (cid:16) (1 − β )P β P + β P +1 (cid:17) ,R , < C (cid:16) (1 − β )P β P + β P +1 (cid:17) ,R , + R , < C (cid:16) (1 − β )P +(1 − β )P β P + β P +1 (cid:17) . (14)Upon receiving Y i = U i (cid:0) W i, (cid:1) + V (cid:0) W i, (cid:1) + Z i , decoder i performs decoding via successive interferencecancelation: it ﬁrst decodes W i, treating V i (cid:0) W i, (cid:1) + Z i as noise, then it subtracts U i (cid:0) W i, (cid:1) from Y i anddecodes W i, from V i (cid:0) W i, (cid:1) + Z i . Thus, decoding of W i, is successful if R i, < C (cid:0) (1 − β i )P i β i P i +1 (cid:1) , whiledecoding of W i, is successful if R i, < C (cid:0) β P (cid:1) . After combining these conditions to the equalitieswhich relate ( r ( { } ) , r ( { } ) , r ( { , } ) , r ( { , } ) to ( R , , R , , R , , R , ) , and eliminating ( R , ,R , , R , , R , ) from the resulting system of inequalities, we obtain that (14) have to be satisﬁed forthe above coding scheme to work. Finally, a standard time-sharing argument can be used to show thatthe closure ( C ′ ∪ C ′′ ∪ C ′′′ ) ⊆ C The following theorem explicitly characterizes the gap between the above inner and outer bounds on C . Theorem 4:

Let R ∈ C . Then, there exists R ′ ∈ C W such that k R − R ′ k≤ √ . Proof:

See Appendix I.Observe Gaussian superposition coding is not the optimal coding strategy for the AWGN channelunder consideration. However, the above theorem ensures that Gaussian superposition coding achievesto within √ / bit from the capacity C . It is important to note that this bound holds independently ofthe power constraints P and P . The proof of the above theorem is established by showing that for anyextreme point v of C , there exists an r ∈ C at distance less that √ / from v . Since any point R in C is a convex combination of extreme points of C , we can employ a time-sharing protocol among thevarious achievable rate points { r } and achieve a rate point at distance less that √ / from R . A. An approximate expression for the throughput.

As an application of the above result, consider the symmetric scenario where P = P = P , and whereeach user is active with probability p . Based on the knowledge of p , transmitters optimize the choiceof the encoding rates so that the throughput is maximized. Formally, we look for the solution of thefollowing optimization problem: max p (1 − p ) [ r ( { } ) + r ( { } )] + p [ r ( { , } ) + r ( { , } )] October 28, 2018 DRAFT7

PSfrag replacements pp (P) Adaptive rate approach [14]Throughput (Inner bound)Throughput (Outer bound)Full CSI transmitters b it s / s y m bo l C (2P) Fig. 5. Throughput of the two-user symmetric AWGN-RAC (

P = 20 dB). subject to the constraint that the rates should be in C . Combining Theorem 2 and Theorem 3, it is possibleto show that the above maximum is equal T ( p, P) + ε ( p, P) , where T ( p, P) =  p (1 − p ) C (P) , if p ∈ (0 , p (P)]; p C (2P) , if p ∈ ( p (P) , ,p (P) = 1 − C (2P) / (2 C (P)) ∈ (0 , / , and ≤ ε ( p, P) ≤ . Observe that the bound on the error termholds for any choice of the parameters p and P .The coding strategy used to achieve T ( p, P) is similar to the one described for the case of thesymmetric BD channel. If the transmission probability p lies in the interval (0 , p (P)] , then user i transmitsmessage W i, encoded at the maximum point-to-point coding rate, i.e., C (P) . If, instead, the transmissionprobability p lies in the interval ( p (P) , , then each active user transmits message W i, encoded atrate / C (2P) , i.e., at half the sum-rate capacity of the two-user AWGN-MAC. The parameter p (P) represents a threshold value below which it is worth taking the risk of incurring in a packet collision.Observe that p (P) → / as P → ∞ .Fig. 5 compares T ( p, P) to the expected sum-rate achieved under the adaptive-rate framework [14], andto its counterpart assuming that full CSI is available to the transmitters. In the adaptive-rate framework,each sender transmits at a rate of C (2P) / , so that users can always be decoded. The ﬁgure illustrates October 28, 2018 DRAFT8 how our approach allows us to improve upon the expected adaptive sum-rate for small values of p , forwhich the collision probability is small. In this regime, our inner bound is in fact close to the curveobtained giving full CSI to the transmitters. Later in the paper, we shall see that the gain provided byour approach becomes more signiﬁcant when the population size of the network increases.V. T HE m - USER ADDITIVE

RACIn this section we extend the analysis previously developed for a two-user system to the case of an m -user MAC, where m denotes an integer ≥ , and in which each transmitter can be in two modes ofoperation, active or not active. The set of active users, denoted in the sequel by A , determines the state of the channel. That is, the channel is said to be in state A if all users in the set A are active. As inthe two-user case, transmitters only know their own state component, and encode data into independentstreams of information. The common receiver knows the set of active users, and decodes subsets of thetransmitted data streams depending on the state of the channel.By introducing one auxiliary receiver per each channel state, we can map this problem to a broadcastnetwork with m transmitters and m − receivers. A one-to-one correspondence exists between the setof receivers and the set of non-empty subsets of { , . . . , m } , so that for each set of active users A , thereexists a unique corresponding receiver, which with abuse of notation we refer to as receiver A . Receiver A observes the sum of the codewords transmitted by users in A plus noise, and decodes a subset ofthe data streams sent by the active users. Observe that for a given channel state, only one among theseauxiliary broadcast receivers corresponds to the actual physical receiver.The formal description of the problem is as follows. A. Problem formulationDeﬁnition 5.1: An m -user DM-RAC ( {X , . . . , X m } , {Y A : A { , · · · , m }} , ( p ( { y A : A { , · · · , m }}| x , . . . , x m )) consists of m input sets X , . . . , X m , m − output sets {Y A } , and a collectionof conditional probabilities on the output sets.The channel is additive if at any discrete unit of time t ∈ { , . . . , n } , the input symbols ( X ,t , . . . , X m,t ) are mapped into m − channel output symbols { Y A,t } via the additive map Y A,t = X a ∈ A X a,t + Z A,t , (15)where the { Z A,t } are mutually independent random variables with values in a set Z , and the sum is overa ﬁeld F such that there exists m embeddings F i : X i → F , and one embedding F m +1 : Z → F . In the October 28, 2018 DRAFT9 next section we consider two classes of additive random access channels: the symmetric BD-RAC, forwhich the channel inputs are strings of bits, and the sum is binary; and the symmetric AWGN-RAC, forwhich X = Z = R , the channel inputs are subject to an average power constraint, and the sum is overthe reals. Deﬁnition 5.2: A message structure W = ( {W , . . . , W m } , {W i ( A ) : i ∈ A ⊆ { , · · · , m }} ) for an m -user RAC consists of m input message sets W i , W i = { W i, , · · · , W i, |W i | } , and m m − output sets W i ( A ) , W i ( A ) ⊆ W i , such that the following condition is satisﬁed: A1. W i ( B ) ⊆ W ( A ) for all i ∈ A ⊆ B ⊆ { , . . . , m } .For each i and j ∈ { , . . . , |W i |} , message W i,j is a random variable independent of everything else anduniformly distributed over a set with cardinality nR i,j , for some non-negative rate R i,j , j ∈ { , . . . , |W i |} .The reason for imposing condition A1. is as follows. Observe from (15) that if A ⊆ B and the marginaldistributions of the noises Z B and Z A are equal, then Y B is a (stochastically) degraded version of Y A .Then, condition A1. says that the “better” receiver A must decode what can be decoded at the “worse”receiver B .For a given message structure W , let r i ( A ) = X j : W i,j ∈W i ( A ) R i,j (16)denote the sum of the rates of the messages in W i ( A ) . Observe that (16) deﬁnes a linear mapping from R |W |× ... ×|W m | + into R m m − + that shows how a macroscopic quantity, the rate at which user i communicatesto receiver A , is related to various microscopic quantities, the coding rates of the individual transmittedmessages. Deﬁnition 5.3: An n -code for the RAC ( {X , . . . , X m } , {Y A : A { , · · · , m }} , ( p ( { y A : A { , · · · , m }}| x , . . . , x m )) and for the message structure W consists of m encoding functions (encoders)and m − decoding functions (decoders). Encoder i maps each { W i, , · · · , W i, |W i | } into a randomcodeword X i , { X i, , X i, , . . . , X i,n } of n random variables with values in the set X i . Decoder A mapseach channel output sequence Y A ∈ Y nA into a set of indexes ∪ j : W i,j ∈W i ( A ) { ˆ W i,j } , where each index ˆ W i,j ∈ { , . . . , nR i,j } is an estimate of the corresponding transmitted message W i,j ∈ W i ( A ) . Deﬁnition 5.4:

For a given n -code, the average probability of decoding error at the decoder A isdeﬁned as Pr n ˆ W i,j = W i,j : W i,j ∈ W i ( A ) , j ∈ { , . . . , |W i ( A ) |} , i ∈ A o . (17) October 28, 2018 DRAFT0

Deﬁnition 5.5:

A rate tuple { r i ( A ) } is said to be achievable if there exists a sequence of n codes suchthat the average probability of a decoding error (17) for each decoder vanishes to zero as the block size n tends to inﬁnity. Deﬁnition 5.6:

The capacity region C W of the m -user RAC ( {X , . . . , X m } , {Y A : A { , · · · , m }} , ( p ( { y A : A { , · · · , m }}| x , . . . , x m )) for the message structure W is closure of the set of achievablerate vectors { r i ( A ) } .Finally, Deﬁnition 5.7:

The capacity region C of the m -user RAC ( {X , . . . , X m } , {Y A : A { , · · · , m }} , ( p ( { y A : A { , · · · , m }}| x , . . . , x m )) is deﬁned as C = closure ( ∪ W C W ) . B. An outer bound to the capacity C Theorem 5:

The capacity region C of the additive m -user additive RAC in (15) is contained insidethe set of non-negative rate tuples satisfying r i ( B ) ≤ r i ( A ) for all i ∈ B ⊆ A, (18)and K X k =1 r i k ( { i . . . i k } ) ≤ I ( X i , . . . , X i K ; Y i ...i K ) , (19)for all K ∈ { , . . . , m } and i = . . . = i m ∈ { , . . . , m } , and some joint distribution p ( q ) p ( x | q ) · · · p ( x m | q ) ,where | Q | ≤ m ! × m . Proof:

See Appendix II.

Remark 1:

In the special case of a network with two users, it is immediate to verify that the outerbound given by the above theorem reduces to the region given by Theorem 1 and Theorem 2 for thetwo-user BD-RAC and the two-user AWGN-RAC, respectively.

Remark 2:

An inspection of the proof of the above theorem shows that the additive channel modelassumed in the theorem can be replaced with a more general family of maps, namely with those channelswith the property that, if X A ′ is given, then Y A is statistically equivalent to Y A \ A ′ , A ′ ⊆ A . October 28, 2018 DRAFT1

Remark 3:

Observe that (19) gives m constraints for any permutation of the set { , . . . , m } , so itdeﬁnes m × m ! inequalities.Equation (19) can be obtained as follows. Suppose that we ﬁx a set of active users i , . . . , i K ,for some K ∈ { , . . . , m } , and we provide the receiver observing Y i ...i K with messages in the set ∪ Kr =1 W i K − r +1 \ W i K − r +1 ( { i . . . i K − r +1 } ) as side information. Suppose that this receiver decodes oneuser at the time, starting with user i K and progressing down to user i . Let us consider the ﬁrstdecoding step. By assumption, receiver { i . . . i K } can decode information in W i K ( { i . . . i K } ) so,given the side information W i K \ W i K ( { i . . . i K } ) it has full knowledge of W i K , it can compute thecodeword X i K transmitted by user i K and subtract it from the aggregate received signal, obtaining Y i ...i K − X i K = Y i ...i K − . Thus, at the end of the ﬁrst decoding step the channel output observed byreceiver { i . . . i K } is statistically equivalent to Y i ...i K − . It follows that at the next decoding step it candecode information in W i K − ( { i . . . i K − } ) . By proceeding this way, at the r th iteration we obtain a se-quence which is statistically equivalent to Y i ...i K − r +1 . Hence, receiver { ı . . . i K } can decode informationin W i K − r +1 ( { i . . . i K − r +1 } ) , then make use of the side information W K − r +1 \ W i K − r +1 ( { i . . . i K − r +1 } ) to compute X i K − r +1 and subtract it from the aggregate received signal before turning to decoding the nextuser. In other words, at the r th step of the iteration user i K − r +1 ’s signal is only subject to interferencefrom users i , . . . , i k − r , as the signal of the remaining users has already been canceled. Therefore, user i k − r +1 communicates to the receiver at a rate equal to r i k − r +1 ( { i . . . i k − r +1 } ) .In summary, equation (19) says that the sum of the communication rates across the K iterations cannotexceed the mutual information between the channel inputs on the transmitters side and the channel outputon the receiver side, regardless of the permutation on the set of users originally chosen. C. The throughput of a RAC

Assume that each user is active with probability p , independently of other users, and that p is availableto the encoders. In light of these assumptions, Deﬁnition 5.8:

The maximum expected sum-rate, or throughput , of a RAC is deﬁned as T ( p, m ) , max X A ⊆{ ,...,m } p | A | (1 − p ) m −| A | X i ∈ A r i ( A ) . (20)where the maximization is subject to the constraint that the rates should be in the capacity region C ofthat channel. October 28, 2018 DRAFT2

The fact that each user is active with the same probability p has one important consequence. Byre-writing the objective function in (20) as m X k =1 p k (1 − p ) m − k X A ⊆{ ,...,m }| A | = k X i ∈ A r i ( A ) and deﬁning ρ k = X A ⊆{ ,...,m }| A | = k X i ∈ A r i ( A ) , k ∈ { , . . . , m } , (21)it is clear that the objective function in (20) depends only on ρ , . . . , ρ m . It follows that in order tocompute T ( p, m ) it is sufﬁcient to characterize the optimal tradeoff among these m variables. Thismotivates the following deﬁnition Deﬁnition 5.9:

Let C ρρρ denote the image of the capacity C of an m -user additive RAC under the lineartransformation given by (21).It should be emphasized that the symmetry of the problem allow us to greatly reduce the complexity ofthe problem: instead of characterizing C , which is a convex subset of R m m − + , it sufﬁces to study theset C ρρρ , which is a convex subset of R m + . Thus, we have that T ( p, m ) = max ρ ,...,ρ m ∈ C ρρρ m X k =1 p k (1 − p ) m − k ρ k . (22)In the sequel, outer and inner bounds on C ρρρ are denoted by C ρρρ and C ρρρ respectively. In what follows,we denote by f m,k ( p ) , (cid:18) mk (cid:19) p k (1 − p ) m − k the probability of getting exactly k successes in m independent trials with success probability p , and wedenote by F m,k ( p ) , k X i =0 f m,i ( p ) the probability of getting at most k successes.VI. E XAMPLE THE m - USER SYMMETRIC

BD-RACIn this section, we consider the m -user generalization of the symmetric BD-RAC considered inSection III, where all transmitted codewords are shifted by the same amount. This model representsan approximation of a wireless channel in which signals are received at the same power level. October 28, 2018 DRAFT3

Suppose the X and Y alphabets are each the set { , } , the additive channel (15) is noise-free, so Z A ≡ , and the sum is over GF (2) . Observe that this is the m -user version of the channel model in (3) inthe special case where n = . . . = n m = 1 . The codeword length is normalized to 1. As mentioned above,this channel model can be thought of as a natural generalization of the packet collision model widelyused in the networking literature, where packets are always encoded at rate one, so that transmissionsare successful only when there is one active user. Theorem 5 yields the following proposition. Proposition 6:

The capacity region C of the m -user symmetric BD-RAC is contained inside the setof { r i ( A ) } tuples satisfying r i ( B ) ≤ r i ( A ) for all i ∈ B ⊆ A, (23)and m X k =1 r i k ( { i . . . i k } ) ≤ , (24)for all i = . . . = i m ∈ { , . . . , m } . A. The throughput of the symmetric BD-RAC

Next, we turn to the problem of characterizing the throughput T ( p, m ) for the symmetric BD-RAC.The following theorem provides the exact characterization of C ρρρ for this channel. Theorem 7: C ρρρ for the m -user symmetric BD-RAC is equal to the ( ρ , . . . , ρ m ) tuples satisfying ρ k k (cid:0) mk (cid:1) ≥ ρ k +1 ( k + 1) (cid:0) mk +1 (cid:1) ≥ . . . ≥ ρ m m (cid:0) mm (cid:1) ≥ , (25a)and m X k =1 ρ k k (cid:0) mk (cid:1) ≤ . (25b) Proof:

See Appendix III.We outline the proof of the theorem as follows. The outer bound in the above theorem makes useof Proposition 6. To prove the achievability, we show that C ρρρ is equal to the image under the lineartransformation given by (21) of the capacity region C W of the m -user symmetric BD-RAC for themessage structure W deﬁned by W i = { W i, , . . . , W i,m } , i ∈ { , . . . , m } (26) October 28, 2018 DRAFT4 and W i ( A ) = ∪ j ≥| A | W i,j , (27)for i ∈ A ⊆ { , . . . , m } . This message structure is the natural generalization of the message structure usedfor the two-user BD-RAC. Each sender transmits m independent messages, which are ordered accordingto the amount of interference which they can tolerate, so that message W i,j is decoded when there areless than j interfering, regardless the identity of the interferers.To prove the achievability of C ρρρ using this message structure, we observe that C ρρρ is the convexhull of m extreme points, and that to achieve the k th extreme points it sufﬁces that user i transmits a single information message, namely W i,k , encoded at rate k . Thus, a simple single-layer coding strategycan achieve all extreme points of C ρρρ , and the proof of the achievability is completed by means of atime-sharing argument.Having an exact characterization of C ρρρ at hands, we can explicitly solve the throughput optimizationproblem. The main result of this section is given by the following theorem. Theorem 8:

Let Π m represent the partition of the unit interval into the set of m intervals ( p , p ] , ( p , p ] , . . . , ( p m − , p m ] , where p , , p m , and, for < k < m , p k is deﬁned as the unique solution in (0 , to the followingpolynomial equation in p k + 1 F m − ,k ( p ) = 1 k F m − ,k − ( p ) . (28)Then, the following facts hold1) p = m , p m − = m / ( m − , and p ∈ (0 , km ) for k ∈ { , . . . , m − } .2) The throughput of the m -user symmetric BD-RAC is given by T ( p, m ) = mpk F m − ,k ( p ) , if p ∈ ( p k − , p k ] , (29)for k ∈ { , . . . , m } .3) T ( p, m ) is achieved when all active senders transmit a single message encoded at rate r ( p ) = 1 k , if p ∈ ( p k − , p k ] , (30)for k ∈ { , . . . , m } .4) T ( p, m ) is a continuous function of p ; it is concave and strictly increasing in each interval of the October 28, 2018 DRAFT5 partition Π m . Proof:

See Appendix IV.

Remarks:

The above theorem says that T ( p, m ) can be achieved by a coding strategy which doesnot require simultaneous transmission of multiple messages. Instead, each active user transmits a singlemessage encoded at rate r ( p ) . Inspection of (33) reveals that r ( p ) is a piecewise constant function of p , whose value depends on the transmission probability p . If p is in the k th interval of the partition Π m , then r ( p ) is equal to k . Similarly, the corresponding achievable throughput T ( p, m ) is a piecewisepolynomial function of p . The boundary values of the partition, denoted by the sequence { p k } , are givenin semi-analytic form as solutions of (28), and closed form expressions are available only for some specialvalues of m and k . Nevertheless, Theorem 13 provides the upper bound p k < km .The structure of the solution is amenable to the following intuitive interpretation. Based on theknowledge of m and p , transmitters estimate the number of active users. More precisely, if p is inthe k th interval of the partition Π m , i.e., p k − < p ≤ p k , then transmitters estimate that there are k active users. Since p k < km , it is interesting to observe that the computed estimator is in general differentfrom the maximum-likelihood estimator ⌊ mp ⌋ . Then, they encode their data at rate k , that is, each userrequests an equal fraction of the k -user binary MAC sum-rate capacity. Clearly, there is a chance that theactual number of active users exceed k , in which case a collision occurs. Vice-versa, the scheme resultsin an inefﬁcient use of the channel when the number of active users is less than k . However, this strategyrepresents the right balance between the risk of packet collisions and inefﬁciency.It is interesting to note that when p ≤ p k − the optimal strategy consists of encoding at rate , i.e., atthe maximum rate supported by the channel. As already remarked, this is the coding strategy used in theclassic ALOHA protocol. Notice that since p = m , this strategy is optimal when the probability of beingactive is less that the inverse of the population size in the network. In this case, there is no advantage inexploiting the multi-user capability at the receiver. On the other hand, for p > m , the throughput of anALOHA system is limited by packet collisions, which become more and more frequent as p increases.In this regime, the encoding rate has to decrease in order to accommodate the presence, which becomemore and more likely as p increases, of other potential active users. B. Throughput scaling for increasing values of m If we let the population size m grow while keeping p constant, the law of large number implies thatthe number of active users concentrates around mp , so one would expect that the uncertainty about the October 28, 2018 DRAFT6 number of active users decreases as m increases. This intuition is conﬁrmed by the following corollary,which states that the probability of collision tends to zero as m grows to inﬁnity. Corollary 9:

Let p ∈ (0 , . Then, lim m →∞ T ( p, m ) = 1 .So far, we have been assuming that p does not depend on m . Assume now that the total packet arrivalrate in the system is λ , and let p = λm be the arrival probability at each transmitting node. Let T ( λ ) denote the throughput in the limit m → ∞ . Then, by applying the law of rare events to (28) and (29)we obtain the following corollary to Theorem 8. Corollary 10:

Let λ , , λ ∞ , ∞ and, for < k < ∞ , let λ k be deﬁned as the unique solution in (0 , ∞ ) to the following polynomial equation in λ k + 1 Γ( k + 1 , λ ) = Γ( k, λ ) where Γ( k + 1 , λ ) is the incomplete gamma function. Then, as m tends to inﬁnity, the throughput isgiven by T ( λ ) = λk ! k Γ( k + 1 , λ ) , if λ ∈ ( λ k − , λ k ] , for k ∈ Z . The rate which attains the throughput is given by r ( λ ) = k , if λ ∈ ( λ k − , λ k ] , k ∈ Z . Finally, T ( λ ) is a continuous function of λ ; it is concave and strictly increasing in each interval ( λ k − , λ k ] , and lim λ →∞ T ( λ ) = 1 .Note that the claim above is in striking contrast with the throughput scaling of the classic slottedALOHA protocol. The throughput of slotted ALOHA increases for small λ , it reaches a maximum e − at λ = 1 /m , after which it decreases to zero as λ tends to inﬁnity. See Fig. 6 for a comparison between T ( λ ) and the throughput of standard ALOHA as a function of λ .VII. E XAMPLE THE M - USER SYMMETRIC

AWGN-RACWe now turn to another important example of additive channels. Suppose that the codewords generatedby the m encoders are composed by n random variables taking values over the reals, and whoserealizations satisfy the following average power constraint n X t =1 x i,t ≤ n P for some positive constant P . Observe that we focus on the symmetric case in which all users are subjectto the same received power constraint. Furthermore, suppose that { Z A } in (15) are independent standard October 28, 2018 DRAFT7

PSfrag replacements λe − T ( λ ) Slotted ALOHA λe − λ b it s / s y m bo l Fig. 6. Comparison between T ( λ ) and the throughput of the slotted ALOHA protocol Gaussian random variables, and that the sum in (15) is over the ﬁeld of real numbers. Applying Theorem5, we obtain the following proposition.

Proposition 11:

The capacity region C of the m -user symmetric AWGN-RAC is contained inside theset of { r i ( A ) } tuples satisfying r i ( B ) ≤ r i ( A ) for all i ∈ B ⊆ A, and K X k =1 r i k ( { i . . . i k } ) ≤ C ( K P) , for all K ∈ { , . . . , m } and i = . . . = i m ∈ { , . . . , m } . A. An approximate expression to within one bit for the throughput

Next, we turn to the problem of characterizing the throughput T ( p, m, P) for the symmetric AWGN-RAC as a function of the transmission probability p , the population size m , and the available power P .First, we provide inner and outer bounds on C ρρρ for this channel. Theorem 12:

Let C ρρρ denote the set of rates { ρ k } ∈ R m such that ρ k k (cid:0) mk (cid:1) ≥ ρ k +1 ( k +1) (cid:0) mk +1 (cid:1) ≥ . . . ≥ ρ m m (cid:0) mm (cid:1) ≥ , October 28, 2018 DRAFT8 and K X k =1 ρ k k (cid:0) mk (cid:1) ≤ C ( K P) , for all K ∈ { , . . . , m } . Let C ρρρ denote the set of rates { ρ k } ∈ R m that satisfy (25a) and C (P) ρ (cid:0) m (cid:1) + m X k =2 (cid:16) k C ( k P) − k − C (( k − (cid:17) ρ k k (cid:0) mk (cid:1) ≤ . Then, C ρρρ ⊆ C ρρρ ⊆ C ρρρ .The proof of the above theorem is omitted since it closely follows the proof of Theorem 7. As for thecase of the BD-RAC, the achievable region in the above theorem is obtained by considering the messagestructure deﬁned by (26) and (27) and the coding scheme we utilize does not require the use of Gaussiansuperposition coding.In virtue of Theorem 12 it is possible to bound T ( p, m ) as T ( p, m, P) ≤ T ( p, m ) ≤ T ( p, m, P) , where lower and upper bounds are given by (22) after replacing C ρρρ,m with C ρρρ,m and C ρρρ,m respectively.The following theorem provides an expression for T ( p, m, P) . Theorem 13:

Let Π m (P) represent the partition of the unit interval into the set of m intervals ( p (P) , p (P)] , . . . , ( p m − (P) , p m (P)] , where p (P) , , p m (P) , and, for k ∈ { , . . . , m − } , p k (P) is deﬁned as the unique solution in (cid:0) , km (cid:1) to the following polynomial equation in p C (( k + 1)P) k + 1 F m − ,k ( p ) = C ( k P) k F m − ,k − ( p ) . (31)Then, T ( p, m, P) is a continuous function of p , concave, strictly increasing in each interval of the partition Π m (P) , and is given by T ( p, m, P) = C ( k P) k mpF m − ,k − ( p ) , if p ∈ ( p k − (P) , p k (P)] , (32)for k ∈ { , . . . , m } . To achieve T ( p, m, P) , it sufﬁces that each active user transmits a unique messageencoded at rate r ( p, m, P) = C ( k P) k if p ∈ ( p k − (P) , p k (P)] , (33) October 28, 2018 DRAFT9 for k ∈ { , . . . , m } .The proof of the above theorem is omitted since it closely follows the proof of Theorem 7. Similarlyto what stated by Theorem 7 for the BD-RAC, the above theorem says that T ( p, m, P) can be achievedby a coding strategy which does not require superposition coding: each active user transmits a singlemessage encoded at rate r ( p, m, P) . Both r ( p, m, P) and T ( p, m, P) are piecewise constant function of p , whose value depends on the transmission probability p .The coding scheme used to achieve T ( p, m, P) for the symmetric AWGN-RAC is similar to the oneused to achieve the throughput of the symmetric BD-RAC: based on the knowledge of m and P and p ,transmitters estimate the number of active users. More precisely, if p is in the k th interval of the partition Π m (P) , i.e., p k − (P) < p ≤ p k (P) , then transmitters estimate that there are k active users. Then, theyencode their data at rate k C ( k P) , that is, each user requests an equal fraction of the k -user AWGN MACsum-rate capacity.A natural question to ask is how close this scheme is to the optimal performance. To answer thisquestion, we ﬁrst need to provide an expression for T ( p, m, P) . This is done in the next Theorem. Theorem 14:

Let Π m represent the partition of the unit interval into the set of m intervals ( p , p ] , . . . , ( p m − , p m ] , where p , , p m , and, for every k ∈ { , . . . , m } , p k is deﬁned as the unique solution in (cid:0) , km (cid:1) tothe following polynomial equation in p k + 1 F m − ,k ( p ) = 1 k F m − ,k − ( p ) . (34)Then, T ( p, m, P) is a continuous function of p , concave and strictly increasing in each interval of thepartition Π m (P) , and is given by T ( p, m, P) = mp m X i =1 v k,i F m − ,i − ( p ) if p ∈ ( p k − , p k ] , (35)for k ∈ { , . . . , m } , where v ,i =  C (2P) − C (P) , i = 1 , C ( i P) − C (( i + 1)P) − C (( i − , i ∈ { , . . . , m } , C ( m P) − C (( m − , i = m, (36) October 28, 2018 DRAFT0

For k ∈ { , . . . , m − } v k,i =  , i ∈ { , . . . , k − } , k +1 k C ( k P) − C (( k + 1)P) , i = k, C ( i P) − C (( i + 1)P) − C (( i − , i ∈ { k + 1 , . . . , m − } , C ( m P) − C (( m − , i = m, (37)For k = m − v m − ,i =  , i ∈ { , . . . , m − } , mm − C (( m − − C ( m P) , i = m − , C ( m P) − C (( m − , i = m, (38)For k = m v m,i =  , i ∈ { , . . . , m − } , m C ( m P) , i = m. (39) Proof:

See Appendix V.The proof of the above theorem is conceptually simple but technical, as it requires ﬁnding the analyticsolution of a linear program. Comparing the statements of Theorems 13 and 14, one can observe that thebasic structure of T ( p, m, P) and T ( p, m, P) is the same. As opposed to the sequence { p k (P) } deﬁnedin Theorem 13, the sequence { p k } in Theorem 14 does not depend on the power P . It is easy to seethat p k (P) ≤ p k ≤ k/m , for every k . Furthermore, the sequence { p k } deﬁned in Theorem 14 is equalto the sequence deﬁned in Theorem 7. By directly comparing T ( p, m, P) and T ( p, m, P) we obtain thefollowing result. Theorem 15:

Let p ∈ (0 , , m ≥ and P > . Then, T ( p, m, P) − T ( p, m, P) ≤ . Proof:

See Appendix VI.The above theorem says that our suggested coding scheme achieves an expected sum-rate which isonly bit away from the optimum, independently of the values of p , P and m . It it remarkable that thegap does not increase with the population size of the system. Thus we conclude that transmitting at rate k C ( k P) when p is in the k th interval of the partition Π m (P) represents the right balance between risk ofcollision and efﬁciency: encoding rates above k C ( k P) would increase the collision probability, yieldinga decrease in the expected sum-rate. Viceversa, rates lower than k C ( k P) would result in an inefﬁcient October 28, 2018 DRAFT1

PSfrag replacements pp , p , p (P) p (P) p (P) r ( p, m, P) T ( p, m, P) vs T ( p, m, P) T ( p, m, P) T ( p, m, P) ThroughputRate allocation C (P) C (2P) C (3P) C (4P) C (4P) b it s / s y m bo l b it s / s y m bo l Fig. 7. Upper and lower bounds on the throughput of a four-user symmetric AWGN-RAC and encoding rate achieving thelower bound, as a function of the transmission probability p ( P = 15 dB). use of the channel.Fig. 7 shows plots of T ( p, m, P) , T ( p, m, P) , and r ( p, m, P) for the case of networks with four users.Observe that the T ( p, m, P) is a piecewise concave function of the transmission probability. B. Comparison with other notions of capacity

The expression for the throughput derived in the previous section can be compared to similar expres-sions obtained assuming other notions of capacity. A natural outer bound is given by the throughputachieved assuming that full CSI is available to the transmitters. In this case, the sum-rate of the k -user AWGN-MAC can be achieved whenever k users are active. Averaging over the message arrivalprobability, we obtain the following expression for the throughput: T CSI ( p, m, P) , m X k =1 f m,k ( p ) C ( k P) . (40)On the other hand, if we study the symmetric AWGN-RAC following the adaptive capacity frameworkas in [14], then each transmitter designs a code which has to be decoded regardless the number of activeusers. This is a conservative viewpoint and forces each user to choose a rate of /m C ( m P) so that userscan be decoded even when all m transmitters are active. Thus, we obtain T AD ( p, m, P) , p C ( m P) . (41) October 28, 2018 DRAFT2

PSfrag replacements pp (P) T AD ( p, m, P) T ( p, m, P) T ( p, m, P) T CSI ( p, m, P) b it s / s y m bo l C (2P) MAP estimate for | A | Fig. 8. Throughput of the symmetric AWGN-RAC with m = 25 users ( P = 20 dB).

Fig. 8 compares the obtained bounds on T ( p, m, P) for the case m = 25 and P = 20 dB to the throughputunder the adaptive-rate framework (41), and assuming full CSI available to the transmitters (40).Finally, observe that in order to achieve T ( p, m, P) transmitters have to estimate the number of activeusers by solving the polynomial equations (31). A natural question to ask is what is the achievablethroughput performance if a maximum-likelihood estimator for the number of active user is used instead.Consider the following strategy. Suppose that, based on the knowledge of m and p , and assuming noprior on the number of active users, transmitters compute k ML , the maximum-likelihood estimator for thenumber of active users, and encode their data at rate C ( k ML P) /k ML . Since the most probable outcomeof ( m − Bernoulli trials with success probability p is the integer number between mp − and mp ,we have that k ML = ⌊ mp ⌋ . Thus, we obtain the following expression for the expected sum-rate capacity: T ML ( p, m, P) , mpk ML C ( k ML P) F m − ,k ML − ( p ) . (42)Fig. 9 compares T ( p, m, P) and (42) for the case m = 25 and P = 20 dB. We remark is that the MLestimator for the number of active users result in a strictly suboptimal throughput performance. Each active transmitter estimates the state of the remaining ( m − users. October 28, 2018 DRAFT3

PSfrag replacements pp (P) T AD ( p, m, P) T ( p, m, P) T ( p, m, P) T CSI ( p, m, P) b it s / s y m bo l C (2P) T ML ( p, m, P) Fig. 9. Throughput performance of the proposed estimator vs ML estimator for the number of active users (

P = 20 dB, m = 25 ). VIII. D

ISCUSSION AND PRACTICAL CONSIDERATIONS

In networking, much research effort has been put in the design of distributed algorithms where eachagent has limited information about the global state of the network. The model we developed in this paperallowed us to focus on the rate allocation problem that occurs when multiple nodes attempt to accessa common medium, and when the set of active users is not available to the transmitters. Our analysishas lead to a distributed algorithm which is easily implementable in practical systems, and which isoptimal in some information-theoretic sense. The rule of thumb which we have developed is that, upontransmission, senders should estimate the number of active users according to a prescribed algorithmbased on the knowledge of the population size and the transmission probability m , and then choose theencoding rate accordingly.In this paper we focused primarily on the problem of characterizing the throughput assuming perfectsymmetry in the network, that is, the same transmission probability and received power constraint acrossusers. The reasons for enforcing symmetry are twofold. First, throughput maximization is a meaningfulperformance metric only in symmetric scenarios. Second, it allows us to focus on random packet arrivalsat the transmitters, and not on the different power levels at which transmitted signals are received by thecommon receiver. This set-up is a realistic model for uplink communications in power-controlled cellularwireless systems. Nevertheless, an interesting open question is how to apply the layering approach to the m -user AWGN-RAC with unequal power levels at the receiver, assuming that each sender only knows October 28, 2018 DRAFT4 its own power level and state.We made the underlying assumption that users can be synchronized, both at block and symbol level.In light of this assumption, a time-sharing protocol could be employed to prove achievability results.A simple way to achieve this partial form of cooperation among senders is to establish, prior to anytransmission, that different coding schemes are used in different fractions of the transmission time.However, in practice achieving such complete synchronization may not be feasible. An interesting openquestion is to characterize the performance loss due to lack of synchronism. In this case, the resultingcapacity region need not be convex, as for the collision model without feedback studied by Massey andMathys [16].We also assumed that the receiver has perfect CSI, that is, it knows the set of active users. The question,relevant in practice, of how the receiver can acquire such information is not discussed here, and we referthe reader to the recent studies of Fletcher et al. [11], Angelosante et al. [4], and Biglieri and Lops [7],which address the issue using sparse signal representation techniques and random set theory.Finally, in this paper the transmission probability p and the number of users m play a pivotal rolein setting the encoding rate, and these quantities are supposed to be known at the transmitters. Theprobability p is determined by the burstiness of the sources, while m has to be communicated from thereceiver to the transmitters. In practice, our model applies to communication scenarios in which the basestation grants access to the uplink channel to m users, but where only a subset of these users actuallytransmit data. IX. A CKNOWLEDGMENT

Prof. Young-Han Kim is gratefully acknowledged for many inspiring discussions during the course ofthe work reported here. A

PPENDIX

I: P

ROOF OF T HEOREM C is a polytope in R deﬁned as the intersection of eight hyperplanes, two of whichrepresenting non-negativity constraints. By the Weyl-Minkowski theorem, C is the convex hull of ﬁnitelymany rate vectors. It is tedious but simple to verify that C = conv { v , . . . , v } October 28, 2018 DRAFT5 = conv   ,  C (P )000  ,  C (P )00  ,  C (P ) C (P )00  ,  C (P )0 C (P )0  ,  C (P )0 C (P )  ,  C (cid:0) P P +1 (cid:1) C (P )0 C (P )  ,  C (P ) C (cid:0) P P +1 (cid:1) C (cid:0) P P +1 (cid:1) ,  C (P ) C (cid:0) P P +1 (cid:1) C (P )0  ,  C (P ) C (cid:0) P P +1 (cid:1) C (P ) C (cid:0) P P +1 (cid:1) ,  C (cid:0) P P +1 (cid:1) C (P ) C (cid:0) P P +1 (cid:1) C (P )  ,  C (P ) C (P ) C (cid:0) P P +1 (cid:1)  ,  C (P ) C (P )0 C (cid:0) P P +1 (cid:1) ,  C (P ) C (P ) C (cid:0) P P +1 (cid:1) C (cid:0) P P +1 (cid:1) . (43)By convexity, it sufﬁces to show that for every i ∈ { , . . . , } , there exists an achievable rate vector r i such that d ( v i , r i ) ≤ . It is straightforward to verify that, for every i ∈ { , . . . , } , v i ∈ C ′ ∪ C ′′ .Thus, d ( v i , r i ) = 0 for all i ∈ { , . . . , } .Consider the rate vector r ∈ C ′′′ obtained by setting equality sign in the inequalities (14) with β = P P and β = 1 , i.e., r = (cid:2) C (P ) + C (cid:16) P − P +1 (cid:17) , C (P ) , C (cid:16) P − P +1 (cid:17) , (cid:3) T . We have that d ( v , r ) ≤ s(cid:12)(cid:12)(cid:12)(cid:12) C (P ) − C (P ) − C (cid:18) P − P + 1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) C (cid:0) P P +1 (cid:1) − C (cid:18) P − P + 1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = s(cid:12)(cid:12)(cid:12)(cid:12)

12 log (cid:18) P − P P P − P (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)

12 log 2P + 1P + 1 (cid:12)(cid:12)(cid:12)(cid:12) ≤ s(cid:12)(cid:12)(cid:12)(cid:12)

12 log 2P P P P − P (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)

12 log 2P + 1P + 1 (cid:12)(cid:12)(cid:12)(cid:12) ≤ √ . (44)Next, consider the rate vector r = [ C (P ) , C (P ) , , T ∈ C ′′′ , obtained by setting equality sign in theinequalities (14) with β = 1 and β = 1 . We have that d ( v , r ) = (cid:12)(cid:12)(cid:12) C (cid:0) P P +1 (cid:1)(cid:12)(cid:12)(cid:12) ≤ √ . (45)Finally, the distance between v and r can be bounded as follows d ( v , r ) ≤ s(cid:12)(cid:12)(cid:12)(cid:12) C (P ) − C (P ) − C (cid:18) P − P + 1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) C (cid:0) P P +1 (cid:1) − C (cid:18) P − P + 1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) C (cid:0) P P +1 (cid:1)(cid:12)(cid:12)(cid:12) ≤ √ . (46)Combining (44), (45), and (46) we conclude that d ( v i , r i ) ≤ √ , i ∈ { , , } , which concludes the October 28, 2018 DRAFT6 proof. A

PPENDIX

II: P

ROOF OF T HEOREM

A1. . Next, ﬁx i = i = . . . = i m ∈ { , . . . , m } .By Fano’s inequality, we have that, for all r ∈ { , . . . , m } , H (cid:18) r ∪ k =1 W i k ( { i . . . i r } ) | Y i ...i r (cid:19) ≤ nǫ n , (47)where ǫ n → in the limit of n going to inﬁnity. In particular, (47) implies that H ( W i r ( { i . . . i r } ) | Y i ...i r ) ≤ nǫ n . (48)Let K ∈ { , . . . , m } . Then, the following chain of equalities holds: n K X k =1 r i k ( { i . . . i k } )= H (cid:18) K ∪ k =1 W i k ( { i . . . i k } ) (cid:19) = H (cid:18) K ∪ k =1 W i k ( { i . . . i k } ) (cid:12)(cid:12)(cid:12)(cid:12) K ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) (cid:19) = I (cid:18) K ∪ k =1 W i k ( { i . . . i k } ); Y i ...i K (cid:12)(cid:12)(cid:12)(cid:12) K ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) (cid:19) + H (cid:18) K ∪ k =1 W i k ( { i . . . i k } ) (cid:12)(cid:12)(cid:12)(cid:12) Y i ...i K , K ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) (cid:19) (49)The ﬁrst term in the right hand side of (49) can be upper bounded as follows I (cid:18) K ∪ k =1 W i k ( { i . . . i k } ); Y i ...i K (cid:12)(cid:12)(cid:12)(cid:12) K ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) (cid:19) = H ( Y i ...i K ) − H (cid:18) Y i ...i K (cid:12)(cid:12)(cid:12)(cid:12) K ∪ k =1 W i k (cid:19) ≤ H ( Y i ...i K ) − H (cid:18) Y i ...i K (cid:12)(cid:12)(cid:12)(cid:12) K ∪ k =1 W i k , X i , . . . , X i K (cid:19) = n X t =1 I ( X i ,t , . . . , X i K ,t ; Y i ...i K ,t ) (50)where we use the fact conditioning reduces the entropy and the memoryless property of the channel. Onthe other hand, application of the chain rule on the second term at the left hand side of (49) yields H (cid:18) K ∪ k =1 W i k ( { i . . . i k } ) (cid:12)(cid:12)(cid:12)(cid:12) Y i ...i K , K ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) (cid:19) October 28, 2018 DRAFT7 = K X r =1 H (cid:18) W i r ( { i . . . i r } ) (cid:12)(cid:12)(cid:12)(cid:12) Y i ...i K , K ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) , K ∪ k = r +1 W i k ( { i . . . i k } ) (cid:19) = K X r =1 H (cid:18) W i r ( { i . . . i r } ) (cid:12)(cid:12)(cid:12)(cid:12) Y i ...i K , r ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) , K ∪ k = r +1 W i k (cid:19) = K X r =1 H (cid:18) W i r ( { i . . . i r } ) (cid:12)(cid:12)(cid:12)(cid:12) Y i ...i K , r ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) , K ∪ k = r +1 W i k , K ∪ k = r +1 X i k (cid:19) (51) = K X r =1 H (cid:18) W i r ( { i . . . i r } ) (cid:12)(cid:12)(cid:12)(cid:12) Y i ...i r , r ∪ k =1 (cid:8) W i k \ W i k ( { i . . . i k } ) (cid:9) (cid:19) = K X r =1 H ( W i r ( { i . . . i r } ) | Y i ...i r ) (52) ≤ Knǫ n , (53)where (51) uses the fact that X i k is a function of W i k , (52) uses the fact conditioning reduces the entropy,and (53) follows from (48).Therefore, substituting (50) and (53) into (49), we obtain that n K X k =1 r i k ( { i . . . i k } ) ≤ n X t =1 I ( X i ,t , . . . , X i K ,t ; Y i ...i K ,t ) + nKǫ n , (54)and the claim is completed by introducing a standard timesharing random variable and letting the blocksize n tend to inﬁnity. A PPENDIX

III: P

ROOF OF T HEOREM P denote the convex subset of R m described described by inequalities (25a) and (25b). First weprove the converse part, by establishing that C ρρρ ⊆ P . As a ﬁrst step, we derive a useful identity. Let k ∈ { , . . . , m } . Then, X i = ···6 = i m ∈{ ,...,m } r i k ( { i . . . i k } ) = ( m − k )! X i = ···6 = i k ∈{ ,...,m } r i k ( { i . . . i k } )= ( m − k )!( k − X A ⊆{ ,...,m }| A | = k X i ∈ A r i ( A )= ( m − k )!( k − ρ k , (55)where the second equality uses the fact that r i k ( i . . . i k ) = r i k ( { i σ , . . . , i σ k − , i k } ) for any permutation σσσ over the set { , . . . , k − } . Now we can establish the necessity of (25b). It follows from (24) that the October 28, 2018 DRAFT8 following inequality has to hold m X k =1 r i k ( i . . . i k ) ≤ , (56)for all i = · · · 6 = i m ∈ { , . . . , m } . By summing both sides of (56) over all permutations over the ﬁrst m integers, we obtain X i = ···6 = i m ∈{ ,...,m } m X k =1 r i k ( { i . . . i k } ) ≤ m ! . (57)By means of (55), (57) can be re-written as m X k =1 ( m − k )!( k − ρ k ≤ m ! . (58)Dividing both sides of (58) by m ! , we conclude that (25b) is a necessary condition for the achievabilityof a rate vector ρρρ .Next, note from (23) that r i ( A ) ≥ r i ( B ) for all i ∈ A ⊆ B ⊆ { , . . . , m } is a necessary condition tothe achievability of a rate vector { r i ( A ) } . By summing these inequalities over all B having cardinality | A | + 1 , we obtain that r i ( A ) ≥ m − | A | X B : i ∈ A ⊆ B ⊆{ ,...,m }| B | = | A | +1 r i ( B ) . (59)Next, observe that, for every k ∈ { , . . . , m − } , ρ k ≥ X A : A ⊆{ ,...,m }| A | = k X i ∈ A r i ( A )= m X i =1 X A : A ⊆{ ,...,m } i ∈ A, | A | = k r i ( A ) ≥ m X i =1 X A : A ⊆{ ,...,m } i ∈ A, | A | = k m − k X B : i ∈ A ⊆ B ⊆{ ,...,m }| B | = k +1 r i ( B ) (60) = 1 m − k m X i =1 X B : B ⊆{ ,...,m } i ∈ B, | B | = k +1 X A : A ⊆ Bi ∈ A, | A | = k r i ( B ) (61) October 28, 2018 DRAFT9 = km − k m X i =1 X B : B ⊆{ ,...,m } i ∈ B, | B | = k +1 r i ( B )= km − k ρ k +1 (62)where (60) follows from (59), while (61) is obtained observing that there are k subsets of B which havecardinality k and contain the element i . After multiplying right and left hand side of (62) by (( m − k − and rearranging the terms, we obtained the desired inequality ρ k k (cid:0) mk (cid:1) ≥ ρ k +1 ( k + 1) (cid:0) mk +1 (cid:1) which proves (25a). In summary, we showed that inequalities (25a) and (25b) are necessary conditionsfor the achievability of a rate vector ρρρ , i.e., C ρρρ ⊆ P .Next, we prove the achievability of P , establishing the reversed inclusion P ⊆ C ρρρ . To do so, it sufﬁcesto show that the extreme points of P are achievable, as the rest of the region can be achieved my meansof a time-sharing protocol. We claim that P = conv n , n k k X i =1 i (cid:0) mi (cid:1) e i o mk =1 o . (63)where the vector e i denotes the i th unit vector in R m . To see this, consider an invertible linear transfor-mation L : R m → R m given by  x m = ρ m m (cid:0) mm (cid:1) ,x k = ρ k k (cid:0) mk (cid:1) − ρ k +1 ( k +1) (cid:0) mk +1 (cid:1) , k ∈ { , . . . , m − } . (64)It is straightforward to check that the image P under L is given by the oriented m -simplex L P = { x ∈ R m + : P mk =1 kx k ≤ } = conv { , v ′ , . . . , v ′ m } , wherein v ′ k = k e k . Since L is invertible,the extreme points of P can be obtained by applying L − to the extreme points of L P . Thus, P = conv { , ρρρ , . . . , ρρρ m } where ρρρ k = L − v ′ k = 1 k k X i =1 i (cid:18) mi (cid:19) e i , (65) k ∈ { , . . . , m } . Hence (63) is proved.Next, we show that each rate vector ρρρ k given by (65) is achievable. Consider the following messagestructure: W i = { W i, , . . . , W i,m } (66) October 28, 2018 DRAFT0 and W i ( A ) =  ∪ j ≥| A | W i,j , i ∈ A ; ∅ , i A. (67)It is immediate to verify that the above sets satisfy conditions A1. , so the message structure is welldeﬁned. For every i , sender i transmits m independent messages { W i, , . . . , W i,m } encoded at rates { R i, , . . . , R i,m } . For every k ∈ { , . . . , m } , the k th message W i,k is decoded at receiver A if i ∈ A and if | A | ≤ k , that is, if user i is active and there are less than k active users. To achieve the ratevector ρρρ k it sufﬁces to set R i,k = k for all i , and the other rates equal to zero, that is, each sender i transmits a single message of information W i,k encoded at rate k . Encoding is performed by means ofa standard multiple-access random codebook. It follows from (67) that receiver A decodes W i,k if i ∈ A and | A | ≤ k . Thus, we have r i ( A ) =  k , i ∈ A and | A | ≤ k ;0 , otherwise . (68)Observe that for every receiver A the sum of the rates of the decoded messages is at most . It followsthat decoding can be performed by means of a standard k user multiple-access decoder. By plugging (68)into (21), we obtain that ρ k,i =  ik (cid:0) mi (cid:1) , if i ∈ { , . . . , k } , otherwise, (69)hence (65) is achievable. A PPENDIX

IV: P

ROOF OF T HEOREM

Lemma 16:

Let k ∈ { , . . . , m − } . There exists a p k ∈ (cid:0) , km (cid:1) such that k F m − ,k − ( p ) − k + 1 F m − ,k ( p )  > , p < p k = 0 , p = p k < , p > p k . (70) Proof:

Deﬁne f ( p ) = k F m − ,k − ( p ) − k +1 F m − ,k ( p ) . The binomial sum F m − ,k − ( p ) is related tothe incomplete Beta function by [1, (6.6.4) page 263] F m − ,k − ( p ) = 1 − k (cid:0) m − k (cid:1) Z p t k − (1 − t ) m − − k dt. (71) October 28, 2018 DRAFT1

Substituting (71) into the deﬁnition of f ( p ) and differentiating, we obtain the following expression forthe derivative of f with respect to p f ′ ( p ) = − p (1 − p ) f m − ,k ( p ) (cid:2) − p mk +1 (cid:3) . By studying the sign of f ′ ( p ) one can see that f ( p ) is a strictly decreasing function of p in the range (cid:0) , k +1 m (cid:1) , reaches a minimumat p = k +1 m and is a strictly increasing in the interval (cid:0) k +1 m , (cid:1) . We have f (1) = 0 , and the Taylorexpansion centered at p = 1 shows that f ( p ) increases to zero as p tends to one. Thus, f (cid:0) k +1 m (cid:1) < .Note that f (0) > so, by the monotonicity of f and by the mean value theorem, there exists a unique p k ∈ (cid:0) , k +1 m (cid:1) such that f ( p )  > , p < p k = 0 , p = p k < , p > p k . (72)To complete the proof, we show that p k < km . Direct computation shows that p = 1 /m , while for k ∈ { , . . . , m − } , we have that f (cid:0) km (cid:1) = k F m − ,k − (cid:0) km (cid:1) − k +1 F m − ,k (cid:0) km (cid:1) = (cid:16) k − k +1 (cid:17) F m − ,k − (cid:0) km (cid:1) − k +1 f m − ,k (cid:0) km (cid:1) < (cid:16) k − k +1 (cid:17) kf m − ,k − (cid:0) km (cid:1) − k +1 f m − ,k − (cid:0) km (cid:1) = 0 , (73)where the inequality follows from the fact that f m − ,i (cid:0) km (cid:1) ≤ f m − ,k − (cid:0) km (cid:1) for i ∈ { , . . . , k − } , withequality iff i = k − , and that f m − ,k − (cid:0) km (cid:1) = f m − ,k (cid:0) km (cid:1) for k ∈ { , . . . , m − } . Thus, (72) and (73)show that p k < km as claimed.Roughly speaking, the above says that to achieve the throughput the encoding rate has to decrease as thetransmission probability increases. The second lemma shows that /k is the optimal encoding rate when p is in the k th interval of the partition Π m (P) . Lemma 17:

Let k ∈ { , . . . , m } . Deﬁne p , and p m , and let { p k } m − k =1 be as in Lemma 16.Then, k F m − ,k − ( p ) ≥ j F m − ,j − ( p ) , j ∈ { , . . . , m } , (74)for p ∈ [ p k − , p k ] . Proof:

In virtue of Lemma 16, it sufﬁces to show that p k < p k +1 , for k ∈ { , . . . , m − } . As p ∈ (0 , /m ] , it follows that p < p . Next, suppose that k ∈ { , . . . , m − } . Lemma 16 shows that October 28, 2018 DRAFT2 k F m − ,k − ( p k ) = k +1 F m − ,k ( p k ) and that p k ∈ (cid:0) km (cid:1) . Thus, we have k +1 F m − ,k ( p k ) − k +2 F m − ,k +1 ( p k )= k F m − ,k − ( p k ) − k +2 F m − ,k +1 ( p k )= (cid:16) k + k +2 (cid:17) F m − ,k − ( p k ) − k +2 (cid:0) F m − ,k − ( p k ) + F m − ,k +1 ( p k ) (cid:1) > k +1) k ( k +2) F m − ,k − ( p k ) − k +2 F m − ,k ( p k )= k +1) k ( k +2) F m − ,k − ( p k ) − k +1) k ( k +2) F m − ,k − ( p k )= 0 , (75)where the inequality uses the fact that F m − ,k − ( p k ) + F m − ,k +1 ( p k ) (cid:1) < F m − ,k ( p k ) for p < k/m .Comparing (70) and (75), we obtain the desired inequality p k < p k +1 .Using the above lemma, it is immediate to prove theorem 8. Proof:

Observe that the optimum value of a linear program, if it exists, is always achieved at oneof the extreme point of the feasibility set. Thus, (63) implies that T ( p, m, P) = max k ∈{ ,...,m } k k X i =1 i (cid:18) mi (cid:19) p i (1 − p ) m − i = max k ∈{ ,...,m } mp k F m − ,k − ( p )= mp m X k =1 k F m − ,k − ( p ) { p ∈ ( p k − ,p k ] } , where the last equality follows from Lemma 17.A PPENDIX

V: P

ROOF OF T HEOREM c k , C ( k P) . In order to evaluate T ( p, m, P) , it is convenient to make the change of variable  x m = ρ m m (cid:0) mm (cid:1) ,x k = ρ k k (cid:0) mk (cid:1) − ρ k +1 ( k +1) (cid:0) mk +1 (cid:1) , k ∈ { , . . . , m − } . (76)Substituting the new variables into (22), (25a), and (25b) and performing a modicum of algebra, weobtain, T ( p, m, P) = max x ∈ C x ,m mp m X i =1 F m − ,i − ( p ) x i , (77) October 28, 2018 DRAFT3 where C x ,m denote the set of rates { x k } ∈ R m + such that K − X k =1 kx k + K m X k = K x k ≤ c K (78)for every K ∈ { , . . . , m } . Observe that the optimum value of the linear program (77) is achieved atone of the extreme point of the feasibility set. Therefore, to prove the theorem it sufﬁces to show that { v k } mk =1 as deﬁned in (36-39) are extreme points of C x ,m , and that the objective function in (77) reachesa strict local maximum at v k when p is in the k th interval of the partition Π m (P) .For every k ∈ { , . . . , m } , it is straightforward to check that v k satisﬁes (78) for K ∈ { k, . . . , m } ,and that v k has k − zero components. Thus, we conclude that v k is an extreme point of C x ,m .Next, we establish that if p ∈ [ p k − , p k ] , where { p k − } are deﬁned in Lemma 16, then the objectivefunction reaches a local maximum at v k . We proceed by showing that the objective function at v k is strictly greater than at any of its neighboring extreme points. By deﬁnition, two extreme points areneighbors if they are connected by an edge. It is possible to show that v k has exactly m neighbor extremepoints, which we denote by n n ( k ) j o mj =1 . The proof of this fact is straightforward albeit fairly lengthy, sois not reported here. For k ∈ { , . . . , m − } , we have that • If j ∈ { , . . . , k − } , then n ( k ) j,i =  kj ( k − j ) c j − k − j c k , i = j, , i ∈ { , . . . , j − } ∪ { j + 1 , . . . , k − } , k − j +1 j ( k − j ) c k − k − j c j − c k +1 , i = k,v k,i , i ∈ { k + 1 , . . . , m } (79) • If j = k , then n ( k ) j = v k +1 . • If j ∈ { k + 1 , . . . , m − } , then n ( k ) j,i =  c j − − c j − − c j +1 , i = j − , , i = j, c j +1 − c j +2 − c j − , i = j + 1 ,v k,i , i ∈ { , . . . , j − } ∪ { j + 2 , . . . , m } (80) October 28, 2018 DRAFT4 • If j = m − , then n ( k ) m − ,i =  c m − − c m − − c m , i = m − , , i = m − , c m − c m − , i = m,v k,i , i ∈ { , . . . , m − } (81) • Finally, if j = m then n ( k ) m,i =  c m − − c m − , i = m − , , i = m,v k,i , i ∈ { , . . . , m − } , (82)On the other hand, for k = m and j ∈ { , . . . , m − } , we have that n ( m ) j,i =  mj ( m − j ) c j − m − j c m , i = j, m − j c m − m − j c j , i = m, (83)It can be immediately veriﬁed that n n ( k ) j o mj =1 as deﬁned above are extreme points of C x ,m , and neighborsof v k .Next, we establish that the objective function in (77) reaches a local maximum at v k by comparingthe value achieved at v k to the one at its neighboring extreme points. First, suppose k ∈ { , . . . , m − } . • If j ∈ { , . . . , k − } , we can observe, from plugging (79) into (77) and performing some algebraicmanipulations, that mp m X i =1 ( v k,i − n ( k ) j,i ) F m − ,i − ( p )= F m − ,k − ( p ) (cid:18) k − j c j − jk ( k − j ) c k (cid:19) − F m − ,j − ( p ) (cid:18) kj ( k − j ) c j − k − j c k (cid:19) > F m − ,k − ( p ) (cid:18) k − j c j − jk ( k − j ) c k (cid:19) − jk F m − ,k − ( p ) (cid:18) kj ( k − j ) c j − k − j c k (cid:19) = 0 , because F j − < jk F m − ,k − ( p ) if p is in the k th interval of the partition Π m (P) . • If j = k , then mp m X i =1 ( v k,i − n ( k ) k,i ) F m − ,i − ( p ) October 28, 2018 DRAFT5 = 1 k + 1 (cid:18) k + 1 k c k − c k − (cid:19) (cid:18) k F m − ,k − ( p ) − k + 1 F m − ,k ( p ) (cid:19) > , • If j ∈ { k + 1 , . . . , m − } , then mp m X i =1 ( v k,i − n ( k ) j,i ) F m − ,i − ( p ) == 2 (cid:18) c j − c j − + c j +1 (cid:19) (cid:18) F m − ,k ( p ) − F m − ,k − ( p ) + F m − ,k +1 ( p )2 (cid:19) > , • If j = m , then mp m X i =1 ( v k,i − n ( k ) m,i ) F m − ,i − ( p ) = ( c m − c m − ) ( F m − ,m − ( p ) − F m − ,m − ( p )) > , Next, suppose k = m . Compare the utility function at v m and n ( m ) j . mp m X i =1 ( v k,i − n ( k ) m,i ) F m − ,i − ( p ) == F m − ,m − ( p ) (cid:18) m − j c j − jm ( m − j ) c m (cid:19) − F m − ,j − ( p ) (cid:18) mj ( m − j ) c j − jm − j c m (cid:19) = F m − ,m − ( p ) (cid:18) m − j c j − jm ( m − j ) c m (cid:19) − jm F m − ,m − ( p ) (cid:18) mj ( m − j ) c j − jm − j c k (cid:19) = 0 . Therefore, we have established that the objective function reaches a local maximum at v k and completedthe proof. A PPENDIX

VI: P

ROOF OF T HEOREM c k , C ( k P) . For every k ∈ { , . . . , m } , if p ∈ ( p k − , p k ] we have that T ( p, m, P) ≥ mp c k k F m − ,k − ( p ) . (84) October 28, 2018 DRAFT6

In particular, equality holds in (84) when p ∈ [max( p k − , p k − (P)) , min( p k , p k (P))] . It follows that T ( p, m, P) − T ( p, m, P) ≤ mp m X i =1 v k,i F m − ,i − ( p ) − mp c k k F m − ,k − ( p ) . (85)for p ∈ ( p k − , p k ] . To prove the theorem, we show that the right hand side of (85) is upper boundedby one for every k ∈ { , . . . , m } . First, we consider the case k = 1 . By substituting (36) into (85), weobtain that mp " m X i =1 v ,i F m − ,i − ( p ) − C (P) F m − , ( p ) = mp  c − c ) F m − , ( p ) + m X j =1 ( c j +1 − c j ) f m − ,j  = mp  F m − , ( p ) + m − X j =1 j f m − ,j  = 32 f m, ( p ) + m − X j =1 j + 12 j f m,j +1 ≤ f m, ( p ) + m X j =2 f m,j = 32 f m, ( p ) + (1 − f m, ( p ) − f m, ( p )) ≤ where the second equality uses the fact that c j +1 − c j ≤ / (2 j ) , while the last equality follows from f m, ( p ) ≥ f m, ( p ) for p ∈ (0 , /m ] . Similarly, from (37) we obtain that, for every k ∈ { , . . . , m − } , mp " m X i =1 v k,i F m − ,i − ( p ) − c k k F m − ,k − ( p ) = mp " ( c k − c k − ) F m − ,k − ( p ) + m − X i = k +1 (2 c i − c i − − c i +1 ) F m − ,i − ( p )+( c m − c m − ) F m − ,m − ( p )]= mp " c k − c k − + m − X i = k +1 (cid:0) c i − c i − − c i +1 (cid:1) + c m − c m − ! F m − ,k − ( p )+ m X j = k +1 m − X i = j (cid:0) c i − c i − − c i +1 + c m − c m − (cid:1) f m − ,j − ( p )  October 28, 2018 DRAFT7 = mp m X j = k +1 ( c j − c j − ) f m − ,j − ( p ) ≤ mp m X j = k +1 j − f m − ,j − ( p )= mp m − X j = k j f m − ,j ( p ) ≤ The proof is concluded observing that we have T ( p, m, P) = T ( p, m, P) when k = m .R EFERENCES [1] M. Abramowitz and I. A. Stegun,

Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables ,9th ed. New York: Dover, 1972.[2] N. Abramson, “The ALOHA system – Another alternative for computer communications,” in

Proc. Full Joint ComputerConf. , AFIPS Conf., vol. 37, 1970, pp. 281-285.[3] R. Ahlswede, “Multi-way communication channels,” in

Proc. 2nd Int. Symp. Information Theory , Tsahkadsor, ArmenianS.S.R., 1971, Hungarian Acad. SC., pp. 23-52, 1973.[4] D. Angelosante, E. Biglieri, M. Lops, ”Low-complexity receivers for multiuser detection with an unknown number of activeusers,”

IEEE J. Sel. Areas Comm. , submitted 2007.[5] S. Avestimehr, S. Diggavi and D. N. C. Tse, “A deterministic approach to wireless relay networks”, in

Proc. Allerton Conf.on Communication, Control, and Computing , Illinois, USA, Sept. 2007.[6] E. Biglieri, J. G. Proakis, S. Shamai (Shitz), “Fading Channels: Information-Theoretic and Communication Aspects,”

IEEETrans. on Inform. Theory , Vol. IT-44, No. 6, pp. 2619–2692, Oct. 1998.[7] E. Biglieri, M. Lops. “Multiuser Detection in a Dynamic Environment Part I: User Identiﬁcation and Data Detection,”

IEEETrans. on Inform. Theory , Vol. IT-53, No. 9, pp. 3158–3170, Sept. 2007.[8] Y. Cemal and Y. Steinberg, “The multiple-access channel with partial state information at the encoders,”

IEEE Trans. onInform. Theory , vol. IT-51, no.11, pp. 3992-4003, Nov. 2005.[9] T. M. Cover, “Broadcast channels,”

IEEE Trans. on Inform. Theory , Vol. IT-18, No. 1, pp. 2-14, Jan. 1972.[10] Ephremides, A.; Hajek, B., “Information theory and communication networks: an unconsummated union,”

IEEE Trans. onInform. Theory , vol. IT-44, no.6, pp.2416-2434, Oct 1998[11] A. K. Fletcher, S. Rangan and V. K. Goyal, “On-Off random access channels: a compressed sensing framework,” arXiv:0903.1022 .[12] R. G. Gallager, “A perspective on multiaccess channels,”

IEEE Trans. Inform. Theory , vol. IT-31, no. 2, pp. 124-142, Mar.1985.[13] S. Ghez, S. Verd´u, and S. Schwartz, “Stability properties of slotted ALOHA with multipacket reception capability,”

IEEETrans. Autom. Control , vol. 33, no. 7, pp. 640–649, Jul. 1988.[14] C. Hwang, M. Malkin, A. El Gamal and J. M. Ciofﬁ, “Multiple-access channels with distributed channel state information,”in

Proc. of IEEE Symposium on Information Theory , pp. 1561-1565, 24-29 June, 2007, Nice, France.

October 28, 2018 DRAFT8 [15] H. Liao, “A coding theorem for multiple access communications,” in

Proc. Int. Symp. Information Theoy , Asilomar, CA,1972.[16] J. L. Massey and P. Mathys, “The collision channel without feedback,”

IEEE Trans. on Inform. Theory , Vol. IT-31, No. 2,pp. 192–204, Mar. 1985.[17] M. Medard, J. Huang, A. J. Goldsmith, S. P. Meyn, and T. P. Coleman, “Capacity of time-slotted ALOHA packetizedmultiple-access systems over the AWGN channel”,

IEEE Trans. on Wireless Communications , Vol. 3, No. 2, pp. 486-499,Mar. 2004.[18] P. Minero and D. N. C. Tse “A broadcast approach to multiple access with random states,” in

Proc. of IEEE Symposiumon Information Theory , pp. 2566-2570, 24-29 June, 2007, Nice, France.[19] V. Naware, G. Mergen and L. Tong, “Stability and delay of ﬁnite user slotted ALOHA with multipacket reception”,

IEEETrans. Inform. Theory , vol. 51, no. 7, pp. 2636–2656, Jul. 2005.[20] S. Shamai (Shitz), “A broadcast approach for the multiple-access slow fading channel,” in

Proc. of IEEE Symposium onInformation Theory , p. 128, 25-30 June 2000, Sorrento, Italy.[21] A. Steiner and S.Shamai (Shitz), “The broadcast approach in communications systems,” , Dec. 3-5, 2008, Eilat, Israel.[22] S. Shamai (Shitz) and A. Steiner, “A broadcast approach for a single-user slowly fading MIMO channel,”

IEEE Trans. onInform. Theory , Vol. 49, No. 10, pp. 2617–2635, Oct. 2003., Vol. 49, No. 10, pp. 2617–2635, Oct. 2003.