[PDF] Distributed User Association in B5G Networks Using Early Acceptance Matching Games

Abstract

We study distributed user association in 5G and beyond millimeter-wave enabled heterogeneous networks using matching theory. We propose a novel and efficient distributed matching game, called early acceptance (EA), which allows users to apply for association with their ranked-preference base station in a distributed fashion and get accepted as soon as they are in the base station's preference list with available quota. Several variants of the EA matching game with preference list updating and reapplying are compared with the original and stability-optimal deferred acceptance (DA) matching game, which implements a waiting list at each base station and delays user association until the game finishes. We show that matching stability needs not lead to optimal performance in other metrics such as throughput. Analysis and simulations show that compared to DA, the proposed EA matching games achieve higher network throughput while exhibiting a significantly faster association process. Furthermore, the EA games either playing once or multiple times can reach closely the network utility of a centralized user association while having much lower complexity.

Full PDF

11 Distributed User Association in B5G NetworksUsing Early Acceptance Matching Games

Alireza Alizadeh,

Student Member, IEEE, and Mai Vu,

Senior Member, IEEE

Abstract —We study distributed user association in5G and beyond millimeter-wave enabled heteroge-neous networks using matching theory. We propose anovel and eﬃcient distributed matching game, called early acceptance (EA), which allows users to apply forassociation with their ranked-preference base stationin a distributed fashion and get accepted as soonas they are in the base station’s preference list withavailable quota. Several variants of the EA matchinggame with preference list updating and reapplyingare compared with the original and stability-optimaldeferred acceptance (DA) matching game, which im-plements a waiting list at each base station and delaysuser association until the game ﬁnishes. We show thatmatching stability needs not lead to optimal perfor-mance in other metrics such as throughput. Analy-sis and simulations show that compared to DA, theproposed EA matching games achieve higher networkthroughput while exhibiting a signiﬁcantly faster as-sociation process. Furthermore, the EA games eitherplaying once or multiple times can reach closely thenetwork utility of a centralized user association whilehaving much lower complexity.

Index Terms —Matching theory, user association,early acceptance, mmWave-enabled networks.

I. Introduction G and beyond heterogeneous networks (HetNets)will utilize both sub-6 GHz and millimeter wave(mmWave) frequency bands. These networks will behighly dense and composed of diﬀerent types of basestations (BSs) with diﬀerent sizes, transmit powers,and capabilities. In these dense multi-tier networks,ﬁnding optimal user associations is a challenging prob-lem, which deﬁnes the best possible connections be-tween BSs and user equipment (UEs) to achieve anoptimal network performance while satisfying BSs’load constraints. Traditional association method ofconnecting to the BS with highest SINR may overloadcertain BSs and no longer work well in a HetNet withdiﬀerent BS transmit powers and under the highlydirectional and variable mmWave channel conditions.Furthermore, the user association process needs to befast and eﬃcient to adapt to the low-latency require-ments of beyond 5G (B5G) networks.In a user association problem, the presence or ab-sence of the connection between a UE and a BS isusually indicated by an integer variable with valueone or zero, making it an integer optimization. In thispaper, we focus on unique user association in whicheach UE can only be associated with one BS at a time. In B5G HetNets, UEs will be able to work in dualconnectivity mode as they are equipped with a multi-mode modem supporting both sub-6 GHz and mmWavebands, allowing the possibility of association to eithera macro cell BS (MCBS) or a small cell BS (SCBS).

A. Background and Related Works

User association for load balancing in LTE Het-Nets is studied in [1], where a user association al-gorithm is introduced by relaxing the unique associ-ation constraint and then using a rounding methodto obtain unique association coeﬃcients. In [2], theresearchers studied optimal user association in mas-sive MIMO networks and solve the user associationproblem using Lagrangian duality. User associationin a 60-GHz wireless network is studied in [3], wherethe researchers assumed that interference is negligi-ble due to directional steerable antenna arrays andhighly attenuated signals with distance at 60-GHz.This assumption, however, becomes inaccurate at othermmWave frequencies considered for cellular use as itis shown that the mmWave systems can transit fromnoise-limited to interference-limited regimes [4], [5].The unique user association problem usually resultsin a complex integer nonlinear programming which isNP-hard in general. Heuristic algorithms have beendesigned to solve this problem and achieve a near-optimal solution [6], [7]. These algorithms often re-quire centralized implementation which usually suf-fers from high computational complexity and relieson a central coordinator to collect the channel stateinformation (CSI) between all BSs and UEs. As aresult, practical implementation of these algorithmsin mmWave-enabled networks is potentially ineﬃcientgiven the erratic nature of mmWave channels. The au-thors in [8] proposed two distributed user associationmethods for a 60-GHz mmWave network. They onlyconsidered a simple mmWave channel model basedon large-scale CSI and antenna gains, and assumedthat the interference is negligible because of highlydirectional transmissions.Matching theory has been proposed as a promis-ing low-complexity mathematical framework with in-teresting applications from labor market to wirelessnetworks. In a pioneering work [9], Gale and Shapleyintroduced a matching game, called deferred accep-tance (DA) game, to solve the problem of one-to-one a r X i v : . [ ee ss . SP ] D ec and many-to-one matching. The DA matching gamehas recently found applications for user association inwireless networks. For example, the DA game is em-ployed for resource management in wireless networksusing algorithmic implementations in [10]. A two-tiermatching game is proposed for user association basedon task oﬄoading to BSs jointly with collaborationamong BSs for task transferring to underloaded BSs[11]. Matching algorithms based on the DA game havealso been used for user association in the downlink ofsmall cell networks in [12] and for the uplink in [13]. B. Our Contributions

We propose matching-theory based user associa-tion schemes designed by considering system aspectsspeciﬁcally relevant to B5G cellular networks. We con-sider a mmWave clustered channel model for gener-ating the mmWave links. In the system model, theUEs have dual connectivity and are equipped withboth sub-6 GHz and mmWave antenna arrays. Theeﬀect of directional transmissions in B5G systems isdirectly integrated via beamforming transmission andreception in both frequency bands. Moreover, we takeinto account the dependency between user associa-tion and interference which is crucial for beamform-ing transmission, while previous works either ignoreinterference or assume it to be independent of userassociation. Such an assumption is applicable in LTEcellular networks because of omni-directional trans-missions, but is no longer suitable for B5G mmWave-enabled networks because of directional transmissionsby beamforming at both BSs and UEs.Existing matching theory user association works arebased exclusively on the DA game. One important issuewith the DA game is excessive association delay fordistributed implementation, as the associations of allUEs are postponed to the end of the game. This is dueto the presence of a waiting list at each BS, describedin more details in Sec. III-C. In [14], we proposed a newmatching game with lower delay and provided somepreliminary results. In this paper, we propose a setof low-delay distributed matching games tailored foruser associations in B5G cellular networks. The maincontributions of this paper are: • We introduce a distributed user association frame-work, in which UEs and BSs exchange applicationand response messages to make decisions, and deﬁnerelevant metrics to assess its performance. We showthat stability of a matching game does not directlylead to optimality in other objectives such as thenetwork throughput. Instead of stability which isalso less relevant in a dynamic network, we con-sider performance metrics for distributed user as-sociation in terms of association delay, user’s powerconsumption in the association process, percentageof unassociated users, and network utility includingthroughput or spectral eﬃciency. • We propose three novel and purely distributedmatching games, called early acceptance (EA), andcompare them with the well-known DA matchinggame. Unlike DA which postpones association deci-sions to the last iteration of the game, our proposedEA games allow the BSs to make their decisionsin accepting or rejecting UEs immediately. This ap-proach results in signiﬁcantly faster association pro-cess and at the same time a slightly higher networkthroughput, and hence presents a better choice forassociation in fast-varying mmWave systems. • Our proposed EA games follow a set of similar rules,but are diﬀerent in terms of updating the UE/BSpreference lists and reapplying to BSs. Numerical re-sults show that the basic EA game without updatingor reapplying is the fastest, whereas the EA gamewith preference list updating and reapplication leadsto the highest percentage of associated UEs. Thus,there is a tradeoﬀ between the game simplicity andthe number of unassociated UEs. • We also propose a multi-game user association al-gorithm to further improve the network throughputby performing multiple rounds of a matching game.The proposed algorithm requires minimal central-ized coordination, and as the number of rounds ofthe game increases, reaches closely the performanceof centralized worst connection swapping (WCS) al-gorithm introduced in [7]. • Our simulations show that the proposed EA gameshave comparable performance with the DA game interms of power consumption and signaling overhead,while resulting in a slightly higher network utilityand a superior performance in terms of associationdelay. Considering the fact that the number of UEs tobe associated is usually diﬀerent from the total quotaof BSs, we show that our proposed EA games areeﬀective in any loading scenarios (underload, criticalload, and overload).

C. Notation

In this paper, scalars and sets are denoted by italicletters (e.g. 𝑥 or 𝑋 ) and calligraphy letters (e.g. X ), re-spectively. Vectors are represented by lowercase bold-face letters (e.g. x ), and matrices by uppercase boldfaceletters (e.g. X ). Superscript ( . ) 𝑇 and ( . ) ∗ represent thetranspose operator and the conjugate transpose oper-ator, respectively. log ( . ) stands for base-2 logarithm,and big-O notation O ( . ) expresses the complexity. |X| denotes the cardinality of set X . 𝑰 𝑁 is the 𝑁 × 𝑁 identitymatrix, and | X | denotes determinant of matrix X .II. System and User Association ModelsWe study the problem of user association in thedownlink of a two-tier HetNet with 𝐵 macro cell BSs(MCBS) operating at microwave band, 𝑆 small cell BSs(SCBSs) working at mmWave band, and 𝐾 UEs. Let B , S , and J = { , ..., 𝐽 } denote the respective set of MCBSs, SCBSs, and all BSs with 𝐽 = 𝐵 + 𝑆 , and K = { , ..., 𝐾 } represents the set of UEs. Each BS 𝑗 has 𝑀 𝑗 antennas, and each UE 𝑘 is equipped withtwo antenna modules: 1) a single antenna for LTEconnections at microwave band, and 2) an antennaarray with 𝑁 𝑘 elements for 5G connections at mmWaveband. Each UE 𝑘 aims to receive 𝑛 𝑘 data streams fromits serving BS such that 1 ≤ 𝑛 𝑘 ≤ 𝑁 𝑘 , where theupper inequality indicates that the number of datastreams for each UE cannot exceed the number ofits antennas. We also assume that each UE supportsdual-connectivity so that association to either MCBSor SCBS is possible. A. Microwave and mmWave Channel Models

In this subsection, we introduce the microwave andmmWave channel models. In the microwave band thetransmissions are omnidirectional and we use the well-known Gaussian channel model [15]. We denote h 𝜇 W as the channel vector between a MCBS and a UEwhere its entries are i.i.d. complex Gaussian randomvariables with zero-mean and unit variance, i.e., ℎ 𝜇 W ∼CN ( , ) . In the mmWave band, the transmissionsare highly directional and the Gaussian MIMO chan-nel model no longer applies. We employ the speciﬁcclustered mmWave channel model which includes 𝐶 clusters with 𝐿 rays per cluster deﬁned as [16], [17] H mmW = √ 𝐶 𝐿 𝐶 ∑︁ 𝑐 = 𝐿 ∑︁ 𝑙 = √ 𝛾 𝑐 a ( 𝜙 UE 𝑐,𝑙 , 𝜃 UE 𝑐,𝑙 ) a ∗ ( 𝜙 BS 𝑐,𝑙 , 𝜃 BS 𝑐,𝑙 ) (1)where 𝛾 𝑐 is the power gain of the 𝑐 th cluster. The pa-rameters 𝜙 UE , 𝜃 UE , 𝜙 BS , 𝜃 BS represent azimuth angle ofarrival, elevation angle of arrival, azimuth angle of de-parture, and elevation angle of departure, respectively.The vector a ( 𝜙, 𝜃 ) is the response vector of a uniformplanar array (UPA) which allows 3D beamforming inboth the azimuth and elevation directions. We considerthe probability of LoS and NLoS as given in [18], andutilize the path loss model for LoS and NLoS linksas given in [16]. The numerical results provided inSec. VI are based on these channel models and relatedparameters. B. Signal Model

For tier-1 working at sub-6 GHz band, the eﬀectiveinterfering channel on UE 𝑘 from MCBS 𝑗 ∈ B servingUE 𝑙 is deﬁned as ℎ 𝑘,𝑙, 𝑗 = h 𝜇 W 𝑘, 𝑗 f 𝑙, 𝑗 (2)where f 𝑙, 𝑗 ∈ C 𝑀 𝑗 × is the linear precoder (transmitbeamforming vector) at MCBS 𝑗 intended for UE 𝑙 . If 𝑙 = 𝑘 , this deﬁnes the eﬀective channel between MCBS 𝑗 and UE 𝑘 as ℎ 𝑘, 𝑗 = h 𝜇 W 𝑘, 𝑗 f 𝑘, 𝑗 . Similarly, for tier-2 operating at mmWave band, theeﬀective interfering channel on UE 𝑘 from SCBS 𝑗 ∈ S serving UE 𝑙 is deﬁned as H 𝑘,𝑙, 𝑗 = W ∗ 𝑘 H mmW 𝑘, 𝑗 F 𝑙, 𝑗 (3)where F 𝑙, 𝑗 ∈ C 𝑀 𝑗 × 𝑛 𝑙 is the linear precoder at SCBS 𝑗 intended for UE 𝑙 , and W 𝑘 ∈ C 𝑁 𝑘 × 𝑛 𝑘 is the linearcombiner (receive beamforming matrix) of UE 𝑘 . If 𝑙 = 𝑘 , (3) becomes the eﬀective channel between SCBS 𝑗 ∈ S and UE 𝑘 which includes both beamforming vec-tors/matrices at the BS and UE, and can be expressedas H 𝑘, 𝑗 = W ∗ 𝑘 H mmW 𝑘, 𝑗 F 𝑘, 𝑗 . Thus, the received signal atUE 𝑘 connected to MCBS 𝑗 ∈ B can be written as 𝑦 𝜇 W 𝑘 = ∑︁ 𝑗 ∈B ℎ 𝑘, 𝑗 𝑠 𝑘, 𝑗 + 𝑧 𝑘 (4)where 𝑠 𝑘, 𝑗 ∈ C is the data symbol intended for UE 𝑘 with E [ 𝑠 ∗ 𝑘, 𝑗 𝑠 𝑘, 𝑗 ] = 𝑃 𝑘, 𝑗 , and 𝑧 𝑘 ∈ C is the complexadditive white Gaussian noise at UE 𝑘 with 𝑧 𝑘 ∼CN ( , 𝑁 ) , and 𝑁 is the noise power. We consideran equal power allocation scheme to split each BS 𝑗 transmit power ( 𝑃 𝑗 ) equally among its associatedusers, i.e., 𝑃 𝑘, 𝑗 = 𝑃 𝑗 /|K 𝑗 | .Similarly, the received signals at UE 𝑘 connected toSCBS 𝑗 ∈ S is given by y mmW 𝑘 = ∑︁ 𝑗 ∈S H 𝑘, 𝑗 s 𝑘, 𝑗 + W ∗ 𝑘 z 𝑘 (5)where s 𝑘, 𝑗 ∈ C 𝑛 𝑘 is the data stream vector for UE 𝑘 consisting of mutually uncorrelated zero-mean sym-bols with E [ s ∗ 𝑘, 𝑗 s 𝑘, 𝑗 ] = 𝑃 𝑘, 𝑗 , and z 𝑘 ∈ C 𝑁 𝑘 is thecomplex additive white Gaussian noise vector at UE 𝑘 .The presented signal model, developed algorithms andinsights for user association in this paper is applicableto all types of channel models, transmit beamformingand receive combining. C. User Association Model

We follow the mmWave-speciﬁc user associationmodel introduced in [7]. Because of directional beam-forming in mmWave systems, the interference struc-ture depends on the user association. Taking intoaccount this dependency is important for mmWavesystems where the channels are probabilistic and fasttime-varying, and the interference depends on thehighly directional connections between BSs and UEs.We perform user association per a time duration whichwe call an association block , which can span a singleor multiple time slots depending on the availability ofCSI and is a design choice (Fig. 1).We assume a model where in each association block,the user association process occurs in the associationtime interval which establishes the UE-BS connectionsfor transmission time interval. The association timeinterval is further divided into sub-slots for distributedimplementation, where during each sub-slot UEs canapply to a BS for association. In a fully distributed …Association Time Transmission TimeAssociation Block (one or multiple time slots)Application/Rejection loop

Figure 1. Structure of an association block: User association isestablished in a distributed fashion during the association timeinterval then is applied to the transmission time interval. algorithm, UE-BS associations can be determined atthe end of each sub-slot within the association timeinterval. This is fundamentally diﬀerent from a cen-tralized implementation where all user associationsare determined at the end of the association time inter-val. In distributed implementation, the duration of theassociation time interval can vary for each UE’s asso-ciation block, depending on the delay in the associationprocess for that UE. In the next association block, theuser association process needs to be performed againto update associations according to users’ mobility andchannel variations. An association block can representa single time slot, when both instantaneous (small-scale) and large-scale CSI are available, or can becomposed of several consecutive time slots when onlylarge-scale CSI is available. Such a choice will leadto a trade-oﬀ between user association overhead andresulting network performance.We study a unique user association problem in whicheach UE can be served by only one BS in each associ-ation block. In this problem, UE-BS associations canbe completely determined by an association vector β deﬁned as follows β = [ 𝛽 , ..., 𝛽 𝐾 ] 𝑇 (6)and unique association constraints can be expressed by ∑︁ 𝑗 ∈J 𝛽 𝑘 = 𝑗 ≤ , ∀ 𝑘 ∈ K (7)where 𝛽 𝑘 represents the index of BS to which user 𝑘 is associated, and 𝛽 𝑘 = 𝑗 is the indicator function suchthat 𝛽 𝑘 = 𝑗 = 𝛽 𝑘 = 𝑗 , and 𝛽 𝑘 = 𝑗 = 𝛽 𝑘 ≠ 𝑗 . Allanalysis and results in this paper are per associationblock, and thus we do not consider a time index in ourformulations.In a HetNet composed of diﬀerent types of BSs,each BS 𝑗 can have a diﬀerent quota 𝑞 𝑗 , i.e., thenumber of UEs it can serve simultaneously. We deﬁne q = [ 𝑞 , ..., 𝑞 𝐽 ] as the quota vector of BSs, and K 𝑗 asthe activation set of BS 𝑗 which represents the set ofactive UEs in BS 𝑗 , such that K 𝑗 ⊆ K , |K 𝑗 | ≤ 𝑞 𝑗 . Thus,we can deﬁne load balancing constraints for BSs as ∑︁ 𝑘 ∈K 𝛽 𝑘 = 𝑗 ≤ 𝑞 𝑗 , ∀ 𝑗 ∈ J (8)This set of constraints denotes that the number of UEsserved by BS 𝑗 can not exceed its quota 𝑞 𝑗 . The loadbalancing constraints allow our formulation to specifyeach BS’s quota separately. This makes the resulting user association scheme applicable to HetNets wherethere are diﬀerent types of BSs with diﬀerent capabil-ities.When UE 𝑘 is connected to MCBS 𝑗 ∈ B , its instan-taneous rate (in bps/Hz) is 𝑅 𝜇 W 𝑘, 𝑗 ( β ) = log (cid:32) + 𝑃 𝑘, 𝑗 ℎ 𝑘, 𝑗 ℎ ∗ 𝑘, 𝑗 𝑣 𝑘, 𝑗 ( β ) (cid:33) (9)where 𝑃 𝑘, 𝑗 represents the transmit power from BS 𝑗 dedicated to UE 𝑘 , and 𝑣 𝑘, 𝑗 is the interference plusnoise given as 𝑣 𝑘, 𝑗 ( β ) = ∑︁ 𝑙 ∈K 𝑗 𝑙 ≠ 𝑘 𝑃 𝑙, 𝑗 ℎ 𝑘,𝑙, 𝑗 ℎ ∗ 𝑘,𝑙, 𝑗 + ∑︁ 𝑖 ∈B 𝑖 ≠ 𝑗 ∑︁ 𝑙 ∈K 𝑖 𝑃 𝑙,𝑖 ℎ 𝑘,𝑙,𝑖 ℎ ∗ 𝑘,𝑙,𝑖 + 𝑁 Similarly, the instantaneous rate of UE 𝑘 connected toSCBS 𝑗 ∈ S is given by 𝑅 mmW 𝑘, 𝑗 ( β ) = log (cid:12)(cid:12)(cid:12) I 𝑛 𝑘 + V − 𝑘, 𝑗 ( β ) 𝑃 𝑘, 𝑗 H 𝑘, 𝑗 H ∗ 𝑘, 𝑗 (cid:12)(cid:12)(cid:12) (10)where V 𝑘, 𝑗 is the interference and noise covariancematrix given as V 𝑘, 𝑗 ( β ) = ∑︁ 𝑙 ∈K 𝑗 𝑙 ≠ 𝑘 𝑃 𝑙, 𝑗 H 𝑘,𝑙, 𝑗 H ∗ 𝑘,𝑙, 𝑗 + ∑︁ 𝑖 ∈S 𝑖 ≠ 𝑗 ∑︁ 𝑙 ∈K 𝑖 𝑃 𝑙,𝑖 H 𝑘,𝑙,𝑖 H ∗ 𝑘,𝑙,𝑖 + 𝑁 W ∗ 𝑘 W 𝑘 Note that the summations in 𝑣 𝑘, 𝑗 and V 𝑘, 𝑗 are takenover the activation set of BSs, indicating the depen-dency between interference and user association. D. Centralized vs. Distributed User Associations

User association is usually studied in the literaturein the form of centralized algorithms. Centralized al-gorithms can reach a near-optimal solution, however,they require a central coordinator, for example, locatedin a cloud-radio access network (C-RAN), to collect allrequired CSI and run the user association algorithm.In this centralized structure, BSs transmit CSI ref-erence signals via physical downlink control channels(PDCCHs) to enable UEs to estimate the CSI. TheCSI is sent back to BSs and then forwarded to C-RANwhere the central coordinator is located. Since at eachiteration of a centralized algorithm, such as the WCSalgorithm in [7], user instantaneous rates are updatedbased on the current association vector, raw CSI andnot the SINR must be available to compute these userrates. After collecting all required CSI from the net-work, the central coordinator runs the user associationalgorithm to ﬁnd the best possible connections. Thesignaling overhead in this ideal centralized structureis usually high due to network densiﬁcation in B5GHetNets, which requires a signiﬁcant amount of CSIto be reported, leading to high computational cost andtime complexity as the network size increases.Distributed user association algorithms have beenintroduced as low-complexity approaches with a rea- sonably low convergence time. These algorithms onlyinvolve low bit-rate signaling exchanges between UEsand BSs such that association decisions happen in adistributed fashion without a need for a central coordi-nator. We assume the UEs exchange messages with theBSs directly and the association decision for diﬀerentUEs can occur asynchronously at diﬀerent times. Inthis paper, we employ matching theory and proposea new matching game to solve the user associationproblem in a distributed fashion. We also introduce amulti-game matching algorithm to further improve thenetwork performance.III. Matching Theory for Distributed UserAssociationMatching theory attracted the attention of re-searchers due to its low-complexity and fast con-vergence time [10]. These promising features makematching theory a suitable framework for distributeduser association in fast-varying mmWave systems.User association can be posed as a matching gamewith two sets of players – the BSs and the UEs. In thisgame, each player collects the required information tobuild a preference list based on its own objective func-tion using local measurements. Each user then applyto the BSs based on its preference list, and associationdecision is made by the BSs individually. Thus, nocentral coordinator is required and user associationcan be performed fully distributed. This feature makesmatching theory an eﬃcient approach for designing adistributed user association in B5G HetNets.

A. User Association Matching Game Concepts

In the context of matching theory, user associationproblem can be considered as a college admission gamewhere the BSs with their speciﬁc quota represent thecolleges and UEs are considered as students. Thisframework is suitable for user association in a HetNetwhere BSs may have diﬀerent quotas and capabilities.In order to formulate our user association as a match-ing game, we ﬁrst introduce some basic concepts basedon two-sided matching theory [19].

Deﬁnition 1.

A preference relation (cid:23) 𝑘 helps UE 𝑘 tospecify the preferred BS between any two BSs 𝑖, 𝑗 ∈J , 𝑖 ≠ 𝑗 such that 𝑗 (cid:23) 𝑘 𝑖 ⇔ Ψ UE 𝑘, 𝑗 ≥ Ψ UE 𝑘,𝑖 ⇔ UE 𝑘 prefers BS 𝑗 to BS 𝑖 (11) where Ψ UE 𝑘, 𝑗 is the preference value between UE 𝑘 and BS 𝑗 , which can be simply a local measurement at the UE(e.g. SINR). Similarly, for any two UEs 𝑘, 𝑙 ∈ K , 𝑘 ≠ 𝑙 ,each BS builds a preference relation (cid:23) 𝑗 such that 𝑘 (cid:23) 𝑗 𝑙 ⇔ Ψ BS 𝑘, 𝑗 ≥ Ψ BS 𝑙, 𝑗 ⇔ BS 𝑗 prefers UE 𝑘 to UE 𝑙 (12) Deﬁnition 2.

Based on the preference relations, eachUE 𝑘 (BS 𝑗 ) builds its own preference list P UE 𝑘 ( P BS 𝑗 ) over the set of all BSs (UEs) in descending order ofpreference. Deﬁnition 3.

Association vector β deﬁnes a matchingrelation , which speciﬁes the association between UEsand BSs and has the following properties1) 𝛽 𝑘 ∈ J with 𝑘 ∈ K ;2) 𝛽 𝑘 = 𝑗 if and only if 𝑘 ∈ K 𝑗 .The second property states that the association vector β is a bilateral matching. Deﬁnition 4.

A user association matching game G isa game with two sets of players (BSs and UEs) and aset of rules which apply on the input data to obtain aresult. The input data of the game are: • Preference lists of BSs: P BS 𝑗 , ∀ 𝑗 ∈ J • Preference lists of UEs: P UE 𝑘 , ∀ 𝑘 ∈ K • BSs’ quota: 𝑞 𝑗 , ∀ 𝑗 ∈ J and game outcome or result is the association vector β .Each particular game is deﬁned by the speciﬁc way ofbuilding preference lists and its set of rules.B. Building Preference Lists A preliminary step in a matching game is to buildthe preference lists of the players. In this subsection,we describe the process of building the preference listsfor the UEs and BSs. These preference lists can bebuilt based on a number of metrics, including users’instantaneous rates, channel norms or UEs’ local mea-surements.

1) Users’ instantaneous rates:

Using users’ instan-taneous rates to build preference lists will require theknowledge of both instantaneous and large-scale CSI.For example, we can use the user instantaneous ratein (9)-(10) as the objective function for both sides of thegame, i.e., Ψ UE 𝑘, 𝑗 = Ψ BS 𝑘, 𝑗 = 𝑅 𝑘, 𝑗 , ∀ 𝑘 ∈ K , 𝑗 ∈ J (13)In other words, each UE prefers the BS which providesthe highest user instantaneous rate, and each BS iswilling to serve the UE that can get the highest userinstantaneous rate. This metric, however, requires fre-quent update of the preference value which dependson the association of other users and can complicatethe process. One approach is ﬁxing the association ofall other UEs based on an initial association vector forcomputing the current user’s instantaneous rates fromall BSs. Alternatively, we can switch the associationof the current UE with another UE connected to aBS while computing the instantaneous rate from thatBS to the current UE, in order to maintain the BSquota. This switching can be either random or withthe weakest UE connected to that BS as in [7]. In this paper we use the terms "association vector" and "matchingrelation" interchangeably.

2) Channel norms:

The preference lists can alsobe built based on just CSI. As stated earlier, we as-sume both instantaneous and large-scale CSI availablethrough beamforming CSI estimation techniques. Inthis case, the preference values can be expressed asthe Frobenius norm of a MIMO channel Ψ UE 𝑘, 𝑗 = Ψ BS 𝑘, 𝑗 = || H 𝑘, 𝑗 || 𝐹 , ∀ 𝑘 ∈ K , 𝑗 ∈ J (14)where H 𝑘, 𝑗 is the channel matrix between UE 𝑘 andBS 𝑗 , and the operator || . || 𝐹 represents the Frobeniusnorm. Using the channel norm, each UE (BS) ranksthe BSs (UEs) and builds its own preference list suchthat the BS (UE) with the strongest channel (highestchannel norm) is the most preferred one. The prefer-ence lists can also be built based on large-scale CSIalone, which will stay valid for longer durations butmay result in a lower overall network utility.

3) Local CQI measurements:

A more practical ap-proach to building the preference lists is based on UEs’local measurements. Using reference signal receivedpower information, each UE measures the receivedSINR from each BS as a ratio of a valid signal tonon-valid signals. Then, the UE converts this SINRinformation to CQI values and reports them to BSsvia the PUCCH signaling mechanism [20]. In thisapproach, the preference values are given by Ψ UE 𝑘, 𝑗 = Ψ BS 𝑘, 𝑗 = CQI 𝑘, 𝑗 , ∀ 𝑘 ∈ K , 𝑗 ∈ J (15)where CQI 𝑘, 𝑗 represents the channel quality betweenUE 𝑘 and BS 𝑗 . Using CQI values, each UE (BS) ranksthe BSs (UEs) such that the BS (UE) with the highestCQI is the most preferred one. The periodicity of CQIreport is a conﬁgurable parameter and it could be asfast as every four time slots [22]. C. Deferred Acceptance [9] User Association MatchingGame

Matching theory dates back to the early 60s whenmathematicians Gale and Shapley proposed the now-famous DA matching game, which can be posed asa college admission game and produces optimal andstable results [9]. Applying to user association, theinput data of this game are preference lists of BSsand UEs as well as the quota of BSs, and its resultis a matching relation β . While the DA game can beimplemented in a centralized way, in this paper, wefocus on its distributed implementation, which doesnot require a central entity to collect the preferencelists of all BSs and UEs and run the game [10]. Inwhat follows, we describe the rules of this game in thecontext of user association.We ﬁrst deﬁne 𝑚 𝑘 as the preference index of UE 𝑘 , R as the rejection set of UEs, U as the set of unassociatedUEs , and W 𝑗 as the waiting list of BS 𝑗 . Before starting The duration of time slot in 5G NR depends on transmissionnumerology, and is less than 1ms which is the subframe duration[21]. Thus, CQI can be reported as fast as every 4ms. A pp l y User association procedure in DA game 𝒫 𝒥 (𝑘,1) … A pp l y 𝒫 𝒥 (𝑘,𝑖) (a) (b) R e j e c t … P u t i n t h e w a i t li s t 𝒲 " … P u s h e d o u t o f 𝒲 " A pp l y 𝒫 𝒥 (𝑘,𝑗) … A cc e p t 𝜷 = 𝒋 A pp l y 𝒫 𝒥 (𝑘,1) … A pp l y 𝒫 𝒥 (𝑘,𝑖) R e j e c t … P u t i n t h e w a i t li s t 𝒲 " … P u s h e d o u t o f 𝒲 " A pp l y 𝒫 𝒥 (𝑘,𝐽) … P u t i n 𝒲 $ 𝜷 = ∅ UE 𝑘 BS 𝑘 BS 𝜏 ! 𝑡 %&’( P u s h e d o u t o f 𝒲 $ R e j e c t P u t i n 𝒲 ) Figure 2. User association process in the DA game. (a) UE 𝑘 isassociated with BS 𝑗 , (b) UE 𝑘 is pushed out of waiting lists andeventually rejected by all BSs. Algorithm 1:

DA User Association Game(based on [9])

Data: P BS 𝑗 , P UE 𝑘 , 𝑞 𝑗 , ∀ 𝑘 ∈ K , 𝑗 ∈ J Result:

Association vector β = [ 𝛽 , 𝛽 , ..., 𝛽 𝐾 ] Initialization : Set 𝑚 𝑘 = , ∀ 𝑘 , 𝑛 =

1, form a rejectionset R = { , , ..., 𝐾 } , initialize a set of unassociatedUEs U = ∅ and the waiting list of each BS W 𝑗 = ∅ , ∀ 𝑗 ; while R ≠ ∅ do Each UE 𝑘 ∈ R applies to its 𝑚 𝑘 th preferred BS; Each BS 𝑗 forms its current waiting list W 𝑛𝑗 fromits new applicants and its previous waiting list W 𝑛 − 𝑗 ; Each BS 𝑗 keeps the ﬁrst 𝑞 𝑗 preferred UEs from W 𝑛 + 𝑗 , and reject the rest of them; for 𝑘 ∈ R do 𝑚 𝑘 ← 𝑚 𝑘 + if 𝑚 𝑘 > 𝐽 then Remove UE 𝑘 from R and add it to U ; end end 𝑛 ← 𝑛 + end Form β based on the ﬁnal waiting lists of BSs W 𝑗 , 𝑗 = , ..., 𝐽 . the game, we set the preference index of each UE to one( 𝑚 𝑘 = , ∀ 𝑘 ), form a rejection set R including all 𝐾 UEs,initialize a set of unassociated UEs ( U = ∅ ) and thewaiting list of each BS ( W 𝑗 = ∅ , ∀ 𝑗 ). At 𝑛 th iterationof the game, each UE 𝑘 applies to its 𝑚 𝑘 th preferredBS by sending an application message. Each BS 𝑗 thenranks its new applicants together with the UEs in itscurrent waiting list ( W 𝑛 − 𝑗 ) based on its preference list,keeps the ﬁrst 𝑞 𝑗 UEs in its new waiting list W 𝑛𝑗 , andrejects the rest of UEs. Each BS accordingly sends aresponse of either rejection or waitlisting to new UEapplicants as well as those previously in its waiting list but are now rejected. Rejected UEs remain in therejection set, update their preference index 𝑚 𝑘 , andapply to their next preferred BS in the next iteration.If 𝑚 𝑘 > 𝐽 , it means UE 𝑘 is applied to all BSs andget rejected. Thus, we remove UE 𝑘 from rejection set R and add it to the set of unassociated UEs U . Ateach new iteration, each BS forms a new waiting list byselecting the top 𝑞 𝑗 UEs among the new applicants andthose on its current waiting list. The game terminateswhen the rejection set is empty.In the DA game, the associations are only deter-mined when the game terminates. The BSs use theirwaiting lists to keep the most preferred UEs over allapplication rounds, and the ﬁnal waiting lists after thelast iteration determine the associations. This DA userassociation process is shown in Fig. 2 and the algo-rithm for the DA user association game is described inAlg. 1.

D. User Association Matching Game Metrics1) Game Stability vs. Other Objectives:

In the con-text of matching theory, stability has been consideredas a key performance metric [9]. Stability means thereis no blocking pair, a UE-BS pair ( 𝑘, 𝑗 ) ∉ β where theyprefer each other more than their associations undermatching relation β [9].Stability in matching is an important criterion asdiscussed in the original paper by Gale and Shapley[9], who showed that DA is optimal in terms of sta-bility. There is, however, no direct implication that anoptimal matching in terms of stability is also optimalin terms of another performance objective, includingthe metric used to build the preference list. We willillustrate this lack of connection via an example. TableI shows the preference lists of the players (in thesame format as in [9]): the ﬁrst number of each cellis the preference of the Greek-letter player for theRoman-letter player, and the second number is Romanfor Greek. Each preference list is built based on themetric values (for example, spectral eﬃciency in userassociation) in Table II. We can see that the stableassignment here is ( 𝛼, 𝐴 ) and ( 𝛽, 𝐵 ) as shown in bold inTable I . The other assignment set of ( 𝛼, 𝐵 ) and ( 𝛽, 𝐴 ) isinherently unstable since 𝛼 prefers 𝐴 more than 𝐵 , and 𝐴 also prefers 𝛼 more than 𝛽 (per deﬁnition of stabilityas in [9]). The total metric value for the stable assign-ment, however, is 5 and is less than the value of theunstable assignment of 6. Thus, optimality in termsof stability does not necessarily result in the highest(optimal) metric values. As such, when applying toa cellular system where a performance metric suchas spectral eﬃciency is of primary interest, a stableassociation does not necessarily lead to the optimalspectral eﬃciency. This fact will also be conﬁrmed viasimulation results in Sec. VI (see Fig. 9).Furthermore, stability becomes less relevant in asetting where users’ preferences change over time, Table IPreference lists of players

𝐴 𝐵𝛼 (1,1) (2,1) 𝛽 (1,2) (2,2) Table IIMetric values of players

𝐴 𝐵𝛼 𝛽

2) User Association Delay:

This metric representsthe amount of time it takes for a UE to get associatedwith a BS. Denote the time delay for one iteration as 𝑡 iter , which is a system parameter and independent ofthe matching game. Thus, we can compare the associ-ation delay of diﬀerent matching games by comparingtheir respective time delay until association. In a DAgame, the association decision is postponed to the lastiteration of the game due to the presence of waitinglists. Thus for DA, all UEs have the same associationdelay ( 𝜏 𝑘, DA = 𝜏 DA ) which is proportional to the totalnumber of iterations 𝑁 iterDA of the game as follows 𝜏 DA (cid:44) 𝑁 iterDA 𝑡 iter (16)For an EA game, as will be described in Sec. IV, the 𝑘 th UE’s association delay is 𝜏 𝑘, EA (cid:44) 𝑁 appl 𝑘 𝑡 iter (17)where 𝑁 appl 𝑘 is the number of applications of UE 𝑘 until it is associated with a BS. Thus, the averageassociation delay for the EA game can be obtained as 𝜏 EA = 𝐾 ∑︁ 𝑘 ∈K 𝜏 𝑘, EA (18)We assume that all applicants apply simultaneously ateach iteration of the game, thus the time duration foreach iteration is the same for every applicant.

3) User Power Consumption for Association:

In amatching game, each UE applies to its most preferredBSs, until it is accepted by one of them. Each ap-plication requires the UE to send a signal to a BS,which is a power-consuming process. We assume thateach application consumes a speciﬁc amount of power( 𝑃 appl ), which is also a system parameter independentof the matching game. Thus, the power consumptionof UE 𝑘 during the association process can be deﬁnedas 𝛾 𝑘 (cid:44) 𝑁 appl 𝑘 𝑃 appl (19)Thus, diﬀerent user power consumption during theassociation process can be compared by considering theusers’ number of applications, the lower the number ofapplications of a UE, the less its power consumption. A pp l y UE 𝑘 BS

User association procedure in EA game R e j e c t 𝒫 𝒥 (𝑘,1) … 𝜷 " = 𝒋 A pp l y A cc e p t 𝒫 𝒥 (𝑘,𝑗) A pp l y UE 𝑘 BS R e j e c t 𝒫 𝒥 (𝑘,1) … A pp l y R e j e c t 𝒫 𝒥 (𝑘,𝑚 " )𝜷 " = ∅ (a) (b) 𝑡 𝜏 ! \scriptP >> 𝒫 Figure 3. User association process in an EA game. (a) UE 𝑘 isassociated with BS 𝑗 , (b) UE 𝑘 is not associated as it is rejected atthe 𝑚 𝑘 th (last) iteration of the game, with 𝑚 𝑘 ≥ 𝐽 .

4) Percentage of Unassociated Users:

In the case thenumber of UEs is more than the total quota of BSs,or some BS are out of range such that the preferencelists of certain UEs are shorter than 𝐽 , there may beunassociated UEs at the end of a matching game. Wecan evaluate the performance of matching games bycomparing the percentage of unassociated UEs underthese games. In Sec. VI, we consider the followingscenarios in evaluating the performance of matchinggames. • Underload : 𝐾 < (cid:205) 𝑗 ∈J 𝑞 𝑗 • Critical load : 𝐾 = (cid:205) 𝑗 ∈J 𝑞 𝑗 • Overload : 𝐾 > (cid:205) 𝑗 ∈J 𝑞 𝑗

5) Network Utility Function:

A network utility func-tion can be used to compare the performance of cen-tralized and distributed user association algorithms.In particular, we employ sum-rate utility function toassess the performance of association schemes. Deﬁn-ing the instantaneous user throughput vector r ( β ) (cid:44) ( 𝑅 ,𝛽 , ..., 𝑅 𝐾 ,𝛽 𝐾 ) , we can express the sum-rate utilityfunction as 𝑈 ( r ( β )) (cid:44) ∑︁ 𝑘 ∈K 𝑅 𝑘,𝛽 𝑘 (20)where 𝑅 𝑘,𝛽 𝑘 is the instantaneous rate of UE 𝑘 associ-ated with BS 𝛽 𝑘 given in (9)-(10).IV. Proposed Early Acceptance Matching GamesThe original DA game defers matching decision untilthe last iteration of the game, and thus is suitablefor processes which do not require making decisions inreal time. When applied to user association, however,all UEs are kept in BSs’ waiting lists until the gameterminates. This game can result in an excessive delayfor the association process and can be problematicwhen it comes to user association in fast-varyingmmWave systems which require low-latency communi-cations. In order to overcome this problem, we proposea set of new matching games, called early acceptance (EA) games, to solve the user association problemin B5G HetNets. In an EA game, BSs immediatelydecide about acceptance or rejection of UEs at eachapplication round. This leads to a signiﬁcantly faster and more eﬃcient user association process. The DAgame however, has an advantage over the EA gamesin terms of stability, but this property is less relevantin the user association context since the preferencelists of users change with time. Furthermore, stabilityneeds not lead to optimal throughput, as shown in Sec.III-D1. Our simulations also conﬁrm that the EA gameachieves slightly higher network throughput but at amuch lower delay. A. Proposed Early Acceptance User Association Match-ing Games

We introduce three distributed EA matching gameswhich follow a set of similar rules, but are diﬀerentin terms of updating the UE/BS preference lists andreapplying to BSs. Similar to the DA game, an EAgame takes as input data the preference lists of BSsand UEs and quota of BSs, and delivers a matchingrelation β . The initial steps of these games is similarto that of DA which is to set the preference indexof all UEs to one ( 𝑚 𝑘 = , ∀ 𝑘 ), form a rejectionset R composed of all 𝐾 UEs, and initialize a set ofunassociated UEs ( U = ∅ ).

1) EA-Base Game:

This game deﬁnes the basic rulesof the EA games. At each iteration of the game, eachUE 𝑘 ∈ R applies to its 𝑚 𝑘 th preferred BS (sayBS 𝑗 ) regardless of its available quota. If UE 𝑘 isamong the top 𝑞 𝑗 UEs in the preference list of BS 𝑗 ,it will be immediately accepted by this BS, and thegame updates the association vector with 𝛽 𝑘 = 𝑗 . Theaccepted UEs will be removed from the rejection set R and any rejected UEs will apply to their next preferredBS in the next iteration. If a UE applies to all the BSsin its preference list and get rejected from all of them,we add the UE to the set of unassociated UEs ( U ).This game is the simplest form EA since the UEsand BSs do not update their preference list during thegame, and if a UE is rejected by a BS it will not reapplyto that BS. These features make the EA-Base gamevery fast with only a small number of applications( ≤ 𝐽 ) for each UE. However, at the end of this gamesome UEs may be unassociated regardless of loadingscenarios.

2) EA-PLU (EA with Preference List Updating)Game:

In order to improve the performance of EA-Base, we update the preference lists of UEs and BSsat the end of each iteration. When an association hap-pens, the following updates occur: 1) associated UEsare removed from the rejection set and the preferencelists of all BSs, and 2) for each new association withBS 𝑗 , its quota is updated as 𝑞 𝑗 ← 𝑞 𝑗 −

1. When a BSruns out of quota, it informs all the UEs by sending abroadcast message and the UEs remove this BS fromtheir preference lists. As a result, the UEs will nolonger apply to that BS. Similar to EA-Base though,there is still no guarantee that all UEs are associatedat the end of the EA-PLU game.

Algorithm 2:

Proposed EA-PLU-RA User Asso-ciation Game

Data: P BS 𝑗 , P UE 𝑘 , 𝑞 𝑗 , ∀ 𝑘 ∈ K , 𝑗 ∈ J Result:

Association vector β = [ 𝛽 , 𝛽 , ..., 𝛽 𝐾 ] Initialization : Set 𝑚 𝑘 = , ∀ 𝑘 , form a rejection set R = { , , ..., 𝐾 } , and initialize a set of unassociatedUEs U = ∅ ; while R ≠ ∅ do Each UE 𝑘 ∈ R applies to its 𝑚 𝑘 th preferred BS(namely BS 𝑗 with 𝑞 𝑗 ≠ if 𝑘 ∈ P BS 𝑗 ( 𝑞 𝑗 ) then 𝛽 𝑘 = 𝑗 ; 𝑞 𝑗 ← 𝑞 𝑗 − if 𝑞 𝑗 = then Remove BS 𝑗 from P UE 𝑘 , ∀ 𝑘 ∈ K ; end Remove UE 𝑘 from R and P BS 𝑗 , ∀ 𝑗 ∈ J ; else if P UE 𝑘 ≠ ∅ then 𝑚 𝑘 ← 𝑚 𝑘 + Keep UE 𝑘 in R ; else Remove UE 𝑘 from R and add it to U ; end end end

3) EA-PLU-RA (EA with Preference List Updatingand ReApplying) Game:

In order to improve the per-centage of associated users, we allow each UE toreapply to those BSs from which it has been rejected inprevious iterations. Recall that if UE is rejected by aBS, it will be kept in the rejection set. In the followingiterations, each UE applies to the next preferred BSin its updated preference list. When a UE reachesthe end of its updated preference list, it comes backto the beginning and repeats the application processuntil its updated preference list becomes empty or itis associated. In the case that the updated preferencelist becomes empty, no reapplication is possible and theUE is removed from the rejection set R and added tothe set of unassociated UEs U . With this reapplication,when the length of the original preference list of eachUE is equal to 𝐽 , no UE is unassociated at the endof the game in underload and critical load cases (seeLemma 2). In case of an incomplete original preferencelist, UE unassociation is possible. The user associationprocess in the EA-PLU-RA game is depicted in Fig. 3,and described in Alg. 2.We note that it is practical to update the preferencelists of UEs and BSs after each round of applicationsduring an EA matching game. When UE-BS associa-tions occur, the associating BSs update their quota andpreference lists. They also inform all other BSs (viabackhaul links) and UEs (via a cell broadcast message[23]) to update their preference lists accordingly. Inparticular, each BS updates its quota and removeassociated UEs from its preference list. If a BS runsout of quota, all unassociated UEs remove that BS from their preference lists. This reporting mechanismincurs minimal signaling overhead in practical imple-mentations. B. Convergence of Matching Games

For a matching game, convergence implies that thegame will eventually terminate and produce a match-ing result. Depending on the game and loading sce-nario, diﬀerent events can happen at convergence.In the EA-Base and EA-PLU games which do notallow re-application, convergence occurs when the lastunassociated UE(s) applies to its least preferred BSor all BSs run out of quota. In the EA-PLU-RA game,convergence occurs when either all UEs are associatedor all BSs are full. We provide four lemmas on theconvergence of the proposed EA matching games andobtain their worst convergence time (maximum numberof iterations).

Lemma 1 . EA-Base and EA-PLU games always con-verge and their worst convergence time is 𝐽 iterations.Proof. EA-Base and EA-PLU games converge when thelast unassociated UE(s) applies to its least preferredBS after having been rejected from all others. Sinceeach UE can only apply once to each BS (reapplyingis not allowed), the games terminate in at most 𝐽 iterations. (cid:3) Lemma 2 . For underload and critical load cases,when the length of the original preference list of eachUE is equal to the number of BSs 𝐽 , all UEs will beassociated at the end of the EA-PLU-RA game.Proof. If the length of the original preference list ofeach UE is equal to 𝐽 , i.e., |P UE 𝑘 | = 𝐽, ∀ 𝑘 ∈ K , thenany rejected UE(s) will have a chance to apply to allBSs and therefore will be associated as long as there isquota left at any of the BSs, which is always the casein underload and critical load scenarios. But if there isa UE with |P UE 𝑘 | < 𝐽 , the UE can not apply to all BSsand become unassociated if all BSs in its preferencelist run out of quota. (cid:3) Lemma 3 . The EA-PLU-RA game converges withinﬁnite number of iterations.Proof.

As stated in Alg. 2, the EA-PLU-RA gameterminates when the rejection set R becomes empty.According to Lemma 2, for underload and critical loadcases when the length of the original preference list ofeach UE is equal to 𝐽 , all UEs will be associated by theend of the game and the rejection set becomes empty.If there is a UE with a shorter original preferencelist than 𝐽 , it keeps reapplying to the BSs in itspreference list. Once a BS runs out of quota, it willbe removed from the preference list of the UE. If allthe BSs run out of quota, the preference list of theUE becomes empty and the UE will be moved to set U (unassociated) and the rejection set becomes empty.For the overload scenario, the game converges when all BSs run out of quota, which will eventually happensbecause of preference list update and reapplication.After this point the preference lists of all rejected UEsare empty, and they will be moved from R to U , causingthe game to terminate. (cid:3) Because of reapplication, the worst convergence timeof EA-PLU-RA is longer than of the other two EAgames and is harder to determine as it depends onthe preferences setting. The next lemma provides anupper bound on the worst convergence time of the EA-PLU-RA game.

Lemma 4 . For a network with 𝐽 BSs and 𝐾 UEs,an upper bound on the maximum number of iterationsfor the EA-PLU-RA game under critical load scenariois 𝑁 max 𝐾 ,𝐽 = 𝐽 ( 𝐾 − 𝐽 ) + (cid:205) 𝐽𝑗 = 𝑗 .Proof. We prove the bound by reduction. When thereare more than 𝐽 quotas left in the network, then atleast one UE must be associated after every 𝐽 itera-tions. The worst case is when no association occursin the ﬁrst 𝐽 − 𝐽 th iteration. Using thislogic, then when there are 𝐽 unassociated users left, anupper bound on the maximum number of iterations upto this point is 𝐽 ( 𝐾 − 𝐽 ) . From this point on, if there are 𝐿 quotas left, then the longest time it takes to get atleast one UE associated is after 𝐿 applications, in thecase those quotas belong to diﬀerent BSs. Thus after atmost 𝐿 applications, the number of unassociated UEsand available quota reduces to 𝐿 −

1. Using this reduc-tion, we can obtain an upper bound on the maximumnumber of iterations for a network with 𝐽 BSs and 𝐾 UEs as 𝑁 max 𝐾 ,𝐽 = 𝐽 ( 𝐾 − 𝐽 ) + (cid:205) 𝐽𝑗 = 𝑗 . (cid:3) The bound in Lemma 4 is conservative and quiteloose as indicated by our numerical results, neverthe-less it provides a concrete cutoﬀ value on the worstconvergence time. The actual worst convergence timeis found numerically to be signiﬁcantly smaller thanthe bound.

C. Complexity of Matching Games

In this subsection, we analyze the complexity of userassociation matching games.1)

Computation complexity of building preferencelists : Prior to starting the game, we need to build thepreference lists of BSs and UEs. This process needs tobe performed by each player of the game. As mentionedearlier, preference lists can be built based on user’sinstantaneous rate or some local measurements atthe UE. In practical scenarios, this information canbe measured at a minimal computational complex-ity using the mechanism described in Sec. III-B. Ina distributed matching game, each UE can locallymeasure the received SINR from each BS separately,then compute the instantaneous rate using a singlecomputation. After computing the instantaneous rates,the UE sorts these rates as a mean of ranking the BSs in a descending order to build its preferencelist. This sorting step results in a computational costof

O ( 𝐽 log ( 𝐽 )) at each UE. Similarly, we obtain thecomputational cost of building a preference list at eachBS as O ( 𝐾 log ( 𝐾 )) . As a result, the total computationalcost for building the preference lists of all UEs and BSsis O ( 𝐽𝐾 log ( 𝐽𝐾 )) .2) Game execution complexity : During a matchinggame, each UE applies to a BS by sending an applica-tion message, and it is notiﬁed about the BS decisionvia a response message. This process is the same forDA and EA games at the UEs’ side, and the numberof iterations speciﬁes the execution complexity for eachgame. However, the execution complexity of each gameis diﬀerent at the BSs’ side because the BSs respondto applicants in diﬀerent ways. At each iteration ofthe DA game, each BS requires to perform a sortingprocedure (Step 5) with cost

O ( 𝐾 log ( 𝐾 )) , which incursa total computational cost of order O ( 𝐽𝐾 log ( 𝐾 )) ateach iteration. In the EA games, no such sorting isrequired by the BSs, and thus they have a much lowerexecution complexity than the DA game on the BSs’side.Our numerical results show that the number ofiterations of these games is usually around the numberof BSs ( 𝐽 ). Thus, the total execution complexity of theDA game and EA games are O ( 𝐽 𝐾 log ( 𝐾 )) and O ( 𝐽 ) ,respectively. We assume BSs and UEs perform theiractions based on their most recently updated prefer-ence lists through the reporting mechanism describedin Sec. IV-A3. Thus, updating the preference lists inthe EA-PLU and EA-PLU-RA games does not increasethe execution complexity of these games. Consideringboth the computational complexity of building the pref-erence lists and executing the game, the total compu-tational complexity for the DA game and EA games are O ( 𝐽 𝐾 log ( 𝐾 )) and O ( 𝐽𝐾 log ( 𝐽𝐾 )) , respectively. In thecomplexity analysis of the centralized WCS algorithmin [7], we showed that the total complexity of thatalgorithm is O ( 𝑀 𝑗 𝐾 log ( 𝐾 )) , which is much higherthan the one for distributed matching games since 𝑀 𝑗 (cid:29) 𝐽 . The computational complexities of the central-ized, distributed, and semi-distributed (discussed laterin Sec. V-B) user association schemes are summarizedin Table III.V. Multi-Game Matching AlgorithmThe distributed matching games are fast and eﬃ-cient in terms of delay and power consumption, butdue to the distributed nature, they may not reachthe performance of centralized algorithms. If we canaﬀord some delay and additional power consumption aswell as a minimal signaling exchange, we can furtherenhance the performance in terms of a network utility.In this section, we introduce a user association opti-mization problem which aims to maximize a networkutility function, then propose a multi-game matching Algorithm 3:

Multi-Game Matching Algorithm forUser Association with Max-throughput

Data: J , K , q 𝑗 , ∀ 𝑗 ∈ J , Path loss information Result:

Near-optimal association vector β ★ Initialization : - Set the number of games ( 𝑁 ); - Build initial preference lists of BSs and UEs ( P 𝑗 , P 𝑘 , ∀ 𝑘, 𝑗 ) based on channel norms; - Perform a matching game (DA or EA) to obtaininitial β ; for 𝑛 = 𝑁 do Calculate 𝑅 𝑘, 𝑗 ( β 𝑛 ) , ∀ 𝑘, 𝑗 ; Build preference lists P 𝑛𝑗 , ∀ 𝑗 ∈ J and P 𝑛𝑘 , ∀ 𝑘 ∈ K ; Perform a matching game (DA or EA) to obtain β 𝑛 + ; end β ★ = arg max 𝑛 = ,...,𝑁 𝑈 ( r ( β 𝑛 )) . algorithm which requires running multiple rounds ofa game and a central entity to keep track of the bestassociation vector. Each game is still run in an entirelydistributed fashion, and only the resulting associationvector is sent to the central entity for tracking.Due to the dependency between user association andinterference structure of the network, user instanta-neous rates or local measurements could change withdiﬀerent associations. Thus, the preference lists maychange according to user associations. In particular,at the end of a user association matching game, weobtain an association vector β which speciﬁes the UE-BS connections. Since the user instantaneous rate is afunction of β , the resulting preference lists at the endof a game round may be diﬀerent from the original oneat the start of the same round, and performing anotherround of a matching game may produce a better userassociation in terms of a network utility. In order tokeep improving the network performance, we introducea matching algorithm which plays multiple rounds ofa matching game in an iterative manner. Each roundof a game is aimed to maximize the sum-rate as in(20), given the initial association vector obtained fromprevious round of the game. A. Multi-Game Matching Algorithm for Max-throughput

This matching algorithm requires an initial associ-ation vector β which can be obtained by performinga user association matching game in the initializationprocedure. The preference lists of BSs and UEs for thisinitial game can be built based on channel norms asdescribed in Sec. III.B.At each subsequent iteration of the algorithm, byﬁxing the associations of all other UEs based on thecurrent association vector β 𝑛 , each UE computes theinstantaneous rate it can get from each BS and reportsthis rate to the corresponding BS. Then, each BS (UE)updates its preference list by ranking all UEs (BSs) Table IIIComputational complexity of centralized, distributed, andsemi-distributed user association schemes

User Association Scheme Complexity

WCS Algorithm (centralized) [7] O (cid:16) 𝑀 𝑗 𝐾 log ( 𝐾 ) (cid:17) DA Game (distributed) [9] O (cid:16) 𝐽 𝐾 log ( 𝐾 ) (cid:17) Proposed EA-PLU-RA Game (distributed) O (cid:16) 𝐽 𝐾 log ( 𝐽 𝐾 ) (cid:17) Proposed Multi-game DA Alg. (semi-distributed) O (cid:16) 𝑁 𝐽 𝐾 log ( 𝐾 ) (cid:17) Proposed Multi-game EA Alg. (semi-distributed) O (cid:16) 𝑁 𝐽 𝐾 log ( 𝐽 𝐾 ) (cid:17) based on the computed instantaneous rates. Next, amatching game (DA or EA) is performed to obtainthe new association vector β 𝑛 + . This new associationvector will be used to establish the preference lists forthe next round.The algorithm performs the matching game 𝑁 times,where 𝑁 is a design parameter. At each time, the gameis run in a distributed fashion. At the end of eachgame, the BSs report their associations to a centralentity, called best- β -tracker , which computes the utilityfunction and keeps track of the best association vector.As 𝑁 increases, there is a higher chance of obtaining abetter association at the cost of more association delayand higher power consumption at the UEs. The valueof 𝑁 can be determined based on practical delay andpower constraints. At the end of the algorithm, thebest- β -tracker notiﬁes the BSs with the best associa-tion vector corresponding to the highest network util-ity. Although this algorithm requires a central entityto keep track of the best association vector, at eachround, the game is performed in a purely distributedmanner. This algorithm is described in Alg. 3. B. Complexity of Multi-Game Matching Algorithm

Based on the complexity analysis in Sec. IV-C, thecomplexity of the proposed 𝑁 -game matching algo-rithm using the DA game is O (

𝑁 𝐽 𝐾 log ( 𝐾 )) and usingthe EA game is O (

𝑁 𝐽𝐾 log ( 𝐽𝐾 )) , both of which arestill much smaller than the cost of centralized WCSalgorithm ( O ( 𝑀 𝑗 𝐾 log ( 𝐾 )) ) since 𝑁 and 𝐽 are usuallymuch smaller than 𝐾 , and 𝑀 𝑗 is typically a largenumber. We note that all computations in the WCSalgorithm are carried out by a central coordinator.In the proposed multi-game matching algorithm, how-ever, computations are distributed among the best- β -tracker (for computing the network utility and keepingtrack of the best association vector), and the BSs andUEs who need to build their own preference lists. Asummary of computational complexity of diﬀerent userassociation schemes is shown in Table III. Number of BSs (J)(a) A v e r age A ss o c i a t i on D e l a y ( t i m e un i t ) DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA Game

Users' Association Delay (time unit)(b) E m p i r i c a l P r obab ili t y D en s i t y F un c t i on ( P D F ) DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA Game

Figure 4. Comparing the association delay of DA and EA games under critical load scenario. a) Average association delay in a HetNet with1 MSBS and 𝐽 − 𝐽 = 𝐾 =

35 UEs. The vertical lines show the25 and 75 percentile bars. The time unit is the amount of time for sending an application and receiving a response.

Number of BSs (J)(a) ( po w e r un i t ) DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA Game

Users' power consumption of association (power unit)(b) E m p i r i c a l P r obab ili t y D en s i t y F un c t i on ( P D F ) DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA Game

Figure 5. Comparing the power consumption of DA and EA games under critical load scenario with 𝑞 =

15. a) Average user powerconsumption for association in a HetNet with 1 MSBS and 𝐽 − 𝐽 = 𝐾 =

35. The vertical lines show the 25 and 75 percentile bars. The power unit is the amount of power consumedduring an application and response step.

VI. Numerical ResultsIn this section, we evaluate the performance of theproposed user association matching games and algo-rithm in the downlink of a mmWave-enabled HetNetwith 𝐽 BSs and 𝐾 UEs. The network includes 1 MCBSoperating at 1.8 GHz with quota 𝑞 =

15 and 𝐽 − 𝑞 𝑗 = , 𝑗 ∈ { , .., 𝐽 } . Unless otherwise stated, we considera HetNet with 1 MCBS, 4 SCBSs, and 35 UEs. Sub-6 GHz channels and mmWave channels are generatedas described in Sec. II-A. We assume each mmWavechannel is composed of 5 clusters with 10 rays percluster. In order to implement 3D beamforming, eachMCBS is equipped with a massive MIMO antennawith 64 elements, each SCBS has a 8 × ×

500 m square area where the BSs are placed at speciﬁc locations andthe UEs are distributed randomly according to a PPPdistribution with density 𝐾 UEs within the given area.

A. Association Delay and Power Consumption

Fig. 4 compares the DA and EA games in termsof association delay under critical load scenario, withaverage delay on the left and delay distribution onthe right. Subﬁgure (a) shows that the EA gamessigniﬁcantly outperforms the DA game in terms of av-erage association delay. This advantage becomes moresigniﬁcant as the network size increases. Subﬁgure (b)illustrates that all EA games perform better than theDA game in terms of user association delay by havingthe delay distribution more concentrated around lowdelay values. For example, the association delay underthe EA games for about 50% of the UEs is only one timeunit, while for the DA game, more than 96% of the UEshave an association delay of at least 5 time units. Also,the maximum delay is 8 time units for the DA game, Users' Association Delay (time unit) C u m u l a t i v e D i s t r i bu t i on F un c t i on ( CD F ) DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA Game X Y X Y Figure 6. Comparing the empirical CDF of user association delayfor EA and DA matching games in a HetNet with 𝐽 = 𝐾 = Number of UEs (K) A v e r age A ss o c i a t i on D e l a y f o r U E s DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA GameCritical LoadLightlyOverload RegionUnderload Region HeavilyOverload Region

Figure 7. Comparing the eﬀect of diﬀerent loading scenarios onaverage association delay in a HetNet with 𝐽 = q = [ , , , , ] . can serve a maximum of 35 UEs, we increase thenumber of UEs such that the network transitions fromunderload (left shaded region) to critical load (verticalline at 𝐾 =

35 UEs), lightly overload (middle shadedregion), and ﬁnally heavily overload (right shaded re-gion) cases, in order to investigate the eﬀect of diﬀerentloading scenarios on association delay. We observe aninteresting eﬀect that in the DA game, the averagedelay in the overloading cases are exactly equal to thenumber of BSs, since the DA game terminates whenall UEs are waitlisted by BSs, and this happens atthe end of 𝐽 th iteration since each UE has no morethan 𝐽 options. For the EA-Base and EA-PLU games,the association delay is always less than 𝐽 since UEsdo not reapply to BSs. A diﬀerent trend is observed forthe EA-PLU-RA game due to reapplication process andsince BSs only accept UEs within their quota. Thus,the higher number of UEs, the more reapplications andthe larger the association delay. Note that the heavilyoverload region where the delay of EA-PLU-RA crossesthat of DA occurs when the number of UEs in thenetwork is about twice the BSs’ total quota, which isunlikely to happen in the real world. B. Percentage of Unassociated Users

In practice, the number of UEs in the network isnot always equal to the total quota of BSs. Thus, theremay be unassociated UEs at the end of an associationsgame. For the next simulation, we increase both thenumbers of BSs and UEs while keeping the BSs’ quotaﬁxed. Fig. 8 compares the percentage of unassociatedUEs under three loading scenarios: a) underload, b)critical load, and c) overload. For the underload case(a), the number of UEs is 20% less than the totalquota of BSs, whereas for the overloading case (c), thenumber of UEs is 20% more. This ﬁgure shows that theproposed EA-PLU-RA game has similar performanceas the DA game under all three loading scenarios sinceboth games guarantee that maximum number of UEs Number of BSs (J)(a) P e r c en t age o f U na ss o c i a t ed U E s ( % ) Underload: -20%

DA Game [9]Proposed EA-Base GameProposed EA-PLU GameProposed EA-PLU-RA Game 4 5 6 7 8 9 10

Number of BSs (J)(b) P e r c en t age o f U na ss o c i a t ed U E s ( % ) Critical Load: 0%

Number of BSs (J)(c) P e r c en t age o f U na ss o c i a t ed U E s ( % ) Overload: 20%

Figure 8. Percentage of unassociated UEs under three diﬀerent loading scenarios: a) underload, b) critical load, and c) overload. Thenumber of BSs and UEs increases while BS quotas are ﬁxed 𝑞 = , 𝑞 𝑗 = , 𝑗 ∈ { , ..., 𝐽 } . (limited by BSs’ quotas) are associated at the end ofthe game (see Lemma 2). Also, it can be inferred thatwithout preference list updating and reapplying steps(EA-Base and EA-PLU), there may be unassociatedUEs even for the underload and critical load cases.Thus, these two steps are necessary to make the bestuse of available resources provided by BSs.Fig. 9 compares the network spectral eﬃciency ofthe HetNet for several association schemes: 1) central-ized WCS algorithm [7], 2) proposed multi-game DAalgorithm with 𝑁 =

10, 3) proposed multi-game EAalgorithm with 𝑁 =

10, 4) distributed single DA game[9], 5) proposed distributed single EA game, 6) max-SINR association, and 7) random association. The EAgame used in this simulation is the one with preferencelist updating and reapplying (EA-PLU-RA). For thesingle matching games and initial round of the multi-game algorithms, the preference lists are built basedon channel norms which include both instantaneousand large-scale CSI. For the multi-game matchingalgorithms, we use the matching obtained from thecorresponding single matching game as the initialassociation vectors, and run each algorithm for 𝑁 =

10 15 20 25 30

Transmit Power of MCBSs (dBm) A v e r age N e t w o r k S pe c t r a l E ff i c i en cy ( bp s / H z ) Centralized WCS Alg. [7]Proposed Multiple-game DA Alg. (N=10)Proposed Multiple-game EA Alg. (N=10)Distributed DA Game (Ch. norm) [9]Proposed Distributed EA Game (Ch. norm)Max-SINR AssociationRandom Association

Figure 9. Comparing the average network spectral eﬃciency of userassociation schemes in a HetNet with 𝐽 = 𝐾 =

35 UEs. slightly more power overhead, the EA games result ina signiﬁcantly faster association process compared tothe DA game while achieving better network spectraleﬃciency. The EA-PLU-RA game with preference listupdating and reapplication provides the best overallnetwork performance in terms of both association de-lay and percentage of associated users. These resultssuggest that stability is a less relevant metric foruser association and EA may be more suitable in B5Gwireless networks for real-time distributed association.Next, we proposed a multi-game matching algorithmto further enhance the network spectral eﬃciency byrunning multiple rounds of a matching game. Nu-merical results show that the proposed distributedEA games and multi-game algorithm achieve a net-work spectral eﬃciency within 80-90% of the near-optimal centralized WCS benchmark algorithm, whileincurring a complexity at several orders of magnitudelower and signiﬁcantly less overheads due to theirdistributed or semi-distributed nature.References [1] Q. Ye, B. Rong, Y. Chen, M. Al-Shalash, C. Caramanis, andJ. G. Andrews, “User association for load balancing in hetero-geneous cellular networks,”

IEEE Trans. Wireless Commun. ,vol. 12, no. 6, pp. 2706–2716, 2013.[2] D. Bethanabhotla, O. Y. Bursalioglu, H. C. Papadopoulos, andG. Caire, “Optimal user-cell association for massive MIMOwireless networks,”

IEEE Trans. Wireless Commun. , vol. 15,no. 3, pp. 1835–1850, 2016.[3] G. Athanasiou, P. C. Weeraddana, C. Fischione, and L. Tas-siulas, “Optimizing client association for load balancing andfairness in millimeter-wave wireless networks,”

IEEE/ACMTrans. Netw. , vol. 23, no. 3, pp. 836–850, 2015.[4] S. Niknam and B. Natarajan, “On the regimes in millimeterwave networks: Noise-limited or interference-limited?” in , May2018, pp. 1–6.[5] M. Rebato, M. Mezzavilla, S. Rangan, F. Boccardi, andM. Zorzi, “Understanding noise and interference regimes in 5Gmillimeter-wave cellular networks,” in ,May 2016, pp. 1–5.[6] M. Zalghout, J. Helard, M. Crussiere, S. Abdul-Nabi, andA. Khalil, “A greedy heuristic algorithm for context-aware userassociation and resource allocation in heterogeneous wirelessnetworks,” in

Proc. IEEE 86th Veh. Technol. Conf. (VTC-Fall) ,Sep. 2017, pp. 1–7.

Quota of SCBSs E x e c u t i on R un T i m e ( t i m e un i t ) Centralized WCS Algorithm [7]Distributed DA Game (Ch. norm) [9]Proposed Distributed EA Game (Ch. norm)Proposed Multi-game DA Alg. (N=10)Proposed Multi-game EA Alg. (N=10)

Figure 10. Comparing the execution run time of user associationschemes in a HetNet with 𝐽 = 𝐾 =

35 UEs.[7] A. Alizadeh and M. Vu, “Load balancing user association inmillimeter wave mimo networks,”

IEEE Trans. Wireless Com-mun. , vol. 18, no. 6, pp. 2932–2945, 2019.[8] Y. Xu, H. S. Ghadikolaei, and C. Fischione, “Adaptive dis-tributed association in time-variant millimeter wave networks,”

IEEE Trans. Wireless Commun. , vol. 18, no. 1, pp. 459–472,2019.[9] D. Gale and L. S. Shapley, “College admissions and the stabilityof marriage,”

The American Mathematical Monthly , vol. 69,no. 1, pp. 9–15, 1962.[10] Y. Gu, W. Saad, M. Bennis, M. Debbah, and Z. Han, “Matchingtheory for future wireless networks: Fundamentals and appli-cations,”

IEEE Commun. Mag. , vol. 53, no. 5, pp. 52–59, 2015.[11] Y. Du, J. Li, L. Shi, T. Liu, F. Shu, and Z. Han, “Two-tier match-ing game in small cell networks for mobile edge computing,”

IEEE Transactions on Services Computing , pp. 1–1, 2019.[12] O. Semiari, W. Saad, S. Valentin, M. Bennis, and B. Ma-ham, “Matching theory for priority-based cell association inthe downlink of wireless small cell networks,” in

Proc. IEEEICASSP , 2014, pp. 444–448.[13] W. Saad, Z. Han, R. Zheng, M. Debbah, and H. V. Poor, “Acollege admissions game for uplink user association in wirelesssmall cell networks,” in

Proc. IEEE INFOCOM , 2014, pp. 1096–1104.[14] A. Alizadeh and M. Vu, “Early Acceptance Matching Game forUser Association in 5G Cellular HetNets,” in

Proc. IEEE GlobalCommun. Conf. (GLOBECOM) , Dec. 2019.[15] E. Telatar, “Capacity of multi-antenna gaussian channels,”

Eur.Trans. Telecommun. , vol. 10, no. 6, pp. 585–595, 1999.[16] 3rd Generation Partnership Project (3GPP), “Study on channelmodel for frequencies from 0.5 to 100 GHz,” Technical Report38.901, Jun. 2018, v. 15.0.0.[17] T. A. Thomas, H. C. Nguyen, G. R. MacCartney, and T. S.Rappaport, “3D mmWave channel model proposal,” in

Proc.IEEE 80th Veh. Technol. Conf. (VTC Fall) , 2014, pp. 1–6.[18] M. K. Samimi, T. S. Rappaport, and G. R. MacCartney, “Prob-abilistic omnidirectional path loss models for millimeter-waveoutdoor communications,”

IEEE Wireless Commun. Lett. , vol. 4,no. 4, pp. 357–360, 2015.[19] A. E. Roth and M. Sotomayor, “Two-sided matching,”