[PDF] Cooperative Interference Management for Over-the-Air Computation Networks

Abstract

This paper considers a multi-cell AirComp network and investigates the optimal power control policies over multiple cells to regulate the effect of inter-cell interference. First, we consider the scenario of centralized multi-cell power control, where we characterize the Pareto boundary of the multi-cell MSE region by minimizing the sum MSE subject to a set of constraints on individual MSEs. Though the sum-MSE minimization problem is non-convex and its direct solution intractable, we optimally solve this problem via equivalently solving a sequence of convex second-order cone program feasibility problems together with a bisection search. Next, we consider distributed power control in the other scenario without a centralized controller, for which an alternative IT-based method is proposed to characterize the same MSE Pareto boundary, and enable a decentralized power control algorithm. Accordingly, each AP only needs to individually control the power of its associated devices, but subject to a set of IT constraints on their interference to neighboring cells, while different APs can cooperate in iteratively updating the IT levels by pairwise information exchange, to achieve a Pareto-optimal MSE tuple. Last, simulation results demonstrate that cooperative power control using the proposed algorithms can substantially reduce the sum MSE of AirComp networks.

Full PDF

aa r X i v : . [ c s . I T ] J u l Cooperative Interference Management forOver-the-Air Computation Networks

Xiaowen Cao, Guangxu Zhu, Jie Xu, and Kaibin Huang

Abstract

Recently, over-the-air computation (AirComp) has emerged as an efﬁcient solution for access points (APs) to aggregate distributed data from many edge devices (e.g., sensors) by exploiting the waveformsuperposition property of multiple access (uplink) channels. While prior work focuses on the single-cell setting where inter-cell interference is absent, this paper considers a multi-cell AirComp networklimited by such interference and investigates the optimal policies for controlling devices’ transmit powerto minimize the mean squared errors (MSEs) in aggregated signals received at different APs. First, weconsider the scenario of centralized multi-cell power control. To quantify the fundamental AirCompperformance tradeoff among different cells, we characterize the Pareto boundary of the multi-cell MSEregion by minimizing the sum MSE subject to a set of constraints on individual MSEs. Though the sum-MSE minimization problem is non-convex and its direct solution intractable, we show that this problemcan be optimally solved via equivalently solving a sequence of convex second-order cone program (SOCP) feasibility problems together with a bisection search. This results in an efﬁcient algorithm forcomputing the optimal centralized multi-cell power control, which optimally balances the interference-and-noise-induced errors and the signal misalignment errors unique for AirComp. Next, we considerthe other scenario of distributed power control, e.g., when there lacks a centralized controller. In thisscenario, we introduce a set of interference temperature (IT) constraints, each of which constrains themaximum total inter-cell interference power between a speciﬁc pair of cells. Accordingly, each AP onlyneeds to individually control the power of its associated devices for single-cell MSE minimization, butsubject to a set of IT constraints on their interference to neighboring cells. By optimizing the IT levels,the distributed power control is shown to provide an alternative method for characterizing the same multi-cell MSE Pareto boundary as the centralized counterpart. Building on this result, we further propose

X. Cao is with the Future Network of Intelligence Institute (FNii), The Chinese University of Hong Kong (Shenzhen),Shenzhen, China, and the School of Information Engineering, Guangdong University of Technology, Guangzhou, China (e-mail:[email protected]).G. Zhu is with Shenzhen Research Institute of Big Data, Shenzhen, China (e-mail: [email protected]).J. Xu is with the Future Network of Intelligence Institute (FNii) and the School of Science and Engineering (SSE), TheChinese University of Hong Kong (Shenzhen), Shenzhen, China (e-mail: [email protected]). J. Xu is the corresponding author.K. Huang is with the Dept. of Electrical and Electronic Engineering, The University of Hong Kong, Pok Fu Lam, Hong Kong(e-mail: [email protected]). an efﬁcient algorithm for different APs to cooperate in iteratively updating the IT levels to achievea Pareto-optimal MSE tuple, by pairwise information exchange. Last, simulation results demonstratethat cooperative power control using the proposed algorithms can substantially reduce the sum MSE ofAirComp networks compared with the conventional single-cell approaches.

Index Terms

Over-the-air computation, multi-cell cooperation, power control, interference management, interfer-ence temperature.

I. I

NTRODUCTION

One common operation of future

Internet-of-Things (IoT) is to aggregate sensing data orcomputation results transmitted by many edge devices (e.g., sensors and smart phones). Recently, over-the-air computation (AirComp) has emerged as a promising solution for such fast wirelessdata aggregation (WDA) as required by ultra-low-latency and high-mobility applications [1]–[3].The core idea of AirComp is to exploit the signal-superposition property of a multiple accesschannel (MAC) for “over-the-air aggregation”. This enables an access point (AP) to directlyreceive the aggregated version of the simultaneously transmitted data from devices. The sharingof the whole spectrum by all devices overcomes the issue of long latency faced in massiveaccess. With proper pre-processing at devices and post-processing at AP, AirComp can go aheadaveraging to compute a class of so-called nomographic functions (e.g., geometric mean andpolynomial functions). As a result, AirComp ﬁnds a wide range of applications ranging fromdistributed sensing [2], [3] to distributed consensus [4] to distributed machine learning [5]–[9].The theme of this paper is to design techniques for cooperative interference management tofacilitate the large-scale implementation of AirComp in multi-cell networks.

A. Over-the-Air Computation

The concept of AirComp was ﬁrst studied from the information-theoretic perspective in [1],where structured codes are designed to exploit interference arising from simultaneous transmis-sions for fast functional computation over a MAC. Subsequently, a strong result was proved in[2] that simple AirComp with uncoded analog transmission is optimal in terms of minimizingthe noise-induced distortion in WDA, which we term AirComp error, if the sensing data sourcesare independent and identically Gaussian distributed [2]. Another vein of research on AirCompfocuses on the signal processing perspective [10]–[12]. The optimal scheme for power allocationthat minimizes the AirComp error was studied targeting distributed signal estimation from noisy observations [10]. In [11], the authors proposed power allocation schemes under a differentcriterion of minimum outage probability where an outage event occurs when the AirComp errorexceeds a certain threshold. One requirement for implementing AirComp is the synchronizationbetween transmissions by devices. A solution for meeting the requirement was proposed in [12],in which the AP broadcasts a reference-clock signal to all devices.Most recent advancements in AirComp have led to its integration with more complex wirelesstechniques and systems, and its new application to the area of distributed machine learning.

Multiple-input-multiple-output (MIMO) AirComp was developed to exploit spatial multiplexingfor supporting vector-valued functional computation targeting multi-modal sensing [13], [14].The channel feedback overhead in MIMO AirComp was then exempted in [15], by solvinga bilinear estimation problem that can recover both the channel information and the desiredfunctions simultaneously from a set of noisy received aggregated signals. In the fast growingarea of distributed machine learning, AirComp ﬁnds a new application in efﬁciently enabling anedge server to aggregate distributed learning results transmitted by edge devices [5]–[9].Power control for AirComp, which is the theme of this work, concerns controlling the transmitpower of energy-constrained edge devices to cope with channel fading and noise that canpotentially result in unacceptable AirComp errors. The simple transmission scheme of channelinversion is widely adopted in the AirComp literature to overcome fading so that multiusersignals arriving at an AP are aligned in magnitude, which is required for receiving a desiredfunctional value of distributed data [5], [7], [13], [14]. However, it is well established that thechannel inversion incurs severe noise ampliﬁcation when channels are in deep fade, resulting inlarge AirComp errors. To address this issue, the corresponding power control needs to be jointlydesigned over devices with the objective of minimizing the AirComp error. This differs fromthe power control in conventional systems with different objectives of enhancing data rates orensuring link reliability, which result in well-known policies such as water ﬁlling or channeltruncation (see, e.g., [16]). Based on the metric of mean squared error (MSE), the optimalpower control policies were studied for AirComp under individual power constraints in [8],[17], [18] and under a sum power constraint in [10], [11], which were found to have differentstructures from their counterparts for conventional wireless communication systems. For instance,to minimize MSE, devices with relatively weak links should transmit with full power but othersshould perform channel inversion [17], achieving a balance between the noise-induced errorsand the signal misalignment errors.

While the prior work assumes a single-cell network, we envision the large-scale deploymentof AirComp in a multi-cell network, to support ubiquitous coverage for next-generation IoT. Thisleads to simultaneous AirComp tasks in different cells, each of which is characterized by itsapplication and corresponding data type (e.g., sensing or learning) as well as aggregation function(e.g., averaging or geometric mean). The coexistence of different AirComp tasks, however, affectseach other due to the inter-cell interference. This gives rise to the new challenge of managingsuch interference by multi-cell cooperation so as to rein in the errors in the coexisting tasks.

B. Cooperative Interference Management

Cooperative interference management for conventional radio access networks is a well-studiedarea (see, e.g., [19] and the references therein). A wide range of relevant techniques andissues have been studied such as beamforming [20], [21], network throughput [22], [23] andpower control [24], [25]. However, the challenges faced by designing cooperative interferencemanagement for AirComp networks differ from those for conventional radio access networksas they provide different services. A conventional radio access network is designed to supportradio access to users. In contrast, the function of an AirComp network is to perform WDA overdevices that are either sensors or workers. This results in different operations and performancemetrics for the conventional radio access networks and the emerging AirComp networks. Interms of operations, the former suppresses multiuser interference so as to support multiuser datastreams while the latter aggregates simultaneous data streams to compute a desired function. Interms of performance metrics, the conventional ones measure rates or reliability (e.g., sum rate[21], [22] and outage probability [25]) while those for AirComp should measure the accuracy inthe received functional value (e.g., MSE). Cooperative interference management for AirCompnetworks remains a largely uncharted area. Recently, an initial study in this area was reportedin [26], where a scheme called simultaneous signal-and-interference alignment is proposedto maximize the number of interference-free aggregated data streams in a two-cell AirCompnetwork. In this work, we study the same theme of multi-cell cooperation but from a differentperspective, namely power control.

C. Main Contributions

In this paper, we consider an AirComp network comprising multiple cells. In each cell, oneAP serves as a fusion center to aggregate date from multiple devices in the same cell. Theaggregated signal received at each AP is exposed to inter-cell interference due to simultaneous uplink transmissions by devices in the neighboring cells. A novel framework of coordinatedpower control for managing such interference to suppress the errors in coexisting AirComptasks is presented in the current work. The main contributions of this work are summarized asfollows. • Multi-cell MSE tradeoffs with centralized coordinated power control:

In this scenario,the transmit power of all devices is subject to centralized control by a centralized networkcontroller. First, to understand the fundamental MSE performance trade-offs among thesecells, we characterize the Pareto boundary of the AirComp MSE region of simultaneous Air-Comp tasks in different cells using the so-called

MSE-proﬁling technique. This is equivalentto minimizing the sum MSE of all APs subject to a set of MSE constraints for individualcells and individual transmit power constraints at devices, in which the devices’ transmitpowers and APs’ signal scaling factors for noise suppression, called denoising factors , arejointly optimized. The problem is non-convex due to the coupling between power controlvariables and denoising factors. Though the direct solution is intractable, we propose analternative approach to obtain its optimal solution by equivalently solving a sequence ofproblems, each being a convex second order cone program (SOCP), combined with a simplebisection search. This leads to an efﬁcient algorithm of computing the optimal policy forcentralized coordinated power control, which optimally balances the suppression of theinterference-and-noise-induced errors and signal misalignment errors. • Distributed power control with interference-temperature coordination:

We consideranother scenario of distributed power control, where the centralized controller is unavailable.The distributed power control is realized by introducing a set of interference temperature (IT) constraints, each of which limits the maximum power of total interference from onecell to the other. Given the IT constraints, the multi-cell power control reduces to single-celloperations, where each AP only needs to control the power of its associated devices forsingle-cell MSE minimization. While power control is distributed, multi-cell cooperation isrealized by optimizing the IT levels. It is shown that by proper IT levels control, the sameMSE Pareto boundary as the centralized counterpart can be achieved. To materialize thegain promised by such optimality, we further propose an efﬁcient algorithm for differentAPs to cooperate in iteratively updating the IT levels for practically achieving a Pareto-optimal MSE tuple, by only pairwise information exchange. Based on the algorithm, allcells are ensured to monotonically reduce their individual MSE values from the starting

AP 1 InformationInterferenceAP AP L … ℓ Figure 1. A multi-cell AirComp network, where, in each cell, an AP aims at aggregating data from its associated edge devices. point corresponding to no cooperation, providing incentives for the cells to cooperative. • Performance evaluation:

Simulation results are presented to validate the derived analyticalresults. It is shown that both the centralized and distributed implementation of coordinatedpower control can substantially improve the AirComp performance, compared with theconventional design without cooperation.The remainder of the paper is organized as follows. Section II presents the system model ofthe AirComp networks. Sections III and IV present the centralized and distributed power controlfor characterizing the Pareto boundary of MSE region, respectively. Finally, Section V presentsthe simulation results, followed by the conclusion in Section VI.

Notations:

Bold lowercase and uppercase letters refer to column vectors and matrices, respec-tively. E ( · ) denotes the expectation operation, and the superscript T represents the transposeoperation. For a complex number a , Re { a } denotes the real part and the superscript † denotesthe conjugate operation. For a vector a , k a k denote the Euclidean norm. | A | denotes thedeterminant of a squared matrix A .II. S YSTEM M ODEL AND P ERFORMANCE M ETRICS

A. System Model

We consider a multi-cell AirComp network with multiple APs as shown in Fig. 1, whereeach AP acting as a fusion center aggregates sensing data (e.g., temperature, humidity) fromedge devices. In each cell, the aggregated signal received at the AP is exposed to the inter-cellinterference caused by uplink transmission of devices in neighboring cells. Let L , { , , · · · , L } denote the set of L APs, each dealing with a heterogeneous type of interested data, and K ℓ , { P ℓ − i =1 K i + 1 , · · · , P ℓ − i =1 K i + K ℓ } denote the set of K ℓ ≥ edge devices collecting sensingreadings associated with AP ℓ ∈ L with K , and K ℓ ∩ K j = ∅ , ∀ ℓ = j, ℓ, j ∈ L . Let K denote the set of all K devices with K , K ∪ K ∪ · · · ∪ K L and K = P ℓ ∈L K ℓ . Speciﬁcally, AP ℓ needs to estimate the average of the type- ℓ data from the K ℓ devices in K l . Let X k denote thesensing reading measured by device k ∈ K ℓ associated with AP ℓ ∈ L , which is assumed to be independent and identically distributed (i.i.d.) over devices. The desired average of type- ℓ dataat AP ℓ , denoted by ˜ f ℓ ( · ) , is given by ˜ f ℓ = 1 K ℓ X k ∈K ℓ X k ! , ∀ ℓ ∈ L . (1)To facilitate power control, X k is normalized as s k , Ψ ℓ ( X k ) , ∀ k ∈ K ℓ , ℓ ∈ L [17]. The linearfunction Ψ ℓ ( · ) denotes the normalization operation to ensure that { s k } k ∈K ℓ have zero mean andunit variance, assuming { X k } k ∈K ℓ have identical means and variance. Upon receiving the averageof transmitted data { s k } k ∈K ℓ at each AP ℓ ∈ L , i.e., f ℓ = 1 K ℓ X k ∈K ℓ s k , ∀ ℓ ∈ L , (2)it can simply recover the desired ˜ f ℓ from f ℓ via the de-normalization operation as follows: ˜ f ℓ = Ψ − ℓ ( f ℓ ) , (3)in which Ψ − ℓ ( · ) represents the inverse function of Ψ ℓ ( · ) . Therefore, with the one-to-one mappingbetween f ℓ and ˜ f ℓ , we refer to f ℓ as the target-function value in this paper.To design adaptive power control, it is sufﬁcient to consider a single realization of channelsand analyze the control policy as a function of the channel states. Let h k and g k,j , ∀ k ∈ K ℓ denotethe channel coefﬁcient of the data link between device k ∈ K ℓ and its associated AP ℓ ∈ L ,and that of the interference link between device k ∈ K ℓ and non-associated AP j ∈ L \ { ℓ } ,respectively. Let b k denote the transmit coefﬁcient at device k ∈ K ℓ , ℓ ∈ L for transmittinginformation to AP ℓ ∈ L . Therefore, the received signal at each AP ℓ is y ℓ = X k ∈K ℓ h k b k s k + X j ∈L\{ ℓ } X i ∈K j g i,ℓ b i s i + w ℓ , ∀ ℓ ∈ L , (4)where w ℓ ∼ CN (0 , σ ) models channel noise at the AP ℓ . To invert the data link for signalalignment at the AP, the transmit coefﬁcient b k is set as b k = √ p k h † k | h k | , where p k ≥ denotes thetransmit power at device k that is a control variable of our interest. Then (4) reduces to y ℓ = X k ∈K ℓ | h k |√ p k s k + X j ∈L\{ ℓ } X i ∈K j ˜ g i,ℓ √ p i s i + w ℓ , ∀ ℓ ∈ L , (5)where ˜ g i,ℓ , g i,ℓ h † i | h i | , i ∈ K j , j ∈ L \ { ℓ } , represents the effective interference channel to AP ℓ .Following the existing approach (see, e.g. [17]), the signal y ℓ is scaled at AP ℓ using a denoising factor denoted by η ℓ . The scaled signal is given as ˆ f ℓ = Re { y ℓ } K ℓ √ η ℓ . (6)Furthermore, in practice, each device k ∈ K ℓ is constrained by a maximum power budget ¯ P k : p k ≤ ¯ P k , ∀ k ∈ K ℓ , ℓ ∈ L . (7) B. Performance Metrics

We are interested in minimizing the distortion of the recovered average of the transmitteddata, with respect to (w.r.t.) the ground truth average f ℓ , ∀ ℓ ∈ L at each AP ℓ . The AirComperror in cell ℓ is measured by the corresponding instantaneous MSE deﬁned as ] MSE ℓ ( { p k } k ∈K , η ℓ ) = E h ( ˆ f ℓ − f ℓ ) i = 1 K ℓ E  Re { y ℓ }√ η ℓ − X k ∈K ℓ s k !  = 1 K ℓ  X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | η ℓ  , (8)where the expectation is over the distribution of the transmitted signals { s k } k ∈K and ˆ g k,ℓ =Re { ˜ g k,ℓ } . For notational convenience, we use MSE ℓ ( { p k } k ∈K , η ℓ ) = K ℓ ] MSE ℓ ( { p k } k ∈K , η ℓ ) inthe sequel to represent the MSE without the constant term K ℓ in (8). It is observed from (8) thatthe “intra-cell interference” is exploited to enable functional computation, while the inter-cellinterference interferes with the operation.We deﬁne the MSE region for AirComp to be the set of MSE-tuples for all L APs that canbe simultaneously achievable for all L APs under a given set of individual maximum powerconstraints for the devices, given as • MSE Region : M , [ ≤ p k ≤ ¯ P k , ∀ k ∈K ,η ℓ ≥ , ∀ ℓ ∈L n (Φ , Φ , · · · , Φ L ) : Φ ℓ ≥ MSE ℓ ( { p k } k ∈K , η ℓ ) , ∀ ℓ ∈ L o . (9)We are particularly interested in the operational points on the Pareto boundary (or, equivalently,the Pareto optimal points) of the MSE region M , which corresponds to the lower-left boundaryof this MSE region. Note that in the Pareto boundary, we can only reduce a particular AP’sMSE at a cost of increasing the MSE at others, as illustrated in Fig. 2.III. C ENTRALIZED P OWER C ONTROL VIA

MSE-P

ROFILING

In this section, we focus on the scenario of centralized power control, when there exists acentralized controller with global channel state information (CSI) to coordinate all APs on the

MSE of Cell 1 M S E o f C e ll Pareto boundaryNo power controlOptimal power control

MSE profile

Figure 2. Illustration of the Pareto boundary of MSE region based on the optimal coordinated power control. sum MSE reduction. In such a scenario, we ﬁrst introduce the so-called

MSE-proﬁling techniqueto characterize the Pareto boundary of MSE region. Based on this, an optimal algorithm for jointpower control and denoising factor design is presented to achieve the Pareto boundary.

A. Characterization of Pareto Boundary of MSE Region via MSE Proﬁling

Inspired by the “rate proﬁle” approach proposed in [27], which is a widely used approach tocharacterize the Pareto boundary of rate region in multiuser communication systems, we proposeto characterize the Pareto boundary of the MSE region by using the MSE-proﬁling technique aspresented in this subsection. Deﬁne a particular MSE-proﬁling vector as β = [ β , β , · · · , β L ] .Then the MSE-tuple on the Pareto boundary of the MSE region can be obtained by solving thefollowing optimization problem with a speciﬁed MSE-proﬁling vector β : ( P1 ) : min { p k } k ∈K , { η ℓ } ℓ ∈L ,ε ≥ ε s . t . MSE ℓ ( { p k } k ∈K , η ℓ ) ≤ β ℓ ε, ∀ ℓ ∈ L (10) ≤ p k ≤ ¯ P k , ∀ k ∈ K (11) η ℓ ≥ , ∀ ℓ ∈ L , (12)where ε denotes the achievable sum MSE of the L APs, and β ℓ represents the target ratio of the ℓ -th AP’s achievable MSE to the sum MSE achieved by all the L APs, ε . In general, we assumethat β ℓ ≥ , ∀ ℓ ∈ L , and it holds P ℓ ∈L β ℓ = 1 , in which a smaller β ℓ means that AP ℓ has higherpriority in minimizing the MSE, MSE ℓ ( { p k } k ∈K , η ℓ ) . With a given β , we denote the optimalvalue of problem (P1) by ε opt1 . Accordingly, the achieved MSE tuple β ε opt1 corresponds to thePareto-optimal point, which is exactly the intersection point between a ray in the direction of β and the Pareto boundary of the MSE region as geometrically illustrated in Fig. 2. Therefore, by varying the values of β , solving problem (P1) can yield the complete Pareto boundary for theMSE region. B. Algorithm for Centralized Power Control

In this subsection, we present the algorithm to optimally solve problem (P1) with a given β .Note that problem (P1) is non-convex due to the coupling between power control { p k } k ∈K anddenoising factors { η ℓ } ℓ ∈L in constraint (10), and thus is hard to solve directly. Thereby, we ﬁrstconsider the optimization of denoising factors { η ℓ } ℓ ∈L with any given { p k } k ∈K , and then ﬁndthe optimal { p k } k ∈K to solve problem (P1) with the optimal { η ℓ } ℓ ∈L .First, with any given { p k } k ∈K , problem (P1) can be decoupled into L subproblems each foroptimizing η ℓ to minimize the MSE at one AP ℓ . The ℓ -th subproblem is written as min η ℓ ≥ β ℓ  X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | η ℓ  . (13)Let ν ℓ = 1 / √ η ℓ , then problem (13) can be transformed to a convex quadratic problem as min ν ℓ ≥ β ℓ  X k ∈K ℓ ( √ p k | h k | ν ℓ − +  σ + X j ∈L\{ ℓ } X i ∈K j p i | ˆ g i,ℓ |  ν ℓ  . (14)By setting the ﬁrst derivative of the objective function in problem (14) to be zero, we can obtainthe optimal solution ν ∗ ℓ to problem (14). As a result, the optimal solution to problem (13) isobtained as η ∗ ℓ = ( ν ∗ ℓ ) , ∀ ℓ ∈ L , given in the following proposition. Proposition 1.

With any given { p k } k ∈K , the optimal η ℓ to problem (13) is given by η ∗ ℓ =  P k ∈K ℓ p k | h k | + P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | + σ P k ∈K ℓ √ p k | h k |  , ∀ ℓ ∈ L . (15) Remark 1 (Interfence-and-Noise-Induced Error Reduction) . It is observed from (15) that the op-timal η ∗ ℓ is monotonically increasing w.r.t. the noise variance σ , the received power P k ∈K ℓ p k | h k | ,and the interference power from other devices associated with other APs P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | . Onthe one hand, as σ increases, a large denoising factor η ∗ ℓ is needed for suppressing the dominantnoise-induced error. On the other hand, as the interference power increases, a relatively larger η ∗ ℓ is in need for inter-cell interference suppression to enable reliable multi-cell AirComp. Next, we optimize { p k } k ∈K and ε by substituting η ∗ ℓ in (15) into problem (P1). Thus, we have min { ≤ p k ≤ ¯ P k } ,ε ≥ ε (16) s . t . K ℓ − P k ∈K ℓ √ p k | h k | ! P k ∈K ℓ p k | h k | + P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | + σ ! ≤ β ℓ ε, ∀ ℓ ∈ L . In the following, we show that problem (16) can be optimally solved by equivalently solving asequence of feasibility problems each for a ﬁxed ε . Denoting ψ ℓ = K ℓ − β ℓ ε , we deﬁne a seriesof feasibility problems with given ε as Find { p k } (17) s . t . ψ ℓ  X k ∈K ℓ p k | h k | + X j ∈L\{ ℓ } X i ∈K j p i | ˆ g i,ℓ | + σ  ≤ X k ∈K ℓ √ p k | h k | ! , ∀ ℓ ∈ L (18) ≤ p k ≤ ¯ P k , ∀ k ∈ K . (19)Recall that ε opt1 denotes the optimal value achieved by problem (P1). With any given sum-MSE target ε , if problem (17) is feasible, then we have ε opt1 ≤ ε ; otherwise, ε opt1 > ε holds.Therefore, we can solve problem (16) by equivalently solving the feasibility problems in (17)with different ε together with a bisection search over ε .Therefore, it remains to solve problem (17) with given ε . Notice that ψ ℓ must be nonnegative,as the MSE ε is upper bounded by K ℓ . Hence, the constraints in (18) can be re-written as vuuut ψ ℓ  X k ∈K ℓ p k | h k | + X j ∈L\{ ℓ } X i ∈K j p i | ˆ g i,ℓ | + σ  ≤ X k ∈K ℓ √ p k | h k | , ∀ ℓ ∈ L . (20)By introducing auxiliary variables q k = √ p k , ∀ k ∈ K ℓ , ℓ ∈ L , and letting q ℓ , (cid:2) q K ℓ − +1 , · · · , q K ℓ (cid:3) T , h ℓ = (cid:2) | h K ℓ − +1 | , · · · , | h K ℓ | (cid:3) T , g j,ℓ = (cid:2) | ˆ g ( K j − +1) ,ℓ | , · · · , | ˆ g K j ,ℓ | (cid:3) , H ℓ = diag( h ℓ ) T , and G j,ℓ =diag( g j,ℓ ) , ∀ ℓ ∈ L , j ∈ L \ { ℓ } , we can transform the constraints in (20) or equivalently (18)into a set of second order cone (SOC) constraints as: p ψ ℓ k Σ ℓ k ≤ q Tl h ℓ , ∀ ℓ ∈ L , (21)where Σ ℓ = [ q T G ,ℓ , · · · , q Tℓ H ℓ , · · · , q TL G L,ℓ , σ ] T . Then, problem (17) is reformulated as the following SOCP, which can be solved efﬁciently by convex optimization tools, e.g., CVX [28]. Find { q k } (22) s . t . ≤ q k ≤ ¯ q k , ∀ k ∈ K (23)(21) , where ¯ q k , p ¯ P k , ∀ k ∈ K ℓ , ℓ ∈ L . Denote { q ∗ k } k ∈K as the optimal solution to problem (22)with any given sum-MSE target ε , then we have p ∗ k = ( q ∗ k ) , ∀ k ∈ K as the optimal solution toproblem (17). Based on the solution to problem (17) together with the bisection search over ε ,the optimal ε opt1 to problem (16) is thus obtained. With the obtained ε opt1 , we can accordinglyattain the globally optimal power control { p opt1 k } k ∈K by solving problem (17), as well as theglobal optimal denoising factor { η opt1 ℓ } ℓ ∈L for problem (P1) based on Proposition 1. In summary,the algorithm for optimally solving problem (P1) is presented in Algorithm 1. Algorithm 1 for Optimally Solving Problem (P1) a) Input : Maximum power budgets { ¯ P k } k ∈K , MSE-proﬁling vector β .b) Initialization : Let ε low = 0 , ε high = min l ∈L K ℓ β ℓ .c) Repeat

1) Compute ε = ε low + ε high , and then solve problem (17) with given ε and the optimal solution of { p k } k ∈K being { p ∗ k } k ∈K .2) If problem (17) is feasible, then set ε high = ε ; otherwise, set ε low = ε ;d) Until | ε high − ε low | converges within a prescribed accuracy.e) Set ε opt1 = ε low + ε high and p opt1 k = p ∗ k , ∀ k ∈ K ℓ , ℓ ∈ L .f) Compute { η opt1 ℓ } based on (15) in Proposition 1.g) Output : Obtain the optimal solution { p opt1 k } k ∈K , { η opt1 ℓ } ℓ ∈L , and ε opt1 to problem (P1). IV. D

ISTRIBUTED P OWER C ONTROL BASED ON I NTERFERENCE T EMPERATURE

The optimal centralized power control algorithm in the previous section requires the fullcooperation among all APs coordinated by a centralized controller, to achieve the Pareto boundaryof the MSE region. In this section, we consider a practical scenario where a centralized controlleris unavailable and study the distributed power control by exploiting the IT technique. It will beproved that the IT-based distributed power control can actually provide an alternative methodfor achieving the same Pareto boundary of the MSE region as the centralized counterpart.

A. Alternative Characterization of Pareto Boundary of MSE Region via IT control

Different from the MSE-proﬁling-based design, in this subsection, we provide an alternativeproblem formulation based on the IT technique to characterize the Pareto Boundary of MSE region, which features (distributed) single-cell power control under the constraints of a set ofITs to limit its interference to the neighboring cells. To this end, we ﬁrst introduce a set of ITlevels denoted by Γ ℓ,j , which is the maximum interference power from all devices associatedwith AP ℓ to AP j , ∀ ℓ ∈ L , j ∈ L \ { ℓ } . For the purpose of illustration, we denote Γ as an L ( L − × vector composed of Γ ℓ,j ’s, and Γ ℓ as a L − × vector consisting of Γ j,ℓ ’sand Γ ℓ,j ’s, ∀ j = ℓ, j ∈ L , for any given ℓ ∈ L . Accordingly, for each AP ℓ , we replace theinterference term P i ∈K j p i | ˆ g i,ℓ | in the MSE formula MSE ℓ ( { p k } k ∈K , η ℓ ) by Γ j,ℓ , and impose a setof IT constraints each for one neighboring AP j, ∀ j = ℓ . As a result, the MSE minimization isimplemented at each AP ℓ ∈ L individually, which is explicitly expressed as ( P2 . .ℓ ) : min { p k } k ∈K ℓ ,η ℓ ≥ X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } Γ j,ℓ η ℓ s . t . X k ∈K ℓ p k | ˆ g k,j | ≤ Γ ℓ,j , ∀ j ∈ L \ { ℓ } (24) ≤ p k ≤ ¯ P k , ∀ k ∈ K ℓ . (25)For notational convenience, we denote Φ ℓ ( Γ ℓ ) as the optimal value of problem (P2.1. ℓ ) with anygiven Γ ℓ , and denote { p opt2 k } k ∈K ℓ and η opt2 ℓ as the optimal solution to problem (P2.1. ℓ ).Before solving problem (P2.1. ℓ ), we show that via the IT control, the single-cell distributedpower control in problem (P2.1. ℓ ) leads to a parametric characterization of the Pareto boundaryof MSE region w.r.t. Γ in the following proposition. Proposition 2 (Pareto Optimality Based on Interference Temperature Control) . For any MSE-tuple (Φ , · · · , Φ L ) on the Pareto boundary achieved by { ˜ p k } k ∈K and { ˜ η ℓ } ℓ ∈L , there exist a setof corresponding IT levels Γ , with Γ ℓ,j = P k ∈K ℓ ˜ p k | ˆ g k,j | , ∀ ℓ = j, ℓ, j ∈ L , such that Φ ℓ = Φ ℓ ( Γ ℓ ) , ∀ ℓ ∈ L , and { ˜ p k } k ∈K ℓ and ˜ η ℓ are the optimal solution to problem (P2.1. ℓ ) for each cell ℓ ∈ L with given Γ . Proof:

See Appendix A.Based on Proposition 2, it follows that by solving problem (P2.1. ℓ ) and exhausting Γ , wecan accordingly obtain the complete Pareto boundary of MSE region same as that have beencharacterized from problem (P1) via MSE proﬁling. Notice that in the IT-based method, weneed to determine L ( L − parameters in Γ in order to ﬁnd each boundary point, while only K parameters are needed for the MSE-proﬁling-based design in the previous section. Nevertheless,the IT-based design enables distributed power control that is efﬁcient for practical implementation(as illustrated next), while the MSE-proﬁling-based design must be realized in a centralized manner. Furthermore, in terms of CSI requirement, the IT-based design requires each AP tohave access to only the CSI of channels in its own cell and the related IT constraints, withoutrequiring the availability of the global CSI of the whole network at a centralized controller. B. Distributed Power Control under IT Constraints

In this subsection, the single-cell power control problem in (P2.1. ℓ ) is solved under the ITconstraints. Note that problem (P2.1. ℓ ) is non-convex w.r.t. { p k } k ∈K ℓ and η ℓ , due to the non-convex term p k η ℓ , and thus hard to solve optimally in general. To overcome the difﬁculty, we deﬁne ν ℓ = 1 /η ℓ as the inverse of denoising factor, and introduce an auxiliary variable Q k = √ p k ν ℓ for each device k ∈ K ℓ , such that problem (P2.1. ℓ ) w.r.t. { p k } k ∈K ℓ and η ℓ can be transformedinto the following equivalent problem w.r.t. { Q k } k ∈K ℓ and ν ℓ : ( P2 . .ℓ ) : min { Q k ≥ } k ∈K ℓ ,ν ℓ ≥ X k ∈K ℓ ( | h k | Q k − + ν ℓ  σ + X j ∈L\{ ℓ } Γ j,ℓ  s . t . X k ∈K ℓ Q k | ˆ g k,j | ≤ Γ ℓ,j ν ℓ , ∀ j ∈ L \ { ℓ } (26) Q k ≤ ¯ P k ν ℓ , ∀ k ∈ K ℓ . (27)It can be easily veriﬁed that problem (P2.2. ℓ ) is jointly convex w.r.t. { Q k } k ∈K ℓ and ν ℓ , and thuscan be efﬁciently solved by the standard convex optimization methods such as the interior pointmethod [29]. Alternatively, for gaining more design insights, we use the Lagrange duality method[29] to obtain a well-structured optimal solution on power control. We denote { Q opt2 k } k ∈K ℓ and ν opt2 ℓ as the optimal solution to problem (P2.2. ℓ ). Let λ ℓ,j ≥ denote the dual variable associatedwith the j -th IT constraint in (26) for problem (P2.2. ℓ ). Then the partial Lagrangian of problem(P2.2. ℓ ) is given by L ℓ (cid:0) { Q k } k ∈K ℓ , ν ℓ , { λ ℓ,j } j ∈L\{ ℓ } (cid:1) = X k ∈K ℓ  | h k | + X j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j |  Q k − X k ∈K ℓ | h k | Q k + ν ℓ  σ + X j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  + K ℓ . Then the dual function of problem (P2.2. ℓ ) is given by g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) = min { Q k ≥ } ,ν ℓ ≥ L ℓ (cid:0) { Q k } k ∈K ℓ , ν ℓ , { λ ℓ,j } j ∈L\{ ℓ } (cid:1) (28) s . t . Q k ≤ ¯ P k ν ℓ , ∀ k ∈ K ℓ . (29) Proposition 3.

In order for the dual function g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) to be lower bounded, it musthold that σ + P j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j ) ≥ . Proof:

See Appendix B.The corresponding dual problem of problem (P2.2. ℓ ) is thus given by max { λ ℓ,j ≥ } g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) (30) s . t . σ + X j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j ) ≥ . (31)Since problem (P2.2. ℓ ) is convex and satisﬁes the Slater’s condition [29], strong duality holdsbetween problem (P2.2. ℓ ) and its dual problem (30). As a result, we can solve problem (P2.2. ℓ )by equivalently solving problem (30). For notational connivence, we denote the optimal solutionto the dual problem (30) as { λ opt2 ℓ,j } j ∈L\{ ℓ } , and that to problem (28) as { Q ∗ k } k ∈K ℓ and ν ∗ ℓ with anygiven { λ ℓ,j } . In the following, we ﬁrst evaluate the dual function g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) with any given { λ ℓ,j } j ∈L\{ ℓ } by solving problem (28), and then obtain the optimal dual variable { λ opt2 ℓ,j } j ∈L\{ ℓ } to maximize g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) .

1) Derivation of Dual Function:

To obtain the dual function, we need to solve problem (28)equivalently with any given { λ ℓ,j } j ∈L\{ ℓ } . a) Optimizing { Q k } k ∈K ℓ with Given ν ℓ : First, with any given inverse denoising factor ν ℓ , prob-lem (28) can be decomposed into the following K ℓ subproblems, each for optimizing Q k , k ∈ K ℓ as min ≤ Q k ≤ √ ¯ P k ν ℓ ( | h k | Q k − + X j ∈L\{ ℓ } λ ℓ,j Q k | ˆ g k,j | . (32)By taking the ﬁrst derivative of the objective function in problem (32), the optimal solution Q ∗ k is obtained as Q ∗ k = min p ¯ P k ν ℓ , | h k || h k | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j |  . (33) b) Optimizing ν ℓ with Obtained Optimal { Q ∗ k } k ∈K ℓ : Next, we ﬁnd the optimal inverse denoisingfactor ν l to problem (28) by substituting back the optimized { Q ∗ k } k ∈K ℓ in (33). Before proceedingand to facilitate the description, we deﬁne B k as the policy indicator at device k ∈ K ℓ , given by B k , ¯ P k | h k | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j | ! | h k | , ∀ k ∈ K ℓ , (34)and assume that B ≤ · · · ≤ B k ≤ · · · ≤ B K ℓ without loss of generality. Notice that the value of B k determines the adopted power control policy (full power transmission or regularized channelinversion) at each device k ∈ K ℓ as will be discussed later. Substituting Q ∗ k ’s in (33) into problem (28), we obtain the following optimization problem: min ν ℓ ≥ F ℓ ( ν ℓ ) , X k ∈K ℓ min  | h k | + X j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j |  ¯ P k ν ℓ , | h k | | h k | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j |  + K ℓ − X k ∈K ℓ min  | h k | p ¯ P k ν ℓ , | h k | | h k | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j |  + ν ℓ  σ + X j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  . (35)Let ν ∗ ℓ denote the globally optimal solution to problem (35). To solve problem (35), we ﬁrst needto remove the “min” operation in the objective function to simplify the derivation by adopting a divide-and-conquer approach. In particular, we divide the feasible set of problem (35), namely { ν ℓ ≥ } , into K ℓ + 1 intervals, each given by F ℓ,k = (cid:26) ν ℓ | B k ≤ ν ℓ ≤ B k +1 (cid:27) , ∀ k ∈ { } ∪ K ℓ , (36)where B , and B K ℓ +1 → ∞ are deﬁned for notational convenience. Then, we have { ν ℓ ≥ } = [ k ∈{ }∪K ℓ F ℓ,k . (37)Given (37), solving problem (35) is equivalent to ﬁrst solving K ℓ + 1 subproblems (each forone interval F ℓ,k , given as follows, ∀ k ∈ { } ∪ K ℓ ) , and then comparing their optimal values toobtain the minimum one: min ν ℓ ∈F ℓ,k F ℓ,k ( ν ℓ ) , (38)where F ℓ,k ( ν l ) =  k X i =1  | h i | + X j ∈L\{ ℓ } λ ℓ,j | ˆ g i,j |  ¯ P i + σ + X j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  ν ℓ + k − k X i =1 | h i | p ¯ P i ! √ ν ℓ + K ℓ X n = k +1  − | h n | | h n | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g n,j |  , ∀ k ∈ K ℓ \ { K ℓ } , (39) F ℓ, ( ν ℓ ) = K ℓ X i =1  − | h i | | h i | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g i,j |  + ν ℓ  σ + X j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  , (40) F ℓ,K ℓ ( ν ℓ ) =  K ℓ X i =1  | h i | + X j ∈L\{ ℓ } λ ℓ,j | ˆ g i,j |  ¯ P i + σ + X j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  ν ℓ − K ℓ X i =1 | h i | p ¯ P i ! √ ν ℓ + K ℓ . (41)Suppose that ν ∗ ℓ,k and F ℓ,k ( ν ∗ ℓ,k ) denote the optimal solution and optimal value to the k -thsubproblem in (38). By comparing the optimal values { F ℓ,k ( ν ∗ ℓ,k ) } , we can obtain the optimalsolution to problem (35), given in the following proposition. Proposition 4.

The optimal ν ∗ ℓ for problem (35) is obtained as ν ∗ ℓ = ν ∗ ℓ,k ∗ =  P k ∗ i =1 | h i | p ¯ P i P k ∗ i =1 | h i | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g i,j | ! ¯ P i + σ + P j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  , (42)where k ∗ = arg min k ∈K ℓ F ℓ,k ( ν ∗ ℓ,k ) . Proof:

Please refer to Appendix C.With ν ∗ ℓ at hands, { Q ∗ k } k ∈K ℓ is derived accordingly. Therefore, the optimal solution to problem(28) and also the dual function are obtained.

2) Obtaining Optimal Dual Variables to Maximize Dual Function:

Next, it remains to search { λ opt2 ℓ,j } j ∈L\{ ℓ } to maximize g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) for solving dual problem (28). Since the dualfunction g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) is concave but non-differentiable in general, one can use subgradientbased methods such as the ellipsoid method [30], to obtain the optimal { λ opt2 ℓ,j } for dual problem(30). For the objective function in (28), the subgradient w.r.t. λ ℓ,j is P k ∈K ℓ Q k | ˆ g k,j | − Γ ℓ,j ν ℓ , ∀ j ∈ L \ { ℓ } .

3) Optimal Solution to Problems (P2.2. ℓ ) and (P2.1. ℓ ): With obtained { λ opt2 ℓ,j } j ∈L\{ ℓ } , wecan obtain the optimal solutions to problem (P2.2. ℓ ), i.e., { Q opt2 k } k ∈K l and ν opt2 ℓ , accordingto (33) and (42), respectively. After obtaining the optimal solution to problem (P2.2. ℓ ), theoptimal solutions of { p opt2 k } k ∈K ℓ and η opt2 ℓ to problem (P2.1. ℓ ) can be correspondingly found bycalculating p opt2 k = ( Q opt2 k ) ν opt2 ℓ and η opt2 ℓ = ν opt2 ℓ , as summarized in Theorem 1, for which the proofis omitted for brevity. Theorem 1.

The optimal power control solution to problem (P2.1. ℓ ) is given by p opt2 k =  ¯ P k , k ∈ { , · · · , k opt2 } , | h k | η opt2 ℓ | h k | + P j ∈L\{ ℓ } λ opt2 ℓ,j | ˆ g k,j | ! , k ∈ { k opt2 + 1 , · · · , K ℓ } , (43)where the threshold is given as η opt2 ℓ =  P k opt2 i =1 | h i | + P j ∈L\{ ℓ } λ opt2 ℓ,j | ˆ g i,j | ! ¯ P i + σ + P j ∈L\{ ℓ } (cid:0) Γ j,ℓ − λ opt2 ℓ,j Γ ℓ,j (cid:1)P k opt2 i =1 | h i | p ¯ P i  , (44)with k opt2 = arg min k ∈K ℓ F ℓ,k ( ν opt2 ℓ,k ) . Remark 2.

From Theorem 1, it is observed that the optimal power control at each AP, derived from the MSE minimization problem under a set of IT constraints, exhibits a threshold-based structure. In particular, if the policy indicator of each device k (i.e., a function of thepower budget and channel quality of both the direct and interfering links given as B k in (34))exceeds an optimized threshold (i.e., B k ≥ η opt2 ℓ ), then device k will transmit with regularizedchannel inversion, where the regularization can balance the tradeoff between the signal-magnitudealignment and interference-induced error suppression; otherwise, device k will employ the fullpower transmission. Similar observations were also made in [17] studying single-cell AirComp.However, the policy indicator (also called quality indicator) deﬁned in [17] is determined onlyby the power budget and the channel quality of the direct link, while that in the current workalso accounts for the interference to other APs. C. Efﬁcient Algorithm for Optimizing IT Levels

The optimization of IT levels provides a mechanism for harnessing the cooperation gain ontop of the distributed power control. In this subsection, we present an efﬁcient algorithm forupdating Γ in an iterative manner with only peer-to-peer signaling between APs. In each iteration,a particular pair of two APs updates their IT levels, which needs to ensure that the achievableMSE values at both APs are reduced or at least not increased, and the MSE values at otherAPs are not affected. To successfully implement such a design, suppose that there is a reliablebackhaul link between each pair of APs, such that all different pairs of APs can communicatewith each other to exchange their mutual IT levels.Consider the update of the mutual IT levels for a particular pair of APs. Recall that Proposition2 shows that for any MSE tuple on the Pareto boundary, there must exist a Γ such that the optimalsolution to the corresponding problem (P2.1. ℓ )’s can lead to that MSE tuple. However, for anygiven Γ , it is still unsure whether it can achieve a Pareto-optimal MSE tuple of AirCompnetworks. Inspiring by the simple rule for updating the IT levels in conventional multi-cellcommunication networks (instead of AirComp) [21], in the following proposition we present thenecessary condition on the IT levels in order for the resultant MSE tuple to be Pareto optimal. Proposition 5. (Necessary Condition for Pareto Optimality) With any given Γ , if the optimalMSE values Φ ℓ ( Γ ℓ ) ’s in problem (P2.1. ℓ ) (or equivalently (P2.2. ℓ )) are Pareto optimal, then forany pair of APs ℓ and j , it must hold that | D ℓ,j | = 0 , where D ℓ,j is a × matrix deﬁned as D ℓ,j =  ∂ Φ ℓ ( Γ ℓ ) ∂ Γ ℓ,j ∂ Φ ℓ ( Γ ℓ ) ∂ Γ j,ℓ ∂ Φ j ( Γ j ) ∂ Γ ℓ,j ∂ Φ j ( Γ j ) ∂ Γ j,ℓ  . (45) Proof:

See Appendix D.Notice that one can obtain each component of D ℓ,j based on the primal and dual optimalsolution to problem (P2.2. ℓ ) with any given Γ ℓ [21]. To be speciﬁc, we have ∂ Φ ℓ ( Γ ℓ ) ∂ Γ ℓ,j = − λ opt2 ℓ,j ν opt ℓ , (46)where { λ opt2 ℓ,j } is the optimal dual variables associated with the constraints in (26) of problem(P2.2. ℓ ) and ν opt ℓ is the optimal solution to problem (P2.2. ℓ ). Furthermore, we have ∂ Φ ℓ ( Γ ℓ ) ∂ Γ j,ℓ = ν opt ℓ , (47)and ∂ Φ j ( Γ j ) ∂ Γ ℓ,j and ∂ Φ j ( Γ j ) ∂ Γ j,ℓ can be derived similarly. Remark 3 (Tightness of IT Constraints) . Combining Propositions 2 and 5, it is observed that,for any particular Γ corresponding to a Pareto-optimal MSE tuple of the AirComp networks, itmust hold that P k ∈K ℓ p opt2 k | ˆ g k,j | = Γ ℓ,j , ∀ ℓ ∈ L , j ∈ L \ { ℓ } . This says that all the IT constraintsat APs must be tight in order to achieve the Pareto optimality.Next, we present an efﬁcient rule to update the IT level in a pairwise manner inspired byProposition 2 and [21]. Let Γ ′ denote the updated Γ , where all the elements in Γ remainunchanged except [Γ ℓ,j , Γ j,ℓ ] T that is replaced by [Γ ′ ℓ,j , Γ ′ j,ℓ ] T = [Γ ℓ,j , Γ j,ℓ ] T + δ ℓ,j · d ℓ,j , (48)where δ ℓ,j is a sufﬁciently small step size, and d ℓ,j is a vector satisfying D ℓ,j d ℓ,j < (component-wise). For notational conciseness, let D ℓ,j =  a bc d  , then one feasible d ℓ,j is given as [21] d ℓ,j = sign ( bc − ad ) · [ α ℓ,j d − b, a − α ℓ,j c ] T , (49)where sign( x ) = 1 if x ≥ and sign( x ) = − otherwise; and α ℓ,j ≥ is a control parameterdetermining the ratio between the MSE decrements for APs ℓ and j . Remark 4.

Provided a sufﬁciently small step size δ ℓ,j , we can set the control variable α ℓ,j ≥ (or α ℓ,j ≥ ) to ensure that a larger (smaller) MSE decrement is achieved by AP ℓ than that forAP j . Therefore, via adjusting α ℓ,j from zero to inﬁnity, we can achieve different points on thePareto boundary with lower MSE at both APs ℓ and j than that at the starting point (e.g., underthe design without cooperation).In summary, the detailed procedure for the distributed IT levels update is described as follows. Step 1) : APs ℓ and j , ℓ = j , exchange the current IT levels through the backhaul link; Step 2) : APs ℓ and j solve problems (P2.2. ℓ ) and (P2.2. j ) individually to obtain the optimal { Q k } k ∈K ℓ and ν ℓ (and accordingly the optimal { p k } k ∈K ℓ and η ℓ ), as well as { Q k } k ∈K j and ν j (and accordingly the optimal { p k } k ∈K j and η j ), respectively; Step 3) : According to (46) and (47), APs ℓ and j , ℓ = j , individually compute the elements a and b , as well as c and d in D ℓ,j , respectively; Step 4) : The two APs share the computed results with each other to construct D ℓ,j with(45) and d ℓ,j according to (49); Step 5) : Both APs ℓ and j , ℓ = j can update their IT levels { Γ ′ ℓ,j } according to (48);The above procedure is repeated among different AP pairs until each element of the matrix | D ℓ,j | , ∀ ℓ = j , is less than a sufﬁciently small threshold D . In summary, the compete algorithmfor updating the IT levels in a decentralized manner is presented in Algorithm 2. Algorithm 2 for Updating the IT Levels in a Decentralized Way a) Initialization : Let Γ ℓ,j ≥ , ∀ ℓ = j, ℓ, j ∈ L .b) Repeat

For ℓ = 1 , , · · · , L , and j = 1 , , · · · , L , j = ℓ

1) APs ℓ and j exchange the current IT levels, i.e., Γ ℓ,j and Γ j,ℓ ;2) AP ℓ computes a and b in D ℓ,j with Γ ℓ according to (46) and (47), respectively;3) Similarly, AP j computes c and d in D ℓ,j with Γ j ;4) AP ℓ sends the results a and b to AP j for constructing D ℓ,j , and similarly AP j sends its results c and d toAP ℓ ;5) APs ℓ and j update Γ ℓ and Γ j according to (48);End Forc) Until | D ℓ,j | is lower than a predetermined threshold D , i.e., | D ℓ,j | ≤ D , ∀ ℓ = j . Remark 5 (Incentivized Multi-Cell Cooperation) . Essentially, the distributed algorithm reducesthe overall MSE achieved by all APs in a pairwise manner. In other words, in each iteration,only a selected pair of APs updates their IT levels for their own MSE reduction without affectingthe MSE performance of other APs. In this way, the generated solution can approach the Paretoboundary of the MSE region with the MSE reduced at all APs as compared to the starting point(e.g., without cooperation). Thereby this provides incentives for APs (even with self-interests)to participate in the cooperation for MSE reduction. V. S

IMULATION R ESULTS

In this section, we provide simulation results to show the MSE performance of the multi-cell AirComp networks. In the simulation, we consider an AirComp network with two cells inFigs. 3, 4, 5(a), and 6(a), where the APs are located with a distance of meters (m) and thehorizontal coordinates of them are ﬁxed as (0 , and (0 , , respectively. We also study thethree-cell case in Figs. 5(b) and 6(b), in which the third AP’s horizontal coordinate is ﬁxed as (20m , . All the devices are randomly located in a circle with its associated AP located atthe center and a radius of m. The direct and interference links follow Rayleigh fading channelmodels, speciﬁed by h k = θ (Θ / Θ ) − ζ ¯ h k and g k,j = θ (Θ / Θ ) − ζ ¯ g k,j , where ¯ h k ’s and ¯ g k,j ’s aremodeled as i.i.d. circularly symmetric complex Gaussian (CSCG) random variables with zeromean and unit variance, θ = − dB corresponds to the path loss at the reference distanceof Θ = 10 m, Θ denotes the distance from the transmitter to the receiver, and ζ = 3 is thepathloss exponent. Furthermore, we deﬁne the per-device power budget by P , and set ¯ P k = P for each device k ∈ K ℓ , ℓ ∈ L , and noise variance σ = − dBm. A. Benchmarking Schemes

For performance comparison, we consider the following benchmark schemes without anycooperation required among APs. • Full power transmission:

The full power transmission is applied for all devices by setting p k = ¯ P k , ∀ k ∈ K , in problem (P1). This scheme requires the most primitive control requiringno CSI collection and signaling overhead. • Power control by ignoring interference:

Each AP optimizes the power allocation and de-noising factor based on its own local CSI without any cooperation with other APs. Thus,the MSE is minimized independently at each AP ignoring the inter-cell interference. Thisleads to the single-cell AirComp power control problem for each cell ℓ ∈ L , presented asin [17], given as min { ≤ p k ≤ ¯ P k } k ∈K ℓ ,η ℓ ≥ X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ η ℓ . (50) • Power control with maximum interference:

Each AP can access the local CSI and obtainthe estimated interference based on an initial stage with full power transmission at alldevices. However, the actual transmit power from the interference links is unknown. As aresult, each AP minimizes the worst-case MSE by treating the maximum interference power Per-device power budget, P (W) C o mm on M SE Optimal power controlFull power transmissionPower control by ignoring interferencePower control with maximum interference

Figure 3. Effect of per-device power budget on the MSE performance of multi-cell AirComp networks. P j ∈L\{ ℓ } P k ∈K j ¯ P k | ˆ g k,ℓ | as noise. This leads to the following problem of the same form as thatin (50), which can thus be similarly solved: min { ≤ p k ≤ ¯ P k } k ∈K ℓ ,η ℓ ≥ X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } P i ∈K j ¯ P i | ˆ g i,ℓ | η ℓ . (51) B. Performance Evaluation of the Proposed Cooperative Interference Management

We test the MSE performance by varying the per-device power budget in Fig. 3 with L = 2 and K = K = 20 , where power budgets at different devices are set to be uniform. Bysetting β = β = 0 . , the MSE metric is the common MSE between two cells. Firstly, theproposed optimal centralized power control is observed to considerably outperform the otherthree benchmark schemes within the considered regimes. At the low power budget regime (e.g.,less than . W), both the power control schemes by ignoring interference and with maximuminterference can achieve close-to-optimal MSE performance, and all of the three schemes withpower control outperform the full-power-transmission scheme. This implies the effectiveness ofpower control optimization in suppressing the noise-induced error that is dominant for the MSEin the low power budget regime. As the power budget increases, the performance gap betweenthe optimal centralized power control and the power-control-by-ignoring-interference becomeslarge, so as that between the power control schemes with maximum interference and by ignoringinterference. This is due to the fact that the cooperative interference management is helpful inMSE reduction for AirComp networks, especially when the inter-cell interference becomes themain contributor for the MSE at the high power budget regime. Besides, it is observed that

10 20 30 40 50 60

Number of devices associated with each AP, K C o mm on M SE Optimal power controlFull power transmissionPower control by ignoring interferencePower control with maximum interference

Figure 4. Effect of the number of devices in each cell on the MSE performance of multi-cell AirComp networks. the curves for both the optimal centralized power control and the power-control-with-maximum-interference schemes become saturated at the high power budget regime, meaning that the MSEperformance is limited by the inter-cell interference and cannot be improved further by simplyincreasing the transmit power.The effect of device population on the MSE performance is illustrated in Fig. 4 with β = β = 0 . and K = K = K , where the power budgets at all devices are identically setto be W. Firstly, it is observed that the MSE achieved by all the schemes decreases as K increases, due to the fact that the AP receivers can aggregate more data for averaging. Secondly,the performance gain achieved by the optimal power control over the benchmark schemes issigniﬁcant throughout the whole regime of K , and especially prominent at the small K regime.The full-power-transmission scheme is observed to be the worst in MSE reduction among theothers, showing the necessity of power control. Furthermore, it is worth noting that, for allschemes with power control, the MSE performance is saturated at the large K regime. This isbecause that as K increases, the inter-cell interference becomes the bottleneck of MSE reduction.Figs. 5(a) and 5(b) show the MSE region of AirComp networks with L = 2 and L = 3 ,respectively, where we set P = 1 W, K ℓ = 20 , ℓ ∈ L , and α , = α , = α , = α . It is observedthat all the benchmark schemes lie within the Pareto boundary that is obtained by the centralizedpower control by varying the MSE-proﬁling vector β . The distributed power control based on ITis observed to achieve the Pareto boundary. Furthermore, through a comparison between the twocases in distributed power control with α = 1 and α = 10 under the two-cell AirComp networkshown in Fig. 5(a), it is observed that a larger value of α bias the MSE minimization towards cell MSE of cell 1 M SE o f c e ll Pareto boundaryFull power transmissionPower control by ignoring interferencePower control with maximum interferenceDistributed power control with =1Distributed power control with =10 (a) Two-cell case. M SE o f C e ll MSE of Cell 2

MSE of Cell 1 (0.0179,0.0135,0.0117) (0.008,0.00379,0.00968) (0.019,0.0177,0.0169) (0.0085,0.00373,0.00906)(0.0091,0.00723,0.00928) Pareto boundaryFull power transmissionPower control by ignoring interferencePower control with maximum interferenceDistributed power control with =1Distributed power control with =10 (b) Three-cell case.Figure 5. MSE region of multi-cell AirComp networks. which is consistent with Remark 4. Besides, the power-control-by-ignoring-interference schemeis observed to outperform the full-power-transmission scheme (closer to Pareto boundary), whilethe power-control-by-maximum-interference scheme outperforms the optimal-power-control-by-ignoring-interference scheme. The former observation reveals the beneﬁt of power control inminimizing the MSE performance of AirComp, while the later shows the effectiveness of theinterference management. Under the decentralized design, all the cells’ MSE have reduced ascompared to the case without cooperation, which is consistent with Remark 5. Number of terations M S E o f ce ll s a nd -3 MSE of cell 1MSE of cell 2 (a) Two-cell case.

Number of terations M S E o f ce ll s -3 MSE of cell 1MSE of cell 2MSE of cell 3 (b) Three-cell case.Figure 6. Convergence analysis of the proposed distributed IT-based power control.

Furthermore, the convergence performance of the pairwise decentralized algorithm in onechannel realization is depicted for the setting of α = 1 and P = 1 W, in Figs. 6(a) and6(b), corresponding to the two-cell and three-cell cases with each cell including K ℓ = 20 devices, respectively. The MSE values at all APs are monotonically non-increasing over time, thus validating Remark 5 again in Section IV-C, and showing the effectiveness of our design inpractical decentralized implementation.VI. C ONCLUSION

In this work, we considered multi-cell AirComp for which the power control was optimizedover multiple cells to regulate the effect of inter-cell interference on AirComp performance.Firstly, we considered the scenario of centralized multi-cell power control, based on whichwe characterized the Pareto boundary of the achievable multi-cell MSE region for AirCompnetworks. This is implemented by minimizing the overall MSE of all cells subject to a set ofMSE-proﬁling constraints, which is solved via solving a sequence of convex SOCP feasibilityproblems together with a bisection search. Next, we considered the scenario of distributed powercontrol without a centralized controller, for which an alternative IT-based method was proposed tocharacterize the same MSE Pareto boundary, and enable a decentralized power control algorithm.In the decentralized design, each AP only needs to minimize its own MSE under a set of ITconstraints, while different cells iteratively update the IT levels based on pairwise informationexchange. Remarkable performance gain in terms of AirComp accuracy was observed in thecomparison with other designs without cooperation.This work opens up several directions for further investigation on multi-cell AirComp. Onedirection is to develop multi-cell MIMO AirComp techniques for enabling coexisting vector-value function computation over multi-dimension data, where the key challenge lies in the jointdesign of multi-cell cooperative beamforming and power control. Another interesting directionis to explore the cluster-based hierarchical design for large-scale AirComp, which needs to de-termine the optimal clustering policy for the interference suppression and computation distortionreduction. A

PPENDIX

A. Proof of Proposition 2

Suppose that the given { p k } k ∈K ℓ and η ℓ can achieve the Pareto-optimal MSE tuple, and thenwe have the following MSE for each AP ℓ ∈ L : Φ ℓ = X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | η ℓ , ∀ ℓ ∈ L . (52) With Γ j,ℓ = P i ∈K j p i | ˆ g i,ℓ | , ∀ ℓ ∈ L , j ∈ L \ { ℓ } , Φ ℓ in (52) can be rewritten as Φ ℓ = X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } Γ j,ℓ η ℓ , ∀ ℓ ∈ L . (53)Note that (53) is derived similarly as the objective function of problem (P2.1. ℓ ) for each AP ℓ ∈ L . With Γ ℓ,j = P k ∈K ℓ p k | ˆ g k,j | , ∀ j ∈ L \ { ℓ } , it is observed that { p k } k ∈K ℓ and η ℓ in (9) satisfyall the constraints in problem (P2.1. ℓ ). Therefore, { p k } k ∈K ℓ and η ℓ must be the feasible solutionfor problem (P2.1. ℓ ).Hence, it remains to prove that { p k } k ∈K ℓ and η ℓ are exactly the optimal solution of problem(P2.1. ℓ ) for each AP ℓ , where the achievable MSE accordingly is equal to the optimal valueof problem (P2.1. ℓ ), i.e., Φ ℓ = Φ ℓ ( Γ ℓ ) . We prove this result by contradiction. Suppose that theoptimal solution to problem (P2.1. ℓ ) in AP ℓ is denoted by { p ⋆k } k ∈K ℓ and η ⋆ℓ which are unequalto { p k } k ∈K ℓ and η ℓ . Then, the new MSE achieved at AP ℓ denoted by φ ℓ is φ ℓ = X k ∈K ℓ p p ⋆k | h k | p η ⋆ℓ − ! + σ + P j ∈L\{ ℓ } Γ j,ℓ η ⋆ℓ = X k ∈K ℓ p p ⋆k | h k | p η ⋆ℓ − ! + σ + P j ∈L\{ ℓ } P i ∈K j p i | ˆ g i,ℓ | η ⋆ℓ < Φ ℓ . (54)As P k ∈K ℓ p ⋆k | ˆ g k,j | ≤ Γ ℓ,j , we further have the achievable MSE at any AP j = ℓ as φ j = X k ∈K j (cid:18) √ p k | h k |√ η j − (cid:19) + σ + P i ∈L\{ j,ℓ } P k ∈K i p k | ˆ g k,j | + P k ∈K ℓ p ⋆k | ˆ g k,j | η j ≤ X k ∈K j (cid:18) √ p k | h k |√ η j − (cid:19) + σ + P i ∈L\{ j } Γ i,j η j = X k ∈K j (cid:18) √ p k | h k |√ η j − (cid:19) + σ + P i ∈L\{ j } P k ∈K i p k | ˆ g k,j | η j = Φ j . (55)Note that the MSE-tuple ( φ , · · · , φ L ) achieved by {{ p k } k ∈K , η } , · · · , {{ p ⋆k } k ∈K ℓ , η ⋆ℓ } , · · · , {{ p k } k ∈K ℓ , η ℓ } , satisﬁes φ ℓ < Φ ℓ and φ j ≤ Φ j , ∀ j = ℓ , which contradicts the fact that (Φ , · · · , Φ L ) is Pareto optimal. Thus the presumption cannot be true, and it holds that { p k } k ∈K ℓ and η ℓ are the optimal solution to problem (P2.1. ℓ ) for each AP ℓ , i.e. p k = p ⋆k , ∀ k ∈ K ℓ and η ℓ = η ⋆ℓ , and the achievable MSE is equal to the optimal value of problem (P2.1. ℓ ), i.e., Φ ℓ = Φ ℓ ( Γ ℓ ) , ∀ ℓ ∈ L . B. Proof of Proposition 3

Suppose σ + P j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j ) < . It thus follows that L ℓ (cid:0) { Q k } k ∈K ℓ , ν ℓ , { λ ℓ,j } j ∈L\{ ℓ } (cid:1) becomes negative inﬁnity when ν ℓ → + ∞ . This implies that the dual function g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) is unbounded from below in this case. Hence, it requires that σ + P j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j ) ≥ to guarantee g ℓ ( { λ ℓ,j } j ∈L\{ ℓ } ) to be bounded from below. C. Proof of Proposition 4

To proceed with solving problem (35), we alternatively solve each subproblem in (38) andthen compare their optimal values { F ℓ,k ( ν ∗ ℓ,k ) } in (38). First, we have the following lemma. Lemma 1.

The optimal solution ν ∗ ℓ, to problem (38) when ν ℓ ∈ F ℓ, is thus given by ν ∗ ℓ, = max (cid:20) γ , B (cid:21) . (56)where γ = max j ∈L\{ ℓ } ℓ,j P k ∈K ℓ | h k | | ˆ g k,j | | h k | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j | ! . Proof:

See Appendix E.Combining with IT constraints in (26), it is worth to point out that if γ ≥ B , then the ITlevels are unreasonably low, or the power budgets are unreasonably high, for which case allthe power constraints become inactive. This case does not make sense in practice and thus isexcluded in the sequence discussion, and we hence assume the case with practical power budgetsand the IT levels satisfying γ ≤ B .Furthermore, for any k ∈ K , the function F k ( ν ℓ ) is shown to be a unimodal function that ﬁrstdeceases in [0 , ˜ ν k ] and then increases in [˜ ν k , ∞ ) , where ˜ ν k is the stationary point given by ˜ ν k =  P ki =1 | h i | p ¯ P i P ki =1 | h i | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g i,j | ! ¯ P i + σ + P j ∈L\{ ℓ } (Γ j,ℓ − λ ℓ,j Γ ℓ,j )  . Thus, the optimal solution ν ∗ ℓ,k to problem (38) when ν ℓ ∈ F k , ∀ k ∈ K ℓ is thus given in thefollowing lemma. Lemma 2.

The optimal solution ν ∗ ℓ,k to the k -th subproblem in (38) is given by ν ∗ ℓ,k = max (cid:18) B k +1 , min (cid:18) ˜ ν k , B k (cid:19)(cid:19) = ˜ ν k . (57) Proof:

Similarly as in [17], it can be shown that /B k ≤ ˜ ν k ≤ /B k +1 . This lemma thusfollows directly. Therefore, with Lemma 2 and by comparing the optimal values { F ℓ,k ( ν ∗ ℓ,k ) } among all sub-problems, we can obtain the optimal solution to problem (35). This thus completes the proof. D. Proof of Proposition 5

Based on Proposition 2, the optimal values of the problems in (P2.1. ℓ ) for all APs, { Φ ℓ ( Γ ℓ ) } ,achieved by the optimal solution denoted by { p k } k ∈K ℓ and { η ℓ } ℓ ∈L , correspond to a Pareto-optimal MSE-tuple, denoted by (Φ , · · · , Φ L ) , given as Φ ℓ ( Γ ℓ ) = Φ ℓ = X k ∈K ℓ (cid:18) √ p k | h k |√ η ℓ − (cid:19) + σ + P j ∈L\{ ℓ } Γ j,ℓ η ℓ , ∀ ℓ ∈ L . (58)Then we prove Proposition 5 by contradiction. Suppose that | D ℓ,j | 6 = 0 . With updated Γ ′ ℓ,j and Γ ′ j,ℓ based on the updating rule of Γ ′ in (48), the optimal solutions to problems (P2.2. ℓ ) and(P2.2. j ) for cells ℓ and j are changed to be {{ p ⋆k } k ∈K ℓ , η ⋆ℓ } and {{ p ⋆k } k ∈K j , η ⋆j } , respectively,while for those to problems (P2.2. i ) for cell i = ℓ, j , the optimal solutions remain unchanged.Accordingly, the new achievable MSE for any AP ∀ i = ℓ, j , is given by φ i = X k ∈K i (cid:18) √ p k | h k |√ η i − (cid:19) + σ + P ¯ ı ∈L\{ ℓ,j,i } P k ∈K ¯ ı p k | ˆ g k,i | + P k ∈K ℓ p ⋆k | ˆ g k,i | + P k ∈K j p ⋆k | ˆ g k,i | η i = X k ∈K i (cid:18) √ p k | h k |√ η i − (cid:19) + σ + P ¯ ı ∈L\{ ℓ,j,i } Γ ¯ ı,i + P k ∈K ℓ p ⋆k | ˆ g k,i | + P k ∈K j p ⋆k | ˆ g k,i | η i ≤ Φ i , (59)in which (59) holds due to the fact that P k ∈K ℓ p ⋆k | ˆ g k,i | ≤ Γ ℓ,i and P k ∈K j p ⋆k | ˆ g k,i | ≤ Γ j,i . Then basedon P k ∈K j p ⋆k | ˆ g k,ℓ | ≤ Γ ′ j,ℓ , we have the updated achievable MSE for AP ℓ as φ ℓ = X k ∈K ℓ p p ⋆k | h k | p η ⋆ℓ − ! + σ + P n ∈L\{ ℓ,j } P k ∈K n p k | ˆ g k,ℓ | + P k ∈K j p ⋆k | ˆ g k,ℓ | η ⋆ℓ = X k ∈K ℓ p p ⋆k | h k | p η ⋆ℓ − ! + σ + P n ∈L\{ ℓ,j } Γ n,ℓ + P k ∈K j p ⋆k | ˆ g k,ℓ | η ⋆ℓ ≤ Φ ℓ ( Γ ′ ℓ ) . (60)Similarly, it also holds that φ j ≤ Φ j ( Γ ′ j ) due to P k ∈K ℓ p ⋆k | ˆ g k,j | ≤ Γ ′ ℓ,j . Furthermore, based on (48)and D ℓ,j d ℓ,j < , it follows that:  φ ℓ φ j  ≤  Φ ℓ ( Γ ′ ℓ )Φ j ( Γ ′ j )  ∼ =  Φ ℓ ( Γ ℓ )Φ j ( Γ j )  + δ ℓ,j · D ℓ,j d ℓ,j <  Φ ℓ Φ j  . Note that the achieved MSE-tuple ( φ , · · · , φ L ) based on the updated Γ ′ , satisﬁes φ ℓ < Φ ℓ , φ j ≤ Φ j , and φ i ≤ Φ i , ∀ i = ℓ, j , which contradicts the fact that (Φ , · · · , Φ L ) is Pareto-optimal.Thus the presumption is not true, and this proof is thus completed. E. Proof of Lemma 1

Consider the case with ν ℓ ∈ F ℓ, , for which F ℓ, ( ν ℓ, ) is linear and monotonically increasingw.r.t. ν ℓ, based on Proposition 3. In this case, we have ν ℓ, ≥ B accordingly. Furthermore, withthe IT constraint in (26), there exists a possible solution in which only the IT constraint is tightinstead of the power budget constraints. Thus, we correspondingly have the following potentialconstraints: ν ℓ, ≥ X k ∈K ℓ | h k | | ˆ g k,j | | h k | + P j ∈L\{ ℓ } λ ℓ,j | ˆ g k,j | ! Γ ℓ,j , ∀ j ∈ L \ { ℓ } . (61)Therefore, it is evident that the optimal solution to problem (38) when ν ℓ ∈ F ℓ, can be givenin Lemma 1. This completes the proof. R EFERENCES [1] B. Nazer and M. Gastpar, “Computation over multiple-access channels,”

IEEE Trans. Inf. Theory , vol. 53, no. 10, pp.3498–3516, Oct. 2007.[2] M. Gastpar, “Uncoded transmission is exactly optimal for a simple Gaussian sensor network,”

IEEE Trans. Inf. Theory , vol.54, no. 11, pp. 5247–5251, Nov. 2008.[3] O. Abari, H. Rahul, and D. Katabi, “Over-the-air function computation in sensor networks,” 2016. [Online]. Available:http://arxiv.org/abs/1612.02307[4] F. Molinari, S. Stanczak, and J. Raisch, “Exploiting the superposition property of wireless communication for averageconsensus problems in multi-agent systems,” in

Proc. European Control Conference (ECC) , Limassol, Cyprus, Aug. 2018,pp. 1766–1772.[5] G. Zhu, Y. Wang, and K. Huang, “Broadband analog aggregation for low-latency federated edge learning,”

IEEE Trans.Wireless Commun. , vol. 19, no. 1, pp. 491–506, Jan. 2020.[6] G. Zhu, Y. Du, D. G ¨u nd ¨u z, and K. Huang, “One-bit over-the-air aggregation for communication-efﬁcient federated edgelearning: Design and convergence analysis,” 2020. [Online]. Available: https://arxiv.org/abs/2001.05713.pdf[7] K. Yang, T. Jiang, Y. Shi, and Z. Ding, “Federated learning via over-the-air computation,” IEEE Trans. Wireless Commun. ,vol. 19, no. 3, pp. 2022–2035, Mar. 2020.[8] N. Zhang and M. Tao, “Gradient statistics aware power control for over-the-air federated learning,” 2019. [Online]. Available:https://arxiv.org/abs/2003.02089.pdf[9] M. M. Amiri and D. G ¨u nd ¨u z, “Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,” IEEE Trans. Signal Process. , vol. 68, pp. 2155–2169, Apr. 2020.[10] J. Xiao, S. Cui, Z. Luo, and A. Goldsmith, “Linear coherent decentralized estimation,”

IEEE Trans. Signal Process. , vol.56, no. 2, pp. 757–770, Feb. 2008.[11] C. Wang, A. Leong, and S. Dey, “Distortion outage minimization and diversity order analysis for coherent multiaccess,”

IEEE Trans. Signal Process. , vol. 59, no. 12, pp. 6144–6159, Dec. 2011.[12] O. Abari, H. Rahul, D. Katabi, and M. Pant, “AirShare: Distributed coherent transmission made seamless,” in

Proc. IEEEINFOCOM , Kowloon, Hong Kong, Apr. 2015, pp. 1742–1750.[13] G. Zhu and K. Huang, “MIMO over-the-air computation for high-mobility multi-modal sensing,”

IEEE Internet Thing J. ,vol. 6, no. 4, pp. 6089–6103, Aug. 2019. [14] D. Wen, G. Zhu, and K. Huang, “Reduced-dimension design of MIMO over-the-air computing for data aggregation inclustered IoT networks,” IEEE Trans. Wireless Commun. , vol. 18, no. 11, pp. 5255–5268, Nov. 2019.[15] J. Dong, Y. Shi, and Z. Ding, “Blind over-the-air computation and data fusion via provable Wirtinger ﬂow,”

IEEE Trans.Signal Process. , vol. 68, pp. 1136–1151, Jan. 2020.[16] A. Goldsmith,

Wireless communications , Cambridge, U.K.: Cambridge Univ. Press, Aug. 2005.[17] X. Cao, G. Zhu, J. Xu, and K. Huang, “Optimized power control for over-the-air computation in fading channels,” toappear in

IEEE Trans. Wireless Communi . (available online at https://arxiv.org/pdf/1906.06858.pdf)[18] W. Liu and X. Zang, “Over-the-air computation systems: Optimization, analysis and scaling Laws,” 2019. [Online].Available: https://arxiv.org/pdf/1909.00329.pdf[19] D. Gesbert, S. Hanly, H. Huang, S. Shamai Shitz, O. Simeone, and W. Yu, “Multi-cell MIMO cooperative networks: Anew look at interference,”

IEEE J. Sel. Areas Commun. , vol. 28, no. 9, pp. 1380–1408, Dec. 2010.[20] Q. Shi, M. Razaviyayn, Z. Luo, and C. He, “An iteratively weighted MMSE approach to distributed sum-utility maximizationfor a MIMO interfering broadcast channel,”

IEEE Trans. Signal Process. , vol. 59, no. 9, pp. 4331–4340, Sep. 2011.[21] R. Zhang and S. Cui, “Cooperative interference management with MISO beamforming,”

IEEE Trans. Signal Process. , vol.58, no. 10, pp. 5450–5458, Oct. 2010.[22] L. Liu, R. Zhang, and K. Chua, “Achieving global optimality for weighted sum-rate maximization in the K -user Gaussianinterference channel with multiple antennas,” IEEE Trans. Wireless Commun. , vol. 11, no. 5, pp. 1933–1945, May 2012.[23] R. H. Etkin, D. N. C. Tse, and H. Wang, “Gaussian interference channel capacity to within one bit,”

IEEE Trans. Inf.Theory , vol. 54, no. 12, pp. 5534–5562, Dec. 2008.[24] L. P. Qian, Y. J. Zhang, and J. Huang, “MAPEL: Achieving global optimality for a non-convex wireless power controlproblem,”

IEEE Trans. Wireless Commun. , vol. 8, no. 3, pp. 1553–1563, Mar. 2009.[25] S. Kandukuri and S. Boyd, “Optimal power control in interference-limited fading wireless channels with outage-probabilityspeciﬁcations,”

IEEE Trans. Wireless Commun. , vol. 1, no. 1, pp. 46–55, Jan. 2002.[26] Q. Lan, H. S. Kang, and K. Huang, “Simultaneous signal-and-interference alignment for two-cell over-the-air computation,”2020. [Online]. Available: https://arxiv.org/abs/2001.03309.pdf[27] M. Mohseni, R. Zhang, and J. M. Ciofﬁ, “Optimized transmission for fading multiple-access and broadcast channels withmultiple antennas,”

IEEE J. Sel. Areas Commun. , vol. 24, no. 8, pp. 1627–1639, Aug. 2006.[28] M. Grant and S. Boyd,

CVX: MATLAB Software for Disciplined Convex Programming , Version 2.1, 2016. [Online]Available:http://cvxr.com/cvx[29] S. Boyd and L. Vandenberghe,