[PDF] Exploiting Deep Learning for Secure Transmission in an Underlay Cognitive Radio Network

Abstract

This paper investigates a machine learning-based power allocation design for secure transmission in a cognitive radio (CR) network. In particular, a neural network (NN)-based approach is proposed to maximize the secrecy rate of the secondary receiver under the constraints of total transmit power of secondary transmitter, and the interference leakage to the primary receiver, within which three different regularization schemes are developed. The key advantage of the proposed algorithm over conventional approaches is the capability to solve the power allocation problem with both perfect and imperfect channel state information. In a conventional setting, two completely different optimization frameworks have to be designed, namely the robust and non-robust designs. Furthermore, conventional algorithms are often based on iterative techniques, and hence, they require a considerable number of iterations, rendering them less suitable in future wireless networks where there are very stringent delay constraints. To meet the unprecedented requirements of future ultra-reliable low-latency networks, we propose an NN-based approach that can determine the power allocation in a CR network with significantly reduced computational time and complexity. As this trained NN only requires a small number of linear operations to yield the required power allocations, the approach can also be extended to different delay sensitive applications and services in future wireless networks. When evaluate the proposed method versus conventional approaches, using a suitable test set, the proposed approach can achieve more than 94% of the secrecy rate performance with less than 1% computation time and more than 93% satisfaction of interference leakage constraints. These results are obtained with significant reduction in computational time, which we believe that it is suitable for future real-time wireless applications.

Full PDF

aa r X i v : . [ c s . I T ] J a n Exploiting Deep Learning for Secure Transmissionin an Underlay Cognitive Radio Network

Miao Zhang,

Member, IEEE,

Kanapathippillai Cumanan,

Senior Member, IEEE,

Jeyarajan Thiyagalingam,

SeniorMember, IEEE,

Yanqun Tang, Wei Wang,

Member, IEEE,

Zhiguo Ding,

Fellow, IEEE, and Octavia A. Dobre,

Fellow, IEEE

Abstract —This paper investigates a machine learning-basedpower allocation design for secure transmission in a cognitiveradio (CR) network. In particular, a neural network (NN)-based approach is proposed to maximize the secrecy rate ofthe secondary receiver under the constraints of total transmitpower of secondary transmitter, and the interference leakage tothe primary receiver, within which three different regularizationschemes are developed. The key advantage of the proposed algo-rithm over conventional approaches is the capability to solve thepower allocation problem with both perfect and imperfect chan-nel state information. In a conventional setting, two completelydifferent optimization frameworks have to be designed, namelythe robust and non-robust designs. Furthermore, conventionalalgorithms are often based on iterative techniques, and hence,they require a considerable number of iterations, rendering themless suitable in future wireless networks where there are verystringent delay constraints. To meet the unprecedented require-ments of future ultra-reliable low-latency networks, we proposean NN-based approach that can determine the power allocationin a CR network with signiﬁcantly reduced computational timeand complexity. As this trained NN only requires a small numberof linear operations to yield the required power allocations,the approach can also be extended to different delay sensitiveapplications and services in future wireless networks. Whenevaluate the proposed method versus conventional approaches,

The work of M. Zhang was supported by the Research Start Up Funding ofChongqing Jiaotong University under grant number 2020020070. The work ofY. Tang was supported by the Guangdong Natural Science Foundation undergrant number 2019A1515011622 and the National Natural Science Foundationunder grant number 62071499. The work of W. Wang was supported in part bythe Six Categories Talent Peak of Jiangsu Province under Grant KTHY-039,the Science and Technology Program of Nantong under Grant MS22019019and the Veriﬁcation Platform of Multi-tier Coverage Communication Networkfor oceans under Grant LZC0020. (

Corresponding author: Yanqun Tang. )M. Zhang is with the School of Information Science and En-gineering, Chongqing Jiaotong University, Chongqing, China, (email:[email protected]).K. Cumanan is with the Department of Electronic Engineering, Univer-sity of York, York, United Kingdom, YO10 5DD (email: [email protected]).J. Thiyagalingam is with the Scientiﬁc Computing Department of Ruther-ford Appleton Laboratory, Science and Technology Facilities Council, HarwellCampus, Ditcot, UK (email: [email protected]).Y. Tang is with the School of Information Science and Engineering,Chongqing Jiaotong University, Chongqing, China, also with the Schoolof Electronics and Communication Engineering, Sun Yat-Sen University,Shenzhen, 510006, China (email: [email protected])W. Wang is with the School of Information Science and Technology,Nantong University, Nantong, China, and with the Nantong Research Institutefor Advanced Communication Technologies, Nantong, China, and also withResearch Center of Networks and Communications, Peng Cheng Laboratory,Shenzhen, China (e-mail: [email protected]).Z. Ding is with the School of Electrical and Electronic En-gineering, The University of Manchester, Manchester, UK (email:[email protected]).O. A. Dobre is with the Department of Electrical and Computer En-gineering, Memorial University, St. John’s, NL A1B 3X5, Canada (email:[email protected]). using a suitable test set, the proposed approach can achieve morethan 94% of the secrecy rate performance with less than 1%computation time and more than 93% satisfaction of interferenceleakage constraints. These results are obtained with signiﬁcantreduction in computational time, which we believe that it issuitable for future real-time wireless applications.

Index Terms —Deep learning, neural network, physical layersecurity, cognitive radio networks, resource allocation techniques.

I. I

NTRODUCTION

Wireless communications have become an indispensablepart of daily life of people as they play a crucial role inour day-to-day activities and the means of interactions inthe current networked society. However, information securityis one of the major challenges in wireless networks dueto the open nature of wireless signal transmission whichis more vulnerable for interception and eavesdropping. Theconventional security methods employed at upper layers inthe current communication systems completely rely on cryp-tographic techniques [1], [2]. Despite the fact that existingconventional security techniques, developed based on somehigh complex intractable mathematical problems, are difﬁcultto break or intercept, the broadcast nature of wireless trans-missions introduces different challenges in terms of secret keyexchange and distributions [3], [4]. As a result, informationtheoretic based physical layer security has been proposed tocomplement the conventional cryptographic methods and toprovide additional security measures in wireless transmissions.Furthermore, this approach exploits the dynamics of physicallayer characteristics of wireless channels to establish securetransmission [1]. A reasonable secrecy rate can be realizedthrough physical layer security technique provided that thesignal-to-interference plus noise ratio (SINR) of the channelof the legitimate user is better than that of the channel of theeavesdropper [5]. This novel technique was ﬁrst theoreticallyproved by Shannon [5] and then secrecy capacities of wiretapand related channels were developed by Wyner [6] and Csiszar[7]. In contrast to the conventional cryptographic methods,physical layer security schemes are more suitable for practicalimplementations as these techniques do not require any secretkey distributions or exchange. Furthermore, it is difﬁcult forinterceptors to decipher the information transmitted acrosswireless channels based on physical layer security [3].Recently, machine learning techniques have been appliedwidely as a solution approach to solve different challengingproblems that have complicated structures with stringent con- straints on computational time [8]. Furthermore, artiﬁcial in-telligence has become one of the fastest growing techniques inmany research topics [9] and its practical implementations canbe realized through different machine learning techniques. Assuch, these techniques enable machines to acquire knowledgefrom their computations and make decisions according to theenvironment [10]–[12]. There are various machine learningframeworks available in the literature [9]–[13], such as linearregression, logistic regression, and neural network (NN). NNis one of the well-known machine learning technique dueto its capabilities to simply realize different relationships incomplicated and statistical data sets [14], [15]. In recentyears, numerous research interests have been developed toutilize NN to design and optimize wireless communicationsystems, where the researchers believe that NN will be thecore technique for 5G and beyond wireless systems [16]–[19].

A. Motivation and Contributions

In secure transmission designs, different optimization ap-proaches with various approximations techniques have beenwidely exploited to solve complicated and mathematicallyintractable resource allocation problems [20]–[25]. However,these techniques often have been developed based on iterativeapproaches to yield either optimal or sub-optimal solutions.The computational complexities associated with these conven-tional optimization techniques are neither affordable in lowpowered devices in Internet-of-Things (IoT) nor suitable forapplications with ultra reliability and low latency in futurewireless networks. Furthermore, these conventional optimiza-tion techniques pose different challenges in delay sensitivesystems as the dynamic nature of real-time parameters requiresfrequent updates in very short time [26]. This introducesdifferent stringent delay requirements in updating those de-sign parameters which is impossible to meet by conventionaloptimization approaches. Machine learning techniques can beconsidered as the potential solution approaches to solve thesereal-time update issues. Among a number of machine learningapproaches, the deep learning approach has a number ofbeneﬁts. Although some of these beneﬁts are shared acrossdifferent methods, deep learning offers better learn-ability withincreased volumes of data. We summarize these beneﬁts asfollows:1) NN has the potential capabilities to provide a solutionwith a short time frame with reduced computationalcomplexity [27], compared to other machine learningtechniques, such as support vector machine (SVM)and Gaussian processes (GP) [27], [28]. A particularadvantage here is that conventional machine learningapproaches use all available data, whereas the NN relieson samples of data from batches (mini-batch gradientdescent algorithm). This process demands only a sub-set of the available large dataset at each training step,opposed to every data point;2) A single NN model can be trained to meet the objectivesof multiple tasks [27], whereas it is difﬁcult for othermachine learning techniques to achieve those multi-objectives with the same model; and 3) Furthermore, NN is able to automatically extract featuresfrom the data with highly complex datasets and toformulate the latent representations, which can furtherhelp with learning [29].In the literature, several bodies of work have demonstratedthat machine learning techniques can be exploited to solvethese types of problems in different real-time wireless com-munication applications. For example, deep learning-basedchannel estimation and signal detection techniques in or-thogonal frequency division multiplexing (OFDM) systems isinvestigated in [30]. A deep NN-based method for efﬁcienton-line conﬁguration of reconﬁgurable intelligent surfaces isproposed in [31], where the transmitted signal focusing is im-proved under the indoor environment. The deep reinforcementlearning based joint transmit beamforming and phase shift ma-trix design for reconﬁgurable intelligent surface aided MISOsystems is studied in [32]. The NN-based spectrum and energyefﬁciency maximization techniques is proposed for cognitiveradio (CR) network in [33]. A learning-based approach forwireless resource management is presented in [26], whereasa reinforcement learning based resource allocation techniqueis developed for vehicle-to-vehicle communications in [34].A deep NN is utilized to learn the interference managementover interference-limited channels in [35], whereas the authorsdesign a deep NN for channel calibration between the uplinkand downlink directions in generic massive MIMO systems in[36]. However, none of these work have considered employingmachine learning techniques to simultaneously solve resourceallocation problems with perfect and imperfect channel stateinformation (CSI) in secure communication systems.In general, the motivations behind this work can be summa-rized as follows: (1) Although the conventional optimizationapproaches can yield global or local optimal solutions forresource allocation problems, the nature of their compleximplementations render them less practical for real-worlddeployments, particularly on resource-limited edge deviceswhere tolerance for delays are very minimal. For NNs, oncetrained, the inference step does not demand back-propagation,at which point it only relies on limited number of ﬂoatingpoint arithmetic. This offers a two-fold beneﬁt. First, oncetrained (using powerful computational resources), the NNmodel can be moved around for inference purposes, particu-larly on edge devices where computational resources are oftenlimited. Secondly, an NN-based approach can offer almost nearreal-time performance. This is clearly evidenced by modernedge devices, such as smart phones and cameras. (2) AsNN provides reduced computational complexity for inferencecompared to other machine learning techniques, it is useful forour resource allocation problem. (3) Finally, with the rise ofmachine learning algorithms in various domains of sciences,the community would beneﬁt if some baseline performancecan be established for resource allocation problems in wire-less communications and be compared against conventionaloptimization-based solutions.To carry out the study, in this paper, we consider a securetransmission in a CR network problem as shown in Fig. 1.This secure network consists of one primary transmitter (PU-Tx), one primary receiver (PU-Rx), one secondary transmitter (SU-Tx), one secondary receiver (SU-Rx) and one eaves-dropper (EVE). These terminals are equipped with a singleantenna. Our main objective is to design an NN approachthat can achieve near optimal secrecy rate performances withsigniﬁcantly reduced computational time compared to theexisting conventional optimization schemes in the literature.In particular, the optimal power allocation is determined tomaximize the achievable secrecy rate under the constraints oftotal transmit power of the SU-Tx and the interference leakageto the PU-Rx. We develop two approaches in this paper: theconventional optimization approach and NN-based framework.We show that the NN-based approach can be exploited to solveboth robust and non-robust secrecy rate maximization prob-lem, whereas the conventional optimization techniques requirecompletely two different problem formulation and solutionapproaches. Our contributions of this work are summarizedas follows:1) Firstly, to the best of our knowledge and surveys [27],[37], [38], none of the existing work considered de-veloping an NN framework to solve the secrecy ratemaximization problems in an underlay CR network.2) Secondly, due to the imperfections and non-linearities inpractical systems [39], we also consider a more practicalimperfect CSI scenario in this paper, whereas most ofthe previous works that apply NN for resource allocationproblems only consider the perfect CSI scenarios. There-fore, the framework of the proposed NN is different fromthose found in related works, i.e., we have added thechannel error bounds as input parameters to enable theNN to learn the impact of these errors on the powerallocation.3) Thirdly, we propose an NN-based algorithm to simul-taneously solve the secrecy rate maximization problemwith perfect and imperfect CSI at the SU-Tx. Thekey advantage of the developed approach is that thesame NN-based algorithm can be exploited to solveboth the robust and the non-robust secrecy rate maxi-mization problems with the both imperfect and perfectCSI, respectively. Opposite to that, in the conventionaloptimization approaches, these problems need to beformulated into two completely different optimizationframeworks. Furthermore, to reduce over-ﬁtting, we alsoembed two regularization techniques into our proposedNN designs. To generate the required training set, weutilize the conventional optimization framework andthen train the NN with this training set to determineappropriate weights of the connections in the proposedNN. These weighted connections establish a mathemati-cal relationship between the input and the correspondingoutput. After completing the training process, we evalu-ate the performance of the proposed NN-based approachversus the the conventional optimization approachesavailable in the literature.4) Finally, we compare the performance of both schemes interms of achieved secrecy rate and required computationtime to demonstrate the effectiveness and superiority ofour proposed NN scheme.

Fig. 1: A CR network with one PU-Tx, one PU-Rx, one SU-Tx, oneSU-Rx and one EVE. Each is equipped with a single antenna.

The remainder of this paper is organized as follows. Thesystem model is presented in Section II, whereas the secrecyrate maximization problems with both perfect and imperfectCSI are formulated and solved by using conventional optimiza-tion technique in Section III. Section IV presents an NN-basedoptimization framework. Section V provides simulation resultsto demonstrate the effectiveness of the proposed approach.Section VI discusses the limitations of the proposed approachand several potential directions for future work, and ﬁnally,Section VII concludes this paper.

B. Notations

We use the upper and the lower case boldface lettersfor matrices and vectors, respectively. ( · ) − , ( · ) T and ( · ) H stand for inverse, transpose and conjugate transpose operation,respectively. | a | represents the absolute value of a . [ x ] + deﬁnes max { x, } . The 1-norm and 2-norm of x are expressedrespectively as || x || and || x || . A · B represents the dotproduct of matrix A and B . h ′ ( x ) is the ﬁrst derivative offunction h at x . The circularly symmetric complex Gaussiandistribution with mean µ and variance σ is represented by CN ( µ, σ ) .II. S YSTEM M ODEL AND P ROBLEM F ORMULATION

We consider a CR network as shown in Fig. 1 with ﬁveterminals: one PU-Tx, one SU-Tx, one SU-Rx, one PU-Rxand one EVE. All terminals are equipped with single antenna.The SU-Tx intends to send a conﬁdential message to the SU-Rx while ensuring that the interference leakage to the PU-Rxis less than a predeﬁned threshold. At the same time, the EVEattempts intercepting the information sent by the SU-Tx toSU-Rx. The channels between PU-Tx and PU-Rx, SU-Rx, and

EVE are represented by g p , g s , and g e , respectively, whereasthe channels between the SU-Tx and PU-Rx, SU-Rx, and EVEare denoted by h p , h s , and h e , respectively. The received signalat the SU-Rx and EVE can be expressed, respectively, as y s = p P s h s x s + p P p g s x p + n s , (1) y e = p P s h e x s + p P p g e x p + n e , (2)where x s ( E {| x s | } = 1) and x p ( E {| x p | } = 1) are thesymbols sent from the SU-Tx to SU-Rx and the PU-Tx to PU-Rx, respectively. The noise at the SU-Rx and EVE are denotedby n s ( E {| n s | } = σ s ) and n e ( E {| n e | } = σ e ) , respectively.Furthermore, P s and P p represent the power allocations at theSU-Rx and the EVE, respectively. The SINR at the SU-Rxand EVE are deﬁned as γ s = P s | h s | P p | g s | + σ s , (3) γ e = P s | h e | P p | g e | + σ e . (4)The achievable secrecy rate at the SU-Rx can be written as[40] R s = [log (1 + γ s ) − log (1 + γ e )] + . (5)The interference leakage to the PU-Rx can be expressed as P in = P s | h p | . (6)With these deﬁnitions, the secrecy rate maximization problemcan be formulated as max P s R s s.t. P s | h p | ≤ q,P s ≤ P t , P s ≥ , (7)where q is the maximum interference leakage to the PU-Rx,and P t is the maximum transmit power available at the SU-Tx.In the following sections, we present two ways to solve thisproblem: conventional optimization approaches and NN-basedapproach.III. C ONVENTIONAL O PTIMIZATION BASED P OWER A LLOCATION A PPROACH

In this section, we present conventional convex optimizationapproaches to solve the secrecy rate maximization problemdeﬁned in (7) by taking into account the scenarios of havingboth perfect and imperfect CSI at the SU-Tx.

A. Perfect CSI

In this subsection, we present the conventional convexoptimization-based approach to solve the problem deﬁned in(7) with perfect CSI assumption. The original problem (7)is non-convex in its original form due to the non-convexobjective function. Based on the monotonicity of logarithmicfunctions, we reformulate the original problem in (7) as max P s P s | h s | P p | g s | + σ s P s | h e | P p | g e | + σ e s.t. P s | h p | ≤ q,P s ≤ P t , P s ≥ . (8)The above problem still remains non-convex due to the frac-tional objective function, and therefore, it cannot be directlysolved using existing convex optimization tools. To circumventthis non-convexity issue, we convert the original problem intoa two-level optimization problem, namely outer problem andinner problem. The outer problem can be written with respectto (w.r.t.) a new scalar variable t as max t ≥ f ( t )1 + t , (9)whereas the inner problem can be expressed for a given t as f ( t ) = max P s P s | h s | P p | g s | + σ s s.t. P s | h p | ≤ q,P s | h e | P p | g e | + σ e ≤ t,P s ≤ P t , P s ≥ . (10)The inner problem in (10) is convex for a given t and canbe solved by using standard interior-point methods. Since theinner problem in (10) is a convex problem, the outer problemin (9) is a quasi-convex optimization problem w.r.t. variable t .Therefore, we employ a one-dimensional search to obtain theoptimal t ∗ and P ∗ s [41]. The proposed one-dimensional searchalgorithm is summarized in Algorithm 1. Algorithm 1 : One-dimensional search based on bisectionmethod Initialize t ∈ [0 , t max ] , c = ( √ − / , a = 0 , b = t max , t = (1 − c ) b , t = cb ; Compute f ( t ) , f ( t ) ; Repeat If f ( t )1+ t > f ( t )1+ t , b = t , t = t , f ( t ) = f ( t ) , t = a + c ( b − a ) and update f ( t ) ; Else, a = t , t = t , f ( t ) = f ( t ) , t = a + c ( b − a ) ,and update f ( t ) ; Until | b − a | ≤ ǫ , where ǫ is threshold to terminate thealgorithm. B. Imperfect CSI

In this subsection, we develop a tractable approach to solvethe secrecy rate maximization problem with imperfect CSIavailable at the SU-Tx. We reformulate this robust probleminto a tractable one by exploiting the Charnes-Cooper trans-formation [42] and S-Procedure [43].In practical scenarios, it is difﬁcult for SU-Tx to obtainperfect CSI due to the channel estimation and quantizationerrors [44]. Instead, the SU-Tx has knowledge of its estimatedCSI and the uncertainty regions that contain the actual channelrealizations, which is referred to imperfect CSI. Note that theimperfect CSI of the PU-Rx and SU-Rx can be estimated basedon the standard CSI feedback techniques [45]. Furthermore,the imperfect CSI of EVE can be obtained by the following two methods: (1) for the case when EVE is part of the system,the CSI can be estimated with the standard CSI feedbacktechniques, as is still part of the system and should be able tocooperate with SU-TX with its CSI feedback [45]; (2) whenEVE is not part of the system, the CSI can be estimated atthe SU-Tx through the local oscillator power leakage from theEVE’s RF front end, the details of which can be found in [46].In this work, the imperfect CSI is modelled based on thedeterministic models [1], [44], [47], in which it is assumed thatthe actual channel lies in an ellipsoid centred at the channelmean. In this CSI assumption, the estimated CSI and the errorbounds are known at the SU-Tx, while the actual value ofchannel errors are unknown [1], [44], [47]. The actual chan-nel coefﬁcients can be modelled with corresponding channeluncertainties as follows: h s = ˆ h s + e s , h e = ˆ h e + e e , h p = ˆ h p + e p , (11)where ˆ h s , ˆ h e and ˆ h p are the channel coefﬁcients estimated bythe SU-Tx. Furthermore, the symbols e s , e e and e p representthe channel uncertainties. These channel uncertainties areassumed to be bounded by a predeﬁned ellipsoids, as follows: | e s | = | h s − ˆ h s | ≤ ǫ s , (12) | e e | = | h e − ˆ h e | ≤ ǫ e , (13) | e p | = | h p − ˆ h p | ≤ ǫ p , (14)where ǫ s ≥ , ǫ e ≥ and ǫ p ≥ are the error bounds. Basedon these bounded channel uncertainties and the monotonicityof log functions, the robust secrecy rate maximization problemcan be reformulated into the following robust optimizationframework: max P s P s | ˆ h s + e s | P p | g s | + σ s P s | ˆ h e + e e | P p | g e | + σ e s.t. P s | ˆ h p + e p | ≤ q,P s ≤ P t , P s ≥ . (15)First, we introduce the Charnes-Cooper transformation [42] as P s = P s t , (16)to recast the problem deﬁned in (15) as max P s , t t + P s | ˆ h s + e s | P p | g s | + σ s s.t. t + P s | ˆ h e + e e | P p | g e | + σ e ≤ ,P s | ˆ h p + e p | ≤ tq,P s ≤ tP t , P s ≥ . (17)The problem deﬁned in (17) can be rewritten by introducinga new slack variable τ and deﬁning it in the epigraph form as max P s , t, τ τ (18a) s.t. t + P s | ˆ h s + e s | P p | g s | + σ s ≥ τ, (18b) t + P s | ˆ h e + e e | P p | g e | + σ e ≤ , (18c) P s | ˆ h p + e p | ≤ tq, (18d) P s ≤ tP t , P s ≥ . (18e)The above problem is still intractable due to the inﬁnitenumber of the uncertainty sets in the constraints (18b)-(18d).To address this issue, we employ the following proposition: Proposition 1: : The constraints in (18b)-(18d) can beequivalently written as " λ + P s P p | g s | + σ s P s ˆ h s P p | g s | + σ s P s ˆ h s P p | g s | + σ s P s | ˆ h s | P p | g s | + σ s + t − τ − λ ǫ s (cid:23) ,λ ≥ , (19) " λ − P s P p | g e | + σ s − P s ˆ h e P p | g e | + σ s − P s ˆ h e P p | g e | + σ s − P s | ˆ h s | P p | g e | + σ e − t − λ ǫ e (cid:23) ,λ ≥ , (20)and (cid:20) λ − P s − P s ˆ h p − P s ˆ h p tq − P s | ˆ h p | + σ s − λ e ǫ p (cid:21) (cid:23) ,λ ≥ . (21) Proof:

Please refer to Appendix A.Therefore, we rewrite the problem in (18) into the followingequivalent form: max P s , t, τ τs.t. (19)-(21) ,P s ≤ tP t , P s ≥ . (22)The above problem is convex, and therefore, the optimal P ∗ s can be obtained efﬁciently by the convex optimization toolbox [48].IV. P OWER A LLOCATION F RAMEWORK BASED ON

NNIn this section, we present our proposed NN-based schemes.In this approach, the secrecy rate maximization problem istreated as an unknown non-linear mapping, and an NN istrained to learn the relationship between the input and theoutput parameters.First, NNs can be considered as universal function approx-imators [49] and shown to have remarkable capabilities of al-gorithmic learning [50]. As such, they are akin to conventionaloptimizer-based solutions. Second, the literature demonstratethat NN schemes have the capability to substantially reduce thecomputational complexity, and processing time for a varietyof problems in wireless communications, such as, resourceallocation [33], [35], [51], channel estimation and signaldetection [30], and physical layer designs [39]. Third, oncethe networks are trained (ideally using scalable computationalresources), the resulting model is suitable for inference in verylimited resource for real-time applications [52].

Fig. 2: The structure of proposed NN.

Remark 1:

Although the notion of function approximation isuseful to derive a powerful learned model, rendering an adap-tive learning model is a challenging goal, including, but notlimited to, anticipating varying inputs, noisy conditions, andfailures. As such, a simple NN-based approach alone cannothandle dynamic problems effectively. Instead, a solution toa dynamic problem will involve a hybrid approach, coveringoptimization techniques, NN, on-line learning, reinforcementlearning and possibly other techniques.

Remark 2:

The proposed NN-based approach performs itsderivations in real domain to determine the weights and biasthrough minimizing loss functions. However, a problem mightarise that the complex derivations exist if and only if theloss functions satisfy the Cauchy-Riemann equations. In thecomplex domain, the functions that satisfy these equationsare called holomorphic functions; otherwise, they are callednon-holomorphic functions [53]. This condition for complexdomain introduces challenges for directly employing the pro-posed NN-based approach to learn to optimize in multipleantenna wireless communication systems. For example, inholographic multiple-input multiple-output (MIMO) surfacesand reconﬁgurable intelligent surfaces aided future wirelessnetworks, the NNs need to deal with different parameters incomplex domains.Our aim is to utilize the high computational efﬁciency ofthe NN in its testing stage to design a time and computationalefﬁcient real-time power allocation scheme which can beapplied to solve the power allocation problem with both perfectand imperfect CSI. As shown in Fig. 2, the proposed NNconsists of three layers: input layer, multiple hidden layersand output layer. In particular, we choose | ˆ h s | , | ˆ h p | , | ˆ h e | , | g s | , | g e | , ǫ s , ǫ e and ǫ p as inputs and P ∗ s as output of the trainingdata, respectively. Note that the perfect CSI scheme becomes aspecial case of imperfect CSI scheme by setting the inputs forthe perfect CSI scheme as | ˆ h s | = | h s | , | ˆ h p | = | h p | , | ˆ h e | = | h e | ,and ǫ s = ǫ e = ǫ p = 0 . The mapping between the input andthe output parameters can be expressed as P ∗ s = f ( | ˆ h s | , | ˆ h p | , | ˆ h e | , | g s | , | g e | , ǫ s , ǫ e , ǫ p ) . (23)We start from the input and then pass the input data throughthe NN and calculate the actual output straightforwardly,which is referred as feed-forward. Furthermore, the calculationﬂow follows the natural forward direction from the input layerto the hidden layers and ﬁnally to the output layer. This processcan be expressed mathematically as z ( l +1) = W ( l ) a ( l ) + b ( l ) , (24) a ( l +1) = g ( z ( l +1) ) , (25)where z ( l +1) is the linear transformation of given inputs at the ( l + 1) -th layer, whereas a ( l +1) is the output activation valueof the ( l + 1) -th layer. g ( z ) denotes the activation function;in this work, we choose the rectiﬁed linear unit (ReLU)function as the activation function, which can be expressedas g ( x ) = max { , x } . W ( l ) and b ( l ) are the weight matrixand the bias vector for the l -th layer, respectively. Supposethere is an N -layer NN, the mapping between the inputs andthe output parameters can be expressed as y = f ( S , W , b ) , (26)where S = [ | ˆ h s | , | ˆ h p | , | ˆ h e | , | g s | , | g e | , ǫ s , ǫ e , ǫ p ] . Our goal is todetermine the weights W = [ W (1) , ..., W ( N − ] and the bias b = [ b (1) , ..., b ( N − ]) such that both functions in (23) and(26) yield a similar output for the same set of inputs. Proposition 2:

In order to have a similar outputs from both(23) and (26), we should minimize the following normalizedloss function: J ( W , b ) = 1 M M X m =1 ( y m − P ∗ s,m ) , (27) where M is the number of training data sets. y m and P ∗ s,m are the m -th output of the NN and the optimal transmitpower obtained by the conventional optimization approach,respectively. Proof:

Please refer to Appendix B.We iteratively use the back-propagation based gradientdescent algorithm to update the weights matrices W and thebias vectors b . Proposition 3:

Based on the back-propagation and the gra-dient descent algorithm, the weight matrix and the bias vectorfor the l -th layer W ( l ) and b ( l ) can be updated respectivelyby W ( l ) = W ( l ) − αM M X m =1 [ δ ( l +1) m ( a ( l ) m ) T ] , (28) b ( l ) = b ( l ) − αM M X m =1 δ ( l +1) m , (29)where α is the learning rate and δ ( l +1) m is deﬁned as δ ( l +1) m = ∂J ( W , b ) ∂ z ( l +1) m . Proof:

Please refer to Appendix C.In an NN, over-ﬁtting is the result of a model that isvery closely to or precisely aligned to a speciﬁc set of data[54], which occurs when the model learns the training dataset along with noises [55]. Over-ﬁtting leads the model notto be able to ﬁt additional data or reliably predict futureobservations [54]. Regularization is an approach to reduce thewell-known over-ﬁtting problem of a machine learning model[56], [57]. To overcome this over-ﬁtting problem, the L and L regularizations are most widely utilized techniques in theliterature [58], [59].The regularization term is added to the loss function toreduce the sum of absolute values of the weights in the L regularization method, where the loss function can be writtenas J ( W , b ) = 1 M M X m =1 ( y m − P ∗ s,m ) + λ M N − X l =1 || W ( l ) || , (30)where λ is the regularization parameter. Following the similarderivation of Proposition 2, the weights for the L regulariza-tion can be updated as W ( l ) = − αM M X m =1 [ δ ( l +1) m ( a ( l ) m ) T ] − αλM . (31)The bias b ( l ) can be updated by using the equation providedin (29). In the L regularization method, the sum of squares of theweights are reduced by adding the regularization term to theloss function, which can be mathematically expressed as J ( W , b ) = 1 M M X m =1 ( y m − P ∗ s,m ) + λ M N − X l =1 || W ( l ) || , (32)where λ is the regularization parameter. Following a derivationsimilar to that of Proposition 2, the weights for the L regularization can be updated as W ( l ) = (1 − αλM ) W ( l ) − αM M X m =1 [ δ ( l +1) m ( a ( l ) m ) T ] . (33)The bias b ( l ) for L regularization can be updated by usingthe equation provided in (29).The development of our proposed NN scheme can bedivided into three steps: (1) Obtaining the training data setby solving the secrecy rate maximization problem throughconventional optimization approach; (2) developing an NN-based algorithm to learn the relationship between the input andoutput parameters of this secure transmission system; (3) aftercompleting the training process, evaluating the performance ofthe trained NN over the conventional optimization algorithm.The details of these steps are provided in Algorithm 2.V. S IMULATION R ESULTS

In this section, we present numerical results to demonstratethe superior performance of our proposed NN schemes. Thedata set is obtained by utilizing the conventional optimizationscheme in Section III with × different random channelrealizations. We split the data set into two subsets of data: × for training and for validation. In the trainingprocess, all the NN parameters are updated by utilizing mini-batch gradient descent algorithm based on the Adam optimizer[60], where the batch size is chosen to be ten. All the param-eters in NN are initialized with by the Xavier initializer [61].Furthermore, similar to [33], it is assumed that the NN hastwo hidden layers with one hundred neurons in each layer. Thelearning rate α is set to − and the regularization parameter λ is assumed to be × − [28], [58], [62]. The test data setis obtained by using 3000 channel realizations. The transmitpower of PU-Tx is assumed to be 60 mW, whereas all thenoise variances are set to be 0.001. The channels ˆ h s , ˆ h p , ˆ h e , g s ,and g e are all generated by ˆ h i = χ i q d − αi , i = s, e, p and g j = χ j q c − αj , j = s, e , where χ i ∼ CN (0 , , χ j ∼ CN (0 , , d i is the distance between the SU-Tx andthe i -th user and c j denotes the distance between the PU-Tx and the i -th user. The parameter α = 1 . denotes thepath loss exponent. The distances between the transmittersand corresponding receivers are assumed to be d s = 10 m, d e = 20 m, d p = 10 m, c s = 20 m and c e = 20 m,respectively. The simulated datasets for training and testingwere generated using MATLAB scripts, and the performanceof data generation is irrelevant to the results. For training andtesting the model, we used a system with Intel Core i7-9700K Algorithm 2 : The NN approach

Preparing process: Obtain the training data set by utilizing the conven-tional approaches in Section III: The optimal trans-mit power P ∗ s for corresponding the channel coefﬁ-cients | ˆ h s | , | ˆ h p | , | ˆ h e | , | g s | , | g e | and channel error bounds ǫ s , ǫ e , ǫ p ; Training process: Initialize the weights matrices W , the bias vectors b andthe learning rate α ; Divide the training data set into I mini-batches, the sizeof each mini-batch is M ; For each batch : Input the training set S =[ S , . . . , S M ] and y = [ y , . . . , y M ] , where S m =[ | ˆ h s,m | , | ˆ h p,m | , | ˆ h e,m | , | g s,m | , | g e,m | , ǫ s,m , ǫ e,m , ǫ p,m ] ; For NN without any regularization, update the weightsmatrices W and the bias vectors b by minimizing theloss function deﬁned in (27) using the back-propagationbased gradient descent method provided in (28) and (29); For NN with L regularization, update the weights ma-trices W and the bias vectors b by utilizing the back-propagation based gradient descent method provided in(31) and (29), which are based on minimizing the lossfunction deﬁned in (30); For NN with L regularization, update the weights ma-trices W and the bias vectors b by minimizing the lossfunction deﬁned in (32) using the back-propagation basedgradient descent method provided in (33) and (29); End for ; Save the trained NN.

Testing process: Generate the channel coefﬁcients for the test data set S test : Feed S test as the input parameters and determine theoutput results based on the trained NN;processor, with eight cores, clocked at 3.9 GHz, 12 MB cachememory and 32 GB random access memory. The training wasperformed purely on CPUs (opposed to GPUs).First, we show the mean square error obtained by NNschemes without regularization, with L regularization and L regularization versus the number of training steps, respectively,in Figs. 3-5. For a better presentation, we take samples forevery 100 points from the whole training steps. It is obviousthat the mean square error decreases and approaches zero asthe number of iterations increases. This is due to the factthat the weights W and the bias b of the NN are iterativelyupdated by using the mini-batch gradient descent algorithm.Furthermore, the mean square errors of the validation datafor the three schemes are also provided, respectively, in Figs.3-5. As seen in these ﬁgures, the mean square errors ﬁrstdecrease and then remain constant, which conﬁrms that thetraining process does not over-ﬁt the NN for all three cases.Over-ﬁtting is a phenomenon where a machine learning modelbecomes overly sensitive to a given dataset, and hence, fails togeneralize beyond the training data [9], [28], [63]. Generally,a model can easily be tested for over-ﬁtting using a validation Training steps M e a n Squ a re E rr o r -3 Validation SetTraining Set

Fig. 3: The mean square error between the power allocations obtainedby the conventional approach and the NN scheme without regular-ization versus the number of training steps.

Traning Setps M e a n Squ a re E rr o r -3 Validation SetTraining Set

Fig. 4: The mean square error between the power allocations ob-tained by the conventional approach and the NN scheme with L regularization versus the number of training steps. dataset that the model has not been exposed to. An over-ﬁttedmodel has a signature characteristic of performing well for fewtraining steps, and showing degrading performance for largertraining steps [9], [28], [63], while the training performanceincreases. In our cases, it can be seen that the validationperformance approaches a stead-state with increasing trainingsteps. This is a clear evidence that the model is not over-ﬁtted.Next, Fig. 6 presents the performance comparison in termsof optimal transmit power obtained by using the conventionaloptimization scheme and the proposed NN scheme (withoutregularization) versus the number of training steps. Similar toFigs. 3-5, the results of this ﬁgure are obtained by samplingevery 100 points from the whole training steps. As seen inthis ﬁgure, the output transmit power of the proposed NNscheme approaches the optimal transmit power obtained fromthe conventional scheme as the training steps increase. Thereason is that the weights W and the bias b of the proposedNN are continuously updated in the training process to achieveminimum mean square error. Note that the output power ofthe proposed NN may be negative or larger than the availabletransmit power, since the training errors between the NN Training Steps M e a n Squ a re E rr o r -3 Validation SetTraining Set

Fig. 5: The mean square error between the power allocations ob-tained by the conventional approach and the NN scheme with L regularization versus the number of training steps. Training Steps -200020040060080010001200 T r a n s m i t P o w er ( m W ) Proposed NN Output PowerOptimal Transmit Power obtained by Conventional Optimization Scheme

Fig. 6: The performance comparison in terms of the optimal transmitpower obtained by the conventional optimization approach and theproposed NN-based scheme without regularization versus the numberof training steps. output power and the optimal power obtained by conventionaloptimization scheme cannot be completely eliminated. In orderto incorporate the power constraints ( ≤ P s ≤ P t ), we choose P NNs = min(max( P OUTs , , P t ) as the SU-Tx transmitpower of our proposed NN scheme in the following simulationresults.Next, Fig. 7 presents the achievable secrecy rates of theSU-Rx versus the interference leakage tolerance of the PU-Rxobtained by both conventional optimization and our proposedNN schemes with perfect CSI assumption. The maximumavailable transmit power of the SU-Tx is assumed to be 100mW. It can be seen that the achievable secrecy rate increaseswith the interference leakage tolerance for all schemes. Inaddition, the three NN-based schemes can achieve a similarperformance with the conventional optimization approach.Note that there is a performance gap between the conventionalscheme and the three NN schemes, and this is due to thetraining errors between the output power and the desiredoptimal power.Next, we evaluate the achievable secrecy rates versus theavailable transmit power with perfect CSI. Fig. 8 presents Interference Leakage Tolerance (mW) A c h i e v e d S ecrec y R a t e ( bp s / H z ) Conventional Optimization SchemeProposed NN Scheme without RegularizationProposed NN Scheme with L1 RegularizationProposed NN Scheme with L2 Regularization

Fig. 7: The achievable secrecy rates versus the interference toleranceof the PU-Rx obtained by the conventional optimization approachand the proposed NN framework under perfect CSI assumption.

10 20 30 40 50 60 70 80 90 100

Maximum Available Transmit Power at SU-Tx (mW) A c h i e v e d S ecrec y R a t e ( bp s / H z ) Conventional Optimization SchemeProposed NN Scheme without regulizationProposed NN Scheme with L1 regulizationProposed NN Scheme with L2 regulization

Fig. 8: The achievable secrecy rates versus the maximum transmitpower of the SU-Tx obtained by the conventional optimizationapproach and the proposed NN framework under perfect CSI as-sumption. the achievable secrecy rates of SU-Rx of both conventionaloptimization and our proposed NN schemes. The interferenceleakage tolerance is set to 6 mW. It can be seen that the achiev-able secrecy rate increases as the transmit power enhancesfor all schemes. Similar to Fig. 7, our proposed NN schemesshow a similar performance as the conventional optimizationapproach.Next, Fig. 9 presents the achievable secrecy rates versusthe interference leakage tolerance at the PU-Rx obtained byboth conventional optimization and our proposed NN schemesunder imperfect CSI assumption. The channel error bound isassumed to be ǫ s = ǫ e = ǫ p = 0 . . The maximum availabletransmit power of SU-Tx is assumed to be 100 mW. As seen inFig. 9, the achievable secrecy rate enhances as the interferenceleakage tolerance increases for all schemes. Furthermore, thethree NN-based schemes show similar performances comparedto that of the conventional optimization approach.Next, we evaluate the achievable secrecy rates of con-ventional optimization and the proposed NN schemes withdifferent available transmit power at SU-Tx. Fig. 10 presentsthe achievable secrecy rates of SU-Rx for these schemes. Theinterference leakage tolerance is set to 6 mW. As seen in Fig. Interference Leakage Tolerance (mW) A c h i e v e d S ecrec y R a t e ( bp s / H z ) Conventional Optimization SchemeProposed NN Scheme without RegularizationProposed NN Scheme with L1 RegularizationProposed NN Scheme with L2 Regularization

Fig. 9: The achievable secrecy rates versus the interference toleranceof the PU-Rx obtained by the conventional optimization approachand the proposed NN framework under imperfect CSI assumption.

10 20 30 40 50 60 70 80 90 100

Maximum Available Transmit Power at SU-Tx (mW) A c h i e v e d S ecrec y R a t e ( bp s / H z ) Conventional Optimization SchemeProposed NN Scheme without RegulizationProposed NN Scheme with L1 RegulizationProposed NN Scheme with L2 Regulization

Fig. 10: The achievable secrecy rates versus the maximum transmitpower of the SU-Tx obtained by the conventional optimizationapproach and the proposed NN framework under imperfect CSIassumption.

10, the achievable secrecy rate enhances with the increase inthe interference leakage tolerance. Similar to previous results,the proposed NN schemes provide a similar performance asthe conventional convex optimization approach.The achievable secrecy rates of conventional optimizationand proposed NN schemes with different channel error boundsare provided in Fig. 11. The maximum available transmitpower at SU-Tx is set to be 100 mW and the interferenceleakage tolerance at PU-Rx is assumed to be 6 mW. All thechannel error bounds are assumed to be the same for eachpoint, i.e., ǫ s = ǫ e = ǫ p . As seen in this ﬁgure, the achievedsecrecy rate decreases as the channel error bound increases forall the schemes. Furthermore, as observed in the previous setof simulation results, the proposed NN schemes can achievesimilar performances in comparison with the conventionalconvex optimization scheme.In Fig. 12, we present the achieved secrecy rate (left axis)and computation time (right axis) versus the number of hiddenlayers for the NN scheme without any regularization. Themaximum available transmit power at the SU-Tx and the inter-ference leakage tolerance at the PU-Rx are assumed to be 100 Channel Error Bound A c h i e v e d S ecrec y R a t e ( bp s / H z ) Conventional Optimization SchemeProposed NN Scheme without RegulizationProposed NN Scheme with L1 RegulizationProposed NN Scheme with L2 Regulization

Fig. 11: The achievable secrecy rates versus channel error boundsobtained by the conventional optimization approach and the proposedNN framework.

Number of Hidden Layers A c h i e v e d S ecrec y R a t e ( bp s / H z ) C o m pu t a t i o n T i m e f o r T e s t i n g S e t ( s ) Conventional SchemeNN SchemeComputation Time

Fig. 12: The achievable secrecy rates (left axis) and computation timefor the training set (right axis) versus the number of hidden layers. mW and 9 mW, respectively. All channel error bounds are setto be 0 to represent the perfect CSI scenario. As shown in thisﬁgure, the difference of performance among different numberof hidden layers are within a range of 1%. However, thecomputation time for the testing set increases as the numberof hidden layers increase. In other words, introducing morehidden layers cannot lead to much performance improvement,while it will yield more computational complexity to the NN.Next, we present the statistical results in Figs. 13 and 14 toevaluate the interference leakage tolerance satisfaction at thePU-Rx for the proposed NN scheme without regularization.These statistical results are calculated by combing the resultsof test data of both perfect and imperfect CSI scenarios. In Fig.13, the interference leakage tolerance is set to 6 mW, while themaximum available transmit power at the SU-Tx is assumedto be 100 mW in Fig. 14. Fig. 13 provides the interferenceleakage tolerance satisfaction versus the maximum transmitpower, whereas Fig. 14 presents it versus the interferenceleakage tolerances. As shown in these ﬁgures, more than 93%of the test results can meet the interference leakage constraintat the PU-Rx.Table I provides the secrecy rate performance of all schemeswith perfect and imperfect CSI versus different interference TABLE I: The achieved secrecy rates of all schemes versus the interference leakage tolerances

Perfect CSIInterference leakage NN scheme without NN scheme with NN scheme with Conventional scheme Minimum ratiotolerance regularization L regularization L regularization (bps/Hz) (%)(mW) (bps/Hz) (bps/Hz) (bps/Hz)1 0.5636 0.5618 0.5642 0.5679 98.932 0.7438 0.7393 0.7355 0.7486 98.253 0.846 0.8377 0.8414 0.8523 98.29Imperfect CSIInterference leakage NN scheme without NN scheme with NN scheme with Conventional scheme Minimum ratiotolerance regularization L regularization L regularization (bps/Hz) (%)(mW) (bps/Hz) (bps/Hz) (bps/Hz)1 0.2202 0.2256 0.2216 0.2320 94.912 0.3357 0.3427 0.3431 0.3474 96.633 0.4056 0.4138 0.4186 0.4235 95.77 TABLE II: The required computational time for all schemes versus the interference leakage tolerances

Perfect CSIInterference leakage NN scheme without NN scheme with NN scheme with Conventional scheme Maximum ratiotolerance regularization L regularization L regularization (s) (%)(mW) (s) (s) (s)1 3.59 4.40 4.38 558.71 0.792 3.69 4.58 4.47 584.68 0.783 3.77 4.45 4.46 573.38 0.78Imperfect CSIInterference leakage NN scheme without NN scheme with NN scheme with Conventional scheme Maximum ratiotolerance regularization L regularization L regularization (s) (%)(mW) (s) (s) (s)1 3.72 4.67 4.22 651.18 0.652 3.71 4.79 4.39 639.32 0.753 3.62 4.74 4.29 647.28 0.73 TABLE III: The achieved secrecy rates of all schemes versus the maximum transmit powers

Perfect CSIMaximum available NN scheme without NN scheme with NN scheme with Conventional scheme Minimum ratiotransmit power regularization L regularization L regularization (bps/Hz) (%)(mW) (bps/Hz) (bps/Hz) (bps/Hz)10 0.5636 0.5618 0.5642 0.5679 98.9320 0.7438 0.7393 0.7355 0.7486 98.2530 0.846 0.8377 0.8414 0.8523 98.29Imperfect CSIMaximum available NN scheme without NN scheme with NN scheme with Conventional scheme Minimum ratiotransmit power regularization L regularization L regularization (bps/Hz) (%)(mW) (bps/Hz) (bps/Hz) (bps/Hz)10 0.2123 0.2138 0.2127 0.2224 95.4620 0.3338 0.3258 0.3351 0.3400 95.8230 0.4033 0.3954 0.4019 0.4108 96.25 TABLE IV: The required computational time for all schemes versus the maximum transmit powers

Perfect CSIMaximum available NN scheme without NN scheme with NN scheme with Conventional scheme Minimum ratiotransmit power regularization L regularization L regularization (s) (%)(mW) (s) (s (s)10 4.21 5.03 5.34 578.06 0.9220 3.90 4.65 4.77 574.38 0.8330 3.63 4.39 4.58 566.87 0.81Imperfect CSIMaximum available NN scheme without NN scheme with NN scheme with Conventional scheme Minimum ratiotransmit power regularization L regularization L regularization (s) (%)(mW) (s) (s) (s)10 3.87 4.62 4.76 658.36 0.7220 4.07 4.52 4.44 654.87 0.6930 3.93 4.82 4.68 661.43 0.73 Maximum Available Transmit Power at SU-Tx (mW) P erce n t ag e ( % ) Constraint not satisfiedConstraint satisfied

Fig. 13: Distributions of the interference leakage satisfactions versusthe maximum available transmit power at SU-Tx.

Interference Leakage Tolerance (mW) P erce n t ag e ( % ) Constraint not satisfiedConstraint satisfied

Fig. 14: Distributions of the interference leakage satisfactions versusthe interference leakages at RU-Rx. leakage tolerances, similar to the results depicted in Figs. 7 and9. Table II shows the comparison of the required computationtime of the four schemes versus the interference leakage tol-erances. Similarly, Table III presents the achieved the secrecyrate performance of all schemes versus the available transmitpower. Table IV provides the required computation time ofall schemes versus the maximum transmit power. To drawa performance comparison of the achieved secrecy rate, weemploy the minimum ratio, which is calculated by dividingthe minimum achieved secrecy rate among the three NNschemes by that of the conventional scheme. Similarly, for thecomparison of the computational time, we use the maximumratio, which is obtained by dividing the maximum computationof the three NN schemes by that of the conventional scheme.Note that the testing process for all schemes is performed onthe same computer. For the results provided in these tables,the achievable secrecy rates are obtained by averaging resultsover the test data with 3000 channel realizations, while thecomputational time is the total computation time of 3000channel realizations. From these results, we can conclude thatthe proposed NN schemes achieve at least 94% of the optimalperformance of the conventional scheme, while signiﬁcantlyreducing the required computation time. In particular, the proposed NN-based schemes require less than 1% of the timeneeded by the conventional optimization scheme. This is dueto the fact that the conventional optimization based solutionsof the perfect CSI assumption are obtained through an iterativeapproach and sub-gradient algorithms, while the conventionalscheme for the imperfect CSI assumption requires sub-gradientalgorithms. These conventional optimization algorithms forboth perfect and imperfect CSI scenarios are more complexity,which require a higher computation time. In the NN-basedschemes, once the weights and bias are determined, it is shouldbe able to compute the solution with a reasonable complexitywithin a short time compared to that of the conventionalapproach. VI. D

ISCUSSIONS

Despite offering a number of beneﬁts, the proposed ap-proach also has a number of shortcomings. We discuss thesebelow, and highlight a number of potential directions for futurework:1) NN is a supervised learning approach, and hence relieson labelled-data for training the NN. In our context,the the necessity for valid labels implies that the train-ing data should also be reliable, which is requiredfor guaranteeing a valid solution [9]. In addition, thetraining is an off-line process, which effectively limitsthe applicability of the proposed approach to dynamicwireless systems. As we have mentioned in

Remark1 , hybrid approaches can be considered for problemsin dynamic systems, which might include optimizationtechniques, NN, on-line learning, reinforcement learningand possibly other techniques.2) As mentioned in

Remark 2 , the proposed NN approachmay not be able to learn to optimize for multiple-antenna wireless transmission scenarios. To extend thisNN-based scheme to multiple-antenna systems, one canconsider two approaches. One is to is to separate bothcomplex input and output parameters into real andimaginary parts [64], and the NN can be trained in realdomain. The other is to handle the complex parametersby employing the Wirtinger calculus to deal with non-holomorphic functions in complex domain [53].3) Finally, due to the fact that training errors cannot becompletely eliminated, the proposed NN cannot includeconstraints in the training process, as presented in Fig.13 and Fig. 14. This introduces challenges for extendingthe proposed scheme to design problems that havenumerous system constraints. Fortunately, a constrainedtraining algorithm was developed in the literature [65],where the key idea is to employ the Lagrange dualformulation to accommodate the constraints [48]. Thisis another potential direction of future research work.VII. C

ONCLUSION

In this paper, we proposed an NN-based approach for thepower allocation design to maximize the secrecy rate in aCR network under transmit power and interference leakageconstraints. We showed that the developed NN algorithm has the capability to solve the power allocation problem with bothperfect and imperfect CSI, whereas it requires to developboth robust and non-robust optimization frameworks in theconventional approach. First, the conventional optimizationscheme for perfect CSI scenario was developed based on aone-dimensional search, while that for the imperfect assump-tion was developed based on the Charnes-Cooper transfor-mation and the S-Procedure approach. Then, the NN-basedschemes were proposed where a relationship between theinput and output parameters is established by determiningan approximated function. The training set to determine therelationship between inputs and output was obtained throughthe conventional optimization approaches and the NN wastrained to calculate the weights of the connections in thenetwork. After training the NN, the performance was evaluatedwith a test set in terms of achieved secrecy rate and requiredcomputational time. We demonstrated that the proposed NNschemes can achieve more than 94% of the secrecy rate perfor-mance with less than 1% computation time and more than 93%satisfaction of interference leakage constraints compared withthose of the conventional approaches. Simulation results wereprovided to demonstrate the effectiveness of the proposed NN-based approach over the benchmark conventional optimizationapproaches. Finally, we have discussed some limitations of theproposed NN-based approach and a number of potential futuredirections of research. A PPENDIX AP ROOF OF P ROPOSITION

Lemma 1: ( S-Procedure [43]) Deﬁne f i ( x ) , i = 1 , suchas f i ( x ) = x H A i x + 2 Re { b Hi x } + c i , (34)in which x ∈ R n , A i ∈ S n , b i ∈ R n and c i ∈ R . Theimplication f ( x ) ≤ → f ( x ) ≤ holds if and only if thereexists a ϑ ≥ such that ϑ (cid:20) A b b H c (cid:21) − (cid:20) A b b H c (cid:21) (cid:23) . (35)We ﬁrst rewrite the constraint in (18b) as | e s | − ǫ s ≤ ,τ − t − P s | ˆ h s | + 2 Re { P s ˆ h s e e } + P s | e s | P p | g s | + σ s ≤ . (36)Then, by applying Lemma 1, this constraint can be reformu-lated with a slack variable λ as " λ + P s P p | g s | + σ s P s ˆ h s P p | g s | + σ s P s ˆ h s P p | g s | + σ s P s | ˆ h s | P p | g s | + σ s + t − τ − λ ǫ s (cid:23) ,λ ≥ . (37)Similarly, (18c) and (18d) also can be derived as | e e | − ǫ e ≤ ,t − P s | ˆ h e | + 2 Re { P s ˆ h e e e } + P s | e e | P p | g e | + σ e ≤ , (38) and | e e | − ǫ e ≤ ,P s | ˆ h p | + 2 Re { P s ˆ h p e p } + P s | e p | − tq ≤ , (39)respectively. Then, by adopting Lemma 1, these constraintscan be reformulated, respectively as " λ − P s P p | g e | + σ s − P s ˆ h e P p | g e | + σ s − P s ˆ h e P p | g e | + σ s − P s | ˆ h s | P p | g e | + σ e − t − λ ǫ e (cid:23) ,λ ≥ , (40)and (cid:20) λ − P s − P s ˆ h p − P s ˆ h p tq − P s | ˆ h p | + σ s − λ e ǫ p (cid:21) (cid:23) ,λ ≥ . (41)This completes the proof of Proposition 1. (cid:4) A PPENDIX BP ROOF OF P ROPOSITION L ( W , b ) = M Y m =1 p m ( y | x ; W , b )= M Y m =1 exp (cid:18) − ( f m ( S , W , b ) − P ∗ s,m ) σ (cid:19) . (42)By utilizing the monotonicity of the logarithmic function, thelogarithmic likelihood function can be expressed as log L ( W , b ) = log M Y m =1 exp (cid:18) − ( f m ( S , W , b ) − P ∗ s,m ) σ (cid:19) = M log (cid:18) √ πσ (cid:19) − σ M X m =1 ( f m ( S , W , b ) − P ∗ s,m ) . (43)Since M log √ πσ and σ are constants, maximizing thelikelihood function is equivalent to minimizing the followingloss function: J ( W , b ) = M X m =1 ( y m − P ∗ s,m ) . (44)Furthermore, this loss function can be normalized without lossof generality as follows: J ( W , b ) = 1 M M X m =1 ( y m − P ∗ s,m ) , (45)which completes the proof of Proposition 2. (cid:4) A PPENDIX CP ROOF OF P ROPOSITION ∂h ( g ) ∂z = ∂h∂g ∂g∂z . (46)From the feed-forward process, we have z ( l +1) = W ( l ) a ( l ) + b ( l ) , (47) a ( l +1) = g ( z ( l +1) ) , (48)where z ( l +1) is the linear transformation of a given set ofinput parameters at the ( l + 1) -th layer, whereas a ( l +1) is theoutput activation value of the ( l +1) -th layer. The function g ( z ) represents the activation function. By assuming that J ( W , b ) is the loss function of the NN, we can write the followingequations based on the chain rule deﬁned in (46): ∂J ( W , b ) ∂ W ( l ) = ∂J ( W , b ) ∂ z ( l +1) ∂ z ( l +1) ∂ W ( l ) = 1 M M X m =1 δ ( l +1) m ( a ( l ) m ) T , (49) ∂J ( W , b ) ∂ b ( l ) = ∂J ( W , b ) ∂ z ( l +1) ∂ z ( l +1) ∂ b ( l ) = 1 M M X m =1 δ ( l +1) m . (50)Since we can calculate a ( l ) m from the feed-forward process,then δ ( l ) m can be derived as follows. Based on the chain rule,we have δ ( l ) m = ∂J ( W , b ) ∂ z ( l ) m = ∂J ( W , b ) ∂ z ( l +1) m ∂ z ( l +1) m ∂ a ( l ) m ∂ a ( l ) m ∂ z ( l ) m = [( W ( l ) ) T δ ( l +1) m ] · g ′ ( z ( l ) m ) . (51)Starting from the output layer, we can calculate δ ( l ) back for-ward layer-by-layer until the input layer. Finally, consideringthe gradient descent method, the weights matrix W ( l ) and thebias vector b ( l ) for the l -th layer can be updated respectivelyas follows: W ( l ) = W ( l ) − αM M X m =1 [ δ ( l +1) m ( a ( l ) m ) T ] , (52) b ( l ) = b ( l ) − αM M X m =1 δ ( l +1) m , (53)which completes the proof of of Proposition 3. (cid:4) R EFERENCES[1] K. Cumanan, Z. Ding, B. Sharif, G. Y. Tian, and K. K. Leung, “Secrecyrate optimizations for a MIMO secrecy channel with a multiple-antennaeavesdropper,”

IEEE Trans. Veh. Technol. , vol. 63, no. 4, pp. 1678–1690,May. 2014.[2] Z. Chu, K. Cumanan, Z. Ding, M. Johnston, and S. Le Goff, “Robustoutage secrecy rate optimizations for a MIMO secrecy channel,”

IEEEWireless Commun. Lett. , vol. 4, no. 1, pp. 86–89, Feb. 2015.[3] Z. Chu, H. Xing, M. Johnston, and S. Le Goff, “Secrecy rate op-timizations for a MISO secrecy channel with multiple multi-antennaeavesdroppers,”

IEEE Trans. Wireless Commun. , vol. 15, no. 1, pp. 283–297, Jan. 2016. [4] M. Zhang, K. Cumanan, and A. G. Burr, “Secure energy efﬁciencyoptimization for MISO cognitive radio network with energy harvesting,”in

Proc. IEEE WCSP , Nanjing, Oct. 2017, pp. 1–6.[5] C. Shannon, “Communication theory of secrecy systems,”

Bell Syst.Tech. J. , vol. 28, no. 4, pp. 656–715, Oct. 1949.[6] A. D. Wyner, “The wire-tap channel,”

Bell Syst. Tech. J , vol. 54, no. 8,pp. 1355–1387, Jan. 1975.[7] I. Csisz´ar and J. Korner, “Broadcast channels with conﬁdential mes-sages,”

IEEE Trans. Inf. Theory. , vol. 24, no. 3, pp. 339–348, May 1978.[8] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,“Machine learning paradigms for next-generation wireless networks,”

IEEE Wireless Commun. , vol. 24, no. 2, pp. 98–105, Apr. 2016.[9] I. Goodfellow, Y. Bengio, and A. Courville,

Deep learning . MIT PressCambridge, 2016, vol. 1.[10] C. Andrieu, N. De Freitas, A. Doucet, and M. I. Jordan, “An introductionto MCMC for machine learning,”

Machine learning , vol. 50, no. 1-2,pp. 5–43, Jan. 2003.[11] F. Sebastiani, “Machine learning in automated text categorization,”

ACMComputing Surveys , vol. 34, no. 1, pp. 1–47, Mar. 2002.[12] C. M. Bishop,

Pattern Recognition and Machine Learning . Springer,2006.[13] J. W. Kalat,

Biological Psychology . Nelson Education, 1995.[14] M. Chen, U. Challita, W. Saad, C. Yin, and M. Debbah, “Machinelearning for wireless networks with artiﬁcial intelligence: A tutorial onneural networks,”

IEEE Commun. Surveys & Tutorials , vol. 21, no. 4,pp. 3039–3071, Jul. 2019.[15] H. B. Demuth, M. H. Beale, O. De Jess, and M. T. Hagan,

NeuralNetwork Design . Martin Hagan, 2014.[16] T. Lin and Y. Zhu, “Beamforming design for large-scale antenna arraysusing deep learning,”

IEEE Wireless Commun. Lett. , vol. 9, no. 1, Jan.2020.[17] H. Huang, Y. Song, J. Yang, G. Gui, and F. Adachi, “Deep-learning-based millimeter-wave massive MIMO for hybrid precoding,”

IEEETrans. Veh. Tech. , vol. 68, no. 3, pp. 3027–3032, Mar. 2019.[18] F. Zhou, G. Lu, M. Wen, Y. Liang, Z. Chu, and Y. Wang, “Dynamicspectrum management via machine learning: State of the art, taxonomy,challenges, and open research issues,”

IEEE Network , vol. 33, no. 4, Jul.2019.[19] C. Huang, G. C. Alexandropoulos, A. Zappone, C. Yuen, and M. Deb-bah, “Deep learning for UL/DL channel calibration in generic massiveMIMO systems,” in

Proc. ICC , Shanghai, May 2019, pp. 1–6.[20] M. Zhang, K. Cumanan, L. Ni, H. Hu, A. G. Burr, and Z. Ding, “Ro-bust beamforming for AN aided MISO SWIPT system with unknowneavesdroppers and non-linear EH model,” in

Proc. IEEE GLOBECOMWORKSHOP , Abu Dhabi, Dec. 2018, pp. 1–7.[21] K. Cumanan, H. Xing, P. Xu, G. Zheng, X. Dai, A. Nallanathan, Z. Ding,and G. K. Karagiannidis, “Physical layer security jamming: Theoreticallimits and practical designs in wireless networks,”

IEEE Access , vol. 5,pp. 3603–3611, Dec. 2016.[22] K. Cumanan, Z. Ding, M. Xu, and H. V. Poor, “Secrecy rate optimiza-tion for secure multicast communications,”

IEEE J. Sel. Topics SignalProcess. , vol. 10, no. 8, pp. 1417–1432, Dec. 2016.[23] Z. Chu, Z. Zhu, M. Johnston, and S. Y. Le Goff, “Simultaneous wirelessinformation power transfer for MISO secrecy channel,”

IEEE Trans. Veh.Technol. , vol. 65, no. 9, pp. 6913–6925, Nov. 2016.[24] Z. Chu, K. Cumanan, M. Xu, and Z. Ding, “Robust secrecy rateoptimisations for multiuser multiple-input-single-output channel withdevice-to-device communications,”

IET Commun. , vol. 9, no. 3, pp. 396–403, Feb. 2015.[25] M. Zeng, N. Nguyen, O. A. Dobre, and H. V. Poor, “Securing downlinkmassive MIMO-NOMA networks with artiﬁcial noise,”

IEEE J. Sel.Signal Process. , vol. 13, no. 3, pp. 685–699, Feb. 2019.[26] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos,“Learning to optimize: Training deep neural networks for wirelessresource management,” in

Proc. IEEE SPAWC , Sapporo, Jul. 2017, pp.1–6.[27] C. Zhang, P. Patras, and H. Haddadi, “Deep learning in mobile andwireless networking: A survey,”

IEEE Commun. Surveys & Tutorials ,vol. 21, no. 3, pp. 2224– 2287, Mar. 2019.[28] P. M. Domingos, “A few useful things to know about machine learning.”

ACM Commun. , vol. 55, no. 10, pp. 78–87, Spring 2012.[29] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,”

Nature , vol. 521,no. 7553, pp. 436–444, May 2015.[30] H. Ye, G. Y. Li, and B.-H. Juang, “Power of deep learning for channelestimation and signal detection in OFDM systems,”

IEEE WirelessCommun. Lett. , vol. 7, no. 1, pp. 114–117, Feb. 2018. [31] C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “Indoorsignal focusing with deep learning designed reconﬁgurable intelligentsurfaces,” in Proc. IEEE SPAWC . IEEE, Cannes, Jul. 2019, pp. 1–5.[32] C. Huang, R. Mo, and C. Yuen, “Reconﬁgurable intelligent surface as-sisted multiuser MISO systems exploiting deep reinforcement learning,”

IEEE J. Sel. Commun. , vol. 38, no. 8, pp. 1839 – 1850, Aug. 2020.[33] F. Zhou, X. Zhang, R. Q. Hu, A. Papathanassiou, and W. Meng,“Resource allocation based on deep neural networks for cognitive radionetworks,” in

Proc. IEEE ICCC , Beijing, Feb. 2018, pp. 40–45.[34] H. Ye and G. Y. Li, “Deep reinforcement learning for resource allocationin V2V communications,” in

Proc. IEEE ICC , Kansas City, May 2018,pp. 1–6.[35] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos,“Learning to optimize: Training deep neural networks for interferencemanagement,”

IEEE Trans. Signal Process. , vol. 66, no. 20, pp. 5438–5453, Oct. 2018.[36] C. Huang, G. C. Alexandropoulos, A. Zappone, C. Yuen, and M. Deb-bah, “Deep learning for UL/DL channel calibration in generic massiveMIMO systems,” in

Proc. IEEE ICC , Shanghai, May 2019, pp. 1–6.[37] Y. Sun, J. Liu, J. Wang, Y. Cao, and N. Kato, “When machine learningmeets privacy in 6g: A survey,”

IEEE Communications Surveys &Tutorials , Early Access 2020.[38] M. A. Al-Garadi, A. Mohamed, A. Al-Ali, X. Du, I. Ali, and M. Guizani,“A survey of machine and deep learning methods for Internet of things(IoT) security,”

IEEE Communications Surveys & Tutorials , Thirdquarter2020.[39] T. O’Shea and J. Hoydis, “An introduction to deep learning for thephysical layer,”

IEEE Trans. Cognitive Commun. & Network. , vol. 3,no. 4, pp. 563–575, Dec. 2017.[40] M. Zhang, K. Cumanan, and A. G. Burr, “Secrecy rate maximizationfor MISO multicasting SWIPT system with power splitting scheme,” in

Proc. IEEE SPAWC , Edinburgh, Jul. 2016, pp. 1–5.[41] K. Cumanan, G. C. Alexandropoulos, Z. Ding, and G. K. Karagiannidis,“Secure communications with cooperative jamming: Optimal powerallocation and secrecy outage analysis,”

IEEE Trans. Veh. Technol. ,vol. 66, no. 8, pp. 7495–7505, Aug. 2017.[42] A. Charnes and W. W. Cooper, “Programming with linear fractionalfunctionals,”

Naval Res. Logist. Quart. , vol. 9, no. 3-4, pp. 181–186,1962.[43] S. P. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan,

Linear MatrixInequalities in System and Control Theory . SIAM, 1994, vol. 15.[44] S. Shahbazpanahi, A. B. Gershman, Zhi-Quan Luo, and Kon Max Wong,“Robust adaptive beamforming using worst-case sinr optimization: a newdiagonal loading-type solution for general-rank signal models,” in

Proc.IEEE ICASSP , vol. 5, Hong Kong, Apr. 2003, pp. V–333.[45] G. Geraci, M. Egan, J. Yuan, A. Razi, and I. B. Collings, “Secrecy sum-rates for multi-user MIMO regularized channel inversion precoding,”

IEEE Trans. Commun. , vol. 60, no. 11, pp. 3472–3482, Nov. 2012.[46] A. Mukherjee and A. L. Swindlehurst, “Detecting passive eavesdroppersin the MIMO wiretap channel,” in

Proc. IEEE ICASSP , Kyoto, Mar.2012, pp. 2809–2812.[47] Y. Guo and B. C. Levy, “Worst-case MSE precoder design for imper-fectly known MIMO communications channels,”

IEEE Trans. SignalProcess. , vol. 53, no. 8, pp. 2918–2930, Aug. 2005.[48] S. Boyd and L. Vandenberghe,

Convex Optimization . CambridgeUniversity Press, 2004.[49] K. Hornik, M. Stinchcombe, H. White et al. , “Multilayer feedforwardnetworks are universal approximators.”

Neural networks , vol. 2, no. 5,pp. 359–366, 1989.[50] S. Reed and N. De Freitas, “Neural programmer-interpreters,” arXivpreprint arXiv:1511.06279 , 2015.[51] J. Luo, J. Tang, D. K. C. So, G. Chen, K. Cumanan, and J. A. Chambers,“A deep learning-based approach to power minimization in multi-carrierNOMA with SWIPT,”

IEEE Access , vol. 7, pp. 17 450–17 460, Jan.2019.[52] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,“Machine learning paradigms for next-generation wireless networks,”

IEEE Wireless Commun. , vol. 24, no. 2, pp. 98–105, Apr. 2017.[53] A. Hirose,

Complex-valued Neural Networks: Advances and Applica-tions . John Wiley & Sons, 2013, vol. 18.[54] D. J. Leinweber, “Stupid data miner tricks: overﬁtting the s&p 500,”

The Journal of Investing , vol. 16, no. 1, pp. 15–22, Spring 2007.[55] D. Chicco, “Ten quick tips for machine learning in computationalbiology,”

BioData Mining , vol. 10, no. 1, p. 35, Dec. 2017.[56] Y. Bengio, F. Bastien, A. Bergeron, N. Boulanger-Lewandowski,T. Breuel, Y. Chherawala, M. Cisse, M. Cˆot´e, D. Erhan, J. Eustache,X. Glorot, X. Muller, S. P. Lebeuf, R. Pascanu, S. Rifai, F. Savard, and G. Sicard, “Deep learners beneﬁt more from out-of-distributionexamples,” in

Proc. JMLR AISTATS , Fort Lauderdale, Apr. 2011, pp.164–172.[57] F. Girosi, M. Jones, and T. Poggio, “Regularization theory and neuralnetworks architectures,”

Neural Computation , vol. 7, no. 2, pp. 219–269,Mar. 1995.[58] L. Wang, M. D. Gordon, and J. Zhu, “Regularized least absolutedeviations regression and an efﬁcient algorithm for parameter tuning,”in

Proc. IEEE ICDM , Hong Kong, Dec. 2006, pp. 690–700.[59] A. E. Hoerl and R. W. Kennard, “Ridge regression: biased estimationfor nonorthogonal problems,”

Technometrics , vol. 42, no. 1, pp. 80–86,Feb. 2000.[60] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 , 2014.[61] X. Glorot and Y. Bengio, “Understanding the difﬁculty of training deepfeedforward neural networks,” in

Proc, ICAIS , Sardinia, 2010, pp. 249–256.[62] L. N. Smith, “Cyclical learning rates for training neural networks,” in

Proc. IEEE WACV , Santa Rosa, Mar. 2017, pp. 464–472.[63] I. V. Tetko, D. J. Livingstone, and A. I. Luik, “Neural network studies.1. comparison of overﬁtting and overtraining,”

J. Chem. Inf. Comput.Sci. , vol. 35, no. 5, pp. 826–833, Jan. 1995.[64] T. Li, M. R. A. Khandaker, F. Tariq, K. Wong, and R. T. Khan, “Learningthe wireless V2I channels using deep neural networks,” in

Proc. IEEEVTC-Fall , Honolulu, Sep. 2019, pp. 1–5.[65] H. Lee, S. H. Lee, and T. Q. S. Quek, “Deep learning for distributedoptimization: Applications to wireless resource management,”