[PDF] Optimal Co-Designs of Communication and Control in Bandwidth-Constrained Cyber-Physical Systems

Abstract

Full PDF

OOptimal Co-Designs of Communication and Control inBandwidth-Constrained Cyber-Physical Systems

Nandini Negi and Aranya Chakrabortty

Electrical & Computer Engineering, North Carolina State UniversityEmail: [email protected], [email protected] Abstract

We address the problem of sparsity-promoting optimal control of cyber-physical systems (CPSs) inthe presence of communication delays. The delays are categorized into two types - namely, an inter-layer delay for passing state and control information between the physical layer and the cyber layer, and an intra-layer delay that operates between the computing agents, referred to here as control nodes (CNs),within the cyber-layer. Our objective is to minimize the closed-loop H -norm of the physical systemby co-designing an optimal combination of these two delays and a sparse state-feedback controller whilerespecting a given bandwidth cost constraint. We propose a two-loop optimization algorithm for this.Based on the alternating directions method of multipliers (ADMM), the inner loop handles the conﬂictingdirections between the decreasing H -norm and the increasing sparsity level of the controller. The outerloop comprises a semideﬁnite program (SDP)-based relaxation of non-convex inequalities necessary forclosed-loop stability. Moreover, for CPSs where the state and control information assigned to the CNsare not private, we derive an additional algorithm that further sparsiﬁes the communication topologyby modifying the row and column structures of the obtained controller, resulting in reassigning thecommunication map between the cyber and physical layers, and determining which physical agent shouldsend its state information to which CN. Proofs for closed-loop stability and optimality are provided forboth algorithms, followed by numerical simulations. Over the recent years, sparsity-promoting optimal control has emerged as a key tool for enabling economiccontrol of large-scale cyber-physical systems (CPSs) (Hespanha et al., 2007; Sinopoli et al., 2003) in bothcontinuous-time (Lin et al., 2013) and discrete-time settings (Geromel et al., 1989). The fundamental ideais to minimize the number of communication links needed for control without sacriﬁcing the closed-loopperformance of the physical system below a speciﬁed threshold. Optimization methods such as alternatingdirections method of multipliers (ADMM) (Lin et al., 2013; Boyd et al., 2011), proximal Newton method(Wytock and Kolter, 2013), gradient support pursuit (GraSP) (Lian et al., 2017), and rank-constrainedconvex optimization (Arastoo et al., 2014), among others, have been successfully used to achieve this trade-oﬀwith applications to a wide range of CPSs such as electric power systems, robotics, transportation networks,and multi-agent control. An underlying assumption behind these designs is that the communication of stateand control inputs between the diﬀerent agents is instantaneous. In reality, however, all practical CPSs willencounter communication delays arising from propagation as well as from routing and queuing. How theseconventional sparse control designs would perform in the presence of such delays is still an open question. Ourrecent work in Negi and Chakrabortty (2020b) showed that inclusion of delays is not just a trivial extensionof the conventional algorithms for sparse control, but instead demands an entirely new design approachdue to various complex stability constraints that are typical to time-delayed systems Youcef-Toumi and Wu(1992); Bresch-Pietri and Krstic (2009); Heemels et al. (2010); Liu and Chopra (2012); Hale et al. (2015).While the work in Negi and Chakrabortty (2020b) addressed this problem for a speciﬁc class of CPSsthat operate over peer-to-peer communication, in this paper, we present a new design framework for sparsity-promoting optimal control of linear time-invariant (LTI) systems with feedback delay that are deﬁned overa far more generic cyber-physical architecture. Our approach is inspired by recent advancements in cloud1 a r X i v : . [ m a t h . O C ] F e b omputing, fog computing, and software-deﬁned networking (SDN). The physical layer in our CPS consistsof the physical plant that needs to be controlled, including its sensors, estimators, actuators, and otherphysical devices, while the cyber layer is deﬁned over a cloud computing network consisting of multiplespatially distributed virtual computing agents, referred to here as control nodes (CNs) (Xin et al., 2011).The physical layer sensors collectively measure the instantaneous values of the system state and communicatethem to designated CNs through a local area network (LAN). Inside the cloud, the CNs then share thatstate information through an SDN by following the sparsity pattern of the controller. Upon receiving itsrespective state information, each CN computes a control input using a linear quadratic regulator (LQR)law and sends that information back to a designated actuator in the physical layer, which then actuates thatcontrol input. The feedback loop continues like this over time by continuous interactions between the twolayers. Unlike the setup in Negi and Chakrabortty (2020b) where a single delay was used, in this case, wehave two distinct delays: (1) inter-layer delay τ d that arises in the LAN connecting the sensors or actuatorsin the physical layer to the corresponding CNs in the cyber layer, and (2) intra-layer delay τ c that arises inthe SDN links connecting the CNs across the cyber-layer. Both delays are a function of the respective LANand SDN bandwidths and the distances over which the corresponding communication links are operating(Bertsekas et al., 1992). Given this premise, our primary objectives and contributions are as follows. [1] We ﬁrst present a new sparse optimal control design that minimizes the closed-loop H norm of thephysical system while at the same time designing the optimal values of τ d and τ c to reduce the bandwidthcost. We co-optimize the controller and these two delays that are all coupled to each other through compleximplicit relationships arising from stability, H performance, and bandwidth constraints. To handle thesedependencies, we develop an algorithm (Algorithm 1) with two hierarchical loops. The outer loop designsthe two delays and ﬁnds a corresponding stabilizing controller by sequentially relaxing the non-linear matrixequations required for the co-design. The inner loop sparsiﬁes this controller while minimizing the closed-loop H -norm. Our results show that the relative magnitudes of τ c and τ d for achieving the optimal H -normcan be notably diﬀerent depending on the plant dynamics. [2] For the case when preserving privacy of the information handled by the CNs is not an issue, weprovide a strategy for reassignment of the states and the control inputs by manipulating the block-wise rowand column structures of the sparse controller obtained from Algorithm 1. This reassignment changes thetwo delays, resulting in a subsequent change in the H performance. We propose a series of algorithms,collectively referred to as Algorithm 2, which further minimize the closed-loop H norm under the constraintthat the computation overhead of the CNs remains below that for Algorithm 1. We derive the conditionsunder which such an optimal reassignment exists.Note that our problem is fundamentally diﬀerent from the conventional bandwidth allocation and delayassignment problems reported in the literature of computer networking Kelly et al. (1998), Barrera andGarcia (2015). The utility functions in these papers are static, and do not include any plant dynamics likeours. We illustrate the eﬀectiveness of our algorithms using simulations in Sec. 5 and 7. These simulationshighlight the impacts of delays and sparsity on H -performance and provide important insights on theirinterdependencies.Some preliminary results on this topic have been presented in our recent conference paper Negi andChakrabortty (2020a), where we used a simpliﬁed relationship between delay and sparsity to satisfy onlythe bandwidth cost while minimizing the closed-loop H norm. The results in this paper, however, aresigniﬁcantly extended, in comparison. We use a more practical delay versus sparsity relationship in ourproblem formulation, and minimize both the bandwidth and the cost of computation overhead for the CNs.Moreover, the concept of CN reassignment and the related algorithms developed in Sec. 6 are also addedas entirely new contributions. The rest of the paper is organized as follows. Sec. 2 states the problemformulation followed by Sec. 3 that describes the proposed co-design of the delays. Sec. 4 introduces thetwo-loop algorithm to solve the problem, followed by the corresponding simulation examples in Sec. 5.Sec. 6 introduces the reassignment problem via topology design and derives the corresponding algorithms,followed by simulations in Sec. 7 and conclusion in Sec. 8. Finally, the proofs of all lemmas, theorems, andpropositions are listed in the Appendix. Notations: R , Z and N n are the set of real numbers, integers and natural numbers from 1 to n . U p , q isthe continuous uniform distribution over [0,1]. The natural order of i refers to ascending order of the indices i . A T , Tr p A q and λ max p A q represent the transpose, the trace and the maximum eigenvalue of A . A b B A ˝ B represent Kronecker and Hadamard product between A and B respectively. A p B q representsdiﬀerentiability of A w.r.t B . A permutation matrix is obtained from permuting the rows and columns of n ˆ n identity matrix I n . The rowgroups and colgroups of an N ˆ N block matrix A P R n ˆ m refers topartitioning n and m into separate collection of N sets. floor p x q rounds x to the nearest integer ď x . Thenotation B “ Reshape p A , r p, q sq is used to reshape an A P R m ˆ n in row-traversing order to another matrix B P R p ˆ q , provided pq “ mn . Consider a LTI system with the following dynamics: x p t q “ Ax p t q ` Bu p t q ` B w w p t q , (1)where x P R n is the state, u P R m is the control, and w P R r is the exogenous input, with the correspondingmatrices A P R n ˆ n , B P R n ˆ m , and B w P R n ˆ r . We design a state-feedback controller, ideally representedas u p t q “ ´ Kx p t q . However, due to limited bandwidth availability, the controller includes ﬁnite delays in thefeedback. The architecture of the closed-loop system consists of state information x p t q being sent from thesensors in the physical layer to the N CNs located in a virtual cloud, the CNs sharing this state informationwith each other and computing the control input u p t q in a distributed way, and ﬁnally these control signalsbeing transmitted back to the actuators in the physical layer. The exact CPS model to carry out these threeexecutions is described as follows. A.1

Every CN i is associated with n i ě ř N i “ n i “ n . The subscripts of thesestates are represented by the set x i P Z n i with Ť N i “ x i “ N n . For example, if CN i is associated with x and x , then x i “ t , u . These n i states are received from the physical-layer through LAN orinter-layer communication links with delay τ d ą A.2

Inside the cloud, each CN i shares its corresponding state x l p t ´ τ d q , l P x i with the others overpoint-to-point SDN or intra-layer links with delay τ c ą .3 Each CN j is also associated with m j ě ř N j “ m j “ m . Thesubscripts of these inputs are represented by the set u j P Z m j with Ť N j “ u j “ N m . For e.g., if CN j isassociated with u and u , then u j “ t , u . These m j inputs are calculated by CN j at each time t . A.4

Each CN j calculates u k p t q “ ´ ř nl “ K kl x l p t ´ τ c ´ τ d q @ k P u j , which are then transmitted back tothe physical layer with delay τ d . The total round trip delay is, therefore, τ o “ τ c ` τ d .A sample CPS with n “ m “ N “ u p t q for the CPS described by A.1 - A.4 can be expressed as: u p t q “ ´ p K ˝ I d q looomooon K d x p t ´ τ d q ´ p K ˝ I o q looomooon K o x p t ´ τ o q , (2)where I d , I o P R m ˆ n are binary matrices such that I d p i, j q “ , If D q P t , . . . , N u : i P u q , j P x q , , otherwise , (3)and I o is the complement of I d . For instance, I d and I o for the CPS of Fig. 1 are given as: I d “ »——————– ﬁﬃﬃﬃﬃﬃﬃﬂ , I o “ »——————– ﬁﬃﬃﬃﬃﬃﬃﬂ (4)The closed-loop system of (1)-(2) can be written as: x p t q “ Ax p t q ´ BK d x p t ´ τ d q ´ BK o x p t ´ τ o q ` B w w p t q , z p t q “ Cx p t q ` Du p t q , C “ r Q { , s T , D “ r , R { s T , (5)where z p t q is the measurable output, Q ľ R ą

0. We make the standard assumption that p A , B q and p A , Q { q are stabilizable and detectable, respectively (Lin et al., 2013, Sec. II). Before proceeding, weintroduce the following three terms that will be used frequently over the rest of the paper. Deﬁnition 1

Let us deﬁne tuples X : “ p x i q and U : “ p u i q that respectively represent the state and inputindices arranged in the natural order of i . For e.g., if x “ t , u and x “ t , u , then X “ p , , , q .These tuples provide the order in which states and inputs are allocated to the CNs; for instance, the ﬁrst n ( m ) values in X ( U ) correspond to the states (inputs) associated with the ﬁrst CN, the next n values withthe second CN, and so on. The topology of a CPS, i.e., the state and control inputs associated with all theCNs in the cyber-layer, is deﬁned for N P r , N m “ min p m, n qs CNs by the tuple T : “ p X , U , n , m q , where n : “ t n i u and m “ t m i u . Deﬁnition 2

The propagation delay τ cpr ( τ dpr ) in the intra-layer (inter-layer) link is the delay arising dueto the physical distance between the source and the destination. Deﬁnition 3

The transmission delay τ ctr ( τ dtr ) in any intra-layer (inter-layer) link, discussed shortly in Sec.2.2, is the delay arising from routing and queuing. This delay is a function of the SDN (LAN) bandwidth,and the total number of corresponding links Bertsekas et al. (1992). In our CPS setting, since both thebandwidth and the number of links can be selected a priori, we will use τ ctr and τ dtr as design variables. Since each CN must be associated with at least 1 state and 1 control input as given in

A.1 and

A.3 , the maximum numberof CNs in the cyber-layer can be at most min p m, n q . .2 Problem Setup Our goal is to design a K that minimizes the H -norm of the transfer function from w p t q to z p t q for thetime-delayed LTI system (5). In general, the H -performance of (5) will be worse than that of the delay-freesystem (Gu et al., 2003, Sec. 5.6). Therefore, reducing both τ d and τ c will improve the H -performance. Ofcourse, the trivial solution would be to use τ d “ τ c “

0, which is not practical as that would require inﬁnitebandwidth and all the link lengths to be zero (Def. 2, 3). We next deﬁne constraints on bandwidth and CNcomputation overhead costs that lower bound the two delays τ d and τ c .– CN cost S CN is the sum of cost of renting and computation overhead of each CN (i.e., number ofstates and control inputs handled by the CN). This cost is constant for a given topology. In Sec. 6, we usetopology as a design variable, and accordingly, S CN varies. We provide a mathematical deﬁnition of S CN when we come to Sec. 6.– Bandwidth cost S BW is the cost of allocating the total bandwidth to the LAN (inter-layer) and SDN(intra-layer). Let the combined bandwidth of the LAN and SDN links be denoted by b cp and b cc , respectively.Then, S BW can be written as: S BW “ m cp b cp ` m cc b cc , (6)where m cp and m cc are the respective dollar costs for renting LAN and SDN links.The total cost S is S CN ` S BW . The bandwidths b cp and b cc are divided according to their respectivenumber of links as follows. b cc For ease of exposition, let us denote K p K , T q P R m ˆ n as the block matrix obtained by ﬁrst permuting the m rows and n columns of K to follow the ordering of U and X (Def. 1), and then partitioning into n rowgroups and m colgroups . This is shown in the following example. Example 1

For the CPS in Fig. 1, T and K are given as: T ” ´ X “ p , , , , , , q , U “ p , , , , , q , n “ r , , , , , s T , m “ r , , , , , s T ¯ (7) K “ »——————– a b c de f g hi j kl m no p qr s ﬁﬃﬃﬃﬃﬃﬃﬂ . (8) We obtain the block matrix K p K , T q as: u z x x hkkikkj x hkkikkj x hkkikkj x hkkikkj x hkkikkj x hkkikkj »———————————– ﬁﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬂ u ! a b c d u ! e f g h u ! i j k u ! l m n u ! o p q u ! r s . (9)Given a K for a ﬁxed T , an intra-layer link from CN i to j is not needed if calculation of u k , k P u j does not require x l , l P x i . This happens if K kl “ @ k P u j , l P x i (See A.4 ), i.e., K p K , T q j,i “ . Here,5 p K , T q j,i represents the j, i -th block of the block matrix K . Therefore, the number of outgoing intra-layerlinks from CN i are the number of non-zero oﬀ-diagonal blocks in the i -th block column of K p K , T q , denotedby n oﬀ i . We deﬁne n oﬀ “ t n oﬀ i u as the vector of these entries for all the CNs. Example 1 (Contd.) For the K in (9) , n oﬀ “ since all the oﬀ-diagonal blocks in the ﬁrst block-columnare zero. This means that none of the CNs (except CN itself ) require the state x held by CN to calculatetheir respective control inputs. The rest of the vector is obtained as n oﬀ “ r , , , , , s T . Thus, there area total of intra-layer links. CN i transmits n i states in each of the n oﬀ i outgoing intra-layer links. Thus, the intra-layer bandwidth b cc isdivided into the total number of channels in all the links, denoted as n cc p K , T q , resulting in the intra-layerdelay τ c “ κ ˆ n cc p K , T q b cc ˙loooooooomoooooooon τ ctr p K , T q ` τ cpr p T q , (10)where n cc p K , T q “ n T p T q n oﬀ p K p K , T qq and κ is a proportionality constant. b cp The uplink for carrying u j back to the physical layer is not needed if the j -th row of K is entirely 0.Similarly, if the i -th column of K is 0, then x i is no longer required for calculating any control input, andthe corresponding downlink becomes redundant. Thus, b cp is eﬀectively divided into the number of non-zerorows and columns of K denoted by n row p K q and n col p K q , respectively. The delay in the inter-layer links is,therefore, written as: τ d “ κ ˆ n cp p K q b cp ˙looooooomooooooon τ dtr p K q ` τ dpr p T q , (11)where n cp p K q “ n row p K q ` n col p K q . Remark 1

Note that the deﬁnitions in (10) and (11) do not involve any subscript for τ c and τ d indicatingthat the delays are assumed to be equal across all links in the SDN as well as in the LAN. While thisassumption is made to simplify the design, its practical relevance is as follows. For our practical purposes,propagation delay is of the order of ´ (Bertsekas et al., 1992) due to which we assume it to be equal forall intra-layer and inter-layer links. For the transmission component of the delay, one can assign the per-linkbandwidth in a way that the per-link transmission delay becomes equal for all links. Using (6), (10) and (11), we can write the bandwidth cost constraint as: S BW “ m cp n cp p K q τ d ´ τ dpr looomooon τ dtr ` m cc n cc p K , T q τ o ´ τ d loomoon τ c ´ τ cpr loooooomoooooon τ ctr ď S b , (12)where S b ą K . Our design objectives, therefore, are listed as: P1:

Given a topology T , design τ d , τ o and K such that ‚ H -norm of the closed-loop transfer function of (5) from w p t q to z p t q , denoted as J , is minimized. ‚ The bandwidth cost S BW satisﬁes (12) and budget S b , which is assumed to be large enough for theproblem to be feasible. Sparsity of K is promoted. Given a budget S b and a topology T , P1 can be mathematically stated as: O1 : minimize K ,τ d ,τ o J p K , τ d , τ o q ` g p K q , (13a)subject to K stabilizes (5) for τ o and τ d , (13b) S BW p τ d , τ o , K q ď S b , (13c)where S BW is given by (12), and g p K q is a sparsity-promoting function which will be introduced in Sec. 4.1.The closed-form expression of J is derived next. H norm for the Delayed System The delayed system (5) is inﬁnite dimensional. In order to obtain a linear, ﬁnite dimensional LTI approxi-mation of (5), we use the method of spectral discretization given in Vanbiervliet et al. (2011). Since τ o ą τ d in (5), following Vanbiervliet et al. (2011), we divide r´ τ o , s into a grid of N scaled and shifted Chebyshevextremal points θ k ` “ τ o ˆ cos ˆ p N ´ k ´ q πN ´ ˙ ´ ˙ , k “ t , . . . , N ´ u , (14)such that θ “ ´ τ o and θ N “

0. The choice of N is guided by (Vanbiervliet et al., 2011, Sec. 4). Let υ p θ q “ x p t ` θ q denote the θ -shifted state vector. The extended state η and the closed-loop state matrix A cl can then be written as: η “ r υ T p θ q , ¨ ¨ ¨ , υ T p θ N q “ x p t qs T , l j p θ q “ N ź m “ , m ‰ j θ ´ θ m θ j ´ θ m , (15a) A clij “ $’’’&’’’% B θ l j p θ i q I n , j “ , . . . , N, i “ , . . . , N ´ ´ l N p´ τ d q BK d ` A , j “ N, i “ N ´ l p´ τ d q BK d ´ BK o , j “ , i “ N ´ l j p´ τ d q BK d , j “ , . . . , N ´ , i “ N, (15b)where K d “ K ˝ I d , K o “ K ˝ I o . We can separate A cl into three sub-components: A cl “ ˜ A ´ B K o N To ´ B K d N Td , (16) B “ MB , M “ r , . . . , , I n s T , N o “ r I n , , . . . , s T , (17)where the ﬁrst sub-component ˜ A is independent of K d and K o , the second is only dependent on K o , andthe third on K d . The explicit expressions for ˜ A and N d in terms of τ d and τ o will be derived in Sec. 3.1.The linear approximation of the closed-loop system (5) becomes: η p t q “ A cl η p t q ` B w w p t q , (18a) z p t q “ C η p t q , C “ r M T Q { , ´p K d N Td ` K o N To q T R { s T (18b)where B w “ MB w . The algebraic Riccati equations and the closed-loop H -norm J can be written as: A Tcl P ` PA cl “ ´ C T C “ ´ ` ˜ Q ` ˜ C T R ˜ C ˘ , (19) A cl L ` LA Tcl “ ´ BB T , (20) J p K , τ d , τ o q “ Tr p B T P B q “ Tr p C L C T q . (21)where ˜ Q “ MQM T and ˜ C “ K d N Td ` K o N To . 7 Derivation of the gradient of H norm Our goal is to design p K , τ d , τ o q to minimize J . However, from (19)-(21), we see that J is a function of ˜ A and N d , besides K . To compute the gradient of J with respect to p K , τ d , τ o q , it is essential to express ˜ A and N d in terms of these three design variables. We begin this section with these derivations as follows. H Performance and Design Variables

Recall that the closed-loop state matrix A cl “ ˜ A ´ B p K o N To ` K d N Td q . In the next two lemmas, we express A cl as a function of τ o , K and the delay ratio c “ τ d { τ o . Lemma 1 ˜ A is a function of τ o , and can be written as: ˜ A “ τ o Λ ` A , A “ Diag p , A q , (22) where Λ is a constant matrix for constant N . (cid:4) Lemma 2 N d is a function of the ratio c “ τ d { τ o P r , s , and can be written as: N d p c q “ p Γ ν p c qq b I n , ν p c q “ r c N ´ c N ´ . . . c c s T , (23) where Γ P R N ˆ N is a constant matrix for constant N . (cid:4) Lemmas 1 and 2 show that for ﬁxed N , J for the system in (18) is a function of τ o and c . Henceforth,all of our analysis for minimizing J will be carried out using τ o and c , instead of τ o and τ d . This change ofvariables is invertible, and therefore, there is no loss of generality. H norm In order to minimize J , we next derive the gradient of J . We deﬁne the set of solutions that guaranteeclosed-loop stability of (18) as: K : “ tp K , τ o , c q : Re ` λ max p A cl q ˘ ă u . (24)Given this deﬁnition, we next derive the gradient of closed-loop H norm J at K , τ o and c in the followingtheorem. Theorem 1 J in (21) is diﬀerentiable on K . With P and L obtained from (19) and (20) , the gradient of J is evaluated as: J p τ o q “ ´ τ o Tr p Λ T PL q , (25) J p c q “ Tr p N d p c q K Td GL q , (26) ∇ J p K q “ pp GLN d q ˝ I d ` p GLN o q ˝ I o q , (27) where G “ R p K d N Td ` K o N To q ´ B T P and N d “ p Γ B ν p c qq b I n . The negative directions of J p c q and J p τ o q , as derived in Theorem 1, always point to the trivial solution c “ , τ o “ τ d and τ o . This is because the partial derivatives in(26)-(27) are derived with the assumption that K , τ o and c are independent of each other as K p τ o q and K p c q cannot be computed directly given the implicit dependence of K on τ o and c . Therefore, it would be incorrectto co-design c , τ o and K using just the gradient information. Starting from a stabilizing p K , τ o , c q P K , assoon as we change either τ o or c , we must update K to ensure stability of (18). In other words, p K , τ o q and p K , c q must be co-designed separately in sequence while holding c and τ o as constant in the respective steps.8 .3 Co-design of Controller and Delays We next describe how equations in (19)-(20) can be relaxed for each of the two co-designs. ‚ Co-design of p K , τ o q Theorem 2

Let ω o “ { τ o . Consider a known tuple p K ˚ , ω ˚ o , c ˚ q P K satisfying (19) with a known P ˚ forclosed-loop state matrix A ˚ cl p K ˚ , ω ˚ , c ˚ q . Let ω o “ ω ˚ o ` ∆ ω , K “ K ˚ ` ∆ K , P “ P ˚ ` ∆ P and α ą beobtained as a solution of the following SDP: φ ` φ ` ψ ` α I ľ , (28a) | ∆ ω | ď ζ , } ∆ P } ď ζ , (28b) α ě ζ } Λ T ∆ P } ` ζ } B ∆ ˜ C } ` } R { ∆ ˜ C } , (28c) where α , ∆ K , ∆ P and ∆ ω are the design variables, φ “ A ˚ Tcl P ` PA ˚ cl , K ˚ d “ K ˚ ˝ I d , ∆ K d “ ∆ K ˝ I d , K ˚ o “ K ˚ ˝ I o , ∆ K o “ ∆ K ˝ I o , ˜ C ˚ “ p K ˚ d N Td ` K ˚ o N To q , ∆ ˜ C “ p ∆ K d N Td ` ∆ K o N To q , A “ ´ B p ∆ ˜ C q ` ∆ ω Λ , φ “ A T P ˚ ` P ˚ A , ψ “ ˜ Q ` ˜ C ˚ T R ˜ C ˚ ` ∆ ˜ C T R ˜ C ˚ ` ˜ C ˚ R ∆ ˜ C and, ζ , ζ are chosen constants.Then, p K , { ω o , c ˚ q is a stabilizing tuple for (18) . (cid:4) ‚ Co-design of p K , c q Next, consider the co-design step for p K , c q . Recall that A cl is a non-linear function of c P r , s through N d p c q as shown in Lemma 2, and therefore, the exact expression of N d p c q cannot be used while forming theSDP relaxations. To circumvent this problem, we divide r , s into k c sub-intervals r c , c s , . . . , r c k c , c k c ` s with each sub-interval small enough to allow N d p c q to be approximated as an aﬃne function ˆ N d p c q . Leteach sub-interval r c i , c i ` s have an associated vector of aﬃne coeﬃcients χ p i q P R N ˆ . The approximatedfunction is written as: ˆ N d p c q “ ´ χ p i q r c, s T ¯ b I n , c P r c i , c i ` s , i “ , . . . , k c . (29)The coeﬃcients can be computed from a linear curve ﬁtting on (23). Larger the number of sub-intervals k c ,lower is the approximation error } ˆ N d ´ N d } . For our simulations in Sec. 5, we have used k c “

10. We nextpresent the SDP relaxation for the co-design of p K , c q . Theorem 3

Consider a known tuple p K ˚ , τ ˚ o , c ˚ q P K with c ˚ P r c i , c i ` s for some i P t , . . . , k c u satisfying (20) with a known L ˚ for closed-loop state matrix A ˚ cl p K ˚ , τ ˚ o , c ˚ q . Let c “ c ˚ ` ∆ c , K “ K ˚ ` ∆ K , L “ L ˚ ` ∆ L and α ą be a solution of the following SDP: φ ` φ ` BB T ` α I ľ , (30a) c P r c i , c i ` s , } ∆ L } ď β, (30b) α ě β } B p ∆ K d N Td p c ˚ q ` ∆ K o N To q} ` p β S } B ∆ K d } ` β } B K ˚ d ∆ N Td } ` S } B ∆ K d }} L ˚ }q , (30c) where α, ∆ K , ∆ P and ∆ c are the design variables, ∆ N d “ ˆ N d p c q ´ N d p c ˚ q , φ “ A ˚ cl L ` LA ˚ Tcl , φ “ A L ˚ ` L ˚ A T , A “ ´ B p K ˚ d ∆ N Td p c q ` ∆ K d N Td p c ˚ q ` ∆ K o N To q , β ą is a chosen constant, and S ě} N d p c q} . Then, p K , τ ˚ o , c q is a stabilizing tuple for (18) . (cid:4) Starting from a known stabilizing tuple p K ˚ , τ ˚ , c ˚ q , Theorems 2 and 3 enable us to co-design stabilizingpairs p K , τ o q and p K , c q , respectively. We next integrate the bandwidth cost constraint (12) with the SDPsin (28) and (30). We impose the bandwidth cost constraint (12) as part of P1 , which can be rewritten as: S BW “ m cp n cp p K q cτ o ´ τ dpr ` m cc n cc p K , T q cτ o ´ τ cpr ď S b , (31)9here n cp p K q “ n row p K q ` n col p K q , n cc p K q “ n T p T q n oﬀ p K , T q and c “ ´ c . Recall that S BW is the totalbandwidth cost and S b is the upper bound imposed on it as stated in O1 . When (31) is imposed on SDPs(28) and (30), we obtain an alternative form of (31), which is stated in the next proposition. Proposition 1

Given the topology T , with corresponding constant propagation delays τ dpr and τ cpr , considera known tuple p K ˚ , τ ˚ o , c ˚ q P K with an associated bandwidth cost S ˚ BW ď S b . Denoting n ˚ cp “ n row p K ˚ q ` n col p K ˚ q and n ˚ cc “ n T n oﬀ p K ˚ q , the following statements are true.1) Keeping τ o “ τ ˚ o , let c ˚ be perturbed to c P p τ dpr { τ ˚ o , ´ p τ cpr { τ ˚ o qq resulting in a cost S BW p c q . Then, δS BW p c q : “ S BW ´ S ˚ BW ď is a convex constraint and written as: δS BW p c q “ p p ´ p q c ` p q ´ q q c ` p r ´ r q ď , (32) where p “ S ˚ BW τ ˚ o , p “ ´ τ ˚ o , q “ τ ˚ o p´ S ˚ BW p τ ˚ o ` p τ dpr ´ τ cpr qq ´ m cp n ˚ cp ` m cc n ˚ cc ˘ , q “ τ ˚ o p τ ˚ o ` p τ dpr ´ τ cpr qq , r “ p S ˚ BW τ dpr ` m cp n ˚ cp qp τ ˚ o ´ τ cpr q ´ m cc n ˚ cc τ dpr and r “ ´ τ dpr p τ ˚ o ´ τ cpr q . The constraint δS BW p c q ď implies S BW ď S b .2) Keeping c “ c ˚ , let τ ˚ o “ τ ˚ dtr ` τ ˚ ctr ` τ dpr ` τ cpr be perturbed to τ o “ τ dtr ` τ ctr ` τ dpr ` τ cpr such that τ dtr and τ ctr satisfy c ˚ τ ctr ´ c ˚ τ dtr “ c ˚ τ dpr ´ c ˚ τ cpr , (33) resulting in a new bandwidth cost S BW p τ o q . Then, δS BW p τ o q : “ S BW ´ S ˚ BW ď is concave and written as: δS BW p τ o q “ p p ´ p q τ o ` p q ´ q q τ o ` p r ´ r q , (34) where p “ ´ S ˚ BW p , p “ c ˚ c ˚ , q “ m cp n ˚ cp c ˚ ` m cc n ˚ cc c ˚ ` S ˚ BW p c ˚ τ cpr ` c ˚ τ dpr q , q “ ´ c ˚ τ cpr ´ c ˚ τ dpr , r “ ´ ` S ˚ BW τ dpr τ cpr ` m cp n ˚ cp τ cpr ` m cc n ˚ cc τ dpr ˘ , r “ τ dpr τ cpr and c ˚ “ ´ c ˚ . The constraint δS BW p τ o q ď implies S BW ď S b . (cid:4) Since δS BW p τ o q and δS BW p c q are respectively convex and concave from Proposition 1, we can easily incor-porate them in the co-design SDPs of Theorems 2 and 3 to satisfy the bandwidth constraint in (31). Notethat since K is co-designed with either τ o or c , the true bandwidth cost S BW depends on K as well through n row p K q , n col p K q and n oﬀ p K q . If n row p K q ` n col p K q ď n row p K ˚ q ` n col p K ˚ q and n T n oﬀ p K q ď n T n oﬀ p K ˚ q ,one can easily verify that δS BW p c q ď δS BW p τ o q ď The H -norm J , in general, increases with increasing sparsity of K (Negi and Chakrabortty, 2020b), whilethe bandwidth cost S BW reduces. Due to these inherent trade-oﬀs between the objectives and the constraints, P1 is a prime candidate to be reformulated as a two-loop ADMM optimization. The outer-loop co-designs p K , τ o q and p K , c q using (28)-(30) under the bandwidth constraints (32)-(34). The inner-loop, on the otherhand, sparsiﬁes K while minimizing J . We describe the inner and outer loops in Sec. 4.1 and 4.2 respectively,followed by the main algorithm in Sec. 4.3. Throughout the inner ADMM loop, we hold both τ o and c as constants. The mathematical program of theinner loop denoted as O2.0 is written as follows:

O2.0 : minimize K , F J p K q ` γg p F q , (35a) Since the topology T is assumed to be constant, with a slight abuse of notation we write n oﬀ as a function of K ˚ . K “ F , (35b)where γ is a regularization parameter and g p F q “ } W ˝ F } l is the weighted l norm function which is usedto induce sparsity in F . The weight matrix W for g p F q is updated iteratively through a series of reweightingsteps from the solution of the previous iteration (Candes et al., 2008): W ij “ | F ij | ` (cid:15) , ă (cid:15) ! . (36)The augmented Lagrangian for O2.0 is L p “ J p K q ` γg p F q ` Tr p Θ T p K ´ F qq ` ρ } K ´ F } , (37)where ρ is a positive scalar, Θ is the dual variable and } ¨ } F is the Frobenius norm. ADMM involves solvingeach objective separately while simultaneously projecting onto the solution set of the other. As shown inLin et al. (2013); Boyd et al. (2011), (37) is used to derive a sequence of iterative steps K -min, F -min and Θ -min by completing the squares with respect to each variable. K k ` “ argmin K Φ p K q “ argmin K J p K q ` ρ } K ´ U k } , (38a) F k ` “ argmin F Φ p F q “ argmin F γg p F q ` ρ } F ´ V k } , (38b) Θ k ` “ Θ k ` ρ p K k ` ´ F k ` q , (38c)where U k “ F k ´ ρ Θ k and V k “ K k ` ` ρ Θ k . We next present a method to solve K -min and provide ananalytical expression for F -min. Setting ∇ Φ p K q “ rp GLN d q ˝ I d ` p GLN o q ˝ I o s ` ρ p K ´ U q “ , (39)where U “ U k for the p k ` q -th iteration of the ADMM loop and N d p c q is denoted as N d as c is constantfor O2.0 . P and L are the solutions of (19) and (20), respectively. K -min begins with a stabilizing K , solves(19)-(20) for P and L , and then solves (39) to obtain a new gain K as follows: K “ Reshape ´ p ˆ V d ˝ T d ` ˆ V o ˝ T o ` ρ I n q ´ µ , [m,n] ¯ , (40) T d “ p T dd ˝ ˆ V Td ` T od ˝ ˆ V To q , T o “ p T oo ˝ ˆ V To ` T do ˝ ˆ V Td q , T dd “ p N Td LN d b R q , T od “ p N To LN d b R q , T oo “ p N To LN o b R q , T do “ p N Td LN o b R q , µ “ vec ´ p B T PLN d q ˝ I d ` p B T PLN o q ˝ I o ` ρ U ¯ , ˆ V d “ b v d , v d “ vec p I d q , ˆ V o “ b v o , v o “ vec p I o q , For details of (40), see Appendix (Sec. 8.6). It can be shown that ˜ K “ K ´ K is the descent direction for Φ (Rautert and Sachs, 1997, See Lemma 4.1). The Armijo-Goldstein line search method can then be usedto determine a step size s to ensure p K ` s ˜ K q stabilizes (18). The iterative process continues till we obtain ∇ Φ p K q « The solution of the F -min step is well-known in the literature (Boyd et al., 2011, Sec. 4.4.3) as: F ij “ p ´ a ij | V ij | q V ij , if | V ij | ą a ij , , otherwise , (41)11here a ij “ γρ W ij . Note that large values of γ will induce more sparsity, and therefore may lead to a suddenincrease in J . Therefore, γ must be increased in small steps. The regularization path, for example, can belogarithmically spaced from 0 . γ max to 0 . γ max , where γ max is ideally the critical value of γ above whichthe solution of P1 in is K “ F “ (Boyd et al., 2011). In our simulations, γ max “ The outer-loop of our algorithm designs τ o and c with bandwidth constraint (31) and updates the weightmatrix W for minimizing the weighted l norm in (36). Co-design of K in this loop is necessary to ensurestability as τ o and c change. Let K ˚ “ F ˚ and Θ ˚ be the output of the last converged inner loop with U ˚ “ K ˚ ´ p { ρ q Θ ˚ . Programs O2.1 and

O2.2 directly design p K , τ o q and p K , c q , respectively, in sequenceas follows: O2.1 : minimize K ,τ o , P ˆ J p K , τ o q ` ρ } K ´ U ˚ } , (42a)s.t. δS BW p τ o q ď , SDP in Eq. (28) , (42b) O2.2 : minimize K ,c, L ˆ J p K , c q ` ρ } K ´ U ˚ } , (42c)s.t. δS BW p c q ď , SDP in Eq. (30) , (42d)where ˆ J p K , τ o q “ Tr p B T P B q , ˆ J p K , c q “ Tr p L C ˚ T C ˚ q , C ˚ “ p K ˚ ˝ I d q N Td p c ˚ q ` p K ˚ ˝ I o q N To . We nextpresent our main algorithm to show the iterative solutions of O2 beginning from a known stabilizing tuple p K ˚ , τ ˚ o , c ˚ q . Algorithm 1

Main Algorithm Input:

Initial feasible point p K ˚ o , τ ˚ o , c ˚ q P K for γ i “ . γ max to 0 . γ max do Input: K ˚ , τ ˚ o and c ˚ stabilizing for (5) for do Solve

O2.1 using K ˚ , τ ˚ o , c ˚ to get ˆ K , τ o Solve

O2.2 using ˆ K , τ o , c ˚ to get updated K , c Input:

Inner loop initial: K , c , τ o while ADMM Stopping Criteria not met do K -min : Solve (38a) for K k ` F -min : Solve (38b) for F k ` Update Θ using (38c) Result: K ˚ “ K , τ ˚ o “ τ o , c ˚ “ c Update W using K ˚ from (36) and S ˚ using n row p K ˚ q , n col p K ˚ q , n oﬀ p K ˚ q , τ ˚ o and c ˚ from (31) Result: K , τ o and c are obtained for γ i Our main algorithm is listed in Algorithm 1; the following points explain its key steps. ‚ Using

O2.1 , we ﬁrst co-design a stabilizing pair p ˆ K , τ o q from an initial tuple p K ˚ , τ ˚ o , c ˚ q P K . The two aredesigned together as the initial K ˚ may not be stabilizing for τ o satisfying the bandwidth constraint (34). ‚ We then use the solution of

O2.1 , i.e., p ˆ K , τ o , c ˚ q P K as the initial point for O2.2 to ﬁnd an updatedpair p K , c q . From Proposition 1, δS BW p c q in (34) is convex in c . Let c min be the minimizer of δS BW p c q .If ˆ K is stabilizing for c min , then instead of co-designing p K , c q , we can directly set c “ c min and K “ ˆ K ,and then use a procedure similar to K -min to minimize J p K q , starting from ˆ K . The inner-loop begins with p K , τ o , c q P K and updates K in the direction of decreasing J and increasing sparsity while τ o and c remainconstant. 12 terations (a) J Iterations

234 0.00.51.0 (b) o + d Iterations (c) o Iterations

012 012 (d) c

Figure 2: (a), (b), (c), (d) show normalized J , τ o ` τ d , τ o and c vs iterations - Right axis for Model I a ( ‚ ),Left axis for Model I b ( ‚ ). ‚ Following Lin et al. (2013, Sec. III-D) and Boyd et al. (2011, Sec. 3.4.1), ρ in (38) is chosen to be suﬃcientlylarge to ensure the convergence of the inner ADMM loop. Since J is nonconvex, convergence of this loop,in general, is not guaranteed, as is commonly seen in the sparsity promoting literature Lin et al. (2013).However, large values of ρ have been shown to facilitate convergence. We use ρ “

100 for our simulations.The stopping criterion for the inner loop in Line 7 of Algorithm 1 follows Boyd et al. (2011, Sec. 3.3.1).

We ﬁrst present simulations where only the outer loop is iterated without considering any bandwidth con-straint in Algorithm 1. This example shows that the relative magnitudes of τ d and τ o for obtaining minimum H -norm can be signiﬁcantly diﬀerent for diﬀerent systems. Absence of the bandwidth cost, as indicatedbefore, will lead to the trivial solution τ o “ τ d “

0. To avoid this, we impose a simple artiﬁcial constraint |p τ d ´ τ ˚ d q ` p τ o ´ τ ˚ o q| ď (cid:15) where 0 ă (cid:15) ! p K ˚ , τ ˚ o , τ ˚ d q P K is the initial point forevery iteration. This initial tuple is replaced by the newly designed p K , τ o , τ d q P K at the end of every iter-ation. We simulate two randomly generated models I a and I b with A P R ˆ , B “ B w “ I n , K ˚ “ K LQR , Q “ R “ I n for two diﬀerent initial conditions as part of Case A. The logarithm of ratios of J , τ d ` τ o , τ o and c with respect to their respective minima are plotted in Fig. 2. Case A : Right and left axis of all the sub-ﬁgures in Fig. 2 show system I a and I b with p c ˚ , τ ˚ o q chosen as p . , . q and p . , . q , respectively. For both the systems, J in Fig. 2 (a) is seen to be decreasingas τ o ` τ d decreases. This is expected as H -performance improves with a decrease in the overall delay.Fig. 2 (a), (c) and (d) show that for achieving a lower J , the model I a requires a lower τ o and a higher c , while I b requires a higher τ o and a lower c . We can infer that obtaining a better H -performance candemand completely diﬀerent relative magnitudes of τ d and τ o depending on the system model and the initialconditions. Thus, this example validates the motivation of our problem in determining the trade-oﬀ between τ d and τ o . K LQR is the solution of the Linear Quadratic Regulator (LQR) problem for the system in (18) for the given Q and R . J , S BW , τ o and c vs N z p K q where N z is the number of zero elementsof K . ‘DD’ and ‘W/O DD’ indicate Algorithm 1 and constant-delay algorithm respectively. We next validate Algorithm 1. To illustrate its beneﬁts, we compare it to an algorithm that consistsof only the inner ADMM loop, referred to as the constant-delay algorithm. Both algorithms start from p K ˚ , τ ˚ o , τ ˚ d q P K . The delays τ ˚ o and τ ˚ d are kept constant throughout the constant-delay algorithm. In CaseB, we present the simulations for a randomly generated LTI model A P R ˆ and B “ B w “ Q “ R “ I n .We denote the number of zero elements of K by N z p K q . Case B : We consider a randomly generated A P R ˆ with 900 links in the cyber-layer, p c ˚ , τ ˚ o q “p . , . q , m cp “

84 and m cc “

81. The initial conditions result in c min “ . ą c ˚ from (32).However, p K ˚ , c min q is an unstable tuple, and therefore, we rely on O2.2 to co-design p K , c q . Fig. 3 (a),(c) and (d) show that as sparsity increases, Algorithm 1 initially increases τ o and maintains c to maintainoptimality of J . As shown in Proposition 1, an increase in τ o decreases the bandwidth cost S BW . Moreover,since c moves towards c min , S BW decreases steeply for Algorithm 1. Further increase in sparsity of K causesthe algorithm to decrease τ o and increase c , which slows down the rate of decrease of S BW with respect to N z p K q . Fig. 3 (a), (b) show that as a trade-oﬀ for a much lower S BW from Algorithm 1, we obtain J thatis comparable to the constant-delay algorithm for all the sparsity levels. Recall that the total system cost S is composed of the bandwidth cost S BW , and the CN cost S CN . In Sec.4, we designed an H optimal combination of sparse K and the delays τ d and τ c (i.e., bandwidths b cp and b cc ) to decrease S BW while keeping S CN constant (i.e., keeping the topology T constant). In this section,as an additional step to Algorithm 1, we aim to further reduce the closed-loop H norm of the system byredesigning the CPS topology parameters X , n , U and m while keeping the gain matrix ﬁxed at the sparsesolution K of Algorithm 1. This redesign changes S CN , τ d , and τ c , which in turn changes the H norm J . In14his process, S BW remains constant (i.e., the bandwidths remain ﬁxed to the solution of Algorithm 1). Sincean optimal redesign of the topology should decrease both S CN and J beyond the solution of Algorithm 1,we provide results on the existence of such an optimal topology and present a set of algorithms to obtain it. Consider the system (5), which is implemented as a CPS following

A.1 - A.4 . Before stating our design ob-jective, we ﬁrst discuss the eﬀect of redesigning T with ﬁxed K , b cc and b cp on the following CPS parameters. Recall from Sec. 2.2.1 and 2.2.2 that the number of inter-layer links is dependent on K , while the totalnumber of intra-layer links are dependent on both K and T . Speciﬁcally, a non-zero i, j -th oﬀ-diagonalblock of the matrix K p K , T q , deﬁned in Sec. 2.2.1, represents an intra-layer link transmitting the n i statesof CN i to CN j .Figure 4: A simple diagram that depicts the change in CPS topology of Fig. 1 that occurs due to row andcolumn permutations of K ; See Example 2. Their are 9 intra-layer links in this topology compared to 14 inFig. 1 even though the nnz p K q remains the same. Example 2

Let us recall Example 1. For the system in Fig. 1, there are a total of ř i n oﬀ i “ intra-layerlinks. If we change the topology to T ” ´ X “ p , , , , , , q , U “ p , , , , , q , n “ r , , , , , s T , “ r , , , , , s T ¯ , then, K p K , T q is obtained as: u z x x hkkikkj x hkkikkj x hkkikkj x hkkikkj x hkkikkj x hkkikkj »———————————– ﬁﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬂ u ! e g h f u ! a d b c u ! p o q u ! n l m u ! i j k u ! r s . (43) Here, n oﬀ “ r , , , , , s and therefore, the number of intra-layer links decrease from links in Example1 to ř i n oﬀ i “ , even though K remains the same. This eﬀect of topology redesign on the assignment ofthe inter-layer and intra-layer links is shown in Fig. 4. As seen from the ﬁgure, the reassignment changesthe destinations of the inter-layer links as well as both source and destinations of the intra-layer links. It,however, preserves the number of inter-layer links. From (10) and (11), it follows that varying T for a ﬁxed p K , b cc , b cp q leads to a variation in τ ctr “ κ p n T p T q n oﬀ p K , T qq{ b cc but τ dtr “ κ p n row p K q ` n col p K qq { b cp q is kept constant (since the number of zero rowsand columns will remain constant for a given K ). In Algorithm 1, as T was ﬁxed, both τ dpr and τ cpr wereconstant. In the current problem, however, T is the design variable, as a result of which both of thesepropagation delays become variable. Furthermore, we can no longer assume the per-link propagation delaysfor τ c and τ d to be equal as the choice of the destinations for the links is also variable. Therefore, weconsider the worst case propagation delay for both, and accordingly replace the design variables τ cpr and τ dpr with max i p τ cpri q and max i p τ dpri q respectively, where τ cpri and τ dpri are the propagation delays in thecorresponding i -th link. This will be shown shortly when we formulate our optimization problem for theredesign. The CN cost S CN consists of two parts - the rent cost S CN r and the computation overhead cost S CN c . Assumption 1 S CN r is a strictly increasing function of the number of CNs N . S CN c is a strictly increasingfunction of p n i ` m i q for each CN i . Following Assumption 1, we deﬁne the computation overhead cost as S CN c p T q : “ N ÿ i “ p n i ` m i q , (44)and the rent cost as S CN r p T q : “ t S maxCN c X p N q | X p N q is the N -th order statistic of random samples X i „ U p , q , i P N N m u , (45)where S maxCN c “ p m ` n ´ q ` m inputs and n states. An example of total CN cost ( S CN ) vs. number of CNs ( N ) for m “ n “

25 is shown inFig. 5. We can see that the S CN vs. N characteristics resemble a conventional marginal cost curve followingthe static pricing scheme used in cloud computing Aldossary et al. (2019). As the number of CNs reduces, S CN ﬁrst decreases due to a decrease in S CN r . However, as N reduces further, the increase in S CN c overtakesthe decrease in S CN r . 16igure 5: An example S CN vs N characteristics. We next state our design objective as follows.

P2:

Let the system (5) be implemented as a CPS following

A.1 - A.4 with maximum number of CNs N m “ max p m, n q CNs (Def. 1) and the initial topology denoted as T p in q | N m . Let the outputs of Algorithm1 when applied to this system be a sparse control gain matrix K ˚ , bandwidths b ˚ cc and b ˚ cp , and H norm J p K ˚ , T p in q | N m q . Then, assigning p K , b cc , b cp q ” p K ˚ | N m , b ˚ cc | N m , b ˚ cp | N m q , ﬁnd the optimal number ofCNs N ˚ P r , N m s and the corresponding optimal topology T ˚ | N ˚ such that P2.1 the closed-loop H norm is minimized and, P2.2 the corresponding CN cost S CN p T ˚ | N ˚ q ď S CN p T p in q | N m q , i.e., it remains at most the same as theCN cost for the initial topology. The rationale behind the formulation of P2 is as follows. For a given topology T , there is a higher loss inthe closed-loop H norm of (5) when block-wise sparsity is promoted in K compared to when the element-wise sparsity is promoted (Algorithm 1), assuming the same nnz p K q in both the cases. However, block-wisesparsity in K is required to reduce the number of intra-layer links. Therefore, P1 focuses on promotingelement-wise sparsity in K , and P2 rearranges its zero entries such that a block-wise sparse structure can beobtained. When combined with the variation in T to change the delays, this approach enables us to reduce J further than that obtained from Algorithm 1. We illustrate this approach in more detail in the simulationexample in Sec. 7.2. For a ﬁxed K ˚ , each topology T | N of the CPS (Def. 1) corresponds to an N ˆ N block matrix K | N througha bijective function K : p K ˚ , T | N q Ñ K | N deﬁned next. Let permutation matrices U P R m ˆ m and X P R n ˆ n be deﬁned for a given T | N “ p X | N , U | N , n | N , m | N q as U : “ t I m : permuted as p , , . . . , m q Ñ U u , (46) X : “ t I n : permuted as p , , . . . , n q Ñ X u . (47) Since both τ d and τ c are functions of T , with a slight abuse of notation, we write J p K , T q in this section. K | N is obtained as: K | N “ K p K ˚ , T | N q “ r UK ˚ X s r n | N , m | N s . (48)Here, r X s r a , b s represents the partitioning of X using rowgroups a and colgroups b (see Example 1). Since K is invertible, the set of block matrices K | N that each correspond to a valid topology for N CNs can bewritten as: t K | N u : “ t K | N : T | N “ K ´ p K ˚ , K | N q deﬁnes a topology from Def. 1 u . (49)To solve P2 , we ﬁrst choose optimal K ˚ | , ¨ ¨ ¨ , K ˚ | N m from the sets t K | u , ¨ ¨ ¨ , t K | N m u such that theircorresponding CN costs fulﬁll P2.2 ; the guidelines for this choice will be deﬁned shortly. Out of these N m ´ K ˚ | N ˚ as the one that corresponds to the minimum closed-loop H norm. The solutionof P2 will then be obtained as T ˚ | N ˚ “ K ´ p K ˚ , K ˚ | N ˚ q . We begin with the steps for obtaining t K | N u in(49) for N “ To obtain K ˚ | N “ , we ﬁrst need to ﬁnd U and X such that the majority of the zeros in K ˚ are delegatedto the oﬀ-diagonal blocks K , and K , . K | “ r UK ˚ X s r n | , m | s “ „ K , K , K , K ,  . (50)This is done so that the total number of non-zero oﬀ-diagonal blocks ř n oﬀ p K | N “ q , i.e., the number ofintra-layer links decreases. For a ﬁxed b ˚ cc , this can result in a decrease in the intra-layer transmission delay τ ctr and eventually in the closed-loop H norm.We use the spectral partitioning method based on the Fiedler vector to carry out the block diagonalpermutations, following its variant for rectangular matrices as presented in Kolda (1998). Algorithm 2astates the steps for carrying out these permutations. Since the block sizes, i.e., n | and m | are unknown,we assign m “ floor r m { s and n “ floor r n { s to avoid a trivial solution. The actual block sizes aredetermined after obtaining UK ˚ X , as seen next. Once U and X are obtained from Algorithm 2a, we next search for sets of n | and m | to obtain the feasibleset t K | u . There are p m ´ q ˆ p n ´ q ways to partition UK ˚ X as a 2 ˆ m “ n “

3, then K could be partitioned into a 2 ˆ »– ‚ ‚ ‚‚ › ĲĲ ‚ ˛ ﬁﬂ »– ‚ ‚ ‚‚ › ĲĲ ‚ ˛ ﬁﬂ »– ‚ ‚ ‚‚ › ĲĲ ‚ ˛ ﬁﬂ »– ‚ ‚ ‚‚ › ĲĲ ‚ ˛ ﬁﬂ n | r , s r , s r , s r , s m | r , s r , s r , s r , s (51)The computation of t K | u is detailed in Algorithm 2b. We next discuss the variation of τ c , τ d , S CN over theset t K | N u . ‚ τ d p K | N q : While τ dtr p K ˚ q remains constant, τ dpr p T | N q ” τ dpr p K ´ p K ˚ , K | N qq varies over t K | N u . ‚ τ c p K | N q : Both τ ctr and τ cpr are functions of T and therefore, can vary over t K | N u . To ensure that τ cpr remains constant or decreases over t K | N u for N ă N m , we can choose an N -dimensional set of CNs(out of N m ). We formally state the existence of such a set in the following lemma. Here, K ´ p K ˚ , K | N q is the inverse of the function K p K ˚ T | N q in the argument T | N for ﬁxed K ˚ , as given in (48). lgorithm 2a Spectral Partitioning Algorithm Input:

Sparse matrix K ˚ P R m ˆ n Symmetrize K ˚ as ˆ K ˚ “ r ˚ ; K ˚ T s Compute Laplacian of ˆ K ˚ : L “ D ´ ˆ K ˚ , where D “ Diag t d , d , . . . , d m ` n u and d i “ ř j ˆ K ˚ ij Find w as the eigenvector corresponding to the smallest positive eigenvalue of L or Fiedler vector w f P R m and w l P R n denote ﬁrst m and last n elements of w , respectively. Sort w f and w l in descending order and denote the corresponding sorted indices as t i , i , . . . , i m u and t j , j , . . . , j n u . U : “ I m with rows permuted as t i , i , . . . , i m u X : “ I n with rows permuted as t j , j , . . . , j m u Result:

Permutation matrices U , X , rowgroup U “ t i , i , . . . , i m u and colgroup X “ t j , j , . . . , j m u Algorithm 2b

Optimal Partitioning Algorithm Input:

Initial sparse gain matrix K ˚ , CN Cost S CN p T p in q | N m q , H norm J p K ˚ , T p in q | N m q Input: r UK ˚ X s , U and X from Algorithm 2a for i “ m ´ do Set m “ r i, m ´ i s T for j “ n ´ do Set n “ r j, n ´ j s T Set T : “ p X , U , n , m q for current iteration Calculate τ c p T q , τ d p T q and S CN p T q Calculate J p K ˚ , τ c , τ d q for the current iter-ation’s m and n Choose the partitionings m , n for which the corresponding S CN p T q ď S CN p T p in q | N m q Out of the remaining partitionings, choose the one that corresponds to the lowest H norm and set T ˚ | N “ T for that partitioning Result:

Topology T ˚ | N for N number of CNs Lemma 3

Assume a CPS with parameters p K ˚ , b ˚ cc , N m q and topology T p in q | N m that results in the intra-layer propagation delay τ cpr p T p in q | N m q . For any N ď N m , one can always choose an N -dimensional set ofCNs such that for all topologies T | N , τ cpr p T | N q ď τ cpr p T p in q | N m q . (cid:4) ‚ S CN p K | N q : Rent cost S CN r is constant but the computation overhead cost S CN c p T | N q ” S CN c p K ´ p K | N qq varies over t K | N u .Our objective is to ﬁnd the optimal K ˚ | N out of t K | N u that minimizes the closed-loop H norm underthe constraint that the CN cost does not increase from that in Algorithm 1. O3: K ˚ | N “ argmin t K | N u J p τ c p K | N q , τ d p T p K | N qq , K ˚ q , (52)s.t. S CN p K | N q ď S CN p T p in q | N m q . (53)Algorithm 2 obtains T ˚ | ” K ´ p K ˚ , K ˚ | q by solving O3 . This is the optimal topology that can beimplemented for a 2 CN CPS. To obtain the same for N ą K | are furtherdivided into their constituent 2 ˆ T ˚ | N m ” K ´ p K ˚ | N m q . Finally, the optimal topology from the designed T ˚ | , T ˚ | , ¨ ¨ ¨ , T ˚ | N m is chosen such that min N J p T ˚ | N q . The optimization O3 is trivially feasible for N m CNs. We nextstate the proposition that derives the conditions under which O3 is feasible for any N ă N m . Proposition 2

Assume a CPS with N m CNs and ﬁxed p K ˚ , b ˚ cc , b ˚ cp q has a topology T p in q | N m , intra-linktransmission delay τ ctr p T p in q | N m q , and CN cost S CN p T p in q | N m q . Let Algorithm 2 be used to obtain a topology T ˚ | N for N ă N m such that the fraction of the total non-zero oﬀ-diagonal blocks in K p K ˚ , T ˚ | N q is less han or equal to that in K p K ˚ , T ˚ | N m q , i.e., ř N i n oﬀ i p K ˚ , T ˚ | N q N p N ´ q ď ř N m i n oﬀ i p K ˚ , T p in q | N m q N m p N m ´ q . (54) When n ď m , there always exists an optimal number of CNs N ˚ ă N m for which τ ctr p T ˚ | N ˚ q ă τ ctr p T p in q | N m q and S CN p T ˚ | N ˚ q ď S CN p T p in q | N m q . When n ą m . then this result holds if n ď m p m ´ q . (cid:4) Lemma 3 and Proposition 2 provide suﬃcient conditions under which intra-layer delay τ c p T ˚ | N ˚ q ă τ c p T p in q | N m q for some N ˚ ă N m . If the increase in the corresponding τ d p T ˚ | N ˚ q is limited such that p τ c p T ˚ | N ˚ q ` τ d p T ˚ | N ˚ qq ă p τ c p T p in q | N m q ` τ d p T p in q | N m qq , one can obtain J p T ˚ | N ˚ q ă J p T p in q | N m q . Otherwise, oneshould stick to the initial topology T p in q | N m . Note that τ cpr p T ˚ | N ˚ q may also be lesser than the initialvalue τ cpr p T p in q | N m q , as shown in Lemma 3. This can result in a further decrease in the overall delay τ c p T ˚ | N ˚ q ` τ d p T ˚ | N ˚ q .(a) T p in q | N m “ (b) T ˚ | N “ (c) T ˚ | N “ (d) T ˚ | N m “ (e) T ˚ | N “ (f) T ˚ | N “ Ĳ Actuator ‚ Sensor ‚ CN ´´ Inter-layer link carrying u p t q¨ ¨ ¨ Inter-layer link carrying x p t q ´ ¨ ´ Intra-layer link carrying x p t q Figure 6: Case B: (a) shows the initial topology T p in q | N m for N m “

30 CNs used in Algorithm 1. Theoptimal topologies for N “

4, 11, 20, 28 and 30 obtained from Algorithm 2 are shown in (b), (c), (d), (e)and (f) respectively.

We apply Algorithm 2 to the simulation example of Case B presented in Sec. 5. A sparse K ˚ obtainedfrom Algorithm 1 with the corresponding topology T p in q | N m “ is used as the input to Algorithm 2. Thevalues of X p in q | , U p in q | , n p in q | and m p in q | for this initial topology are randomly chosen and given innegi (2021). Moreover, the values of τ d p T p in q | q , τ c p T p in q | q , S CN p T p in q | q and J p K ˚ , T p in q | q are used20o normalize the corresponding τ d p T ˚ | N q , τ c p T ˚ | N q , S CN p T ˚ | N q and J p K ˚ , T ˚ | N q for optimal topologies T ˚ | N obtained as the outputs of Algorithm 2. For e.g., a normalized τ c p T ˚ | N q “ . N ă N m , Algorithm 2 obtains a topology with 20% less intra-layer delay compared to the initial topology. The input to Algorithm 2 is a sparse K ˚ P R ˆ with nnz “ T p in q | , while Fig. 6 (b) - (f) show the optimal topologies T ˚ | N obtained as the output of Algorithm 2 for N “

4, 11, 20, 28 and 30, respectively. In these ﬁgures, the size of the square representing a CN is directlyproportional to the number of states and inputs associated with it. Thus, these ﬁgures show how Algorithm 2modiﬁes the number of states and inputs associated with each CN to satisfy O3 as N changes. Furthermore,they show the subset of N chosen out of N m CNs to fulﬁll Lemma 3. Fig. 7 shows that Algorithm 2guarantees a lower value of the closed-loop H norm for all N ď N m compared to J p K ˚ , T p in q | N m q . For theoptimal topologies, any potential increase in τ dpr is compensated by the decrease in τ ctr , as seen for N “ S CN decreases continuously from N “

30 until N “

15, after which it starts increasing againas the sharp rise in S CN c overshadows the decrease in the cost of renting a lower number of CNs. The mainmessage conveyed by Fig. 7 is that if one desires the lowest J such that S CN ď S CN p T p in q | N m q , then onecan choose the optimal topology T ˚ | N obtained for N “

2; however, if the goal is to obtain the lowest S CN such that J ď J p T p in q | N m q , then one must choose the optimal topology corresponding to N “ J , S , τ d and τ o between diﬀerent designed topologies T ˚ | , T ˚ | , ¨ ¨ ¨ T ˚ | . The normalization is with respect to the corresponding parameter values of the initiallygiven topology. Note that our approach is to ﬁrst promote element-wise sparsity in the initial K ˚ using Algorithm 1, andthen divide it into N “ , , . . . , N m blocks using Algorithm 2. If one wants to directly promote block-wisesparsity in K ˚ for any number of blocks N , then one needs to know the corresponding initial block-structurea priori. We can use the block-structure obtained from Algorithm 2 for each N as that initial given structureand use the algorithm proposed in our recent paper Negi and Chakrabortty (2020b) to carry out directblock-sparsity promotion in K ˚ . The corresponding S BW , S CN and J values obtained are compared with theoutput of Algorithm 2 in Fig. 8. While S BW is held constant in Algorithm 2 (since it is already optimizedin Algorithm 1), the S BW for block-sparse algorithm increases steeply as N increases. On the other hand,21 CN for a given N is the same for both the algorithms because of the same corresponding block-structures.However, the H norm obtained through Algorithm 2 in each case is lower than that obtained through directblock-sparsity promotion for all N . The reason is that the latter must collectively sparsify an entire blockas a result of which the H norm becomes more conservative. Our element-wise sparsiﬁcation approach inAlgorithm 1 manages to avoid this conservatism.Figure 8: Comparison of S BW , S CN and J between Algorithm 2 output and direct block-sparsity promotionin K ˚ for Case B This paper presented the co-design of network delays and sparse controller for LTI systems to improve their H performance. Bandwidth cost constraint is imposed to ensure ﬁnite bandwidth distribution among thecommunication links. The challenges of co-design borne out of implicit functional relationships between thedelays, the sparse controller, and the H -norm are overcome by a hierarchical algorithm, where the inner loopand outer loop are based on ADMM and SDP relaxations, respectively. Additional algorithms are derivedfor carrying out structural modiﬁcations in the sparse controller to manipulate the number and computationoverhead of CNs such that both the H norm and the price of renting the agents are further reduced, at thecost of foregoing privacy of information. Numerical simulations show the eﬀectiveness of the designs andbring out interesting observations about the relationship between the delays, sparsity, and H performanceof the closed-loop system. Appendix

The ij -th block of ˜ A is given as:˜ A ij “ $’’’’’&’’’’’% N ř k “ , k ‰ j θ j ´ θ k I n , i P N N ´ , i “ j θ j ´ θ i N ś m “ , m ‰ j,i θ i ´ θ m θ j ´ θ m I n , i P N N ´ , i ‰ j A , i “ N, j P N N ´ . (55)22ubstituting (14) in (55), the diagonal and oﬀ-diagonal block matrices of the ﬁrst N ´ A ii “ τ o N ÿ k “ , k ‰ i a ik I n , ˜ A ji “ a ji τ o N ź m “ , m ‰ j,i a jm a im I n , (56a) a ik “ ´ ˆ sin ˆ p N ´ i ´ k q π q p N ´ q ˙ sin ˆ p k ´ i q π p N ´ q ˙˙ ´ , (56b)where i, j P t , . . . , N u . Therefore, Λ can be written as: Λ ij “ ˜ A ij , i “ , . . . , N ´ , j “ , . . . , N, , i “ N, j “ , . . . , N. (57)The proof follows from (55), (56) and (57). (cid:4) From (15) and (16), N d can be written as: N d “ r l p´ τ d q , . . . , l N p´ τ d qs T b I n . (58)Let ϑ k “ cos ´ p N ´ k ´ q πN ´ ¯ for k “ t , . . . , N ´ u . Using (14), (15a) and c “ τ d { τ o we can write l j p´ τ d q “ N ź m “ , m ‰ j ´ c ´ . p ϑ m ´ ´ q . p ϑ j ´ ´ ϑ m ´ q . (59)Using (58) and (59), N d can be rewritten in the form of (23) where the j -th row of Γ contains the coeﬃcientsof l j p´ τ d q . From (59), l j p´ τ d q is a product of N ´ c whose coeﬃcients are only dependenton N , and therefore, Γ is a constant for constant N . (cid:4) The proof of uniqueness of solution of (19) and diﬀerentiability of P utilizes Lemma 1 and 2, and followsprocedure similar to Theorem 2.1 and Lemma 3.1 in Rautert and Sachs (1997), respectively. Speciﬁcally, P p τ o qB τ o , P p c qB c and P p K q dK follow as solutions of the following Lyapunov equations: A Tcl P p τ o qB τ o ` P p τ o qB τ o A cl “ B τ o τ o p Λ T P ` PΛ q , (60) A Tcl P p c qB c ` P p c qB c A cl “ N d B c K Td G ` G T K d B c T N Td , (61) A Tcl P p K qB K ` P p K qB K A cl “ ´ Z d ´ Z Td ´ Z o ´ Z To , (62)where Z d “ N Td pB K ˝ I d q G and Z o “ N To pB K ˝ I o q G . The partial derivative of J p K q is J p K qB K “ Tr p B T P p K q B q “ Tr p ∇ J p K q T B K q , where B K P R m ˆ n . Post-multiplying (62) with L and taking its trace, we obtainTr pB K T ∇ J p K qq “ Tr ` B K Td GLN d ` B K To GLN o ˘ , (63)where B K d “ B K ˝ I d and B K o “ B K ˝ I o . Using the property Tr pp X ˝ Y q T Z q “ Tr p X T p Y ˝ Z qq (Schott,2016, Prob. 8.37), where X , Y , Z P R m ˆ n in (63), we get (27). Using (60) and (61), and a similar procedureas above, we obtain J p τ o q “ ´ τ o Tr p Λ T PL ` LPΛ q , (64) J p c q “ Tr p N d K Td GL ` LG T K d N dT q . (65)We can subsequently obtain (26) from (64) and (65). (cid:4) .4 Proof of Theorem 2 Using φ , φ , ψ , A and ∆ ˜ C as stated in the theorem, we deﬁne φ and ψ using Lemma 1 as: φ “ φ ` φ ` φ , φ “ A T ∆ P ` ∆ PA T , ψ “ ψ ` ψ , ψ “ ∆ ˜ C T R ∆ ˜ C . (66)The equation φ ` ψ “ p K , ω o , c ˚ q with ω o “ { τ o and therefore, p K , ω o , c ˚ q is a stabilizing tuple if φ ` ψ ĺ φ and ψ if they satisfy λ max p φ q` λ max p ψ q ĺ φ ` φ ` ψ ` λ max p φ q ` λ max p ψ q ĺ . (67)Equation (67) can be equivalently written as: φ ` φ ` ψ ` αI ĺ , α ě λ max p φ q ` λ max p ψ q . (68)Following (Horn and Johnson, 2013, Theorem 4.3.50) and (Goldberg and Tadmor, 1982, Theorem 1.2), | λ max p φ q| ď } A T ∆ P } , | λ max p ψ q| “ } R { ∆ ˜ C } . Therefore, (28) yields the necessary α for satisfying (68). (cid:4) Let φ “ ř i “ φ i where φ “ A ∆ L ` ∆ LA T , φ “ A L ˚ ` L ˚ A T , φ “ A ∆ L ` ∆ LA T and A “´ B ∆ K d ∆ N Td . The equation φ ` BB T “ p K , τ ˚ o , c q , and φ ` BB T ĺ p K , τ ˚ o , c q is a stabilizing tuple. The rest of the proof can be obtained through similar arguments asTheorem 2. Let k “ vec p K q . Using the property vec p ABC q “ p C T b A q B , on (39), we obtain the following: p T dd p k ˝ v d q ` T od p k ˝ v o qq ˝ v d ` p T do p k ˝ v d q` T oo p k ˝ v o qq ˝ v o ` ρ p k ˝ v d q ` ρ p k ˝ v o q “ µ. (69)Since v d and v o are binary vectors, p T dd p k ˝ v d qq˝ v d “ ` p T dd ˝ ˆ V Td q k ˘ ˝ v d . Furthermore, ` p T dd ˝ ˆ V Td q k ˘ ˝ v d “p ˆ V d ˝ T dd ˝ ˆ V Td q k . Substituting this in (69), p ˆ V d ˝ T dd ˝ ˆ V Td ` ˆ V d ˝ T od ˝ ˆ V To ` ˆ V o ˝ T do ˝ ˆ V Td ` ˆ V o ˝ T oo ˝ ˆ V To ` ρI n q k “ µ. (70)We get (40) from above, thereby completing the proof. (cid:4)

1) The delay ratio c “ τ d τ o can only be theoretically perturbed between 0 and 1 as τ d ď τ o . Due to τ dpr ‰ τ cpr ‰ c can only be perturbed in the open interval p τ dpr τ ˚ o , ´ τ cpr τ ˚ o q if τ o “ τ ˚ o is kept constant. Theminimum value of this interval is obtained when τ dtr “ τ ctr “ c ˚ be perturbed to c P r τ dpr { τ ˚ o , ´ p τ cpr { τ ˚ o qs resulting in a cost S BW p c q . Then, δS BW p c q is given as: “ S BW p c q ´ S ˚ BW (71) “ m cp n ˚ cp cτ ˚ o ´ τ dpr ` m cc n ˚ cc p ´ c q τ ˚ o ´ τ cpr ´ S ˚ BW (72)24 ´ S ˚ BW τ ˚ o ¯looooomooooon p c ` ´ τ ˚ o ` ´ S ˚ BW p τ ˚ o ` p τ dpr ´ τ cpr qq ´ m cp n ˚ cp ` m cc n ˚ cc ˘¯looooooooooooooooooooooooooooooooooooomooooooooooooooooooooooooooooooooooooon q c ` ´` S ˚ BW τ dpr ` m cp n ˚ cp ˘` τ ˚ o ´ τ cpr ˘ ´ m cc n ˚ cc τ dpr ¯looooooooooooooooooooooooooooooooomooooooooooooooooooooooooooooooooon r ´ ´ τ ˚ o ¯loooomoooon p c ` ´ τ ˚ o ` τ ˚ o ` p τ dpr ´ τ cpr q ˘¯looooooooooooooomooooooooooooooon q c ` ´ ´ τ dpr p τ ˚ o ´ τ cpr q ¯loooooooooooomoooooooooooon r . (73)Thus, the constraint δS BW p c q ď p p ´ p q c ` p q ´ q q c ` p r ´ r q ď ùñ δS BW p c q ď . (74)Since p ´ p “ p S ˚ BW ` q τ ˚ o ą δS BW p c q ď c .2) Keeping c “ c ˚ constant, we can write: c ˚ “ τ ˚ d τ ˚ o “ c “ τ d τ o “ τ dtr ` τ dpr τ dtr ` τ dpr ` τ ctr ` τ cpr , (75) ùñ τ ctr “ c ˚ p τ dtr ` τ dpr q c ˚ ´ τ cpr , (76) ùñ c ˚ τ ctr ´ c ˚ τ dtr “ c ˚ τ dpr ´ c ˚ τ cpr . (77)Thus, τ ˚ o “ τ ˚ ctr ` τ ˚ dtr ` τ cpr ` τ cpr is perturbed to τ o “ τ ctr ` τ dtr ` τ cpr ` τ cpr such that (77) is true. Theconstraint δS BW p τ o q is written as: “ S BW p τ o q ´ S ˚ BW (78) “ m cp n ˚ cp c ˚ τ o ´ τ dpr ` m cc n ˚ cc c ˚ τ o ´ τ cpr ´ S ˚ BW (79) “ ´ ´ S ˚ BW c ˚ c ˚ ¯loooooooomoooooooon p τ o ` ´ m cp n ˚ cp c ˚ ` m cc n ˚ cc c ˚ ` S ˚ BW p c ˚ τ cpr ` c ˚ τ dpr q ¯looooooooooooooooooooooooooooooooomooooooooooooooooooooooooooooooooon q τ o ` ´ ´ S ˚ BW τ dpr τ cpr ´ m cp n ˚ cp τ cpr ´ m cc n ˚ cc τ dpr ¯looooooooooooooooooooooooooooooomooooooooooooooooooooooooooooooon r ´ c ˚ c ˚ ¯loomoon p τ o ` ´ ´ c ˚ τ cpr ´ c ˚ τ dpr ¯loooooooooooomoooooooooooon q τ o ` ´ τ dpr τ cpr ¯looooomooooon r . (80)Thus, the constraint δS BW p τ o q ď p p ´ p q τ o ` p q ´ q q τ o ` p r ´ r q ď ùñ δS BW p τ o q ď . (81)Since p ´ p “ ´p S ˚ BW ` q c ˚ c ˚ ă δS BW p τ o q ď τ o . There are a total of N m “ max p m, n q CNs available in the cyber-layer of a given CPS. To implement anytopology T | N for a given K ˚ , one needs to choose a subset of N CNs out of these N m CNs. Let these subsetsbe denoted as V | N , N “ t , ¨ ¨ ¨ , N m u . For example, any topology for N “ V | . Let (cid:96) i,j “ (cid:96) j,i be the physical length of the edge between CN i and j . We can then deﬁne asuperset of physical lengths of the intra-layer links that will be utilized by a topology T | N as: L | N : “ t (cid:96) i,j p“ (cid:96) j,i q : i, j P V | N u . (82)25ince for a given T | N , we assume the worst case intra-layer link propagation delay as given in Sec. 6.1.2, τ cpr p T | N q is a function of max p L | N q written as: τ cpr p T | N q “ ¯ κ max p L | N q , ¯ κ “ { c , (83)where c is the speed of light in the intra-layer link medium (Bertsekas et al., 1992). Let (cid:96) p,q “ max ˆŤ N L | N ˙ be the largest link length between any two CNs. Thus, we can infer τ cpr p T p in q | N m q “ ¯ κ max p L | N m q “ ¯ κ max ˜ď N L | N ¸ “ (cid:96) p,q . (84)One can choose the subset of N m ´ T | N m ´ as V | N m ´ “ V | N m { p , i.e.,all CNs except CN p (or q ) that corresponds to the source/destination of the link with the largest lengthin L | N m . Therefore, the corresponding link length superset becomes max p L | N m ´ q ď (cid:96) p,q . Clearly, for thischoice of N m ´ τ cpr p T | N m ´ q ď τ cpr p T p in q | N m q . We can apply a similar logic for any N ă N m andchoose V | ∈ , V | (cid:51) , ¨ ¨ ¨ , V | N m ´ , V | N m such thatmax p L | q ď max p L | q ď ¨ ¨ ¨ ď max p L | N m ´ q ď max p L | N m q , (85) ùñ τ cpr p T | q ď τ cpr p T | q ď ¨ ¨ ¨ ď τ cpr p T | N m ´ q ď τ cpr p T | N m q “ τ cpr p T p in q | N m q . (86) We use the shorthand ř a | b for ř N i “ a i p K ˚ , T ˚ | b q and b “ b ´ a P R N and b P R for this proof.Using (54), for some N ă N m we can write N m N m N N ď ř n oﬀ | N m ř n oﬀ | N (87) ùñ N m N m N N ď ř n oﬀ | N m ř n | N m ř n oﬀ | N ř n | N , (88) ùñ N m N m n cc | N N N ď n cc | N m ` ˜ÿ i n oﬀ i | N m ¸ ˜ÿ j ‰ i n j | N m ¸ , (89)where n cc | N “ n | N T n oﬀ | N . From (89), n cc | N ď n cc | N m is true for some N ă N m for any given n oﬀ | N m and n | N m if N m N m N N ě ` ´ř n oﬀ i | N m ř j ‰ i n j | N m ¯ n cc | N m . (90)The maximum value of RHS in (90) is n ; we prove this in (C3) shortly. Therefore, we can rewrite (90) as: N m N m N N ě n. (91)We will now prove the theorem separately for the cases n ď m and n ą m . (C1) Case n ď m If the following constraints are satisﬁed for any n, m with n ď m : n ě N p N ´ q ` , N P r , n s , (92)then, from (91), the theorem is true. Clearly the above constraints are satisﬁed for any n ě

2. Therefore,the theorem is proven for the case n ď m . (C2) Case n ą m n ą m , N m “ m and N P r , m s . We can thus rewrite (90) as: m p m ´ q N p N ´ q ě n. (93)Therefore, using (93) and simple algebra, we can conclude that the above constraints will be satisﬁedsimultaneously, i.e., the theorem is valid only when m ď n ď p m p m ´ qq{ n ą m . (C3) Proof for the maximum value of the RHS in (90)We need to prove that the maximum value of RHS of (90) is n . To do that, we write the RHS of (90) as1 ` φ and prove that the maximum value of φ “ n ´

1, i.e.,max N m ě φ “ ˜ ř i n oﬀ i | N m ř j, j ‰ i n j | N m n | T N m n oﬀ | N m ¸ “ n ´ , (94)given that ř i n i | N m “ n , n i | N m P r , n ´ N m s and n oﬀ i | N m P r , N m ´ s . For ease of notation, we write n oﬀ | N m “ r n oﬀ1 , n oﬀ2 , . . . , n oﬀ N m s T and n | N m “ r n , n , . . . , n N m s T . Substituting n N m “ n ´ ř i n i in (94),we can write φ “ ` ř N m ´ i n oﬀ i p n ´ n i q ˘ ` n oﬀ N m ř N m ´ i n i ř N m ´ i “ n oﬀ i n i ` nn oﬀ N m ´ n oﬀ N m ř N m ´ i “ n i . (95)Let ∆ n oﬀ i “ n oﬀ N m ´ n oﬀ i @ i . Then, we can write: φ “ p ř N m ´ i “ n oﬀ i q n ` ř N m ´ i “ n i ∆ n oﬀ i nn oﬀ N m ´ ř N m ´ i “ n i ∆ n oﬀ i . (96)Substituting n oﬀ i “ n oﬀ N m ´ n oﬀ i @ i P N N m in (96): φ “ p N m ´ q nn oﬀ N m ` ř N m ´ i “ p n i ´ n q ∆ n oﬀ i nn oﬀ N m ´ ř N m ´ i “ p n i ´ n q ∆ n oﬀ i . (97)To maximize φ , we need to choose n oﬀ i , i P N N m such that we maximize ř N m ´ i “ p n i ´ n q ∆ n oﬀ i and ř N m ´ i “ n i ∆ n oﬀ i , which is only possible if we choose n i and ∆ n oﬀ i @ i P N N m ´ to be maximum. Therefore, weassume in (97) that ř N m ´ i “ n i “ n ´ n oﬀ i “ N m ´ @ i P N N m ´ . Substituting these in (97):max φ “ n p N m ´ q ` p N m ´ qpp ř N m ´ i “ n i q ´ p N m ´ q n q n p N m ´ q ´ p ř N m ´ i “ n i qp N m ´ q (98)max φ “ p N m ´ q n ` p n ´ ´ p N m ´ q n q n ´ p n ´ q “ n ´ . Therefore, (94) is proved. Since RHS of (90) is 1 ` φ , its maximum value is n . The obtained sparse K ˚ from Algorithm 1 for Case B has nnz p K ˚ q “ T p in q | N m “ is randomly chosen as: X p in q | N m “ t , , , , , , , , , , , , , , , , , , , , , , , , , , , , , u (99) U p in q | N m “ t , , , , , , , , , , , , , , , , , , , , , , , , , , , , , u (100) n p in q | N m “ r , , , ¨ ¨ ¨ , , s T P Z , m p in q | N m “ r , , , ¨ ¨ ¨ , , s T P Z (101)The corresponding block structure K p K ˚ , T p in q | N m “ q is given in Fig. 9.27igure 9: Block Structure K p K ˚ , T p in q | N m “ q References

Aldossary, M., Djemame, K., Alzamil, I., Kostopoulos, A., Dimakis, A., and Agiatzidou, E. (2019). Energy-aware cost prediction and pricing of virtual machines in cloud computing environments.

Future GenerationComputer Systems , 93:442–459.Arastoo, R., Motee, N., and Kothare, M. V. (2014). Optimal sparse output feedback control design: a rankconstrained optimization approach. arXiv preprint arXiv:1412.8236 .Barrera, J. and Garcia, A. (2015). Dynamic incentives for congestion control.

IEEE Trans. Automat. Contr. ,60(2):299–310.Bertsekas, D. P., Gallager, R. G., and Humblet, P. (1992).

Data networks , volume 2. Prentice-Hall Interna-tional, New Jersey.Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011). Distributed optimization and statisticallearning via the alternating direction method of multipliers.

Found. Trends Mach. , 3(1).Bresch-Pietri, D. and Krstic, M. (2009). Adaptive trajectory tracking despite unknown input delay andplant parameters.

Automatica , 45(9):2074–2081.Candes, E. J., Wakin, M. B., and Boyd, S. P. (2008). Enhancing sparsity by reweighted (cid:96) minimization. Journal of Fourier analysis and applications , 14(5):877–905.Geromel, J., Yamakami, A., and Armentano, V. (1989). Structural constrained controllers for discrete-timelinear systems.

Journal of optimization theory and applications , 61(1):73–94.Goldberg, M. and Tadmor, E. (1982). On the numerical radius and its applications.

Linear Algebra Appl. ,42:263–284.Gu, K., Chen, J., and Kharitonov, V. L. (2003).

Stability of time-delay systems . Springer Science & BusinessMedia.Hale, M. T., Nedi´c, A., and Egerstedt, M. (2015). Cloud-based centralized/decentralized multi-agent opti-mization with communication delays. In ,pages 700–705. IEEE.Heemels, W. M. H., Teel, A. R., Van de Wouw, N., and Nesic, D. (2010). Networked control systemswith communication constraints: Tradeoﬀs between transmission intervals, delays and performance.

IEEETransactions on Automatic control , 55(8):1781–1796.28espanha, J. P., Naghshtabrizi, P., and Xu, Y. (2007). A survey of recent results in networked controlsystems.

Proceedings of the IEEE , 95(1):138–162.Horn, R. A. and Johnson, C. R. (2013).

Matrix analysis . Cambridge university press.Kelly, F. P., Maulloo, A. K., and Tan, D. K. (1998). Rate control for communication networks: shadowprices, proportional fairness and stability.

Journal of the Operational Research society , 49(3):237–252.Kolda, T. G. (1998). Partitioning sparse rectangular matrices for parallel processing. In

InternationalSymposium on Solving Irregularly Structured Problems in Parallel , pages 68–79. Springer.Lian, F., Chakrabortty, A., and Duel-Hallen, A. (2017). Game-theoretic multi-agent control and networkcost allocation under communication constraints.

IEEE J. Sel. Areas Commun. , 35(2).Lin, F., Fardad, M., and Jovanovi´c, M. R. (2013). Design of optimal sparse feedback gains via the alternatingdirection method of multipliers.

IEEE Trans. Autom. Control , 58(9).Liu, Y.-C. and Chopra, N. (2012). Control of robotic manipulators under input/output communicationdelays: Theory and experiments.

IEEE Transactions on Robotics , 28(3):742–751.Negi, N. and Chakrabortty, A. (2020a). Co-designing delays with sparse controller for bandwidth-constrainedcyber-physical systems. In . IEEE.Negi, N. and Chakrabortty, A. (2020b). Sparsity-promoting optimal control of cyber–physical systems overshared communication networks.

Automatica , 122:109217.Rautert, T. and Sachs, E. W. (1997). Computational design of optimal output feedback controllers.

SIAMJ Optim , 7(3).Schott, J. R. (2016).

Matrix analysis for statistics . John Wiley & Sons.Sinopoli, B., Sharp, C., Schenato, L., Schaﬀert, S., and Sastry, S. S. (2003). Distributed control applicationswithin sensor networks.

Proceedings of the IEEE , 91(8):1235–1246.Vanbiervliet, J., Michiels, W., and Jarlebring, E. (2011). Using spectral discretisation for the optimal h2design of time-delay systems.

International Journal of Control , 84(2):228–241.Wytock, M. and Kolter, J. Z. (2013). A fast algorithm for sparse controller design. arXiv preprintarXiv:1312.4892 .Xin, Y., Baldine, I., Chase, J., Beyene, T., Parkhurst, B., and Chakrabortty, A. (2011). Virtual smart gridarchitecture and control framework. In , pages 1–6. IEEE.Youcef-Toumi, K. and Wu, S.-T. (1992). Input/output linearization using time delay control.