[PDF] The evolution of network controllability in growing networks

Abstract

The study of network structural controllability focuses on the minimum number of driver nodes needed to control a whole network. Despite intensive studies on this topic, most of them consider static networks only. It is well-known, however, that real networks are growing, with new nodes and links added to the system. Here, we analyze controllability of evolving networks and propose a general rule for the change of driver nodes. We further apply the rule to solve the problem of network augmentation subject to the controllability constraint. The findings fill a gap in our understanding of network controllability and shed light on controllability of real systems.

Full PDF

TThe evolution of network controllability in growingnetworks

Rui Zhang a , Xiaomeng Wang a , Ming Cheng b , Tao Jia a a College of Computer and Information Science, Southwest University, Chongqing,400715, P. R. China b School of Rail Transportation, Soochow University, Suzhou, Jiangsu, 215131, P. R.China

Abstract

The study of network structural controllability focuses on the minimum num-ber of driver nodes needed to control a whole network. Despite intensivestudies on this topic, most of them consider static networks only. It is well-known, however, that real networks are growing, with new nodes and linksadded to the system. Here, we analyze controllability of evolving networksand propose a general rule for the change of driver nodes. We further applythe rule to solve the problem of network augmentation subject to the control-lability constraint. The ﬁndings ﬁll a gap in our understanding of networkcontrollability and shed light on controllability of real systems.

Keywords: network controllability, growing networks, complex networks

1. Introduction

How to control a complex system is one of the most challenging problemsin science and engineering with a long history. During recent years, therewere a signiﬁcant amount of works addressing the controllability of complexnetworks, our ability to drive a network from any initial state to any desiredﬁnal state within ﬁnite time[1, 2, 3]. A general framework based on structuralcontrollability of linear systems was ﬁrst proposed to identify the minimumset of driver nodes (MDS) [4], whose control leads to the control of the wholenetwork. Following it, related problems under this framework were also in-vestigated, ranging from the cost of control [5, 6, 7] to the robustness and

Email address: [email protected] (Tao Jia)

Preprint submitted to Physica A January 27, 2021 a r X i v : . [ phy s i c s . s o c - ph ] J a n ptimal of controllability [8, 9, 10, 11], from the multiplicity feature in con-trol [12, 13, 14, 15] to the controllability in multi-layer or temporal networks[16, 17, 18, 19], and more [20, 21, 22, 23]. This framework was also applied todiﬀerent real networked systems, such as ﬁnancial networks, power networks,social networks, protein-protein interaction networks, disease networks, andgene regulatory networks [24, 25, 26, 27, 28, 29, 30, 31]. In the meanwhile,research on diﬀerent directions of control were also stimulated, such as edgecontrollability [32, 33] , exact controllability [34, 35], strong structural con-trollability [36] and dominating sets [37, 38, 39], which signiﬁcantly advancedour understanding on this fundamental problem.However, except for a few works considering the temporal feature of net-works [40, 41, 42, 43, 44], most of the current advances on network control-lability focus on the static network, in which the number of nodes is ﬁxedand connected by a ﬁxed number of links that do not change over time. Butreal networks are growing, with new links and nodes constantly added to thesystem [45]. To the best of our knowledge, there is no study on the generalprinciple for the change of controllability in growing networks. In this work,we analyze controllability of evolving networks and propose a general rule forthe change of driver nodes under network expansion. This rule allows us tofurther study a problem of network augmentation subject to the controllabil-ity constraints [43]. The maximum number of new nodes that can be addedto the network while keeping the controllability unchanged is diﬃculty to ob-tain. However, the upper and lower bound of this problem can be eﬃcientlyidentiﬁed. The upper and lower bound are also aﬀected by diﬀerent typesof degree correlations in directed networks. In the following discussion, wewill brieﬂy review the framework of identifying minimum driver nodes and anode classiﬁcation scheme based on the multiplicity feature in choosing drivernodes. With these basic concepts, we propose a general rule for the changeof driver nodes when a new node is added and connected to existing nodesin the network. Finally, we use this rule to solve a problem of maintainingnetwork controllability while adding new nodes to the system.

2. Results

A dynamical system is controllable if it can be driven from any initialstate to any desired ﬁnal state within ﬁnite time. In many systems, alteringthe state of a few nodes is suﬃcient to drive the dynamics of the whole2etwork. For a linear time-invariant system, the minimum driver node set(MDS) can be identiﬁed eﬃciently [4]. First, a directed network is convertedinto a bipartite graph by splitting a single node in the directed networkinto two nodes in a bipartite graph, forming two disjoint sets of + and -nodes. Consequently, a directed link from node i to j in the directed networkbecomes a link from node i + to node j − in the bipartite graph. Then themaximum matching [46, 47, 48] of the bipartite graph is identiﬁed whereone node can at most match another node via one link. The the unmatchednodes in the - set are the driver nodes. By imposing properly chosen signalson these N D driver nodes of the MDS, we can yield control over the wholesystem.

123 4 critical redundant intermittent123 41 + + + + + - - - - - Figure 1: ( a ) A directed network with ﬁve nodes. ( b ) The directed network in ( a ) can be transferred intoa bipartite network by splitting a node to two nodes in the bipartite network. The maximum matchingis then performed in the bipartite network, leaving node 1 − , 4 − and 5 − unmatched. ( c ) One minimumdriver node set (MDS) is obtained with the maximum matching in ( b ). The whole network is controllableby controlling node 1, 4 and 5. ( d ) Node classiﬁcation based on the likelihood of being a driver node.Node 1 is critical, node 2 is redundant and nodes 3, 4, 5 are intermittent. The number of driver nodes necessary and suﬃcient for control, N D , isﬁxed for a given network. However, there are multiple choices of MDSs3ith the same N D (Fig. 1a ), giving raise to a multiplicity feature [12, 13].Correspondingly, a node classiﬁcation scheme is proposed based a node’slikelihood of being in the MDS. A node may appear in all MDSs. Hencethis node is critical because the network can not be under control withoutcontrolling this node. A node may not appear in any MDSs. Consequentlythis node is redundant as it does not require any external inputs. The restkind of nodes that may appear in some but not all MDSs is intermittent .It has been found that a node is critical if and only if it has no incominglinks [12]. Hence, the fraction of critical nodes in the network ( n c ) is solelydetermined by the degree distribution, which equals the fraction of nodeswith zero in-degree. The redundant nodes in a network can be identiﬁed byan algorithm with O( LN ) complexity. The intermittent nodes are thereforereadily known once the critical and redundant nodes are identiﬁed. An aim of this work to answer the question about how the number ofdriver nodes N D changes when a new node is added to the network with newlinks pointing to / from the existing nodes. For simplicity, we separatelyanalyze two cases when the new node has only incoming links and onlyoutgoing links. Indeed, the correlation between a node’s in- and out-degreedoes not aﬀect the overall controllability[49]. Therefore, the general casewhen adding a new node with both incoming and outgoing links can beconsidered as a process that adds one node with only outgoing links and onenode with only incoming links, and then merges these two nodes as a singlenode.We ﬁrst consider adding a single node to a network which has only out-going links. Our conclusion is that if there is one new link connected to anon-redundant node (either a critical node or an intermittent node, denotedby NR node for short) in the original network, the number of drive nodeswill stay the same. Otherwise, if all links are connected to redundant nodes(denoted by R nodes for short) in the original network, the number of drivenodes will increase by 1. While we put detailed proof in the Appendix (seeAppendix A), this conclusion can be intuitively understood as follows. Anode without incoming links always requires an independent external signalto control. Hence when this node is added to a network, the number of drivernodes will either increase by 1 or stay unchanged, depending on whether anexisting driver node in the original network would become a non-driver nodeafter adding this new node. Since a redundant node could never become a4river node, linking to them will not change the original number of externalsignals. In this case, N D will increase by 1. In contrast, a critical node isalways a driver node, and an intermittent node can become a driver node insome circumstances. Connecting to these two types of nodes can save oneoriginal signal, making N D stay the same.The the situation that a single node with only incoming links is addedto a network can be analyzed in a similar way by introducing the transposenetwork, in which the direction of all links in the original network is reversed.The value N D is the same in both the original network and the transposenetwork. Therefore, the problem that how N D would change if adding a nodewith only incoming links is equivalent to the problem that how N D wouldchange if adding a node with only outgoing links in the transpose network.Correspondingly, we can ﬁrst identify a node’s category in the transposenetwork, i.e. identify whether a node is redundant or non-redundant in thetranspose network. Then we can apply the result above and reach a conclu-sion that if there is one new link from a non-redundant node in the transposenetwork (denoted by NR T node for short), N D will stay the same. Otherwise,if all links are from the redundant nodes in the transpose network (denotedby R T nodes for short), N D will increase by 1.To quickly summarize, when a node with only outgoing links is added,there are two options in the change of N D : (1) N D increases by 1 if alllinks are connected to R nodes; (2) N D keeps the same if at least one linkis connected to a NR node. Likewise, when a node with only incoming linksis added, there are also two options in the change of N D : (1) N D increasesby 1 if all links are from R T nodes; (2) N D keeps the same if at least onelink is from a NR T node. As mentioned above, the general case when addinga new node with both incoming and outgoing links can be considered asa process that adds one node with only outgoing links and one node withonly incoming links, and then merges these two nodes as a single node. Itis also noteworthy that N D will decrease by 1 during this particular mergingprocess (see Appendix B). Taken together, we have the general conclusionon the change of controllability as follows:- Identify the category of all existing nodes, i.e. R or NR in the originalnetwork, and R T or NR T in the transpose network (Fig. 2a).- N D increases by 1 if all out-links are connected to R nodes and allin-links are from R T nodes (Fig. 2b).5 R,R T )(NR,NR T ) (NR,R T )(NR,NR T ) (NR,NR T ) bc da Figure 2: ( a ) A network with ﬁve nodes that can be control via three driver nodes ( N D =3). The categoryof the node, i.e. R or NR in the original network, and R T or NR T in the transpose network, is labeledfor each of the ﬁve nodes. When a new node with one out-going link and one-incoming link is added tothe network, ( b ) N D increases by 1 when the out-link connects to a R node and the in-link is from a R T node. ( c ) N D keeps the same when the out-link connects to a NR node and the in-link is from a R T node.( d ) N D decreases by 1 when the out-link connects to a NR node and the in-link is from a NR T node. - N D keeps the same if all out-links are connected to R nodes and at leastone in-link is from a NR T node, or at least one out-link is connected toa NR node and all in-links are from R T nodes (Fig. 2c).- N D decreases by 1 if at least one out-link is connected to a NR nodeand at least one in-link is from a NR T node (Fig. 2d). A recent study raises an interesting problem about network argumenta-tion: what is the maximum number of nodes that can be added to a networkwhile keeping N D unchanged [43]. The problem is under several constraintssuch that some trivial solutions are excluded. First, the new nodes addedhave only one out-going link. The constraint on one link slightly simpliﬁesthe problem. But it also means that the new nodes added are not able toform a cycle. Second, the new nodes are not allowed to connect to criti-cal nodes, i.e. the nodes in the network with zero in-degree. Therefore, the6rivial solution that new nodes connect one after another to form a directedpath is excluded. Finally, the MDS needed to control the original network isrecorded and the new nodes are not allowed to connect to any nodes in theoriginal MDS. Indeed, one trivial solution of this problem is to connect thenew node to the node in the MDS to keep N D unchanged. This constraintexcludes this trivial solution, and it signiﬁcantly increases the diﬃculties ofthe problem, which will be explained later.The problem itself has several implications to real systems [43, 50], whichis not the focus of our work. We are interested in identifying the maximumnumber of nodes N a that can be added to the network in this problem. Oneintuitive answer is N a = N D − N c . The reason is as follows. Because thenew node added has zero in-degree, it needs to be always controlled and bethe driver node once it is added. To keep N D the same, we can add at most N D of these new nodes. Because nodes with zero in-degree are not allowedto be connected, the number of critical nodes in the original network shouldbe deducted, which yields the answer N a = N D − N c .However, this solution does not eﬀectively satisfy the third constraint.Because the new node can not connect to nodes in the original MDS, theway that the original network is controlled will aﬀect N a . Indeed, in our con-clusion of controllability change, we show that the new node has to connectto a NR node to keep N D the same. But if the NR node is also in the MDS,it is not allowed to be connected. Fig. 3 shows a good example about howthe original MDS would aﬀect N a .Hence, N D - N c should be the upper bound of N a , but N a in manycases can be less than N D - N c . The exact value of N a turns out to behighly non-trivial, related with solving an integer programming. But basedon the principle identiﬁed, we can use a greedy algorithm to ﬁnd the localmaximum, denoted by N oa , which represents the lower bound of N a . Theidea is to identify a NR node which is not in the MDS and connect thenew node to this NR node. The algorithm (See Appendix C) takes O( N L )complexity to identify a set of N oa nodes for a given choice of MDS that areintroduced to control the original network. We ﬁnd that N oa identiﬁed usingour algorithm can be quite less than N D - N c . Such diﬀerence varies non-monotonically with the average degree of the network (cid:104) k (cid:105) which reaches thepeak at a intermediate value of (cid:104) k (cid:105) (Fig. 4a). N oa depends on the particular choice of MDS and there are multiple MDSsfor a given network. To take this multiplicity feature into account, we applythe random sampling method [13] to generate a collection of random MDSs,7 driver node

12 3 4 5 6 s NR NR NR NRNRR R12 3 4 5 613 4 5 62 N a = 2N a = 1MDS={1, 3 , , Figure 3: An example of how the original MDS can aﬀect N a . ( a ) Node 1, 3, 4, 5 form the MDS. Afteradding a new node s connecting to node 6, node 2 remains to be a NR node, allowing an extra node tobe added. In this case, N a = 2. ( b ) Node 1, 2, 3, 5 form the MDS. After adding a new node s connectingto node 6, there is no NR node that is not included in the original MDS. In this case, N a = 1. minmal averagemaxmal N oa N D - N c N a N a ∆ a b Figure 4: ( a ) The upper bound ( N D − N c ) and the lower bound ( N oa ) of N a for ER network with N =10000and varying (cid:104) k (cid:105) . N oa is obtained via one realization of MDS. Both the upper and lower bound vary non-monotonically with (cid:104) k (cid:105) . The diﬀerence between the upper and lower bound, ∆ (insert), also varies in asimilar trend as those of N D − N c and N oa . ( b ) The average, maximal and minimal value of N oa based on100 randomly generated MDSs in ER network with N =10000 and varying (cid:104) k (cid:105) . in which each MDS gives rise to a N oa value. We then calculate the mean,maximal and minimal value of N oa based on the collection of MDSs (Fig. 4b).In general, the mean and maximal value of N oa are very close. Statistically N oa is not signiﬁcantly aﬀected by the multiple choices of MDSs. But thereexist rare cases when the N oa value is much less than its mean.Finally, we analyzed the eﬀect of degree correlation on N a . In mostreal systems, connections between nodes are not neutral [51, 52, 53]. Nodesare with certain tendency to connect to nodes with similar or diﬀerent de-gree. Such tendency, or degree correlation, is usually quantiﬁed by Pear-son correlation coeﬃcient between the degree of two nodes connected by asingle link [53, 51]. In directed networks, a node is characterized by bothin- and out-degree. Hence, there are four diﬀerent quantiﬁcation of degreecorrelation[54, 49]. More speciﬁcally, for a direct link starts at node s andends at node t , the degree correlation r is given by: r ( α,β ) = L − (cid:80) i α si β ti − [ L − (cid:80) i / α si + β ti )] L − (cid:80) i / α s i + β t i ) − [ L − (cid:80) i / α si + β ti )] , (1)where L is the total number of links in the network, α , β ∈ { in, out } cor-responds to the two diﬀerent types of degree. The four types of degreecorrelation are hence denoted by r in − in , r in − out , r out − out , r out − in .9 c db N oa N D - N C r in-in r out-out r in-out r out-in -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8600800100012001400160018002000 N a -0.8 -0.4 0.0 0.4 0.8200400600 ∆ r -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8600800100012001400160018002000 N a -0.8 -0.4 0.0 0.4 0.8100200300400 ∆ r -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8050010001500200025003000 N a -0.8 -0.4 0.0 0.4 0.803006009001200 ∆ r -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8740760780800820 N a -0.8 -0.4 0.0 0.4 0.8406080100 ∆ r Figure 5: The relationship between the degree correlation and the N a in ER network with N = 10000and (cid:104) k (cid:105) = 4. The four types of degree correlation are denoted by r in − in , r in − out , r out − out , r out − in . Theupper bound ( N D − N c ), the lower bound ( N oa ), and the diﬀerence between the two (∆, in the inset)change very similarly. Since N c does not change with degree correlation but only depends on thenumber of nodes with 0 in-degree ( P in (0)), the upper bound of N a , which is N D − N c , changes with N D alone. It does not change with r in − out , increaseswith the absolute value of r in − in and r out − out , and monotonically decreaseswith r out − in [49] (Fig. 5). The lower bound N oa , identiﬁed using our method,follows a similar trend of N D in all cases. Furthermore, the diﬀerence ∆between the upper and lower bound also shows a similar trend as that of N a and N oa . 10 . Discussion In summary, we study the change of network controllability in growingnetworks. We introduce two sets of node categories, R and NR, and R T andNR T . We ﬁnd that the number of driver nodes N D can increase by 1, staythe same or decrease by 1 when a new node is added. The change relies onthe categories of nodes (R or NR) that the out-going links connects to andthe categories of nodes (R T or NR T ) that the in-coming links are from. Thisprinciple on the change of controllability helps us to solve a recently proposedproblem on network augmentation, the maximum number of nodes N a thatcan be added to a network while keeping N D unchanged. We propose analgorithm that can eﬃciently ﬁnds the lower bound of N a . We demonstratehow the upper bound and lower bound change with average degree and thedegree correlation of the network.The results presented have many potential applications in future works[55]. Network expansion or augmentation is a ubiquitous feature in ourrapidly growing technological society such as adding nodes or edges to anexisting network. Generally, when the network is going to be larger, therewill be more nodes required to achieve full control and also the cost of controlthe network will increase. Our approach can oﬀer insights for future workexploring the augmentation of nodes in control and oﬀer fundamental toolsto explore control in temporal complex systems. Acknowledgement

This work is supported by the Natural Science Foundation of China (No.6160309). M. C. is also supported by the Nature Science Foundation ofJiangsu Province (No. BK20150344), China Postdoctoral Science Foundation(No. 2016M601885).

Appendix A.

Our conclusion says that if there is one new link connected to a non-redundant node (either a critical node or an intermittent node, denoted byNR node for short) in the original network, the number of drive nodes willstay the same. Otherwise, if all links are connected to redundant nodes (de-noted by R nodes for short) in the original network, the number of drivenodes will increase by 1. The proof of this conclusion is best described in a11ipartite graph. Therefore, we will change the terminology from the “drivernode in a directed network” to the “matched or unmatched node in the - setof a bipartite graph”. In particular there are several equivalent terms.number of driver nodes = number of nodes in a set (either + or − ) − numberof matched pairs (or number of matched nodes in a set)redundant node = always matched node in the - set of a bipartite graph, i.e.the node is matched in all diﬀerent maximum matching conﬁgurationsnon-redundant node = not always matched node in the - set of a bipartitegraph. This includes the node that is not matched in the current maximummatching conﬁguration, and the node that is currently matched but can beunmatched in a diﬀerent maximum matching conﬁgurationNow let us consider the case when a new node s with one out-goinglink is added to a directed network (Fig. A.6a). In the bipartite graphrepresentation, this is to add a node s − with zero link and a node s + withone link. Assuming that the node s + connects to is t − , there are 3 diﬀerentsituations:1. Node t − is unmatched. Then a new maximum matching is achieved bymatching node s + and node t − (Fig. A.6b).2. Node t − is matched in the current maximum matching conﬁguration,but it is not always matched. Because node t − is not always matched,there are conﬁgurations that node t − is unmatched but yields the samenumber of matched pairs. Then we can always change the matchingconﬁguration to such that node t − is unmatched and them match thepair node s + and node t − . In the terminology of Hopcroft-Karp algo-rithm that is applied to ﬁnd the maximum matching in bipartite graph[46], this means that there exists an augmentation path that can gothrough node s + and node t − .3. Node t − is matched and is always matched. In this case, there is nomaximum matching conﬁguration with t − unmatched. Hence, node s + and node t − can not be matched. In other words, there is no augmen-tation path that can go through node s + and node t − .In case 1 and 2 (connecting to a node that is not always matched), thenumber of maximum matching increases by 1, which oﬀsets the increase oftotal number of nodes. Consequently, the number of drive nodes stays thesame. In case 3 (connecting to an always matched node), the number of12

23 4 1 + + + + - - - - umatched nodea bc d5 5 + - s s + s - s + s - +-+ +- -1 - - - - - - - - - - + + + + + + + + + + Figure A.6: ( a ) A new node s with one out-going link is added to a directed network with ﬁve nodes.( b ) The corresponding bipartite network with a maximum matching obtained. Node 1 − , 4 − and 5 − areunmatched in the original network. ( c ) Node 1 − is unmatched. If node s − connects to node 1 − , a newmaximum matching is achieved with three matched pairs. ( d ) Node 3 − is currently matched, but it isnot always matched. Consequently, there is a diﬀerent matching conﬁguration that preserves the samenumber of matched pairs while leaving node 3 − unmatched. Therefore, when node s − connects to node3 − , a new maximum matching is with three matched pairs can be achieved. Appendix B.

Assume that node s is with only in-coming links and node s is withonly out-going link. In the corresponding bipartite network, node s +1 andnode s − are with zero degree. Now consider the case that node s and node s are merged together to form a node s with both in-coming and out-goinglinks. In the bipartite network, it corresponds to the process that node s +1 and node s +2 , node s − and node s − are merged together. Note that node s +1 and node s − can not be matched in the original bipartite network. Therefore,the merging does not change the number of matched pairs. But the numberof nodes in each set is reduced by 1. Hence, N D will decrease by 1 in thismerging process as N D = number of nodes in a set − number of matchedpairs (see Appendix A). Appendix C.