The evolution of network controllability in growing networks
TThe evolution of network controllability in growingnetworks
Rui Zhang a , Xiaomeng Wang a , Ming Cheng b , Tao Jia a a College of Computer and Information Science, Southwest University, Chongqing,400715, P. R. China b School of Rail Transportation, Soochow University, Suzhou, Jiangsu, 215131, P. R.China
Abstract
The study of network structural controllability focuses on the minimum num-ber of driver nodes needed to control a whole network. Despite intensivestudies on this topic, most of them consider static networks only. It is well-known, however, that real networks are growing, with new nodes and linksadded to the system. Here, we analyze controllability of evolving networksand propose a general rule for the change of driver nodes. We further applythe rule to solve the problem of network augmentation subject to the control-lability constraint. The findings fill a gap in our understanding of networkcontrollability and shed light on controllability of real systems.
Keywords: network controllability, growing networks, complex networks
1. Introduction
How to control a complex system is one of the most challenging problemsin science and engineering with a long history. During recent years, therewere a significant amount of works addressing the controllability of complexnetworks, our ability to drive a network from any initial state to any desiredfinal state within finite time[1, 2, 3]. A general framework based on structuralcontrollability of linear systems was first proposed to identify the minimumset of driver nodes (MDS) [4], whose control leads to the control of the wholenetwork. Following it, related problems under this framework were also in-vestigated, ranging from the cost of control [5, 6, 7] to the robustness and
Email address: [email protected] (Tao Jia)
Preprint submitted to Physica A January 27, 2021 a r X i v : . [ phy s i c s . s o c - ph ] J a n ptimal of controllability [8, 9, 10, 11], from the multiplicity feature in con-trol [12, 13, 14, 15] to the controllability in multi-layer or temporal networks[16, 17, 18, 19], and more [20, 21, 22, 23]. This framework was also applied todifferent real networked systems, such as financial networks, power networks,social networks, protein-protein interaction networks, disease networks, andgene regulatory networks [24, 25, 26, 27, 28, 29, 30, 31]. In the meanwhile,research on different directions of control were also stimulated, such as edgecontrollability [32, 33] , exact controllability [34, 35], strong structural con-trollability [36] and dominating sets [37, 38, 39], which significantly advancedour understanding on this fundamental problem.However, except for a few works considering the temporal feature of net-works [40, 41, 42, 43, 44], most of the current advances on network control-lability focus on the static network, in which the number of nodes is fixedand connected by a fixed number of links that do not change over time. Butreal networks are growing, with new links and nodes constantly added to thesystem [45]. To the best of our knowledge, there is no study on the generalprinciple for the change of controllability in growing networks. In this work,we analyze controllability of evolving networks and propose a general rule forthe change of driver nodes under network expansion. This rule allows us tofurther study a problem of network augmentation subject to the controllabil-ity constraints [43]. The maximum number of new nodes that can be addedto the network while keeping the controllability unchanged is difficulty to ob-tain. However, the upper and lower bound of this problem can be efficientlyidentified. The upper and lower bound are also affected by different typesof degree correlations in directed networks. In the following discussion, wewill briefly review the framework of identifying minimum driver nodes and anode classification scheme based on the multiplicity feature in choosing drivernodes. With these basic concepts, we propose a general rule for the changeof driver nodes when a new node is added and connected to existing nodesin the network. Finally, we use this rule to solve a problem of maintainingnetwork controllability while adding new nodes to the system.
2. Results
A dynamical system is controllable if it can be driven from any initialstate to any desired final state within finite time. In many systems, alteringthe state of a few nodes is sufficient to drive the dynamics of the whole2etwork. For a linear time-invariant system, the minimum driver node set(MDS) can be identified efficiently [4]. First, a directed network is convertedinto a bipartite graph by splitting a single node in the directed networkinto two nodes in a bipartite graph, forming two disjoint sets of + and -nodes. Consequently, a directed link from node i to j in the directed networkbecomes a link from node i + to node j − in the bipartite graph. Then themaximum matching [46, 47, 48] of the bipartite graph is identified whereone node can at most match another node via one link. The the unmatchednodes in the - set are the driver nodes. By imposing properly chosen signalson these N D driver nodes of the MDS, we can yield control over the wholesystem.
123 4 critical redundant intermittent123 41 + + + + + - - - - - Figure 1: ( a ) A directed network with five nodes. ( b ) The directed network in ( a ) can be transferred intoa bipartite network by splitting a node to two nodes in the bipartite network. The maximum matchingis then performed in the bipartite network, leaving node 1 − , 4 − and 5 − unmatched. ( c ) One minimumdriver node set (MDS) is obtained with the maximum matching in ( b ). The whole network is controllableby controlling node 1, 4 and 5. ( d ) Node classification based on the likelihood of being a driver node.Node 1 is critical, node 2 is redundant and nodes 3, 4, 5 are intermittent. The number of driver nodes necessary and sufficient for control, N D , isfixed for a given network. However, there are multiple choices of MDSs3ith the same N D (Fig. 1a ), giving raise to a multiplicity feature [12, 13].Correspondingly, a node classification scheme is proposed based a node’slikelihood of being in the MDS. A node may appear in all MDSs. Hencethis node is critical because the network can not be under control withoutcontrolling this node. A node may not appear in any MDSs. Consequentlythis node is redundant as it does not require any external inputs. The restkind of nodes that may appear in some but not all MDSs is intermittent .It has been found that a node is critical if and only if it has no incominglinks [12]. Hence, the fraction of critical nodes in the network ( n c ) is solelydetermined by the degree distribution, which equals the fraction of nodeswith zero in-degree. The redundant nodes in a network can be identified byan algorithm with O( LN ) complexity. The intermittent nodes are thereforereadily known once the critical and redundant nodes are identified. An aim of this work to answer the question about how the number ofdriver nodes N D changes when a new node is added to the network with newlinks pointing to / from the existing nodes. For simplicity, we separatelyanalyze two cases when the new node has only incoming links and onlyoutgoing links. Indeed, the correlation between a node’s in- and out-degreedoes not affect the overall controllability[49]. Therefore, the general casewhen adding a new node with both incoming and outgoing links can beconsidered as a process that adds one node with only outgoing links and onenode with only incoming links, and then merges these two nodes as a singlenode.We first consider adding a single node to a network which has only out-going links. Our conclusion is that if there is one new link connected to anon-redundant node (either a critical node or an intermittent node, denotedby NR node for short) in the original network, the number of drive nodeswill stay the same. Otherwise, if all links are connected to redundant nodes(denoted by R nodes for short) in the original network, the number of drivenodes will increase by 1. While we put detailed proof in the Appendix (seeAppendix A), this conclusion can be intuitively understood as follows. Anode without incoming links always requires an independent external signalto control. Hence when this node is added to a network, the number of drivernodes will either increase by 1 or stay unchanged, depending on whether anexisting driver node in the original network would become a non-driver nodeafter adding this new node. Since a redundant node could never become a4river node, linking to them will not change the original number of externalsignals. In this case, N D will increase by 1. In contrast, a critical node isalways a driver node, and an intermittent node can become a driver node insome circumstances. Connecting to these two types of nodes can save oneoriginal signal, making N D stay the same.The the situation that a single node with only incoming links is addedto a network can be analyzed in a similar way by introducing the transposenetwork, in which the direction of all links in the original network is reversed.The value N D is the same in both the original network and the transposenetwork. Therefore, the problem that how N D would change if adding a nodewith only incoming links is equivalent to the problem that how N D wouldchange if adding a node with only outgoing links in the transpose network.Correspondingly, we can first identify a node’s category in the transposenetwork, i.e. identify whether a node is redundant or non-redundant in thetranspose network. Then we can apply the result above and reach a conclu-sion that if there is one new link from a non-redundant node in the transposenetwork (denoted by NR T node for short), N D will stay the same. Otherwise,if all links are from the redundant nodes in the transpose network (denotedby R T nodes for short), N D will increase by 1.To quickly summarize, when a node with only outgoing links is added,there are two options in the change of N D : (1) N D increases by 1 if alllinks are connected to R nodes; (2) N D keeps the same if at least one linkis connected to a NR node. Likewise, when a node with only incoming linksis added, there are also two options in the change of N D : (1) N D increasesby 1 if all links are from R T nodes; (2) N D keeps the same if at least onelink is from a NR T node. As mentioned above, the general case when addinga new node with both incoming and outgoing links can be considered asa process that adds one node with only outgoing links and one node withonly incoming links, and then merges these two nodes as a single node. Itis also noteworthy that N D will decrease by 1 during this particular mergingprocess (see Appendix B). Taken together, we have the general conclusionon the change of controllability as follows:- Identify the category of all existing nodes, i.e. R or NR in the originalnetwork, and R T or NR T in the transpose network (Fig. 2a).- N D increases by 1 if all out-links are connected to R nodes and allin-links are from R T nodes (Fig. 2b).5 R,R T )(NR,NR T ) (NR,R T )(NR,NR T ) (NR,NR T ) bc da Figure 2: ( a ) A network with five nodes that can be control via three driver nodes ( N D =3). The categoryof the node, i.e. R or NR in the original network, and R T or NR T in the transpose network, is labeledfor each of the five nodes. When a new node with one out-going link and one-incoming link is added tothe network, ( b ) N D increases by 1 when the out-link connects to a R node and the in-link is from a R T node. ( c ) N D keeps the same when the out-link connects to a NR node and the in-link is from a R T node.( d ) N D decreases by 1 when the out-link connects to a NR node and the in-link is from a NR T node. - N D keeps the same if all out-links are connected to R nodes and at leastone in-link is from a NR T node, or at least one out-link is connected toa NR node and all in-links are from R T nodes (Fig. 2c).- N D decreases by 1 if at least one out-link is connected to a NR nodeand at least one in-link is from a NR T node (Fig. 2d). A recent study raises an interesting problem about network argumenta-tion: what is the maximum number of nodes that can be added to a networkwhile keeping N D unchanged [43]. The problem is under several constraintssuch that some trivial solutions are excluded. First, the new nodes addedhave only one out-going link. The constraint on one link slightly simplifiesthe problem. But it also means that the new nodes added are not able toform a cycle. Second, the new nodes are not allowed to connect to criti-cal nodes, i.e. the nodes in the network with zero in-degree. Therefore, the6rivial solution that new nodes connect one after another to form a directedpath is excluded. Finally, the MDS needed to control the original network isrecorded and the new nodes are not allowed to connect to any nodes in theoriginal MDS. Indeed, one trivial solution of this problem is to connect thenew node to the node in the MDS to keep N D unchanged. This constraintexcludes this trivial solution, and it significantly increases the difficulties ofthe problem, which will be explained later.The problem itself has several implications to real systems [43, 50], whichis not the focus of our work. We are interested in identifying the maximumnumber of nodes N a that can be added to the network in this problem. Oneintuitive answer is N a = N D − N c . The reason is as follows. Because thenew node added has zero in-degree, it needs to be always controlled and bethe driver node once it is added. To keep N D the same, we can add at most N D of these new nodes. Because nodes with zero in-degree are not allowedto be connected, the number of critical nodes in the original network shouldbe deducted, which yields the answer N a = N D − N c .However, this solution does not effectively satisfy the third constraint.Because the new node can not connect to nodes in the original MDS, theway that the original network is controlled will affect N a . Indeed, in our con-clusion of controllability change, we show that the new node has to connectto a NR node to keep N D the same. But if the NR node is also in the MDS,it is not allowed to be connected. Fig. 3 shows a good example about howthe original MDS would affect N a .Hence, N D - N c should be the upper bound of N a , but N a in manycases can be less than N D - N c . The exact value of N a turns out to behighly non-trivial, related with solving an integer programming. But basedon the principle identified, we can use a greedy algorithm to find the localmaximum, denoted by N oa , which represents the lower bound of N a . Theidea is to identify a NR node which is not in the MDS and connect thenew node to this NR node. The algorithm (See Appendix C) takes O( N L )complexity to identify a set of N oa nodes for a given choice of MDS that areintroduced to control the original network. We find that N oa identified usingour algorithm can be quite less than N D - N c . Such difference varies non-monotonically with the average degree of the network (cid:104) k (cid:105) which reaches thepeak at a intermediate value of (cid:104) k (cid:105) (Fig. 4a). N oa depends on the particular choice of MDS and there are multiple MDSsfor a given network. To take this multiplicity feature into account, we applythe random sampling method [13] to generate a collection of random MDSs,7 driver node
12 3 4 5 6 s NR NR NR NRNRR R12 3 4 5 613 4 5 62 N a = 2N a = 1MDS={1, 3 , , Figure 3: An example of how the original MDS can affect N a . ( a ) Node 1, 3, 4, 5 form the MDS. Afteradding a new node s connecting to node 6, node 2 remains to be a NR node, allowing an extra node tobe added. In this case, N a = 2. ( b ) Node 1, 2, 3, 5 form the MDS. After adding a new node s connectingto node 6, there is no NR node that is not included in the original MDS. In this case, N a = 1. minmal
This work is supported by the Natural Science Foundation of China (No.6160309). M. C. is also supported by the Nature Science Foundation ofJiangsu Province (No. BK20150344), China Postdoctoral Science Foundation(No. 2016M601885).
Appendix A.
Our conclusion says that if there is one new link connected to a non-redundant node (either a critical node or an intermittent node, denoted byNR node for short) in the original network, the number of drive nodes willstay the same. Otherwise, if all links are connected to redundant nodes (de-noted by R nodes for short) in the original network, the number of drivenodes will increase by 1. The proof of this conclusion is best described in a11ipartite graph. Therefore, we will change the terminology from the “drivernode in a directed network” to the “matched or unmatched node in the - setof a bipartite graph”. In particular there are several equivalent terms.number of driver nodes = number of nodes in a set (either + or − ) − numberof matched pairs (or number of matched nodes in a set)redundant node = always matched node in the - set of a bipartite graph, i.e.the node is matched in all different maximum matching configurationsnon-redundant node = not always matched node in the - set of a bipartitegraph. This includes the node that is not matched in the current maximummatching configuration, and the node that is currently matched but can beunmatched in a different maximum matching configurationNow let us consider the case when a new node s with one out-goinglink is added to a directed network (Fig. A.6a). In the bipartite graphrepresentation, this is to add a node s − with zero link and a node s + withone link. Assuming that the node s + connects to is t − , there are 3 differentsituations:1. Node t − is unmatched. Then a new maximum matching is achieved bymatching node s + and node t − (Fig. A.6b).2. Node t − is matched in the current maximum matching configuration,but it is not always matched. Because node t − is not always matched,there are configurations that node t − is unmatched but yields the samenumber of matched pairs. Then we can always change the matchingconfiguration to such that node t − is unmatched and them match thepair node s + and node t − . In the terminology of Hopcroft-Karp algo-rithm that is applied to find the maximum matching in bipartite graph[46], this means that there exists an augmentation path that can gothrough node s + and node t − .3. Node t − is matched and is always matched. In this case, there is nomaximum matching configuration with t − unmatched. Hence, node s + and node t − can not be matched. In other words, there is no augmen-tation path that can go through node s + and node t − .In case 1 and 2 (connecting to a node that is not always matched), thenumber of maximum matching increases by 1, which offsets the increase oftotal number of nodes. Consequently, the number of drive nodes stays thesame. In case 3 (connecting to an always matched node), the number of12
23 4 1 + + + + - - - - umatched nodea bc d5 5 + - s s + s - s + s - +-+ +- -1 - - - - - - - - - - + + + + + + + + + + Figure A.6: ( a ) A new node s with one out-going link is added to a directed network with five nodes.( b ) The corresponding bipartite network with a maximum matching obtained. Node 1 − , 4 − and 5 − areunmatched in the original network. ( c ) Node 1 − is unmatched. If node s − connects to node 1 − , a newmaximum matching is achieved with three matched pairs. ( d ) Node 3 − is currently matched, but it isnot always matched. Consequently, there is a different matching configuration that preserves the samenumber of matched pairs while leaving node 3 − unmatched. Therefore, when node s − connects to node3 − , a new maximum matching is with three matched pairs can be achieved. Appendix B.
Assume that node s is with only in-coming links and node s is withonly out-going link. In the corresponding bipartite network, node s +1 andnode s − are with zero degree. Now consider the case that node s and node s are merged together to form a node s with both in-coming and out-goinglinks. In the bipartite network, it corresponds to the process that node s +1 and node s +2 , node s − and node s − are merged together. Note that node s +1 and node s − can not be matched in the original bipartite network. Therefore,the merging does not change the number of matched pairs. But the numberof nodes in each set is reduced by 1. Hence, N D will decrease by 1 in thismerging process as N D = number of nodes in a set − number of matchedpairs (see Appendix A). Appendix C.