Multi-Swarm Herding: Protecting against Adversarial Swarms
aa r X i v : . [ c s . M A ] J u l Multi-Swarm Herding: Protecting against Adversarial Swarms
Vishnu S. Chipade and Dimitra Panagou
Abstract — This paper studies a defense approachagainst one or more swarms of adversarial agents.In our earlier work, we employ a closed formation(‘StringNet’) of defending agents (defenders) arounda swarm of adversarial agents (attackers) to con-fine their motion within given bounds, and guidethem to a safe area. The control design relies onthe assumption that the adversarial agents remainclose enough to each other, i.e., within a prescribedconnectivity region. To handle situations when theattackers no longer stay within such a connectivityregion, but rather split into smaller swarms (clusters)to maximize the chance or impact of attack, this paperproposes an approach to learn the attacking sub-swarms and reassign defenders towards the attackers.We use a ‘Density-based Spatial Clustering of Appli-cation with Noise (DBSCAN)’ algorithm to identifythe spatially distributed swarms of the attackers.Then, the defenders are assigned to each identifiedswarm of attackers by solving a constrained general-ized assignment problem. Simulations are provided todemonstrate the effectiveness of the approach.
I. Introduction
Swarms of low-cost agents such as small aerial robotsmay pose risk to safety-critical infrastructure such asgovernment facilities, airports, and military bases. In-terception strategies [1], [2] against these threats maynot be feasible or desirable in an urban environment dueto posing greater risks to humans and the surroundinginfrastructure. Under the assumption of risk-averse andself-interested adversarial agents (attackers) that tend tomove away from the defending agents (defenders) andfrom other dynamic objects, herding can be used as anindirect way of guiding the attackers to a safe area.In our recent work [3], [4], we developed a herdingalgorithm, called ‘StringNet Herding’, to herd a swarmof adversarial attackers away from a safety-critical (pro-tected) area. A closed formation (‘StringNet’) of defend-ing agents connected by string barriers is formed arounda swarm of attackers staying together to confine theirmotion within given bounds, and guide them to a safearea. However, the assumption that the attackers willstay together in a connectivity region, and they willreact to the defenders collectively as a single swarm while
The authors are with the Department of Aerospace En-gineering, University of Michigan, Ann Arbor, MI, USA; (vishnuc,dpanagou)@umich.edu
This work has been funded by the Center for Unmanned Air-craft Systems (C-UAS), a National Science Foundation Indus-try/University Cooperative Research Center (I/UCRC) under NSFAward No. 1738714 along with significant contributions from C-UAS industry members. attacking the protected area, can be quite conservativein practice.In this paper, we build upon our earlier work on‘StringNet Herding’ [4] and study the problem of de-fending a safety-critical (protected) area from adversarialagents that may or may not stay together. We proposea ‘Multi-Swarm StringNet Herding’ approach that usesclustering-based defender assignment, and the ‘StringNetHerding’ method to herd the adversarial attackers toknown safe areas.
1) Related work:
Several approaches have been pro-posed to solve the problem of herding. Some examplesare: the n -wavefront algorithm [5], [6], where the motionof the birds on the boundary of the flock is influencedbased on the locations of the airport and the safe area;herding via formation control based on a potential-field approach [7]; biologically-inspired "wall" and "encir-clement" methods that dolphins use to capture a school offish [8]; an RRT approach that finds a motion plan for theagents while maintaining a cage of potentials around thesheep [9]; sequential switching among the chased targets[10]. In general, the above approaches suffer from oneor more of the following: 1) dependence on knowing theanalytical modeling of the attackers’ motion, 2) lack ofmodeling of the adversarial agents’ intent to reach orattack a certain protected area, 3) simplified motion andenvironment models. The proposed ‘StringNet Herding’approach relaxes the first and the third issue above, andtakes into account the second one for control design.Clustering of data points is a popular machine learningtechnique [11]. There are various categories of clusteringalgorithms: 1) partition based (K-means [12]), 2) hier-arachy based (BIRCH [13]), 3) density based (DBSCAN[14]), 4) stream based (STREAM [15]), 6) graph theorybased (CLICK [16]). Spatial proximity of the agents iscrucial for the problem at hand so our focus will bemostly on the density based approaches in this paper.Assignment problems have also been studied exten-sively [17]. In this paper, we are interested in a general-ized assignment problem (GAP) [18], in which there aremore number of objects than knapsacks to be filled. GAPis known to be NP-hard but there are approximationalgorithms to solve an arbitrary instance of GAP [18].
2) Overview of the proposed approach:
The pro-posed approach involves: 1) identification of the clusters(swarms) of the attackers that stay together, 2) distri-bution and assignment of the defenders to each of theidentified swarms of the attackers, 3) use of ‘StringNetHerding’ approach by the defenders to herd each identi-ed swarm of attackers to the closest safe area.More specifically, we use the “Density based SpatialClustering of Application with Noise (DBSCAN)" algo-rithm [14] to identify the swarms of the attackers inwhich the attackers stay in a close proximity of the otherattackers in the same swarm. We then formulate a gener-alized assignment problem with additional constraints onthe connectivity of the defenders to find which defendershould go against which swarm of attackers and herd itto one of the safe areas. This connectivity constrainedgeneralized assignment problem (C2GAP) is modeledas a mixed integer quadratically constrained program(MIQCP) to obtain an optimal assignment solution. Wealso provide a hierarchical algorithm to find the assign-ment quickly, which along with the MIQCP formulationis the major contribution of this paper.
3) Structure of the paper:
Section II describes themathematical modeling and problem statement. TheStringNet herding approach is briefly discussed in SectionIII. The approach on clustering and the defenders-to-attackers assignment for multiple-swarm herding is dis-cussed in Section IV. Simulations and conclusions areprovided in Section V and VI, respectively.
II. Modeling and Problem Statement
Notation : The set of integers greater than 0 is denotedby Z > . Vectors and matrices are denoted by small andcapital bold letters, respectively (e.g., r , P ). k . k denotesthe Euclidean norm of its argument. | . | denotes theabsolute value of a scalar, and cardinality if the argumentis a set. n ! is a factorial of n .We consider N a attackers A i , i ∈ I a = { , , ..., N a } ,and N d defenders D j , j ∈ I d = { , , ..., N d } , operatingin a 2D environment W ⊆ R that contains a protectedarea P ⊂ W , defined as P = { r ∈ R | k r − r p k ≤ ρ p } , and N s safe areas S m ⊂ W , defined as S m = { r ∈ R | k r − r sm k ≤ ρ sm } , for all m ∈ I s = { , , ..., N s } ,where ( r p , ρ p ) and ( r sm , ρ sm ) are the centers and radiiof the corresponding areas, respectively. The number ofdefenders is no less than that of attackers, i.e., N d ≥ N a .The agents A i and D j are modeled as discs of radii ρ a and ρ d ≤ ρ a , respectively and move under double integrator(DI) dynamics with quadratic drag:˙ r ai = v ai , ˙ v ai = u ai − C D k v ai k v ai ; (1a)˙ r dj = v dj , ˙ v dj = u dj − C D k v dj k v dj ; (1b) k u ai k ≤ ¯ u a , k u dj k ≤ ¯ u d ; (1c)where C D is the drag coefficient, r ai = [ x ai y ai ] T and r dj = [ x dj y dj ] T are the position vectors of A i and D j ,respectively; v ai = [ v x ai v y ai ] T , v dj = [ v x dj v y dj ] T arethe velocity vectors, respectively, and u ai = [ u x ai u y ai ] T , u dj = [ u x dj u y dj ] T are the accelerations (the controlinputs), respectively. The defenders are assumed to befaster than the attackers, i.e., ¯ u a < ¯ u d . This model posesa speed bound on each player with limited accelerationcontrol, i.e., v ai = k v ai k < ¯ v a = q ¯ u a C d and v dj = k v dj k < ¯ v d = q ¯ u d C d . We assume that every defender D j senses theposition r ai and velocity v ai of the attacker A i when A i isinside a circular sensing-zone Z sd = { r ∈ R | k r − r p k ≤ ρ sd } . Each attacker A i has a similar local sensing zone Z sai = { r ∈ R | k r − r ai k ≤ ρ sai } inside which theysense defenders’ positions and velocities.The attackers aim to reach the protected area P . Theattackers may use flocking controllers [19] to stay to-gether, or they may choose to split into different smallerswarms [20], [21]. The defenders aim to herd each of theseattackers to one of the safe areas in S = {S , S , ..., S N s } before they reach P . We consider the following problems. Problem
Identify theswarms {A c , A c , ..., A c Nac } of the attackers for someunknown N ac ≥ A c k , and only them, satisfy prescribed conditions onspatial proximity. Problem
Find subgroups {D c , D c , ..., D c Nac } of the defenders and their assign-ment to the attackers’ swarms identified in Problem 1,such that all the defenders in the same subgroup areconnected via string barriers to enclose and herd theassigned attacker’s swarm. III. Herding a Single Swarm of Attackers
To herd a swarm of attackers to S , we use ‘StringNetHerding’, developed in [4]. StringNet is a closed net ofstrings formed by the defenders as shown in Fig. 1. Thestrings are realized as impenetrable and extendable linebarriers (e.g., spring-loaded pulley and a rope or othersimilar mechanism [22]) that prevent attackers from pass-ing through them. The extendable string barrier allowsfree relative motion of the two defenders connected by thestring. The string barrier can have a maximum lengthof ¯ R s . If the string barrier were to be physical one,then it can be established between two defenders D j and D j ′ only when they are close to each other and havealmost same velocity, i.e., k r dj − r dj ′ k ≤ R s < ¯ R s and k v dj − v dj ′ k ≤ ǫ , where R s and ǫ are small numbers.The underlying graph structure for the two different“StringNet” formations defined for a subset of defenders D ′ = {D j | j ∈ I ′ d } , where I ′ d ⊆ I d , are defined as follows: Definition
The Closed-StringNet G scl ( I ′ d ) = ( V scl ( I ′ d ) , E scl ( I ′ d )) is a cyclegraph consisting of: 1) a subset of defenders as thevertices, V scl ( I ′ d ) = {D j | j ∈ I ′ d } , 2) a set of edges, E scl ( I ′ d ) = { ( D j , D j ′ ) ∈ V scl ( I ′ d ) × V scl ( I ′ d ) |D j s ←→ D j ′ } ,where the operator s ←→ denotes an impenetrable linebarrier between the defenders. Definition
The Open-StringNet G sop ( I ′ d ) = ( V sop ( I ′ d ) , E sop ( I ′ d )) is a pathgraph consisting of: 1) a set of vertices, V sop ( I ′ d ) and 2)a set of edges, E sop ( I ′ d ), similar to that in Definition 1.The StringNet herding consists of four phases: 1)gathering, 2) seeking, 3) enclosing, and 4) herding to asafe area. These phases are discussed as follows. . Gathering We assume that the attackers start as single swarmthat stays together, however, they may start splittinginto smaller groups as they sense the defenders in theirpath. The aim of the defenders is to converge to anopen formation F gd centered at the gathering center r df g located on the expected path of the attackers, wherethe expected path is defined as the shortest path of theattackers to the protected area, before the attackers reach r df g . Let R d ( N a ) : Z > → Z > be the resource allocationfunction that outputs the number of the defenders thatcan be assigned to the given N a attackers. The openformation F gd is characterized by the positions ξ gl , for all l ∈ I dc = { , , ..., R d ( N a ) } , as shown in Fig. 1. Oncethe defenders arrive at these positions, the defendersget connected by strings as follows: the defender at ξ gl gets connected to the defender at ξ gl +1 for all l ∈{ , , ..., R d ( N a ) − } (see Fig. 1). The formation F gd is chosen to be a straight line formation as opposedto a semicircular formation chosen in [4] to allow forthe largest blockage in the path of the attackers. Theangle made by the normal to the line joining ξ g and ξ gN d (clockwise from ξ g , see Fig. 1) is the orientation ofthe formation. The formation F gd is chosen such thatits orientation is toward the attackers on their expectedpath (defined above), see the blue formation in Fig 1.The desired positions ξ gl on F gd centered at the gatheringcenter r df g are: ξ gl = r df g + ˆ R l ˆ o ( θ df g + π ) , for all l ∈ I dc ; (2)where ˆ R l = ˆ R d,gd (cid:0) N d − l +12 (cid:1) , ˆ o ( θ ) = [cos( θ ) , sin( θ )] T isthe unit vector making an angle θ with x -axis, θ df g = θ ∗ a cm + π , where θ ∗ a cm is the angle made by the linesegment joining the attackers’ center of mass (ACoM) tothe center of the protected area (the shortest path fromthe initial position of ACoM to P ) with x -axis. Thesepositions are static, i.e., ˙ ξ gl = ¨ ξ gl = . The gatheringcenter r df g = ρ gdf ˆ o ( θ df g ) is such that ρ gdf > ρ p .We define the defender-goal assignment as: Definition
A bijec-tive mapping β : { , , ..., R d ( N a ) } → I d such that thedefender D β ( l ) is assigned to go to the goal ξ gl .As discussed in [4], we design a time-optimal motionplan so that the defenders converge to the formation F gd as early as possible. Given initial positions for the N d defenders, and desired goal positions on the formation F gd , we recursively solve a mixed integer quadratic pro-gram (MIQP) using bisection method to find: 1) the bestgathering center, if feasible, and 2) the best defender-goal assignment. The MIQP finds the best defender-goal assignment by using: 1) the time information ofthe time-optimal trajectories obtained for each defender Completing a circular formation starting from a semicircularformation of the same radius is faster. However, the semicircularformation, for a given length constraint on the string barrier ( ¯ R s ),creates smaller blockage to the attackers as compared to the lineformation. It is a trade-off between speed and effectiveness. Fig. 1: Assignment of defenders to the attackers’ swarmsto go from its initial position to any goal position ξ gl under bounded acceleration [4], and 2) the information ofcollision of all pairs of the time-optimal trajectories. Thebisection method is then used to find the best gatheringcenter by comparing the maximum time for the defendersobtained from the MIQP and the minimum time requiredby the attackers to reach the gathering center. B. Seeking
After the defenders accomplish gathering, suppose agroup of defenders D c k = {D j | j ∈ I dc k } , I dc k ⊆ I d , istasked to herd a swarm of attackers A c k = {A i | i ∈ I ac k } , I ac k ⊆ I a , the details are discussed later in SectionIV. Let β k : { , , ..., |D c k |} → I dc k be the mappingthat gives the indexing order of the defenders in D c k on the Open-StringNet line formation F sdc k (similar to F gd ). In the seeking phase, the defenders in D c k maintainthe line formation F sdc k and try to get closer to theswarm of attackers A c k by using state-feedback, finite-time convergent, bounded control laws as discussed in [4].The control actions as derived in [4] for the defenders in D c k are modified to incorporate collision avoidance fromthe other StringNet formations by D c k ′ , for k ′ = k . C. Enclosing: Closed-StringNet formation
Once the Open-StringNet formation reaches close tothe attackers’ formation, the enclosing phase begins inwhich the defenders start enclosing the attackers bymoving to their desired positions on the enclosing for-mations while staying connected to their neighbors. Wechoose two formations for this phase that the defenderssequentially achieve: 1) Semi-circular Open-StringNetformation ( F e op dc k ), 2) Circular Closed-StringNet forma-tion ( F e cl dc k ). When the defenders directly try to convergeto a circular formation from a line formation duringthis phase, the defenders at the either end of the Open-StringNet formation will start coming closer to eachother reducing the length of the overall barrier in theattackers’ path significantly. This is because the desiredositions of these terminal defenders in the circular for-mation would be very close to each other on the oppositeside of the circular formation (see Fig. 1) and collisionavoidance part of the controller is only active locally nearthe circle of maximum radius ¯ ρ ac k around the swarm A c k .So the defenders would first converge to a semi-circularformation and would converge to a circular formationafter the former is achieved.The desired position ξ e op c k ,l on the Open-StringNet for-mation F e op dc k (Fig. 1) is chosen on the circle with radius ρ sn k centered at r ˆ ac k as: ξ e op c k ,l = r ˆ ac k + ρ sn k ˆ o ( θ l ), where θ l = θ e ∗ df k + π + π ( l − |D ck |− , (3)for all l ∈ { , , ..., |D c k |} , where θ e ∗ df k = θ s ∗ df k . The center r ˆ ac k = r ac k + ˜ r ˆ ac k , where ˜ r ˆ ac k is the position of thecentroid of the convex hull of the position coordinatesof the attackers in A c k relative to the center of mass r ac k = P i ∈ I ack r ai |A ck | of A c k at the latest time whenthe swarm A c k was identified. The radius ρ sn k shouldsatisfy, ¯ ρ ac k + b d < ρ sn k , where ¯ ρ ac k is maximum radiusof swarm A c k . The parameter b d is the tracking error forthe defenders in this phase [4].Similarly, the desired positions ξ e cl c k ,l on the Closed-StringNet formation F e cl dc k same as in Eq. 3 with θ l = θ e ∗ df k + π (2 l − |D ck | , for all l ∈ { , , ..., |D c k |} . Both theformations move with the same velocity as that of theattackers’ center of mass, i.e., ˙ ξ e op c k ,l = ˙ ξ e cl c k ,l = ˙ r ac k .The defenders D c k first track the desired goal posi-tions ξ e op c k ,l by using the finite-time convergent, boundedcontrol actions given in [4]. Once the defender D β k (1) and D β k ( |D ck | ) reach within a distance of b d from ξ e op c k , and ξ e op c k , |D ck | , i.e., (cid:13)(cid:13) r d β k (1) − ξ e op c k , (cid:13)(cid:13) < b d and (cid:13)(cid:13)(cid:13) r d β k ( |D ck | ) − ξ e op c k , |D ck | (cid:13)(cid:13)(cid:13) < b d , respectively, the desiredgoal positions are changed from ξ e op c k ,l to ξ e cl c k ,l for all l ∈ { , , ..., |D c k |} . The StringNet is achieved when (cid:13)(cid:13)(cid:13) r d β k ( l ) − ξ e cl c k ,l (cid:13)(cid:13)(cid:13) ≤ b d for all l ∈ { , , ..., |D c k |} duringthis phase. D. Herding: moving the Closed-StringNet to safe area
Once a group of defenders D c k = {D j | j ∈ I dc k } ,for I dc k ⊆ I d , forms a StringNet around a swarm ofattackers, they move while tracking a desired rigid closedcircular formation F hdc k centered at a virtual agent r df hk as discussed in [4]. The swarm is herded to the closestsafe area S ς ( k ) , where ς ( k ) = arg min m ∈ I s (cid:13)(cid:13)(cid:13) r df hk − r sm (cid:13)(cid:13)(cid:13) . IV. Multi-Swarm Herding
We consider that the attackers split into smaller groupsas they sense the defenders in their path, to maximize thechance of at least some attackers reaching the protectedarea by circumnavigating the oncoming defenders. Torespond to such strategic movements of the attackers,the defenders need to collaborate intelligently. In theapproach presented in this paper, the defenders first identify the spatial clusters of the attackers. Then, thedefenders distribute themselves into smaller connectedgroups, subsets of defenders that have already estab-lished an Open-StringNet formation, in order to herdthese different spatial clusters (swarms) of the attackersto safe areas. In the next subsections, we discuss the clus-tering and the defender to swarm assignment algorithms.
A. Identifying Swarms of the Attackers
In order to identify the spatially distributed clusters(swarms) of the attackers, the defenders utilize theDensity Based Spatial Clustering of Applications withNoise (DBSCAN) algorithm [14]. Given a set of points,DBSCAN algorithm finds clusters of high density points(points with many nearby neighbors), and marks thepoints as outliers if they lie alone in low-density regions(whose nearest neighbors are too far away). DBSCAN al-gorithm can identify clusters of any shape in the data andrequires two parameters that define the density of thepoints in the clusters: 1) ε nb (radius of the neighborhoodof a point), 2) m pts (minimum number of points in ε nb -neighborhood of a point). In general, attackers can splitinto formations with varied range of densities makingthe choice of the parameters ε nb and m pts challenging.Variants of the DBSCAN algorithm, such as OPTICS[23], can find clusters of varying density, however, theyare more time consuming. To keep computational de-mands low, we use the DBSCAN algorithm with fixedparameters ε nb and m pts , which quickly yields usefulclustering information about the attackers satisfying aspecified connectivity constraints.The neighborhood of an attacker is defined usingweighted distance between two attackers: d ( x ai , x ai ′ ) = p ( x ai − x ai ′ ) T M ( x ai − x ai ′ ), where x ai = [ r Tai , v Tai ] T and M is a weighing matrix defined as M = diag ([1 , , ϕ, ϕ ]), where ϕ weights relative velocityagainst relative position. We choose ϕ < ε nb -neighborhood of an attacker A i is then defined as theset of points x ∈ R such that d ( x ai , x ) < ε nb .The largest circle inscribed in the largest Closed-StrignNet formation formed by the N d defenders hasradius ¯ ρ ac = ¯ R s cot( πN d ). Maximum radius of any clusterwith N a points identified by DBSCAN algorithm withparameters ε nb and m pts is ε nb ( N a − m pts − . If all of theattackers were to be a single swarm enclosed inside theregion with radius ¯ ρ ac then we would require ε nb to begreater than ¯ ρ ac ( m pts − N a − in order identify them as a singlecluster. So we choose ε nb = ¯ ρ ac ( m pts − N a − and since we wantto identify even clusters with as low as 3 agents we needto choose m pts = 3. With this parameters for DBSCANalgorithm, we have: Lemma Let {A c , A c , ..., A c Nac } be the clus-ters identified by DBSCAN algorithm with ε nb = ¯ ρ ac N a − ⌊ m pts ⌋ . For all k ∈ I ac = { , , ..., N ac } , weave ρ ac k = max i ∈ I ack k r ai − r ˆ ac k k ≤ ¯ R s cot (cid:16) π |A ck | (cid:17) , if |A c k | > N a = N d .As the number of attackers increases, the computa-tional cost for DBSCAN becomes higher and looses itspractical usefulness. Furthermore, the knowledge of theclusters is only required by the defenders when a swarmof attackers does not satisfy the assumed constraint onits connectivity radius. So the DBSCAN algorithm isrun only for swarms of attackers A c k for some k ∈ I ac whenever the connectivity constraint is violated by themi.e., when the radius of the swarm of attackers A c k defined as ρ ac k = max i ∈ I ack k r ai − r ˆ ac k k exceeds thevalue ¯ ρ ac k = ¯ R s cot (cid:16) πN d (cid:17) |A ck |− N a − . B. Defender Assignment to the Swarms of Attackers
As the initial swarm of attackers splits into smallerswarms, the defenders must distribute themselves intosmaller groups and assign the attackers’ swarms (clus-ters) to these groups in order to enclose these swarmsand subsequently herd them to the closest safe area. Let A c = {A c , A c , . . . , A c Nac } be a set of swarms of theattackers after a split event has happened at time t se .We assume that none of the swarms in A c is a singularone (i.e., a swarm with less than three agents), |A c k | > k ∈ I ac = { , , ..., N ac } . We formally define thedefender to attackers’ swarm assignment as: Definition
Aset β = { β , β , ... β N ac } of mappings β k : { , , ..., R d ( |A c k | ) } → I d , where β k gives the indices of thedefenders assigned to the swarm A c k for all k ∈ I ac .We consider an optimization problem to find the bestdefender-swarm assignment as: β ⋆ = argmin N ac X k =1 R d ( |A ck | ) X j ′ =1 (cid:13)(cid:13) r ˆ ac k − r d β k ( j ′ ) (cid:13)(cid:13) Subject to ( D β k ( j ′ ) , D β k ( j ′ − ) ∈ E sop ( I d ) , ∀ j ′ ∈ { , ..., R d ( |A c k | ) } , ∀ k ∈ I ac . (4)The optimization cost is the sum of distances of thedefenders from the centers of the attackers’ swarms towhich they are assigned. This ensures that the collectiveeffort needed by all the defenders is minimized whenenclosing the swarms of the attackers. The constraints inEq. (4) require that all the defenders that are assignedto a particular swarm of the attackers are neighbors ofeach other, are already connected to each other via stringbarriers and the underlying graph is an Open-StringNet.Assuming N d = N a , we choose R d ( |A c k | ) = |A c k | , i.e.,the number of defenders assigned to a swarm A c k is equalto the number of attackers in A c k . This is to ensurethat there are adequate number of defenders to go aftereach attacker in the event the attackers in swarm A c k disintegrate into singular swarms . In this case, herding may not be the most economical way ofdefense. How to handle the situations with singular swarms is outof the scope of this paper and will be studied in the future work.
This assignment problem is closely related to general-ized assignment problem (GAP) [18], in which n objectsare to be filled in m knapsacks ( n ≥ m ). This problemis modeled as a GAP with additional constraints on theobjects (defenders) that are assigned to a given knapsack(attackers’ swarm). We call this constrained assignmentproblem as connectivity constrained generalized assign-ment problem (C2GAP) and provide a mixed integerquadratically constrained program (MIQCP) to find theoptimal assignment as:Minimize J = P N ac k =1 P N d j =1 k r ˆ ac k − r dj k δ jk (5a)Subject to P k ∈ Iac δ jk =1 , ∀ j ∈ I d ; (5b) P j ∈ Id δ jk = R d ( |A ck | ) , ∀ k ∈ I ac ; (5c) P j ∈ ˜ Id δ jk δ ( j +1) k ≥ R d ( |A ck | ) − , ∀ k ∈ I ac ; (5d) P k ∈ Iac P j ∈ Id δ jk = R d ( N a ); (5e) δ jk ∈{ , } , ∀ j ∈ I d ,k ∈ I ac ; (5f)where ˜ I d = I d − { N d } , δ jk is a decision variable whichis equal to 1 when the defender D j is assigned to theswarm A c k and 0 otherwise. The constraints (5b) ensurethat each defender is assigned to exactly one swarmof the attackers, the capacity constraints (5c) ensurethat for all k ∈ I ac swarm A c k has exactly R d ( |A c k | )defenders assigned to it, the quadratic constraints (5d)ensure that all the defenders assigned to swarm A c k areconnected together with an underlying Open-StringNetfor all k ∈ I ac and the constraint (5e) ensures thatall the R d ( N a ) defenders are assigned to the attackers’swarms. This MIQCP can be solved using a MIP solverGurobi [24]. As shown in an instance of the defender-swarm assignment in Fig. 1, the defenders at ξ gl for l ∈ { , , ..., } are assigned to swarm A c and those at ξ gl for l ∈ { , , ..., } are assigned to swarm A c . C. Hierarchical Approach to defender-swarm assignment
Finding the optimal defender-swarm assignment bysolving the MIQCP discussed above may not be real-timeimplementable for a large number of agents ( > N ac to smallerproblems of size smaller than or equal N ac ( < N ac ).In Algorithm 1, A is a data structure that stores theinformation of: centers of the attackers’ swarms r ac =[ r ˆ ac , r ˆ ac , ..., r ˆ ac Nac ], numbers of the attackers in eachswarm n ac = [ |A c | , |A c | , ..., |A c Nac | ], total number ofattackers N a ; and D is a data structure that stores theinformation of: defenders’ positions r d = { r dj | j ∈ I ′ d } ,and the goal assignment β . splitEqual function splitsthe attackers into two groups A l and A r of roughlyequal number of attackers and the defenders into twogroups D l and D r . The split is performed based on the lgorithm 1: Defender-Swarm Assignment
Function assignHierarchical( A , D ) :if A .N ac > N ac then [ A l , D l , A r , D r ]= splitEqual ( A , D ); if A l .N ac > N ac then β l = assignHierarchical ( A l , D l ); else β l = assignMIQCP ( A l , D l ); if A r .N ac > N ac then β r = assignHierarchical ( A r , D r ); else β r = assignMIQCP ( A r , D r ); β = { β l , β r } ; else β = assignMIQCP ( r ac , r d ); return β = { β , β , ..., β N ac } End Function angles ψ k made by relative vectors r ˆ ac k − r dc , for all k ∈ I ac , with the vector r ˆ ac k − r dc where r dc is the centerof r d . We first arrange these angles ψ k in descendingorder. The first few clusters in the arranged list withroughly half the total number of attackers become theleft group A l and the rest become the right group A r .Similarly, the left group D l is formed by the first A l .N a defenders as per the assignment β and the rest defendersform the right group D r . We assign the defenders in D l only to the swarms in A l and those in D r onlyto the swarms in A r . By doing so we may or maynot obtain an assignment that minimizes the cost in(5a) but we reduce the computation time significantlyand obtain a reasonably good assignment quickly. As inAlgorithm 1, the process of splitting is done recursivelyuntil the number of attackers’ swarms is smaller thana pre-specified number N ac . The function assignMIQCP finds the defender-swarm assignment by solving (5). Asshown in Figure 2, the average computation time over anumber of cluster configurations and initial conditions forthe hierarchical approach to assignment is significantlysmaller than that of the MIQCP formulation and alsothe cost of the hierarchical algorithm is very close to theoptimal cost (MIQCP), see Fig. 3. V. Simulations
We provide a simulation of 18 defenders herding 18attackers to S with bounded control inputs. Figure 4shows the snapshots of the paths taken by all agents.The positions and paths of the defenders are shown inblue color, and that of the attackers in red. The string-barriers between the defenders are shown as wide solidblue lines with white dashes in them.Snapshot 1 shows the paths during the gatheringphase. As observed the defenders are able to gatherat a location on the shortest path of the attackers tothe protected area before the attacker reach there. Five X Y Fig. 2: Run-time for assignment algorithms % e rr o r i n t he c o s t Fig. 3: % Error in the costs of the assignment algorithmsattackers are already separated from the rest thirteenin reaction to the incoming defenders in their path. Thedefenders have identified two swarms of the attackers A c and A c at the end of the gathering phase and assigntwo subgroups D c and D c of the defenders to A c and A c using Algorithm 1. As shown in snapshot 2, D c and D c seek A c and A c , but the attackers in swarm A c further start splitting and the defenders identify thisnewly formed A c and A c at time t = 120 . s . Thegroup D c is then split into two subgroups D c and D c of appropriate sizes and assigned to the new swarms A c and A c using Algorithm 1.Snapshot 3 shows how the 3 subgroups of the defendersare able to enclose the the identified 3 swarms of theattackers by forming Closed-StringNets around them.Snapshot 4 shows how all the three enclosed swarms ofthe attackers are taken to the respective closest safe areaswhile each defenders’ group ensures collision avoidancefrom other defenders’ groups. Additional simulations canbe found at /drive/video. VI. Conclusions
We proposed a clustering-based, connectivity-constrained assignment algorithm that distributesand assigns groups of defenders against swarms ofthe attackers, to herd them to the closest safe areausing ‘StringNet Herding’ approach. We also provide aheuristic for the defender-swarm assignment based onig. 4: Snapshots of the paths of the agents during Multi-Swarm StringNet Herdingthe optimal MIQCP that finds the assignment quickly.Simulations show how this proposed method improvesthe original ’StringNet Herding’ method and enablesthe defenders herd all the attackers to safe areas eventhough the attackers start splitting into smaller swarmsin reaction to the defenders.
References [1] M. Chen, Z. Zhou, and C. J. Tomlin, “Multiplayer reach-avoid games via pairwise outcomes,”
IEEE Transactions onAutomatic Control , vol. 62, no. 3, pp. 1451–1457, 2017.[2] M. Coon and D. Panagou, “Control strategies for multiplayertarget-attacker-defender differential games with double inte-grator dynamics,” in
Conference on Decision and Control .IEEE, 2017, pp. 1496–1502.[3] V. S. Chipade and D. Panagou, “Herding an adversar-ial swarm in an obstacle environment,” arXiv preprintarXiv:1906.08925 , 2019.[4] ——, “Multi-agent planning and control for swarmherding in 2d obstacle environments under boundedinputs,” (under review) , 2019. [Online]. Available:https://drive.google.com/open?id=1JW3gl2YCMSfli9DccB4h2RIXdTpRBsbF[5] S. Gade, A. A. Paranjape, and S.-J. Chung, “Herding a flock ofbirds approaching an airport using an unmanned aerial vehi-cle,” in
AIAA Guidance, Navigation, and Control Conference ,2015, p. 1540.[6] A. A. Paranjape, S.-J. Chung, K. Kim, and D. H. Shim,“Robotic herding of a flock of birds using an unmanned aerialvehicle,”
IEEE Transactions on Robotics , vol. 34, no. 4, pp.901–915, 2018.[7] A. Pierson and M. Schwager, “Controlling noncooperativeherds with robotic herders,”
IEEE Transactions on Robotics ,vol. 34, no. 2, pp. 517–525, 2018.[8] M. A. Haque, A. R. Rahmani, and M. B. Egerstedt, “Biolog-ically inspired confinement of multi-robot systems,”
Interna-tional Journal of Bio-Inspired Computation , vol. 3, no. 4, pp.213–224, 2011.[9] A. Varava, K. Hang, D. Kragic, and F. T. Pokorny, “Herdingby caging: a topological approach towards guiding movingagents via mobile robots,” in
Proceedings of Robotics: Scienceand Systems , 2017.[10] R. A. Licitra, Z. D. Hutcheson, E. A. Doucette, and W. E.Dixon, “Single agent herding of n-agents: A switched systemsapproach,”
IFAC-PapersOnLine , vol. 50, no. 1, pp. 14 374–14 379, 2017.[11] D. Xu and Y. Tian, “A comprehensive survey of clusteringalgorithms,”
Annals of Data Science , vol. 2, no. 2, pp. 165–193, 2015.12] J. MacQueen et al. , “Some methods for classification andanalysis of multivariate observations,” in
Proceedings of thefifth Berkeley symposium on mathematical statistics and prob-ability , vol. 1, no. 14. Oakland, CA, USA, 1967, pp. 281–297.[13] T. Zhang, R. Ramakrishnan, and M. Livny, “Birch: an effi-cient data clustering method for very large databases,”
ACMSigmod Record , vol. 25, no. 2, pp. 103–114, 1996.[14] M. Ester, H.-P. Kriegel, J. Sander, X. Xu et al. , “A density-based algorithm for discovering clusters in large spatialdatabases with noise.” in
Kdd , vol. 96, no. 34, 1996, pp. 226–231.[15] L. O’callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Mot-wani, “Streaming-data algorithms for high-quality clustering,”in
Proceedings 18th International Conference on Data Engi-neering . IEEE, 2002, pp. 685–694.[16] R. Sharan and R. Shamir, “Click: a clustering algorithm withapplications to gene expression analysis,” in
Proc Int ConfIntell Syst Mol Biol , vol. 8, no. 307, 2000, p. 16.[17] R. Burkard, M. Dell’Amico, and S. Martello,
Assignmentproblems, revised reprint . Siam, 2012, vol. 106.[18] T. Öncan, “A survey of the generalized assignment problemand its applications,”
INFOR: Information Systems and Op-erational Research , vol. 45, no. 3, pp. 123–141, 2007.[19] B. Dai and W. Li, “Flocking of multi-agents with arbitraryshape obstacle,” in
Proceedings of the 33rd Chinese ControlConference . IEEE, 2014, pp. 1311–1316.[20] R. Goel, J. Lewis, M. Goodrich, and P. Sujit, “Leader andpredator based swarm steering for multiple tasks,” in . IEEE, 2019, pp. 3791–3798.[21] K. Raghuwaiya, J. Vanualailai, and B. Sharma, “Formationsplitting and merging,” in
International Conference on SwarmIntelligence . Springer, 2016, pp. 461–469.[22] A. Mirjan, A. Federico, D. Raffaello, G. Fabio, andK. Matthias, “Building a bridge with flying robots,” in
RoboticFabrication in Architecture, Art and Design 2016 . Springer,Cham, 2016, pp. 34–47.[23] M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander,“Optics: ordering points to identify the clustering structure,”