[PDF] An Overview on Optimal Flocking

Abstract

The study of robotic flocking has received considerable attention in the past twenty years. As we begin to deploy flocking control algorithms on physical multi-agent and swarm systems, there is an increasing necessity for rigorous promises on safety and performance. In this paper, we present an overview the literature focusing on optimization approaches to achieve flocking behavior that provide strong safety guarantees. We separate the literature into cluster and line flocking, and categorize cluster flocking with respect to the system-level objective, which may be realized by a reactive or planning control algorithm. We also categorize the line flocking literature by the energy-saving mechanism that is exploited by the agents. We present several approaches aimed at minimizing the communication and computational requirements in real systems via neighbor filtering and event-driven planning, and conclude with our perspective on the outlook and future research direction of optimal flocking as a field.

Full PDF

AAn Overview on Optimal Flocking

Logan E. Beaver, Andreas A. Malikopoulos

Department of Mechanical Engineering, University of Delaware, Newark, DE, 19716, USA

Abstract

The study of robotic ﬂocking has received considerable attention in the past twenty years. As we beginto deploy ﬂocking control algorithms on physical multi-agent and swarm systems, there is an increasingnecessity for rigorous promises on safety and performance. In this paper, we present an overview theliterature focusing on optimization approaches to achieve ﬂocking behavior that provide strong safetyguarantees. We separate the literature into cluster and line ﬂocking, and categorize cluster ﬂocking withrespect to the system objective, which may be realized by a reactive, or planning, control algorithm.We also present several approaches aimed at minimizing ﬂocking communication and computationalrequirements in real systems via neighbor ﬁltering and event-driven planning. We conclude the overviewwith our perspective on the outlook and future research direction of optimal ﬂocking algorithms.

Keywords:

Flocking, optimization, emergence, multi-agent systems, swarm systems

Contents1 Introduction 1

Generating emergent ﬂocking behavior has beenof particular interest since Reynolds proposedthree heuristic rules for multi-agent ﬂocking incomputer animation; see Reynolds (1987). Roboticﬂocking has been proposed in several applicationsincluding mobile sensing networks, coordinated de-livery, reconnaissance, and surveillance; see Olfati-Saber (2006). With the signiﬁcant advances incomputational power in recent decades, the con-trol of robotic swarm systems has attracted con-siderable attention due to their adaptability, scala-bility, and robustness to individual failure; see Ohet al. (2017). However, constructing a robot swarmwith a large number of robots imposes signiﬁcantcost constraints on each individual robot. Thus,any real robot swarm consists of individual robotswith limited sensing, communication, actuation,memory, and computational abilities. To achieverobotic ﬂocking in a physical swarm, we must de-velop and employ energy-optimal approaches toﬂocking under these strict cost constraints.There have been several surveys and tutorialson decentralized control that include ﬂocking; see a r X i v : . [ c s . M A ] S e p arve and Nene (2013); Bayindir (2016); Ferrariet al. (2016); Zhu et al. (2016, 2017); Albert andImsland (2018). In one motivating example, Fineand Shell (2013) discuss various ﬂocking controllerswithout considering optimality. In general, thesesurveys have all considered ﬂocking and optimalcontrol to be two distinct problems. Thus, webelieve it is appropriate to present a comprehensiveoverview of optimal ﬂocking control algorithmsas robotic swarm systems begin to roll out inlaboratories, e.g., Rubenstein et al. (2012); Janget al. (2019); Beaver et al. (2020a); Malikopouloset al. (2020); Wilson et al. (2020), and ﬁeld tests,e.g., V´as´arhelyi et al. (2018); Mahbub et al. (2020).Our objective for this overview is to establish thecurrent frontier of optimal ﬂocking research andpresent our vision of the research path for the ﬁeld.One signiﬁcant problem throughout the litera-ture is the use of the term “ﬂocking” to describevery diﬀerent modes of aggregate motion. Thebiology literature emphasizes this point, e.g., Ba-jec and Heppner (2009), where the distinction ofline ﬂocking (e.g., geese) and cluster ﬂocking (e.g.,sparrows) is necessary. To this end, we believe it ishelpful to partition the engineered ﬂocking litera-ture into cluster and line ﬂocking. As with naturalsystems, these modes of ﬂocking have vastly dif-ferent applications and implementations. Unlikebiological systems, the behavior of engineeringsystems is limited only by the creativity of thedesigner. Thus, based on the current state of theliterature, we have further partition cluster ﬂock-ing into several categories based on the system-level objective. Our proposed ﬂocking taxonomyis shown in Fig. 1. This taxonomy is motivatedby the need to control robotic swarms, which is,in general, application dependent.While an extensive body of literature has stud-ied the convergence of ﬂocking behavior, therehas been almost no attention to the developmentof optimal ﬂocking control algorithms. AlthoughMolzahn et al. (2017) focused on optimal decen-tralized control in a recent survey, the approachescovered in the paper tend to focus on formationconﬁguration achieving consensus, or area cover-age. In this paper, we seek to summarize the exist-ing literature at the interface of ﬂocking and opti- Figure 1: Our proposed ﬂocking classiﬁcation scheme forcluster and line ﬂocking. mization with an emphasis on minimizing agents’energy consumption.The objectives of this paper are to: (1) elabo-rate on a new classiﬁcation scheme for engineeredﬂocking literature aimed at enhancing the descrip-tion of ﬂocking research (Fig. 1), (2) summarizethe results of the existing optimal ﬂocking litera-ture across engineering disciplines and present thefrontier of ﬂocking and optimization research, and(3) propose a new paradigm to understand ﬂock-ing as an emergent phenomenon to be controlledrather than a desirable group behavior for agentsto mimic.The contribution of this paper is the collectionand review of the literature in this area. In severalcases, the optimal ﬂocking and formation recon-ﬁguration literature overlap. We have attemptedto separate them and present only the materialrelevant to ﬂocking in this review. Any such eﬀorthas obvious limitations. Space constraints limitthe presentation of technical details, and thus, ex-tensive discussions are included only where theyare important for understanding the fundamentalconcepts or explaining signiﬁcant departures fromprevious work.The remainder of this paper is structured asfollow. In Section 1.1, we present the commonnotation used throughout the paper. Then, in Sec-tion 2, we give an introduction to cluster ﬂocking,which we further partition into Reynolds ﬂocking(Section 3), reference state tracking (Section 4),2nd remaining cases (Section 5). We further divideeach of these sections into reactive and planningapproaches. We present the line ﬂocking literaturein Section 6, and in Section 7, we discuss severalapproaches to Pareto front selection for optimalﬂocking. In Section 8, we discuss the implicationsof ﬂocking with real robots. In Section 8.1, wepresent approaches to reducing cyberphysical costswhile in Section 8.2, we present ﬂocking as a groupstrategy. Finally, in Section 9, we present our re-search outlook, concluding remarks, and motivatea new direction for ﬂocking research.

We consider a swarm of N ∈ N agents indexedby the set A = { , , . . . , N } . For each agent i ∈A , we denote their position and velocity by p i ( t )and v i ( t ), respectively, at time t ∈ R ≥ . Agent i ’s state is denoted by the vector x i ( t ), and thestate of the system by x ( t ) = [ x T ( t ) , . . . , x TN ( t )] T .Each agent i ∈ A has a neighborhood N i ( t ) ⊆ A ,which contains all neighbors that i may sense andcommunicate with. For consistency we explicitlyinclude i ∈ N i ( t ) for all t . There are many ways todeﬁne a neighborhood, including inter-agent dis-tance, k -nearest neighbors, and Voronoi partitions;see Fine and Shell (2013) for further discussion.In most cases, each agent’s neighborhood is only afraction of the swarm; thus, each agent is only ableto make partial observations of the entire systemstate. Using neighborhoods as our basis for localinformation, we propose the following deﬁnitionfor connected agents. Deﬁnition 1.

Two agents i, j ∈ A are connectedat time t over a period T ∈ R > if there exist asequence of neighborhoods (cid:8) N i ( t ) , N k ( t ) , N k ( t ) , . . . , N k n ( t n +1 ) (cid:9) , (1)such that k ∈ N i ( t ) , k ∈ N k ( t ) , . . . , j ∈ N k n ( t n +1 ) , (2)where n + 1 is the length of the sequence and t ≤ t ≤ t · · · ≤ t n +1 ≤ t + T .Finally, for any two agents i, j ∈ A , we denotetheir relative position as s ij ( t ) = p j ( t ) − p i ( t ) . (3)

2. Cluster Flocking and Swarming

The swarming, aggregate motion of small birdsis known as cluster ﬂocking in biological literature.The beneﬁt of cluster ﬂocking in natural systems isunknown, however, several hypotheses have beenproposed. These include predator evasion, estimat-ing the ﬂock population, and sensor fusion. It isalso unclear if leadership is necessary to generatethe organized motion in cluster ﬂocks of actualbirds; Bajec and Heppner (2009) provides a reviewof swarming in biological systems. In this section,we present each formulation considering that allagents have access to any global reference informa-tion when solving their local optimization problem.With this in mind, and based on the work of Olfati-Saber (2006); Cucker and Smale (2007); Tanneret al. (2007), we present a general deﬁnition forcluster-ﬂocking behavior in engineered swarms.

Deﬁnition 2. (Cluster Flocking) A group of agentsachieve cluster ﬂocking if:1. There exists a ﬁnite distance D ∈ R > suchthat || p i ( t ) − p j ( t ) || ≤ D for all i, j ∈ A andall t ∈ R ≥ .2. There exists a ﬁnite period of time T ∈ R > such that every pair of agents i, j ∈ A isconnected for all t ∈ R ≥ (Deﬁnition 1).3. No agent i ∈ A has a desired ﬁnal state (i.e.,there is no explicit formation).4. For each agent i ∈ A at each time t ∈ R ≥ ,there exists a time T ∈ R > such that || v i ( t + T ) || >

3. Reynolds Flocking

A vast amount of literature exists that seeksto achieve ﬂocking under Reynolds ﬂocking. Gen-erally, ﬂock centering, velocity matching, and col-lision avoidance can be captured by imposing the following cost function for each agent i ∈ A , J i = V ( || s ij ( t ) || ) + (cid:88) j ∈N i ( t ) || ˙ s i ( t ) || , (4)where j ∈ N i ( t ) and V is an attractive-repulsivepotential function with a local minimum at somedesired distance. The ﬁrst term of (4) managescollision avoidance and ﬂock centering, while thesecond term ensures velocity alignment. Fig. 2shows each component of an agent ﬂocking underReynolds rules. Figure 2: A diagram showing the inﬂuence of collisionavoidance, ﬂock centering, and velocity matching for agent i , in green. Given a distance d ∈ R > that minimizes thepotential function in (4), Olfati-Saber (2006) pro-posed the α -lattice, i.e., any conﬁguration of agentssuch that each agent i ∈ A satisﬁes || s ij ( t ) || = d, (5)for all j ∈ N i ( t ). This deﬁnition coincides withthe global minimum of (4), and many authorshave substituted (5) for the ﬂock centering andcollision avoidance rules of Reynolds. Next, wepresent three diﬀerent approaches to designingoptimal Reynolds ﬂocking controllers. To optimally ﬂock in a reactive system, eachagent works to minimize a global objective, such asvelocity alignment of the ﬂock, while only makingpartial observations of the total state x ( t ). There-fore, optimal reactive ﬂocking methods generallyrely on designing an optimal control policy us-ing simulation and experimental data. Generally,4hese approaches seek to ﬁnd the optimal weightsfor a controller of the form u i ( t ) = − (cid:88) j ∈N i ( t ) \{ i } ∇ V ( || s ij || ) − (cid:88) j ∈N i ( t ) \{ i } ˙ s ij ( t ) , (6)where the ﬁrst term minimizes the potential ﬁeldand the second term handles velocity alignment.An early approach to optimally follow Reynoldsﬂocking rules was presented by Morihiro et al.(2006a), where the authors took a learning-basedapproach to velocity alignment. In this work, eachagent i ∈ A observes the state, x j ( t ), of a ran-domly selected agent j ∈ A \ { i } at each time step t . Agent i then follows one of four motion primi-tives: (1) move toward j , (2) move away from j ,(3) move the same direction as j , or (4) move theopposite direction of j . The agents are rewardedfor achieving velocity alignment and staying nearsome desirable distance d of their neighbors, i.e.,velocity matching and ﬂock centering. In addi-tion, the authors included a set of predators thatwould attempt to disrupt the ﬂock. In this case,the agents observe the state of the predator withprobability 1 whenever it is within range. Agent i is then rewarded for evading the predator andmaintaining the structure of the ﬂock. Furthersimulation results for this method are presentedin Morihiro et al. (2006b).Flocking was formulated as a dynamic pro-gram by Wang et al. (2018) to generate optimaltrajectories for a swarm of quadrotors in R . Theobjective is for the quadrotors to follow Reynoldsﬂocking rules while moving the swarm center toa global reference position. The agents followsunicycle dynamics, and each agent observes thestate of its nearest left and right neighbor to de-termine its control action. This angular symmetryin neighbor selection reduces the likelihood of theagents forming isolated cliques, which is a commonissue in the distance and nearest-neighbor deﬁni-tions of neighborhoods; see Camperi et al. (2012);Fine and Shell (2013). The authors penalize eachagent for violating Reynolds ﬂocking rules, com-ing within some distance of an obstacle, and notmoving toward the desired location. They also in-corporate a constant transition penalty if the agent is not within a ﬁxed distance of the goal. Then aninﬁnite-horizon discounted problem is formulatedand the optimal policy is learned using a standarddeep reinforcement learning approach. The policyis veriﬁed on a group of N = 3 agents and showedthat the decentralized control policy generalizes tolarger systems of 5 and 7 uncrewed aerial vehicles(UAVs) without signiﬁcant deterioration of theﬁnal objective function value.Metaheuristic algorithms, including Pigeon-inspired optimization, e.g., see Duan and Qiao(2014), and particle swarm optimization, e.g., seeKennedy and Eberhart (1995), have been used togenerate systems that optimally follow Reynoldsﬂocking rules. In Qiu and Duan (2020), the au-thors optimized the control actions of a UAV in R with state and control constraints. This isachieved by breaking the controller into ﬂockingand obstacle avoidance components, then usingpigeon-inspired optimization to weight each com-ponent such that the deviation from Reynoldsﬂocking rules was minimized while avoiding colli-sions.Navarro et al. (2015) applied particle swarm op-timization to optimize a neural network controllerwith 50 weights, nine inputs, and two outputs.The inputs consist of distance measurements foreach octant around the agent and the averageheading of all neighboring agents. The outputsof the neural network are speed commands forthe left and right motor of a diﬀerential driverobot. The system is trained to maximize a linearcombination of local velocity alignment, desiredinter-robot spacing, and the average velocity ofthe ﬂock. The agents are trained in simulation inthe local and global information case. The authorsshowed that a neural network trained on 4 agentscan be generalized up to groups of 16.The eﬀect of control input constraints for an op-timal ﬂocking controller was studied in Celikkanat(2008). In this work, the authors sought to designa local control law based on maximizing velocityalignment while minimizing deviation from an α -lattice. They included the average heading of thesystem as a global order parameter and an entropyparameter which applied Shannon’s informationentropy metric, e.g., see Shannon (1948), to the5roportion of robots within a disk of diameter h .The ﬂocking controller parameters are optimizedusing a genetic algorithm while its performance isvalidated in simulation. Another genetic algorithmwas employed by V´as´arhelyi et al. (2018) to designthe feedback controller for an individual agent,which is parameterized in terms of 11 optimiza-tion variables. The authors optimize the agentsin a constrained environment with a complicatedobjective function that includes the minimizationof collision risk with walls and other agents, de-viation from desired speed, and the number ofdisconnected agents, while simultaneously maxi-mizing velocity alignment and the largest clustersize. The control variables are optimized oﬄinein a realistic simulation that includes stochasticdisturbances for desired ﬂock speeds of 4, 6, and 8m/s. The controller is validated in outdoor ﬂightexperiments with 30 Pixiehawk drones ﬂown over10-minute intervals.Up to this point, obstacle avoidance and safetyhave been accomplished through artiﬁcial potentialﬁelds and attractive-repulsive forces. In addition,the design of potential ﬁelds has been the sub-ject of signiﬁcant research for general navigationproblems; see Vadakkepat et al. (2000). However,applying potential ﬁelds to multi-agent systemshas been shown to have several drawbacks; seeKoren and Borenstein (1991). These include in-troducing steady oscillations to trajectories andexacerbating deadlock in crowded environments.A promising alternative to potential ﬁeld meth-ods, which explicitly guarantees safety, has beenproposed as a novel paradigm for the design oflong-duration robotic systems by Egerstedt et al.(2018). In this approach, the tasks of each agentare imposed as motion constraints while the alwaysagents seek to follow energy-minimizing trajecto-ries. We interpret this constraint-driven approachto control as understanding why agents take partic-ular control actions, rather than designing controlalgorithms that mimic a desirable behavior. To thebest of our knowledge, reactive constraint-drivenReynolds ﬂocking has only been explored by Ibukiet al. (2020). Under this approach, each agent i ∈ A generates an optimal control trajectory bysolving the following optimal control problem at each time t , min u i ( t ) ∈ R ,δ i ∈ R || u i ( t ) || + δ i subject to: lim t →∞ || s ij ( t ) || ≤ δ i , (7)lim t →∞ || φ ij ( t ) || → , (8) || s ij ( t ) || > R ∀ t ∈ R ≥ , (9) ∀ j ∈ A \ { i } , where δ is a slack variable, φ ij is a metric for atti-tude error, and R is the radius of a circle that cir-cumscribes the agents. Constraint (7) correspondsto pose synchronization (ﬂock centering), (8) toattitude synchronization (velocity alignment), and(9) to collision avoidance. The authors generatedcontrol inputs for each agent by applying gradi-ent ﬂow coupled with control barrier functions toachieve constraint satisfaction. This guaranteesthat the agents satisfy the safety constraint, satisfyReynolds ﬂocking rules within a threshold δ , andsimultaneously minimize energy consumption. As an alternative way to simply reacting to theenvironment and system state, agents may insteadplan an optimal trajectory over a time horizon.This can generally improve the performance of theagent, e.g., by avoiding local minima; however,planning generally requires more computationalpower than a reactive approach. The structureof the information in a decentralized system alsocreates challenges with respect to the informationavailable over a planning horizon. It has beenshown that there is separation between estimationand control for particular decentralized informa-tion structures; see Nayyar et al. (2013); Dave andMalikopoulos (2020). However, this is an openproblem for the general case. Some proposed solu-tions include sharing information between agents,e.g., see Morgan et al. (2016), only planning withagents shared between neighbors, e.g., see Daveand Malikopoulos (2019), and applying model pre-dictive control (MPC). For large swarms of inex-pensive agents, widespread information sharingis generally infeasible, and it is unlikely that any6ommon information exists. For this reason, MPChas been a preferred approach in swarm systems.With MPC, each agent plans a sequence of controlactions over a time horizon based on its currentinformation about the system. After some time,the agent will replan its trajectory based on what-ever new information it has received. Next, wepresent several approaches to planning optimaltrajectories that use Reynolds ﬂocking rules.A signiﬁcant number of MPC ﬂocking algo-rithms seek to minimize deviation from Reynoldsﬂocking rules, which may be implemented througha linear combination of the following objectives: J di ( t ) = (cid:88) j ∈ N i ( t ) (cid:16) || s ij ( t ) || − d (cid:17) , (10) J vi ( t ) = || ¯ v i ( t ) − v i ( t ) || , (11) J ui ( t ) = || u i ( t ) || , (12)where d is the desired separating distance in (5),and ¯ v i ( t ) is the average velocity of all agents j ∈N i ( t ). Eq. (10) corresponds to ﬂock centering,(11) to velocity matching, and (12) is a controleﬀort penalty term.The analysis by Zhang et al. (2008) presentsa mechanism for ﬂocking agents to estimate theirneighbors’ future trajectories. The predictive de-vice was applied by Zhan and Li (2011b) to achieveReynolds ﬂocking under a fully connected commu-nication topology. This was extended to the decen-tralized information case in Zhan and Li (2011a)and validated experimentally with outdoor ﬂighttests in Yuan et al. (2017).An inﬁnite horizon continuous-time MPC ap-proach was employed in Xu and Carrillo (2015) andXu and G. Carrillo (2017) that minimized ﬂockingerror over an inﬁnite horizon in a continuous-timesystem. The resulting Hamilton-Jacobi-Bellmanequation is nonlinear and without an explicit solu-tion. As a result, the authors applied an originalreinforcement learning technique to optimize theagent trajectories online and validated the perfor-mance in simulation. The reinforcement learningarchitecture is expanded on in Jafari et al. (2020),where the authors include model mismatch and sig-niﬁcant environmental disturbances acting upon the agents. They also present simulation and ex-perimental results for a ﬂock of quadrotors.To guarantee feasibility of the planned trajecto-ries, it is necessary to explicitly impose constraintsthat bound the maximum control and velocity ofeach individual agent within their physical limits,i.e., for each agent i ∈ A , || v i ( t ) || ≤ v max , (13) || u i ( t ) || ≤ u max , (14)for all t ∈ R ≥ . An analysis of constrained α -lattice ﬂocking under MPC which incorporated(13) and (14) was explored in Zhang et al. (2015),and was extended to velocity alignment in Zhanget al. (2016).As we begin to implement ﬂocking in physicalswarms, explicit guarantees of safety are impera-tive for any proposed control algorithm. The moststraightforward approach to guarantee safety isto circumscribe each agent i ∈ A entirely withina closed disk of radius R ∈ R > . The safety con-straint for i may then be formulated as || s ij ( t ) || ≥ R, ∀ j ∈ A \ { i } . (15)In general, applying MPC to each agent doesnot guarantee that coupled constraints, such as(15), are satisﬁed. At any planning instant, agent i only has the trajectories generated by j ∈ N i ( t )at a previous time step. Thus, agent i cannotguarantee safety constraints for the trajectory gen-erated by agent j at the current time. For thisreason, in the decentralized case, agents must ei-ther cooperatively plan trajectories or impose acompatibility constraint. To guarantee that cou-pled constraints between agents are satisﬁed, sig-niﬁcant research eﬀort has been dedicated to de-centralized MPC (DMPC). A common approachto DMPC is to design a communication protocolfor agents to iteratively generate trajectories whiledriving their cost to a local minimum. An itera-tive approach, proposed by Zhan and Li (2013),cooperatively generates trajectories while limit-ing the number of messages exchanged betweenagents. The agents apply an impulse accelerationat discrete intervals and seek to minimize the ﬂockcentering error over a ﬁnite horizon. The agents7equentially generate trajectories up to some in-dex l ≥ N , where at each iteration, agent i = (cid:0) k mod N (cid:1) + 1 , i ∈ A , k = 0 , , . . . , l − i ’s trajectory is nonincreasing with eachplanning iteration. Beaver et al. (2020c) appliedReynolds ﬂocking rules as an endpoint cost in acontinuous optimal control problem while includ-ing (13)-(15) as constraints. Each agent i ∈ A ﬁrstgenerates a trajectory while relaxing the safety con-straint (15). Agent i then exchanges trajectoryinformation with every other j ∈ N i ( t ). Finally,any agents violating (15) cooperatively generatethe centralized safety-constrained trajectory be-tween ﬁxed start and end points, which guaranteessafety.

4. Reference State Cluster Flocking

A common application that exhibits clusterﬂocking behavior is tracking a reference trajectorywith the center of mass of a swarm. In this applica-tion, the reference trajectory (also called a virtualleader) is generally presented as a time-varyingreference state, x r ( t ), which may be known toall agents. In general, this is appended to thestandard ﬂocking controller (6) of informed agentswith the feedback term f ri ( t ) = || x i ( t ) − x r ( t ) || , (16)which may be scaled with a positive control gain.To track the center of mass, the swarm must satisfythe condition (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N (cid:88) i ∈A (cid:0) x i ( t ) (cid:1) − x r ( t ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15), (17)for some threshold (cid:15) ≥

0. As with Reynolds ﬂock-ing, the information available to each agent i ∈ A is restricted to its neighborhood, N i ( t ). This isinsuﬃcient to evaluate (17). Thus, the center ofmass tracking problem has generally been formu-lated as an optimal controller design problem. Aschematic of reference state cluster ﬂocking agentsis presented in Fig. 3. Figure 3: Agent i , in green, selects the control input thatdrives the center of the ﬂock toward the reference state, x r . An early approach proposed by Hayes andDormiani-Tabatabaei (2002) sought to track areference point with a ﬂock of agents that fol-lowed Reynolds ﬂocking rules with an additionalattractive force toward the reference state. Theagents are placed in a rectangular domain, whereeach agent has a uniform probability of failingover a given period, i.e., the agent would stopmoving but still be detectable. Simulations areperformed to determine the controller gains thatminimize a combination of travel time, cumulativedistance traveled, and average inter-agent spacing.The resulting controller is validated in physicalexperiments with 10 robots. This objective hasbecome standard in many ﬂocking applications;see (Bayindir, 2016).Another approach to reference tracking, pro-posed by La et al. (2009), involves the selectionof the optimal weights for a feedback controllersuch that the reference trajectory x r ( t ) is trackedin minimum time while maintaining an α -latticeconﬁguration. The authors construct a cost func-tion which penalizes the time taken for the ﬂockto catch the reference trajectory scaled by theirinitial position. The resulting cost function isnon-convex and non-diﬀerentiable, and thus it isminimized by applying a genetic algorithm. Toguarantee that (17) is globally satisﬁed, all casesthat do not yield an α -lattice within some errorbound are discarded. The discrete-time versionof this system is optimized by Khodayari et al.(2016) using a gravitational search algorithm.La et al. (2015) proposed a hybrid ﬂocking-learning system to guarantee ﬂocking behavior inthe presence of obstacles and predators. At the8gent level, each agent seeks to reach the staticreference position p r with the center of their localneighborhoods. The objective of the system isto have each agent i ∈ A , in a decentralized wayand select the optimal p r ∈ P from a ﬁnite set ofpositions, P . Each agent is rewarded proportion-ately to the size of its neighborhood at each timestep, up to a maximum value of 6. The authorsimplemented a cooperative Q -learning approach,where each agent i ∈ A was rewarded for selectingthe appropriate p r by Q k +1 i = w Q ki ( s i , a i ) + (1 − w ) (cid:88) j ∈N i ( t ) Q kj ( s j , a j ) , (18)where w ∈ [0 ,

1] weighs the inﬂuence of i ’s neigh-bors, and s i , a i are the state and action taken byagent i , respectively. The convergence propertiesof this cooperative learning scheme are proved andthe performance is demonstrated in simulationsand experiments.To track the reference trajectory under realis-tic conditions Vir´agh et al. (2016) sought optimalvalues for a potential-ﬁeld based controller in R and R under the eﬀects of sensor noise, commu-nication delay, limited sensor update rate, andconstraints on the agent’s maximum velocity andacceleration. The work is framed in terms of aerialtraﬃc; thus, multiple competing ﬂocks are placedinto shared airspace such that their reference tra-jectories result in conﬂict between the ﬂocks. Theauthors presented two controllers, one that main-tained constant speed and one with a ﬁxed head-ing. The potential ﬁelds used in the controllersare composed of sigmoid functions parameterizedby optimization variables. A compound objectivefunction, proportional to eﬀective velocity and in-versely proportional to collision risk, is constructed,while 20 scenarios are generated to ﬁnd 20 param-eter sets for the two optimal controllers. The sce-narios consist of every combination of ﬁve diﬀerentinitial conﬁgurations for both the constant-speedand constant-heading controllers in R and R .As an alternative to deriving an optimal feed-back gain, Atrianfar and Haeri (2013) sought tominimize the number of informed agents such thatthe entire ﬂock could track a known reference tra- jectory. First, the authors impose that, for a givensensing distance h > d ∈ R > , the potential ﬁeldmust tend to inﬁnity as s ij approaches h . Thisproperty guarantees that any connected group ofagents would remain connected for all time. Thus,any initially connected group of agents containingan informed agent is guaranteed to converge tothe reference trajectory. The latter implies thatat most one informed agent would be required foreach group of connected agents. In addition, asa function of their initial conditions, some unin-formed groups may merge with an informed cluster.Following this reasoning, the authors show thatat most each initial cluster of agents requires oneinformed agent.Departing from the aforementioned approaches,a centralized approach to tracking a virtual veloc-ity reference was rigorously studied in Piccoli et al.(2016) for double-integrator agents in R k . Theauthors presented a consensus-driven control law,based on Cucker-Smale ﬂocking, of the form u i ( t ) = α i (cid:0) v r ( t ) − v i ( t ) (cid:1) + (1 − α i ) · N − (cid:88) j ∈N i ( t ) \{ i } || s ij ( t ) || ˙ s ij ( t ) , (19)where α i ∈ [0 , , i ∈ A weighs the tradeoﬀ be-tween consensus and velocity tracking. The au-thors sought values of α i such that (cid:80) i ∈A α i ≤ M, M ∈ R > , while minimizing the error function e ( t ) = 1 N N (cid:88) i =1 || v i ( t ) − v r ( t ) || , (20)over a time interval [ t , t f ] ⊂ R ≥ The optimalvalues of α i were presented for three cases: (1)instantaneously minimizing dedt , (2) minimizing theterminal cost e ( t f ), and (3) minimizing the inte-gral cost, (cid:82) t f t e ( t ) dt . The resulting optimal controlanalysis implies that, in general, the optimal strat-egy is to apply the maximum feedback to a fewagents before applying moderate feedback to allagents. This aims at driving agents with highvariance toward the reference velocity, enhancingthe rate of consensus. The authors also noted thepresence of dwell time in the terminal cost case,9.e., the optimal strategy includes applying no con-trol input over a nonzero interval of time startingat t . Optimal planning has several advantages overreactive methods, although it suﬀers from a hand-ful of challenges related to information structure.As with the reactive methods, the desired referencetrajectory is a time-varying function denoted by x r ( t ). To guarantee that the reference trajectorycan be maintained, the agents must be capableof evaluating x r ( t ) over their entire planning hori-zon. In addition, each agent i ∈ A generally mustplan under the assumption that their neighbor-hood, N i ( t ), is invariant over the planning horizon.Relaxing this assumption may require an amountof information sharing that is infeasible for largeswarm systems.Lee and Myung (2013) applied collective par-ticle swarm optimization to generate the controltrajectory of each agent for a general cost function.In their approach, each agent i ∈ A performs aparticle swarm optimization with M ∈ N parti-cles that correspond to possible control inputs ofagent i . The agents then transmit their g < M -best particles to all j ∈ N i ( t ) and iteratively solvetheir local particle swarm optimization until theplanned trajectories converge system-wide.The approach proposed by Lyu et al. (2019)tracks a known reference trajectory by generatingthe virtual state for each agent i ∈ A , z i ( t ) = 1 |N i ( t ) | (cid:88) j ∈N i x i ( t ) , (21)which corresponds to the average state of agent i ’sneighborhood. Agent i then imposes the constraint z j ( t ) = x r ( t ) , ∀ j ∈ N i ( t ) , (22)using Lagrange relaxation. Since i ∈ N j ( t ), j ∈N i ( t ), the components of (22) are shared betweenneighboring agents. A gradient descent techniqueis applied to minimize the deviation from (22).Reference tracking under uncertainty was ex-plored by Quintero et al. (2013) to track the po-sition of a mobile ground vehicle with a known trajectory, x r ( t ). The ﬂocking UAVs travel at aconstant speed and altitude with stochasticity intheir dynamics. The objective of each agent isto remain within a predeﬁned annulus centeredon the ground vehicle.. The cost for agent i ∈ A is deﬁned as the signed distance of agent i fromthe edge of the annulus plus a heading alignmentterm. The authors then applied dynamic program-ming to generate an optimal control policy for eachagent. This approach was extended by Hung andGivigi (2017) to include external disturbances, andthe optimal policy is derived in real time under areinforcement learning framework.

5. Other Cluster Flocking

In addition to Reynolds ﬂocking and centroidtracking, several other applications have been shownto induce cluster ﬂocking behavior. Although notwidely addressed in the literature, these resultsdemonstrate the breadth of applications that mayyield cluster ﬂocking behavior.Anisotropy in the angle between neighboringﬂockmates was proposed as a metric for measuringthe quality of a ﬂock of birds by Ballerini et al.(2008). Makiguchi and Inoue (2010) constructed ameasure for anisotropy using a projection matrix M ( n ) pq = 1 N (cid:88) i ∈A (ˆ s ik ) · p (ˆ s ik ) · q , (23)where k indexes the n ’th nearest neighbor of i, and p , q ∈ { ˆ x, ˆ y, ˆ z } are vector components of anorthonormal basis for R . Eq. (23) can be usedto calculate normalized anisotropy, denoted by γ ∈ [0 , M ( n ) pq to the average agent velocity. The author’s objec-tive was to select the optimal weights for each ofReynolds ﬂocking rules (cohesion, alignment, andseparation) to maximize ﬂock anisotropy for thecase that n = 1 in (23). The authors discarded anyparameters that yielded collisions or ﬂock fragmen-tation and achieved a ﬁnal anisotropy of γ > . γ = .Other optimization techniques outside of ge-netic algorithms have been applied to the problemof optimal ﬂocking. In Vatankhah et al. (2009),10ach agent uses local measurements to determinethe control input that would maximize the velocityof the swarm center via particle swarm optimiza-tion. Veitch et al. (2019) employed ergodic trajec-tories to achieve ﬂocking. An ergodic trajectory isone where the average position of the agents overtime is equal to some spatially distributed prob-ability mass (or density) function. The authorspresented a measure of ergodicity by decompos-ing the probability density function into a ﬁniteFourier series. The proposed control policy foreach robot maximizes this metric along an agent’strajectory. Each agent i ∈ A periodically sharesits Fourier coeﬃcients with all j ∈ N i ( t ). Thisallows the agents to predict where their neighborshave previously explored while also guaranteeingcollision avoidance by the nature of ergodicity. Fi-nally, to achieve ﬂocking, the authors generate auniform probability distribution in a closed diskcentered on a reference state in R . By construc-tion, this guarantees that all agents will enter theclosed disk and remain within it in ﬁnite time.Additionally, by smoothly moving the disk around R , the average velocity and centroid of the ﬂockcan be precisely controlled.Inspired by Reynolds ﬂocking rules and theconstraint-driven paradigm for control, Beaver andMalikopoulos (2020) proposed a set of ﬂockingrules over a planned horizon to achieve clusterﬂocking by: (1) minimizing energy consumption,(2) staying near the neighborhood center, and (3)avoiding collisions. Condition 2 (aggregation) isimposed with the constraint (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p i ( t ) − |N i ( t ) | − (cid:88) j ∈N i ( t ) \{ i } p j ( t ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ D, (24)for some distance D much greater than the di-ameter of any agent, and for |N i ( t ) | >

1. Thisapproach is visualized in Fig. 4. The proposedconstraint seeks to conﬁne each agent within adiameter D disk positioned at their neighborhoodcenter. The intuition is that the agents may movefreely within the disk; however, their velocity can-not vary dramatically from the average velocityin their neighborhood for long periods of time. Ithas been proven that these rules yield velocity Figure 4: Agent i , in green, is constrained to remain withina disk positioned at its neighborhood center. consensus asymptotically when N i ( t ) is forward-invariant. In a more recent eﬀort, Beaver et al.(2020b) proposed a method for a constraint-drivenagent to generate an optimal control policy in real-time. This is an important next step in real-timeoptimal control of physical ﬂocks.

6. Line Flocking

In this section, we review literature related toline ﬂocking, which is a naturally occurring phe-nomenon commonly found in large birds (such asgeese) that travel in a vee, jay, and echelon forma-tions over long distances. It has long been under-stood that saving energy is a signiﬁcant beneﬁt ofﬂying in such formations; see Cutts and Speakman(1994); Mirzaeinia et al. (2020). In aerial systems,the main energy savings comes from upwash, i.e.,trailing regions of upward momentum that can beexploited by birds to induce lift and expend lessenergy. This is illustrated in Fig. 5. Similar bene-ﬁts have been found in terrestrial and underwatervehicles, where a leader may create a low-pressurewake and reduce the overall drag force imposedon the following vehicles.In this context, the most straightforward methodto achieve line ﬂocking is to generate an optimalset of formation points based on the drag, wake,and upwash characteristics of each agent. Thiseﬀectively transforms the line ﬂocking probleminto a formation reconﬁguration problem, whereeach agent must assign itself to a unique goal and11 igure 5: The lead agent induces upwash and downwash inits wake due to its trailing wing vortices, and the followingagents exploit the upwash to induce lift and reduce energyconsumption. reach it within some ﬁxed time, as is the case inNathan and Barbosa (2008). However, a forma-tion reconﬁguration approach generally requiresthe formation to be computed oﬄine and does notnecessarily consider diﬀerences between individualagents (e.g., age, weight, size, and eﬃciency) orenvironmental eﬀects. Although formation recon-ﬁguration algorithms have rich supporting litera-ture, they are beyond the scope of this paper. Forrecent reviews of formation control see Oh et al.(2015, 2017).Another approach to line ﬂocking is model-ing the aerodynamic and hydrodynamic interac-tions between agents so that they may dynamicallyposition themselves without a predeﬁned forma-tion. This approach was proposed by Bedruz et al.(2019b), who applied computational ﬂuid dynam-ics simulations to wheeled mobile robots in orderto determine the optimal drafting distances be-tween them. This was extended in Bedruz et al.(2019a), where the authors proposed a fuzzy logiccontroller to maximize the eﬀect of drafting. Theauthors validated their controllers in simulationand experiments with wheeled diﬀerential driverobots.An early approach to capture line ﬂocking be-havior in a robotic system with model predictivecontrol was explored by Yang et al. (2016). In thiswork, the authors attempted to maximize veloc-ity matching and upwash beneﬁts for each agent i ∈ A while minimizing the ﬁeld of view occludedby leading agents. The resulting line ﬂocking be- havior was demonstrated in simulation, where anemergent vee formation consistently emerged in-dependently of the ﬂock’s initial conditions.As a next step toward optimal line ﬂocking,an analysis of the eﬀect of upwash on energy con-sumption in ﬁxed-wing UAVs was presented byMirzaeinia et al. (2019). The authors found thatthe front and tail agents in a vee formation havethe highest rate of energy consumption in theﬂock. This implies that the lead or tail agentsbecome the limiting factor in the total distancetraveled by the ﬂock. The authors proposed aload balancing algorithm based on a root-selectionprotocol, where the highest-energy agents replacethe lead and tail agents periodically. The authorsthen demonstrated, in simulation, that periodic re-placement of the lead and tail agents signiﬁcantlyincreases the total travel distance of the ﬂock.A ﬁnal facet of line ﬂocking is the eﬀect ofenvironmental disturbances, such as turbulenceand currents. Energy-optimal ﬂocking in the pres-ence of strong background ﬂows was investigatedby Song et al. (2017). In this approach, the au-thors derived an energy-optimal reference trajec-tory, x r ( t ), for the ﬂock centroid to track. To gen-erate this trajectory, the authors approximated theﬂock as a point mass at the centroid and sought tominimize its energy consumption in the presence ofa background ﬂow, U ( p , t ), where p ∈ R is a po-sition. The normalized rate of power consumptionof an agent was given by P ( t ) = || v r ( t ) − U ( p r ( t ) , t ) || v , (25)and the authors sought to solve the optimal controlproblem min t ,t f , p r ( t ) (cid:90) t f t P ( t ) dt subject to: p r ( t ) = p , p r ( t f ) = p f , || v r ( t ) || ≤ v max ,t min ≤ t < t f ≤ t max . The authors showed that the background ﬂowswould dominate the energy consumption of theagents, and therefore a tight cluster of agentswould closely approximate the energy-optimal tra-jectory traced out by the center of the ﬂock.12 . Pareto Front Selection

An essential consideration in multi-objectiveoptimal control is in the tradeoﬀ between each ofthe individual objectives. This can be observed,for example, in the tradeoﬀ between neighbor-hood centering and velocity alignment in Reynoldsﬂocking. This tradeoﬀ can be explored by ﬁndingPareto-eﬃcient outcomes. An outcome is Pareto-eﬃcient if no individual term in the cost functioncan be increased without decreasing the value ofany other term; see Malikopoulos et al. (2015).The set of all Pareto-eﬃcient outcomes is calledthe

Pareto frontier . After establishing a Paretofrontier, the most desirable outcome can be se-lected as the Pareto-optimal control policy; seeMalikopoulos (2016). Generally, multi-objectiveﬂocking problems have not applied Pareto opti-mality in the past. Instead, authors have usedvarious evolutionary algorithms that generate fam-ilies of optimal solutions; see Vir´agh et al. (2016);V´as´arhelyi et al. (2018).Hauert et al. (2011) examined the impact ofdesign parameters on ﬂocking performance for agroup of UAVs. The authors noted that, due to thehardware limitations, designers must weight thecost of enhanced communication range versus in-creasing the maximum turning rate for each agent.The authors explored this tradeoﬀ by exhaustivelyexploring the design space and calculating theresulting heading angle (velocity alignment) andrelative drift (ﬂock centering) error. Using exten-sive simulation data, the authors constructed thePareto frontier of optimal design choices. Finally,to validate their analysis, the authors conducted aset of outdoor experiments using 10 UAVs in fourdiﬀerent experiments.Pareto frontier generation was explicitly dis-cussed in terms of control by Kesireddy et al.(2019), who noted that almost all optimal ﬂockingalgorithms apply arbitrary weights to the compo-nents of a multiobjective ﬂocking problem. Theauthors presented three cooperative evolutionaryalgorithms that are used to generate a Pareto fron-tier, yielding a family of control policies that arePareto-eﬃcient with respect to Reynolds ﬂockingrules. This type of analysis provides a useful tool to ﬁnd the optimal tradeoﬀ in diﬀerent clusterﬂocking applications.Recent work by Zheng et al. (2020) examinesthe tradeoﬀ between ﬂocking performance andprivacy. The authors describe a system that fol-lows Reynolds ﬂocking rules guided by a leaderrobot. The system is observed by a discriminatorattempting to determine which agent is the leader.The authors propose a genetic algorithm that co-optimizes the ﬂocking controller parameters andthe discrimination function. The paper presents ameasure of the resulting ﬂocking performance andleader detectability to ﬁnd a set of optimal controlparameters for several kinds of leader trajectories.

8. Considerations for Physical Swarms

As the number of agents in a ﬂock increases,the amount of inter-agent communication requiredmay become a signiﬁcant energy and performancebottleneck. This has motivated several approachesto minimize the cyberphysical costs incurred byeach agent by either reducing the amount of com-munication required, explicitly including commu-nication cost into an agent’s objective function,or explicitly breaking communication links with asubset of neighboring agents. In the following sub-sections, we explore these approaches and discusstheir potential value to optimal ﬂocking.

The cost of communication was explicitly in-cluded by Li et al. (2013) as part of a holistic cyber-physical approach. To account for environmentaland inter-agent communication disturbances, theauthors calculated the probability of communica-tion errors as a function of physical antenna prop-erties. Based on the collision avoidance constraintand maximum communication distance, each agent i ∈ A determines a minimum and maximum dis-tance to every neighbor j ∈ N i ( t ). Agent i then se-lects the optimal separating distance within thesebounds to minimize a combination of communica-tion error and a crowding penalty. The authorspropose an adaptive controller to ﬁnd the optimalseparating distance and extend the analysis to in-clude both near and far-ﬁeld communication in Liet al. (2017).13 control method for preserving agent connec-tivity while minimizing the number of neighborswas presented in Zavlanos et al. (2009). In thisformulation, agents receive a number of commu-nication hops from their neighbors that they useto estimate the communication graph diameter.Each agent uses this information to remove andpreserve particular communication links withintheir neighborhood. Graph topology was explic-itly linked with antenna power in Dolev et al.(2010), where the agents sought to minimize com-munication power while guaranteeing a minimumglobal graph diameter. This work was extended inDolev et al. (2013), where agents applied a gossipalgorithm to achieve global information about thesystem trajectory. Although these are not directlyapplicable to swarm systems, a similar approachmay be beneﬁcial to ensure that all agents satisfyDeﬁnition 2 while minimizing communication costsbetween agents. Communication hop approacheshave not been applied to ﬂocking. However, theyhave been successfully used in more centralizedand structured swarm problems, particularly pat-tern formation; see Rubenstein et al. (2012); Wanget al. (2020). Finally, Chen et al. (2012) soughtthe minimum possible communication distanceto guarantee convergence to velocity consensusfor agents under the Vicsek model. The authorsshowed that, if the agents position and orienta-tion were randomly and uniformly distributed in[0 , × [ − π, π ], the minimum possible communi-cation distance was (cid:113) log NπN . This provides a lowerbound on communication energy cost for the ﬂock.Camperi et al. (2012) studied the stability ofa ﬂock when noise and external perturbations areintroduced. The authors sought to optimize theresponse of a large swarm of Vicsek agents in R by changing the neighborhood topology. The au-thors note that, as Ballerini et al. (2008) found,a k − nearest or Voronoi neighborhood topologyleads to more stable ﬂocking while reducing thenumber of neighbors of each agent as compared toa distance-based neighborhood. This has signiﬁ-cant implications in how the selection of a neigh-borhood topology may aﬀect the energy cost ofcommunication. Zhou and Li (2017) proposed to minimize thecommunication and computational cost of generat-ing optimal trajectories by screening out neighborsthat do not negatively impact the objective func-tion of agent i ∈ A . The authors applied MPC toa discrete-time ﬂocking system with the α -latticeobjective (5) and a control penalty term (12). Inthis case, given a desired distance d >

0, agent i constructed the screened neighbor sets S i ( t ) = { j ∈ N i ( t ) \ { i } : || s ij ( t ) || > d, s ij ( t ) · ˙ s ij ( t ) ≥ } , (26) S i ( t ) = { j ∈ N i ( t ) \ { i } : || s ij ( t ) || < d, s ij ( t ) · ˙ s ij ( t ) ≤ } , (27)where S i ( t ) consists of neighbors further than d and moving away, and S i ( t ) consists of neighborscloser than d and moving closer. Thus, agent i must only consider j ∈ S i ( t ) (cid:83) S i ( t ) when plan-ning.Another approach toward reducing communica-tion and computational cost is to perform sparseplanning updates by employing event-triggeredcontrol. Sun et al. (2019) proposed an update rulefor ﬂocking with time delays for systems usinga potential ﬁeld control law (6). A continuouslydiﬀerentiable and bounded function τ ( t ) acts asa time delay on all position measurements. Theauthors let the portion of the control input thatachieves velocity consensus for agent i ∈ A beconstant over an interval [ t , t ). Then the au-thors proposed an error function that the agentuses to update the potential ﬁeld portion of itscontroller. This occurs whenever the error magni-tude exceeds a threshold. However, the proposedthreshold requires global knowledge of the averageagent velocity, communication graph Laplacian,and a Lipschitz bound on the agent dynamics. Theauthors proved that, under this event-triggeredscheme, the agents converge to steady-state ﬂock-ing behavior and the system was free of Zeno, i.e.,chattering, behavior. This is a promising resultfor reducing the computational burden on agents,and the development of a decentralized triggeringfunction is a promising area of research.14 .2. Flocking as a Strategy As we begin to deploy robotic swarm systemsin situ, it is crucial to consider when a particu-lar ﬂocking behavior is an optimal strategy for aswarm. To the best of our knowledge, determin-ing when cluster ﬂocking is an optimal strategyhas not been explored in the literature. Instead,cluster ﬂocking is generally proposed as either aconvenient method of aggregate motion or as theresult of optimizing particular types of trackingproblems, e.g., reference state tracking. As engi-neering systems become more complex, addressingthe question of when cluster ﬂocking is an optimalteam strategy will become necessary to achievelong-term swarm operation. Line ﬂocking as astrategy has been explored as a tradeoﬀ betweenthe energy savings of ﬂocking and the energy costof rerouting to join a ﬂock. Signiﬁcant researcheﬀort has gone toward the rendezvous problem,that is, given a set of agents with distinct originsand destinations, when is it optimal for agents toexpend energy in order to form an energy-savingﬂock. This has primarily been explored throughthe lens of air traﬃc management, where com-mercial aircraft may rendezvous to form ﬂocksbetween origin and destination airports given atakeoﬀ and landing window.A centralized approach to the rendezvous prob-lem was presented in Ribichini and Frazzoli (2003),where the authors proved several properties ofenergy-optimal rendezvous for two agents. Thetwo-agent case was further explored in Rao andKabamba (2006) for minimum-time graph traver-sal. The eﬀect of wind and environmental factorswas explored in Marks et al. (2018), where theauthors used historical traﬃc and environmentaldata to show a 5-7% increase in fuel economy re-sulting from coordination. A ﬂocking protocolfor selﬁsh agents was presented in Azoulay andReches (2019), and the air traﬃc routing problemhas been extensively explored in Kent (2015) andVerhagen (2015).Flocking as a strategy was also explored in thecontext of passenger vehicle eco-routing by Fre-dette (2017). In this approach, the author adaptedReynolds ﬂocking rules to a two-lane highway withthe objective of minimizing vehicle fuel consump- tion while maintaining a desired velocity subjectto the physical parameters describing each vehicle.This resulted in each vehicle approaching its de-sired speed while dynamically forming and exitingﬂocks under a centralized control scheme.

9. Outlook and Research Directions

In the past twenty years, a rich literature onthe control of ﬂocking systems has been produced.Control algorithms that implement variants ofReynolds rules have proven rigorous guaranteeson their steady-state behavior. Recently, controlalgorithms that optimally implement these ruleshave been demonstrated in simulation and large-scale outdoor ﬂight tests. Flocking, as deﬁned byReynolds, will seemingly be driven by advancesin decentralized control, robust control, and long-duration autonomy in the future. However, someapplication areas, such as mobile sensor networks,have criticized Reynolds ﬂocking as a novelty thatdoes not necessarily have advantages in terms ofperformance or ease of implementation; see Albertand Imsland (2018).Therefore, we think that a new paradigm forviewing the nature of ﬂocking is necessary. As wedemonstrated, there is a distinction in the naturalworld between cluster and line ﬂocking. We wishto strengthen this distinction, and to that end, wepropose a partition of the literature into line andcluster ﬂocking. We have also presented severaltypes of cluster ﬂocking, deﬁned by the system-level objective, that have been conﬂated using thenebulous term “ﬂocking” throughout the litera-ture. In fact, we see no compelling reason why acontroller based on potential ﬁelds or α -latticesought to capture Reynolds ﬂocking, reference stateﬂocking, or Ergodic ﬂocking. Due to the nature ofengineering systems, new types of cluster ﬂockinghave already emerged that have no natural coun-terpart. For this reason, we believe that preciselyclassifying and diﬀerentiating between these typesof ﬂocking will be essential to advancing the re-search frontier of ﬂocking as a desirable emergentbehavior.Furthermore, we think that constraint-drivenoptimal control should be the “natural language”15o formulate ﬂocking and other emergence prob-lems. Under this design paradigm, it is possible toachieve rigorous guarantees on the safety and tasksimposed on agents as they travel along energy-minimizing surfaces. There has already been someinitial exploration into Reynolds ﬂocking, e.g.,seeIbuki et al. (2020), and systems with limited com-munication range under disk ﬂocking; see Beaverand Malikopoulos (2020). These approaches havealso shown a capacity for generating emergencein relatively simple multi-agent systems, e.g., seeNotomista and Egerstedt (2019), and the imposedconstraints provide guarantees on agent behaviorto neighbors and the system designer. Movingforward, we expect that by applying similar solu-tion methods to those used in the past, e.g., seeJadbabaie et al. (2003); Tanner et al. (2007), wemay provide guarantees on the behavior of manytypes of cluster ﬂocking agents.Finally, including heterogeneity in cluster andline ﬂocking will be essential as we roll out optimalﬂocking control algorithms to physical systems,where it is impossible for any two robots to haveidentical physical properties and performance ca-pabilities. Heterogeneity of agent properties is par-ticularly important in the line ﬂocking literature,where the variable size, wingspan, metabolism, andage of ﬂock members signiﬁcantly aﬀects the sys-tem’s overall energy savings; see Mirzaeinia et al.(2020). Prorok et al. (2017) has also shown that forgeneral swarm systems, an increase in agent diver-sity will expand the feasible solution space for eachagent’s control action. This may be beneﬁcial interms of system robustness, especially for applica-tions related to emerging transportation systems;however, it may also increase the diﬃculty of ﬁnd-ing an optimal solutions. By explicitly includingheterogeneity into a ﬂocking system, it is possibleto generate a larger space of possible emergentbehavior. Future ﬂocking research ought to con-sider diversity in agent properties and behaviorsto exploit the full beneﬁts of swarm intelligence. Acknowledgement

The authors would like to thank Bert Tannerfor the insightful remarks and suggestions.

References

Albert, A., Imsland, L., 2018. Survey: mobile sensor net-works for target searching and tracking. Cyber-PhysicalSystems 4, 57–98.Atrianfar, H., Haeri, M., 2013. Flocking of multi-agentdynamic systems with virtual leader having the reducednumber of informed agents. Transactions of the Instituteof Measurement and Control 35, 1104–1115.Azoulay, R., Reches, S., 2019. UAV ﬂocks forming forcrowded ﬂight environments, in: ICAART 2019 - Pro-ceedings of the 11th International Conference on Agentsand Artiﬁcial Intelligence, SciTePress. pp. 154–163.Bajec, I.L., Heppner, F.H., 2009. Organized ﬂight in birds.Animal Behaviour 78, 777–789.Ballerini, M., Cabibbo, N., Candelier, R., Cavagna, A., Cis-bani, E., Giardina, I., Lecomte, V., Orlandi, A., Parisi,G., Procaccini, A., Viale, M., Zdravkovic, V., 2008. In-teraction ruling animal collective behavior depends ontopological rather than metric distance: Evidence froma ﬁeld study. Proceedings of the National Academy ofSciences of the United States of America 105, 1232–1237.Barve, A., Nene, M.J., 2013. Survey of Flocking Algo-rithms in Multi-agent Systems. International Journal ofComputer Science 19, 110–117.Bayindir, L., 2016. A review of swarm robotics tasks.Neurocomputing 172, 292–321.Beaver, L.E., Chalaki, B., Mahbub, A.M., Zhao, L., Zayas,R., Malikopoulos, A.A., 2020a. Demonstration of aTime-Eﬃcient Mobility System Using a Scaled SmartCity. Vehicle System Dynamics 58, 787–804.Beaver, L.E., Dorothy, M., Kroninger, C., Malikopou-los, A.A., 2020b. Energy-Optimal Motion Planningfor Agents: Barycentric Motion and Collision AvoidanceConstraints, in: arxiv:2009.00588.Beaver, L.E., Kroninger, C., Malikopoulos, A.A., 2020c.An Optimal Control Approach to Flocking, in: 2020American Control Conference, pp. 683–688.Beaver, L.E., Malikopoulos, A.A., 2020. Beyond Reynolds:A Constraint-Driven Approach to Cluster Flocking, in:IEEE 59th Conference on Decision and Control (to ap-pear).Bedruz, R.A., Bandala, A.A., Vicerra, R.R., Conception,R., Dadios, E., 2019a. Design of a Robot Controller forPeloton Formation using Fuzzy Logic, in: 7th Conferenceon Robot Intelligence Technology and Applications, pp.83–88.Bedruz, R.A.R., Maningo, J.M.Z., Fernando, A.H., Ban-dala, A.A., Vicerra, R.R.P., Dadios, E.P., 2019b. Dy-namic Peloton Formation Conﬁguration Algorithm ofSwarm Robots for Aerodynamic Eﬀects Optimization,in: Proceedings of the 7th International Conferenceon Robot Intelligence Technology and Applications, pp.264–267.Camperi, M., Cavagna, A., Giardina, I., Parisi, G., Silvestri,E., 2012. Spatially balanced topological interaction rants optimal cohesion in ﬂocking models. InterfaceFocus 2, 715–725.Celikkanat, H., 2008. Optimization of self-organized ﬂock-ing of a robot swarm via evolutionary strategies, in: 23rdInternational Symposium on Computer and InformationSciences, IEEE. pp. 1–4.Chen, G., Liu, Z., Guo, L., 2012. The smallest possible in-teraction radius for ﬂock synchronization. SIAM Journalon Control and Optimization 50, 1950–1970.Cucker, F., Smale, S., 2007. Emergent behavior in ﬂocks.IEEE Transactions on Automatic Control 52, 852–862.Cutts, C.J., Speakman, J.R., 1994. Energy Savings inFormation Flight of Pink-Footed Geese. J. exp. Biol 189,251–261.Dave, A., Malikopoulos, A.A., 2019. Decentralized Stochas-tic Control in Partially Nested Information Structures,in: IFAC-PapersOnLine, Chicago, IL, USA.Dave, A., Malikopoulos, A.A., 2020. Structural results fordecentralized stochastic control with a word-of-mouthcommunication, in: 2020 American Control Conference(ACC), IEEE. pp. 2796–2801.Dolev, S., Segal, M., Shpungin, H., 2010. Bounded-HopStrong Connectivity for Flocking Swarms, in: WiOpt’10:Modeling and Optimization in Mobie, Ad Hoc, andWireless Networks, pp. 269–277.Dolev, S., Segal, M., Shpungin, H., 2013. Bounded-hopenergy-eﬃcient liveness of ﬂocking swarms. IEEE Trans-actions on Mobile Computing 12, 516–528.Duan, H., Qiao, P., 2014. Pigeon-inspired optimization:A new swarm intelligence optimizer for air robot pathplanning. International Journal of Intelligent Computingand Cybernetics 7, 24–37.Egerstedt, M., Pauli, J.N., Notomista, G., Hutchinson, S.,2018. Robot ecology: Constraint-based control designfor long duration autonomy. Annual Reviews in Control46, 1–7.Ferrari, S., Foderaro, G., Zhu, P., Wettergren, T.A., 2016.Distributed Optimal Control of Multiscale DynamicalSystems: A Tutorial. IEEE Control Systems 36, 102–116.Fine, B.T., Shell, D.A., 2013. Unifying microscopic ﬂockingmotion models for virtual, robotic, and biological ﬂockmembers. Autonomous Robots 35, 195–219.Fredette, D., 2017. Fuel-Saving behavior for Multi-VehicleSystems: Analysis, Modeling, and Control. Ph.D. thesis.The Ohio State University.Hauert, S., Leven, S., Varga, M., Ruini, F., Cangelosi, A.,Zuﬀerey, J.C., Floreano, D., 2011. Reynolds ﬂocking inreality with ﬁxed-wing robots: Communication rangevs. maximum turning rate, in: 2011 IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems,pp. 5015–5020.Hayes, A.T., Dormiani-Tabatabaei, P., 2002. Self-Organized Flocking with Agent Failure: Oﬀ-Line Opti-mization and Demonstration with Real Robots, in: IEEEInternational Conference on Robotics and Automation, pp. 3900–3905.Hung, S.M., Givigi, S.N., 2017. A Q-Learning Approach toFlocking with UAVs in a Stochastic Environment. IEEETransactions on Cybernetics 47, 186–197.Ibuki, T., Wilson, S., Yamauchi, J., Fujita, M., Egerstedt,M., 2020. Optimization-Based Distributed FlockingControl for Multiple Rigid Bodies. IEEE Robotics andAutomation Letters 5, 1891–1898.Jadbabaie, A., Lin, J., Morse, A.S., 2003. Mobile Au-tonomous Agents Using Nearest Neighbor Rules. IEEETransactions on Automatic Control 48, 988–1001.Jafari, M., Xu, H., Carrillo, L.R.G., 2020. A biologically-inspired reinforcement learning based intelligent dis-tributed ﬂocking control for Multi-Agent Systems inpresence of uncertain system and dynamic environment.IFAC Journal of Systems and Control 13, 100096.Jang, K., Vinitsky, E., Chalaki, B., Remer, B., Beaver,L., Malikopoulos, A.A., Bayen, A., 2019. Simulation toscaled city: zero-shot policy transfer for traﬃc controlvia autonomous vehicles, in: Proceedings of the 10thACM/IEEE International Conference on Cyber-PhysicalSystems, pp. 291–300.Kennedy, J., Eberhart, R., 1995. Particle Swarm Optimiza-tion, in: International Conference on Neural Networks,pp. 1942–1948.Kent, T.E., 2015. Optimal Routing and Assignment forCommercial Formation Flight. Ph.D. thesis. Universityof Bristol.Kesireddy, A., Shan, W., Xu, H., 2019. Global Opti-mal Path Planning for Multi-agent Flocking: A Multi-Objective Optimization Approach with NSGA-III, in:Proceedings of the 2019 IEEE Symposium Series onComputational Intelligence, pp. 64–71.Khodayari, E., Sattari-Naeini, V., Mirhosseini, M., 2016.Flocking Control with Single-COM for Tracking a Mov-ing Target in Mobile Sensor Network Using GravitationalSearch Algorithm, in: Proceedings of the 1st Conferenceon Swarm Intelligence and Evolutionary Computation,pp. 125–130.Koren, Y., Borenstein, J., 1991. Potential Field Methodsand their Inherent Limitations for Mobile Robot Navi-gation, in: Proceedings of the 1991 IEEE InternationalConference on Robotics and Automation.La, H.M., Lim, R., Sheng, W., 2015. Multirobot coopera-tive learning for predator avoidance. IEEE Transactionson Control Systems Technology 23, 52–63.La, H.M., Nguyen, T.H., Nguyen, C.H., Nguyen, H.N., 2009.Optimal Flocking Control for a Mobile Sensor NetworkBased on a Moving Target Tracking, in: Proceedingsof the 2009 IEEE International Conference on Systems,Man, and Cybernetics, IEEE. pp. 4801–4806.Lee, S.M., Myung, H., 2013. Particle swarm optimization-based distributed control scheme for ﬂocking robots,in: Advances in Intelligent Systems and Computing,Springer Verlag. pp. 517–524.Li, H., Peng, J., Liu, W., Wang, J., Liu, J., Huang, Z., nd Automation, pp. 3293–3298.Shannon, C.E., 1948. A Mathematical Theory of Communi-cation. The Bell System Technical Journal 27, 379–423.Song, Z., Lipinski, D., Mohseni, K., 2017. Multi-vehiclecooperation and nearly fuel-optimal ﬂock guidance instrong background ﬂows. Ocean Engineering 141, 388–404.Sun, F., Wang, R., Zhu, W., Li, Y., 2019. Flocking innonlinear multi-agent systems with time-varying delayvia event-triggered control. Applied Mathematics andComputation 350, 66–77.Tanner, H.G., Jadbabaie, A., Pappas, G.J., 2007. Flockingin ﬁxed and switching networks. IEEE Transactions onAutomatic Control 52, 863–868.Vadakkepat, P., Tan, K.C., Ming-Liang, W., 2000. Evolu-tionary Artiﬁcial Potential Fields and Their Applicationin Real Time Robot Path Planning, in: 2000 Congresson Evolutionary Computation, IEEE. pp. 256–263.V´as´arhelyi, G., Vir´agh, C., Somorjai, G., Nepusz, T.,Eiben, A.E., Vicsek, T., 2018. Optimized ﬂocking ofautonomous drones in conﬁned environments. ScienceRobotics 3.Vatankhah, R., Etemadi, S., Honarvar, M., Alasty, A.,Boroushaki, M., Vossoughi, G., 2009. Online velocityoptimization of robotic swarm ﬂocking using particleswarm optimization (PSO) method, in: 2009 6th Interna-tional Symposium on Mechatronics and its Applications,pp. 1–6.Veitch, C., Render, D., Aravind, A., 2019. Ergodic Flocking,in: Proceedings of the 2019 IEEE/RSJ InternationalConference on Intelligent Robots and Systems, pp. 6957–6962.Verhagen, C.M.A., 2015. Formation ﬂight in civil aviationDevelopment of a decentralized approach to formationﬂight routing. Ph.D. thesis. Delft University of Technol-ogy.Vicsek, T., Czirok, A., Ben-Jacob, E., Cohen, I., Shochet,O., 1995. Novel Type of Phase Transition in a Systemof Self-Driven Particles. Physical Review Letters 75,1226–1229.Vir´agh, C., Nagy, M., Gershenson, C., V´as´arhelyi, G., 2016.Self-organized UAV Traﬃc in Realistic Environments, in:2016 IEEE/RSJ International Conference on IntelligentRobots and Systems, pp. 1645–1652.Wang, C., Wang, J., Zhang, X., 2018. A deep reinforcementlearning approach to ﬂocking and navigation of uavs inlarge-scale complex environments, in: 2018 IEEE GlobalConference on Signal and Information Processing, pp.1228–1232.Wang, H., Rubenstein, M., Rubenstein, M., 2020. ShapeFormation in Homogeneous Swarms Using Local TaskSwapping. IEEE Transactions on Robotics 36, 597–612.Wilson, S., Glotfelter, P., Wang, L., Mayya, S., Notomista,G., Mote, M., Egerstedt, M., 2020. The Robotar-ium: Globally Impactful Opportunities, Challenges, andLessons Learned in Remote-Access, Distributed Con- trol of Multirobot Systems. IEEE Control Systems 40,26–44.Xu, H., Carrillo, L.R.G., 2015. Distributed Near Opti-mal Flocking Control for Multiple Unmanned AircraftSystems, in: Proceedings of the 2015 International Con-ference on Unmanned Aircraft Systems, pp. 879–885.Xu, H., G. Carrillo, L.R., 2017. Fast reinforcement learningbased distributed optimal ﬂocking control and networkco-design for uncertain networked multi-UAV system, in:Unmanned Systems Technology XIX, SPIE. p. 1019511.Yang, J., Grosu, R., Smolka, S.A., Tiwari, A., 2016. Lovethy neighbor: V-formation as a problem of model pre-dictive control, in: 27th International Conference onConcurrency Theory, pp. 4:1–4:5.Yuan, Q., Zhan, J., Li, X., 2017. Outdoor ﬂocking ofquadcopter drones with decentralized model predictivecontrol. ISA Transactions 71, 84–92.Zavlanos, M.M., Tanner, H.G., Jadbabaie, A., Pappas,G.J., 2009. Hybrid control for connectivity preservingﬂocking. IEEE Transactions on Automatic Control 54,2869–2875.Zhan, J., Li, X., 2011a. Decentralized Flocking Protocolof Multi-agent Systems with Predictive Mechanisms, in:Proceedings of the 30th Chinese Control Conference, pp.5995–6000.Zhan, J., Li, X., 2011b. Flocking of Discrete-time Multi-Agent Systems with Predictive Mechanisms, in: 18thIFAC World Congress, pp. 5669–5674.Zhan, J., Li, X., 2013. Flocking of multi-agent systemsvia model predictive control based on position-only mea-surements. IEEE Transactions on Industrial Informatics9, 377–385.Zhang, H.T., Chen, M.Z., Stan, G.B., Zhou, T., Ma-cIejowski, J.M., 2008. Collective behavior coordinationwith predictive mechanisms. IEEE Circuits and SystemsMagazine 8, 67–85.Zhang, H.T., Cheng, Z., Chen, G., Li, C., 2015. Modelpredictive ﬂocking control for second-order multi-agentsystems with input constraints. IEEE Transactions onCircuits and Systems I: Regular Papers 62, 1599–1606.Zhang, H.T., Liu, B., Cheng, Z., Chen, G., 2016. ModelPredictive Flocking Control of the Cucker-Smale Multi-Agent Model with Input Constraints. IEEE Transactionson Circuits and Systems I: Regular Papers 63, 1265–1275.Zheng, H., Panerati, J., Beltrame, G., Prorok, A., 2020.An adversarial approach to private ﬂocking in mobilerobot teams. IEEE Robotics and Automation Letters 5,1009–1016.Zhou, L., Li, S., 2017. Distributed model predictive controlfor multi-agent ﬂocking via neighbor screening optimiza-tion. International Journal of Robust and NonlinearControl 27, 1690–1705.Zhu, B., Xie, L., Han, D., 2016. Recent Developments inControl and Optimization of Swarm Systems: A BriefSurvey, in: 12th IEEE International Conference on Con- rol and Automation, pp. 19–24.Zhu, B., Xie, L., Han, D., Meng, X., Teo, R., 2017. Asurvey on recent progress in control of swarm systems.Science China Information Sciences 60.rol and Automation, pp. 19–24.Zhu, B., Xie, L., Han, D., Meng, X., Teo, R., 2017. Asurvey on recent progress in control of swarm systems.Science China Information Sciences 60.