An Overview on Optimal Flocking
AAn Overview on Optimal Flocking
Logan E. Beaver, Andreas A. Malikopoulos
Department of Mechanical Engineering, University of Delaware, Newark, DE, 19716, USA
Abstract
The study of robotic flocking has received considerable attention in the past twenty years. As we beginto deploy flocking control algorithms on physical multi-agent and swarm systems, there is an increasingnecessity for rigorous promises on safety and performance. In this paper, we present an overview theliterature focusing on optimization approaches to achieve flocking behavior that provide strong safetyguarantees. We separate the literature into cluster and line flocking, and categorize cluster flocking withrespect to the system objective, which may be realized by a reactive, or planning, control algorithm.We also present several approaches aimed at minimizing flocking communication and computationalrequirements in real systems via neighbor filtering and event-driven planning. We conclude the overviewwith our perspective on the outlook and future research direction of optimal flocking algorithms.
Keywords:
Flocking, optimization, emergence, multi-agent systems, swarm systems
Contents1 Introduction 1
Generating emergent flocking behavior has beenof particular interest since Reynolds proposedthree heuristic rules for multi-agent flocking incomputer animation; see Reynolds (1987). Roboticflocking has been proposed in several applicationsincluding mobile sensing networks, coordinated de-livery, reconnaissance, and surveillance; see Olfati-Saber (2006). With the significant advances incomputational power in recent decades, the con-trol of robotic swarm systems has attracted con-siderable attention due to their adaptability, scala-bility, and robustness to individual failure; see Ohet al. (2017). However, constructing a robot swarmwith a large number of robots imposes significantcost constraints on each individual robot. Thus,any real robot swarm consists of individual robotswith limited sensing, communication, actuation,memory, and computational abilities. To achieverobotic flocking in a physical swarm, we must de-velop and employ energy-optimal approaches toflocking under these strict cost constraints.There have been several surveys and tutorialson decentralized control that include flocking; see a r X i v : . [ c s . M A ] S e p arve and Nene (2013); Bayindir (2016); Ferrariet al. (2016); Zhu et al. (2016, 2017); Albert andImsland (2018). In one motivating example, Fineand Shell (2013) discuss various flocking controllerswithout considering optimality. In general, thesesurveys have all considered flocking and optimalcontrol to be two distinct problems. Thus, webelieve it is appropriate to present a comprehensiveoverview of optimal flocking control algorithmsas robotic swarm systems begin to roll out inlaboratories, e.g., Rubenstein et al. (2012); Janget al. (2019); Beaver et al. (2020a); Malikopouloset al. (2020); Wilson et al. (2020), and field tests,e.g., V´as´arhelyi et al. (2018); Mahbub et al. (2020).Our objective for this overview is to establish thecurrent frontier of optimal flocking research andpresent our vision of the research path for the field.One significant problem throughout the litera-ture is the use of the term “flocking” to describevery different modes of aggregate motion. Thebiology literature emphasizes this point, e.g., Ba-jec and Heppner (2009), where the distinction ofline flocking (e.g., geese) and cluster flocking (e.g.,sparrows) is necessary. To this end, we believe it ishelpful to partition the engineered flocking litera-ture into cluster and line flocking. As with naturalsystems, these modes of flocking have vastly dif-ferent applications and implementations. Unlikebiological systems, the behavior of engineeringsystems is limited only by the creativity of thedesigner. Thus, based on the current state of theliterature, we have further partition cluster flock-ing into several categories based on the system-level objective. Our proposed flocking taxonomyis shown in Fig. 1. This taxonomy is motivatedby the need to control robotic swarms, which is,in general, application dependent.While an extensive body of literature has stud-ied the convergence of flocking behavior, therehas been almost no attention to the developmentof optimal flocking control algorithms. AlthoughMolzahn et al. (2017) focused on optimal decen-tralized control in a recent survey, the approachescovered in the paper tend to focus on formationconfiguration achieving consensus, or area cover-age. In this paper, we seek to summarize the exist-ing literature at the interface of flocking and opti- Figure 1: Our proposed flocking classification scheme forcluster and line flocking. mization with an emphasis on minimizing agents’energy consumption.The objectives of this paper are to: (1) elabo-rate on a new classification scheme for engineeredflocking literature aimed at enhancing the descrip-tion of flocking research (Fig. 1), (2) summarizethe results of the existing optimal flocking litera-ture across engineering disciplines and present thefrontier of flocking and optimization research, and(3) propose a new paradigm to understand flock-ing as an emergent phenomenon to be controlledrather than a desirable group behavior for agentsto mimic.The contribution of this paper is the collectionand review of the literature in this area. In severalcases, the optimal flocking and formation recon-figuration literature overlap. We have attemptedto separate them and present only the materialrelevant to flocking in this review. Any such efforthas obvious limitations. Space constraints limitthe presentation of technical details, and thus, ex-tensive discussions are included only where theyare important for understanding the fundamentalconcepts or explaining significant departures fromprevious work.The remainder of this paper is structured asfollow. In Section 1.1, we present the commonnotation used throughout the paper. Then, in Sec-tion 2, we give an introduction to cluster flocking,which we further partition into Reynolds flocking(Section 3), reference state tracking (Section 4),2nd remaining cases (Section 5). We further divideeach of these sections into reactive and planningapproaches. We present the line flocking literaturein Section 6, and in Section 7, we discuss severalapproaches to Pareto front selection for optimalflocking. In Section 8, we discuss the implicationsof flocking with real robots. In Section 8.1, wepresent approaches to reducing cyberphysical costswhile in Section 8.2, we present flocking as a groupstrategy. Finally, in Section 9, we present our re-search outlook, concluding remarks, and motivatea new direction for flocking research.
We consider a swarm of N ∈ N agents indexedby the set A = { , , . . . , N } . For each agent i ∈A , we denote their position and velocity by p i ( t )and v i ( t ), respectively, at time t ∈ R ≥ . Agent i ’s state is denoted by the vector x i ( t ), and thestate of the system by x ( t ) = [ x T ( t ) , . . . , x TN ( t )] T .Each agent i ∈ A has a neighborhood N i ( t ) ⊆ A ,which contains all neighbors that i may sense andcommunicate with. For consistency we explicitlyinclude i ∈ N i ( t ) for all t . There are many ways todefine a neighborhood, including inter-agent dis-tance, k -nearest neighbors, and Voronoi partitions;see Fine and Shell (2013) for further discussion.In most cases, each agent’s neighborhood is only afraction of the swarm; thus, each agent is only ableto make partial observations of the entire systemstate. Using neighborhoods as our basis for localinformation, we propose the following definitionfor connected agents. Definition 1.
Two agents i, j ∈ A are connectedat time t over a period T ∈ R > if there exist asequence of neighborhoods (cid:8) N i ( t ) , N k ( t ) , N k ( t ) , . . . , N k n ( t n +1 ) (cid:9) , (1)such that k ∈ N i ( t ) , k ∈ N k ( t ) , . . . , j ∈ N k n ( t n +1 ) , (2)where n + 1 is the length of the sequence and t ≤ t ≤ t · · · ≤ t n +1 ≤ t + T .Finally, for any two agents i, j ∈ A , we denotetheir relative position as s ij ( t ) = p j ( t ) − p i ( t ) . (3)
2. Cluster Flocking and Swarming
The swarming, aggregate motion of small birdsis known as cluster flocking in biological literature.The benefit of cluster flocking in natural systems isunknown, however, several hypotheses have beenproposed. These include predator evasion, estimat-ing the flock population, and sensor fusion. It isalso unclear if leadership is necessary to generatethe organized motion in cluster flocks of actualbirds; Bajec and Heppner (2009) provides a reviewof swarming in biological systems. In this section,we present each formulation considering that allagents have access to any global reference informa-tion when solving their local optimization problem.With this in mind, and based on the work of Olfati-Saber (2006); Cucker and Smale (2007); Tanneret al. (2007), we present a general definition forcluster-flocking behavior in engineered swarms.
Definition 2. (Cluster Flocking) A group of agentsachieve cluster flocking if:1. There exists a finite distance D ∈ R > suchthat || p i ( t ) − p j ( t ) || ≤ D for all i, j ∈ A andall t ∈ R ≥ .2. There exists a finite period of time T ∈ R > such that every pair of agents i, j ∈ A isconnected for all t ∈ R ≥ (Definition 1).3. No agent i ∈ A has a desired final state (i.e.,there is no explicit formation).4. For each agent i ∈ A at each time t ∈ R ≥ ,there exists a time T ∈ R > such that || v i ( t + T ) || >
3. Reynolds Flocking
A vast amount of literature exists that seeksto achieve flocking under Reynolds flocking. Gen-erally, flock centering, velocity matching, and col-lision avoidance can be captured by imposing the following cost function for each agent i ∈ A , J i = V ( || s ij ( t ) || ) + (cid:88) j ∈N i ( t ) || ˙ s i ( t ) || , (4)where j ∈ N i ( t ) and V is an attractive-repulsivepotential function with a local minimum at somedesired distance. The first term of (4) managescollision avoidance and flock centering, while thesecond term ensures velocity alignment. Fig. 2shows each component of an agent flocking underReynolds rules. Figure 2: A diagram showing the influence of collisionavoidance, flock centering, and velocity matching for agent i , in green. Given a distance d ∈ R > that minimizes thepotential function in (4), Olfati-Saber (2006) pro-posed the α -lattice, i.e., any configuration of agentssuch that each agent i ∈ A satisfies || s ij ( t ) || = d, (5)for all j ∈ N i ( t ). This definition coincides withthe global minimum of (4), and many authorshave substituted (5) for the flock centering andcollision avoidance rules of Reynolds. Next, wepresent three different approaches to designingoptimal Reynolds flocking controllers. To optimally flock in a reactive system, eachagent works to minimize a global objective, such asvelocity alignment of the flock, while only makingpartial observations of the total state x ( t ). There-fore, optimal reactive flocking methods generallyrely on designing an optimal control policy us-ing simulation and experimental data. Generally,4hese approaches seek to find the optimal weightsfor a controller of the form u i ( t ) = − (cid:88) j ∈N i ( t ) \{ i } ∇ V ( || s ij || ) − (cid:88) j ∈N i ( t ) \{ i } ˙ s ij ( t ) , (6)where the first term minimizes the potential fieldand the second term handles velocity alignment.An early approach to optimally follow Reynoldsflocking rules was presented by Morihiro et al.(2006a), where the authors took a learning-basedapproach to velocity alignment. In this work, eachagent i ∈ A observes the state, x j ( t ), of a ran-domly selected agent j ∈ A \ { i } at each time step t . Agent i then follows one of four motion primi-tives: (1) move toward j , (2) move away from j ,(3) move the same direction as j , or (4) move theopposite direction of j . The agents are rewardedfor achieving velocity alignment and staying nearsome desirable distance d of their neighbors, i.e.,velocity matching and flock centering. In addi-tion, the authors included a set of predators thatwould attempt to disrupt the flock. In this case,the agents observe the state of the predator withprobability 1 whenever it is within range. Agent i is then rewarded for evading the predator andmaintaining the structure of the flock. Furthersimulation results for this method are presentedin Morihiro et al. (2006b).Flocking was formulated as a dynamic pro-gram by Wang et al. (2018) to generate optimaltrajectories for a swarm of quadrotors in R . Theobjective is for the quadrotors to follow Reynoldsflocking rules while moving the swarm center toa global reference position. The agents followsunicycle dynamics, and each agent observes thestate of its nearest left and right neighbor to de-termine its control action. This angular symmetryin neighbor selection reduces the likelihood of theagents forming isolated cliques, which is a commonissue in the distance and nearest-neighbor defini-tions of neighborhoods; see Camperi et al. (2012);Fine and Shell (2013). The authors penalize eachagent for violating Reynolds flocking rules, com-ing within some distance of an obstacle, and notmoving toward the desired location. They also in-corporate a constant transition penalty if the agent is not within a fixed distance of the goal. Then aninfinite-horizon discounted problem is formulatedand the optimal policy is learned using a standarddeep reinforcement learning approach. The policyis verified on a group of N = 3 agents and showedthat the decentralized control policy generalizes tolarger systems of 5 and 7 uncrewed aerial vehicles(UAVs) without significant deterioration of thefinal objective function value.Metaheuristic algorithms, including Pigeon-inspired optimization, e.g., see Duan and Qiao(2014), and particle swarm optimization, e.g., seeKennedy and Eberhart (1995), have been used togenerate systems that optimally follow Reynoldsflocking rules. In Qiu and Duan (2020), the au-thors optimized the control actions of a UAV in R with state and control constraints. This isachieved by breaking the controller into flockingand obstacle avoidance components, then usingpigeon-inspired optimization to weight each com-ponent such that the deviation from Reynoldsflocking rules was minimized while avoiding colli-sions.Navarro et al. (2015) applied particle swarm op-timization to optimize a neural network controllerwith 50 weights, nine inputs, and two outputs.The inputs consist of distance measurements foreach octant around the agent and the averageheading of all neighboring agents. The outputsof the neural network are speed commands forthe left and right motor of a differential driverobot. The system is trained to maximize a linearcombination of local velocity alignment, desiredinter-robot spacing, and the average velocity ofthe flock. The agents are trained in simulation inthe local and global information case. The authorsshowed that a neural network trained on 4 agentscan be generalized up to groups of 16.The effect of control input constraints for an op-timal flocking controller was studied in Celikkanat(2008). In this work, the authors sought to designa local control law based on maximizing velocityalignment while minimizing deviation from an α -lattice. They included the average heading of thesystem as a global order parameter and an entropyparameter which applied Shannon’s informationentropy metric, e.g., see Shannon (1948), to the5roportion of robots within a disk of diameter h .The flocking controller parameters are optimizedusing a genetic algorithm while its performance isvalidated in simulation. Another genetic algorithmwas employed by V´as´arhelyi et al. (2018) to designthe feedback controller for an individual agent,which is parameterized in terms of 11 optimiza-tion variables. The authors optimize the agentsin a constrained environment with a complicatedobjective function that includes the minimizationof collision risk with walls and other agents, de-viation from desired speed, and the number ofdisconnected agents, while simultaneously maxi-mizing velocity alignment and the largest clustersize. The control variables are optimized offlinein a realistic simulation that includes stochasticdisturbances for desired flock speeds of 4, 6, and 8m/s. The controller is validated in outdoor flightexperiments with 30 Pixiehawk drones flown over10-minute intervals.Up to this point, obstacle avoidance and safetyhave been accomplished through artificial potentialfields and attractive-repulsive forces. In addition,the design of potential fields has been the sub-ject of significant research for general navigationproblems; see Vadakkepat et al. (2000). However,applying potential fields to multi-agent systemshas been shown to have several drawbacks; seeKoren and Borenstein (1991). These include in-troducing steady oscillations to trajectories andexacerbating deadlock in crowded environments.A promising alternative to potential field meth-ods, which explicitly guarantees safety, has beenproposed as a novel paradigm for the design oflong-duration robotic systems by Egerstedt et al.(2018). In this approach, the tasks of each agentare imposed as motion constraints while the alwaysagents seek to follow energy-minimizing trajecto-ries. We interpret this constraint-driven approachto control as understanding why agents take partic-ular control actions, rather than designing controlalgorithms that mimic a desirable behavior. To thebest of our knowledge, reactive constraint-drivenReynolds flocking has only been explored by Ibukiet al. (2020). Under this approach, each agent i ∈ A generates an optimal control trajectory bysolving the following optimal control problem at each time t , min u i ( t ) ∈ R ,δ i ∈ R || u i ( t ) || + δ i subject to: lim t →∞ || s ij ( t ) || ≤ δ i , (7)lim t →∞ || φ ij ( t ) || → , (8) || s ij ( t ) || > R ∀ t ∈ R ≥ , (9) ∀ j ∈ A \ { i } , where δ is a slack variable, φ ij is a metric for atti-tude error, and R is the radius of a circle that cir-cumscribes the agents. Constraint (7) correspondsto pose synchronization (flock centering), (8) toattitude synchronization (velocity alignment), and(9) to collision avoidance. The authors generatedcontrol inputs for each agent by applying gradi-ent flow coupled with control barrier functions toachieve constraint satisfaction. This guaranteesthat the agents satisfy the safety constraint, satisfyReynolds flocking rules within a threshold δ , andsimultaneously minimize energy consumption. As an alternative way to simply reacting to theenvironment and system state, agents may insteadplan an optimal trajectory over a time horizon.This can generally improve the performance of theagent, e.g., by avoiding local minima; however,planning generally requires more computationalpower than a reactive approach. The structureof the information in a decentralized system alsocreates challenges with respect to the informationavailable over a planning horizon. It has beenshown that there is separation between estimationand control for particular decentralized informa-tion structures; see Nayyar et al. (2013); Dave andMalikopoulos (2020). However, this is an openproblem for the general case. Some proposed solu-tions include sharing information between agents,e.g., see Morgan et al. (2016), only planning withagents shared between neighbors, e.g., see Daveand Malikopoulos (2019), and applying model pre-dictive control (MPC). For large swarms of inex-pensive agents, widespread information sharingis generally infeasible, and it is unlikely that any6ommon information exists. For this reason, MPChas been a preferred approach in swarm systems.With MPC, each agent plans a sequence of controlactions over a time horizon based on its currentinformation about the system. After some time,the agent will replan its trajectory based on what-ever new information it has received. Next, wepresent several approaches to planning optimaltrajectories that use Reynolds flocking rules.A significant number of MPC flocking algo-rithms seek to minimize deviation from Reynoldsflocking rules, which may be implemented througha linear combination of the following objectives: J di ( t ) = (cid:88) j ∈ N i ( t ) (cid:16) || s ij ( t ) || − d (cid:17) , (10) J vi ( t ) = || ¯ v i ( t ) − v i ( t ) || , (11) J ui ( t ) = || u i ( t ) || , (12)where d is the desired separating distance in (5),and ¯ v i ( t ) is the average velocity of all agents j ∈N i ( t ). Eq. (10) corresponds to flock centering,(11) to velocity matching, and (12) is a controleffort penalty term.The analysis by Zhang et al. (2008) presentsa mechanism for flocking agents to estimate theirneighbors’ future trajectories. The predictive de-vice was applied by Zhan and Li (2011b) to achieveReynolds flocking under a fully connected commu-nication topology. This was extended to the decen-tralized information case in Zhan and Li (2011a)and validated experimentally with outdoor flighttests in Yuan et al. (2017).An infinite horizon continuous-time MPC ap-proach was employed in Xu and Carrillo (2015) andXu and G. Carrillo (2017) that minimized flockingerror over an infinite horizon in a continuous-timesystem. The resulting Hamilton-Jacobi-Bellmanequation is nonlinear and without an explicit solu-tion. As a result, the authors applied an originalreinforcement learning technique to optimize theagent trajectories online and validated the perfor-mance in simulation. The reinforcement learningarchitecture is expanded on in Jafari et al. (2020),where the authors include model mismatch and sig-nificant environmental disturbances acting upon the agents. They also present simulation and ex-perimental results for a flock of quadrotors.To guarantee feasibility of the planned trajecto-ries, it is necessary to explicitly impose constraintsthat bound the maximum control and velocity ofeach individual agent within their physical limits,i.e., for each agent i ∈ A , || v i ( t ) || ≤ v max , (13) || u i ( t ) || ≤ u max , (14)for all t ∈ R ≥ . An analysis of constrained α -lattice flocking under MPC which incorporated(13) and (14) was explored in Zhang et al. (2015),and was extended to velocity alignment in Zhanget al. (2016).As we begin to implement flocking in physicalswarms, explicit guarantees of safety are impera-tive for any proposed control algorithm. The moststraightforward approach to guarantee safety isto circumscribe each agent i ∈ A entirely withina closed disk of radius R ∈ R > . The safety con-straint for i may then be formulated as || s ij ( t ) || ≥ R, ∀ j ∈ A \ { i } . (15)In general, applying MPC to each agent doesnot guarantee that coupled constraints, such as(15), are satisfied. At any planning instant, agent i only has the trajectories generated by j ∈ N i ( t )at a previous time step. Thus, agent i cannotguarantee safety constraints for the trajectory gen-erated by agent j at the current time. For thisreason, in the decentralized case, agents must ei-ther cooperatively plan trajectories or impose acompatibility constraint. To guarantee that cou-pled constraints between agents are satisfied, sig-nificant research effort has been dedicated to de-centralized MPC (DMPC). A common approachto DMPC is to design a communication protocolfor agents to iteratively generate trajectories whiledriving their cost to a local minimum. An itera-tive approach, proposed by Zhan and Li (2013),cooperatively generates trajectories while limit-ing the number of messages exchanged betweenagents. The agents apply an impulse accelerationat discrete intervals and seek to minimize the flockcentering error over a finite horizon. The agents7equentially generate trajectories up to some in-dex l ≥ N , where at each iteration, agent i = (cid:0) k mod N (cid:1) + 1 , i ∈ A , k = 0 , , . . . , l − i ’s trajectory is nonincreasing with eachplanning iteration. Beaver et al. (2020c) appliedReynolds flocking rules as an endpoint cost in acontinuous optimal control problem while includ-ing (13)-(15) as constraints. Each agent i ∈ A firstgenerates a trajectory while relaxing the safety con-straint (15). Agent i then exchanges trajectoryinformation with every other j ∈ N i ( t ). Finally,any agents violating (15) cooperatively generatethe centralized safety-constrained trajectory be-tween fixed start and end points, which guaranteessafety.
4. Reference State Cluster Flocking
A common application that exhibits clusterflocking behavior is tracking a reference trajectorywith the center of mass of a swarm. In this applica-tion, the reference trajectory (also called a virtualleader) is generally presented as a time-varyingreference state, x r ( t ), which may be known toall agents. In general, this is appended to thestandard flocking controller (6) of informed agentswith the feedback term f ri ( t ) = || x i ( t ) − x r ( t ) || , (16)which may be scaled with a positive control gain.To track the center of mass, the swarm must satisfythe condition (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N (cid:88) i ∈A (cid:0) x i ( t ) (cid:1) − x r ( t ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15), (17)for some threshold (cid:15) ≥
0. As with Reynolds flock-ing, the information available to each agent i ∈ A is restricted to its neighborhood, N i ( t ). This isinsufficient to evaluate (17). Thus, the center ofmass tracking problem has generally been formu-lated as an optimal controller design problem. Aschematic of reference state cluster flocking agentsis presented in Fig. 3. Figure 3: Agent i , in green, selects the control input thatdrives the center of the flock toward the reference state, x r . An early approach proposed by Hayes andDormiani-Tabatabaei (2002) sought to track areference point with a flock of agents that fol-lowed Reynolds flocking rules with an additionalattractive force toward the reference state. Theagents are placed in a rectangular domain, whereeach agent has a uniform probability of failingover a given period, i.e., the agent would stopmoving but still be detectable. Simulations areperformed to determine the controller gains thatminimize a combination of travel time, cumulativedistance traveled, and average inter-agent spacing.The resulting controller is validated in physicalexperiments with 10 robots. This objective hasbecome standard in many flocking applications;see (Bayindir, 2016).Another approach to reference tracking, pro-posed by La et al. (2009), involves the selectionof the optimal weights for a feedback controllersuch that the reference trajectory x r ( t ) is trackedin minimum time while maintaining an α -latticeconfiguration. The authors construct a cost func-tion which penalizes the time taken for the flockto catch the reference trajectory scaled by theirinitial position. The resulting cost function isnon-convex and non-differentiable, and thus it isminimized by applying a genetic algorithm. Toguarantee that (17) is globally satisfied, all casesthat do not yield an α -lattice within some errorbound are discarded. The discrete-time versionof this system is optimized by Khodayari et al.(2016) using a gravitational search algorithm.La et al. (2015) proposed a hybrid flocking-learning system to guarantee flocking behavior inthe presence of obstacles and predators. At the8gent level, each agent seeks to reach the staticreference position p r with the center of their localneighborhoods. The objective of the system isto have each agent i ∈ A , in a decentralized wayand select the optimal p r ∈ P from a finite set ofpositions, P . Each agent is rewarded proportion-ately to the size of its neighborhood at each timestep, up to a maximum value of 6. The authorsimplemented a cooperative Q -learning approach,where each agent i ∈ A was rewarded for selectingthe appropriate p r by Q k +1 i = w Q ki ( s i , a i ) + (1 − w ) (cid:88) j ∈N i ( t ) Q kj ( s j , a j ) , (18)where w ∈ [0 ,
1] weighs the influence of i ’s neigh-bors, and s i , a i are the state and action taken byagent i , respectively. The convergence propertiesof this cooperative learning scheme are proved andthe performance is demonstrated in simulationsand experiments.To track the reference trajectory under realis-tic conditions Vir´agh et al. (2016) sought optimalvalues for a potential-field based controller in R and R under the effects of sensor noise, commu-nication delay, limited sensor update rate, andconstraints on the agent’s maximum velocity andacceleration. The work is framed in terms of aerialtraffic; thus, multiple competing flocks are placedinto shared airspace such that their reference tra-jectories result in conflict between the flocks. Theauthors presented two controllers, one that main-tained constant speed and one with a fixed head-ing. The potential fields used in the controllersare composed of sigmoid functions parameterizedby optimization variables. A compound objectivefunction, proportional to effective velocity and in-versely proportional to collision risk, is constructed,while 20 scenarios are generated to find 20 param-eter sets for the two optimal controllers. The sce-narios consist of every combination of five differentinitial configurations for both the constant-speedand constant-heading controllers in R and R .As an alternative to deriving an optimal feed-back gain, Atrianfar and Haeri (2013) sought tominimize the number of informed agents such thatthe entire flock could track a known reference tra- jectory. First, the authors impose that, for a givensensing distance h > d ∈ R > , the potential fieldmust tend to infinity as s ij approaches h . Thisproperty guarantees that any connected group ofagents would remain connected for all time. Thus,any initially connected group of agents containingan informed agent is guaranteed to converge tothe reference trajectory. The latter implies thatat most one informed agent would be required foreach group of connected agents. In addition, asa function of their initial conditions, some unin-formed groups may merge with an informed cluster.Following this reasoning, the authors show thatat most each initial cluster of agents requires oneinformed agent.Departing from the aforementioned approaches,a centralized approach to tracking a virtual veloc-ity reference was rigorously studied in Piccoli et al.(2016) for double-integrator agents in R k . Theauthors presented a consensus-driven control law,based on Cucker-Smale flocking, of the form u i ( t ) = α i (cid:0) v r ( t ) − v i ( t ) (cid:1) + (1 − α i ) · N − (cid:88) j ∈N i ( t ) \{ i } || s ij ( t ) || ˙ s ij ( t ) , (19)where α i ∈ [0 , , i ∈ A weighs the tradeoff be-tween consensus and velocity tracking. The au-thors sought values of α i such that (cid:80) i ∈A α i ≤ M, M ∈ R > , while minimizing the error function e ( t ) = 1 N N (cid:88) i =1 || v i ( t ) − v r ( t ) || , (20)over a time interval [ t , t f ] ⊂ R ≥ The optimalvalues of α i were presented for three cases: (1)instantaneously minimizing dedt , (2) minimizing theterminal cost e ( t f ), and (3) minimizing the inte-gral cost, (cid:82) t f t e ( t ) dt . The resulting optimal controlanalysis implies that, in general, the optimal strat-egy is to apply the maximum feedback to a fewagents before applying moderate feedback to allagents. This aims at driving agents with highvariance toward the reference velocity, enhancingthe rate of consensus. The authors also noted thepresence of dwell time in the terminal cost case,9.e., the optimal strategy includes applying no con-trol input over a nonzero interval of time startingat t . Optimal planning has several advantages overreactive methods, although it suffers from a hand-ful of challenges related to information structure.As with the reactive methods, the desired referencetrajectory is a time-varying function denoted by x r ( t ). To guarantee that the reference trajectorycan be maintained, the agents must be capableof evaluating x r ( t ) over their entire planning hori-zon. In addition, each agent i ∈ A generally mustplan under the assumption that their neighbor-hood, N i ( t ), is invariant over the planning horizon.Relaxing this assumption may require an amountof information sharing that is infeasible for largeswarm systems.Lee and Myung (2013) applied collective par-ticle swarm optimization to generate the controltrajectory of each agent for a general cost function.In their approach, each agent i ∈ A performs aparticle swarm optimization with M ∈ N parti-cles that correspond to possible control inputs ofagent i . The agents then transmit their g < M -best particles to all j ∈ N i ( t ) and iteratively solvetheir local particle swarm optimization until theplanned trajectories converge system-wide.The approach proposed by Lyu et al. (2019)tracks a known reference trajectory by generatingthe virtual state for each agent i ∈ A , z i ( t ) = 1 |N i ( t ) | (cid:88) j ∈N i x i ( t ) , (21)which corresponds to the average state of agent i ’sneighborhood. Agent i then imposes the constraint z j ( t ) = x r ( t ) , ∀ j ∈ N i ( t ) , (22)using Lagrange relaxation. Since i ∈ N j ( t ), j ∈N i ( t ), the components of (22) are shared betweenneighboring agents. A gradient descent techniqueis applied to minimize the deviation from (22).Reference tracking under uncertainty was ex-plored by Quintero et al. (2013) to track the po-sition of a mobile ground vehicle with a known trajectory, x r ( t ). The flocking UAVs travel at aconstant speed and altitude with stochasticity intheir dynamics. The objective of each agent isto remain within a predefined annulus centeredon the ground vehicle.. The cost for agent i ∈ A is defined as the signed distance of agent i fromthe edge of the annulus plus a heading alignmentterm. The authors then applied dynamic program-ming to generate an optimal control policy for eachagent. This approach was extended by Hung andGivigi (2017) to include external disturbances, andthe optimal policy is derived in real time under areinforcement learning framework.
5. Other Cluster Flocking
In addition to Reynolds flocking and centroidtracking, several other applications have been shownto induce cluster flocking behavior. Although notwidely addressed in the literature, these resultsdemonstrate the breadth of applications that mayyield cluster flocking behavior.Anisotropy in the angle between neighboringflockmates was proposed as a metric for measuringthe quality of a flock of birds by Ballerini et al.(2008). Makiguchi and Inoue (2010) constructed ameasure for anisotropy using a projection matrix M ( n ) pq = 1 N (cid:88) i ∈A (ˆ s ik ) · p (ˆ s ik ) · q , (23)where k indexes the n ’th nearest neighbor of i, and p , q ∈ { ˆ x, ˆ y, ˆ z } are vector components of anorthonormal basis for R . Eq. (23) can be usedto calculate normalized anisotropy, denoted by γ ∈ [0 , M ( n ) pq to the average agent velocity. The author’s objec-tive was to select the optimal weights for each ofReynolds flocking rules (cohesion, alignment, andseparation) to maximize flock anisotropy for thecase that n = 1 in (23). The authors discarded anyparameters that yielded collisions or flock fragmen-tation and achieved a final anisotropy of γ > . γ = .Other optimization techniques outside of ge-netic algorithms have been applied to the problemof optimal flocking. In Vatankhah et al. (2009),10ach agent uses local measurements to determinethe control input that would maximize the velocityof the swarm center via particle swarm optimiza-tion. Veitch et al. (2019) employed ergodic trajec-tories to achieve flocking. An ergodic trajectory isone where the average position of the agents overtime is equal to some spatially distributed prob-ability mass (or density) function. The authorspresented a measure of ergodicity by decompos-ing the probability density function into a finiteFourier series. The proposed control policy foreach robot maximizes this metric along an agent’strajectory. Each agent i ∈ A periodically sharesits Fourier coefficients with all j ∈ N i ( t ). Thisallows the agents to predict where their neighborshave previously explored while also guaranteeingcollision avoidance by the nature of ergodicity. Fi-nally, to achieve flocking, the authors generate auniform probability distribution in a closed diskcentered on a reference state in R . By construc-tion, this guarantees that all agents will enter theclosed disk and remain within it in finite time.Additionally, by smoothly moving the disk around R , the average velocity and centroid of the flockcan be precisely controlled.Inspired by Reynolds flocking rules and theconstraint-driven paradigm for control, Beaver andMalikopoulos (2020) proposed a set of flockingrules over a planned horizon to achieve clusterflocking by: (1) minimizing energy consumption,(2) staying near the neighborhood center, and (3)avoiding collisions. Condition 2 (aggregation) isimposed with the constraint (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p i ( t ) − |N i ( t ) | − (cid:88) j ∈N i ( t ) \{ i } p j ( t ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ D, (24)for some distance D much greater than the di-ameter of any agent, and for |N i ( t ) | >
1. Thisapproach is visualized in Fig. 4. The proposedconstraint seeks to confine each agent within adiameter D disk positioned at their neighborhoodcenter. The intuition is that the agents may movefreely within the disk; however, their velocity can-not vary dramatically from the average velocityin their neighborhood for long periods of time. Ithas been proven that these rules yield velocity Figure 4: Agent i , in green, is constrained to remain withina disk positioned at its neighborhood center. consensus asymptotically when N i ( t ) is forward-invariant. In a more recent effort, Beaver et al.(2020b) proposed a method for a constraint-drivenagent to generate an optimal control policy in real-time. This is an important next step in real-timeoptimal control of physical flocks.
6. Line Flocking
In this section, we review literature related toline flocking, which is a naturally occurring phe-nomenon commonly found in large birds (such asgeese) that travel in a vee, jay, and echelon forma-tions over long distances. It has long been under-stood that saving energy is a significant benefit offlying in such formations; see Cutts and Speakman(1994); Mirzaeinia et al. (2020). In aerial systems,the main energy savings comes from upwash, i.e.,trailing regions of upward momentum that can beexploited by birds to induce lift and expend lessenergy. This is illustrated in Fig. 5. Similar bene-fits have been found in terrestrial and underwatervehicles, where a leader may create a low-pressurewake and reduce the overall drag force imposedon the following vehicles.In this context, the most straightforward methodto achieve line flocking is to generate an optimalset of formation points based on the drag, wake,and upwash characteristics of each agent. Thiseffectively transforms the line flocking probleminto a formation reconfiguration problem, whereeach agent must assign itself to a unique goal and11 igure 5: The lead agent induces upwash and downwash inits wake due to its trailing wing vortices, and the followingagents exploit the upwash to induce lift and reduce energyconsumption. reach it within some fixed time, as is the case inNathan and Barbosa (2008). However, a forma-tion reconfiguration approach generally requiresthe formation to be computed offline and does notnecessarily consider differences between individualagents (e.g., age, weight, size, and efficiency) orenvironmental effects. Although formation recon-figuration algorithms have rich supporting litera-ture, they are beyond the scope of this paper. Forrecent reviews of formation control see Oh et al.(2015, 2017).Another approach to line flocking is model-ing the aerodynamic and hydrodynamic interac-tions between agents so that they may dynamicallyposition themselves without a predefined forma-tion. This approach was proposed by Bedruz et al.(2019b), who applied computational fluid dynam-ics simulations to wheeled mobile robots in orderto determine the optimal drafting distances be-tween them. This was extended in Bedruz et al.(2019a), where the authors proposed a fuzzy logiccontroller to maximize the effect of drafting. Theauthors validated their controllers in simulationand experiments with wheeled differential driverobots.An early approach to capture line flocking be-havior in a robotic system with model predictivecontrol was explored by Yang et al. (2016). In thiswork, the authors attempted to maximize veloc-ity matching and upwash benefits for each agent i ∈ A while minimizing the field of view occludedby leading agents. The resulting line flocking be- havior was demonstrated in simulation, where anemergent vee formation consistently emerged in-dependently of the flock’s initial conditions.As a next step toward optimal line flocking,an analysis of the effect of upwash on energy con-sumption in fixed-wing UAVs was presented byMirzaeinia et al. (2019). The authors found thatthe front and tail agents in a vee formation havethe highest rate of energy consumption in theflock. This implies that the lead or tail agentsbecome the limiting factor in the total distancetraveled by the flock. The authors proposed aload balancing algorithm based on a root-selectionprotocol, where the highest-energy agents replacethe lead and tail agents periodically. The authorsthen demonstrated, in simulation, that periodic re-placement of the lead and tail agents significantlyincreases the total travel distance of the flock.A final facet of line flocking is the effect ofenvironmental disturbances, such as turbulenceand currents. Energy-optimal flocking in the pres-ence of strong background flows was investigatedby Song et al. (2017). In this approach, the au-thors derived an energy-optimal reference trajec-tory, x r ( t ), for the flock centroid to track. To gen-erate this trajectory, the authors approximated theflock as a point mass at the centroid and sought tominimize its energy consumption in the presence ofa background flow, U ( p , t ), where p ∈ R is a po-sition. The normalized rate of power consumptionof an agent was given by P ( t ) = || v r ( t ) − U ( p r ( t ) , t ) || v , (25)and the authors sought to solve the optimal controlproblem min t ,t f , p r ( t ) (cid:90) t f t P ( t ) dt subject to: p r ( t ) = p , p r ( t f ) = p f , || v r ( t ) || ≤ v max ,t min ≤ t < t f ≤ t max . The authors showed that the background flowswould dominate the energy consumption of theagents, and therefore a tight cluster of agentswould closely approximate the energy-optimal tra-jectory traced out by the center of the flock.12 . Pareto Front Selection
An essential consideration in multi-objectiveoptimal control is in the tradeoff between each ofthe individual objectives. This can be observed,for example, in the tradeoff between neighbor-hood centering and velocity alignment in Reynoldsflocking. This tradeoff can be explored by findingPareto-efficient outcomes. An outcome is Pareto-efficient if no individual term in the cost functioncan be increased without decreasing the value ofany other term; see Malikopoulos et al. (2015).The set of all Pareto-efficient outcomes is calledthe
Pareto frontier . After establishing a Paretofrontier, the most desirable outcome can be se-lected as the Pareto-optimal control policy; seeMalikopoulos (2016). Generally, multi-objectiveflocking problems have not applied Pareto opti-mality in the past. Instead, authors have usedvarious evolutionary algorithms that generate fam-ilies of optimal solutions; see Vir´agh et al. (2016);V´as´arhelyi et al. (2018).Hauert et al. (2011) examined the impact ofdesign parameters on flocking performance for agroup of UAVs. The authors noted that, due to thehardware limitations, designers must weight thecost of enhanced communication range versus in-creasing the maximum turning rate for each agent.The authors explored this tradeoff by exhaustivelyexploring the design space and calculating theresulting heading angle (velocity alignment) andrelative drift (flock centering) error. Using exten-sive simulation data, the authors constructed thePareto frontier of optimal design choices. Finally,to validate their analysis, the authors conducted aset of outdoor experiments using 10 UAVs in fourdifferent experiments.Pareto frontier generation was explicitly dis-cussed in terms of control by Kesireddy et al.(2019), who noted that almost all optimal flockingalgorithms apply arbitrary weights to the compo-nents of a multiobjective flocking problem. Theauthors presented three cooperative evolutionaryalgorithms that are used to generate a Pareto fron-tier, yielding a family of control policies that arePareto-efficient with respect to Reynolds flockingrules. This type of analysis provides a useful tool to find the optimal tradeoff in different clusterflocking applications.Recent work by Zheng et al. (2020) examinesthe tradeoff between flocking performance andprivacy. The authors describe a system that fol-lows Reynolds flocking rules guided by a leaderrobot. The system is observed by a discriminatorattempting to determine which agent is the leader.The authors propose a genetic algorithm that co-optimizes the flocking controller parameters andthe discrimination function. The paper presents ameasure of the resulting flocking performance andleader detectability to find a set of optimal controlparameters for several kinds of leader trajectories.
8. Considerations for Physical Swarms
As the number of agents in a flock increases,the amount of inter-agent communication requiredmay become a significant energy and performancebottleneck. This has motivated several approachesto minimize the cyberphysical costs incurred byeach agent by either reducing the amount of com-munication required, explicitly including commu-nication cost into an agent’s objective function,or explicitly breaking communication links with asubset of neighboring agents. In the following sub-sections, we explore these approaches and discusstheir potential value to optimal flocking.
The cost of communication was explicitly in-cluded by Li et al. (2013) as part of a holistic cyber-physical approach. To account for environmentaland inter-agent communication disturbances, theauthors calculated the probability of communica-tion errors as a function of physical antenna prop-erties. Based on the collision avoidance constraintand maximum communication distance, each agent i ∈ A determines a minimum and maximum dis-tance to every neighbor j ∈ N i ( t ). Agent i then se-lects the optimal separating distance within thesebounds to minimize a combination of communica-tion error and a crowding penalty. The authorspropose an adaptive controller to find the optimalseparating distance and extend the analysis to in-clude both near and far-field communication in Liet al. (2017).13 control method for preserving agent connec-tivity while minimizing the number of neighborswas presented in Zavlanos et al. (2009). In thisformulation, agents receive a number of commu-nication hops from their neighbors that they useto estimate the communication graph diameter.Each agent uses this information to remove andpreserve particular communication links withintheir neighborhood. Graph topology was explic-itly linked with antenna power in Dolev et al.(2010), where the agents sought to minimize com-munication power while guaranteeing a minimumglobal graph diameter. This work was extended inDolev et al. (2013), where agents applied a gossipalgorithm to achieve global information about thesystem trajectory. Although these are not directlyapplicable to swarm systems, a similar approachmay be beneficial to ensure that all agents satisfyDefinition 2 while minimizing communication costsbetween agents. Communication hop approacheshave not been applied to flocking. However, theyhave been successfully used in more centralizedand structured swarm problems, particularly pat-tern formation; see Rubenstein et al. (2012); Wanget al. (2020). Finally, Chen et al. (2012) soughtthe minimum possible communication distanceto guarantee convergence to velocity consensusfor agents under the Vicsek model. The authorsshowed that, if the agents position and orienta-tion were randomly and uniformly distributed in[0 , × [ − π, π ], the minimum possible communi-cation distance was (cid:113) log NπN . This provides a lowerbound on communication energy cost for the flock.Camperi et al. (2012) studied the stability ofa flock when noise and external perturbations areintroduced. The authors sought to optimize theresponse of a large swarm of Vicsek agents in R by changing the neighborhood topology. The au-thors note that, as Ballerini et al. (2008) found,a k − nearest or Voronoi neighborhood topologyleads to more stable flocking while reducing thenumber of neighbors of each agent as compared toa distance-based neighborhood. This has signifi-cant implications in how the selection of a neigh-borhood topology may affect the energy cost ofcommunication. Zhou and Li (2017) proposed to minimize thecommunication and computational cost of generat-ing optimal trajectories by screening out neighborsthat do not negatively impact the objective func-tion of agent i ∈ A . The authors applied MPC toa discrete-time flocking system with the α -latticeobjective (5) and a control penalty term (12). Inthis case, given a desired distance d >
0, agent i constructed the screened neighbor sets S i ( t ) = { j ∈ N i ( t ) \ { i } : || s ij ( t ) || > d, s ij ( t ) · ˙ s ij ( t ) ≥ } , (26) S i ( t ) = { j ∈ N i ( t ) \ { i } : || s ij ( t ) || < d, s ij ( t ) · ˙ s ij ( t ) ≤ } , (27)where S i ( t ) consists of neighbors further than d and moving away, and S i ( t ) consists of neighborscloser than d and moving closer. Thus, agent i must only consider j ∈ S i ( t ) (cid:83) S i ( t ) when plan-ning.Another approach toward reducing communica-tion and computational cost is to perform sparseplanning updates by employing event-triggeredcontrol. Sun et al. (2019) proposed an update rulefor flocking with time delays for systems usinga potential field control law (6). A continuouslydifferentiable and bounded function τ ( t ) acts asa time delay on all position measurements. Theauthors let the portion of the control input thatachieves velocity consensus for agent i ∈ A beconstant over an interval [ t , t ). Then the au-thors proposed an error function that the agentuses to update the potential field portion of itscontroller. This occurs whenever the error magni-tude exceeds a threshold. However, the proposedthreshold requires global knowledge of the averageagent velocity, communication graph Laplacian,and a Lipschitz bound on the agent dynamics. Theauthors proved that, under this event-triggeredscheme, the agents converge to steady-state flock-ing behavior and the system was free of Zeno, i.e.,chattering, behavior. This is a promising resultfor reducing the computational burden on agents,and the development of a decentralized triggeringfunction is a promising area of research.14 .2. Flocking as a Strategy As we begin to deploy robotic swarm systemsin situ, it is crucial to consider when a particu-lar flocking behavior is an optimal strategy for aswarm. To the best of our knowledge, determin-ing when cluster flocking is an optimal strategyhas not been explored in the literature. Instead,cluster flocking is generally proposed as either aconvenient method of aggregate motion or as theresult of optimizing particular types of trackingproblems, e.g., reference state tracking. As engi-neering systems become more complex, addressingthe question of when cluster flocking is an optimalteam strategy will become necessary to achievelong-term swarm operation. Line flocking as astrategy has been explored as a tradeoff betweenthe energy savings of flocking and the energy costof rerouting to join a flock. Significant researcheffort has gone toward the rendezvous problem,that is, given a set of agents with distinct originsand destinations, when is it optimal for agents toexpend energy in order to form an energy-savingflock. This has primarily been explored throughthe lens of air traffic management, where com-mercial aircraft may rendezvous to form flocksbetween origin and destination airports given atakeoff and landing window.A centralized approach to the rendezvous prob-lem was presented in Ribichini and Frazzoli (2003),where the authors proved several properties ofenergy-optimal rendezvous for two agents. Thetwo-agent case was further explored in Rao andKabamba (2006) for minimum-time graph traver-sal. The effect of wind and environmental factorswas explored in Marks et al. (2018), where theauthors used historical traffic and environmentaldata to show a 5-7% increase in fuel economy re-sulting from coordination. A flocking protocolfor selfish agents was presented in Azoulay andReches (2019), and the air traffic routing problemhas been extensively explored in Kent (2015) andVerhagen (2015).Flocking as a strategy was also explored in thecontext of passenger vehicle eco-routing by Fre-dette (2017). In this approach, the author adaptedReynolds flocking rules to a two-lane highway withthe objective of minimizing vehicle fuel consump- tion while maintaining a desired velocity subjectto the physical parameters describing each vehicle.This resulted in each vehicle approaching its de-sired speed while dynamically forming and exitingflocks under a centralized control scheme.
9. Outlook and Research Directions
In the past twenty years, a rich literature onthe control of flocking systems has been produced.Control algorithms that implement variants ofReynolds rules have proven rigorous guaranteeson their steady-state behavior. Recently, controlalgorithms that optimally implement these ruleshave been demonstrated in simulation and large-scale outdoor flight tests. Flocking, as defined byReynolds, will seemingly be driven by advancesin decentralized control, robust control, and long-duration autonomy in the future. However, someapplication areas, such as mobile sensor networks,have criticized Reynolds flocking as a novelty thatdoes not necessarily have advantages in terms ofperformance or ease of implementation; see Albertand Imsland (2018).Therefore, we think that a new paradigm forviewing the nature of flocking is necessary. As wedemonstrated, there is a distinction in the naturalworld between cluster and line flocking. We wishto strengthen this distinction, and to that end, wepropose a partition of the literature into line andcluster flocking. We have also presented severaltypes of cluster flocking, defined by the system-level objective, that have been conflated using thenebulous term “flocking” throughout the litera-ture. In fact, we see no compelling reason why acontroller based on potential fields or α -latticesought to capture Reynolds flocking, reference stateflocking, or Ergodic flocking. Due to the nature ofengineering systems, new types of cluster flockinghave already emerged that have no natural coun-terpart. For this reason, we believe that preciselyclassifying and differentiating between these typesof flocking will be essential to advancing the re-search frontier of flocking as a desirable emergentbehavior.Furthermore, we think that constraint-drivenoptimal control should be the “natural language”15o formulate flocking and other emergence prob-lems. Under this design paradigm, it is possible toachieve rigorous guarantees on the safety and tasksimposed on agents as they travel along energy-minimizing surfaces. There has already been someinitial exploration into Reynolds flocking, e.g.,seeIbuki et al. (2020), and systems with limited com-munication range under disk flocking; see Beaverand Malikopoulos (2020). These approaches havealso shown a capacity for generating emergencein relatively simple multi-agent systems, e.g., seeNotomista and Egerstedt (2019), and the imposedconstraints provide guarantees on agent behaviorto neighbors and the system designer. Movingforward, we expect that by applying similar solu-tion methods to those used in the past, e.g., seeJadbabaie et al. (2003); Tanner et al. (2007), wemay provide guarantees on the behavior of manytypes of cluster flocking agents.Finally, including heterogeneity in cluster andline flocking will be essential as we roll out optimalflocking control algorithms to physical systems,where it is impossible for any two robots to haveidentical physical properties and performance ca-pabilities. Heterogeneity of agent properties is par-ticularly important in the line flocking literature,where the variable size, wingspan, metabolism, andage of flock members significantly affects the sys-tem’s overall energy savings; see Mirzaeinia et al.(2020). Prorok et al. (2017) has also shown that forgeneral swarm systems, an increase in agent diver-sity will expand the feasible solution space for eachagent’s control action. This may be beneficial interms of system robustness, especially for applica-tions related to emerging transportation systems;however, it may also increase the difficulty of find-ing an optimal solutions. By explicitly includingheterogeneity into a flocking system, it is possibleto generate a larger space of possible emergentbehavior. Future flocking research ought to con-sider diversity in agent properties and behaviorsto exploit the full benefits of swarm intelligence. Acknowledgement
The authors would like to thank Bert Tannerfor the insightful remarks and suggestions.
References
Albert, A., Imsland, L., 2018. Survey: mobile sensor net-works for target searching and tracking. Cyber-PhysicalSystems 4, 57–98.Atrianfar, H., Haeri, M., 2013. Flocking of multi-agentdynamic systems with virtual leader having the reducednumber of informed agents. Transactions of the Instituteof Measurement and Control 35, 1104–1115.Azoulay, R., Reches, S., 2019. UAV flocks forming forcrowded flight environments, in: ICAART 2019 - Pro-ceedings of the 11th International Conference on Agentsand Artificial Intelligence, SciTePress. pp. 154–163.Bajec, I.L., Heppner, F.H., 2009. Organized flight in birds.Animal Behaviour 78, 777–789.Ballerini, M., Cabibbo, N., Candelier, R., Cavagna, A., Cis-bani, E., Giardina, I., Lecomte, V., Orlandi, A., Parisi,G., Procaccini, A., Viale, M., Zdravkovic, V., 2008. In-teraction ruling animal collective behavior depends ontopological rather than metric distance: Evidence froma field study. Proceedings of the National Academy ofSciences of the United States of America 105, 1232–1237.Barve, A., Nene, M.J., 2013. Survey of Flocking Algo-rithms in Multi-agent Systems. International Journal ofComputer Science 19, 110–117.Bayindir, L., 2016. A review of swarm robotics tasks.Neurocomputing 172, 292–321.Beaver, L.E., Chalaki, B., Mahbub, A.M., Zhao, L., Zayas,R., Malikopoulos, A.A., 2020a. Demonstration of aTime-Efficient Mobility System Using a Scaled SmartCity. Vehicle System Dynamics 58, 787–804.Beaver, L.E., Dorothy, M., Kroninger, C., Malikopou-los, A.A., 2020b. Energy-Optimal Motion Planningfor Agents: Barycentric Motion and Collision AvoidanceConstraints, in: arxiv:2009.00588.Beaver, L.E., Kroninger, C., Malikopoulos, A.A., 2020c.An Optimal Control Approach to Flocking, in: 2020American Control Conference, pp. 683–688.Beaver, L.E., Malikopoulos, A.A., 2020. Beyond Reynolds:A Constraint-Driven Approach to Cluster Flocking, in:IEEE 59th Conference on Decision and Control (to ap-pear).Bedruz, R.A., Bandala, A.A., Vicerra, R.R., Conception,R., Dadios, E., 2019a. Design of a Robot Controller forPeloton Formation using Fuzzy Logic, in: 7th Conferenceon Robot Intelligence Technology and Applications, pp.83–88.Bedruz, R.A.R., Maningo, J.M.Z., Fernando, A.H., Ban-dala, A.A., Vicerra, R.R.P., Dadios, E.P., 2019b. Dy-namic Peloton Formation Configuration Algorithm ofSwarm Robots for Aerodynamic Effects Optimization,in: Proceedings of the 7th International Conferenceon Robot Intelligence Technology and Applications, pp.264–267.Camperi, M., Cavagna, A., Giardina, I., Parisi, G., Silvestri,E., 2012. Spatially balanced topological interaction rants optimal cohesion in flocking models. InterfaceFocus 2, 715–725.Celikkanat, H., 2008. Optimization of self-organized flock-ing of a robot swarm via evolutionary strategies, in: 23rdInternational Symposium on Computer and InformationSciences, IEEE. pp. 1–4.Chen, G., Liu, Z., Guo, L., 2012. The smallest possible in-teraction radius for flock synchronization. SIAM Journalon Control and Optimization 50, 1950–1970.Cucker, F., Smale, S., 2007. Emergent behavior in flocks.IEEE Transactions on Automatic Control 52, 852–862.Cutts, C.J., Speakman, J.R., 1994. Energy Savings inFormation Flight of Pink-Footed Geese. J. exp. Biol 189,251–261.Dave, A., Malikopoulos, A.A., 2019. Decentralized Stochas-tic Control in Partially Nested Information Structures,in: IFAC-PapersOnLine, Chicago, IL, USA.Dave, A., Malikopoulos, A.A., 2020. Structural results fordecentralized stochastic control with a word-of-mouthcommunication, in: 2020 American Control Conference(ACC), IEEE. pp. 2796–2801.Dolev, S., Segal, M., Shpungin, H., 2010. Bounded-HopStrong Connectivity for Flocking Swarms, in: WiOpt’10:Modeling and Optimization in Mobie, Ad Hoc, andWireless Networks, pp. 269–277.Dolev, S., Segal, M., Shpungin, H., 2013. Bounded-hopenergy-efficient liveness of flocking swarms. IEEE Trans-actions on Mobile Computing 12, 516–528.Duan, H., Qiao, P., 2014. Pigeon-inspired optimization:A new swarm intelligence optimizer for air robot pathplanning. International Journal of Intelligent Computingand Cybernetics 7, 24–37.Egerstedt, M., Pauli, J.N., Notomista, G., Hutchinson, S.,2018. Robot ecology: Constraint-based control designfor long duration autonomy. Annual Reviews in Control46, 1–7.Ferrari, S., Foderaro, G., Zhu, P., Wettergren, T.A., 2016.Distributed Optimal Control of Multiscale DynamicalSystems: A Tutorial. IEEE Control Systems 36, 102–116.Fine, B.T., Shell, D.A., 2013. Unifying microscopic flockingmotion models for virtual, robotic, and biological flockmembers. Autonomous Robots 35, 195–219.Fredette, D., 2017. Fuel-Saving behavior for Multi-VehicleSystems: Analysis, Modeling, and Control. Ph.D. thesis.The Ohio State University.Hauert, S., Leven, S., Varga, M., Ruini, F., Cangelosi, A.,Zufferey, J.C., Floreano, D., 2011. Reynolds flocking inreality with fixed-wing robots: Communication rangevs. maximum turning rate, in: 2011 IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems,pp. 5015–5020.Hayes, A.T., Dormiani-Tabatabaei, P., 2002. Self-Organized Flocking with Agent Failure: Off-Line Opti-mization and Demonstration with Real Robots, in: IEEEInternational Conference on Robotics and Automation, pp. 3900–3905.Hung, S.M., Givigi, S.N., 2017. A Q-Learning Approach toFlocking with UAVs in a Stochastic Environment. IEEETransactions on Cybernetics 47, 186–197.Ibuki, T., Wilson, S., Yamauchi, J., Fujita, M., Egerstedt,M., 2020. Optimization-Based Distributed FlockingControl for Multiple Rigid Bodies. IEEE Robotics andAutomation Letters 5, 1891–1898.Jadbabaie, A., Lin, J., Morse, A.S., 2003. Mobile Au-tonomous Agents Using Nearest Neighbor Rules. IEEETransactions on Automatic Control 48, 988–1001.Jafari, M., Xu, H., Carrillo, L.R.G., 2020. A biologically-inspired reinforcement learning based intelligent dis-tributed flocking control for Multi-Agent Systems inpresence of uncertain system and dynamic environment.IFAC Journal of Systems and Control 13, 100096.Jang, K., Vinitsky, E., Chalaki, B., Remer, B., Beaver,L., Malikopoulos, A.A., Bayen, A., 2019. Simulation toscaled city: zero-shot policy transfer for traffic controlvia autonomous vehicles, in: Proceedings of the 10thACM/IEEE International Conference on Cyber-PhysicalSystems, pp. 291–300.Kennedy, J., Eberhart, R., 1995. Particle Swarm Optimiza-tion, in: International Conference on Neural Networks,pp. 1942–1948.Kent, T.E., 2015. Optimal Routing and Assignment forCommercial Formation Flight. Ph.D. thesis. Universityof Bristol.Kesireddy, A., Shan, W., Xu, H., 2019. Global Opti-mal Path Planning for Multi-agent Flocking: A Multi-Objective Optimization Approach with NSGA-III, in:Proceedings of the 2019 IEEE Symposium Series onComputational Intelligence, pp. 64–71.Khodayari, E., Sattari-Naeini, V., Mirhosseini, M., 2016.Flocking Control with Single-COM for Tracking a Mov-ing Target in Mobile Sensor Network Using GravitationalSearch Algorithm, in: Proceedings of the 1st Conferenceon Swarm Intelligence and Evolutionary Computation,pp. 125–130.Koren, Y., Borenstein, J., 1991. Potential Field Methodsand their Inherent Limitations for Mobile Robot Navi-gation, in: Proceedings of the 1991 IEEE InternationalConference on Robotics and Automation.La, H.M., Lim, R., Sheng, W., 2015. Multirobot coopera-tive learning for predator avoidance. IEEE Transactionson Control Systems Technology 23, 52–63.La, H.M., Nguyen, T.H., Nguyen, C.H., Nguyen, H.N., 2009.Optimal Flocking Control for a Mobile Sensor NetworkBased on a Moving Target Tracking, in: Proceedingsof the 2009 IEEE International Conference on Systems,Man, and Cybernetics, IEEE. pp. 4801–4806.Lee, S.M., Myung, H., 2013. Particle swarm optimization-based distributed control scheme for flocking robots,in: Advances in Intelligent Systems and Computing,Springer Verlag. pp. 517–524.Li, H., Peng, J., Liu, W., Wang, J., Liu, J., Huang, Z., nd Automation, pp. 3293–3298.Shannon, C.E., 1948. A Mathematical Theory of Communi-cation. The Bell System Technical Journal 27, 379–423.Song, Z., Lipinski, D., Mohseni, K., 2017. Multi-vehiclecooperation and nearly fuel-optimal flock guidance instrong background flows. Ocean Engineering 141, 388–404.Sun, F., Wang, R., Zhu, W., Li, Y., 2019. Flocking innonlinear multi-agent systems with time-varying delayvia event-triggered control. Applied Mathematics andComputation 350, 66–77.Tanner, H.G., Jadbabaie, A., Pappas, G.J., 2007. Flockingin fixed and switching networks. IEEE Transactions onAutomatic Control 52, 863–868.Vadakkepat, P., Tan, K.C., Ming-Liang, W., 2000. Evolu-tionary Artificial Potential Fields and Their Applicationin Real Time Robot Path Planning, in: 2000 Congresson Evolutionary Computation, IEEE. pp. 256–263.V´as´arhelyi, G., Vir´agh, C., Somorjai, G., Nepusz, T.,Eiben, A.E., Vicsek, T., 2018. Optimized flocking ofautonomous drones in confined environments. ScienceRobotics 3.Vatankhah, R., Etemadi, S., Honarvar, M., Alasty, A.,Boroushaki, M., Vossoughi, G., 2009. Online velocityoptimization of robotic swarm flocking using particleswarm optimization (PSO) method, in: 2009 6th Interna-tional Symposium on Mechatronics and its Applications,pp. 1–6.Veitch, C., Render, D., Aravind, A., 2019. Ergodic Flocking,in: Proceedings of the 2019 IEEE/RSJ InternationalConference on Intelligent Robots and Systems, pp. 6957–6962.Verhagen, C.M.A., 2015. Formation flight in civil aviationDevelopment of a decentralized approach to formationflight routing. Ph.D. thesis. Delft University of Technol-ogy.Vicsek, T., Czirok, A., Ben-Jacob, E., Cohen, I., Shochet,O., 1995. Novel Type of Phase Transition in a Systemof Self-Driven Particles. Physical Review Letters 75,1226–1229.Vir´agh, C., Nagy, M., Gershenson, C., V´as´arhelyi, G., 2016.Self-organized UAV Traffic in Realistic Environments, in:2016 IEEE/RSJ International Conference on IntelligentRobots and Systems, pp. 1645–1652.Wang, C., Wang, J., Zhang, X., 2018. A deep reinforcementlearning approach to flocking and navigation of uavs inlarge-scale complex environments, in: 2018 IEEE GlobalConference on Signal and Information Processing, pp.1228–1232.Wang, H., Rubenstein, M., Rubenstein, M., 2020. ShapeFormation in Homogeneous Swarms Using Local TaskSwapping. IEEE Transactions on Robotics 36, 597–612.Wilson, S., Glotfelter, P., Wang, L., Mayya, S., Notomista,G., Mote, M., Egerstedt, M., 2020. The Robotar-ium: Globally Impactful Opportunities, Challenges, andLessons Learned in Remote-Access, Distributed Con- trol of Multirobot Systems. IEEE Control Systems 40,26–44.Xu, H., Carrillo, L.R.G., 2015. Distributed Near Opti-mal Flocking Control for Multiple Unmanned AircraftSystems, in: Proceedings of the 2015 International Con-ference on Unmanned Aircraft Systems, pp. 879–885.Xu, H., G. Carrillo, L.R., 2017. Fast reinforcement learningbased distributed optimal flocking control and networkco-design for uncertain networked multi-UAV system, in:Unmanned Systems Technology XIX, SPIE. p. 1019511.Yang, J., Grosu, R., Smolka, S.A., Tiwari, A., 2016. Lovethy neighbor: V-formation as a problem of model pre-dictive control, in: 27th International Conference onConcurrency Theory, pp. 4:1–4:5.Yuan, Q., Zhan, J., Li, X., 2017. Outdoor flocking ofquadcopter drones with decentralized model predictivecontrol. ISA Transactions 71, 84–92.Zavlanos, M.M., Tanner, H.G., Jadbabaie, A., Pappas,G.J., 2009. Hybrid control for connectivity preservingflocking. IEEE Transactions on Automatic Control 54,2869–2875.Zhan, J., Li, X., 2011a. Decentralized Flocking Protocolof Multi-agent Systems with Predictive Mechanisms, in:Proceedings of the 30th Chinese Control Conference, pp.5995–6000.Zhan, J., Li, X., 2011b. Flocking of Discrete-time Multi-Agent Systems with Predictive Mechanisms, in: 18thIFAC World Congress, pp. 5669–5674.Zhan, J., Li, X., 2013. Flocking of multi-agent systemsvia model predictive control based on position-only mea-surements. IEEE Transactions on Industrial Informatics9, 377–385.Zhang, H.T., Chen, M.Z., Stan, G.B., Zhou, T., Ma-cIejowski, J.M., 2008. Collective behavior coordinationwith predictive mechanisms. IEEE Circuits and SystemsMagazine 8, 67–85.Zhang, H.T., Cheng, Z., Chen, G., Li, C., 2015. Modelpredictive flocking control for second-order multi-agentsystems with input constraints. IEEE Transactions onCircuits and Systems I: Regular Papers 62, 1599–1606.Zhang, H.T., Liu, B., Cheng, Z., Chen, G., 2016. ModelPredictive Flocking Control of the Cucker-Smale Multi-Agent Model with Input Constraints. IEEE Transactionson Circuits and Systems I: Regular Papers 63, 1265–1275.Zheng, H., Panerati, J., Beltrame, G., Prorok, A., 2020.An adversarial approach to private flocking in mobilerobot teams. IEEE Robotics and Automation Letters 5,1009–1016.Zhou, L., Li, S., 2017. Distributed model predictive controlfor multi-agent flocking via neighbor screening optimiza-tion. International Journal of Robust and NonlinearControl 27, 1690–1705.Zhu, B., Xie, L., Han, D., 2016. Recent Developments inControl and Optimization of Swarm Systems: A BriefSurvey, in: 12th IEEE International Conference on Con- rol and Automation, pp. 19–24.Zhu, B., Xie, L., Han, D., Meng, X., Teo, R., 2017. Asurvey on recent progress in control of swarm systems.Science China Information Sciences 60.rol and Automation, pp. 19–24.Zhu, B., Xie, L., Han, D., Meng, X., Teo, R., 2017. Asurvey on recent progress in control of swarm systems.Science China Information Sciences 60.