[PDF] A Framework for Automatic Behavior Generation in Multi-Function Swarms

Abstract

Multi-function swarms are swarms that solve multiple tasks at once. For example, a quadcopter swarm could be tasked with exploring an area of interest while simultaneously functioning as ad-hoc relays. With this type of multi-function comes the challenge of handling potentially conflicting requirements simultaneously. Using the Quality-Diversity algorithm MAP-elites in combination with a suitable controller structure, a framework for automatic behavior generation in multi-function swarms is proposed. The framework is tested on a scenario with three simultaneous tasks: exploration, communication network creation and geolocation of RF emitters. A repertoire is evolved, consisting of a wide range of controllers, or behavior primitives, with different characteristics and trade-offs in the different tasks. This repertoire would enable the swarm to transition between behavior trade-offs online, according to the situational requirements. Furthermore, the effect of noise on the behavior characteristics in MAP-elites is investigated. A moderate number of re-evaluations is found to increase the robustness while keeping the computational requirements relatively low. A few selected controllers are examined, and the dynamics of transitioning between these controllers are explored. Finally, the study develops a methodology for analyzing the makeup of the resulting controllers. This is done through a parameter variation study where the importance of individual inputs to the swarm controllers is assessed and analyzed.

Full PDF

AA Framework for Automatic BehaviorGeneration in Multi-Function Swarms

Sondre A. Engebraaten , , ∗ , Jonas Moen , Oleg A. Yakimenko and Kyrre Glette , University of Oslo, Norway Norwegian Defence Research Establishment, Kjeller, Norway Naval Postgraduate School, Monterey, CA, USA

Correspondence*:Sondre A. Engebraaten,Instituttveien 20, 2007 Kjeller, Norwaysondre.engebraten@fﬁ.no

ABSTRACT

Multi-function swarms are swarms that solve multiple tasks at once. For example, a quadcopterswarm could be tasked with exploring an area of interest while simultaneously functioning as ad-hoc relays. With this type of multi-function comes the challenge of handling potentially conﬂictingrequirements simultaneously. Using the Quality-Diversity algorithm MAP-elites in combination witha suitable controller structure, a framework for automatic behavior generation in multi-functionswarms is proposed. The framework is tested on a scenario with three simultaneous tasks:exploration, communication network creation and geolocation of Radio Frequency (RF) emitters.A repertoire is evolved, consisting of a wide range of controllers, or behavior primitives, withdifferent characteristics and trade-offs in the different tasks. This repertoire would enable theswarm to transition between behavior trade-offs online, according to the situational requirements.Furthermore, the effect of noise on the behavior characteristics in MAP-elites is investigated.A moderate number of re-evaluations is found to increase the robustness while keeping thecomputational requirements relatively low. A few selected controllers are examined, and thedynamics of transitioning between these controllers are explored. Finally, the study developsa methodology for analyzing the makeup of the resulting controllers. This is done through aparameter variation study where the importance of individual inputs to the swarm controllers isassessed and analyzed.

Typical applications for swarms is tasks that are either too big or complex for single agents to do well.Although it might be possible to imagine a single complex and large agent that are able to solve thesetasks, this is often undesirable due to system complexity or cost. Trying to solve several tasks with optimalperformance also adds to the complexity (Bayındır [2016]; Brambilla et al. [2013]), as each task may placeits own requirements or demands on the system. Requirements for being a good long-distance runner arenot the same as being a good sprinter. Similarly, in swarms, the requirements for being good at exploringan area are not the same as for maintaining a communication infrastructure. However, an operator thatrequires capacity and performance in both tasks has limited options. One way of tackling this challengecould be to launch two swarms, giving each swarm a task and operating them independently. This addscomplexity to the operation and doubles the system cost. Another option is to develop a concept for amulti-function swarm. a r X i v : . [ c s . M A ] J u l ngebraaten et al. A Framework for Multi-Function Swarms

Network coverage E x p l o r a t i o n Figure 1.

Evolving repertoires of swarm behaviors allows a user to adapt the behavior of the swarm bysimply selecting a new behavior from the repertoire. The upper ﬁgure shows part of a repertoire where afew selected controllers are highlighted.A multi-function swarm, or a swarm that seeks to solve multiple tasks simultaneously, is a novel conceptin swarm research. This is different from multitask-assignment (Brutschy et al. [2014]; Meng and Gan[2008]; Berman et al. [2009]; Jevtic et al. [2011]) in that each agent is contributing to all tasks at the sametime. It is also different from multi-modal behaviors (Schrum and Miikkulainen [2014, 2012, 2015]), orbehaviors to solve tasks that require multiple actions in a sequence (Brutschy et al. [2014]; Meng andGan [2008]; Berman et al. [2009]; Jevtic et al. [2011]). A multi-function swarm tackles multiple tasks atonce while retaining some performance on all the tasks simultaneously. Figure 1 shows example swarmbehaviors selected from a repertoire with increasing performance in the networking application from left toright.Most swarm behaviors are deﬁned bottom-up, where the agent-to-agent interactions are speciﬁed. For theoperator or the user, the desired behavior is commonly on a macroscopic level, or considering the swarmas a whole. Deducing the required low-level rules in order to achieve a speciﬁc high-level behavior is anon-trivial problem, and subject to research (Jones et al. [2018]; Francesca et al. [2014]). Through the useof evolution, this paper seeks to tackle the problem of top-down automated swarm behavior generation.Previous works show how it is possible to have robots that adapt like animals by using a repertoire ofbehaviors generated ofﬂine (Cully et al. [2015]). Similar adaptation techniques should be incorporated intoall robotics systems, enabling the recovery from crippling failures or simply a change in the goals of theoperator (Engebr˚aten et al. [2018b]). A key element of this is that adaptation must happen live, or at least ina semi-live timeframe. Evolving these behaviors online would be the optimal solution, but limited computeand real-time constraints make this infeasible.Instead of deriving new behaviors on the ﬂy, a swarm could be based on behavior primitives. A behaviorprimitive is a simple pre-computed behavior that solves some task or sub-task. In this paper, each evolvedbehavior is considered a behavior primitive. Each behavior represents a full solution to the multi-functionbehavior optimization problem, however they differ in trade-off between the applications. A swarm couldeasily contain many of these behavior primitives. This would allow the swarm to update its behavior on theﬂy, as the circumstances or requirements as given by the operator are updated. This negates the need for afull online optimization of behaviors. ngebraaten et al. A Framework for Multi-Function Swarms

A good set of behavior primitives that are easily understandable and solves common tasks goes a longway towards reducing the need for human oversight (Cummings [2015]). A common problem with scalingswarms or multi-agent systems is that the number of humans required to operate the system scales linearlywith the number of agents or platforms (Wang et al. [2009]; Cummings [2015]). This does not in a goodway allow the swarm to be a capability multiplier, as it should be.Allowing the operator to choose from a set of predeﬁned high-level behavior primitives stored onboardwould greatly reduce the need for micromanagement and might break the linear relation between numberof agents and operators. By using a Quality-Diversity method (Pugh et al. [2016]; Cully and Demiris[2017]) to optimize, it is possible to use the repertoire itself to gleam some insights into the performancefor individual behaviors.In this paper the Quality-Diversity method MAP-elites (Mouret and Clune [2015]) is employed. MAP-elites is used to explore the search space of possible swarming behaviors. This is done to allow the operatorthe greatest amount of choices to adapt the system to their needs. In combination with a swarm behaviorframework based on physical forces (Engebr˚aten et al. [2018b]), this provides a robust and expandablesystem that has the required level of abstraction to be possible to transfer to real drones. Three tasks areexplored: area surveillance, communication network creation and geolocation of RF emitters. Each taskinduces new requirements onto the swarm behavior. For instance, covering an area in the surveillance taskrequires agents to be on the move. Coverage in the communication network application increases if agentsare stationary, keeping a ﬁxed distance to each other. It is the combination of these requirements that makethis a challenging swarm use-case.By using a well-known RF geolocation technique based on Power Differential Of Arrival (PDOA) itis possible to give estimates of location for an unknown uncooperative RF emitter (Engebr˚aten [2015]).Previously published works on PDOA (Engebr˚aten et al. [2017]) allows this method to be applied alsoto energy and computationally limited devices. Through extensive simulations the performance of thisconcurrent multi-functional swarm is evaluated, and the relevance of neighbor interactions examined.Finally, it is shown that the behaviors can indeed be used as behavioral primitives i.e. building blocks formore complex sequential behavior or as commands from an operator.The contributions of this paper are an extensible and rigorous framework for multi-function swarms,incorporating automated behavior generation and methods for analyzing the resulting behaviors. Datamining on the results from the evolutionary methods is essential to fully utilize all the available data, as thebehaviors are simply too many to manually review. This is a major extension of previous works (Engebr˚atenet al. [2018b]). Through the use of ablation, or the selective disabling or removal of parts of the controller,importance of individual sensory inputs is determined. The simulator used is updated to better reﬂect thereality and experiences from previous real-world tests (Engebr˚aten et al. [2018a]). Through a combinationof extensive simulations, new visualizations and a deep analysis of swarm behaviors, insights can be gainedwhich enables more efﬁcient use of limited computational resources for future evolutionary experiments.Section 2 presents a view on related works. Section 3 present the methods, framework and simulator usedin this study. Section 4 presents the ﬁnding and results of the simulations. Section 5 provides thoughts andviews on the presented results and Section 6 concludes the paper. 3 ngebraaten et al.

A Framework for Multi-Function Swarms

Controllers in the literature for swarms vary greatly. Some propose using neural networks for control(Trianni et al. [2003]; Dorigo et al. [2004]; Duarte et al. [2016a]), handwritten rules (Krupke et al. [2015])or even a combination or hybrid controller structure (Duarte et al. [2014]). Common for all of them is thatindividual robots, or agents, in some way must receive inputs from the environment or other robots. Basedon this information each agent decides on what to do next. This is the basis for a decentralized swarmsystem and allows the swarm to be robust against single point failures.The controller structure in this work is an extension upon artiﬁcial potential ﬁelds (Krogh [1984];Kuntze and Schill [1982]; Khatib [1986]). Artiﬁcial potential ﬁelds was originally a method of avoidingcollisions for industrial robot control. Additional research allowed this method to be applied to generalcollision avoidance in robotics (Vadakkepat et al. [2000]; Park et al. [2001]; Lee and Park [2003]). Furthergeneralization resulted in artiﬁcial physics forces; this is known as Physicomimetics (Spears et al. [2004]).

Evolution of controllers is a common way of tackling the challenge of automated behavior generation(Jones et al. [2018]; Francesca et al. [2014]). Evolving a set of sequential behaviors allows agents to tacklemulti-modal tasks (Schrum and Miikkulainen [2012, 2014, 2015]). Similarly, evolving behavior trees allowfor the evolution of controllers that can easily be understood by human operators (Jones et al. [2018]).Using evolution ofﬂine, only time and available computation power limits the complexity of the problemsthat can be tackled. Online embodied evolution is more limited in the problem complexity, but allows forthe behaviors to evolve in-vivo or in the operating environment itself (Bredeche et al. [2009]; Eiben et al.[2010]). Using embodied evolution is a way of allowing robots to learn on the ﬂy, but also to remove thereality gap as agents are tested in the actual environment they operate in. Combining testing of behaviors ina simulator, such as ARGoS (Pinciroli et al. [2011, 2012]), with some evolution on real robots to ﬁne-tunebehaviors improves performance of evolved behaviors while retaining the beneﬁt and speed of ofﬂineevolution (Miglino et al. [1995]).Evolving a large repertoire of controllers or behaviors before the robot is deployed might allow it torecover from otherwise crippling physical and hardware faults (Mouret and Clune [2015]; Cully et al.[2015]; Cully and Mouret [2016]). Extending on this concept it is also possible to use evolved behaviors tocontrol complex robots as in EvoRBC (Duarte et al. [2016c]). When evolving controllers it is importantto consider the properties of the evolutionary method chosen. In particular, Quality-Diversity methodsperform better with direct encodings than indirect (Tarapore et al. [2016]), and might struggle when facedwith noisy behavior characteristics or ﬁtness metrics (Justesen et al. [2019]). Challenges with noise intraditional evolutionary optimization have been documented well (Cliff et al. [1993]; Hancock [1994];Beyer [2000]; Jin and Branke [2005]). However, as the method MAP-elites is fairly new the effect of noisehas not been reviewed to the same extent.

The ability to operate not only a single Unmanned Aerial Vehicle (UAV) but multiple UAVs is beneﬁcial(Bayraktar et al. [2004]). Multiple UAVs may offer increased performance through task allocation (Howet al. [2004]). A controlled indoor environment allows many swarm concepts to be evaluated withoutthe constraints and uncertainty outdoor tests might bring (Lindsey et al. [2012]; Schuler et al. [2019]; ngebraaten et al. A Framework for Multi-Function Swarms

Kushleyev et al. [2013]; Hsieh et al. [2008]; Preiss et al. [2017]). However, ﬁnding a way to move swarmsout of the labs and into the real world allows for the veriﬁcation of early bio-inspired swarm behaviors(Reynolds [1987]), and the effect of reduced communication can be investigated (Hauert et al. [2011]).Flocking can also be tested on a larger scale than previously possible (V´as´arhelyi et al. [2018]).Outdoors, the potential applications for swarms are many. Swarms could provide a communicationnetwork, as is the case in the SMAVNET project (Hauert et al. [2009, 2010]). Teams or swarms of UAVsmay be used to survey large areas (Basilico and Carpin [2015]; Atten et al. [2016]). SWARM-BOT showshow smaller ground-based robots can work together to traverse challenging terrain (Mondada et al. [2004]).Pushing boundaries on what is possible, scaling a swarm still presents a challenge, but ARSENL shows thata swarm of 50 UAVs is possible in live ﬂight experiments (Chung et al. [2016]). The main challenge, apartfrom logistics (Mulgaonkar [2012]) in these large experiments, is that of communication and maintainingconsensus (Davis et al. [2016]). A new frontier for outdoor swarming might be to incorporate heterogeneousplatforms with wide sets of different capabilities, further extending the number of applications for theswarm (Dorigo et al. [2013]). Swarms of Unmanned Surface Vehicles (USVs) might also prove valuable inenvironmental monitoring of vast maritime areas (Duarte et al. [2016a,b]).

The proposed framework uses evolution to automatically create a large set of swarm behaviors. This setof multi-function swarm behaviors is generated based on high-level metrics that measure performance ineach application. The core of the framework is the combination of evolutionary methods with a directlyencoded physics-based controller, which allows the framework to produce a varied set of swarm behaviors.Three applications were chosen to evaluate the framework: area surveillance, communication networkcreation and geolocation of RF emitters. The ﬁrst two were introduced in previous works (Engebr˚aten et al.[2018b]), while the combination with geolocation of RF emitters is new in this paper.Making a framework for a multi-function swarm requires development and research into controllers forswarm agents, adaptation of existing evolutionary methods to this task, a suitable simulator to test theproposed swarm behaviors, and realistic assumptions about the capabilities of each individual swarm agent.This section will go into additional details about each of these, starting with the structure of the controllersfor each agent.

These experiments employ an event-based particle simulator. Every agent is modeled as a point masswith limits on maximum velocity and acceleration. A modular architecture allows the simulator to be easilyexpanded with new sensor, platforms or features. Using the Celery framework for distributed computationallows for task-based parallelization. The full source code for the simulator setup used can be found atGitHub .Each agent is assumed to be equipped with a radio for communication with other agents and the ground, acamera, and a simple Received Signal Strength (RSS) sensor. All of these are both small in size and weightand constitute a feasible sensor package for a UAV. For these experiments the agents are assumed to have adownward facing camera that can capture the ground below and look for objects of interest. To further Uploaded upon ﬁnal acceptance ngebraaten et al. A Framework for Multi-Function Swarms simplify the simulation, the simulated environment does not emulate internal/external camera geometryand instead simply assumes that the area of interest can be divided into cells. Each cell is smaller than thearea covered by the camera at any given time. If a sufﬁcient number of cells are used, the method wouldmake it likely that the entire area is covered.The communication radio is dual-use and acts as both the interlink for the agents (agent-to-agentcommunication) and the communication channel with the ground control station or other entities on theground. In previous live-ﬂight experiments WiFi was used (Engebr˚aten et al. [2018a]). Newer unpublishedexperiments have employed a mesh radio which makes it possible to remove the need for a central WiFirouter, and as such, furthers the concept of a swarm.Compared to previous works (Engebr˚aten et al. [2018b]), a more conservative vehicle model is employed.Maximum acceleration was reduced to 1.0m/s and max velocity was set to 10.0m/s. This was based on theresults of previous real-ﬂight experiments (Engebr˚aten et al. [2018a]) and is a way to compensate for theslower reaction time of the physical vehicles.Furthermore, it was found that the range of the controller parameters determining the behavior of theplatform were in many cases too high to be readily employed on real UAVs. This led to oscillating behaviorswhere the time delay in the physical system could not keep up with the controller. A swarm of UAVs might in the future be used to provide real-time visual observations over large areas.On a conceptually high level these can be considered a potential replacement for a ﬁxed security system,providing both better coverage, a more ﬂexible and adaptable setup and the ability to react to new situationswith ease. The downside is that today they require more maintenance and logistics, as well as more operatoroversight. In this work a simpliﬁed area surveillance scenario is used as one of the applications. Each agentis equipped with a camera and tasked to explore an area. Exploration is measured by dividing the area ofinterest into a number of cells. The agents in the swarm seek to explore, or cover, all the cells as frequentlyas possible (see left part of Figure 2). The median visitation count across all cells in the area is used tomeasure performance in the area surveillance task.In the absence of a common wireless infrastructure, it is also natural to imagine that the swarm mustprovide its own communication network in order to relay information back to an operator or a groundcontrol station. This requires the UAVs to continuously be in range and to communicate for forwardingof data to be possible. A simpliﬁed scenario for maintaining a communication infrastructure is used as asecond application for the swarm (middle part of Figure 2). Swarm behaviors are measured on the abilityto maintain coverage and connectivity over a large area. The performance in the communication networktask is measured by calculating the area covered by the largest connected subgroup of the swarm, given aﬁxed communication radius.PDOA geolocation is introduced as a third application for the multi-function swarm (right part of Figure2). Geolocation refers to trying to ﬁnd the geographic position, or coordinates, of an RF emitter based onsensor measurements. PDOA uses the RSS, or the received power, at multiple different points in spacein order to give an estimate of the location of the emitter (right part of Figure 2). PDOA geolocation isa form of trilateration, or more speciﬁcally, multilateration in that the sensor readings give an indicationof distance (as opposed to direction used in triangulation). A prediction of the emitter location can bemade by minimizing Q ( x, y ) . Q ( x, y ) (Eq. 1) is an error function that indicates the error compared to FreeSpace Path Loss model, given that the emitter is at coordinates ( x, y ) . P k and P l represents the RSS at ngebraaten et al. A Framework for Multi-Function Swarms

Exploration Network Geolocation

Figure 2.

The multi-function swarm is optimized on three applications; exploration, network creation andgeolocation (from left to right). Each application requires distinct behaviors for optimal performance. Reddots indicate each swarm agent.positions ( x k , y k ) and ( x l , y l ) respectively. Previous works presented a method of providing an estimateof the location of a transmitter using signiﬁcantly less resources than commonly employed estimators(Engebr˚aten et al. [2017]). In this work, Q ( x, y ) is sampled at 60 random locations and the location withthe least error is used as an estimate for emitter location. This forgoes the local search used in (Engebr˚atenet al. [2017]). Q ( x, y ) = (cid:80) nk =1 (cid:80) nl = k [( P k − P l ) − α log ( x − x l ) + ( y − y l ) ( x − x k ) + ( y − y k ) ] (1)Over time, multiple estimates of the emitter location are produced by the swarm. The variance of all thesepredictions are calculated and used as a metric for performance in the geolocation task. It is important tonote that the use of PDOA geolocation has speciﬁc requirements on sensor placement to avoid ambiguitiesthat lead to great variance and inaccuracies in the predicted emitter locations. For more information aboutthis, refer to previous works (Engebr˚aten [2015]). In most cases, variance naturally converges towards zeroas the mean converges on the true mean. Controllers for each swarm agent are based on a variant of Physicomimetics, or artiﬁcial physics (Spearset al. [2004]). Artiﬁcial forces act between the agents and, ultimately, deﬁne the behavior of the swarm.Unlike traditional physics there is no limit on the type of forces that can act between agents. The controller(Figure 3) uses eight inputs: F ) Nearest neighbor F ) Second nearest neighbor F ) Third nearest neighbor F ) Fourth nearest neighbor F ) Fifth nearest neighbor F ) Sixth nearest neighbor F ) Least frequently visited neighboring square F ) Average predicted emitter locationIn this work, the force that acts between agents is deﬁned by the Sigmoid-Well function. This function iscomprised of two parts a i ( d i ) (Eq. 3) and g i ( d i ) (Eq. 2). a i ( d i ) is a distance dependent attraction-replusion7 ngebraaten et al. A Framework for Multi-Function Swarms

Figure 3.

Each agent uses the distance to the 6 nearest neighbors, the direction to the least visitedsurrounding square, and the direction and distance to the average of the predicted emitter location. Together,using Sigmoid-Well w i ( d i ) , they form a velocity setpoint V sp .force. g i ( d i ) is the gravity well component, which can contribute with distance holding type behaviors.These functions are deﬁned by four parameters: the k i weight, the scale parameter t i the c i center-distanceand the σ i range parameter. The k i weight determines the strength of the attraction-repulsion force. Thescale parameter t i deﬁnes the afﬁnity towards the distance given by the center-distance c i . The rangeparameter σ i can increase or decrease the effective range of the gravity well, by lengthening the transitionaround the center distance c i . Together a i ( d i ) and g i ( d i ) form the Sigmoid-well function w i ( d i ) (Eq. 4). Anexample of the shape of each of these components can be seen in Figure 4, which shows how the function w i ( d i ) can be set to enact a repulsion/attraction force, in addition to a preference for holding a distance of500m. g i ( d i ) = − t i ∗ ∗ ( d i − c i ) ∗ e − ( d i − c i ) /σ i (2) a i ( d i ) = k i ∗ (cid:32)

21 + e − ( d i − c i ) /σ i − (cid:33) (3) w i ( d i ) = a i ( d i ) + g i ( d i ) (4) v sp = 18 (cid:88) i F i || F i || ∗ w i ( || F i || ) (5)The eight inputs are combined by scaling them with the Sigmoid-Well function w i ( d i ) before summingthe result to form a single velocity setpoint V sp (Eq. 5). F i is the distance delta vector from agent positionto sensed object position. V sp is calculated based on F i and w i ( d i ) .For each input there are 4 parameters: a weight k i , a scale t i , a center c i and a range σ i . With a total of 8inputs this gives 32 parameters. The least visited neighboring square input is slightly different from the restof the sensory inputs. This input gives only direction information to the controller and not a distance. Thecontroller handles this by only weighting this input and not applying the distance dependent Sigmoid-Wellfunction. This means that in practice, for each controller, only 29 parameters make an impact on the swarmbehavior. ngebraaten et al. A Framework for Multi-Function Swarms

Distance (m) S i g m o i d - w e ll w p , i ( d ) f un c t i o n a n d i t s c o m p o n e n t s w p , i ( d ) g i ( d ) a i ( d ) Distance (m) I n t e g r a l o f S i g m o i d - w e ll f un c t i o n Figure 4.

The Sigmoid-Well function used in these experiments. Weight and scale parameters are coupledto the center parameters and spread parameters. This is because the sign of the Sigmoid-Well functionchanges at the center distance. Each agent minimizes the combined potential of all the contributing forces,indirectly moving to the minimum of the integral function on the right. k i is 5.0, t i is -0.1, c i is 500.0 and σ i is 100.The controller inputs used in this work represent a very limited subset of everything that could be sharedbetween agents. This is intentional, to reduce the amount of communication required. While this paperonly pertains to simulation results, the goal is to ﬂy this system outdoors. Outdoor challenges such aslimited bandwidth, non-uniform antenna diagrams, loss of links, interference and link latency must beconsidered. When designing a swarm to work in the real world, the range and rate of communication needsto be limited. If all agents in the swarm needed information from all other agents, the system would quicklybreak down due to network saturation.Compared to previous works (Engebr˚aten et al. [2018b]), the ranges of the weight and scale parameterswere reduced. For these experiments, the weigh parameter is limited from -2.0 to 2.0 and the scalingparameter is limited from -0.5 to 0.5. It should be noted that due to the form of the Sigmoid-Well function(Engebr˚aten et al. [2018b]) the scaling parameter is stronger and has a smaller allowable range. The Quality-Diversity method MAP-elites is used to evolve swarm controllers. MAP-elites seeks toexplore the search space of all possible controllers by ﬁlling a number of characteristics bins spanningthe search space of all controllers (Mouret and Clune [2015]). Variation of solutions in MAP-elites isdone by mutations. In this work, mutation is performed by ﬁrst selecting a parameter at random thenadding a Gaussian perturbation with a mean of 0.0 and a variance that is 10% of the range of the parameterchosen. Only a single parameter is changed per mutation. Adaptation to ﬁll new bins might require multiplesequential mutations in order to move from one characteristics bin to another. This type of adaptationmight be challenging for the MAP-elites algorithm, as it also requires that the solution outperforms existingsolutions along the path required to reach the new unoccupied bins.As part of the evolutionary process, three behavior characteristics and a ﬁtness metric are used. Asmentioned in Subsection 3.3 the characteristics or metrics are: the median visitation count across all thecells in the area of operation, the area covered by the largest connected subgroup of the swarm, and the9 ngebraaten et al.

A Framework for Multi-Function Swarms variance in predicted locations. For the networking application each agent is assumed to have a ﬁxedcommunication radius of 200m. Fitness f is calculated in a deterministic manner based on the scales andweights parameters of the controller (Eq. 6). These parameters determine the magnitude of the output ofthe controller and as such correlate well with the motion that can be expected from the controller. In orderto limit aggressiveness and reduce battery consumption, behaviors are optimized to minimize motion andmaximize f . f = 1 || t || + || k || (6) A series of experiments are conducted to explore the evolutionary process itself, the viability of usingthe evolved swarm behaviors as behavior primitives, the effect of noise on an evolutionary process usingMAP-elites and ﬁnally, the value of disabling certain controllers inputs are examined. The evolution of asingle repertoire takes around 16 hours on a cluster running 132 threads; thus approximately 2112 CPUhours per repertoire. Total simulation time for the ablation study is approximately 152 064 CPU hours or17.6 CPU years.

All the repertoires in this work have 10 exploration bins x 100 network bins x 10 localizationcharacteristics bins. These are ﬁlled in during 200 pseudo-generations, each evaluating and testing 200individuals, resulting in the ﬁnal repertoire. Figure 5 shows the progression of evolving a single repertoire.A total of 8 independent evolutionary runs are conducted. On average, across 8 runs, the evolution resultedin 2031 solutions in each repertoire. This represents a coverage of 20.3% with a standard deviation innumber of solutions of 101.1. As can be seen in Figure 5, the ﬁrst half of the evolution ﬁlls out most of therepertoire. Solutions are further optimized, and the repertoire is slightly extended during the second half ofthe evolutionary process.

In this subsection, a subset of the evolved controllers is examined and their viability as behavior primitivesare investigated. The best controllers found across all runs are stored in a repertoire (Figure 6). Fromthis repertoire, 16 controllers are selected by visual inspection and examined in greater detail. Selectingbehaviors by visual inspection is possible because any repertoire can be ﬂattened by slicing it (Cully et al.[2015]). Controllers are selected on the boundary of the feasible controller region, as this is assumed toprovide the most extreme set of varied behaviors. Figure 6 shows the location of each of the solutions andFigure 7 shows trace plots of the behaviors.Transitioning between different behaviors shows whether the behaviors may be used as behaviorprimitives. The transitions between the selected controllers are examined using a surrogate metric for theexploration behavior characteristic. The exploration metric used when evolving behaviors calculates themedian visitation count across all cells in the simulation area. This does not work on a short timescale asthe area is too large and the median visitation count rarely gets above 0. Instead, the value of this metricis approximated using the derivative of the total visitation count calculated over the time interval sincethe metric was last evaluated. This gives a rough indication of how good the behaviors are in the different ngebraaten et al. A Framework for Multi-Function Swarms

First generation100th generation

Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Figure 5.

Progression of the evolution of one repertoire. Repertoires are visualized by slicing the three-dimensional behavior space along all three axes, this allows higher dimensions to be ﬂattened for easiervisualization. The right repertoire is the ﬁnal result after 200 generations. Brighter yellow indicates bettersolutions, as measured by ﬁtness.applications, but is subject to noise. To alleviate the noise in this measurement each transition is tested1000 times. The average time series are shown in Figure 8. The ﬁgure shows four examples of transitionsbetween behaviors. As far as the controller is concerned, there is no discernible difference between startingfrom where another behavior left off versus from a clean simulation. This is essential for these behaviors tobe applicable as a set of behavior primitives and is what could enable the operator to change the behaviorof the swarm on the ﬂy.

MAP-elites is a greedy algorithm. Every time a solution is mutated it is kept as a part of the repertoire if itﬁlls a void where there previously was no solution, or it is better than the existing solution. It is an excellentproperty to maintain diversity and allows for better exploration of the search landscape, but also poses achallenge. Many common metrics or ﬁtness functions used in evolutionary optimization are stochastic.This applies to both ﬁtness metrics and behavior characteristics. If a stochastic variable has high variance,but a low mean it still might outcompete a stochastic variable with high mean and low variance. In the caseof these experiments, a behavior might get a lucky draw from the metrics used to evaluate performance,11 ngebraaten et al.

A Framework for Multi-Function Swarms

Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Figure 6.

Combined repertoire from 8 separate evolutionary runs. Each ﬁlled square represents acharacteristics bin. Circled controllers are selected for a more in-depth examination. Brighter yellowindicates better solutions, as measured by ﬁtness.resulting in an inferior solution being chosen over a superior solution. This is a challenge, as in many cases,the lesser variability and higher mean may be preferable to the lower mean and greater variability.In order to test if a repertoire is reproduceable an entire repertoire is re-evaluated using more evaluationsper solution or behavior. This gives a clear visual indication of whether the solution stays in the samecharacteristics bin or moves around. Fig 9 and Table 1 shows that there is a noticeable reduction in thenumber of solutions in the repertoires as they are re-evaluated. The ﬁgure shows an overview of repertoiresevolved and re-evaluated with a single, 5 evaluations and 10 evaluations per controller. It is importantto note that using more evaluations in the initial evolution results in a smaller repertoire. However, thissmaller repertoire is likely closer to the true shape of the space of all feasible swarming behaviors. Usingonly a single evaluation results in a repertoire with 2937 solutions, while 5-evaluations results in only 1957solutions. The further reduction with 10-evaluations is much smaller, with the ﬁnal repertoire having 1841solutions. ngebraaten et al. A Framework for Multi-Function Swarms

LLL LLH LHL LHHLL4 LM4 LH4 HL4HM4 HH4 LL7 LM7LH7 HL7 HM7 HH7

Figure 7.

Trace plot of behaviors from Figure 6. Labels refer to locations in repertoire. LHL for instance is abehavior with low exploration, high network coverage and low variance in geolocation predictions. Red linesindicate the path of the UAVs and black lines are connected UAVs, as determined by communication radius.Grey dots indicate location predictions. Deeper blue squares are more frequently explored. Behavior labelslink to videos, and a complete overview can be found at

It is important to note that re-evaluating using 20 evaluations per solution cannot be compared to evolvinga repertoire with 20 evaluations per solution. During the evolutionary process, a total of 40 000 solutionsare tested. During re-evaluation, only the 2000-3000 solutions in the repertoire are re-evaluated. As such,it is natural that the re-evaluated repertoire contains a lot fewer solutions as it was not given time to searchfor solutions to ﬁll all the characteristics bins. 13 ngebraaten et al.

A Framework for Multi-Function Swarms S c o r e Localization varianceExplorationNetwork coverage

HM4-LM7 S c o r e Localization varianceExplorationNetwork coverage

LH4-LLL S c o r e Localization varianceExplorationNetwork coverage

LHH-LL7 S c o r e Localization varianceExplorationNetwork coverage

HL4-LH7

Figure 8.

Four example transitions between two behaviors. Transitions happens at time 900. Graphs showaverages over 1000 tests. There is a temporary reduction in exploration score during the transition due tothe way exploration is measured and the agents re-organizing to the new behavior.

Table 1.

Overview of results of re-evaluation of controller repertoires.1-eval repertoire 5-eval repertoire 10-eval repertoireOriginal 2937 1957 1841Re-eval 20-eval 823 (28.0%) 857 (43.8%) 859 (46.7%)Re-eval 50-eval 744 (25.3%) 759 (38.8%) 800 (43.4%)Re-eval 100-eval 700 (23.8%) 724 (37.0%) 743 (40.4%)Increasing the number of evaluations seems to have an effect by producing a repertoire that is morecorrect, or closer to the true underlying shape. However, the effect is also diminishing: going from 5 to 10evaluations has little effect. Therefore all other experiments in this work used 5-evaluations per individual.Table 1 shows that the number of evaluations used in the re-evaluation step is not that important. Thegreatest reduction in repertoire size is found when using the highest number of evaluations in the re-evaluation step (100 evaluations per solution). However, the effect of increasing the number of evaluationson the robustness in the initial repertoires can also be clearly seen when re-evaluating the repertoire withonly 20 evaluations per individual. ngebraaten et al. A Framework for Multi-Function Swarms

Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Re-eval 1-eval. (700)

Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Re-eval 5-eval. (724)

Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Re-eval 10-eval. (743)

Figure 9.

Original (top) and re-evaluated repertoires (bot) for runs using an average of 1, 5, and 10evaluations per controller in the initial repertoire evolution. Re-evaluation with 100 evaluations percandidate solution results in a large reduction in repertoire size. The number of solutions in each repertoireis shown in parentheses.To further investigate the challenge of reproducing behaviors and repertoires a single behavior (HL7)is re-evaluated 1000 times and the probability distribution of the behavior characteristics are shown inFigure 10. In the experiments conducted in this paper the ﬁtness can be computed from the genomedeterministically, so only the three behavior characteristics (exploration, network coverage and localizationvariance) are reviewed. 15 ngebraaten et al.

A Framework for Multi-Function Swarms

Exploration

Network

Localization

Figure 10.

Example of a probability distribution of characteristics over 1000 simulations of the samecontroller. Counts are normalized to sum to one across all 10 bins for each characteristic. The solid line isthe mean of samples, dotted lines indicate one standard deviation.Figure 10 shows that the distributions for each of the three metrics are fairly well behaved and resemblesa normal distribution in most cases. Variation, as indicated in the ﬁgure, can cause a behavior to seeminglymove between characteristics bins when re-evaluated, this is believed to the primary cause of the reductionin size of the repertoire.A challenge that remains is to successfully quantify the properties of the solutions. If the measure hasstochastic properties, the exact same solution might ﬁt in multiple bins in the repertoire. This again meansthat there is uncertainty whether the evolutionary method captures the true shape of the behavior space ornot. To visualize this, Figure 11 shows a combined repertoire over 8 evolutionary runs with uncertaintyellipses plotted as slices in 3D. As can be seen from the ﬁgure, the behavior is not always found at thecenter, or mean, of the distribution. The uncertainty ellipses (one std.dev.) are estimates generated byevaluating each of the selected controllers 1000 times and calculating variance and mean over these runs.Figure 11 shows how solutions might jump between characteristics bins if re-evaluated. It also indicatesthat many of the solutions that are in the repertoire are at the very edge of the potential range of values thatthe characteristics may take on. This suggests the idea that MAP-elites might be biased towards acceptingsolutions that have a high variance, as they sometimes get lucky and provide a solution for a hard to reachcharacteristics bin.

The proposed controller has 8 inputs that determine the action of a swarm agent at any given time.In previous works (Engebr˚aten et al. [2018b]), a simpler parametric controller with only 4 inputs wasemployed, as well as a controller with only scalar weights, which did not enable the agents to evolveholding distance type behaviors. In this work, the distance and direction to another 3 neighbors was added.This was done in the interest of improving the performance of the controller in the 3 given applications. Toquantify the effect of this change and the ability of the evolutionary process to ﬁnd good swarm behaviors,an ablation study is performed. Individual inputs are disabled, which allows the effect of each input to beexamined separately. Ablation refers to the selective disabling or removal of certain parts or a larger objectin order to investigate the effect this might have. ngebraaten et al. A Framework for Multi-Function Swarms

Network coverage E x p l o r a t i o n L o c a li z a t i o n v a r i a n c e Figure 11.

Repertoire with uncertainty ellipses plotted for selected behaviors. The small circles indiciatethe location of the behavior in the ﬁnal repertoire. Slices of the uncertainty ellipse are shown in the samecolor as the circle indicating the behavior. Note that behaviors at the edge of the repertoire also often are atthe edge of the uncertainty ellipse. This indicates that behaviors with these characteristics may be hard tocome by.Figure 12 shows the effect on the number of individuals in the ﬁnal repertoire when disabling a giveninput to the controller. The average number of individuals in the repertoire with one input disabled iscompared to the repertoire utilizing all the information available. Disabling the nearest neighbor, leastfrequently visited neighboring square, or the average predicted location results in signiﬁcant reduction innumber of individuals in the repertoire. This is tested using a Rank-Sum statistical test, comparing againstrepertoires evolved using the full set of inputs. 17 ngebraaten et al.

A Framework for Multi-Function Swarms D i ff e r e n c e i n a v g . nu m b e r o f i n d i v i d u a l s Figure 12.

Difference in individual count when disabling each input to the swarm controller. An (s) afterthe X-axis label indicates statistical signiﬁcance (P < d i = 100 is examined. The valueof a i (100) is shown because the sign of the attraction-repulsion component of the Sigmoid-Well function a i ( d ) changes around the center point. Showing the value of a i (100) is a direct way of visualization ofwhether the weight contributes to a net negative or positive attraction towards the sensed object, withoutthe dependency on the center parameter ( c i ).From Figure 13 it is possible to see that with increasing degree of exploration, Weight ngebraaten et al. A Framework for Multi-Function Swarms scale parameter indicates that the behaviors are trying to hold a set distance to this neighbor, as opposed toa general attraction or repulsion.

Weights 1Weights 2Weights 3Weights 4Weights 5Weights 6Weights 7Weights 8

Expl. Network Loc.

Scales 1Scales 2Scales 3Scales 4Scales 5Scales 6Scales 7Scales 8 -2.00.02.0-0.50.00.5

Weights 1Weights 2Weights 3Weights 4Weights 5Weights 6Weights 7Weights 8Scales 1Scales 2Scales 3Scales 4Scales 5Scales 6Scales 7Scales 8 D i s a b l e d Weights 1Weights 2Weights 3Weights 4Weights 5Weights 6Weights 7Weights 8Scales 1Scales 2Scales 3Scales 4Scales 5Scales 6Scales 7Scales 8 D i s a b l e d Weights 1Weights 2Weights 3Weights 4Weights 5Weights 6Weights 7Weights 8Scales 1Scales 2Scales 3Scales 4Scales 5Scales 6Scales 7Scales 8 D i s a b l e d Figure 13.

Visualization of controller parameter space. Each pair of horizontal subplots are independentof the other pairs. For every slice in a repertoire (a column in this plot), a set of average parametersare calculated. Each pair of horizontal subplots show the average of the average parameters across 8independent repertoires. The upper pair shows the experiment with all information. The lower three pairsshow experiments where input ngebraaten et al.

A Framework for Multi-Function Swarms with very high variance in location prediction also seem to require input

A key motivation in this work was to enable the top-down deﬁnition of swarm behaviors and to developa framework for automating behavior generation. Operators, or even researchers, designing and usingswarms are commonly interested in the macroscopic behavior of the swarm, not the low-level interactionbetween swarm agents. How to ﬁnd the low-level controllers that enable a given high-level behavior is anunresolved question in swarm research. To this end, this work contributes another method of generatingbehaviors based on high-level goals or metrics. The methods presented here are powerful, but not complete.It is easy to develop behaviors that are ﬂuid or organic. However, this framework would be less suitablefor producing behaviors that require agents to assemble into pre-deﬁned patterns. This is a trade-off, asthe controllers in this work were made to be simplistic by design. To enable more complex behaviors mayrequire controllers with an internal state machine, or at the very least, more complex rule-based structures.This would further add to the time required to optimize or evolve the controllers.Re-evaluating a whole repertoire highlights the issue of combining stochastic metrics with MAP-elites orin general, Quality-Diversity methods. In these experiments, re-evaluation of the entire repertoire resultedin a reduction in repertoire size of up to 76.2%. Through the use of 5-evaluations per individual thiswas reduced to 63.0%, but this is still a drastic reduction in the size of the original repertoire. Multipleevaluations contribute to tackling this challenge. However, as seen from Figure 11, there is still roomfor the behaviors to seemingly move within the repertoire. By doing these experiments it is possible tohighlight that noise in Quality-Diversity methods is a challenge. Noise must be considered when designingexperiments to discover the true shape of the underlying repertoire. In traditional genetic algorithms it iscommon to operate with a limited number of elites. In terms of MAP-elites all solutions in the repertoirebecome an elite and none are re-evaluated. One idea could be to enforce a shelf-life on solutions or requirere-evaluation if the solution has persisted in the repertoire for too long. This might remove solutions thatget a lucky draw from the a single or a few simulations runs. More research is required to ﬁgure out theappropriate measure in order to fully address this issue.Behavior characteristics can be challenging to design, speciﬁcally because evolutionary methods excel atﬁnding ways to exploit metrics without actually providing the intended, or desired, type of behaviors. In thiswork, exploration is measured by the median visitation count. This was a result of previous experimentsusing an average metric, which resulted in behaviors merely alternating between two cells instead ofactually exploring the area. This provided the same gain in metrics, but did not actually allow for the type ngebraaten et al. A Framework for Multi-Function Swarms of behaviors that were desired. For geolocation, the metric used is the variance of the predicted location.The assumption is that variance decreases as the estimated mean converges on the the true mean. Thisis often the case but not always. In some very speciﬁc cases it is possible to introduce a skew or a bias,where there is a fairly low variance in predictions while most of the predictions are in the wrong place(Engebr˚aten [2015]).The combination of a direct encoding and an open-ended evolutionary method makes it possible tofurther analyze the results. This would not have been as easy if the controllers had been, for instance, aneural network. Neural networks are inherently hard to fully analyze and understand. In particular, thedirect encoded controller made it possible to visualize the effect and contribution of individual controllerparameters to the overall swarm behaviors. Understanding the methods in use is key in order to further theﬁeld, and having tools that simpliﬁes this is important. This type of analysis would not have been possiblewith traditional multi-objective optimization, as the intermediate solutions that are not on the Pareto frontare discarded.The parameter heatmaps might show which controller parameters are in use, but this unfortunatelydoes not paint the entire picture. Disabling the nearest neighbor input resulted in a drastic decrease ofthe performance in the network application. However, when evolution could use all available inputs themagnitude of Weight

This paper presents a concept for automated behavior generation using evolution for a multi-functionswarm. Multi-function swarms have the potential to allow for a new type of multi-tasking previouslynot seen in swarms. With complex environments and scenarios, it is likely that the operator’s needs andrequirements will change over time, and as such, the swarm should be capable of adapting to these changes.The viability of evolving large repertoires of behaviors is demonstrated using MAP-elites. These behaviorscan be considered behavior primitives that allow for easy adaptation of the swarm to new requirements.It can potentially even be achieved on the ﬂy if simple messages can be broadcast to the entire swarm.This allows the operator to change the behavior based on a change in preferences, desires or other externalevents.Noise is a challenge in MAP-elites. The combination of a greedy algorithm and noisy metrics can result inrepertoires that do not reﬂect the true shape and properties of the underlying system. In this work multipleevaluations is used to reduce the effect of noise. Noise in metrics may enable poorly performing solutionswith high variance to outperform better solutions with lower variance due to a lucky draw. Multipleevaluations is not a complete solution, however this study highlights that noise must be considered whenapplying Quality-Diversity methods such as MAP-elites.It is possible to investigate the effect each input to a controller has on the swarm performance throughparameter ablation. The three most important inputs for this type of artiﬁcial physics controller was thenearest neighbor, the least frequently visited neighboring cell and the average predicted emitter location.21 ngebraaten et al.

A Framework for Multi-Function Swarms

Results indicate that more information might be better, but more research is required to conclude withcertainty.Similarly, to the adaptation mechanism presented by Cully et al. [2015], it is possible to use a repertoireof behaviors as a way of rapidly adapting to hardware faults or communication errors. In real-world systemscommunication is unreliable. Having a repertoire could in the future enable even more graceful degradationof performance than what is currently innate within swarms. Optimizing repertoires not only for the threeapplication (exploration, network coverage and localization), but also for varying degrees of allowedcommunication could bolster the resilience of swarm system. This is future work.

REFERENCES

Atten, C., Channouf, L., Danoy, G., and Bouvry, P. (2016). Uav ﬂeet mobility model with multiplepheromones for tracking moving observation targets. In

European Conference on the Applications ofEvolutionary Computation (Springer), 332–347Basilico, N. and Carpin, S. (2015). Deploying teams of heterogeneous uavs in cooperative two-levelsurveillance missions. In (IEEE), 610–615Bayındır, L. (2016). A review of swarm robotics tasks.

Neurocomputing (IEEE), vol. 4, 4292–4298Berman, S., Hal´asz, ´A., Hsieh, M. A., and Kumar, V. (2009). Optimized stochastic policies for taskallocation in swarms of robots.

IEEE Transactions on Robotics

25, 927–937Beyer, H.-G. (2000). Evolutionary algorithms in noisy environments: Theoretical issues and guidelines forpractice.

Computer methods in applied mechanics and engineering

Swarm Intelligence

7, 1–41Bredeche, N., Haasdijk, E., and Eiben, A. (2009). On-line, on-board evolution of robot controllers. In

International Conference on Artiﬁcial Evolution (Evolution Artiﬁcielle) (Springer), 110–121Brutschy, A., Pini, G., Pinciroli, C., Birattari, M., and Dorigo, M. (2014). Self-organized task allocation tosequentially interdependent tasks in swarm robotics.

Autonomous agents and multi-agent systems (IEEE), 1255–1262Cliff, D., Husbands, P., and Harvey, I. (1993). Explorations in evolutionary robotics.

Adaptive behavior

Nature

IEEE Transactions on Evolutionary Computation

22, 245–259Cully, A. and Mouret, J.-B. (2016). Evolving a behavioral repertoire for a walking robot.

Evolutionarycomputation

24, 59–88Cummings, M. (2015). Operator interaction with centralized versus decentralized uav architectures.

Handbook of Unmanned Aerial Vehicles , 977–992 ngebraaten et al. A Framework for Multi-Function Swarms

Davis, D. T., Chung, T. H., Clement, M. R., and Day, M. A. (2016). Consensus-based data sharingfor large-scale aerial swarm coordination in lossy communications environments. In (IEEE), 3801–3808Dorigo, M., Floreano, D., Gambardella, L. M., Mondada, F., Nolﬁ, S., Baaboura, T., et al. (2013).Swarmanoid: a novel concept for the study of heterogeneous robotic swarms.

IEEE Robotics &Automation Magazine

20, 60–71Dorigo, M., Trianni, V., S¸ ahin, E., Groß, R., Labella, T. H., Baldassarre, G., et al. (2004). Evolvingself-organizing behaviors for a swarm-bot.

Autonomous Robots

17, 223–245Duarte, M., Costa, V., Gomes, J., Rodrigues, T., Silva, F., Oliveira, S. M., et al. (2016a). Evolution ofcollective behaviors for a real swarm of aquatic surface robots.

PloS one

11, e0151834Duarte, M., Gomes, J., Costa, V., Rodrigues, T., Silva, F., Lobo, V., et al. (2016b). Application of swarmrobotics systems to marine environmental monitoring. In

OCEANS 2016-Shanghai (IEEE), 1–8Duarte, M., Gomes, J., Oliveira, S. M., and Christensen, A. L. (2016c). Evorbc: Evolutionary repertoire-based control for robots with arbitrary locomotion complexity. In

Proceedings of the Genetic andEvolutionary Computation Conference 2016 (ACM), 93–100Duarte, M., Oliveira, S. M., and Christensen, A. L. (2014). Hybrid control for large swarms of aquaticdrones. In

Proceedings of the 14th International Conference on the Synthesis & Simulation of LivingSystems. MIT Press, Cambridge, MA . 785–792[Dataset] Eiben, A., Haasdijk, E., and Bredeche, N. (2010). Embodied, on-line, on-board evolution forautonomous roboticsEngebr˚aten, S., Glette, K., and Yakimenko, O. (2018a). Field-testing of high-level decentralized controllersfor a multi-function drone swarm. In (IEEE), 379–386Engebr˚aten, S. A. (2015).

RF Emitter geolocation using PDOA algorithms and UAVs-A strategy fromemitter detection to location prediction . Master’s thesis, NTNUEngebr˚aten, S. A., Moen, J., and Glette, K. (2017). Meta-heuristics for improved rf emitter localization. In

European Conference on the Applications of Evolutionary Computation (Springer), 207–223Engebr˚aten, S. A., Moen, J., Yakimenko, O., and Glette, K. (2018b). Evolving a repertoire of controllers fora multi-function swarm. In

International Conference on the Applications of Evolutionary Computation (Springer), 734–749Francesca, G., Brambilla, M., Brutschy, A., Trianni, V., and Birattari, M. (2014). Automode: A novelapproach to the automatic design of control software for robot swarms.

Swarm Intelligence

8, 89–112Hancock, P. J. (1994). An empirical comparison of selection methods in evolutionary algorithms. In

AISBWorkshop on Evolutionary Computing (Springer), 80–94Hauert, S., Leven, S., Varga, M., Ruini, F., Cangelosi, A., Zufferey, J.-C., et al. (2011). Reynolds ﬂockingin reality with ﬁxed-wing robots: communication range vs. maximum turning rate. In (IEEE), 5015–5020Hauert, S., Leven, S., Zufferey, J.-C., and Floreano, D. (2010). Communication-based swarming forﬂying robots. In

Proceedings of the Workshop on Network Science and Systems Issues in Multi-RobotAutonomy, IEEE International Conference on Robotics and Automation (Ieee Service Center, 445 HoesLane, Po Box 1331, Piscataway, Nj 08855-1331 Usa), CONFHauert, S., Zufferey, J.-C., and Floreano, D. (2009). Evolved swarming without positioning information:an application in aerial communication relay.

Autonomous Robots

26, 21–32How, J., Kuwata, Y., and King, E. (2004). Flight demonstrations of cooperative control for uav teams. In

AIAA 3rd” Unmanned Unlimited” Technical Conference, Workshop and Exhibit . 6490 23 ngebraaten et al.

A Framework for Multi-Function Swarms

Hsieh, M. A., Kumar, V., and Chaimowicz, L. (2008). Decentralized controllers for shape generation withrobotic swarms.

Robotica

26, 691–701Jevtic, A., Guti´errez, A., Andina, D., and Jamshidi, M. (2011). Distributed bees algorithm for taskallocation in swarm of robots.

IEEE Systems Journal

6, 296–304Jin, Y. and Branke, J. (2005). Evolutionary optimization in uncertain environments-a survey.

IEEETransactions on evolutionary computation

9, 303–317Jones, S., Studley, M., Hauert, S., and Winﬁeld, A. (2018). Evolving behaviour trees for swarm robotics.In

Distributed Autonomous Robotic Systems (Springer). 487–501Justesen, N., Risi, S., and Mouret, J.-B. (2019). Map-elites for noisy domains by adaptive sampling. In

Proceedings of the Genetic and Evolutionary Computation Conference Companion (ACM), 121–122Khatib, O. (1986). Real-time obstacle avoidance for manipulators and mobile robots. In

Autonomous robotvehicles (Springer). 396–404Krogh, B. (1984). A generalized potential ﬁeld approach to obstacle avoidance control. In

Proc. SME Conf.on Robotics Research: The Next Five Years and Beyond, Bethlehem, PA, 1984 . 11–22Krupke, D., Ernestus, M., Hemmer, M., and Fekete, S. P. (2015). Distributed cohesive control for robotswarms: Maintaining good connectivity in the presence of exterior forces. In

Intelligent Robots andSystems (IROS), 2015 IEEE/RSJ International Conference on (IEEE), 413–420Kuntze, H. and Schill, W. (1982). Methods for collision avoidance in computer controlled industrial robots.In

Proceedings of the 12th International Symposium on Industrial Robots . 519–530Kushleyev, A., Mellinger, D., Powers, C., and Kumar, V. (2013). Towards a swarm of agile microquadrotors.

Autonomous Robots

35, 287–300Lee, M. C. and Park, M. G. (2003). Artiﬁcial potential ﬁeld based path planning for mobile robots using avirtual obstacle concept. In

Advanced Intelligent Mechatronics, 2003. AIM 2003. Proceedings. 2003IEEE/ASME International Conference on (IEEE), vol. 2, 735–740Lindsey, Q., Mellinger, D., and Kumar, V. (2012). Construction with quadrotor teams.

Autonomous Robots

33, 323–336Meng, Y. and Gan, J. (2008). Self-adaptive distributed multi-task allocation in a multi-robot system. In (IEEE), 398–404Miglino, O., Lund, H. H., and Nolﬁ, S. (1995). Evolving mobile robots in simulated and real environments.

Artiﬁcial life

2, 417–434Mondada, F., Pettinaro, G. C., Guignard, A., Kwee, I. W., Floreano, D., Deneubourg, J.-L., et al. (2004).Swarm-bot: A new distributed robotic concept.

Autonomous robots

17, 193–221Mouret, J.-B. and Clune, J. (2015). Illuminating search spaces by mapping elites. arXiv preprintarXiv:1504.04909

Mulgaonkar, Y. (2012).

Automated recharging for persistence missions with multiple micro aerial vehicles .Ph.D. thesis, University of PennsylvaniaPark, M. G., Jeon, J. H., and Lee, M. C. (2001). Obstacle avoidance for mobile robots using artiﬁcialpotential ﬁeld approach with simulated annealing. In

Industrial Electronics, 2001. Proceedings. ISIE2001. IEEE International Symposium on (IEEE), vol. 3, 1530–1535Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., et al. (2011). Argos: amodular, multi-engine simulator for heterogeneous swarm robotics. In (IEEE), 5027–5034Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., et al. (2012). Argos: amodular, parallel, multi-engine simulator for multi-robot systems.

Swarm intelligence

6, 271–295 ngebraaten et al. A Framework for Multi-Function Swarms

Preiss, J. A., Honig, W., Sukhatme, G. S., and Ayanian, N. (2017). Crazyswarm: A large nano-quadcopterswarm. In (IEEE), 3299–3304Pugh, J. K., Soros, L. B., and Stanley, K. O. (2016). Quality diversity: A new frontier for evolutionarycomputation.

Frontiers in Robotics and AI

3, 40Reynolds, C. W. (1987).

Flocks, herds and schools: A distributed behavioral model , vol. 21 (ACM)Schrum, J. and Miikkulainen, R. (2012). Evolving multimodal networks for multitask games.

IEEETransactions on Computational Intelligence and AI in Games

4, 94–111Schrum, J. and Miikkulainen, R. (2014). Evolving multimodal behavior with modular neural networks inms. pac-man. In

Proceedings of the 2014 annual conference on genetic and evolutionary computation (ACM), 325–332Schrum, J. and Miikkulainen, R. (2015). Discovering multimodal behavior in ms. pac-man throughevolution of modular neural networks.

IEEE transactions on computational intelligence and AI in games

8, 67–81Schuler, T., Lofaro, D., Mcguire, L., Schroer, A., Lin, T., and Sofge, D. (2019). A Study of RoboticSwarms and Emergent Behaviors using 25+ Real-World Lighter-Than-Air Autonomous Agents (D)Spears, W. M., Spears, D. F., Heil, R., Kerr, W., and Hettiarachchi, S. (2004). An overview ofphysicomimetics. In

International Workshop on Swarm Robotics (Springer), 84–97Tarapore, D., Clune, J., Cully, A., and Mouret, J.-B. (2016). How do different encodings inﬂuence theperformance of the map-elites algorithm? In

Proceedings of the Genetic and Evolutionary ComputationConference 2016 (ACM), 173–180Trianni, V., Groß, R., Labella, T. H., S¸ ahin, E., and Dorigo, M. (2003). Evolving aggregation behaviors ina swarm of robots. In

European Conference on Artiﬁcial Life (Springer), 865–874Vadakkepat, P., Tan, K. C., and Ming-Liang, W. (2000). Evolutionary artiﬁcial potential ﬁelds and theirapplication in real time robot path planning. In

Evolutionary Computation, 2000. Proceedings of the2000 Congress on (IEEE), vol. 1, 256–263V´as´arhelyi, G., Vir´agh, C., Somorjai, G., Nepusz, T., Eiben, A. E., and Vicsek, T. (2018). Optimizedﬂocking of autonomous drones in conﬁned environments.

Science Robotics

3, eaat3536Wang, H., Chien, S. Y., Lewis, M., Velagapudi, P., Scerri, P., and Sycara, K. (2009). Human teams for largescale multirobot control. In2009 IEEE International Conference on Systems, Man and Cybernetics