CIMAX: Collective Information Maximization in Robotic Swarms Using Local Communication
Hannes Hornischer, Joshua Cherian Varughese, Ronald Thenius, Franz Wotawa, Manfred Füllsack, Thomas Schmickl
CCIMAX: Collective Information Maximization in Robotic Swarms Using LocalCommunication
Hannes Hornischer , , ∗ , Joshua Cherian Varughese , ,Ronald Thenius , Franz Wotawa , Manfred F¨ullsack and Thomas Schmickl Artificial Life Laboratory, Department of Zoology, Institute of Biology, Karl-Franzens-University, Graz, Austria Institute of Systems Sciences, Innovation and Sustainability Research, Karl-Franzens-University, Graz, Austria Institute for Software Technology, Technical University of Graz, Austria
Abstract
Robotic swarms and mobile sensor networks are used for en-vironmental monitoring in various domains and areas of op-eration. Especially in otherwise inaccessible environmentsdecentralized robotic swarms can be advantageous due totheir high spatial resolution of measurements and resilienceto failure of individuals in the swarm. However, such roboticswarms might need to be able to compensate misplacementduring deployment or adapt to dynamical changes in the en-vironment. Reaching a collective decision in a swarm withlimited communication abilities without a central entity serv-ing as decision-maker can be a challenging task. Here wepresent the CIMAX algorithm for collective decision mak-ing for maximizing the information gathered by the swarmas a whole. Agents negotiate based on their individual sen-sor readings and ultimately make a decision for collectivelymoving in a particular direction so that the swarm as a wholeincreases the amount of relevant measurements and thus ac-cessible information. We use both simulation and real roboticexperiments for presenting, testing and validating our algo-rithm. CIMAX is designed to be used in underwater swarmrobots for troubleshooting an oxygen depletion phenomenonknown as “anoxia”.
Swarms of various lifeforms have been observed to utilizeemergent group dynamics (Eberhart et al., 2001) for varioustasks such as foraging (Seeley, 1992), reproduction (Bon-ner, 1949; Durston, 1973) or escaping predators (Cavagnaet al., 2010; Brock and Riffenburgh, 1960; Magurran andPitcher, 1987). Seeley (1992) discovered how bees use wag-gle dances for foraging by pointing their hive to high qualityfood sources. Bonner (1949) and Durston (1973) examinedthe communication behavior of slime mould cells which de-spite its simplicity enables self-organization with respect toforaging, reproduction et cetera. Cavagna et al. (2010) an-alyzed how starling flocks respond to external stimuli as acollective in order to evade predators. Due to the avail-ability of many eyes in a swarm, each individual spendsless time on scanning the area for predators while spending ∗ Corresponding author: [email protected] more time on foraging. Magurran and Pitcher (1987) exper-imentally demonstrated various formations used by shoalsof minnows when detecting predators. Decentralized intel-ligence of such kind is popularly known as swarm intelli-gence (Beni and Wang, 1989). Natural systems exhibitingthis decentralized intelligence have inspired researchers dueto their adaptability to the environment, resilience to pertur-bations and underlying simplicity. However despite simplerules governing the behavior of individuals in a swarm, theresulting collective behavior often shows a stunning degreeof complexity – as it can also be observed in the synchro-nized flashing of the fireflies lampyridae (Buck, 1988). Re-searchers in many emerging fields such as ubiquitous com-puting (Kim and Follmer, 2017), multi-robot systems (Zaha-dat and Schmickl, 2016; Kernbach et al., 2009), traffic man-agement (Renfrew and Yu, 2009) etc. have recognized par-allels between such multi-agent artificial systems and natu-ral systems containing several actors (Garnier et al., 2007).As a result, extensive research has been dedicated to self or-ganization and decentralization in complex systems (Dorigoet al., 2004). When designing swarms or sensor networksone challenge that often needs to be addressed is the col-lective decision making task (Kernbach et al., 2013). In thispaper we present an algorithm enabling a swarm of individu-als with limited communication abilities to make a collectivedecision regarding its direction of motion in order to maxi-mize information accessible to the collective.The algorithm presented in this paper enables a swarmto increase its information entropy over time. For exam-ple consider a swarm of N agents and each measurement x i independently follows a uniform probability distribution, p ( x i ) = 1 /M with M possible measurements. The result-ing information entropy for each agent according to Shan-non’s measure of information entropy H ( x ) = − M (cid:88) i =1 p ( x i ) log b ( p ( x i )) (1)for a binary system is H = log ( M ) . The entropy for N independent agents is H = N · ( log ( M )) . As the numberof possible options in the distribution decreases, the infor-1 a r X i v : . [ c s . M A ] M a r ation entropy in the entire system decreases. In terms ofthe quantities measured by the swarm, for larger variance inthose measurements we have larger values of M and there-fore larger information entropy of the swarm.In the algorithm presented in this work we use the variancein measurements of the swarm in combination with a simplebio-inspired communication mechanism to enable swarmsto maximize the information available to them. Swarmsmove in a direction which leads to an increase in informationavailable to the swarm as a whole. In contrast to centralizedswarms here the individual agents use only local informa-tion. In the following we refer to the algorithm as CIMAX.We initially designed CIMAX to address the task ofdocumenting, examining and ultimately forecasting thefrequently but irregularly appearing anoxic waters phenom-ena (Runca et al., 1996) in the lagoon of Venice. Duringthis phenomenon which we refer to as “anoxia”, the oxygencontent of a small part of the lagoon decreases dramaticallyresulting in the death of animal life in that specific area.Anoxia adversely affects the flora and fauna in the lagoonand also causes difficulties for the inhabitants and touristsin Venice.A strategy to examine and document this phenomenon is toutilize a swarm of underwater robots for monitoring a set ofenvironmental parameters, i.a. oxygen concentration levels.For determining dynamics and spatio-temporal evolution ofanoxic areas the water body a swarm of robots allows mon-itoring at various underwater locations and thus high spatialresolution. One implementation of such swarm robotsused for autonomous long-term underwater monitoringwas developed and extensively tested in real-world marineenvironments within project “subCULTron” (subCULTron,2015): the so-called “aMussel” (Donati et al., 2017).Due to problem such as expensive hardware (Akyildizet al., 2005), high power consumption (Stojanovic andPreisig, 2009) and general complexity of communicationunderwater (Lanbo et al., 2008) the main approach forcommunication within members of a swarm of aMusselsis based on using modulated light for local informationtransfer.When deploying a swarm at a target location it is not guar-anteed that the location is sufficiently covered. It is possiblethat only few robots are in contact with the anoxic area andthe majority of the swarm is not. Moreover, even in casethe swarm is optimally placed the target area is dynamic andhence mobile. Therefore, the CIMAX algorithm can contin-uously guide the swarm to areas of interest.The problem that CIMAX addresses in this paper is a clas-sical problem of collective decision making in multi agentsystems where individual entities might make conflicting de-cisions based on local information. According to Trianniand Campo (2015), algorithms for collective decision mak-ing in natural and artificial swarms can be categorized into three main mechanisms. In the first mechanism, the swarmwaits for one entity to have enough information to make adecision and then propagate that decision within the swarm.Organizational structures following this mechanism can befound in form of hierarchies within animal and human so-cieties (Rabb et al., 1967; Ahl and Allen, 1996). The sec-ond mechanism is called opinion averaging in which all in-dividuals constantly adjust their own opinion based on theirneighbours’ opinions until the entire swarm eventually con-verges to one opinion. This mechanism for collective deci-sion making in robots swarm can also be found in groups ofanimals which use it for effectively navigating as a collec-tive (Simons, 2004; Codling et al., 2007). The third mech-anism is based on amplification of a particular opinion toproduce a collective decision. In this mechanism, each in-dividual randomly starts with an opinion and then changestheir opinion to other opinions depending on how often theyhear the latter opinion. The amplification mechanism is alsofound within animals such as the pheromone trails selectionin ants (Beckers et al., 1990) or the temperature based siteselection of young bees (Szopek et al., 2013). The under-lying mechanism of collective decision making of the algo-rithm presented in this paper relies on the amplification ofthe mostly held opinions within the swarm which is asso-ciated with the second category of mechanisms presentedin (Trianni and Campo, 2015).Apart from collective decision making in swarm robotics,our approach is broadly related to relocation of sensor nodesin mobile wireless sensor networks (“MWSN”) (Wang et al.,2005; Li et al., 2007; Cui et al., 2004). When deploying aswarm e.g. in an otherwise inaccessible environment, theswarm is often not arranged properly for effective measure-ment and observation due to inaccurate knowledge of targetarea, of dynamic changes in local conditions or of unfore-seeable events. For optimizing parameters such as cover-age, connectivity or network longevity individual membersof the network need to be relocated for which a variety ofapproaches has been suggested.While some approaches in sensor relocation rely on hav-ing access to global information (Wang et al., 2005) of theposition of sensors, often this problem is approached in a de-centralized manner. In (Wang et al., 2005) the exact positionof the sensors is known by the base station or similar centralentity. The area covered by the sensors is increased whileminimizing the travel time and the distance travelled usinggenetic algorithms. Such a system is used to compensate forcoverage loss when sensors fail in the field.In (Li et al., 2007; Cui et al., 2004), a more decentralizedapproach for relocation of sensors is followed in order tomaintain coverage of a sensor network. The sensors period-ically broadcast their locations and identifiers to their neigh-bours and construct a Voronoi diagram. Voronoi polygonsare computed using the received information. Once a nodefinds a hole in the Voronoi diagram, i.e. a relatively large2olygon, the relocation of a sensor is initiated. In Cui et al.(2004), a simulation of an odor localization scenario with agroup of mobile robots is presented. The authors focus onusing fuzzy logic to decide which direction to move to inorder to eventually localize the source. They assume thatmeasurements from each agent are easily available to otheragents in the swarm wherefore the agent to agent communi-cation aspect is not adequately addressed.In contrast to such approaches we here present a methodto maximize information about the environment collectedby a swarm based on a bio-inspired communication mech-anism. The CIMAX algorithm differs from existing ap-proaches in the following ways: 1) the swarm has no di-rect access to global information – there is no central en-tity knowing the positions of all sensors 2) nor are agentsable to receive instructions or be organized by a central en-tity. 3) CIMAX maximizes the diversity or variance of mea-surements collected by the swarm as a whole and 4) our ap-proach utilizes not only the content of received signals butalso its properties. We present both numerical simulationand robotic experiments to validate the presented method.Furthermore, this algorithm can be embedded into the “waveoriented swarm programming paradigm” (WOSPP) (The-nius et al., 2018), a framework for controlling swarms us-ing the communication mechanism we briefly introduce inSection 2. In Section 2.2 we present the algorithm and itsimplementation. The computational results and theoreticalanalysis of the algorithm are shown in Section 3, includingnumerical simulations in the aforementioned target environ-ment and scenario. In Section 4 we present the experimentalsetup and results which are then discussed in Section 5. The CIMAX algorithm enables a swarm of individuals withlimited communication abilities to make a collective de-cision regarding its direction of motion in order to maxi-mize the information accessible to the collective. The fun-damental communication mechanism presented here is in-spired by slime mold ( dictyostelium discoideum ) and fire-flies ( lampyridae ) and has previously been used to designvarious algorithms (Varughese et al., 2016; Thenius et al.,2018). Thenius et al. (2018) unified various swarm be-haviours into one general framework called ”wave orientedswarm programming paradigm” or WOSPP.
In the WOSPP communication paradigm, all agents can en-ter three different states similar to the behavior of slime mold( dictyostelium discoideum ): An “inactive” state in whichagents are receptive to incoming communication, an “ac-tive” state where they send or relay a signal, which is fol-lowed by a “refractory” state where agents are temporar-ily insensitive to incoming signals. This communicationmechanism is schematically shown in Figure 1. Agents ini- tiate a signal randomly by initially setting a timer within t p ∈ (0 , t maxp ] . In this manner, each agent initiates the send-ing of a signal at least once within a time period t maxp (max-imum the timer can randomly be set to), which we refer toas ’cycle’. (a)(b)Figure 1: The WOSPP communication mechanism. (a) Agents canbe in one of the three states. From the inactive state, an incomingmessage or the decision to initiate a message lets an agent transitioninto the active state. In the active state an agent either relays anincoming message or initiates a new message. Subsequently agentsenter the refractory state, being insensitive to incoming messagesfor a finite time until transitioning to the inactive state again. Theconceptual operating structure of an agent is illustrated in (b). The three states of agents ultimately allow wave-likepropagation of signals through the swarm as shown in Fig-ure 2. Signals are solely received by agents in close neigh-borhood, i.e. within perception range R of the sender andsubsequently relayed thus propagating through the system.After receiving a signal agents relay it with a delay of onetimestep t delay = 1 s which we use in the following as basicunit for time.The refractory state assures that a signal will neither’flood’ a swarm, i.e. signals will not (re)activate the initialsender, nor periodically propagate through the swarm e.g. asa spiraling wave. In Figure 2 (a)-(e) a temporal sequence ofa signal propagating through a swarm is shown. Figure 2(f)shows several trajectories signals took in the signal propa-gation in panels (a)-(e), indicated by red lines. For this al-gorithm agents need to be able to communicate with nearestneighbors, move or have some means of transportation, and3ave a common sense of direction. (a) Time: s (b) Time: s (c) Time: s (d) Time: s (e) Time: s (f) Time: s Figure 2: Illustration of wave based communication. In (a) almostall agents are in the inactive state, shown in black, except one agentwhich broadcasts a message, i.e. enters the active state, shown inred. It afterwards transitions into the refractory state, shown inblue. Neighboring agents receive the signal and switch to the ac-tive state as shown in (b) and (c). The signal spreads in a wave-likemanner. In (d) the initiating agent switches from the refractorystate into the inactive state again. Due to a fixed duration of the re-fractory state, the transition to the inactive state as well spreads ina wave like manner, shown in (d) and (e). In (f) several trajectoriesalong which the signals were broadcast are shown as red lines.The perception range R of an agent is shown as bar in the bottomright corner in (a). Times [s]: (a) 0, (b) 2, (c) 4, (d) 11, (e) 16. Pa-rameters: number of agents N = 80 , physical size of the swarm inunits perception range R s = 5 R , refractory time in units timesteps t ref = 10 s . In the following we present the algorithm for maximizingthe information accessible to, or collected by the swarm. Forthis scenario we define information as diversity of measure-ments throughout the swarm, quantified using the varianceof measurements. Thus the swarm ultimately detects diversedomains or transition areas between homogeneous domains,while uniform domains are considered redundant and pro-viding less information.Each agent in the swarm measures the same single quan-tity X which we use as a generic placeholder for any en-vironmental parameter or quantity measured by swarms.When one agent initiates a message, it sends its own mea-surement value as message. Neighboring agents receive themessage, append their own measurements and relay the mes-sage. This way a message propagating through the systemincrementally grows in length with every relay. With thisin mind, for easier illustration of the algorithm we dividethe entire procedure into three parts: “information gather- ing”, “evaluation” and “collective decision”. However, forimplementation there are various possibilities, depending onthe abilities and specific tasks of the target medium, withoutneed of dividing.The three parts of the algorithm are exemplary illustratedin Figure 4 for a swarm of N = 4 agents. The agents, rep-resented by black circles, are arranged in a line. They areable to measure a quantity X of the environment which isrepresented by the background colors red and yellow.Four agents, represented by black circles, constitute aone-dimensional swarm within a system with two domains,yellow and red, representing two different measurements X. (a) (b) (c)Figure 3: (a) An agent (black circle) receives an incoming messagewith measurements m i . (b) It then appends its own measurementand stores the message before in (c) it broadcasts this extendedmessage. • Information gathering : agents randomly (in time) initi-ate sending a message containing their own sensor read-ings. Each agent which has received this message storesthe received information as well as the direction fromwhich it received the message. Finally, each agent ap-pends its own sensor readings before then broadcasting itto its neighbours. This process is schematically shown inFigure 3, resulting in a dispersion of information about thesensor readings of agents throughout the whole swarm. • Evaluation : agents evaluate the stored messages with re-spect to the directions from which they were received.Agents then determine the diversity of the content of allmessages associated with a certain direction. Depend-ing on the systems characteristics this is practically donee.g. by calculating the variance of all elements containedby those messages. The calculated diversities serve as“weights” for all directions. Agents finally consider thedirection with largest weight (e.g. variance) as their pre-ferred direction to move towards. Figure 4 (b) shows theevaluation of the two messages initiated in 4 (a). • Collective decision : agents agree on a common directionto move towards a target location, based on the individ-ual preferences of directions. One option is to let agentscommunicate their opinions on a preferred direction to theneighbors. Those then, instead of relaying a message,simply change their own preferred direction by a small4actor towards the received direction. This way opinions’diffuse’ through the swarm letting it converge to a com-mon opinion. Figure 4 (c) shows the result of a collectivedecision on the example shown in 4 (a) and (b).Algs. 1, 2 and 3 show the pseudo-code for the three parts“information gathering”, “evaluation” and “collective deci-sion”, respectively. Please note that the presented pseudo-code is an exemplary implementation of the algorithm anddoes not exclude alternative ways of implementing it.Mode ← “information gathering”;state ← inactive ;timer( t p ) ← random integer ∈ (0 , t maxp ] ; while Modus = “information gathering” do decrement timer( t p ); if agent in refractory state then wait for refractory time; if refractory time is over then state ← inactive endendif agent in active state then broadcast message;state ← refractory endif agent in inactive state then listen for incoming pings; if message received then state ← active;append own measurements to receivedmessage i ;calculate variance V i in measurementscontained by message i ;store variance V i with respect to thedirection of reception of the message, V diri ; endendif timer( t p ) ≤ then state ← active;create empty message and append ownmeasurements to message; endendAlgorithm 1: Pseudo-code of “information gathering”
In this section we first present the behavior of a swarm insystems consisting of a discrete and a smooth linear tran-sition, respectively, to give an intuitive understanding of itsbehavioral dynamics. We then examine a computational sce-nario close to a real application case. Mode ← “evaluation”;calculate average of variances V dir of for eachdirection of reception dir ;preferred direction dir preferred ← choose directionassociated with largest average variance V dir preferred = max { V dir } ;empty storage; Algorithm 2:
Pseudo-code for “evaluation”Mode ← “collective decision”;state ← inactive ;timer( t p ) ← random integer ∈ (0 , t maxp ] ; while Modus = “collective decision” do decrement timer( t p ); if agent in refractory state then wait for refractory time; if refractory time is over then state ← inactive endendif agent in active state then broadcast preferred direction dir preferred ;state ← refractory endif agent in inactive state then listen for incoming pings; if message received then state ← active;adjust own preferred direction by 10 %towards preferred direction contained bythe incoming message; endendif timer( t p ) ≤ then state ← active; endendAlgorithm 3: Pseudo-code of “collective decision”5 a) Information gathering (b) Evaluation (c) Collective decisionFigure 4: The three sub-parts the algorithm can be divided into. Four agents illustrated as black dots constitute a swarm in a system with twodomains, yellow and red. The colors represent two different measurements of quantity X. (a) The dispersion of two independent messages.On the left hand side the top agent (in the yellow domain) initiates a message with its own measurement, illustrated as a yellow dot in curlybrackets next to the agent. The message propagates from agent to agent, each of which appends its own measurement (here depicted ascolor). On the right hand side the same scenario is shown only this time the bottom agent, in the red domain, initiates the message which thenpropagates upwards. (b) The evaluation of the two messages. The diversity of a message is illustrated as the number of different measuredcolors. The top agent received no messages from upward direction and thus considers a weight of w = 0 colors for upward, however a weightof w = 2 colors for downward. Its preferred direction therefore is down which is indicated by the arrow in the green box on the right handside. All agents calculate their preferred directions in this way. (c) All agents communicate their preferred direction (not explicitly shown)and ultimately agree on a direction to move. Since three agents prefer to move downwards and one agent prefers upwards, the resultingcommon direction is downwards. We consider a swarm of N=61 agents within a 2-dimensionalspace. Each agent has a perception range of R . Agents in theswarm are distributed in a circular area of diameter D = 6 R .We chose the number of agents in the swarm N relative to D in a manner such that agents on average have five neighborswithin perception range in order to ensure sufficient connec-tivity within the swarm. Every negotiation period, after theswarm decided for a direction to move, the swarm moves bya step of length s = 0 . R along this direction. For simplic-ity we let the swarm move as a whole without changing theagents’ relative positions. Hence we exclude any interactionbetween agents other than communication and treat agentsas point particles. Each agent is able to measure a dimen-sionless quantity X in the system which we use as place-holder for any environmental parameter or quantity. Finally,in the following we quantify diversity using the messagesstored by agents. We define diversity V k associated with anagent k as the variance of the measurements m kj containedby all stored messages of this agent V k = 1 n n (cid:88) j =1 ( m kj − m k ) , (2)where n represents the total number of measurements m kj and m k represents the average of those measurements m k = n (cid:80) nj =1 m kj . In Figure 5 the scenario of a swarm close to a sharp tran-sition of a measured quantity X is shown, left hand side inyellow for low values, right hand side in red for high valuesof X . Everywhere the quantity X is subject to a small time dependent random noise − . < ξ ( t ) < . . The centerof mass of the swarm, represented by a black “ + ”, is ini-tially at position ( x, y ) = ( − . , , in the yellow domain.All agents are illustrated as grey dots at their initial position(with center of mass of the swarm at ( x, y ) = ( − . , ). Inthe following we use “the position of the center of mass ofthe swarm” synonym with “the position of the swarm”.In the beginning of the simulation the swarm movesstraight to the right, towards the border of the two domains.From there at ( x, y ) ≈ (0 , it moves upwards along theborder of the two domains in a less directed manner, ef-fectively performing a one-dimensional random walk. Fig-ure 5 (b) shows the average diversity V k within the swarm(as viewed by an external and all-knowing observer) againsttime. Initially the average diversity is close to V k = 0 andincreases until t = 8 where it reaches a plateau around V k =4 . . This corresponds to the point when the swarm reachedthe border. Please note that the swarm is not attracted by do-mains of higher values of X , but instead by largest averagediversity of measurements and therefore moves towards thetransition.In Figure 6 we show the rate in which a swarm success-fully reaches the border between the two domains. We counta simulation as successful if the center of mass of the swarmreaches a distance to the border smaller than | x | < . R within a finite simulation time of t fin = 100 t maxp , i.e. thetime in which the swarm can take 50 steps. In Figure 6 thesuccess rates (histogram in top figure) and correspondingmean time until success (bottom figure) of a simulation isshown for different initial distances of the swarm from theborder. For initial distances smaller than | x init | < . R thesuccess rates are 1 and the corresponding success time de-6 a)(b)Figure 5: (a) shows the trajectory of the average position of theswarm (+) within a system with two domains, yellow with X ∈ [ − . , . and red with X ∈ [4 . , . Initially the swarm movestowards right until it reaches the border between the domains. Theremaining simulation time it moves randomly along this border. Inlight grey circles the swarm at its initial position (with center ofthe swarm at x ≈ − . ) is shown, each circle representing oneagent. [ N = 61 , negotiation period: 2 cycle lengths]. (b) showsthe diversity averaged over all agents in the swarm V k against time t from an observers perspective. Initially the diversity is small ( t =0 , V k = 0 . ) as most agents have similar measurements in theyellow domain with quantity X levels fluctuating around 0. Withincreasing time, more agents have different measurements as theswarm approaches the border. creases linearly. This shows, that if the swarm initially per-ceives the other domain (yellow and red domain as shown inFig. 5, respectively), it consistently moves there directly. Fordistances further away the swarm randomly moves aroundand by chance perceives the respective other domain. Figure 6: The top graph shows the success rates of the swarm mov-ing the border between the two domains vs. its initial position rela-tive to the border in our simulation experiment depicted in Fig. 5. Asimulation is counted as successful if the swarm reaches a distancefrom the border smaller than | x | < . R . For initial distancesfrom the border smaller than | x init | < . , the swarm consistentlysucceeds in finding the border. After 50 negotiation periods (cor-responding to 100 cycles) a simulation was stopped and countedas failed. The bottom graph shows the mean time for a swarm un-til it reaches the border. Each data is the result of 50 independentsimulations. This consistent behavior allows us to illustrate the ex-pected behavior of a swarm close to the border as shownin Figure 7. It shows the preferred direction of such a swarmas arrows. Each arrow represents the preferred direction of aswarm with its center at the arrow’s location. For distancesfrom the border | d | > . the preferred direction is randomsince no agent in the swarm is located in the respective otherdomain and thus the swarm has no information about its ex-istence. In this case the swarm is located in an almost uni-form area and thus does not develop a preferred direction.For distances from the border | d | < . , the swarm movestowards the border. In Figure 8 a swarm close to a gradient in X is shown.For x < the system exhibits fluctuating values in X ∈ [ − . , . , for x ≥ the temporal average in X linearlyincreases. The swarm initially starts at position ( x, y ) =( − . , and moves towards the right in a directed manner.For x (cid:38) . the swarm moves less directed and effectivelyperforms a random walk. In Figure 8 (b) the diversity aver-7 igure 7: The preferred direction of a swarm in a system with twodomains, yellow ( X ∈ [ − . , . ) and red ( X ∈ [4 . , . ). Ev-ery arrow indicates the preferred direction of a swarm with its cen-ter at the arrows position. The arrows were calculated each with asingle simulation with negotiation periods of 4 cycle lengths. For | x | (cid:38) . the swarm moves randomly, for | x | (cid:46) . it can per-ceive the other domain and moves towards it - from both directionrespectively towards the border at x = 0 . aged over all agents in the swarm V k is shown against time.It increases from V k ≈ . until at t = 15 (when it startsmoving randomly) it reaches a plateau where it fluctuatesaround V k = 20 .In Figure 9 we show the rate in which a swarm suc-cessfully maximizes its average diversity V k . We count asimulation as successful if the swarm reaches a position of x ≥ . within a finite simulation time of t fin = 100 t maxp ,i.e. the time in which the swarm takes 50 steps. In Fig-ure 9 the success rates (top histogram) and correspondingmean time until success (bottom graph) of a simulation isshown for different initial distances of the swarm from theonset point of the linear-increase domain. For initial posi-tions x init ≥ . the swarm succeeds instantly as its ini-tial position already fulfills the condition for success. For − (cid:46) x init < . the swarm succeeds in the majorityof conducted simulations, the success times increase withdecreasing distance from the border between the two do-mains. At x init (cid:46) − for decreasing distance from the bor-der the success rates decrease significantly, at x init = − they reach . For x init (cid:46) − the swarm is too far awayfrom the domain of increasing values in X and thereforedoes not perceive it anymore. By chance it moves closer tothe domain of increasing values in X and ultimately suc-ceeds, i.e. reaches x ≥ . within simulation time. Onlysuccessful simulations were considered when calculating themean success times.Instead of diffusing along the border as it is the case fora sharp transition, for this gradient the swarm diffuses inboth dimensions given that x ≥ . . As soon as the swarmis entirely on a linear gradient (for x > . , both direc-tions (to lower and to higher values, respectively) providethe same average diversity. This is implicitly shown in Fig-ure 10 where each arrow denotes the preferred direction of aswarm with center at its position. For x < − . the swarmmoves randomly as it does not perceive the linear gradient(starting at the dashed line). For x > . the swarm moves (a)(b)Figure 8: (a) The trajectory of the center of mass of the swarm (+)in a system with two domains, for x ≤ the environment exhibitsvalues fluctuating between X ∈ [ − . , . , for x > the averagevalues in X linearly increase with increasing x . Initially the swarmmoves towards the right until at x ≈ . it moves randomly. Thedashed line indicates the border between the area of on average uni-form levels of X (left) and the (along the x-axis) linearly increasingdomain. For the initial position of the swarm with center of massat ( x, y ) ≈ ( − . , the agents of the swarm are shown as lightgrey circles. [ N = 61 , negotiation period: 2 cycle lengths] (b)shows the average of the agents’ diversities V k over time. Initially V k is close to zero as the swarm is located in an almost uniformarea. With increasing time as the swarm moves towards the right,the average diversity increases until around t = 15 it saturates andfluctuates around V k = 8 . This corresponds to the random walk ofthe swarm. igure 9: The top graph shows the success rates of the swarmreaching the border (as depicted in Fig. 5) between the two do-mains versus the initial position of the swarm. For each distancewe conducted 50 independent simulations. As success we countedsimulations in which the swarm reached a position x ≥ . , cor-responding to an average diversity of V k ≈ (as shown in Fig-ure 8). For x (cid:38) . the swarms performs a random walk. After50 negotiation periods (corresponding to 100 cycles) a simulationwas stopped and counted as failed. The bottom graph shows themean time for a swarm until it succeeds, only taking into accountsuccessful simulations. towards the right up to x = 2 . where it moves randomly. We consider an area of radially varying values of X . InFigure 11 the quantity X fluctuates around a constant value X ∈ [4 . , . in the yellow domain. For radial distancesfrom the center of the cloud between d ∈ [3 , . the quan-tity X linearly decreases to X ∈ [ − . , . . For d < the red domain exhibits X ∈ [ − . , . . Figure 11 showsthe preferred direction of a swarm as arrows, the position ofeach arrow indicating the center of mass of the swarm. Forradial distances from the center of the cloud of d (cid:38) theswarm moves randomly as it does not detect the circular do-main. For (cid:38) d (cid:38) the swarm moves towards the circulardomain, whereas for d (cid:46) it moves radially away from ittowards its border. As soon as the swarm detects the circulardomain of deviating levels, it proceeds to move towards itsborder where it measures the largest average diversity. For experimental validation of the CIMAX algorithm weused aMussel robots, developed in the project subCUL-Tron (Donati et al., 2017). They communicate via modu-lated light and are used i.a. for examining the anoxic watersphenomenon in the lagoon of Venice (Runca et al., 1996) bydiving down to the floor of the lagoon. They are equippedwith a variety of sensors and communication devices (Do-
Figure 10: The preferred direction of a swarm in a system with(towards the right hand side) linearly increasing average levels of X . For x < , X fluctuates around X ∈ [ − . , . . For x ≥ the average levels in X increase linearly with increasing x . Thedashed line indicates the onset of the increase. Every arrow indi-cates the preferred direction of a swarm with its center at the arrowsposition. Each arrow was calculated with a single simulation witha negotiation period of cycle lengths. For x (cid:46) − the swarmmoves randomly as it does not perceive the gradient domain. For − (cid:38) x (cid:38) the swarm is attracted to the gradient domain andmoves towards the right, effectively maximizing its average diver-sity. For x (cid:38) the swarm moves randomly. At this point, bothdirections (along the x-axis) exhibit the same linear gradient. Sincethe swarm detects the variance in measurements instead of absolutevalues, both directions are equivalent.Figure 11: Preferred direction of a swarm within a system with acircular domain of deviating levels of X . Each arrow represents thepreferred direction of a swarm with its center it its position. Eacharrow was calculated by a single simulation with negotiation periodof 2 cycle lengths. The circular domain extends radially with a ra-dius of r = 5 . The levels of X linearly decrease from a maximumof 5 down to 0, to every position in the system is added a noisebetween 0 and 1. For distances form the center of the circular arealarger than d (cid:38) . the swarm is too far from it to perceive it andthus does not find a coherent preferred direction, i.e. the swarmmoves randomly. For d (cid:46) . the swarm consistently moves to-wards the border between the two domains. For testing the algorithm we used aMussels under lab condi-tions outside water in an one-dimensional setup. Four aMus-sels were arranged in a linear manner in an arena as shown inFigure 12 (a). As an emulation of oxygen gradients we usedan ambient light gradient which allowed us to perform theexperiments in the lab outside of a water environment. Wehence were able to establish precisely controlled environ-mental situations and predictably changing environments.Two projectors were located above the arena and used forvarying the light intensity on the arena floor as shown inFigure 12(b) and (c) where different parts of the floor arebrightly illuminated and others dark. In this experiments weconsidered two states of illuminance: lights on or off. Thesetup of the system corresponds to the simulation of a swarmclose to a sharp transition, presented in Section 3.1. The sen-sor for measuring ambient light values is located at the topcap of the aMussels. In this experiment they communicatedvia modulated green light. We counted an experiment assuccessful as soon as the robots agreed on the direction to-wards the border between the two different domains. Whilelights for communication are located in the center of theirbody, the LED’s in their top caps (as visible in Figure 12(c)in green) indicated their preferred direction. From the per-spective of the camera, green represents the preferred direc-tion “left” and blue represents “right”.The algorithm introduced in Section 2.2 was implementedon the aMussels, however the information gathering and ne-gotiation phases were fused into a single phase. All mes-sages sent by aMussels in this experiment contained boththe sensor readings as well as their preferred direction. Thisis illustrated in Figure 12(d), where the red aMussel initi-ates the sending of a message. For this, it broadcasted itsown sensor reading as well as its current preferred directionas message. When another aMussel received a message, itstored it and afterwards appended its own sensor reading andcurrent preferred direction to this message before relaying it.Based on the sensor readings in the stored messages, theaMussels continuously evaluated from which direction theyreceived messages with largest variance in measurements,i.e. which direction they individually considered mostpreferable to move towards and likewise which directionthey broadcasted as their own current preferred direction. (a)(b)(c)(d)Figure 12: (a) Four aMussels in an arena used for the experiments.(b) The experimental setup with one of the six considered light con-figurations. The aMussels did not decide on a preferred directionyet as their top caps are not illuminated. (c) An experiment countedas successful with all aMussels agreeing on moving left, towardsthe illuminated domain, indicated by the green LEDs in their topcaps. (d) Schematic illustration of broadcasting and relaying mes-sages. The red colored aMussel initiates a message containing itsown local ambient light measurement value (here: g ∈ { , } ) aswell as its own preferred direction ( p ∈ { R, L } ). Other aMusselswhich successively receive the message (shown on the right sidein black color) add their own sensor readings and preferred direc-tions to the message before relaying it, respectively. The messagesbroadcasted by the aMussels are shown to the left of each aMus-sel. Messages started and ended with the characters “ss” for easierparsing. Table 1: The parameters values used in the experiments.
Based on the preferred directions in the stored messages,at the end of this phase (consisting of both the informa-tion gathering phase and the negotiation phase) the aMusselsevaluated which direction was favored by the majority of theswarm and thus which direction they ultimately decided onmoving towards. This phase consisted of a time period of10 cycles, meaning every aMussel initiated at least 10 mes-sages.The parameter values used in the experiment are given inTable 1. The robots randomized the time when they initiateda message during a cycle at the beginning of every cycle. Asa result in this experiment the effective cycle length of indi-vidual robots varied between 40 and 70 seconds as indicatedin Table 1. The reason for this randomization is that occa-sionally robots initiated message approximately at the sametime. In this case the messages were not received by all otherrobots since right after broadcasting a message every robotstays insensitive to incoming messages for a brief amountof time. Due to randomizing the initiation of messages thisevent less likely occurred repeatedly.
We conducted experiments for six different light configura-tions as schematically shown in Figure 13. The arrows to theright of each configuration indicate the results of each set ofexperiments. The direction of the arrow indicates the collec-tive decision of the swarm in which direction to move and itscolor denotes the color used by the aMussels to indicate therespective direction they decided on (e.g. see Figure 12(c)).For Figure 13(c) we counted an experiment as successful ifthe aMussels ultimately decided to move towards the borderbetween the two domains of luminosity, i.e. the two aMus-sels on the left choose to move towards the right and viceversa.For each light configuration the experiment was indepen-dently repeated five times with a success rate of 100 %. Inorder to test how well the aMussels adapt to changing lightconfigurations we conducted another set of experiments withalternating light configuration in which the robots need tochange their previously reached consensus. After reachingconsensus, the light configurations were changed such thatthe expected direction for the robots to decide on was in-verted. The experiments were considered as successful ifthe robots correctly found consensus for the initial light con- figuration and then switched their opinion accordingly. Weconducted this experiment five independent times with allexperiments successful. (a) (b) (c)Figure 13: Six different light configurations which were tested inthe experiments. (a) and (b) show the configurations for which weexpect aMussels to decide to move to the left and to the right, re-spectively. The color of the arrow denotes the corresponding colorthe aMussels used to indicate their preferred direction via the LEDsin their top caps. (c) The configurations for which we expect theaMussels to not agree on a common direction but to choose di-rections towards the border of the domains of different luminosity.The two aMussels left from the border decide to move to the rightand vice versa.
In this paper, we demonstrated how a simple bio-inspiredcommunication behavior can be used to reach a swarm leveldecision of which direction to move in order to maximizeswarm level information access. We also demonstrated howthis algorithm works in robots of an underwater swarm withlimited communication range and local information. Wepresented simulation results in Section 3 to give an intuitiveunderstanding of the algorithm’s functionality. For both aspatially discrete as well as a gradual change in measuredquantity X in the system the swarm could successfully max-imize its diversity in the measurement. For system with adiscrete change in X (Section 3.1) the swarm, within prox-imity of nearby variations, succeeded in of all sim-ulations whereas for systems with a gradual change in X (Section 3.2) the success rates vary between to .Also the mean success times in the latter system (Figure 9)are significantly larger compared to the prior (Figure 6). Itshows that the artificial swarms using this algorithm performbetter the steeper a gradient in measured quantity is in thesystem.It is also worth pointing out that the configuration of aswarm, i.e. the spatial distribution of agents, has a signifi-cant influence on the preferred direction. Considering a sys-tem of entirely random values in X . If agents are distributede.g. in a line, the preferred direction can only be along thelinear distribution of the swarm as messages are only sharedalong this line. A swarm shaped as a perfect cross will the-oretically move like the king on a chess board, solely up-down-left-right. This needs to be taken into account in casea swarm tends to group or shape up in symmetric ways inorder to avoid systematic errors.11n Section 4 we presented experimental results of asimplified laboratory demonstration of the algorithm im-plemented on robots using N = 4 aMussels in a one-dimensional setup. The resulting behavior is in full qualita-tive agreement with the results of the corresponding numer-ical simulations (Figure 7) in of the experiments. Thechances of reaching an indecision point, when two aMusselsdecide to move left and two aMussels decide to move right(Figure 13(c)), decrease with increasing number of membersof a swarm — for a sufficiently large swarm of robots dis-tributed in two spatial dimensions those chances of reachingan indecision point would be negligibly small. Despite thesimplification of restricting the experiments to one dimen-sion, they serve as proof of concept and general functionalityof the implementation of the algorithm in robots.For both simulations and experiments shown in this workwe assumed/ensured an interconnected swarm with everyagent being connected to at least one neighbor at all times.This assumption allowed the demonstration of the collectivedecision aspect of the algorithm, however it is not a feasi-ble assumption in a real world scenario. Although it couldbe shown in previous work how the underlying communi-cation mechanism exhibits significant resilience to signalloss (Varughese et al., 2017), in practice a number of stepsneed to be taken in order to ensure connectivity of robots. Ina real world scenario, one has to account for possible occlu-sions, alignment problems and other prospective challengeswhile using modulated light communication.For evaluating incoming messages between robots weused the variance of the received measurements. Althoughvariance is a simple measure, it is an effective measure ofinformation entropy for a swarm measuring a single param-eter. In contrast to our approach, Cui et al. (2004) use afuzzy logic based evaluation. Such complex measures couldbe used in place of variance in the CIMAX algorithm whendealing with complex parameter spaces while following thesame information gathering, evaluation and collective deci-sion phases.Lastly, in this work we only considered a single quantitybeing measured, however this algorithm constitutes a gen-eral approach for collective decision making in this partic-ular class of swarms or networks. Therefore several quan-tities can be considered, resulting in a swarm maximizingthe data points within a phase space spanned by the num-ber of considered quantities. This hence lets such a swarmautonomously explore an environment of high complexity,taking into account previously collected data and adjustingto environmental changes and variations.As the algorithm could be proven conceptually functionalwith respect to collective decision making it will be imple-mented and tested in the future on larger swarms and in-fieldwithin the framework of subCULTron. ACKNOWLEDGMENT
This work was supported by EU-H2020 Project subCUL-Tron, funded by the European Unions Horizon 2020 re-search and innovation programmer under grant agreementNo 640967. Furthermore this work was supported by theCOLIBRI initiative at the University of Graz.
References
Ahl, V. and Allen, T. F. (1996).
Hierarchy theory: A vi-sion, vocabulary, and epistemology . Columbia Univer-sity Press.Akyildiz, I. F., Pompili, D., and Melodia, T. (2005). Under-water acoustic sensor networks: research challenges.
Ad Hoc Networks , 3(3):257–279.Beckers, R., Deneubourg, J.-L., Goss, S., and Pasteels, J. M.(1990). Collective decision making through food re-cruitment.
Insectes Sociaux , 37(3):258–267.Beni, G. and Wang, J. (1989). Swarm intelligence in cellularrobotic systems. In
Proceedings of the NATO AdvancedWorkshop on Robots and Biological Systems , volume 3,pages 268–308.Bonner, J. T. (1949). The social amoebae.
Scientific Ameri-can , 180(6):44–47.Brock, V. E. and Riffenburgh, R. H. (1960). Fish schooling:A possible factor in reducing predation.
ICES Journalof Marine Science , 25(3):307–317.Buck, J. (1988). Synchronous rhythmic flashing of fireflies.ii.
The Quarterly Review of Biology , 63(3):265–289.Cavagna, A., Cimarelli, A., Giardina, I., Parisi, G., Santa-gati, R., Stefanini, F., and Viale, M. (2010). Scale-freecorrelations in starling flocks.
Proceedings of the Na-tional Academy of Sciences , 107(26):11865–11870.Codling, E., Pitchford, J., and Simpson, S. (2007). Groupnavigation and the many-wrongs principle in modelsof animal movement.
Ecology , 88(7):1864–1870.Cui, X., Hardin, C. T., Ragade, R. K., and Elmaghraby, A. S.(2004). A swarm-based fuzzy logic control mobilesensor network for hazardous contaminants localiza-tion. In ,pages 194–203.Donati, E., van Vuuren, G. J., Tanaka, K., Romano, D.,Schmickl, T., and Stefanini, C. (2017). aMussels: Div-ing and anchoring in a new bio-inspired under-actuatedrobot class for long-term environmental explorationand monitoring. In
Conference Towards AutonomousRobotic Systems , pages 300–314. Springer.12origo, M., Trianni, V., S¸ ahin, E., Groß, R., Labella, T. H.,Baldassarre, G., Nolfi, S., Deneubourg, J.-L., Mon-dada, F., Floreano, D., and Gambardella, L. M. (2004).Evolving self-organizing behaviors for a swarm-bot.
Autonomous Robots , 17(2):223–245.Durston, A. (1973). Dictyostelium discoideum aggregationfields as excitable media.
Journal of Theoretical Biol-ogy , 42(3):483–504.Eberhart, R. C., Shi, Y., and Kennedy, J. (2001).
SwarmIntelligence . Elsevier.Garnier, S., Gautrais, J., and Theraulaz, G. (2007). The bio-logical principles of swarm intelligence.
Swarm Intel-ligence , 1(1):3–31.Kernbach, S., H¨abe, D., Kernbach, O., Thenius, R., Rad-spieler, G., Kimura, T., and Schmickl, T. (2013). Adap-tive collective decision-making in limited robot swarmswithout communication.
The International Journal ofRobotics Research , 32(1):35–55.Kernbach, S., Thenius, R., Kernbach, O., and Schmickl, T.(2009). Re-embodiment of honeybee aggregation be-havior in an artificial micro-robotic system.
AdaptiveBehavior , 17(3):237–259.Kim, L. H. and Follmer, S. (2017). UbiSwarm: Ubiqui-tous robotic interfaces and investigation of abstract mo-tion as a display.
Proceedings of the ACM on Interac-tive, Mobile, Wearable and Ubiquitous Technologies ,1(3):66.Lanbo, L., Shengli, Z., and Jun-Hong, C. (2008). Prospectsand problems of wireless communication for under-water sensor networks.
Wireless Communications andMobile Computing , 8(8):977–994.Li, X., Santoro, N., and Stojmenovic, I. (2007). Mesh-based sensor relocation for coverage maintenance inmobile sensor networks. In
International Conferenceon Ubiquitous Intelligence and Computing , pages 696–708. Springer.Magurran, A. E. and Pitcher, T. J. (1987). Provenance,shoal size and the sociobiology of predator-evasion be-haviour in minnow shoals.
Proceedings of the RoyalSociety of London B , 229(1257):439–465.Rabb, G. B., Woolpy, J. H., and Ginsburg, B. E. (1967). So-cial relationships in a group of captive wolves.
Ameri-can Zoologist , 7(2):305–311.Renfrew, D. and Yu, X.-H. (2009). Traffic signal controlwith swarm intelligence. In
Proceedings of the Fifth In-ternational Conference on Natural Computation , pages79–83. IEEE. Runca, E., Bernstein, A., Postma, L., and Silvio, G. D.(1996). Control of macroalgae blooms in the lagoonof venice.
Ocean & Coastal Management , 30(2):235–257.Seeley, T. D. (1992). The tremble dance of the honey bee:Message and meanings.
Behavioral Ecology and So-ciobiology , 31(6):375–383.Simons, A. M. (2004). Many wrongs: The advantage ofgroup navigation.
Trends in Ecology & Evolution ,19(9):453–455.Stojanovic, M. and Preisig, J. (2009). Underwater acousticcommunication channels: Propagation models and sta-tistical characterization.
IEEE Communications Maga-zine
PLoS ONE , 8(10):e76250.Thenius, R., Moser, D., Varughese, J. C., Kernbach, S.,Kuksin, I., Kernbach, O., Kuksina, E., Miˇskovi´c, N.,Bogdan, S., Petrovi´c, T., et al. (2016). subCULTron –cultural development as a tool in underwater robotics.In
Artificial Life and Intelligent Agents Symposium ,pages 27–41. Springer.Thenius, R., Varughese, J. C., Moser, D., and Schmickl, T.(2018). WOSPP – a wave oriented swarm program-ming paradigm.
IFAC-PapersOnLine , 51(2):379–384.Trianni, V. and Campo, A. (2015).
Fundamental Collec-tive Behaviors in Swarm Robotics , pages 1377–1394.Springer Berlin Heidelberg, Berlin, Heidelberg.Varughese, J. C., Thenius, R., Schmickl, T., and Wotawa,F. (2017). Quantification and analysis of the resilienceof two swarm intelligent algorithms. In Benzm¨uller,C., Lisetti, C., and Theobald, M., editors,
GCAI 2017.3rd Global Conference on Artificial Intelligence , vol-ume 50 of
EPiC Series in Computing , pages 148–161.Varughese, J. C., Thenius, R., Wotawa, F., and Schmickl,T. (2016). FSTaxis algorithm: Bio-inspired emergentgradient taxis. In
Proceedings of the 15th InternationalConference on the Synthesis and Simulation of LivingSystems , pages 330–337. MIT Press.13ang, G., Cao, G., La Porta, T., and Zhang, W. (2005). Sen-sor relocation in mobile sensor networks. In
Proceed-ings of the 24th Annual Joint Conference of the IEEEComputer and Communications Societies. , volume 4,pages 2302–2312. IEEE.Zahadat, P. and Schmickl, T. (2016). Division of laborin a swarm of autonomous underwater robots by im-proved partitioning social inhibition.