[PDF] A Hierarchical Multi-Robot Mapping Architecture Subject to Communication Constraints

Abstract

Multi-robot systems are an efficient method to explore and map an unknown environment. The simulataneous localization and mapping (SLAM) algorithm is common for single robot systems, however multiple robots can share respective map data in order to merge a larger global map. This thesis contributes to the multi-robot mapping problem by considering cases in which robots have communication range limitations. The architecture coordinates a team of robots and the central server to explore an unknown environment by exploiting a hierarchical choice structure. The coordination algorithms ensure that the hierarchy of robots choose frontier points that provide maximum information gain, while maintaining viable communication amongst themselves and the central computer through an ad-hoc relay network. In addition, the robots employ a backup choice algorithm in cases when no valid frontier points remain by arranging the communication relay network as a fireline back to the source. This work contributes a scalable, efficient, and robust architecture towards hybrid multi-robot mapping systems that take into account communication range limitations. The architecture is tested in a simulation environment using various maps.

Full PDF

CCopyrightbyHenry Fielding Cappel2020 a r X i v : . [ c s . R O ] F e b Hierarchical Multi-Robot Mapping ArchitectureSubject to Communication Constraints by Henry Fielding Cappel,Thesis

Presented to the Faculty of the Graduate School ofThe University of Texas at Austinin Partial Fulﬁllmentof the Requirementsfor the Degree of

MASTER OF SCIENCE IN ENGINEERING

The University of Texas at Austin

December 2020

Hierarchical Multi-Robot Mapping ArchitectureSubject to Communication Constraints

Approved bySupervising Committee:

Luis Sentis, SupervisorTakashi Tanakao my parents Jeﬀ and Catherine cknowledgments

I want to take the opportunity to thank everyone who helped me along the waytowards completing my thesis and my master’s degree. Thank you to the Univer-sity of Texas and speciﬁcally to Aerospace Engineering for providing me with anopportunity to explore my passion for robotics.A big thank you to my advisor Dr. Sentis for providing me with the oppor-tunity to research deeply into the ﬁeld of human centered robotics and supportingme with all my interests and giving me advice at just the right moments. Thankyou to Dr. Tanaka for teaching me about Network Control Systems and supportingthe development of my thesis.Thank you to the Human Centered Robotics Laboratory and all members,coworkers, and most importantly friends that I have made along the way. A specialthank you to Steven Jens Jorgensen for giving me the chance to do research at NASAJohnson Space Center and for having immense patience with me as I pestered himwith questions. Thank you to Mihir Vedantam and Ryan Gupta for sharing mypassion for robotics and providing great conversation. Thank you to Huang Huang,Binghan He, and Gray Thomas for collaborating with me to develop an improvedexoskeleton and thank you Huang Huang for making six hours of experiments anenjoyable experience. Thank you to all the other members of the HCRL. I willthoroughly miss our thoughtful yet silly conversations and distractions.Thank you to all other friends and colleagues I have met along the way withinverospace engineering and the University of Texas as a whole.Finally, thank you to my parents Jeﬀ and Catherine for supporting me,believing in me, and always reminding me that I am far more capable than I leadmyself to believe.

Henry Fielding Cappel

The University of Texas at AustinDecember 2020 vi Hierarchical Multi-Robot Mapping ArchitectureSubject to Communication Constraints

Henry Fielding Cappel, MSEThe University of Texas at Austin, 2020Supervisor: Luis SentisMulti-robot systems are an eﬃcient method to explore and map an unknown envi-ronment. The simulataneous localization and mapping (SLAM) algorithm is com-mon for single robot systems, however multiple robots can share respective map datain order to merge a larger global map. This thesis contributes to the multi-robotmapping problem by considering cases in which robots have communication rangelimitations. The architecture coordinates a team of robots and the central server toexplore an unknown environment by exploiting a hierarchical choice structure. Thecoordination algorithms ensure that the hierarchy of robots choose frontier pointsthat provide maximum information gain, while maintaining viable communicationamongst themselves and the central computer through an ad-hoc relay network. Inaddition, the robots employ a backup choice algorithm in cases when no valid fron-tier points remain by arranging the communication relay network as a ﬁreline backviio the source.This work contributes a scalable, eﬃcient, and robust architecture towardshybrid multi-robot mapping systems that take into account communication rangelimitations. The architecture is tested in a simulation environment using variousmaps. viii ontents

Acknowledgments vAbstract viiChapter 1 Introduction 1

Chapter 2 Related Work 8

Chapter 3 Methods 36

Chapter 4 Results and Figures 50

Chapter 5 Conclusion and Future Work 58Bibliography 59Bibliography 60Vita 64 x hapter 1 Introduction

Intelligent robotic systems are not a new concept but have signiﬁcantly advancedin recent years. The progression towards eﬃcient network systems goes back toearly computing days when operating systems relied on single process computingwhere one processor could only run a single process at a time. Later came theprocess of time sharing, which allowed multiple clients to access the capabilitiesof the central processor through a time slice such that the system could processmultiple users’ requests. Engineers improved further by developing client serverarchitectures and object oriented programming as a way to increase eﬃciency anddistribute computing over an extended network.In the context of robotics, a single robot is an inherent distributed system inthe sense that each component of a robot plays a speciﬁc role as a piece of a largersystem. Components can be classiﬁed as actuators, processors, sensors, mechanicalcomponents or a mixture of the four. In order to connect these components the robothas a communication network that links each of the components into a structurewhere the distribution of tasks are allocated such that the robot fulﬁlls its overallobjective. For example, a quadcopter may have an inertial measurement unit (IMU)and lidar sensor to take in data about its surrounding environment. The informationis routed to a central processor that sends out commands to the actuators (in thiscase motors to spin the propellers).As an extension of a single robot system, multi agent systems, or more specif-1cally multi robot systems, place the global objective of one robot as a sub task of agreater global system. Examples of these include drone swarms or networks of au-tonomous vehicles. An application of multi robot systems that has gained tractionin recent years is multi robot mapping of unknown environments. In many instancesa user may wish to deploy a team of robots to fulﬁll a task or set of tasks in anenvironment unknown to the robots or user. Developing an accurate map of theenvironment stands as the foundation of successful completion of other high leveltasks and multi robot systems can provide accurate maps eﬃciently.It is no secret that eﬃciency lies in the proper distribution of tasks as can beseen in a well run workforce or organization, but how exactly to distribute operationsfrom low level computing to high level robotic networks is an established question.

The research question of this thesis exists at the integration of three distinct areas ofrobotics; multi robot systems, simultaneous localization and mapping (SLAM), androbot exploration. Each of these subﬁelds is well studied and has seen signiﬁcantimprovements in recent years. The question is how can one develop an eﬃcient,robust, and scalable architecture for multi robot SLAM subject to communicationconstraints?

I will ﬁrst discuss the objectives of this project in relation to answering the mainresearch question and then explain the hurdles around achieving these goals. Beforediving into the objectives of this project it should be noted that the system usedfor this research is a hybrid multi agent system. A hybrid system contains robotagents along with a central processing computer outlined in section 2.1.4. Thistype of system has advantages towards achieving the main objective of an eﬃcient,scalable, and robust architecture for multi robot mapping.The eﬃciency of the system will be determined by a minimal informationexchange amongst agents. Information exchange, especially across WiFi networks,runs the risk of delay and drops. Systems that rely heavily on information exchangelose their functionality if communication fails at any point. Therefore, relying on2inimal information exchange amongst agents and the central processor mitigatespotential issues arising with communication channels. Eﬃciency is also determinedby intelligent distribution of tasks by both the agents and central server. Tasksinclude information processing as well as exploration of the environment. The ob-jective is to develop a system where each agent takes as much of the informationprocessing as possible such that no one agent is overloaded with processing require-ments and agents optimally distribute exploration tasks. Finally, the eﬃciency willbe measured according to minimal time completion for developing a map of theenvironment.Scalability will be considered successful if the system can be extended toinclude many more agents without losing eﬃciency and robustness. An environmentmay be expansive and dense such that many agents are required to complete themapping in a reasonable time. The objective of this research is to develop a systemthat can be used with small numbers of agents (2 or 3) or large numbers of agents(20 to 30).Robustness will be measured according to how well the architecture canadapt to unforeseen challenges. These challenges may include robot mechanical orelectrical failure, communication delays, communication drops, errors in the SLAMalgorithm, unforeseen objects in the environment, and the unknown unknowns.The ﬁnal objective is to be able to map a range of environments. Theseenvironments may be indoor spaces full of objects and obstacles such as a ware-house, oﬃce building, or home, or more expansive outdoor spaces including collegecampuses, cities, or residential areas. The versatility of a multi robot mapping sys-tem expands a users capability to employ the system to a unique environment andprevents pigeonholing into speciﬁc structures.The main challenge with a multi robot system is communication range. Manyrobot systems rely upon WiFi or Bluetooth in order to transmit and receive dataamongst agents. These platforms have strong data rates (up to 54 Mbps for WiFiand around 2 Mbps for Bluetooth), but are limited in range (100-150 feet for WiFiand around 33 feet for Bluetooth) [1]. This means that a system that relies uponeither of these two platforms for inter-robot communication cannot permit the robotsto travel beyond the available range. This is generally not problematic for indoorenvironments, but outdoor environments extending well beyond the WiFi range ofany one robot may pose important issues.3he other challenge associated with communication is channel capacity forlarge sources of information. Although the channel capacities for both WiFi andBluetooth are generally more than suﬃcient for certain data exchanges, there maybe instances that robots want to send large amounts of data through the networksuch as video sharing where WiFi or Bluetooth modules are limited in capacity.In the scope of this project this does not pose to be an issue as map data is thelargest amount of data transmitted amongst agents, but in the case of extendingthe architecture to a more information heavy context the data rates may becomean issue.The ﬁnal challenge associated with a large multi robot system is collisionsamongst agents. Many agents in a small, dense environment almost certainly restrictpath planning and navigation. The coordination of agents to avoid collisions witheach other and the environment while still completing the objective is an importantconsideration.

Multi robot systems may prove to be the future of space exploration as they areresource eﬃcient, relatively lightweight, more robust than single robot systems, andcan perform dangerous yet necessary tasks that humans otherwise cannot perform.Space agencies are currently investigating robot systems for lunar and planetarytasks. The Japanese Aerospace Exploration Agency (JAXA) plans to use robots toconstruct lunar structures and setup a space based solar power system. Currently,the Mars Exploration Rovers (MER) launched by NASA in 2003 inhabit the Marssurface to investigate past water activity [2], however they operate independently ofeach other. Manned missions to Mars will likely be a reality in the next decades andmulti robot systems will need to setup structures allowing humans to survive. As afoundation of building habitable structures, there is a need for a robust autonomousmapping architecture that allows the robot network to localize objects and otheragents in the environment. 4igure 1.1: The NASA mars exploration rovers (MER) Spirit and Opportunity mayset the stage for a robot age of planetary exploration as well as space habitationconstruction. Image courtesy of [3]

Climate change and biodiversity losses call for a technological revolution that canhelp clean up bodies of water, mimic species playing a crucial role in the ecosys-tem, and monitor endangered species. Robots are advantageous in their ability toexplore inaccessbile areas to humans and if used correctly provide non-invasive mon-itoring. [4] marks the distinction between two types of environmental robots; robotsin ecology and ecological robots. The former classiﬁes robots that collect ecologicaldata used for research purposes. This may include a drone swarm deployed aftera volcanic eruption to monitor air quality. The latter classiﬁes robots that have adirect ecological role. These may include robots that mimic a certain species to ﬁlla scarcity or target an invasive species to preserve an ecosystem. Multi robot sys-5ems provide a more eﬃcient means to fulﬁll these tasks, especially if environmentsare expansive, by intelligently distributing tasks to more agents. Similar to spaceapplications many environmental tasks may be in GPS denied environments forcingrobot systems to develop accurate maps of the space.

Robot integration into household chores has already taken hold. The “Roomba” byiRobot traverses the ﬂoor of a house sweeping as it moves, but the “Roomba” acts asan independent agent. Many common household chores can be outsourced to multirobot systems facilitating eﬃcient completion of tasks. Multi robot systems may beable to assist with cleaning, surveillance, grocery delivery, or garbage collection [5].Additionally, an area that deserves more attention is multi robot systems for elderlycare. Robots and speciﬁcally robot networks can provide an aﬀordable option forin home care for elderly residents [6]. Although robots do not provide the humanto human interaction that is essential for caretaking, many of the mundane andtedious tasks of in home care can be given to the robots such that family membersand caretakers are freed up for more personal interactions.

Certainly, the ethical considerations in a robotics age are not trivial. Multi robotsystems provide an eﬃcient and cost eﬀective means to complete mundane andoftentimes repetitive tasks. However, a heavy reliance on these systems means thatmany human jobs can become obsolete leading to greater unemployment. It isimportant to weigh the advantages of increased productivity with the cost of takingjobs from people. Additionally, many of the resources used to establish robots ineveryday life come at a cost to society and the environment [4]. This includes thewaste produced when a robot reaches the end of its lifespan as well as the wasteproduced from energy consumption that is often not from renewable resources. Theﬁnal ethical consideration to be considered is with human interpersonal connection.Human dependence on robot systems have the potential to deter human to humaninteraction if robots take the place of daily tasks. It is important to integrate robotsinto society such that human to human interaction is not heavily sacriﬁced.6 .6 Thesis Roadmap

This thesis will consist of the following main topics. Chapter 2 will provide thereader with detailed background information regarding multi agent systems, SLAMalgorithms, robot path planning and navigation, an explanation of hierarchical struc-tures and an extensive literature review of the aforementioned topics within thecontext of this thesis. The reader will ﬁnd a detailed mathematical formulation ofextended kalman ﬁlter (EKF) based SLAM as well as particle ﬁlter based SLAMextended to the use of occupancy grid maps. Additionally, as motivating exam-ples, chapter 2 will also include an application of path planning implemented on theNASA Valkyrie humanoid robot and an application of a hierarchical informationstructure applied to whole body control that rely on similar algorithms presentedin this project.Chapter 3 is a detailed explanation of the methods used in this project in-cluding an overview of the project architecture, the algorithms developed for multirobot coordination, and speciﬁcations of the multi-robot system architecture includ-ing system type, the hierarchical structure, information exchanges amongst agents,and communication limits. The chapter will also include algorithms related to fron-tier point extraction, and robot choosing operations.Chapter 4 provides the reader with simulation results based on two diﬀerentmaps, and a discussion of the potential beneﬁts and contributions of this work.In conclusion, chapter 5 will discuss how successfully the project has ad-dressed the objectives outlined in section 1.3 and where there is need for furtherresearch. Finally, chapter 5 will provide possibilities for future work.7 hapter 2

Related Work

I will provide a brief introduction into multi agent systems providing deﬁnitions ofrelevant concepts within the context of this thesis. Additionally, I will provide ataxonomy of multi-robot systems explaining the variations in diﬀerent structures.

The question, “what deﬁnes an agent?” is the foundation for establishing the clas-siﬁcations of a multi-agent system. [7] deﬁnes an agent as, “autonomous, computa-tional entities that can be viewed as perceiving their environment through sensorsand acting upon their environment through eﬀectors.” Varying deﬁnitions exist forthe concept of “agent”, but many overlap on core attributes such as autonomy, ca-pacity to process and exchange information, follow causal structures, act towardsobjectives, and contain boundaries and states [8]. It is easy to assume an agent rep-resents a physical robotic structure, but the term is far more general. For examplean agent can be as physically deﬁned as a drone within a drone swarm or as abstractas a function of code.

Multi-agent systems expand the deﬁnition of a single agent to a system containingmultiple agents. [9] deﬁnes a multi-agent system as, “a loosely coupled network of8roblem solving entities (agents) that work together to ﬁnd answers to problemsthat are beyond the individual capabilities or knowledge of each entity (agent).”Once again, the idea of agent within this context remains general. The term “multi-agent systems” represents an umbrella for many architectures including multi-robotsystems described below.

Multi-robot systems are a subclass of multi-agent systems. They follow the samedeﬁnitions as multi-agent systems with the added stipulation that each agent is arobot. The robots work together or in opposition in order to achieve some goal.The distinction of a multi-robot system is that interaction with the environmentand other agents occurs through a physical medium from one physical structure toanother.

Multi-robot systems fall into a variety of distinct categories. The ﬁrst to consideris organization. Three main organization structures classify multi-robot systems;centralized architectures, decentralized architectures, and hybrid architectures. Acentralized architecture gives control authority to one agent to dictate the actionsof other agents. In many instances the central agent takes on the majority of thecomputational burden and passes task commands to the other agents in the system.On the other hand, the second category, a decentralized architecture, distributestasks amongst all the agents in the system. Decentralized systems are advantageousin computational eﬃciency as all agents are utilized to perform calculations, butlimited in optimality due to incomplete information sharing. The third category,hybrid architectures, is a combination of the two. In many hybrid architectures acentral agent will take on more computational burden and send commands to otheragents, but also allocate tasks and decision making to other agents.The second taxonomic category is robot composition. Multi-robot systemscan be classiﬁed as either homogeneous (all robots are the same) or heterogeneous(robot models diﬀer). Both categories come with costs and beneﬁts. For example,homogeneous systems lack versatility. If a task or set of tasks requires many diﬀerentcapabilities homogeneous robots fall short. On the other hand homogeneous systems9re easier to implement due to redundant hardware and system speciﬁcations. Onthe contrary, heterogeneous systems are more versatile, but require more tuning andconsiderations during implementation.The third category is strategy type. Multi-robot systems can either be coop-erative (robots working together to achieve a common goal) or competitive (robotsworking against one another for personal goals). An example of a cooperative sys-tem is two robots playing hide and seek trying to ﬁnd a person in a house. Therobots distribute tasks such that they both achieve the same global goal. On theother hand a competitive strategy may be two robots playing hide and seek againsteach other. One robot acts towards the goal of localizing the other robot, whereasthe other robot works towards the goal of staying out of view of the ﬁrst robot.Cooperative strategies can be broken down further into static and dynamic teams.A static team implies that information received by any agent is independent of itsown or other robots’ actions. A dynamic team implies the opposite [10].The fourth category is information structure, which is an important classi-ﬁcation in terms of problem diﬃculty. There are two main types of informationstructures for a multi-robot system. The ﬁrst is called classical information struc-ture. This structure is generally easier to solve and can be reduced to standardoptimal control/optimization problems. An example is a system with complete in-formation sharing [10]. I it = [ y [1 ,t ] , u [1 ,t − ]where every agent i receives full histories of measurements and control inputs from allother agents in the system. Non-classical information structures are generally morediﬃcult. Examples of these include completely decentralized information structures[10] I it = [ y i [1 ,t ] ]where each agent only receives information of its own measurement history, or one-step delayed control sharing [10] I it = [ y i [1 ,t ] , u [1 ,t − ]where each agent only has access to its own measurement histories and control inputhistories of all other agents. Quasi-classical or partially nested information struc-10ures lie between classical and non-classical information structures and many timesproduce tractable problems to solve. An example is one-step delayed informationsharing [10] I it = [ y it , y [1 ,t − , u [1 ,t − ]The ﬁfth classiﬁcation is communication. Communication can be dividedinto direct (transfer of information from one robot to another through on boardhardware) or indirect (receive information from modiﬁcations in the environment)[11]. Direct methods of communication are more robust to information errors, butrequire a communication channel to transmit the data. On the other hand indirectmethods require estimation techniques to gain information of other agents’ statesand are more prone to errors, but do not depend on communication channels.The sixth and ﬁnal classiﬁcation is system architecture. This dictates thestrategy of a system to adapt to unforeseen changes. Deliberative architectures re-organize the system as a whole in response to unforeseen changes. This may includea central agent recalculating routes for all the agents in response to a changingenvironment. The other is a reactive architecture. Instead of a coordinated reor-ganization, each agent reactively changes its behavior to adapt to the change whilestill attempting to achieve its personal goal [11]. Simultaneous localization and mapping, or more commonly referred by its acron-mym “SLAM”, is a hot topic for research within multi-robot systems. The algorithmbuilds a map of an unknown environment while also localizing the agent within themap. This parallel process is thought of as a chicken and egg type problem, becausethe parameters needed to build a map rely upon the robot’s state (position and ori-entation), while the robot’s state depends on the map generated. In some instancesa user provides a pre-generated map to the robot such that the SLAM structuredowngrades to a localization problem. However, oftentimes a map is unknown a-priori and the robot or team of robots need to build a map in real time.The SLAM algorithm works as an iterative likelihood estimate of both land-marks in the environment and the robot’s state in relation to these estimated land-marks. Landmarks represent distinguishing features of the environment such as11 lassiﬁcation Attribute Deﬁnition

Organization Centralized One central agent taking computationalload and giving commands to other agentsDecentralized Tasks distributed amongst all agents.Each agent has shared responsibilityHybrid Combination of centralizedand decentralizedRobot Composition Homogeneous One type of robot in the systemHeterogeneous More than one type of robotin the systemStrategy Cooperative Agents work together to achievea global goalCompetitive Agents work against one anotherto achieve personal goalsInformation Structure Classical Easy to solveMore information sharingNon-Classical Diﬃcult to solveLess information sharingPartially Nested (Quasi-Classical) Tractable problemsSuﬃcient information sharingCommunication Direct Communication amongstagents via on board hardwareIndirect Information gain viaenvironmental changeSystem Architecture Deliberative Complete system adaptationto unforeseen changesReactive Local robot adaptation to unforeseenchanges

Table 2.1: Taxonomy of multi-robot systemstrees in a park, walls in a building, or other static objects in an area of space. Theselandmarks are determined based on sensor data of the robot itself. For instance arobot may use a Lidar rangeﬁnder to collect range and bearing data of surroundingobjects. The robot will estimate the landmarks’ position with respect to it’s ownstate, predict the state in the next step, and condition it’s prediction on new sensordata received at the next timestep.At its root the SLAM problem involves maximizing the likelihood of theprobability distribution p ( m, x | u, z )where m is the state of all landmarks in the map, x is the state of the robot, u is thecontrol input of the robot, and z is the measurement of the robot through sensordata. SLAM varies by technique and type of map generated. The two most com-12on SLAM based algorithms, which will be outlined in great detail below, arebased on the Extended Kalman Filter (EKF) and the particle ﬁlter. The SLAMalgorithm can produce maps of varying type including volumetric (3D mappings),feature based maps (features or landmarks in an environment), topological maps,geometric maps (outline of a space), or occupancy grid maps (grid map indicatinglikelihood of obstacle at each grid point). An illustration of three of the maps isshown below. a) Volumetric map b) Landmark map c) Occupancy grid map Figure 2.1: An illustration of three types of maps generated by SLAM. a) representsa volumetric (3D) map, b) represents a landmark or feature based map, and c)represents an occupancy grid map. Images courtesy of [12], [13]

I will provide a detailed explanation of EKF based SLAM for estimating environ-mental landmarks and robot state. This derivation is based on the video lectureseries [14]. 13he goal of EKF based SLAM is to estimate the state vector µ =  r x r y r θ m x m y m x m y ...m nx m ny  where r x , r y , and r θ are the robot state variables (position and orientation) and m ix and m iy are the n landmark positions. Additionally, the covariance matrix can berepresented as Σ =  Σ r,r Σ r,m ... Σ r,m n Σ m ,r Σ m ,m ... .. .. .. . Σ m n ,r . ... Σ m n ,m n  where subscript r represents a condensed robot state and m i represents a condensedlandmark state. The state estimate relies on the works of the Extended KalmanFilter. The EKF contains two stages; the prediction step and the update step. Theprediction step uses a previous best state estimate and current control input topropagate this estimate based on the motion model of the system¯ µ t = f ( u t , µ t − )where ¯ µ t is the prediction estimate, f represents the dynamics of the system, u t isthe current timestep control input, and µ t − is the previous timestep state estimate.14imilarly, it predicts the covariance matrix¯Σ t = F t Σ t − F Tt + R t where ¯Σ t is the predicted covariance matrix, F t is a linearized matrix of the motionmodel at the current timestep, and R t is the process noise of the system.After determining the predicted state vector and covariance matrix, the EKFconditions these predictions based on the data received at the timestep. First, thekalman gain is computed as K t = ¯Σ t H Tt ( H t ¯Σ t H Tt + Q t ) − where K t is the kalman gain, H t is the linearized matrix of the measurement model,and Q t is the measurement noise. Next, the state and covariance matrices areupdated according to µ t = ¯ µ t + K t ( z t − h (¯ µ t ))Σ t = ( I − K t H t ) ¯Σ t where z t is the measurement at timestep t . Algorithm 1 shows the full process. Algorithm 1

Extended Kalman Filter Based SLAM

Require: µ t − //Previous state estimate Require: Σ t − //Previous covariance matrix¯ µ t = f ( u t , µ t − ) //State prediction step¯Σ t = F t Σ t − F Tt + R t //Covariance prediction step K t = ¯Σ t H Tt ( H t ¯Σ t H Tt + Q t ) − //Kalman gain µ t = ¯ µ t + K t ( z t − h (¯ µ t )) //State condition stepΣ t = ( I − K t H t ) ¯Σ t //Covariance condition stepreturn µ t ,Σ t Although EKF based SLAM is suﬃcient in many cases, it depends on assumptionsof the system dynamics and probability densities. Namely, the EKF linearizes thesystem dynamics at each timestep, which in many cases does not pose a problem,but if system dynamics show strong nonlinearities the algorithm can deviate from15ruth. In addition, the EKF assumes an approximately gaussian posterior density.Dense environments can produce multi modal behavior of the posterior distributionand in these cases the EKF fails to accurately predict the state.The particle ﬁlter presents an improved strategy, but comes at a cost ofincreased computational load. In particular as the state space increases the compu-tational eﬃciency dramatically decreases making the particle ﬁlter only viable forlow-dimensional cases. I will outline the main mathematical features of the particleﬁlter based on the lecture series [15].The particle ﬁlter uses a set of “particles” along with corresponding weightsas a sampling pool of possible states. χ = < x [ j ] , w [ j ] > j =1 ,...,J The set χ contains J particles of corresponding states ( x [ j ] ) and weights ( w [ j ] ).At each timestep the particle ﬁlter will initialize a new set χ t as an emptyset and use the previous timestep’s set χ t − in order to resample according to eachparticles weight. After sampling from χ t − the particle ﬁlter propagates the sampleaccording to the posterior x [ j ] t ∼ p ( x t | u t , x [ j ] t − )The propagation occurs with some uncertainty due to process noise, therefore thenew particle is sampled according to the state posterior given control input u t andprevious state estimate x [ j ] t − .The most important step of the particle ﬁlter is to calculate the weights ofeach newly sampled particle according to measurements received at the timestep.The weights are calculated according to w [ j ] t = p ( z t | x [ j ] t )Finally, the weights are normalized and added to the new set χ t = χ t + < x [ j ] t , w [ j ] t > which is used for the next sampling period. The full algorithm is presented inAlgorithm 2. 16n the context of SLAM the states of each particle may be the robot’s positionand orientation, and all positions of landmarks. However, in practice, this becomescomputationally expensive as the number of particles required to cover the entirestate space dramatically increases. As we will see in later sections the particle ﬁlteris used in combination with the EKF for a FastSLAM algorithm [16]. Algorithm 2

Particle Filter Based SLAM

Require: χ t − //Previous timestep set Require: u t //Current timestep control input Require: z t //Current timestep measurement χ t = ∅ //Initialize set as empty for j = 1 to J do sample x [ j ] t ∼ p ( x t | u t , x [ j ] t − ) //Prediction step w [ j ] t = p ( z t | x [ j ] t ) //Update step χ t = χ t + < x [ j ] t , w [ j ] T > //Append particle to set end forfor j = 1 to J do normalize w [ j ] t //Normalize all weights in set end for return χ t Occupancy grid maps are commonly generated maps. I will provide a detailedexplanation of their purpose. As their name suggests, occupancy grid maps are adiscretized grid containing cells that represent obstacles, free space, or unknownspace. An illustration of an occupancy grid map is shown in 2.2.Occupancy grid maps are advantageous because they do not require featuredetectors. Unlike feature based maps, which depend upon an accurate feature de-tector to establish landmarks, every cell on the map receives a corresponding state.On the other hand occupancy grid maps are far more computationally expensivedue to establishing every cell as a landmark [17].I will provide a full derivation of the algorithm to populate each of the gridmap cells. This derivation is based on the video lecture [17].Two main assumptions establish a foundation for occupancy grid maps.First, each cell is assumed to be independent of all other cells, and, second, the17 reeOccupiedUnknown

Figure 2.2: An illustration of an occupancy grid map. The map is discretizedinto rectangular cells which are color coded corresponding to free (white), occupied(black), or unknown (grey)world remains static. The probability density is expressed as p ( m | z t , x t )where the map is conditioned on both the robot’s measurements as well as therobot’s state. It is important to note that this probability distribution assumes thatthe robot’s state is known a-priori. Because each cell is independent of all othercells in the map we can write p ( m | z t , x t ) = (cid:89) i p ( m i | z t , x t )According to Bayes’ rule this can be rearranged as p ( m i | z t , x t ) = p ( z t | m i , z t − , x t ) p ( m i | z t − , x t ) p ( z t | z t − , x t )Due to a markov property of measurements and knowledge of the state of the cellthis can be rewritten as p ( m i | z t , x t ) = p ( z t | m i , x t ) p ( m i | z t − , x t − ) p ( z t | z t − , x t )18e can use Bayes’ rule on the term p ( z t | m i , x t ) to write p ( z t | m i , x t ) = p ( m i | z t , x t ) p ( z t | x t ) p ( m i | x t )which leads to the full probability distribution p ( m i | z t , x t ) = p ( m i | z t , x t ) p ( z t | x t ) p ( m i | z t − , x t − ) p ( m i ) p ( z t | z t − , x t )It is important to note that the term p ( m i | x t ) is reduced to p ( m i ) due to the gridmap structure. If we take the ratio of the probability that a cell is occupied withthe probability that a cell is unoccupied we arrive at the expression p ( m i | z t , x t )1 − p ( m i | z t , x t ) = p ( m i | z t , x t ) p ( m i | z t − , x t − )(1 − p ( m i ))(1 − p ( m i | z t , x t ))(1 − p ( m i | z t − , x t − )) p ( m i )This is the key step in that it allows one to reclaim the original probability distri-bution in the form p ( m i | z t , x t ) = (1 + (1 − p ( m i | z t , x t )) p ( m i | z t , x t ) (1 − p ( m i | z t − , x t − )) p ( m i | z t − , x t − ) p ( m i )(1 − p ( m i )) ) − If we look closely we can see that three separate useful terms appear in the expres-sion. (1 − p ( m i | z t , x t )) p ( m i | z t , x t )represents the inverse sensor model of the robot,(1 − p ( m i | z t − , x t − )) p ( m i | z t − , x t − )represents a recursive term from previous states and sensor observations, and p ( m i )(1 − p ( m i ))is a prior term of the state of the cell. This allows us to form a general algorithmfor updating each of the individual cells in the occupancy grid as the robot receivesmore information from the environment.19 .2.4 FastSLAM FastSLAM is an eﬃcient SLAM algorithm that combines the advantages of the EKFand particle ﬁlter into a single algorithm [16]. EKF based SLAM and particle ﬁlterbased SLAM alone pose signiﬁcant drawbacks. EKF based SLAM cannot accountfor highly nonlinear system dynamics as well as highly non-gaussian distributions.On the other hand particle ﬁlter based SLAM produces a computational infeasibilityfor high state space dimensions. Therefore, FastSLAM combines the two based ona Rao-Blackwellization of the posterior distribution p ( x t , m M | z t , u t ) = p ( x t | z t , u t ) p ( m M | x t , z t )which can be rewritten as p ( x t , m M | z t , u t ) = p ( x t | z t , u t ) M (cid:89) i =1 p ( m i | x t , z t )by accounting for the independence of each landmark. In this form the posteriordensity can be broken into two parts. The ﬁrst piece p ( x t | z t , u t ) can be estimatedwith a particle ﬁlter to determine robot poses. The second piece (cid:81) Mi =1 p ( m i | x t , z t )can be estimated with M x ∗ t = arg max x t p ( z t | x t , m t − ) p ( x t | u t − , x ∗ t − )and in some cases the algorithm uses an improved proposal distribution for choosingparticles based on current sensor information [19]. x [ k ] t ∼ p ( x t | x [ k ]1: t − , u t , z t )In conclusion, FastSLAM is an eﬃcient SLAM algorithm that uses Rao-Blackwellization in order to divide the posterior density such that the EKF and20article ﬁlter work together to produce accurate and eﬃcient map estimates. Map merging is a key component of multi-robot SLAM, because multiple robotscan share local map information presented as occupancy grid maps and overlay themaps to produce a larger global map. This is especially useful if robots are spreadout in distant locations, because a global map presents each robot with much moreinformation of the environment than each local map.Map merging algorithms have two main challenges. The ﬁrst challenge isfeature detection amongst the maps. This process allows the map merging algorithmto determine the features of the map that are signiﬁcant for overlay with othermaps. In theory robots that map the same area should produce maps that have asigniﬁcant number of features in common. Two approaches used for feature detectioninclude Scale-Invariant Feature Transform (SIFT) and Speeded Up Robust Features(SURF). However, both of these are patented and not available for open source useand both have limitations. Another algorithm is called Oriented FAST and RotatedBRIEF (ORB), which is open source and ameliorates the issues of SIFT and SURF[20]. The second challenge is to estimate the transformation amongst individualmaps. [20] uses an algorithm based on image stitching for panoramic images to esti-mate the aﬃne transformation amongst individual maps and extends this to a globalmap transformation. The algorithm uses random sample consensus (RANSAC) toestimate the transformation, which can produce more accurate results by focusingon inliers rather than ﬁtting to all data (inliers and outliers).The advantage of the algorithm presented in [20] is that initial robot posesdo not need to be known. Oftentimes it is diﬃcult for robots to directly estimate thepose of other robots in the system, therefore the algorithm is useful by estimatingaﬃne transformations based on map overlaps.The algorithm presented in [20] is provided in the multirobot map mergepackage included with the ROS navigation stack. The package is used in this projectand will be explained in detail in later sections. A visualization of the result of themap merging algorithm is shown in ﬁgure 2.3.21 ) Merged mapa) Local map 1 b) Local map 2 c) Local map 3

Figure 2.3: An illustration of the map merging algorithm presented in [20]. Each ofthe local maps (a),(b),(c) are stitched together through the ROS navigation multi-robot map merge package. The resulting map is shown in (d)

Path planning is crucial for robots to safely traverse a space from point A to pointB. Without path planning and navigation, one could not guarantee a mobile robotnetwork to function safely and eﬀectively. The navigation procedure is generallydivided into a global planning algorithm to produce a safe path around static and22nown obstacles and a local plan to adhere to the global plan while avoiding dynamicand unknown obstacles. A common global planning algorithm, A*, is outlined belowand an application towards a humanoid robot is included as an illustrating example.

The A* graph traversal algorithm is based on Dijkstra’s algorithm for ﬁnding short-est path in a graph, but uses informed search heuristics to achieve better results.Nodes are explored based on a priority queue determined by a heuristic function f ( n ) = g ( n ) + h ( n )where f ( n ) represents the total edge cost of the node n , g ( n ) is the cost of traversingthe edge from the start node to the current node, and h ( n ) is the edge cost of thecurrent node to the goal node.At each iteration the algorithm chooses the node in the priority queue withlowest heuristic cost. Neighbor nodes are explored with the chosen node as theparent. The algorithm determines the heuristic cost of each neighbor, whether ornot the node has already been explored, and updates the priority queue accordingly.The algorithm is repeated until the node within a tolerance of the goal is chosen atwhich a path is reconstructed. The complete A* algorithm is shown in algorithm 4.In some instances a weighted A* is preferred to reduce time duration at thecost of optimality. The weighted A* aggressively chooses nodes that lead towardsthe goal in order to open less total nodes in the graph [21]. The heuristic functionbecomes f ( n ) = g ( n ) + (cid:15)h ( n )where (cid:15) > We can apply the weighted A* path planning algorithm to ﬁnd locomanipulationtrajectories for the NASA Valkyrie humanoid robot [21]. Locomanipulation is a dif-ﬁcult problem for humanoid robots, because locomotion (moving around an environ-ment) and manipulation (moving objects within an environment) often destabilizethe robot if done in parallel. As a result most locomotion and manipulation tasks23 lgorithm 3

A* Search Algorithm

Require:

StartPose

Require:

GoalPoseENQUEUE(OpenSet,StartPose)StartPose → Gscore := 0StartPose → Fscore := 0 while

OpenSet is not empty do current := node in OpenSet with lowest Fscore if current within tolerance of GoalPose then reconstruct path end iffor each neighbor of current do TentativeGscore := current → Gscore + d(current,neighbor) if TentativeGscore < neighbor → Gscore then neighbor → parent := currentneighbor → Gscore := TentativeGscoreneighbor → Fscore := neighbor → Gscore + d(neighbor,GoalPose) if neighbor not in OpenSet then ENQUEUE(OpenSet,neighbor) end ifend ifend forend while return empty path 24igure 2.4: A locomanipulation planner based on a weighted A* search algorithmimplemented on the NASA Valkyrie humanoid robot. The planner generates plansbased on locomotion progression, manipulation progression, or locomanipulationprogression. Footsteps are denoted by l i and r i and the end eﬀector progression isdenoted by s . Image courtesy of [21]are performed sequentially. However, the goal of this work was to generate feasiblelocomanipulation plans quickly such that the robot’s end-eﬀector reaches the goalpose. In order to generate valid locomanipulation trajectories it is useful to ﬁx alocomotion plan and ﬁnd feasible manipulation trajectories based on this locomotionconstrained manifold. In essence, the trajectory of the end-eﬀector is calculated onlywithin the nullspace of the locomotion plan.We use a weighted A* search algorithm to generate plans based on thismodel, where edge transitions can be one of three options: pure locomotion, puremanipulation, or locomanipulation. The edge cost function becomes∆ g ( v , v ) = w s (1 − s ) + w step + w L r ( v )25here ∆ g ( v , v ) is the cost of transition, w s (1 − s ) is the cost associated with movingthe end eﬀector according to the s variable, w step is the cost associated with takinga footstep, and w L r ( v ) penalizes states that deviate from a particular body path.The important thing to consider is that edge transitions are now a function of twovariables (footstep poses, and end eﬀector poses ( s )). This allows the planner toconsider transitions that progress both locomotion and manipulation trajectoriessimultaneously.In order to speed up the process of determining feasible plans a neural net-work classiﬁer is used to prune nodes that produce infeasible trajectories. Theclassiﬁer’s output is considered in the cost function∆ g ( v , v ) = w s (1 − s ) + w step + w L r ( v ) + w d (1 − n ( v , v ))where w d (1 − n ( v , v )) is a cost associated with feasibility of edge transition de-termined by the neural network. This ensures that the planner chooses nodes thatadhere to feasibility while still progressing towards the goal.As a ﬁnal improvement we use a weighted A* cost heuristic. The edge costfunction becomes∆ g ( v , v ) = w h ( w s (1 − s )) + w step + w L r ( v ) + w d (1 − n ( v , v ))in order to quickly progress the plans towards completing the end eﬀector task. Thisformulation was validated on a cart pushing task and a door opening task as shownin ﬁgure 2.5. Hierarchies are a common approach towards solving multi agent problems. Forexample they can be used in the context of multi-robot task allocation [22], multi-robot reinforcement learning [23], multi-robot mapping [24], and even whole bodycontrol [25]. Hierarchical structures are advantageous in many scenarios, becausethey eliminate the need to solve a global optimal solution involving all agents, butinstead solve optimal solutions within a hierarchy.The premise of a hierarchical structure is that agents are ranked accordingto a criteria. In the case of multi-robot systems the robots may be ranked according26igure 2.5: The locomanipulation formulation is tested using two tasks. The lefttask involves moving the end eﬀector from s = 0 to s = 1 in order to open adoor. The right task involves progressing left and right end eﬀectors ( s l and s r respectively) towards pushing a cart. Images courtesy of [21]to capabilities. In homogeneous cases ranking may be arbitrary as all agents areclassiﬁed in similar ways. Solutions are computed according to the priority queue ofthe hierarchy. For instance, the ﬁrst agent may compute a solution and all followingagents will compute solutions according to the null space of preceding agents.This structure is especially common in the context of whole body control.Robots contain n degrees of freedom, which means that joints can often interferewith other joints if the robot is expected to perform multiple simultaneous tasks. Ahierarchy is setup in order to prevent this issue as tasks are prioritized according toimportance and any task is performed only within the nullspace of preceding tasks. As previously mentioned, whole body control uses a hierarchical task space rep-resentation to ensure that no speciﬁed tasks interfere. Tasks can be end eﬀectorposition or velocity in cartesian space. I will provide an example of a whole bodycontrol structure within the kinematics domain using a hierarchical task space, andcompare two optimal control strategies for solving the problem. The example willbe based on the robot shown in ﬁgure 2.6.27 θ θ θ y xh1 2 3 4 ρ ρ ρ Figure 2.6: The kinematic diagram for this robot. Each of the end eﬀectors arelabelled as 1,2,3,4 respectively.

Task Space Representation

I will give a brief overview of how the joint space of a whole body control structurecan be mapped into a task space using a series of null space projections. Themapping from a joint displacement to the end eﬀector displacement in cartesianspace is described by dx i = J i dq (2.1)where dx i is the end-eﬀector displacement of the ith task, J i is the jacobian of theith task, and dq is the joint displacement.If we deﬁne dq = dq + N dq + N N / dq + ... + N [ s ] dq s (2.2)where N [ s ] = s (cid:89) i =1 N i/i − (2.3)28 i/i − is the null space projector deﬁned to be N i/i − = I − ( J i N i − /i − ) † ( J i N i − /i − ) (2.4)where N / = N = ( I − J † J ) (2.5)By plugging 2.2 into 2.1 for every i in the task space we can establish a setof decoupled task space equations of the form dx = J dq dx = J dq + J N dq dx = J dq + J N dq + J N N / dq With this set of task space equations we can generate a discrete time statespace model of the form  x t +1 x t +1 x t +1 ...x kt +1  = I k × k  x t x t x t ...x kt  +  J ... J J N ... J J N J N N / ... ...J k ... ... ... J k (cid:81) k − i =1 N i/i −   dq t dq t dq t ...dq kt  +  w t w t w t ...w kt  Task Hierarchy

Due to the sequential null space projections of tasks it is easy to establish a hierarchyof task objectives deﬁned as end-eﬀector position in cartesian space as shown by thediscrete time state space model in the previous section. The top prioritized taskgenerates a control policy based on information from it’s own state, whereas alllower level tasks rely on information from their own state as well as control inputsof all higher level tasks. The information structure is displayed in 2.7 as well asequations 2.6,2.7,2.8,2.9. 29 ask 1 Task 2 Task 3 Task k

Figure 2.7: A representation of the information structure for this system. Task 1represents the prioritized task and all information ﬂows downstream to lower leveltasks. u ( t ) = d ˜ q t = γ ,t ( x t ) (2.6) u ( t ) = d ˜ q t = γ ,t ( x t , u t ) (2.7) u ( t ) = d ˜ q t = γ ,t ( x t , u t , u t ) (2.8) u k ( t ) = d ˜ q t = γ k,t ( x kt , u t , u t , u t , ..., u k − t ) (2.9) Optimal Control Formulation

The aim of the control policies is to minimize the cost function J = lim sup T →∞ T T (cid:88) t =1 E (˜ x Tt Q ˜ x t + u Tt Ru t ) (2.10)Because the information structure contains a hierarchy we can solve this system withthe sparsity dynamic programming approach from [26]. There is no time delay in thissystem therefore the solution reduces to k solutions to the discrete backward Ricattirecursion. Due to this system containing a time varying B matrix the standard LQRdynamic programming approach solves a feedback gain matrix K at each timestep30y solving the backward Ricatti equation. P t − = Q + A T P t A − A T P t B ( R + B T P t B ) − B T P t AP T = IK t = ( R + B T P t +1 B ) − B T P t +1 A However if we consider a sparsity structure the feedback gain K st is solvedfor each node where s corresponds to the information node. The control input isreconstructed according to a propagation of the noise multiplied by each K matrix.If we let ζ st +1 = (cid:88) w i → s I s,i w it where ζ s = (cid:88) w i → s I s,i x i and let x i ∼ N (0 , σ )then the control input can be reconstructed as u t = (cid:88) s I i,s K st ζ st This formulation ensures a sparsity on K and signiﬁcantly decreases computationfor the control input.The weighting matrices Q and R are chosen as the identity matrices in orderto put equal weight on control input and state error. These are initial choices forthese parameters and may be updated in the future work. Results and Figures

The results of a simulation comparing the control performances between standarddynamic programming using the backward Ricatti recursion and dynamic program-ming with sparsity is shown below. Figure 2.8 shows the actual joint trajectoriesoverlayed on nominal joint trajectories for both approaches. Figure 2.9 shows theerror of both approaches as a function of time. Figure 2.10 shows the input com-31ands for both approaches as a function of time, and ﬁgure 2.11 shows the averagecost of both approaches as a function of time.Figure 2.8: Actual simulated task trajectories for both dynamic programming anddynamic programming with sparsityFigure 2.9: Task errors over time for dynamic programming and dynamic program-ming with sparsity 32igure 2.10: Joint inputs over time for dynamic programming and dynamic pro-gramming with sparsityFigure 2.11: Average cost over time for dynamic programming and dynamic pro-gramming with sparsity 33 .5 Literature Review

The multi-robot SLAM problem is not a novel research area. However, researchershave proposed a number of strategies to coordinate robots, merge local maps, andexecute the SLAM algorithm.The types of maps generated and measurement sensors vary. For example,[27] generates graph based maps using a vision camera, [28] uses RGBD cameras togenerate landmark based maps, and [29] uses a laser rangeﬁnder to create occupancygrid maps.Some researchers present novel methods to augment the local maps of in-dividual agents, but do not consider communication limitations. [30] uses stereocameras to perform visual based SLAM estimating landmarks with a FastSLAMalgorithm. The research is based on a hybrid multi-robot system as local robotsuse FastSLAM to generate local maps and send these maps to a central server toperform map merging. However, the researchers do not address communication is-sues and assume perfect communication with the central server. In addition, [29]compares the diﬀerence between using a standard EKF and FastSLAM for buildinga global landmark based map for a multi-robot system. Robots share map and stateinformation on a local level and do not rely on a central server for map merging, butrobot communication is not guaranteed as robots only communicate as they comewithin range of one another.On the other hand [27] and [28] present novel methods to augment the mapsof individual agents and consider communication limitations. [27] combines localrobot pose graphs into a global graph based map using visual markers and usesa correction strategy to handle communication delays amongst the robots. [28]compares 3D object based maps to pointcloud based maps for a decentralized multi-robot system. The researchers take advantage of local robot computational capacityby allocating object detection and object pose estimation to individual agents inorder to reduce communication burdens.In addition, [31] considers computational eﬃciency and a hierarchical struc-ture to develop control policies for a multi-robot SLAM system. The researchersdevelop control policies by minimizing the entropy of the system conditioned onfuture measurements. The strategy exploits sparsity in the planning process andreduces computational burden. The formulation considers a partially nested infor-34ation structure and a hierarchy of sensors to develop a hierarchical control policy.[32], [33], [34], and [35] consider multi-robot exploration tasks in which robotsare subject to, speciﬁcally, communication range limitations. [36] draws the distinc-tion among three classes of communication range constraints; continuous, event-based, and none. [33] develops a strategy for event-based communication (robotsonly communicate when novel, relevant information is acquired). The researchersuse a central computer to determine robot poses based on a centralized process andthen extend the framework to a distributed process where robots choose their goalposes. [32] considers continuous communication range restrictions. The researchersuse a population sampling method to determine robot control policies based on aheuristic function taking into account viable distance amongst robots and the cen-tral source. Although agents maintain viable network structures, the coordinationis completely centralized and can run into computational burdens for the centralcomputer.[34] and [35] solve the communication range issue with ad-hoc informationrelay networks. Robots use one another to relay information either amongst them-selves or back to the source. This gives the robot network a means to explore areasbeyond the range of any one robot.Many researchers implement frontier based methods to explore the unknownof a map, but it is worth noting that other methods provide eﬃcient results. [37]uses an artiﬁcial potential ﬁeld to guide a system of robots. The potential ﬁeldis based on attractors (unexplored areas of the map) and repulsors (other robots,obstacles, and already explored areas). Additionally, [38] and [39] propose multi-robot exploration methods based on ant colonies, where robots leave traces to signalalready explored areas to other robots. 35 hapter 3

Methods

The objective of this project is to develop and test an autonomous multi-robotmapping architecture that is constrained by robot communication limits. The goalsare to adhere to three main criteria; eﬃciency, scalability, and robustness.The project combines multi-robot systems, SLAM, and robot exploration intoa cohesive architecture. Multiple robots communicate with the central computerwhile autonomously exploring and gathering information for building the map ofthe space. The robots are constrained by communication range limits that preventexploration of distant points outside of a viable communication network. However,the robots can form an ad-hoc relay network to pass information from any onerobot to another robot eventually reaching the central computer. The architectureattempts to optimize the coordination of the robot system such that the agentsexplore as much of the map as possible, while remaining in network communicationamongst themselves and the central computer.The robots arrange themselves in an ad-hoc communication relay networkby exploiting a hierarchical structure explained in section 3.4.2.The full system architecture is shown in diagram 3.1. Each box correspondsto a node within each robot or the central computer. Each node serves a particularfunction and the network of nodes and information sharing constitutes the entirearchitecture. Up until this point I have explained the nodes related to SLAM, nav-igation/path planning, and map merging. In the following sections I will provide36 detailed explanation of the frontier point exploration and the robot coordinationalgorithms. Finally, I will provide an overview of the system architecture includ-ing the classiﬁcation of the multi-robot system, hierarchical structure, informationexchange, and communication constraints.

The central computer determines frontier points from the merged map in order tosend goal poses to the hierarchy of robots. Frontier points are the threshold betweenthe known and unknown of the map. In other words the frontiers indicate the areasthat still need to be explored. The algorithm for ﬁnding these points is presentedin Algorithm 5.The algorithm is based on the Wavefront Frontier Detector algorithm pre-sented in [40] and modiﬁed to prune irrelevant frontiers and extract optimal frontierpoints. The search algorithm uses a double breadth ﬁrst search in order to searchthe map. The outer breadth ﬁrst search searches the map as a whole looking forrelevant frontier points, while the inner breadth ﬁrst search searches the span of anindividual frontier. The algorithm ensures that no point occurs within more thanone frontier. This prevents frontier points from being double counted on the ﬁnalfrontier list.Following frontier determination, frontiers are sorted to ensure that irrelevantfrontiers either from mapping errors or spaces too small for robots to search are notappended to the frontier list. The algorithm sorts based on a size threshold of eachfrontier array. Only arrays above this threshold are considered.Finally, the frontier point that is the spacial average of each relevant frontierarray is chosen as the representative point for the frontier and appended to thefrontier list. An illustration of the complete set of frontier points and the ﬁnalfrontier list is shown in 3.2.

The algorithm sorts the ﬁnal frontier list according to respective frontier array size.This assumes that frontier arrays with the largest number of points indicate areas37f the map that have the most potential information gain. The hierarchical frontierlist is then used for the robot coordination algorithms explained in section 3.3. a) b)

Figure 3.2: An illustration of the frontier point search algorithm. a) shows allfrontier arrays (red) and b) shows only the representative frontier points (green) ofeach frontier array

Robot coordination is the backbone of a successful architecture. Robots choosefrontier points as navigation goals based on the frontier list in section 3.2 such thateach robot locally optimizes it’s choice. I will outline the data structures used forthe choice algorithm, the main algorithm itself, and a backup choice algorithm incases when no valid frontier points remain.

The choice algorithm depends on three main data structures. The ﬁrst is the frontierlist received from the preceding robot. The frontier list contains all available frontierpoints arranged according to the hierarchy of section 3.2.2. The second is the chosenpose that each robot sends to the navigation/path planning algorithm in order toreach the pose. The third and ﬁnal structure is the pass down path that marks pose38 lgorithm 4

Frontier Points Search Algorithm

Require:

MapQueue

Require:

FrontierQueue

Require:

PoseMapQueue ← ∅

ENQUEUE(MapQueue,Pose)Pose → MapOpen = true while

MapQueue not empty do current m ← DEQUEUE(MapQueue) if current m → MapClosed == true then continue end ifif current m is a frontier point then FrontierQueue ← ∅

NewFrontier ← ∅

ENQUEUE(FrontierQueue, current m ) current m → FrontierOpen = true while

FrontierQueue not empty do current f ← DEQUEUE(FrontierQueue) if current f → MapClosed, FrontierClosed == true then continue end ifif current f is a frontier point then append current f to NewFrontier for all neighbors of current f doif neighbor f → FrontierOpen, FrontierClosed, MapClosed != true then

ENQUEUE(FrontierQueue, neighbor f ) neighbor f → FrontierOpen = true end ifend forend if current f → FrontierClosed = true end whileif size of NewFrontier is greater than threshold then

OptimalFrontierPoint = average pose of NewFrontierappend OptimalFrontierPoint to FrontierList end ifend iffor all neighbors of current m doif neighbor m → MapOpen,MapClosed != true and neighbor m has at leastone open space neighbor then ENQUEUE(MapQueue, neighbor m ) neighbor m → MapOpen = true end ifend for current m → MapClosed = true end while return FrontierList 39ependencies for receding robots such that the current robot can navigate to thechosen pose and use other robots as relays to maintain valid information exchangewith the system.

The choice algorithm guarantees that a robot either chooses the best available fron-tier list pose or adheres to a dependency of a preceding robot for communicationrelay. The full algorithm is shown in algorithm 6. I will explain each of the mainfeatures of the algorithm in detail below.

Algorithm 5

Choose Frontier Point

Require:

FrontierList

Require:

PassDownPathList if PassDownPathList not empty then

ChosenPose ← DEQUEUE(PassDownPathList)erase ChosenPose from FrontierListreturn ChosenPose,FrontierList,PassDownPathList elsefor all

FrontierPts in FrontierList doif

FrontierPt is valid then

ENQUEUE(ValidFrontierList,FrontierPt)ﬁnd neighbors of FrontierPt end ifend forwhile

ValidFrontierQueue not empty do CandidatePose ← DEQUEUE(ValidFrontierQueue)compute valid paths of CandidatePosePathList := CandidatePose → ValidPaths if PathList not empty then

ChosenPose := CandidatePosePassDownPathList := OptimalPathreturn ChosenPose,PassDownPathList,FrontierList end ifend whileif fail to ﬁnd ChosenPose then perform BackupChoiceAlgorithm end ifend if g ( f ) = k (cid:88) i ( w r ( f ir ) + w n ( f in ))where g ( f ) is the cost associated with each path, f ir is the rank value of each frontierpoint of the path determined by index value in the FrontierList, f in is a cost associ-ated with including more frontier points in the path and is set to 1, and w r and w n are respective weights. The algorithm ﬁnds the optimal path based on two criteria;most prioritized frontier points and least dependency on other robots in the system.Finally, if the robot fails to ﬁnd a valid chosen frontier pose then it resortsto the backup choice algorithm presented in the next section.41 .3.3 Backup Choice Algorithm The backup choice algorithm is designed to explore frontier points that otherwise donot contain a path that can be traced back to the WiFi source with pre-establishedfrontier points. The algorithm chooses the ﬁrst frontier point from the FrontierListand computes a shortest distance path from the pose to the WiFi source consideringstatic obstacles (in the same way as a global planning algorithm). If the pathdistance is greater than the number of robots remaining multiplied by the WiFidistance threshold then the frontier point is not considered and the algorithm movesto the next frontier point. If the path distance is within the distance threshold thenthe algorithm computes poses for the minimum number of robots needed to relaythe information back to the source. The algorithm provides a means to explorelarger maps as it uses robots as a ﬁreline to relay information from a distant poseback to the central computer. Finally, the robot may not ﬁnd a viable frontier poseto choose, therefore the chosen pose defaults to the robot’s starting position in themap and the pass down path becomes empty.

Algorithm 6

Backup Choice

Require:

FrontierListCandidateFrontierList := FrontierList while

CandidateFrontierList not empty do CandidatePose ← DEQUEUE(CandidateFrontierList)compute path from CandidatePose to WiFi source if path distance < (robots remaining)*(WiFi range threshold) then ChosenPose := CandidatePoseerase ChosenPose from FrontierListﬁnd relay poses along pathappend relay poses to PassDownPathListreturn ChosenPose,FrontierList,PassDownPathList end ifend whileif fail to ﬁnd ChosenPose then

ChosenPose ← starting posePassDownPathList ← ∅ return ChosenPose, FrontierList, PassDownPathList end if ) b)c) d) Figure 3.3: A representation of the robot coordination algorithms. a) depicts therobots choosing frontier points based solely on the choose frontier point algorithm,b) shows a one robot relay using the backup choice algorithm, where robot 1 usesrobot 2 as an information relay, c) and d) show a two robot relay where robot 1 usesrobot 2 and 3 to relay information. The chosen poses are shown as red, blue, andpink arrows respectively and the WiFi source is denoted by the orange star

The system type is based on the taxonomy of multi-robot systems shown in table 2.1.The ﬁrst classiﬁcation is a hybrid architecture. This system contains n individualagents along with a central computer. Each of the agents generates local mapsaccording to their own SLAM algorithms and sends the local map data to a central43omputer, which performs map merging, frontier point extrapolation, and sendsfrontier points to the individual agents for further exploration.The second classiﬁcation is a homogeneous system. Each of the robots withinthis system has the exact same speciﬁcations. These include hardware and algo-rithms for performing SLAM, coordination, navigation, and communication.The third classiﬁcation is a cooperative system. Every robot works with everyother robot to generate a map of the environment as eﬃciently as possible. Thisincludes working together to optimally distribute tasks as well as prevent collisionsduring mobility.the fourth classiﬁcation is a hierarchical information structure. The robotsarrange themselves according to a hierarchy for decision making. The central com-puter computes optimal frontier points to be explored by the system. The centralcomputer then passes the information to the prioritized robot, who makes a choicebased on the data and then passes the unchosen points to the next robot on thepriority list. This process repeats itself until all robots choose a task to perform.The ﬁfth classiﬁcation is direct communication. All information sent fromagent to agent occurs through on board hardware and a direct communication chan-nel. Robots send all necessary data through an established communication protocolwithin the ROS framework either to the central computer or to one another.The sixth and ﬁnal classiﬁcation is a reactive system architecture. Unforeseenchanges in the environment are common especially as many robots navigate theenvironment all at once. In eﬀect, each robot individually adjusts it’s path plansaccording to unforeseen changes. This highlights the importance of a local planningalgorithm for this type of architecture, because robots often need to adjust theircourse as other robots get in the way of their path.44 .4.2 Hierarchy Central ComputerRobot 1 Robot 2 Robot 3 Robot n

Figure 3.4: An illustration of the hierarchical structure of this system. The centralcomputer sends frontier points to the top robot in the hierarchy, which performs achoosing operation and sends all necessary information to the next robot.The information structure of this system is a hierarchy. As seen in ﬁgure 3.4 thecentral computer sends a list of potential exploration (frontier) points to the ﬁrstrobot, which performs a necessary choosing operation and sends all pass down in-formation to the next robot. The robots perform this procedure in sequence untilall robots choose a task.Within the context of this project the hierarchy is organized arbitrarily due toa homogeneous system. There is no advantage to consider one robot as more capablethan any other robot, because all robots contain the same speciﬁcations. However, inthe case of heterogeneous robot systems the organization of this hierarchy becomesvery important as it factors signiﬁcantly into the performance.There are a few advantages of using a hierarchy within this context. First, Iassume that this architecture can be scaled to many agents. this means that a globaloptimal solution for robot exploration becomes an intractable problem for a largenumber of agents. A hierarchy allows the system to solve a series of local optimalsolutions each within the “nullspace” of the preceding solution. In this case the ﬁrstrobot chooses a frontier point that is optimal for itself. All less prioritized robotsthen choose local optimal frontier points while not interfering with the choice ofpreceding robots. The second advantage is that solutions can be computed with onlylocal measurement data. Robots do not need to know the position of other robotswhen computing solutions. This frees up communication channels and prevents45ssues with incomplete state information.The disadvantage of a hierarchy is a sacriﬁce of optimality for eﬃciency. The“Get out of the way procedure” outlined above where each preceding robot makes adecision and all other robots only make decisions within the nullspace of establishedchoices leaves out better team solutions. For example, robot 1 may choose a frontierpoint based on it’s own optimal criteria and let all other robots choose based on thehierarchy, but it may be more beneﬁcial for robot 1 to choose the frontier points forall other robots based on its own initial decision and state knowledge.

Central Computer

Frontier ListCompletionPass Down Path Frontier ListCompletionPass Down Path Frontier ListCompletionPass Down Path F r on ti e r L i s t C o m p l e ti on , m a p m a p m a p m a p n Robot 1 Robot 2 Robot 3 Robot n

Figure 3.5: An illustration of the information ﬂow of this system. The centralcomputer sends a list of frontier points to the ﬁrst robot, which chooses a localoptimal solution and sends a pass down path and the updated list of frontier pointsto the next robot. This process repeats until all robots choose a task. The robotssend a completion notice back up the hierarchy after reaching the respective goalpose to indicate to the central computer to begin the next cycleInformation propagates through the hierarchy such that each robot chooses a localoptimal task to achieve the global goal.The ﬁrst question is what type of information is propagated in the system?The types of messages sent from the central computer to the robots or amongstthe robots individually contain occupancy grid maps (each robot to the centralcomputer), frontier point lists (central computer to robot and robot to robot), pass46own path lists (robot to robot), and boolean completion messages (robot to robotand robot to central computer).The second question is how does the information propagate? The centralcomputer receives local map data from each of the robots individually in the formof occupancy grid map messages. The central computer computes optimal frontierpoints based on a merged map of all local maps received from the robots, and sendsthis list to the ﬁrst robot in the hierarchy. After robot 1 receives the list of frontierpoints, it chooses a pose based on the algorithm in sections 3.3.2 and 3.3.3. Robot1 then passes down two messages in the form of an updated frontier list and a passdown path (pass down path message will be empty if there is no robot dependency).This process is repeated until all robots choose a point. Finally, after all robots havecompleted their respective tasks a boolean completion message is sent back up thehierarchy until reaching the central computer as an indication to restart the process.The full information ﬂow diagram is presented in 3.5.This information ﬂow is advantageous because the most demanding infor-mation exchange occurs as occupancy grid map messages from each robot to thecentral computer. All other information exchange occurs as limited size arrays(frontier points, pass down paths) or booleans (completion messages).

There are two main communication constraints assumed by this system. First, eachrobot serves as a WiFi hotspot to send and receive data either to the central com-puter or to other robots. The WiFi range of each robot is limited, therefore robotscan only communicate within a range of one another. Small indoor environmentsgenerally do not pose a problem, because WiFi has a range of about 100-150 feetfor indoor environments, but as the size of space increases robots may need to ex-plore areas that are beyond the available range of the on board WiFi module. Thisproject aims to solve this issue by coordinating the robots as an eﬃcient networkrelay structure. As the number of robots increases the available range of explorationalso increases, because robots can relay information back to the central computer.A simulated example is shown in ﬁgure 3.6.The second communication constraint is channel capacity of each robot. Al-though WiFi has a capacity of about 54Mbps and does not pose an issue for limited47nformation exchange such as map data, limited channel capacity is a bottleneck forsending large data ﬁles such as video. In future work the second aim of this projectis to exploit the network structure established by the robot coordination algorithmin order to provide the user with an option of increased channel capacity of all robotsback to the source. The details are explained in the future work section. a) b) c)

Figure 3.6: A simulation of the WiFi range limitation. a) shows a viable structurewhere robot 1 (blue) and robot 2 (red) are within acceptable range of the WiFisource (star), b) shows a viable structure where robot 1 uses robot 2 as a relay forcommunication back to the source, and c) shows an invalid structure where robot 1is not within acceptable range of robot 2 and robot 2 is not within acceptable rangeof the source 48

C/Main

Map Merge Frontier Points

R1/Main

R1/ChoiceR1/SLAMR1/NavigationR1/Path Planning G r i d M a p , P o s e E s ti m a t e L a s e r S ca n F r on ti e r L i s t F r on ti e r L i s t , P a ss D o w n P a t h , C ho s e n P o s e C ho s e n P o s e , O do m e t r y C o mm a nd V e l o c it y C ho s e n P o s e , O do m e t r y P a t h L o ca l M a p s M e r g e d M a p M e r g e d M a p F r on ti e r L i s t F r on ti e r L i s t , L o ca l M a p , C o m p l e ti on R2/Main

R2/SLAMR2/ChoiceR2/NavigationR2/Path Planning L a s e r S ca n G r i d M a p , P o s e E s ti m a t e C ho s e n P o s e , O do m e t r y C o m m a nd V e l o c it y P a t h C ho s e n P o s e , O do m e t r y F r on ti e r L i s t F r on ti e r L i s t , P a ss D o w n P a t h , C ho s e n P o s e L o ca l M a p F r on ti e r L i s t , P a ss D o w n P a t h C o m p l e ti on Robot 1 Robot 2Central Computer

Figure 3.1: The full system architecture. Every rectangle represents a node withineach agent (separated by dashed lines). The arrows between nodes represent infor-mation exchanges with message types written above the lines49 hapter 4

Results and Figures

The simulation tests the performance of the architecture presented in chapter 3 ina ROS/Gazebo environment using two diﬀerent maps; a house structure shown in4.1 and an oﬃce environment shown in 4.2. All simulated robots are based on theTurtleBot3 Waﬄe Pi from ROBOTIS in the Gazebo simulation environment. Thesimulation was performed using a desktop with an Intel Core i5 Processor and anNVIDIA GeForce RTX 2070 GPU running on a Linux Ubuntu operating system.All the code is written in C++.The SLAM nodes are taken from the gmapping package from the ROS nav-igation stack. Each robot contains a gmapping node and all parameters remainconsistent for all robots. In addition, the simulation uses the multirobot map mergepackage from the ROS navigation stack in order to perform map merging by thecentral computer. Finally, the robots use the move base package from the ROSnavigation stack for navigation/path planning.The simulation runs for cases of 1,2,3, and 4 robots and varies the WiFi rangeparameters from 2m to 5m in 1m increments. The simulated WiFi range is not anaccurate representation of actual WiFi range capabilities, but is diminished in orderto test the capabilities of the algorithm for smaller maps. It is the assumptionthat the results here will remain at least fairly consistent with accurate real lifeWiFi range variations. Additionally, the simulation collects data for number ofiterations of hierarchical choosing operations. An iteration is deﬁned as the central50omputer extracting frontier points from the merged map, sending the list of frontierpoints down the robot hierarchy, each robot choosing a goal pose, and each robotcompleting their respective tasks. The simulation also notes the percentage of themap that the robot system completes based on each category. The map percentagecompletion is based on P = m partial m complete ∗ P is the percent completion, m partial is the number of known occupancy gridmap values within the map completed by the robots, and m complete is the numberof known occupancy grid map values for a completed map. The full table of resultsfor each respective map is shown in tables 4.1 and 4.2.51igure 4.1: The house map used for the ﬁrst simulationFigure 4.2: The oﬃce map used for the second simulation52 .2 Tables and Figures Table 4.1: Simulation data for house map

Number of Robots Number of Iterations Map Completion Percentage Simulated WiFi Range1 5 15.38 2m1 6 21.32 3m1 10 26.10 4m1 16 58.21 5m2 10 50.23 2m2 7 51.12 3m2 14 64.87 4m2 11 69.44 5m3 7 61.97 2m3 9 71.90 3m3 7 96.17 4m3 18 100.00 5m4 10 64.10 2m4 11 90.68 3m4 15 100.00 4m4 12 100.00 5m

Number of Robots Number of Iterations Map Completion Percentage Simulated WiFi Range1 4 11.32 2m1 10 16.94 3m1 14 41.87 4m1 6 30.67 5m2 8 36.58 2m2 12 50.59 3m2 16 69.02 4m2 23 73.81 5m3 14 49.96 2m3 14 71.45 3m3 29 87.79 4m3 19 100.00 5m4 12 65.60 2m4 19 100.00 3m4 27 100.00 4m4 19 100.00 5m ) b)c) d) Figure 4.3: Plots for the house map. a) shows the map completion percentage vs.number of robots averaged over the WiFi ranges, b) shows the map completionpercentage vs. number of robots for each of the WiFi ranges, c) shows the numberof iterations vs. number of robots averaged over all WiFi ranges, and d) shows thenumber of iterations vs. number of robots for each WiFi range55 ) b)c) d)

Figure 4.4: Plots for the oﬃce map. a) shows the map completion percentage vs.number of robots averaged over the WiFi ranges, b) shows the map completionpercentage vs. number of robots for each of the WiFi ranges, c) shows the numberof iterations vs. number of robots averaged over all WiFi ranges, and d) shows thenumber of iterations vs. number of robots for each WiFi range

The performance between the two maps is fairly consistent. The map completionpercentage shows a steady increase as the number of robots increases for all WiFiranges based on ﬁgures 4.3b and 4.4b. Additionally, the average map completionover all WiFi ranges shows a steadily increasing behavior plateauting as the mapcompletion percentage reaches 100 based on ﬁgures 4.3a and 4.4a. The question is56hether the improved performance in map exploration outweighs the larger numberof iterations with more robots in the system. The raw data shows that the numberof iterations per simulation does not increase at a constant rate based on 4.3d and4.4d, however the number of iterations as an average for the number of robots overWiFi range generally holds an upward trend based on 4.3c and 4.4c. As robotsbranch out to explore farther reaches of the map they rely heavily on the backupchoice procedure, which in many instances only permits one robot to explore withincremental advances at each iteration. This is likely to account for the heavyincrease in iterations as the number of iterations increase as more robots can branchout farther to explore the map.Due to limited computing capacity a maximum of 4 robots were included inthe simulation, which provides limited data for scalability of this architecture. Inaddition, as the number of robots increased the SLAM and map merging algorithmslost performance due to computational load. This may aﬀect the outcome of thedata as the 3 and 4 robot systems relied on a less accurate map.57 hapter 5

Conclusion and Future Work

The goal of this project is to develop a multi robot mapping architecture thattakes into account communication limitations and adheres to three main criteria;eﬃciency, scalability, and robustness.The results of chapter 4 show that the performance signiﬁcantly increasesas the number of robots increase. The percentage of map completion consistentlyincreases as more robots are added to the system and the number of coordination it-erations is not a constant increase. In addition, the communication amongst robotsis eﬃcient. The robots send messages no larger than an occupancy grid map ateach iteration. All other messages are small arrays and booleans. Finally, robotsdistribute tasks eﬃciently through the hierarchy. This includes information process-ing (SLAM algorithms, navigation, choice algorithms, frontier point extraction, andmap merging) as well as frontier point allocations (choosing optimal frontier pointsto explore the map).The architecture can theoretically scale to large numbers of robots, howeverdue to limited computation capacity only 4 robots were tested. Based on the resultsof chapter 4, the robots complete more of the mapping exploration as the number ofagents increase and the communication burden does not signiﬁcantly increase. Thistheoretically points to an ability to scale this architecture to many agents, but thedata from this paper is not suﬃcient to prove this claim.The system is robust to navigation challenges including unaccounted objectsin the static map, but still has room to improve for communication errors, map-ping errors, and mechanical/electrical failures. Each agent updates its navigation58ath based on unforeseen objects in the environment (such as other robots appear-ing in the calculated path) based on a local planning algorithm. This ensures thatrobots will consistently reach their chosen goal poses. However, the system does notadapt to local failures including communication delays/drops, mechanical/electricalfailures, or mapping errors. In the future work section I describe how the architec-ture can be improved to account for dynamic and unpredictable challenges in theenvironment.

Future Work

I completed this project within a few months and unfortunately did not have theopportunity to implement some modules along the way. Here I will outline some ofthe next steps of this project.The system did not adapt well to system failures including individual robotfailures, communication drops, and software hiccups, so a next step is to implementcoordinated backup features such that the system becomes less of a reactive architec-ture and more of a deliberative architecture. These backup features may reorganizethe hierarchy if one robot fails, or coordinate the agents around communicationfailures.In addition, the results and data collection are very limited. The next stepsare to simulate the system with many more agents and extend the testing to realhardware. Real hardware can include both homogeneous and heterogeneous robots.Because the algorithms developed in this paper are very general, I proposeto extend the architecture to other high level search tasks. This may include multi-robot search and rescue, multi-robot person following, or other user deﬁned tasks.Lastly, the communication amongst the robot network can be optimized. Ihave outlined some of the communication limitations with regard to channel ca-pacity for large data sharing. Because this architecture arranges the robots in vi-able “hotspot” relays for communication sharing at every iteration, it is possible topropagate information through the network using a max ﬂow algorithm such as theford-fulkerson algorithm. This may increase channel capacity for each robot andprovide the user with the option to send large data ﬁles amongst the robots or backto the source. 59 ibliography [1] M. A. A. Hassan, “A review of wireless technology usage for mobile robot con-troller,” in

Proceeding of the International Conference on System Engineeringand Modeling (ICSEM 2012) , pp. 7–12, 2012.[2] J. Leitner, “Multi-robot cooperation in space: A survey,” in , pp. 144–151, IEEE, 2009.[3] SpaceNews, “Nasa declares opportunity mars rover mission over,” 2019. [On-line; accessed November 5, 2020].[4] A. van Wynsberghe and J. Donhauser, “The dawning of the ethics of environ-mental robots,”

Science and engineering ethics , vol. 24, no. 6, pp. 1777–1800,2018.[5] F. Cavallo, R. Limosani, A. Manzi, M. Bonaccorsi, R. Esposito, M. Di Rocco,F. Pecora, G. Teti, A. Saﬃotti, and P. Dario, “Development of a socially believ-able multi-robot solution from town to home,”

Cognitive Computation , vol. 6,no. 4, pp. 954–967, 2014.[6] A. Di Nuovo, F. Broz, N. Wang, T. Belpaeme, A. Cangelosi, R. Jones, R. Espos-ito, F. Cavallo, and P. Dario, “The multi-modal interface of robot-era multi-robot services tailored for the elderly,”

Intelligent Service Robotics , vol. 11,no. 1, pp. 109–126, 2018.[7] G. Weiss,

Multiagent systems: a modern approach to distributed artiﬁcial intel-ligence . MIT press, 1999.[8] J. Rocha, I. Boavida-Portugal, and E. Gomes, “Introductory chapter: Multi-agent systems,” in

Multi-Agent Systems , IntechOpen, 2017.609] P. Stone and M. Veloso, “Multiagent systems: A survey from a machine learningperspective,”

Autonomous Robots , vol. 8, no. 3, pp. 345–383, 2000.[10] S. Y¨uksel and T. Ba¸sar,

Stochastic networked control systems: Stabilizationand optimization under information constraints . Springer Science & BusinessMedia, 2013.[11] A. Farinelli, L. Iocchi, and D. Nardi, “Multirobot systems: a classiﬁcation fo-cused on coordination,”

IEEE Transactions on Systems, Man, and Cybernetics,Part B (Cybernetics) , vol. 34, no. 5, pp. 2015–2028, 2004.[12] diydrones, “Robust gps-agnostic navigation - visual planning, estimation andcontrol for mavs,” 2016. [Online; accessed November 5, 2020].[13] cometlabs, “Teaching robots presence: What you need to know about slam,”2017. [Online; accessed November 5, 2020].[14] Cyrill Stachniss, “Slam course - 05 - ekf slam (2013/14; cyrill stachniss),” 2013.Accessed Nov 5, 2020.[15] Cyrill Stachniss, “Slam course - 11 - particle ﬁlters - a short intro (2013/14;cyrill stachniss),” 2013. Accessed Nov 5, 2020.[16] M. Montemerlo, S. Thrun, D. Koller, B. Wegbreit, et al. , “Fastslam: A factoredsolution to the simultaneous localization and mapping problem,”

Aaai/iaai ,vol. 593598, 2002.[17] Cyrill Stachniss, “Slam course - 10 - grid maps (2013/14; cyrill stachniss),”2013. Accessed Nov 5, 2020.[18] Cyrill Stachniss, “Slam course - 12 - fastslam (2013/14; cyrill stachniss),” 2013.Accessed Nov 5, 2020.[19] Cyrill Stachniss, “Slam course - 13 - grid-based slam with rao-blackwellized pfs(2013/14; cyrill stachniss),” 2013. Accessed Nov 5, 2020.[20] J. H¨orner, “Map-merging for multi-robot system,” 2016.[21] S. J. Jorgensen, M. Vedantam, R. Gupta, H. Cappel, and L. Sentis, “Find-ing locomanipulation plans quickly in the locomotion constrained manifold,”61n ,pp. 6611–6617, IEEE, 2020.[22] J. Hawley and Z. Butler, “Hierarchical distributed task allocation for multi-robot exploration,” in

Distributed autonomous robotic systems , pp. 445–458,Springer, 2013.[23] Y. Cai, S. X. Yang, and X. Xu, “A combined hierarchical reinforcement learn-ing based approach for multi-robot cooperative target searching in complexunknown environments,” in , pp. 52–59, IEEE, 2013.[24] P. Chand and D. A. Carnegie, “Mapping and exploration in a hierarchicalheterogeneous multi-robot system using limited capability robots,”

Roboticsand autonomous Systems , vol. 61, no. 6, pp. 565–579, 2013.[25] L. Sentis,

Synthesis and control of whole-body behaviors in humanoid systems .stanford university USA, 2007.[26] A. Lamperski and L. Lessard, “Optimal decentralized state-feedback controlwith sparsity and delays,”

Automatica , vol. 58, pp. 143–151, 2015.[27] I. Deutsch, M. Liu, and R. Siegwart, “A framework for multi-robot pose graphslam,” in , pp. 567–572, IEEE, 2016.[28] S. Choudhary, L. Carlone, C. Nieto, J. Rogers, Z. Liu, H. I. Christensen, andF. Dellaert, “Multi robot object-based slam,” in

International Symposium onExperimental Robotics , pp. 729–741, Springer, 2016.[29] N. E. ¨Ozkucur and H. L. Akın, “Cooperative multi-robot map merging usingfast-slam,” in

Robot Soccer World Cup , pp. 449–460, Springer, 2009.[30] A. Gil, ´O. Reinoso, M. Ballesta, and M. Juli´a, “Multi-robot visual slam usinga rao-blackwellized particle ﬁlter,”

Robotics and Autonomous Systems , vol. 58,no. 1, pp. 68–80, 2010.[31] N. Atanasov, J. Le Ny, K. Daniilidis, and G. J. Pappas, “Decentralized activeinformation acquisition: Theory and application to multi-robot slam,” in EEE International Conference on Robotics and Automation (ICRA) , pp. 4775–4782, IEEE, 2015.[32] M. N. Rooker and A. Birk, “Multi-robot exploration under the constraints ofwireless networking,”

Control Engineering Practice , vol. 15, no. 4, pp. 435–445,2007.[33] J. Banﬁ, A. Q. Li, I. Rekleitis, F. Amigoni, and N. Basilico, “Strategies forcoordinated multirobot exploration with recurrent connectivity constraints,”

Autonomous Robots , vol. 42, no. 4, pp. 875–894, 2018.[34] J. De Hoog, S. Cameron, and A. Visser, “Role-based autonomous multi-robotexploration,” in , pp. 482–487, IEEE, 2009.[35] K. Cesare, R. Skeele, S.-H. Yoo, Y. Zhang, and G. Hollinger, “Multi-uav explo-ration with limited communication and battery,” in , pp. 2230–2235, IEEE, 2015.[36] F. Amigoni, J. Banﬁ, and N. Basilico, “Multirobot exploration ofcommunication-restricted environments: A survey,”

IEEE Intelligent Systems ,vol. 32, no. 6, pp. 48–57, 2017.[37] T.-M. Liu and D. M. Lyons, “Leveraging area bounds information for au-tonomous multirobot exploration,” in

Intelligent Autonomous Systems 13 ,pp. 563–576, Springer, 2016.[38] M. Andries and F. Charpillet, “Multi-robot taboo-list exploration of unknownstructured environments,” in , pp. 5195–5201, IEEE, 2015.[39] T. Kuyucu, I. Tanev, and K. Shimohara, “Superadditive eﬀect of multi-robotcoordination in the exploration of unknown environments via stigmergy,”

Neu-rocomputing , vol. 148, pp. 83–90, 2015.[40] A. Topiwala, P. Inani, and A. Kathpal, “Frontier based exploration for au-tonomous robot,” arXiv preprint arXiv:1806.03581 , 2018.63 ita

Henry Fielding Cappel was born on November 25, 1994 in Oak Park, Illinois. Heattended school in the Chicago area through his highschool education and movedto Pennsylvania for his undergraduate education. He received a bachelor of artsfrom Swarthmore College located in Swarthmore, Pennsylvania in psychology witha minor in physics. After completing his bachelor’s degree he decided that his truepassion was in physics related to building robots.Henry spent a year teaching robotics to children in the Chicago area travellingto various schools giving lessons for an after school robotics program. Althoughhe loved to teach, his passion was to learn the most challenging robotics relatedproblems and know how to build a complex robot from scratch.He decided to pursue his own degree from the University of Texas at Austin.He is receiving his masters of science in engineering in the ﬁeld of aerospace engi-neering with a focus on robotics and control systems.Permanent Address: [email protected] thesis was typeset with L A TEX 2 ε by the author. L A TEX 2 ε is an extension of L A TEX. L A TEX is a collection of macros for TEX. TEX is a trademarkof the American Mathematical Society. The macros used in formatting this thesis were written by inesh Das, Department of Computer Sciences, The University of Texas at Austin, and extendedby Bert Kay, James A. Bednar, and Ayman El-Khashab.inesh Das, Department of Computer Sciences, The University of Texas at Austin, and extendedby Bert Kay, James A. Bednar, and Ayman El-Khashab.