[PDF] Improving Continuous-time Conflict Based Search

Abstract

Conflict-Based Search (CBS) is a powerful algorithmic framework for optimally solving classical multi-agent path finding (MAPF) problems, where time is discretized into the time steps. Continuous-time CBS (CCBS) is a recently proposed version of CBS that guarantees optimal solutions without the need to discretize time. However, the scalability of CCBS is limited because it does not include any known improvements of CBS. In this paper, we begin to close this gap and explore how to adapt successful CBS improvements, namely, prioritizing conflicts (PC), disjoint splitting (DS), and high-level heuristics, to the continuous time setting of CCBS. These adaptions are not trivial, and require careful handling of different types of constraints, applying a generalized version of the Safe interval path planning (SIPP) algorithm, and extending the notion of cardinal conflicts. We evaluate the effect of the suggested enhancements by running experiments both on general graphs and 2 k -neighborhood grids. CCBS with these improvements significantly outperforms vanilla CCBS, solving problems with almost twice as many agents in some cases and pushing the limits of multiagent path finding in continuous-time domains.

Full PDF

IImproving Continuous-time Conﬂict Based Search*

Anton Andreychuk,

1, 2

Konstantin Yakovlev,

2, 3

Eli Boyarski, Roni Stern

4, 5 Peoples’ Friendship University of Russia (RUDN University) Federal Research Center for Computer Science and Control of Russian Academy of Sciences HSE University Ben-Gurion University of the Negev Palo Alto Research [email protected], [email protected], [email protected], [email protected]

Abstract

Conﬂict-Based Search (CBS) is a powerful algorithmicframework for optimally solving classical multi-agent pathﬁnding (MAPF) problems, where time is discretized into thetime steps. Continuous-time CBS (CCBS) is a recently pro-posed version of CBS that guarantees optimal solutions with-out the need to discretize time. However, the scalability ofCCBS is limited because it does not include any known im-provements of CBS. In this paper, we begin to close thisgap and explore how to adapt successful CBS improvements,namely, prioritizing conﬂicts (PC), disjoint splitting (DS),and high-level heuristics, to the continuous time setting ofCCBS. These adaptions are not trivial, and require carefulhandling of different types of constraints, applying a gener-alized version of the Safe interval path planning (SIPP) al-gorithm, and extending the notion of cardinal conﬂicts. Weevaluate the effect of the suggested enhancements by runningexperiments both on general graphs and k -neighborhoodgrids. CCBS with these improvements signiﬁcantly outper-forms vanilla CCBS, solving problems with almost twice asmany agents in some cases and pushing the limits of multi-agent path ﬁnding in continuous-time domains. Introduction

Multi-Agent Pathﬁnding (MAPF) is the problem of ﬁndingpaths for n agents in a graph such that each agent reachesits goal vertex and the agents do not collide with each otherwhile moving along these paths. Many real-world applica-tions require solving variants of MAPF, including manag-ing aircraft-towing vehicles (Morris et al. 2016), video gamecharacters (Silver 2005), ofﬁce robots (Veloso et al. 2015),and warehouse robots (Wurman, D’Andrea, and Mountz2007). Solving MAPF optimally (for common objectivefunctions) is NP-Hard (Surynek 2010; Yu and LaValle2013), but modern optimal MAPF algorithms can scale toproblems with over a hundred agents (Sharon et al. 2015;Boyarski et al. 2015; Felner et al. 2018; Lam et al. 2019;Gange, Harabor, and Stuckey 2019; Surynek et al. 2016).However, such scaling was shown mostly on the classi-cal version of the MAPF problem (Stern et al. 2019), whichembodies several simplifying assumptions such as all ac-tions have the same duration and time is discretized intotime steps. MAPF R (Walker, Sturtevant, and Felner 2018) *This is a pre-print of the paper accepted to AAAI 2021 is a generalization of the classical MAPF problem in whichactions’ durations can be non-uniform, agents have geomet-ric shapes that must be considered, and time is continuous.Handling continuous time is challenging because it impliesan agent may wait in a location for an arbitrary amount oftime, i.e., the number of wait actions is inﬁnite.Several recently proposed algorithms address the MAPF R problem or its variants, such as Extended ICTS (E-ICTS) (Walker, Sturtevant, and Felner 2018), CBS withContinuous Time-steps (CBS-CT) (Cohen et al. 2019), andContinuous-time conﬂict-based search (CCBS) (Andrey-chuk et al. 2019). In this work, we propose several im-provements to CCBS that allow it to solve MAPF R prob-lems with signiﬁcantly more agents. CCBS is based on theConﬂict-based search (CBS) algorithm for classical MAPF,and the improvements we propose for CCBS are based onknown improvements of CBS, namely Disjoint Splitting(DS), Prioritizing Conﬂicts (PC), and high-level heuristics.Adapting the DS technique to the continuous-time setting ofMAPF R requires solving a single-agent pathﬁnding prob-lem with temporally-constrained action landmarks (Karpasand Domshlak 2009). We show how to efﬁciently solve thispathﬁnding problem in our context by applying a general-ized version of the SIPP algorithm (Phillips and Likhachev2011). A naive applying of PC to CCBS is shown to be in-effective, and we propose an adapted version of PC that cancut the number of expanded nodes signiﬁcantly. The thirdCCBS improvement we propose is an admissible heuristicfunction for CCBS that require only a negligable amount ofoverhead when applied together with the PC technique.Finally, we evaluate the impact of these improvements in-dividually and collectively on several benchmarks, includingboth roadmaps and grids. The results show that the numberof MAPF R instances solved by CCBS with all the proposedimprovements compared to vanilla CCBS has increased by49.2% — from 3,792 to 5,659. In some cases, it can evensolve problems with approximately twice the number ofagents compared to vanilla CCBS and reduce the runtimeup to two orders of magnitude. Background and Problem Statement

In a MAPF R problem (Walker, Sturtevant, and Felner 2018),the agents are conﬁned to a weighted graph G = ( V, E ) whose vertices ( V ) correspond to locations in some metric a r X i v : . [ c s . A I] M a r pace, e.g. R in a Euclidean space, and edges ( E ) corre-spond to possible transitions between these location. Eachagent i is initially located at vertex s i ∈ V and aims to reachvertex g i ∈ V . When at a vertex, an agent can either performa move action or a wait action. A move action means movingthe agent along an edge. We assume that the agent moves ina constant velocity and inertial effects are neglected. The du-ration of a move action is the weight of its respective edge.A wait action means the agents stays in its current locationfor some duration. The duration of a wait action can be anypositive real value. Since we do not discretize time, the setof possible wait actions is uncountable.A timed action is a pair ( a i , t i ) representing that action a i (either move or wait) starts at time t i . A plan for an agentis a sequence of timed actions such that executing this se-quence of timed actions moves the agent from its initial lo-cation to its goal location. The cost of a plan is the sum ofthe durations of its constituent actions. We assume that af-ter ﬁnishing the plan the agent does not disappear but ratherstays at the last vertex forever, but this “dummy” wait actiondoes not add up to the cost of the plan. The plans of two agents are said to be conﬂict free if theagents following them never collide, i.e. their shapes neveroverlap. A joint plan is a set of plans, one per each agent.A solution to a MAPF R problem is joint plan whose con-stituent plans are pairwise conﬂict-free. The cost of a solu-tion is its sum of costs (SOC), i.e., the sum of costs of itsconstituent plans. In this work, we are interested in solvingMAPF R problems optimally, i.e., ﬁnding a solution with aminimal cost. CCBS (Andreychuk et al. 2019) is a CBS-based algorithm that does so. For completeness, we providea brief description of CBS and CCBS below. Conﬂict-based search (CBS)

CBS (Sharon et al. 2015) is a complete and optimal al-gorithm for solving classical MAPF problems, i.e., MAPFproblems where time is discretized and all actions have thesame duration. CBS works by ﬁnding plans for each agentseparately, detecting conﬂicts between these plans, and re-solving them by replanning for the individual agents subjectto speciﬁc constraints . A CBS conﬂict in CBS is deﬁnedby a tuple ( i, j, x, t ) stating that agents i and j have a con-ﬂict in location x (either a vertex or an edge) at time t . ACBS constraint is deﬁned by a tuple ( i, x, t ) , which statesthat agent i cannot occupy x at time t . To resolve a conﬂict ( i, j, x, t ) , CBS replans for agent i or j or both, subject toCBS constraints ( i, x, t ) and ( j, x, t ) , respectively. To guar-antee completeness and optimality, CBS runs two search al-gorithms: a low-level search algorithm that ﬁnds paths forindividual agents subject to a given set of constraints, and ahigh-level search algorithm that chooses which constraintsto impose and which conﬂicts to resolve. CBS: Low-Level Search.

In the basic CBS implementa-tion, the low-level search is a search in the state space ofvertex-time pairs. Expanding a state ( v, t ) generates states of This assumption is common in the MAPF literature. the form ( v (cid:48) , t + 1) , where v (cid:48) is either equal to v , represent-ing a wait action, or equal to one of the locations adjacentto v . States generated by actions that violate the given setof CBS constraints, are pruned. CBS runs A ∗ on this searchspace to return the lowest-cost path to the agent’s goal that isconsistent with the given set of CBS constraints, as required. CBS: High-Level Search.

The CBS high-level search isa search in a binary tree called the Constraint Tree (CT). Inthe CT, each node N represents a set of CBS constraints N. constraints and a joint plan N. Π that is consistent withthese constraints. Generating a node N involves settings itsconstraints N. constraints and running the low-level searchto create N. Π . If N. Π does not contain any conﬂict, then N is a goal. Expanding a non-goal node N involves choosinga conﬂict ( i, j, x, t ) in N. Π and generating two child nodes N i and N j . Both nodes have the same set of constraints as N , plus a new CBS constraint: ( i, x, t ) for N i and ( j, x, t ) for N j . This type of node expansion is referred to as split-ting node N over conﬂict ( i, j, x, t ) . The high-level searchﬁnds a goal node by searching the CT in a best-ﬁrst man-ner, expanding in every iteration the CT node N with thelowest-cost joint plan. Continuous-Time Conﬂict Based Search (CCBS)

To consider continuous time, CCBS (Andreychuk et al.2019) reasons over the time intervals , detects conﬂicts be-tween timed actions , and resolves conﬂicts by imposing con-straints that specify the time intervals in which the conﬂict-ing timed actions can be moved to avoid the conﬂict. For-mally, a CCBS conﬂict is a tuple ( a i , t i , a j , t j ) , specifyingthat the timed action ( a i , t i ) of agent i has a conﬂict withthe timed action ( a j , t j ) of agent j . The unsafe interval oftimed action ( a i , t i ) w.r.t. the timed action ( a j , t j ) , denoted [ t i , t ui ) , is the maximal time interval starting from t i in whichperforming a i creates a conﬂict with performing a j at time t j . A CCBS constraint is a tuple ( i, a i , [ t i , t ui )) specifyingthat agent i cannot perform action a i in the time interval [ t i , t ui ) . To resolve a CCBS conﬂict, CCBS generates twonew CT nodes, where it adds the constraint ( i, a i , [ t i , t ui )) toone node and the constraint ( j, a j , [ t j , t uj )) to the other.The low-level planner of CCBS is an adaptation of theSIPP algorithm (Phillips and Likhachev 2011). SIPP wasoriginally designed to ﬁnd time-optimal paths for an agentmoving among the dynamic obstacles with known trajec-tories. SIPP runs a heuristic search in the state-space of ( v, [ t, t (cid:48) ]) tuples, where v is the graph vertex and [ t, t (cid:48) ] is a safe interval of v , i.e. a maximal contiguous time interval inwhich an agent can stay or arrive at v without colliding witha moving obstacle. As numerous obstacles may pass through v there can exist numerous search nodes corresponding tothe same graph vertex but different time intervals in the SIPPsearch tree.The CCBS low-level search is based on SIPP except forhow it handles the given CCBS constraints. Instead of dy-namic obstacles, the low-level CCBS computes safe inter-vals for each vertex v with respect to the CCBS constraintsimposed over wait actions at v . Initially, vertex v has a sin-le safe interval [0 , ∞ ) . Then, for every CCBS constraint ( i, a i , [ t i , t ui )) where a i is a wait action at vertex v , we splitthe safe interval for v to arriving before t i and to arrivingafter t ui . CCBS constraints imposed over the move actionsare integrated into the low-level search by modifying theconstrained actions, as follows. Let v and v (cid:48) be the sourceand target destinations of a i . If the agent arrives to v at t ∈ [ t i , t ui ) then we remove the action that moves it from v to v (cid:48) at time t , and add an action that represents waiting at v until t ui and then moving to v (cid:48) . Disjoint Splitting for CCBS

The ﬁrst technique we migrate from CBS to CCBS is calledDisjoint Splitting (DS) (Li et al. 2019b). DS is a techniquedesigned to ensure that expanding a CT node N creates adisjoint partition of the space of solutions that satisfy theconstraints in N. constraints . That is, every solution that sat-isﬁes N. constraints is in exactly one of its children. Observethat this is not the case in CBS: for a conﬂict ( i, j, v, t ) theremay be solutions that satisfy both ( i, v, t ) and ( j, v, t ) . Thisintroduces an inefﬁciency in the high-level search.To address this inefﬁciency, CBS with DS (CBS-DS) in-troduces the notion of positive and negative constraints. A negative constraint ( i, x, k ) is the regular CBS constraintstating that agent i must not be at x at time step k . A positive constraint ( i, x, k ) means that agent i must be at x at timestep k . When splitting a CT node N over a CBS conﬂict ( i, j, x, k ) , CBS-DS chooses one of the conﬂicting agents,say i , and generates two child nodes, one with the negativeconstraint ( i, x, k ) and the other with the positive constraint ( i, x, k ) . Deciding on which agent, either i or j to split on,does not affect the theoretical properties of the algorithm,and several heuristics were proposed (Li et al. 2019b).The low-level search in CBS-DS treats each positive con-straint as a special type of fact landmark (Richter, Helmert,and Westphal 2008), i.e., a fact that must be true in any plan.The CBS-DS low-level search generates a plan that satis-ﬁes these fact landmarks by planning to achieve these factlandmarks in ascending order of their time dimension. Thiseffectively decomposes the low-level search to a sequence ofsimpler search tasks, searching for path one fact landmark tothe next one. The agent’s goal is set a the last fact landmark,to ensure the agent reaches it eventually. Positive and Negative Constraints in CCBS

A CCBS constraint ( i, a i , [ t i , t ui )) can be stated formally asfollows: ∀ t ∈ [ t i , t ui ) : ( a i , t ) is not in a plan for agent i This is a negative constraint from a DS perspective. Thecorresponding positive constraint is therefore the inverse: ∃ t ∈ [ t i , t ui ) : ( a i , t ) is in a plan for agent i This mean that agent i must perform a i at some momentof time from the given interval. Thus a positive constraintin CCBS is an action landmark (Karpas and Domshlak2009), i.e., the action that must be performed in any solution.Next, we show how the low level search of Continuous-timeconﬂict-based search with disjoint splitting (CCBS-DS) isable to ﬁnd a plan that achieves all these action landmarks. Figure 1: An example where performing the action landmarkas early as possible leads to a suboptimal plan.Figure 2: An example where performing the action landmarkas early as possible results in failing to ﬁnd a plan. Low-Level Search in CCBS-DS

The low-level search in CCBS-DS sorts the positive con-straints in ascending order of their time dimension and planto achieve each of them in that order. For example, assumethere is a single positive constraint ( i, move(A,B) , [ t i , t ui )) .Then, the low-level search works by ﬁrst (1) searching for aplan from s i to A that ends in the time range [ t i , t ui ) , then(2) performing the action landmark (i.e., move from A to B ), and ﬁnally (3) searching for a plan from B to g i (start-ing immediately after the action landmark is performed).However, in CCBS-DS there is an additional challengefor the low-level search: there may be more than one planto perform each landmark. In our example above, there maybe an inﬁnite amount of plans from s i to A that ends in thetime range [ t i , t ui ) . As we show below, choosing the plan thatperforms the action landmark earliest does not necessarilylead to ﬁnding an optimal solution and might even lead toincompleteness, especially when there are both positive andnegative constraints. Example

Consider the illustration depicted in Figure 1.The low-level search needs to ﬁnd a plan that satisﬁes threeconstraints:• A positive constraint ( i, move(A,B) , [ t ab , t uab ) .• A negative constraint ( i, wait at ( A ) , [ t aa , t uaa ) .• A negative constraint ( i, wait at ( B ) , [ t bb , t ubb ) .where t ab < t aa < t uaa < t uab . Thus, the negative constrainton waiting at A (wait at ( A ) ) creates two safe intervals for A , I = [0 , t aa ] and I = [ t uaa , ∞ ) that overlap the interval ofthe positive constraint. The negative constraint on waiting at B (wait at ( B ) ) creates two safe intervals for B , I = [0 , t bb ] and I = [ t ubb , ∞ ) .Now assume that there are two plans that satisfy the actionlandmark for the positive constraint, one that reaches A be-fore t aa (shown in yellow) and one that reaches A after t uaa (show in green). Clearly, the lowest-cost plan to achieve theaction landmark is the one that reaches A before t aa , but to lgorithm 1: Low-level search for CCBS with DS

Input:

Negative constraints C ( − ) Input:

Positive constraints C (+) Input:

Agent i S ←

ComputeSafeIntervals( C ( − ) ) L ←

ComputeLandmarks( C (+) , S ) Starts ← { s i } foreach landmark l = ( i, move ( A, B ) , [ t, t u )) in L do Goals ← computeGoals( l ) Plans ← GSIPP (Starts, Goals) Starts ← ∅ foreach plan in Plans do Append move ( A, B ) to plan Add last state in plan to Starts Starts ← Prune Plans/Starts if possible return SIPP (Starts, g i )ﬁnd the optimal solution one must use the second plan. Fig-ure 2 illustrates an even more extreme case, where choosingto lowest-cost plan that achieves the action landmark cannot be extended to a full plan, because it reaches B duringits unsafe interval (marked in red). Generalized SIPP

There may be inﬁnite plans that satisfyan action landmark l = ( i, move ( A, B ) , [ t, t u )) , i.e., reach A within [ t, t u ) . Finding only the least-cost plan might leadto incompleteness, as we showed above. To guarantee com-pleteness and optimality we need to ﬁnd the lowest-cost planof reaching A for every safe interval of A that overlaps with [ t, t u ) . Only in this case we can deem that every possibilityof performing the action landmark has been explored, whichpreserves completeness. The optimality is preserved due toﬁnding least cost plan of reaching A for every safe interval.To this end, we create a generalized version of SIPP(GSIPP) such that: (1) it accepts a set of goal states, one persafe interval of A that overlaps with [ t, t u ) , and (2) it outputsa set of plans, one per goal state. To each of these plans, weconcatenate the action landmark itself move ( A, B ) . Theseplans may end in different safe intervals in B , which thenbecome distinct start states when searching for a plan to getfrom B to the next landmark. Thus, GSIPP accepts a set ofstarts states and a set of goal states and outputs a set of plans,one per goal. It works as follows. First, the open list is ini-tialized with all start states. Then, the search proceeds as inregular SIPP, except that the stop criteria is either when theopen list is exhausted or when all goal nodes are expanded.The worst case runtime of both GSIPP and SIPP is the same,corresponding to expanding all states in the (vertex, safe in-terval) state space deﬁned by SIPP. Pseudo Code

Finally, we can describe the pseudo-codefor the CCBS-DS low-level search. It accepts a list of neg-ative and positive constraints for an agent i . Initially, thelow-level search computes the safe intervals of every ver-tex based on the negative constraints (Line 1). Then, it com- putes the action landmarks based on the positive constraints(Line 2). These landmarks are sorted by time, and then ititerates over these landmarks (Line 3). For each action land-mark l = ( i, move ( A, B ) , [ t, t u )) , it computes the safe in-tervals of A that intersect with [ t, t u ) (Line 5). Every suchsafe interval is considered a goal for GSIPP. When all suchgoals are added, we run GSIPP to ﬁnd a set of plans, oneper goal (Line 6). Then, for each found plan we concate-nate the action move ( A, B ) to its end (Lines 9-10). If B is not reachable within a safe interval then the plan is dis-carded. If two or more concatenated plans safely reach B in the same interval I safe k we prune such plans leaving theonly one that reaches this interval earlier (Line 11). A node ( B, I safe k ) now becomes one of the start nodes for the sub-sequent search and is added to Starts.Note that the number of plans satisfying each landmark l is proportional to the number of the negative constraintsover the wait actions for the target vertex of l . Consequently,if no such constraints exist then only one plan to this land-mark will be present after pruning (no matter with how manydifferent start and goal nodes the search was initialized). Ingeneral, in the process of the iterative invocation of the mod-iﬁed SIPP and plan pruning, numerous plans constructedso far might eventually collapse to a single one. This def-initely happens when one the planning to the goal is car-ried out. The reason is that the goal is deﬁned by a singlegraph vertex and a single time interval ending with ∞ aswe assume that the agent arrives to its goal and stays thereforever. Thus, even if numerous plans to the preceding land-mark were found they all will collapse into a single one, i.e.the one that achieves the goal at the earliest possible timewhich is what CCBS requires. Prioritizing Conﬂicts

Prioritizing Conﬂicts (PC) (Boyarski et al. 2015) is the sec-ond CBS enhancement we migrate to CCBS. PC is a heuris-tic for choosing which conﬂict to resolve when expanding aCT node. Different ways to choose conﬂicts in practice oftenlead to CT of different sizes, thus have a signiﬁcant effecton the overall runtime. PC systematically prioritizes con-ﬂicts by classifying each conﬂict as either cardinal , semi-cardinal , and non-cardinal . A conﬂict is called cardinal iff splitting a CT node N over it results in two child nodeswhose cost is higher than the cost of N . A conﬂict is semi-cardinal iff if the cost of only one child increases while thecost of the other does not. A conﬂict that is not cardinal orsemi-cardinal is non-cardinal. CBS with PC prefers cardinalconﬂicts to semi-cardinal and semi-cardinal to non-cardinal.This way of prioritizing conﬂicts results in a signiﬁcant re-duction of the expanded CT nodes compared to vanilla CBSand makes the algorithm much faster in practice.In MAPF R , most conﬂicts are cardinal, i.e., the agentsinvolved in that conﬂicts are not able to ﬁnd the paths thatrespect the corresponding constraints and are of the samecost as before. This is because the ability to perform waitactions of arbitrary duration paired with non-uniform moveaction durations reduces symmetries . By “symmetry” herewe mean having multiple shortest paths that have exactlyhe same cost. Thus, differentiating the conﬂicts based juston their cardinality type is insufﬁcient.To this end, we propose a generalized version of PC thatintroduces a ﬁner-grained prioritization of conﬂicts, by in-troducing the notion of cost impact . Intuitively, the cost im-pact of a conﬂict is how much the cost of the solution isincreased when it is resolved. More formally, for a CT node N with a CCBS conﬂict Con = ( a i , t i , a j , t j ) , let N i and N j be the CCBS nodes obtained by splitting over this con-ﬂict, and let δ i be the difference between the cost of N and N i . We deﬁne the cost impact of the conﬂict Con , denoted ∆( Con ) , as min( δ i , δ j ) . Our adaptation of PC to CCBSchooses to split a CT node on the conﬂict with the largestcost impact. This follows the same rationale as PC, as weprioritize the resolution of conﬂicts that will reveal the high-est unavoidable cost that was so far hidden in conﬂicts. Heuristics for High-Level Search

To guarantee optimality, the high-level search in CBS ex-plores the CT tree in a best-ﬁrst fashion. Felner et al. (2018)and Li et al. (2019a) introduced admissible heuristics to theCBS high-level search. These heuristics estimate the differ-ence in cost between a CT node and the optimal solution.Both heuristics are admissible , i.e., they are a lower boundon the actual cost difference, and therefore can be safelyadded to the cost of a CT node when choosing which node toexpand next. Indeed, these heuristics were shown to signif-icantly decrease the number of the expanded CT nodes andimprove the performance of CBS.Drawing from these works we suggest two admissibleheuristics for CCBS. The ﬁrst admissible heuristic, denoted H , is based on solving the following linear program-ming problem (LPP). This LPP has n non-negative variables x , . . . x n , one for each agent. Each conﬂict Con i,j betweenagents i and j in the CT node for which we are comput-ing the heuristic introduces the LPP constraint x i + x j ≥ ∆( Con i,j ) . The objective to be minimized is (cid:80) ni =1 x i . Byconstruction, for any solution to this LPP, the value (cid:80) ni =1 x i is an admissible heuristic since for every conﬂict Con i,j thesolution cost is increased by at least ∆( Con i,j ) .The second admissible heuristic we propose, denoted H ,follows h the approach suggested in (Felner et al. 2018).There, the heuristic was based on identifying disjoint cardi-nal conﬂicts , which are cardinal conﬂicts between disjointpairs of agents. As discussed above, in CCBS most conﬂictsare cardinal but their cost impact can vary greatly. Therefore,in the H heuristic we aim to choose the disjoint cardinalconﬂicts that would have the largest cost impact. We do so ina greedy manner, sorting the conﬂicts in N. Π in descendingorder of their cost impact. Then, conﬂicts are picked one byone in this order. After a conﬂict is picked, we remove fromthe conﬂict list all conﬂicts that involve any of the agents inthis conﬂict. This continues until all the conﬂicts are eitherpicked or removed. The H heuristic is the sum of the costimpacts of the chosen conﬂicts. By construction the chosenconﬂicts are disjoint and so H is admissible. While H is We also experimented with ∆( Con ) = max( δ i , δ j ) and ∆( Con ) = (cid:80) ( δ i , δ j ) but the affect on performance was minimal. less informed than H (the one computed by solving LPP),it is faster to compute. We observed experimentally that thepractical difference between these heuristics was negligible– an average difference of 1%. We conjecture that the reason H and H perform similarly is that often the conﬂict graphconsists of disjoint pairs of connected agents, in which casethe minimum vertex cover ( H ) would also be found by thesimple greedy approach ( H ). In our experiments describedbelow we used H and refer to it as H. Empirical Evaluation

We have incorporated all the CCBS enhancements describedso far and evaluated different versions of CCBS in differentMAPF R scenarios involving general graphs (roadmaps) andgrids. Speciﬁcally, we evaluated the basic CCBS, CCBSwith PC (CCBS +PC), CCBS with DS (CCBS-DS), CCBSwith both DS and PC (CCBS +DS +PC), and CCBS with allthe improvements (CCBS +DS +PC + H). In the conductedexperiments all agents were assumed to be disk-shaped withradius equal to √ / .In each run of the evaluated algorithm, we recorded theruntime, the number of expanded CT nodes, and whetherthe algorithm was able to ﬁnd a solution under a time limitof 30 seconds. We chose this speciﬁc limit to demonstratenear real time performance. Moreover, in the preliminaryexperiments with different time limits (from 1s to 300s) weobserved that the difference in performance of CCBS with30s time limit and 300s time limit is not signiﬁcant. Implementation Details

Conﬂict detection in MAPF R is more involved than in clas-sical MAPF and is more computationally intensive. To com-pensate for that we have implemented the following ap-proach to cache the intermediate conﬂict detection resultsand speed up the search. We detect all the conﬂicts in theroot CT node and store them with the node. After choosinga conﬂict and performing a split we copy all the conﬂicts toa successor node except the ones involving the agent that hasto re-plan its path. After such re-planning, newly introducedconﬂicts (if any) are added to the set of conﬂicts for that CTnode. Indeed, this leads to a memory overhead, which in ourexperiments varied from 15% to 250%, depending on howmany conﬂicts were discovered.To compute the cost impacts of the conﬂicts for versionsof CCBS that use PC or the high-level search heuristic H , werun the low-level search explicitly to resolve these conﬂictsand acquire the needed cost increase values.To speed-up the low-level search, we pre-compute a set ofheuristics, h , ..., h n to estimate cost-to-go to each goal. Tocompute h i we run Dijkstra’s algorithm with g i as the sourcenode. Such heuristics are more informative compared to Eu-clidean distance but their computation complexity is polyno-mial in the graph size. However, the runtime needed to com-pute all heuristics is signiﬁcantly less than overall runtimeof solving the MAPF R problem. When DS is used, the low-level search performs multiple searches to achieve the land- Our implementation and all the raw results are available at: github.com/PathPlanning/Continuous-CBS . igure 3: The performance of CCBS and its variants on the sparse, dense and super-dense roadmaps.marks created by the positive constraints. When searchingfor the intermediate goals associated with each landmarks,we implemented a Differential Heuristic (DH) (Goldenberget al. 2011) with the pre-computed heuristics h , . . . , h n aspivots. Evaluation on the Roadmaps

In the ﬁrst set of experiments we have evaluated CCBS on3 different roadmaps, referred to here as sparse , dense and super-dense . The sparse roadmap contains 158 nodes and349 edges, the dense roadmap contains 878 nodes and 7,341edges, and the super-dense roadmap contains 11,342 ver-tices and 263,533 edges. All of these graphs were automati-cally generated by applying a roadmap-generation tool fromthe Open Motion Planning Library (OMPL) (S¸ ucan, Moll,and Kavraki 2012) on the den520d map from the gameDragon Age Origin (DAO). This map is publicly availablein the MovingAI MAPF benchmark (Stern et al. 2019).For each roadmap, 25 different scenarios were generated.Each scenario is a list of start-goal vertices, chosen ran-domly from the graph. Then, we pick the ﬁrst n = 2 start-goal pairs and create a MAPF R instance for n agents. If theevaluated algorithm solves this instance within the 30 sec-onds time limit, we proceed by increasing n by 1 and creat-ing a new MAPF R instance. This is repeated until the evalu-ated algorithm is not able to solve the instance in 30 seconds.We then proceed to the next scenario.The results are shown in Fig.3. Consider ﬁrst the successrate plots (left). The ﬁrst clear trend we observe is that allthe proposed CCBS improvements are signiﬁcantly betterthan the baseline CBS in almost all cases. E.g., on the denseroadmap CCBS +DS +PC +H manages to achieve 0.8 suc- cess rate for the instances with 20 agents, while CCBS suc-cess rate for this number of agents is only 0.1.Next, consider the relative performance of CCBS with dif-ferent combinations of improvements. In general, the mostadvanced version of the algorithm, i.e. CCBS +DS +PC +H,outperforms the competitors on sparse and dense roadmaps.However on the super-dense this is not the case. On thisroadmap, CCBS +DS +PC +H is dominated by CCBS +DSwhich was able to solve 25 agents while the former – 20.Indeed, in this roadmap the PC component on its own is in-effective, as can be seen when comparing the basic CCBSand CCBS +PC. We explain this behavior by observing thatthis roadmap has a very high branching factor (every vertexhas almost 50 neighbors on average). This helps to eliminateconﬂicts by ﬁnding an appropriate detour of nearly the samecost. Thus the cost impacts, which are computationally in-tensive to compute, are very low and provide limited valuein differentiating between the conﬂicts.Next, consider the runtime and expanded CT nodes plotsin Figure 3. These plots are built in the following fash-ion. Each data point ( x, y ) on a plot says that an al-gorithm was able to solve x problem instances within y seconds/CT nodes expansions. For example, on the denseroadmap CCBS solved only 276 instances in less than 1 sec-ond, CCBS +PC – 340 instances, while CCBS +DS +PC +H– 404. In general, the closer the line to x -axis and the longerit is – the better. The values at the end of the lines show theexact numbers of the solved instances.The general trend for runtime and high-level expansionsare similar to the ones for the success rate: CCBS +DS +PC+H is the best on sparse and dense roadmaps and CCBS+DS is the best on super-dense . These results highlight ourigure 4: Success rates for CCBS and its modiﬁcations on different k -connected grids.improvement over vanilla CBS, where our best CCBS ver-sion is up to 2 orders of magnitude faster in some cases.We also analyzed separately the impact of adding thehigh-level heuristic ( H ) on the instances that involve largenumbers of CT expansions. We took the results of 100instances with the highest values of expanded CT nodessolved by CCBS +DS+PS and CCBS +DS+PC+H aver-aged the number of expansions and compared them. Thenumber of expansions for CCBS +DS+PC+H was lower by26.5%, 21.6% and 17.8% for sparse, dense and super-denseroadmaps respectively. Thus, adding heuristic proved to be avaluable technique, especially for the hard instances involv-ing large number of expansions. Evaluation on Grids

The second set of experiments we conducted was on 8-connected ( ) and 32-connected ( ) grids from the Movin-gAI MAPF benchmark (Stern et al. 2019). We used a16x16 empty grid (16x16 empty), a warehouse-like grid(warehouse-10-20-10-2-2), and a grid representation of theden520d DAO map. Here we used the 25 scenario-ﬁles sup-plied by the MAPF benchmark for each grid. The results ofthe second series of experiments are shown in Fig.4.Here we can see that in almost all cases the best resultswere obtained by CCBS with all our enhancements (CCBS+DS +PC +H). Comparing the results on grids with differentconnectedness, one can notice the same trend as observedfor roadmaps with respect to the beneﬁt of PC and DS: in-creasing the branching factor makes PC less effective andDS more effective. This beneﬁt for DS is explained by thefact that positive constraints help to reduce the branchingfactor by reducing the amount of possible alternative trajec-tories to one. Thus, higher branching factor means strongerpruning by positive constraints.Finally, we considered the 100 instances in each gridfor which basic CCBS expanded the most CT nodes. Ta-ble 1 presents the median of the ratios of expansions be-tween basic CCBS and the other versions. As one can see,CCBS+DS+PC+H expands the fewest CT nodes. Also, we PC DS DS+PC DS+PC+Hk=3 k=5 k=3 k=5 k=3 k=5 k=3 k=516x16 33.10% 72.15% 13.97% 14.85% 6.72% 10.25% 5.59% 9.77%warehouse 14.04% 15.69% 28.64% 23.70% 10.84% 18.36% 10.78% 14.31%den520d 31.25% 100.00% 37.50% 67.71% 17.42% 76.42% 14.29% 67.11%

Table 1: The ratio of expanded CT-nodes between CCBS andits modiﬁcations on grids (lower = better).observe that in most cases additional connectivity of the gridmakes all the enhancements less beneﬁcial.

Conclusions and Future Work

In this work, we have proposed three improvements toCCBS, an algorithm for ﬁnding optimal solutions toMAPF R problems in which time is continuous. The ﬁrstCCBS improvement we proposed, called DS, changes howCT nodes are expanded by introducing positive and negativeconstraints. To implement this improvement, we modiﬁedthe CCBS low-level search and applied a generalized ver-sion of SIPP with multiple start and goal nodes. The secondimprovement, called PC, prioritizes the conﬂicts to resolveby computing the cost of the solution that resolves them.The third CCBS improvement we proposed is two admissi-ble heuristics for the high-level search. In a comprehensiveexperimental evaluation, we observed that using these im-provements, CCBS can scale to solve much more problemsthan the basic CCBS, solving in some cases almost twice asmany agents. Allowing CCBS to scale to larger problem iskey to applying it to a wider range of real-world applicationsand also as a foundation for more generate MAPF settingsin which the underlying graph is also changing rapidly. Acknowledgments

The research for this project is partially funded by ISF grant eferences

Andreychuk, A.; Yakovlev, K.; Atzmon, D.; and Stern, R.2019. Multi-Agent Pathﬁnding with Continuous Time. In

Proceedings of the 28th International Joint Conference onArtiﬁcial Intelligence (IJCAI 2019) , 39–45.Boyarski, E.; Felner, A.; Stern, R.; Sharon, G.; Tolpin, D.;Betzalel, O.; and Shimony, S. E. 2015. ICBS: ImprovedConﬂict-Based Search Algorithm for Multi-Agent Pathﬁnd-ing. In the International Joint Conference on Artiﬁcial In-telligence (IJCAI) , 740–746.Cohen, L.; Uras, T.; Kumar, T. S.; and Koenig, S. 2019. Op-timal and bounded-suboptimal multi-agent motion planning.In

Symposium on Combinatorial Search (SoCS) .Felner, A.; Li, J.; Boyarski, E.; Ma, H.; Cohen, L.; Kumar,T. S.; and Koenig, S. 2018. Adding Heuristics to Conﬂict-Based Search for Multi-Agent Path Finding. In the Interna-tional Conference on Automated Planning and Scheduling(ICAPS) , 83–87.Gange, G.; Harabor, D.; and Stuckey, P. J. 2019. Lazy CBS:Implicit Conﬂict-Based Search Using Lazy Clause Genera-tion. In the International Conference on Automated Plan-ning and Scheduling (ICAPS) , 155–162.Goldenberg, M.; Sturtevant, N. R.; Felner, A.; and Schaeffer,J. 2011. The Compressed Differential Heuristic. In

Proceed-ings of the 25th AAAI Conference on Artiﬁcial Intelligence(AAAI 2011) , 24–29.Karpas, E.; and Domshlak, C. 2009. Cost-Optimal Planningwith Landmarks. In

IJCAI , 1728–1733.Lam, E.; Bodic, P. L.; Harabor, D. D.; and Stuckey, P. J.2019. Branch-and-Cut-and-Price for Multi-Agent Pathﬁnd-ing. In

International Joint Conference on Artiﬁcial Intelli-gence (IJCAI) , 1289–1296.Li, J.; Felner, A.; Boyarski, E.; Ma, H.; and Koenig, S.2019a. Improved Heuristics for Multi-Agent Path Findingwith Conﬂict-Based Search. In

Proceedings of the Inter-national Joint Conference on Artiﬁcial Intelligence (IJCAI-2019) , 442–449. doi:10.24963/ijcai.2019/63.Li, J.; Harabor, D.; Stuckey, P. J.; Felner, A.; Ma, H.; andKoenig, S. 2019b. Disjoint splitting for multi-agent pathﬁnding with conﬂict-based search. In

International Confer-ence on Automated Planning and Scheduling (ICAPS) , vol-ume 29, 279–283.Morris, R.; Pasareanu, C. S.; Luckow, K. S.; Malik, W.;Ma, H.; Kumar, T. K. S.; and Koenig, S. 2016. Planning,Scheduling and Monitoring for Airport Surface Operations.In

Planning for Hybrid Systems, Papers from the 2016 AAAIWorkshop .Phillips, M.; and Likhachev, M. 2011. SIPP: Safe intervalpath planning for dynamic environments. In

Proceedings ofThe 2011 IEEE International Conference on Robotics andAutomation (ICRA 2011) , 5628–5635.Richter, S.; Helmert, M.; and Westphal, M. 2008. Land-marks Revisited. In

AAAI , volume 8, 975–982. Sharon, G.; Stern, R.; Felner, A.; and Sturtevant., N. R.2015. Conﬂict-based search for optimal multiagent pathﬁnding.

Artiﬁcial Intelligence Journal the First Artiﬁ-cial Intelligence and Interactive Digital Entertainment Con-ference , 117–122.Stern, R.; Sturtevant, N. R.; Felner, A.; Koenig, S.; Ma, H.;Walker, T. T.; Li, J.; Atzmon, D.; Cohen, L.; Kumar, T. S.;et al. 2019. Multi-agent pathﬁnding: Deﬁnitions, variants,and benchmarks. In

Proceedings of the 12th Annual Sympo-sium on Combinatorial Search (SoCS 2019) , 151–158.S¸ ucan, I. A.; Moll, M.; and Kavraki, L. E. 2012. The OpenMotion Planning Library.

IEEE Robotics & AutomationMagazine

AAAI , 1261–1263.Surynek, P.; Felner, A.; Stern, R.; and Boyarski, E. 2016.Efﬁcient SAT Approach to Multi-Agent Path Finding Underthe Sum of Costs Objective. In

ECAI .Veloso, M. M.; Biswas, J.; Coltin, B.; and Rosenthal, S.2015. CoBots: Robust Symbiotic Autonomous Mobile Ser-vice Robots. In the International Joint Conference on Arti-ﬁcial Intelligence (IJCAI) , 4423–4429.Walker, T. T.; Sturtevant, N. R.; and Felner, A. 2018. Ex-tended Increasing Cost Tree Search for Non-Unit Cost Do-mains. In

IJCAI , 534–540.Wurman, P. R.; D’Andrea, R.; and Mountz, M. 2007. Co-ordinating Hundreds of Cooperative, Autonomous Vehiclesin Warehouses. In the AAAI Conference on Artiﬁcial Intelli-gence (AAAI) , 1752–1760.Yu, J.; and LaValle, S. M. 2013. Structure and Intractabilityof Optimal Multi-Robot Path Planning on Graphs. In