Improving Continuous-time Conflict Based Search
Anton Andreychuk, Konstantin Yakovlev, Eli Boyarski, Roni Stern
IImproving Continuous-time Conflict Based Search*
Anton Andreychuk,
1, 2
Konstantin Yakovlev,
2, 3
Eli Boyarski, Roni Stern
4, 5 Peoples’ Friendship University of Russia (RUDN University) Federal Research Center for Computer Science and Control of Russian Academy of Sciences HSE University Ben-Gurion University of the Negev Palo Alto Research [email protected], [email protected], [email protected], [email protected]
Abstract
Conflict-Based Search (CBS) is a powerful algorithmicframework for optimally solving classical multi-agent pathfinding (MAPF) problems, where time is discretized into thetime steps. Continuous-time CBS (CCBS) is a recently pro-posed version of CBS that guarantees optimal solutions with-out the need to discretize time. However, the scalability ofCCBS is limited because it does not include any known im-provements of CBS. In this paper, we begin to close thisgap and explore how to adapt successful CBS improvements,namely, prioritizing conflicts (PC), disjoint splitting (DS),and high-level heuristics, to the continuous time setting ofCCBS. These adaptions are not trivial, and require carefulhandling of different types of constraints, applying a gener-alized version of the Safe interval path planning (SIPP) al-gorithm, and extending the notion of cardinal conflicts. Weevaluate the effect of the suggested enhancements by runningexperiments both on general graphs and k -neighborhoodgrids. CCBS with these improvements significantly outper-forms vanilla CCBS, solving problems with almost twice asmany agents in some cases and pushing the limits of multi-agent path finding in continuous-time domains. Introduction
Multi-Agent Pathfinding (MAPF) is the problem of findingpaths for n agents in a graph such that each agent reachesits goal vertex and the agents do not collide with each otherwhile moving along these paths. Many real-world applica-tions require solving variants of MAPF, including manag-ing aircraft-towing vehicles (Morris et al. 2016), video gamecharacters (Silver 2005), office robots (Veloso et al. 2015),and warehouse robots (Wurman, D’Andrea, and Mountz2007). Solving MAPF optimally (for common objectivefunctions) is NP-Hard (Surynek 2010; Yu and LaValle2013), but modern optimal MAPF algorithms can scale toproblems with over a hundred agents (Sharon et al. 2015;Boyarski et al. 2015; Felner et al. 2018; Lam et al. 2019;Gange, Harabor, and Stuckey 2019; Surynek et al. 2016).However, such scaling was shown mostly on the classi-cal version of the MAPF problem (Stern et al. 2019), whichembodies several simplifying assumptions such as all ac-tions have the same duration and time is discretized intotime steps. MAPF R (Walker, Sturtevant, and Felner 2018) *This is a pre-print of the paper accepted to AAAI 2021 is a generalization of the classical MAPF problem in whichactions’ durations can be non-uniform, agents have geomet-ric shapes that must be considered, and time is continuous.Handling continuous time is challenging because it impliesan agent may wait in a location for an arbitrary amount oftime, i.e., the number of wait actions is infinite.Several recently proposed algorithms address the MAPF R problem or its variants, such as Extended ICTS (E-ICTS) (Walker, Sturtevant, and Felner 2018), CBS withContinuous Time-steps (CBS-CT) (Cohen et al. 2019), andContinuous-time conflict-based search (CCBS) (Andrey-chuk et al. 2019). In this work, we propose several im-provements to CCBS that allow it to solve MAPF R prob-lems with significantly more agents. CCBS is based on theConflict-based search (CBS) algorithm for classical MAPF,and the improvements we propose for CCBS are based onknown improvements of CBS, namely Disjoint Splitting(DS), Prioritizing Conflicts (PC), and high-level heuristics.Adapting the DS technique to the continuous-time setting ofMAPF R requires solving a single-agent pathfinding prob-lem with temporally-constrained action landmarks (Karpasand Domshlak 2009). We show how to efficiently solve thispathfinding problem in our context by applying a general-ized version of the SIPP algorithm (Phillips and Likhachev2011). A naive applying of PC to CCBS is shown to be in-effective, and we propose an adapted version of PC that cancut the number of expanded nodes significantly. The thirdCCBS improvement we propose is an admissible heuristicfunction for CCBS that require only a negligable amount ofoverhead when applied together with the PC technique.Finally, we evaluate the impact of these improvements in-dividually and collectively on several benchmarks, includingboth roadmaps and grids. The results show that the numberof MAPF R instances solved by CCBS with all the proposedimprovements compared to vanilla CCBS has increased by49.2% — from 3,792 to 5,659. In some cases, it can evensolve problems with approximately twice the number ofagents compared to vanilla CCBS and reduce the runtimeup to two orders of magnitude. Background and Problem Statement
In a MAPF R problem (Walker, Sturtevant, and Felner 2018),the agents are confined to a weighted graph G = ( V, E ) whose vertices ( V ) correspond to locations in some metric a r X i v : . [ c s . A I] M a r pace, e.g. R in a Euclidean space, and edges ( E ) corre-spond to possible transitions between these location. Eachagent i is initially located at vertex s i ∈ V and aims to reachvertex g i ∈ V . When at a vertex, an agent can either performa move action or a wait action. A move action means movingthe agent along an edge. We assume that the agent moves ina constant velocity and inertial effects are neglected. The du-ration of a move action is the weight of its respective edge.A wait action means the agents stays in its current locationfor some duration. The duration of a wait action can be anypositive real value. Since we do not discretize time, the setof possible wait actions is uncountable.A timed action is a pair ( a i , t i ) representing that action a i (either move or wait) starts at time t i . A plan for an agentis a sequence of timed actions such that executing this se-quence of timed actions moves the agent from its initial lo-cation to its goal location. The cost of a plan is the sum ofthe durations of its constituent actions. We assume that af-ter finishing the plan the agent does not disappear but ratherstays at the last vertex forever, but this “dummy” wait actiondoes not add up to the cost of the plan. The plans of two agents are said to be conflict free if theagents following them never collide, i.e. their shapes neveroverlap. A joint plan is a set of plans, one per each agent.A solution to a MAPF R problem is joint plan whose con-stituent plans are pairwise conflict-free. The cost of a solu-tion is its sum of costs (SOC), i.e., the sum of costs of itsconstituent plans. In this work, we are interested in solvingMAPF R problems optimally, i.e., finding a solution with aminimal cost. CCBS (Andreychuk et al. 2019) is a CBS-based algorithm that does so. For completeness, we providea brief description of CBS and CCBS below. Conflict-based search (CBS)
CBS (Sharon et al. 2015) is a complete and optimal al-gorithm for solving classical MAPF problems, i.e., MAPFproblems where time is discretized and all actions have thesame duration. CBS works by finding plans for each agentseparately, detecting conflicts between these plans, and re-solving them by replanning for the individual agents subjectto specific constraints . A CBS conflict in CBS is definedby a tuple ( i, j, x, t ) stating that agents i and j have a con-flict in location x (either a vertex or an edge) at time t . ACBS constraint is defined by a tuple ( i, x, t ) , which statesthat agent i cannot occupy x at time t . To resolve a conflict ( i, j, x, t ) , CBS replans for agent i or j or both, subject toCBS constraints ( i, x, t ) and ( j, x, t ) , respectively. To guar-antee completeness and optimality, CBS runs two search al-gorithms: a low-level search algorithm that finds paths forindividual agents subject to a given set of constraints, and ahigh-level search algorithm that chooses which constraintsto impose and which conflicts to resolve. CBS: Low-Level Search.
In the basic CBS implementa-tion, the low-level search is a search in the state space ofvertex-time pairs. Expanding a state ( v, t ) generates states of This assumption is common in the MAPF literature. the form ( v (cid:48) , t + 1) , where v (cid:48) is either equal to v , represent-ing a wait action, or equal to one of the locations adjacentto v . States generated by actions that violate the given setof CBS constraints, are pruned. CBS runs A ∗ on this searchspace to return the lowest-cost path to the agent’s goal that isconsistent with the given set of CBS constraints, as required. CBS: High-Level Search.
The CBS high-level search isa search in a binary tree called the Constraint Tree (CT). Inthe CT, each node N represents a set of CBS constraints N. constraints and a joint plan N. Π that is consistent withthese constraints. Generating a node N involves settings itsconstraints N. constraints and running the low-level searchto create N. Π . If N. Π does not contain any conflict, then N is a goal. Expanding a non-goal node N involves choosinga conflict ( i, j, x, t ) in N. Π and generating two child nodes N i and N j . Both nodes have the same set of constraints as N , plus a new CBS constraint: ( i, x, t ) for N i and ( j, x, t ) for N j . This type of node expansion is referred to as split-ting node N over conflict ( i, j, x, t ) . The high-level searchfinds a goal node by searching the CT in a best-first man-ner, expanding in every iteration the CT node N with thelowest-cost joint plan. Continuous-Time Conflict Based Search (CCBS)
To consider continuous time, CCBS (Andreychuk et al.2019) reasons over the time intervals , detects conflicts be-tween timed actions , and resolves conflicts by imposing con-straints that specify the time intervals in which the conflict-ing timed actions can be moved to avoid the conflict. For-mally, a CCBS conflict is a tuple ( a i , t i , a j , t j ) , specifyingthat the timed action ( a i , t i ) of agent i has a conflict withthe timed action ( a j , t j ) of agent j . The unsafe interval oftimed action ( a i , t i ) w.r.t. the timed action ( a j , t j ) , denoted [ t i , t ui ) , is the maximal time interval starting from t i in whichperforming a i creates a conflict with performing a j at time t j . A CCBS constraint is a tuple ( i, a i , [ t i , t ui )) specifyingthat agent i cannot perform action a i in the time interval [ t i , t ui ) . To resolve a CCBS conflict, CCBS generates twonew CT nodes, where it adds the constraint ( i, a i , [ t i , t ui )) toone node and the constraint ( j, a j , [ t j , t uj )) to the other.The low-level planner of CCBS is an adaptation of theSIPP algorithm (Phillips and Likhachev 2011). SIPP wasoriginally designed to find time-optimal paths for an agentmoving among the dynamic obstacles with known trajec-tories. SIPP runs a heuristic search in the state-space of ( v, [ t, t (cid:48) ]) tuples, where v is the graph vertex and [ t, t (cid:48) ] is a safe interval of v , i.e. a maximal contiguous time interval inwhich an agent can stay or arrive at v without colliding witha moving obstacle. As numerous obstacles may pass through v there can exist numerous search nodes corresponding tothe same graph vertex but different time intervals in the SIPPsearch tree.The CCBS low-level search is based on SIPP except forhow it handles the given CCBS constraints. Instead of dy-namic obstacles, the low-level CCBS computes safe inter-vals for each vertex v with respect to the CCBS constraintsimposed over wait actions at v . Initially, vertex v has a sin-le safe interval [0 , ∞ ) . Then, for every CCBS constraint ( i, a i , [ t i , t ui )) where a i is a wait action at vertex v , we splitthe safe interval for v to arriving before t i and to arrivingafter t ui . CCBS constraints imposed over the move actionsare integrated into the low-level search by modifying theconstrained actions, as follows. Let v and v (cid:48) be the sourceand target destinations of a i . If the agent arrives to v at t ∈ [ t i , t ui ) then we remove the action that moves it from v to v (cid:48) at time t , and add an action that represents waiting at v until t ui and then moving to v (cid:48) . Disjoint Splitting for CCBS
The first technique we migrate from CBS to CCBS is calledDisjoint Splitting (DS) (Li et al. 2019b). DS is a techniquedesigned to ensure that expanding a CT node N creates adisjoint partition of the space of solutions that satisfy theconstraints in N. constraints . That is, every solution that sat-isfies N. constraints is in exactly one of its children. Observethat this is not the case in CBS: for a conflict ( i, j, v, t ) theremay be solutions that satisfy both ( i, v, t ) and ( j, v, t ) . Thisintroduces an inefficiency in the high-level search.To address this inefficiency, CBS with DS (CBS-DS) in-troduces the notion of positive and negative constraints. A negative constraint ( i, x, k ) is the regular CBS constraintstating that agent i must not be at x at time step k . A positive constraint ( i, x, k ) means that agent i must be at x at timestep k . When splitting a CT node N over a CBS conflict ( i, j, x, k ) , CBS-DS chooses one of the conflicting agents,say i , and generates two child nodes, one with the negativeconstraint ( i, x, k ) and the other with the positive constraint ( i, x, k ) . Deciding on which agent, either i or j to split on,does not affect the theoretical properties of the algorithm,and several heuristics were proposed (Li et al. 2019b).The low-level search in CBS-DS treats each positive con-straint as a special type of fact landmark (Richter, Helmert,and Westphal 2008), i.e., a fact that must be true in any plan.The CBS-DS low-level search generates a plan that satis-fies these fact landmarks by planning to achieve these factlandmarks in ascending order of their time dimension. Thiseffectively decomposes the low-level search to a sequence ofsimpler search tasks, searching for path one fact landmark tothe next one. The agent’s goal is set a the last fact landmark,to ensure the agent reaches it eventually. Positive and Negative Constraints in CCBS
A CCBS constraint ( i, a i , [ t i , t ui )) can be stated formally asfollows: ∀ t ∈ [ t i , t ui ) : ( a i , t ) is not in a plan for agent i This is a negative constraint from a DS perspective. Thecorresponding positive constraint is therefore the inverse: ∃ t ∈ [ t i , t ui ) : ( a i , t ) is in a plan for agent i This mean that agent i must perform a i at some momentof time from the given interval. Thus a positive constraintin CCBS is an action landmark (Karpas and Domshlak2009), i.e., the action that must be performed in any solution.Next, we show how the low level search of Continuous-timeconflict-based search with disjoint splitting (CCBS-DS) isable to find a plan that achieves all these action landmarks. Figure 1: An example where performing the action landmarkas early as possible leads to a suboptimal plan.Figure 2: An example where performing the action landmarkas early as possible results in failing to find a plan. Low-Level Search in CCBS-DS
The low-level search in CCBS-DS sorts the positive con-straints in ascending order of their time dimension and planto achieve each of them in that order. For example, assumethere is a single positive constraint ( i, move(A,B) , [ t i , t ui )) .Then, the low-level search works by first (1) searching for aplan from s i to A that ends in the time range [ t i , t ui ) , then(2) performing the action landmark (i.e., move from A to B ), and finally (3) searching for a plan from B to g i (start-ing immediately after the action landmark is performed).However, in CCBS-DS there is an additional challengefor the low-level search: there may be more than one planto perform each landmark. In our example above, there maybe an infinite amount of plans from s i to A that ends in thetime range [ t i , t ui ) . As we show below, choosing the plan thatperforms the action landmark earliest does not necessarilylead to finding an optimal solution and might even lead toincompleteness, especially when there are both positive andnegative constraints. Example
Consider the illustration depicted in Figure 1.The low-level search needs to find a plan that satisfies threeconstraints:• A positive constraint ( i, move(A,B) , [ t ab , t uab ) .• A negative constraint ( i, wait at ( A ) , [ t aa , t uaa ) .• A negative constraint ( i, wait at ( B ) , [ t bb , t ubb ) .where t ab < t aa < t uaa < t uab . Thus, the negative constrainton waiting at A (wait at ( A ) ) creates two safe intervals for A , I = [0 , t aa ] and I = [ t uaa , ∞ ) that overlap the interval ofthe positive constraint. The negative constraint on waiting at B (wait at ( B ) ) creates two safe intervals for B , I = [0 , t bb ] and I = [ t ubb , ∞ ) .Now assume that there are two plans that satisfy the actionlandmark for the positive constraint, one that reaches A be-fore t aa (shown in yellow) and one that reaches A after t uaa (show in green). Clearly, the lowest-cost plan to achieve theaction landmark is the one that reaches A before t aa , but to lgorithm 1: Low-level search for CCBS with DS
Input:
Negative constraints C ( − ) Input:
Positive constraints C (+) Input:
Agent i S ←
ComputeSafeIntervals( C ( − ) ) L ←
ComputeLandmarks( C (+) , S ) Starts ← { s i } foreach landmark l = ( i, move ( A, B ) , [ t, t u )) in L do Goals ← computeGoals( l ) Plans ← GSIPP (Starts, Goals) Starts ← ∅ foreach plan in Plans do Append move ( A, B ) to plan Add last state in plan to Starts Starts ← Prune Plans/Starts if possible return SIPP (Starts, g i )find the optimal solution one must use the second plan. Fig-ure 2 illustrates an even more extreme case, where choosingto lowest-cost plan that achieves the action landmark cannot be extended to a full plan, because it reaches B duringits unsafe interval (marked in red). Generalized SIPP
There may be infinite plans that satisfyan action landmark l = ( i, move ( A, B ) , [ t, t u )) , i.e., reach A within [ t, t u ) . Finding only the least-cost plan might leadto incompleteness, as we showed above. To guarantee com-pleteness and optimality we need to find the lowest-cost planof reaching A for every safe interval of A that overlaps with [ t, t u ) . Only in this case we can deem that every possibilityof performing the action landmark has been explored, whichpreserves completeness. The optimality is preserved due tofinding least cost plan of reaching A for every safe interval.To this end, we create a generalized version of SIPP(GSIPP) such that: (1) it accepts a set of goal states, one persafe interval of A that overlaps with [ t, t u ) , and (2) it outputsa set of plans, one per goal state. To each of these plans, weconcatenate the action landmark itself move ( A, B ) . Theseplans may end in different safe intervals in B , which thenbecome distinct start states when searching for a plan to getfrom B to the next landmark. Thus, GSIPP accepts a set ofstarts states and a set of goal states and outputs a set of plans,one per goal. It works as follows. First, the open list is ini-tialized with all start states. Then, the search proceeds as inregular SIPP, except that the stop criteria is either when theopen list is exhausted or when all goal nodes are expanded.The worst case runtime of both GSIPP and SIPP is the same,corresponding to expanding all states in the (vertex, safe in-terval) state space defined by SIPP. Pseudo Code
Finally, we can describe the pseudo-codefor the CCBS-DS low-level search. It accepts a list of neg-ative and positive constraints for an agent i . Initially, thelow-level search computes the safe intervals of every ver-tex based on the negative constraints (Line 1). Then, it com- putes the action landmarks based on the positive constraints(Line 2). These landmarks are sorted by time, and then ititerates over these landmarks (Line 3). For each action land-mark l = ( i, move ( A, B ) , [ t, t u )) , it computes the safe in-tervals of A that intersect with [ t, t u ) (Line 5). Every suchsafe interval is considered a goal for GSIPP. When all suchgoals are added, we run GSIPP to find a set of plans, oneper goal (Line 6). Then, for each found plan we concate-nate the action move ( A, B ) to its end (Lines 9-10). If B is not reachable within a safe interval then the plan is dis-carded. If two or more concatenated plans safely reach B in the same interval I safe k we prune such plans leaving theonly one that reaches this interval earlier (Line 11). A node ( B, I safe k ) now becomes one of the start nodes for the sub-sequent search and is added to Starts.Note that the number of plans satisfying each landmark l is proportional to the number of the negative constraintsover the wait actions for the target vertex of l . Consequently,if no such constraints exist then only one plan to this land-mark will be present after pruning (no matter with how manydifferent start and goal nodes the search was initialized). Ingeneral, in the process of the iterative invocation of the mod-ified SIPP and plan pruning, numerous plans constructedso far might eventually collapse to a single one. This def-initely happens when one the planning to the goal is car-ried out. The reason is that the goal is defined by a singlegraph vertex and a single time interval ending with ∞ aswe assume that the agent arrives to its goal and stays thereforever. Thus, even if numerous plans to the preceding land-mark were found they all will collapse into a single one, i.e.the one that achieves the goal at the earliest possible timewhich is what CCBS requires. Prioritizing Conflicts
Prioritizing Conflicts (PC) (Boyarski et al. 2015) is the sec-ond CBS enhancement we migrate to CCBS. PC is a heuris-tic for choosing which conflict to resolve when expanding aCT node. Different ways to choose conflicts in practice oftenlead to CT of different sizes, thus have a significant effecton the overall runtime. PC systematically prioritizes con-flicts by classifying each conflict as either cardinal , semi-cardinal , and non-cardinal . A conflict is called cardinal iff splitting a CT node N over it results in two child nodeswhose cost is higher than the cost of N . A conflict is semi-cardinal iff if the cost of only one child increases while thecost of the other does not. A conflict that is not cardinal orsemi-cardinal is non-cardinal. CBS with PC prefers cardinalconflicts to semi-cardinal and semi-cardinal to non-cardinal.This way of prioritizing conflicts results in a significant re-duction of the expanded CT nodes compared to vanilla CBSand makes the algorithm much faster in practice.In MAPF R , most conflicts are cardinal, i.e., the agentsinvolved in that conflicts are not able to find the paths thatrespect the corresponding constraints and are of the samecost as before. This is because the ability to perform waitactions of arbitrary duration paired with non-uniform moveaction durations reduces symmetries . By “symmetry” herewe mean having multiple shortest paths that have exactlyhe same cost. Thus, differentiating the conflicts based juston their cardinality type is insufficient.To this end, we propose a generalized version of PC thatintroduces a finer-grained prioritization of conflicts, by in-troducing the notion of cost impact . Intuitively, the cost im-pact of a conflict is how much the cost of the solution isincreased when it is resolved. More formally, for a CT node N with a CCBS conflict Con = ( a i , t i , a j , t j ) , let N i and N j be the CCBS nodes obtained by splitting over this con-flict, and let δ i be the difference between the cost of N and N i . We define the cost impact of the conflict Con , denoted ∆( Con ) , as min( δ i , δ j ) . Our adaptation of PC to CCBSchooses to split a CT node on the conflict with the largestcost impact. This follows the same rationale as PC, as weprioritize the resolution of conflicts that will reveal the high-est unavoidable cost that was so far hidden in conflicts. Heuristics for High-Level Search
To guarantee optimality, the high-level search in CBS ex-plores the CT tree in a best-first fashion. Felner et al. (2018)and Li et al. (2019a) introduced admissible heuristics to theCBS high-level search. These heuristics estimate the differ-ence in cost between a CT node and the optimal solution.Both heuristics are admissible , i.e., they are a lower boundon the actual cost difference, and therefore can be safelyadded to the cost of a CT node when choosing which node toexpand next. Indeed, these heuristics were shown to signif-icantly decrease the number of the expanded CT nodes andimprove the performance of CBS.Drawing from these works we suggest two admissibleheuristics for CCBS. The first admissible heuristic, denoted H , is based on solving the following linear program-ming problem (LPP). This LPP has n non-negative variables x , . . . x n , one for each agent. Each conflict Con i,j betweenagents i and j in the CT node for which we are comput-ing the heuristic introduces the LPP constraint x i + x j ≥ ∆( Con i,j ) . The objective to be minimized is (cid:80) ni =1 x i . Byconstruction, for any solution to this LPP, the value (cid:80) ni =1 x i is an admissible heuristic since for every conflict Con i,j thesolution cost is increased by at least ∆( Con i,j ) .The second admissible heuristic we propose, denoted H ,follows h the approach suggested in (Felner et al. 2018).There, the heuristic was based on identifying disjoint cardi-nal conflicts , which are cardinal conflicts between disjointpairs of agents. As discussed above, in CCBS most conflictsare cardinal but their cost impact can vary greatly. Therefore,in the H heuristic we aim to choose the disjoint cardinalconflicts that would have the largest cost impact. We do so ina greedy manner, sorting the conflicts in N. Π in descendingorder of their cost impact. Then, conflicts are picked one byone in this order. After a conflict is picked, we remove fromthe conflict list all conflicts that involve any of the agents inthis conflict. This continues until all the conflicts are eitherpicked or removed. The H heuristic is the sum of the costimpacts of the chosen conflicts. By construction the chosenconflicts are disjoint and so H is admissible. While H is We also experimented with ∆( Con ) = max( δ i , δ j ) and ∆( Con ) = (cid:80) ( δ i , δ j ) but the affect on performance was minimal. less informed than H (the one computed by solving LPP),it is faster to compute. We observed experimentally that thepractical difference between these heuristics was negligible– an average difference of 1%. We conjecture that the reason H and H perform similarly is that often the conflict graphconsists of disjoint pairs of connected agents, in which casethe minimum vertex cover ( H ) would also be found by thesimple greedy approach ( H ). In our experiments describedbelow we used H and refer to it as H. Empirical Evaluation
We have incorporated all the CCBS enhancements describedso far and evaluated different versions of CCBS in differentMAPF R scenarios involving general graphs (roadmaps) andgrids. Specifically, we evaluated the basic CCBS, CCBSwith PC (CCBS +PC), CCBS with DS (CCBS-DS), CCBSwith both DS and PC (CCBS +DS +PC), and CCBS with allthe improvements (CCBS +DS +PC + H). In the conductedexperiments all agents were assumed to be disk-shaped withradius equal to √ / .In each run of the evaluated algorithm, we recorded theruntime, the number of expanded CT nodes, and whetherthe algorithm was able to find a solution under a time limitof 30 seconds. We chose this specific limit to demonstratenear real time performance. Moreover, in the preliminaryexperiments with different time limits (from 1s to 300s) weobserved that the difference in performance of CCBS with30s time limit and 300s time limit is not significant. Implementation Details
Conflict detection in MAPF R is more involved than in clas-sical MAPF and is more computationally intensive. To com-pensate for that we have implemented the following ap-proach to cache the intermediate conflict detection resultsand speed up the search. We detect all the conflicts in theroot CT node and store them with the node. After choosinga conflict and performing a split we copy all the conflicts toa successor node except the ones involving the agent that hasto re-plan its path. After such re-planning, newly introducedconflicts (if any) are added to the set of conflicts for that CTnode. Indeed, this leads to a memory overhead, which in ourexperiments varied from 15% to 250%, depending on howmany conflicts were discovered.To compute the cost impacts of the conflicts for versionsof CCBS that use PC or the high-level search heuristic H , werun the low-level search explicitly to resolve these conflictsand acquire the needed cost increase values.To speed-up the low-level search, we pre-compute a set ofheuristics, h , ..., h n to estimate cost-to-go to each goal. Tocompute h i we run Dijkstra’s algorithm with g i as the sourcenode. Such heuristics are more informative compared to Eu-clidean distance but their computation complexity is polyno-mial in the graph size. However, the runtime needed to com-pute all heuristics is significantly less than overall runtimeof solving the MAPF R problem. When DS is used, the low-level search performs multiple searches to achieve the land- Our implementation and all the raw results are available at: github.com/PathPlanning/Continuous-CBS . igure 3: The performance of CCBS and its variants on the sparse, dense and super-dense roadmaps.marks created by the positive constraints. When searchingfor the intermediate goals associated with each landmarks,we implemented a Differential Heuristic (DH) (Goldenberget al. 2011) with the pre-computed heuristics h , . . . , h n aspivots. Evaluation on the Roadmaps
In the first set of experiments we have evaluated CCBS on3 different roadmaps, referred to here as sparse , dense and super-dense . The sparse roadmap contains 158 nodes and349 edges, the dense roadmap contains 878 nodes and 7,341edges, and the super-dense roadmap contains 11,342 ver-tices and 263,533 edges. All of these graphs were automati-cally generated by applying a roadmap-generation tool fromthe Open Motion Planning Library (OMPL) (S¸ ucan, Moll,and Kavraki 2012) on the den520d map from the gameDragon Age Origin (DAO). This map is publicly availablein the MovingAI MAPF benchmark (Stern et al. 2019).For each roadmap, 25 different scenarios were generated.Each scenario is a list of start-goal vertices, chosen ran-domly from the graph. Then, we pick the first n = 2 start-goal pairs and create a MAPF R instance for n agents. If theevaluated algorithm solves this instance within the 30 sec-onds time limit, we proceed by increasing n by 1 and creat-ing a new MAPF R instance. This is repeated until the evalu-ated algorithm is not able to solve the instance in 30 seconds.We then proceed to the next scenario.The results are shown in Fig.3. Consider first the successrate plots (left). The first clear trend we observe is that allthe proposed CCBS improvements are significantly betterthan the baseline CBS in almost all cases. E.g., on the denseroadmap CCBS +DS +PC +H manages to achieve 0.8 suc- cess rate for the instances with 20 agents, while CCBS suc-cess rate for this number of agents is only 0.1.Next, consider the relative performance of CCBS with dif-ferent combinations of improvements. In general, the mostadvanced version of the algorithm, i.e. CCBS +DS +PC +H,outperforms the competitors on sparse and dense roadmaps.However on the super-dense this is not the case. On thisroadmap, CCBS +DS +PC +H is dominated by CCBS +DSwhich was able to solve 25 agents while the former – 20.Indeed, in this roadmap the PC component on its own is in-effective, as can be seen when comparing the basic CCBSand CCBS +PC. We explain this behavior by observing thatthis roadmap has a very high branching factor (every vertexhas almost 50 neighbors on average). This helps to eliminateconflicts by finding an appropriate detour of nearly the samecost. Thus the cost impacts, which are computationally in-tensive to compute, are very low and provide limited valuein differentiating between the conflicts.Next, consider the runtime and expanded CT nodes plotsin Figure 3. These plots are built in the following fash-ion. Each data point ( x, y ) on a plot says that an al-gorithm was able to solve x problem instances within y seconds/CT nodes expansions. For example, on the denseroadmap CCBS solved only 276 instances in less than 1 sec-ond, CCBS +PC – 340 instances, while CCBS +DS +PC +H– 404. In general, the closer the line to x -axis and the longerit is – the better. The values at the end of the lines show theexact numbers of the solved instances.The general trend for runtime and high-level expansionsare similar to the ones for the success rate: CCBS +DS +PC+H is the best on sparse and dense roadmaps and CCBS+DS is the best on super-dense . These results highlight ourigure 4: Success rates for CCBS and its modifications on different k -connected grids.improvement over vanilla CBS, where our best CCBS ver-sion is up to 2 orders of magnitude faster in some cases.We also analyzed separately the impact of adding thehigh-level heuristic ( H ) on the instances that involve largenumbers of CT expansions. We took the results of 100instances with the highest values of expanded CT nodessolved by CCBS +DS+PS and CCBS +DS+PC+H aver-aged the number of expansions and compared them. Thenumber of expansions for CCBS +DS+PC+H was lower by26.5%, 21.6% and 17.8% for sparse, dense and super-denseroadmaps respectively. Thus, adding heuristic proved to be avaluable technique, especially for the hard instances involv-ing large number of expansions. Evaluation on Grids
The second set of experiments we conducted was on 8-connected ( ) and 32-connected ( ) grids from the Movin-gAI MAPF benchmark (Stern et al. 2019). We used a16x16 empty grid (16x16 empty), a warehouse-like grid(warehouse-10-20-10-2-2), and a grid representation of theden520d DAO map. Here we used the 25 scenario-files sup-plied by the MAPF benchmark for each grid. The results ofthe second series of experiments are shown in Fig.4.Here we can see that in almost all cases the best resultswere obtained by CCBS with all our enhancements (CCBS+DS +PC +H). Comparing the results on grids with differentconnectedness, one can notice the same trend as observedfor roadmaps with respect to the benefit of PC and DS: in-creasing the branching factor makes PC less effective andDS more effective. This benefit for DS is explained by thefact that positive constraints help to reduce the branchingfactor by reducing the amount of possible alternative trajec-tories to one. Thus, higher branching factor means strongerpruning by positive constraints.Finally, we considered the 100 instances in each gridfor which basic CCBS expanded the most CT nodes. Ta-ble 1 presents the median of the ratios of expansions be-tween basic CCBS and the other versions. As one can see,CCBS+DS+PC+H expands the fewest CT nodes. Also, we PC DS DS+PC DS+PC+Hk=3 k=5 k=3 k=5 k=3 k=5 k=3 k=516x16 33.10% 72.15% 13.97% 14.85% 6.72% 10.25% 5.59% 9.77%warehouse 14.04% 15.69% 28.64% 23.70% 10.84% 18.36% 10.78% 14.31%den520d 31.25% 100.00% 37.50% 67.71% 17.42% 76.42% 14.29% 67.11%
Table 1: The ratio of expanded CT-nodes between CCBS andits modifications on grids (lower = better).observe that in most cases additional connectivity of the gridmakes all the enhancements less beneficial.
Conclusions and Future Work
In this work, we have proposed three improvements toCCBS, an algorithm for finding optimal solutions toMAPF R problems in which time is continuous. The firstCCBS improvement we proposed, called DS, changes howCT nodes are expanded by introducing positive and negativeconstraints. To implement this improvement, we modifiedthe CCBS low-level search and applied a generalized ver-sion of SIPP with multiple start and goal nodes. The secondimprovement, called PC, prioritizes the conflicts to resolveby computing the cost of the solution that resolves them.The third CCBS improvement we proposed is two admissi-ble heuristics for the high-level search. In a comprehensiveexperimental evaluation, we observed that using these im-provements, CCBS can scale to solve much more problemsthan the basic CCBS, solving in some cases almost twice asmany agents. Allowing CCBS to scale to larger problem iskey to applying it to a wider range of real-world applicationsand also as a foundation for more generate MAPF settingsin which the underlying graph is also changing rapidly. Acknowledgments
The research for this project is partially funded by ISF grant eferences
Andreychuk, A.; Yakovlev, K.; Atzmon, D.; and Stern, R.2019. Multi-Agent Pathfinding with Continuous Time. In
Proceedings of the 28th International Joint Conference onArtificial Intelligence (IJCAI 2019) , 39–45.Boyarski, E.; Felner, A.; Stern, R.; Sharon, G.; Tolpin, D.;Betzalel, O.; and Shimony, S. E. 2015. ICBS: ImprovedConflict-Based Search Algorithm for Multi-Agent Pathfind-ing. In the International Joint Conference on Artificial In-telligence (IJCAI) , 740–746.Cohen, L.; Uras, T.; Kumar, T. S.; and Koenig, S. 2019. Op-timal and bounded-suboptimal multi-agent motion planning.In
Symposium on Combinatorial Search (SoCS) .Felner, A.; Li, J.; Boyarski, E.; Ma, H.; Cohen, L.; Kumar,T. S.; and Koenig, S. 2018. Adding Heuristics to Conflict-Based Search for Multi-Agent Path Finding. In the Interna-tional Conference on Automated Planning and Scheduling(ICAPS) , 83–87.Gange, G.; Harabor, D.; and Stuckey, P. J. 2019. Lazy CBS:Implicit Conflict-Based Search Using Lazy Clause Genera-tion. In the International Conference on Automated Plan-ning and Scheduling (ICAPS) , 155–162.Goldenberg, M.; Sturtevant, N. R.; Felner, A.; and Schaeffer,J. 2011. The Compressed Differential Heuristic. In
Proceed-ings of the 25th AAAI Conference on Artificial Intelligence(AAAI 2011) , 24–29.Karpas, E.; and Domshlak, C. 2009. Cost-Optimal Planningwith Landmarks. In
IJCAI , 1728–1733.Lam, E.; Bodic, P. L.; Harabor, D. D.; and Stuckey, P. J.2019. Branch-and-Cut-and-Price for Multi-Agent Pathfind-ing. In
International Joint Conference on Artificial Intelli-gence (IJCAI) , 1289–1296.Li, J.; Felner, A.; Boyarski, E.; Ma, H.; and Koenig, S.2019a. Improved Heuristics for Multi-Agent Path Findingwith Conflict-Based Search. In
Proceedings of the Inter-national Joint Conference on Artificial Intelligence (IJCAI-2019) , 442–449. doi:10.24963/ijcai.2019/63.Li, J.; Harabor, D.; Stuckey, P. J.; Felner, A.; Ma, H.; andKoenig, S. 2019b. Disjoint splitting for multi-agent pathfinding with conflict-based search. In
International Confer-ence on Automated Planning and Scheduling (ICAPS) , vol-ume 29, 279–283.Morris, R.; Pasareanu, C. S.; Luckow, K. S.; Malik, W.;Ma, H.; Kumar, T. K. S.; and Koenig, S. 2016. Planning,Scheduling and Monitoring for Airport Surface Operations.In
Planning for Hybrid Systems, Papers from the 2016 AAAIWorkshop .Phillips, M.; and Likhachev, M. 2011. SIPP: Safe intervalpath planning for dynamic environments. In
Proceedings ofThe 2011 IEEE International Conference on Robotics andAutomation (ICRA 2011) , 5628–5635.Richter, S.; Helmert, M.; and Westphal, M. 2008. Land-marks Revisited. In
AAAI , volume 8, 975–982. Sharon, G.; Stern, R.; Felner, A.; and Sturtevant., N. R.2015. Conflict-based search for optimal multiagent pathfinding.
Artificial Intelligence Journal the First Artifi-cial Intelligence and Interactive Digital Entertainment Con-ference , 117–122.Stern, R.; Sturtevant, N. R.; Felner, A.; Koenig, S.; Ma, H.;Walker, T. T.; Li, J.; Atzmon, D.; Cohen, L.; Kumar, T. S.;et al. 2019. Multi-agent pathfinding: Definitions, variants,and benchmarks. In
Proceedings of the 12th Annual Sympo-sium on Combinatorial Search (SoCS 2019) , 151–158.S¸ ucan, I. A.; Moll, M.; and Kavraki, L. E. 2012. The OpenMotion Planning Library.
IEEE Robotics & AutomationMagazine
AAAI , 1261–1263.Surynek, P.; Felner, A.; Stern, R.; and Boyarski, E. 2016.Efficient SAT Approach to Multi-Agent Path Finding Underthe Sum of Costs Objective. In
ECAI .Veloso, M. M.; Biswas, J.; Coltin, B.; and Rosenthal, S.2015. CoBots: Robust Symbiotic Autonomous Mobile Ser-vice Robots. In the International Joint Conference on Arti-ficial Intelligence (IJCAI) , 4423–4429.Walker, T. T.; Sturtevant, N. R.; and Felner, A. 2018. Ex-tended Increasing Cost Tree Search for Non-Unit Cost Do-mains. In
IJCAI , 534–540.Wurman, P. R.; D’Andrea, R.; and Mountz, M. 2007. Co-ordinating Hundreds of Cooperative, Autonomous Vehiclesin Warehouses. In the AAAI Conference on Artificial Intelli-gence (AAAI) , 1752–1760.Yu, J.; and LaValle, S. M. 2013. Structure and Intractabilityof Optimal Multi-Robot Path Planning on Graphs. In