LES: Locally Exploitative Sampling for Robot Path Planning
LLES: Locally Exploitative Sampling forRobot Path Planning
Sagar Suhas Joshi Seth Hutchinson Panagiotis Tsiotras Abstract — Sampling-based algorithms solve the path plan-ning problem by generating random samples in the search-space and incrementally growing a connectivity graph or a tree.Conventionally, the sampling strategy used in these algorithmsis biased towards exploration to acquire information about thesearch-space. In contrast, this work proposes an optimization-based procedure that generates new samples to improve thecost-to-come value of vertices in a neighborhood. The applica-tion of proposed algorithm adds an exploitative-bias to samplingand results in a faster convergence to the optimal solutioncompared to other state-of-the-art sampling techniques. This isdemonstrated using benchmarking experiments performed fora variety of higher dimensional robotic planning tasks. I. INTRODUCTIONSampling-based motion planning (SBMP) algorithms havebecome the default choice for solving robotic planning tasksdue to their scalability to higher dimensional problems.These algorithms do not resort to discretization or explicitconstruction of the search-space. Instead, popular single-query SBMP algorithms such as RRT [1] and multi-queryalgorithms such as PRM [2] use a black-box collision check-ing function to probe a set of random samples and local con-nections to incrementally build a connectivity graph. Thesealgorithms are probabilistically complete , i.e., the probabilityof finding a feasible solution, if it exists, approaches unityas the number of samples tends to infinity.
Asymptotically optimal variants of RRT, such as RRT* [3],converge to the optimal solution almost-surely. These algo-rithms comprise of two fundamental modules, namely, graph-growth and graph-processing. The graph-growth modulegenerates random samples, performs nearest neighbor, localsteering and collision checking calculations to build a con-nectivity graph during planning time. The graph-processingmodule then tries to improve the cost-to-come value of thevertices by performing operations such as edge rewiring.The graph is said to be rewired if the parent of a vertexchanges, improving its cost-to-come value. In particular, the“local rewiring” procedure of RRT* first selects the bestparent for a newly initialized vertex. It then sees if thisnew vertex can be a better parent for any of the verticesin its neighborhood. The RRT [4] algorithm provides anextension to the RRT* procedure by “globally rewiring” thegraph using dynamic programming. It uses value-iteration [4]or policy-iteration [5] to optimally connect each vertex inthe graph in order to minimize their cost-to-come values. , , Institute for Robotics and Intelligent Machines, Georgia Institute ofTechnology, USA. Email: { sagarsjoshi94, seth, tsiotras } @gatech.edu In this work, convergence implies convergence to the optimal solution,unless stated otherwise.
Fig. 1: Schematic motivating the proposed LES algorithm,which leverages local information and considers an optimiza-tion problem to generate the blue sample. In contrast to thered sample, blue sample can initiate rewirings and improvecost-to-come value of (green) vertices in the graph.Recently proposed methods such as BIT* [6] and FMT* [7]also use ideas from dynamic programming and heuristics toobtain faster convergence than RRT*.Using an intelligent sampling strategy, in conjunction withthese graph-processing methods, is effective for acceleratingthe convergence of SBMP algorithms. Uniform randomsampling, a widely used approach, biases the graph growthtowards vertices with larger Voronoi regions in RRT-stylemethods [1]. This results in a rapid exploration of thesearch-space and is effective for finding an initial solutionin single-query scenarios. However, this strategy like manyothers, prioritizes acquisition of new information over theimprovement of current paths in the planner’s graph. Thisbias towards exploration can have a detrimental effect onconvergence, especially in higher dimensions [8].The algorithm proposed in this work aims to generate newsamples that can improve the cost-to-come value of verticesand initiate rewirings. This is in contrast to the exploration-biased techniques. The proposed algorithm first selects avertex and then generates a new sample in its vicinity. Thissample is generated by solving an optimization problem,wherein the objective is to minimize the sum of cost-to-comevalue of a vertex and its randomly selected descendants.The proposed sampling algorithm thus leverages local infor-mation to provide an exploitative bias. The combination ofglobal exploratory and locally exploitative sampling resultsin faster convergence for SBMP algorithms, as demonstratedby several benchmarking experiments.II. RELATED WORKMany approaches have been suggested to address theexploration-exploitation trade-off in SBMP. Akgun and Stil-man [9] generate samples near a randomly selected state a r X i v : . [ c s . R O ] F e b n the current solution path. This local biasing techniqueincreases the probability of improving the current solutionat the cost of exploring other homotopy classes. The RRMalgorithm [10] adds edges to the current roadmap to balanceexploration and refinement. Techniques such as [11], [12],[13], [14], [15] use heuristics and obstacle information toguide search during planning. T-RRT [16] and its variants[17], [18] implement a transition-test to avoid unhinderedexploration in high cost regions. These approaches providea way to focus search during planning. However, they do notdirectly address the problem of improving the cost-to-comevalue of vertices through sampling.Unlike the above approaches, Informed Sampling [8]avoids redundant exploration after an initial solution isdiscovered. It focuses search onto a subset of the search-space, called the Informed Set, that contains all the pointsthat can potentially improve the current solution. Generatingnew samples in the Informed Set is thus a necessary (but notsufficient) condition to improve the current solution. RelevantRegion [19], a subset of the Informed Set, leverages cost-to-come information from the planner’s graph to further focussearch during planning. The combination of Relevant Regionand Informed Sampling results in accelerated convergencein uniform and general cost-space environments. However,these techniques do not generate samples to directly improvethe cost-to-come value of vertices. Hence, some of thesamples may fail to trigger any improvement in the planner’sgraph. The sampling algorithm proposed in this work alsogenerates new samples in the Relevant Region to avoidredundant exploration. However, it does so by solving anoptimization problem aimed towards improving the cost-to-come value of vertices in the graph. Application of theproposed sampling algorithm thus initiates a higher numberof rewirings and results in a faster convergence. Please seeFig. 1 for an illustration of this.Approaches combining sampling-based planning and localoptimizers have also been explored. RABIT* [20] usesCHOMP [21] to get feasible, high quality edges connectingany two vertices during a global search performed by BIT*.However, RABIT* requires pre-computed domain informa-tion, such as an obstacle potential function, which may not beavailable in many practical problems. Volumetric Tree* [22]addresses this limitation by constructing an approximationof the obstacle-free configuration space on-the-fly. However,it relies on uniform random sampling for graph construc-tion, which may lead to redundant exploration. DRRT [23]employs a gradient-descent based procedure in the graph-processing module. It attempts to optimize the locationof vertices to improve their cost-to-come value. However,DRRT incurs a higher computational cost due to the extracalls to the nearest-neighbor and collision checking functionto ensure edge feasibility after vertex movement. This workcombines ideas from DRRT and [19] to propose an opti-mization based sampling procedure. The proposed methoddoes not require extra calls to the collision checker/nearest-neighbor and can be used in conjunction with any graph-processing module. Fig. 2: Planning with the proposed LES algorithm on apotential cost-map. The robot incurs a higher cost if it travelsin the white regions.In the following sections, the path planning problem isformally defined, followed by a description and motivationbehind the optimization problem to generate new samples.The proposed sampling algorithm is then discussed and isfollowed by benchmarking experiments.III. PROBLEM DEFINITION A. Path Planning Problem
Consider the search-space
X ⊂ R d , with dimension d , d ≥ . Let the obstacle space and free space be denotedby X obs and X free respectively. Then X free = cl( X \ X obs ) ,where cl( A ) represents closure of the set A ⊂ R d . Let thecost of moving from a point x ∈ X to x ∈ X along apath π : [0 , → X , π (0) = x , π (1) = x be denoted by c π ( x , x ) , c π ( x , x ) = (cid:90) C ( π ( s )) (cid:107) d π ( s )d s (cid:107) d s. (1)Here, C : X → R ≥ denotes a continuous state costfunction. Note that (1) represents the integral of state-cost(IC) metric as a measure of path quality [16]. The path π in (1) is assumed to be collision free. The path-cost isinfinite otherwise. The optimal path planning problem canbe formally defined as the search for minimum cost path π ∗ from the set of feasible paths Π connecting the start state x s ∈ X free to the goal region X goal ⊂ X free , arg min π ∈ Π c π ( x s , x g ) , subject to: π (0) = x s , π (1) = x g ∈ X goal ,π ( s ) ∈ X free , s ∈ [0 , . (2)SBMP algorithms solve the above problem (2) by construct-ing a connectivity graph G = ( V, E ) with a finite set ofvertices V ⊂ X free and a set of edges E ⊆ V × V .The “geometric” versions of SBMP algorithms ignore thekino-dynamic constraints of the robot. Conventionally, theseplanners construct an edge ( u , v ) ∈ E using a straight linepath π ( s ) = u + ( v − u ) s , s ∈ [0 , connecting u and v .Using (1), the edge-cost can be denoted as c (cid:96) ( u , v ) = (cid:107) u − v (cid:107) (cid:90) C ( u + ( v − u ) s ) d s. (3)ig. 3: Neighborhood around a vertex v . Here, n v = 4 and (cid:98) d v ,V v = 4 + (1 + 2) = 7 .SBMP algorithms can perform numerical integration to cal-culate the edge-cost c (cid:96) ( u , v ) for any edge ( u , v ) ∈ E . Thegraph G embeds a spanning tree T = ( V t , E t ) with V t = V and E t = { ( u , v ) ∈ E | v = parent ( u ) } . Here, parent : V → V denotes the function mapping a vertex to its uniqueparent in the tree. By definition, we have parent ( x s ) = x s .The cost-to-come value g T ( v ) for a vertex v denotes the sumof edge-costs along the path from v to the root x s in T . Thefunction g T : V → R ≥ can be written recursively as g T ( v ) = g T ( v p ) + c (cid:96) ( v p , v ) , (4)where v p = parent ( v ) . By definition, the recursion ends at x s with g T ( x s ) = 0 . Let the set of children for vertex v bedenoted by V v = { u ∈ V | v = parent ( u ) } and the numberof children by n v = | V v | . Descendants of a vertex v areall the vertices u ∈ V whose path from u to the root x s in T contains v . Let D v denote the set of vertices that aredescendants of v and d v = | D v | . Then, d v = n v + (cid:88) u ∈ V v d u . (5)Note that for a leaf vertex v ∈ V , we have n v = d v = 0 . Let h : X ×X → R ≥ denote a consistent heuristic function. Thisfunction obeys the triangle inequality and gives an under-estimate of the path-cost c π ( x , x ) between any two points x , x ∈ X . An example of function h is the L -norm(Euclidean distance). Let B (cid:15) ( x o ) denote an (cid:15) -ball around x o ∈ X , given by B (cid:15) ( x o ) = { x ∈ X | (cid:107) x − x o (cid:107) < (cid:15) } ,for (cid:15) > . Finally, let µ ( A ) denote the Lebesgue measure ofthe set A ⊂ R d . B. Optimization Problem for Sampling
Given G , the objective of the graph-processing moduleis to minimize the cost-to-come value of all vertices. Thisobjective can be written as J T = (cid:88) u ∈ V g T ( u ) . (6)Let J T ( v ) denote the terms of J T that are dependent only ona particular vertex v ∈ V . The position of vertex v impactsthe cost-to-come value of itself and its descendants. Then, J T ( v ) = g T ( v ) + (cid:88) w ∈ D v g T ( w ) . (7)Using (4). the above equation for J T ( v ) can be writtenin terms of edge-costs c (cid:96) ( v p , v ) and c (cid:96) ( v , u ) . Here, v p = Fig. 4: Planning in the joint space of Panda ( R ) and Baxter( R ) manipulator arms. The start and goal positions forboth robots are indicated in the top and bottom figuresrespectively. parent ( v ) and u is any child of v . The edge-cost c (cid:96) ( v p , v ) will appear d v times in total, to calculate the cost-to-come value of v and its descendants. Similarly, the edge-cost c (cid:96) ( v , u ) will appear d u times in total, to calculatethe cost-to-come value of u and its descendants. Then, J T ( v ) = k + (1 + d v )c (cid:96) ( v p , v ) + (cid:88) u ∈ V v (1 + d u )c (cid:96) ( v , u ) . (8)Note again that equation (8) for J T ( v ) only contains termsdependent on v . Other terms are incorporated in the constant k . Also, d v and d u in (8) are linked by equation (5). A newsample can be generated by first selecting a vertex v andthen finding a “better” position for it by optimizing J T withrespect to v . Note that arg min v J T = arg min v J T ( v ) .However, calculating the values of the coefficients d v , d u in (8) requires a depth-first search with time complexity of O ( | V t | ) . This may get computationally cumbersome, espe-cially as the planner tree grows larger with the number ofiterations. The vertex data structure in standard implementa-tions of SBMP algorithms (such as OMPL [24]) only storesinformation about the vertex’s children. Hence, the followingobjective function can be considered instead, (cid:98) J T ,V v ( v ) = k + (1 + (cid:98) d v ,V v )c (cid:96) ( v p , v ) + (cid:88) u ∈ V v (1 + n u )c (cid:96) ( v , u ) , (cid:98) d v ,V v = n v + (cid:88) u ∈ V v n u . (9)Please see Fig. 3. Note that minimizing (cid:98) J T ,V v ( v ) in (9)with respect to v is equivalent to minimizing the cost-to-come values of v , the set of children V v and their children.The objective (cid:98) J T ,V v ( v ) can be calculated efficiently with lgorithm 1: LES Algorithm Flow V ← { x s } ; E ← φ ; G ← ( V, E ) ; for i = 1 : N do c i ← getBestSolutionCost () ; u rand ∼ U (0 , ; if u rand < p LES and c i < ∞ then v ← chooseVertex ( V rel ); ˆ e ← getGradientDirection ( v ) ; γ ← getStepSize ( v , ˆ e ) ; x rand ← v − γ ˆ e ; else x rand ← InformedSampling ( c i ) Extend ( x rand ) ; GraphProcessing ( G ) ; return G Algorithm 2:
Calculate Gradient Direction getGradientDirection ( v ) : (cid:98) V v ← getRandomSubset ( V v ) ; e ← (1 + (cid:98) d v , (cid:98) V v ) ∂∂ v c (cid:96) ( v p , v ) + (cid:80) u ∈ (cid:98) V v (1 + n u ) ∂∂ v c (cid:96) ( v , u ) ; ˆ e ← e / (cid:107) e (cid:107) return ˆ e the information contained in the data structure of vertex v ,without recursing deeper down the tree. Effectively, (cid:98) J T ,V v ( v ) considers descendants of v upto a depth of . This can begeneralized to depth- k descendants, at a higher computa-tional cost for calculating the coefficients.Finally, a random subset of the children, denoted by (cid:98) V v ⊆ V v , can be selected and a new sample generated by minimiz-ing (cid:98) J T , (cid:98) V v ( v ) . This serves two purposes. First, it promotesa desirable randomness in the sampling process. Second,focusing on the subset (cid:98) V v effectively assigns a weight ofzero for the terms corresponding to the vertices V v \ (cid:98) V v inthe objective (9). This can lead to a better improvement inthe cost-to-come value of vertices corresponding to (cid:98) V v .IV. LOCALLY EXPLOITATIVE SAMPLINGThe proposed “Locally Exploitative Sampling (LES)”procedure first selects a vertex v and then generates anew sample considering (cid:98) J T , (cid:98) V v ( v ) . Expansive Space Trees(EST) [25] and its variants, such as [12], [13], also proceedby selecting a vertex and generating a random sample inits vicinity. However, the probability of generating a “good”sample (that can improve (cid:98) J T , (cid:98) V v ( v ) ) with such random searchmay decrease rapidly in higher dimensions. This is illustratedin the Appendix by considering the problem of minimizinga quadratic function J q ( x ) = x T x with random local search.The probability of generating a sample that can improve J q diminishes exponentially with the dimension d .This motivates the LES procedure, given in Algorithm 1.With probability p LES , LES is used to generate a new sample x rand (Algorithm 1, line 6-8). Otherwise, a new sample is Algorithm 3:
Calculate Step-size getStepSize ( v , ˆ e ) : γ rel ← getMaxStepSize ( v , ˆ e ) ; γ max ← γ rel ; while γ max > δ do u rand ∼ U (0 , ; γ ← ( u rand ) /d γ max ; x rand ← v − γ ˆ e ; if (cid:98) J T , (cid:98) V v ( x rand ) < (cid:98) J T , (cid:98) V v ( v ) then break ; else γ max ← γ ; if γ max < δ then u rand ∼ U (0 , ; γ ← ( u rand ) /d γ rel ; return γ generated using the conventional Informed Sampling tech-nique given in [8] (Algorithm 1, line 11). This ensuresa balance between exploration-exploitation (controlled bythe parameter p LES ) and graph growth in all the relevanthomotopy classes. The
Extend function takes this randomsample and performs relevant procedures (nearest-neighbor,local steering and collision checking) to incorporate a newvertex in the graph (Algorithm 1, line 12). Finally, the graph-processing module operates on G considering the addition ofa new vertex (Algorithm 1, line 13).If the best solution cost c i after i iterations is finite(indicating that a sub-optimal solution has been discovered),redundant exploration can be avoided by focusing the searchon the Informed or Relevant Region set. As the InformedSet may be ineffective in focusing search for general cost-space problems, LES generates new samples in the RelevantRegion X (cid:15) rel [19], defined as follows X (cid:15) rel = (cid:91) v ∈ V rel B (cid:15) rel ( v ) , (10)where, B (cid:15) rel ( v ) = { x ∈ B (cid:15) ( v ) | ˆ f v ( x ) < c i } , ˆ f v ( x ) = c (cid:96) ( v,x ) + g T ( v ) + h( x , x g ) , (11)and V rel denotes the set of “relevant vertices”, V rel = { v ∈ V | g T ( v ) + h( v , x g ) < c i } . (12)The value of (cid:15) in (10) is set to (cid:15) = 1 . η , where η is therange parameter in SBMP algorithms [24], which controlsthe maximum edge-length in G . The procedure for selectinga vertex ( chooseVertex ), is similar to the implementationin [19]. It assigns a weight q v for each v ∈ V rel anduses a binary heap data-structure for sorting. Start, goal andleaf vertices (vertices with no children) are ignored by the chooseVertex function.Note that (cid:98) J T , (cid:98) V v ( v ) represents a non-linear objective func-tion. Hence, LES proceeds by numerically calculating thegradient of (cid:98) J T , (cid:98) V v ( v ) and moving an appropriate step-size inthe direction of the gradient. The procedure to calculate thegradient direction ˆ e is given in Algorithm 2. First, a random .5 1 1.516182022242628 Fig. 5: Benchmarking plots for the numerical experiments. Solid lines indicate the value averaged over 100 trials and theerror bars represent standard deviation. Application of the proposed LES method (red) leads to a faster convergence and alarger number of tree rewirings in higher dimensions. However, it incurs a higher computational cost and hence executes alesser number of iterations compared to Informed (magenta) and Relevant Region (blue) sampling.subset of children (cid:98) V v is obtained. The gradient (cid:98) J T , (cid:98) V v ( v ) withrespect to v is calculated numerically using the symmetricdifference formula (Algorithm 2, line 3). Having obtainedthe gradient direction ˆ e , the algorithm to calculate the step-size is given in Algorithm 3. As finding the optimal step-size γ ∗ by solving arg min γ (cid:98) J T , (cid:98) V v ( v − γ ˆ e ) is intractable,approaches such as backtracking line search [26] have beensuggested. However, executing backtracking line search iscomputationally not viable for the current application, asit requires a higher number of expensive calls to calculate (cid:98) J T , (cid:98) V v . Instead, LES uses a procedure given in Algorithm 3,which is similar to the Hit-and-Run Sampler implemented in[27]. First, given a vertex v and the travel direction − ˆ e , theprocedure in [19] is used to calculate the maximum step-size γ rel . This ensures that a candidate v − γ ˆ e ∈ X (cid:15) rel for any γ ∈ (0 , γ rel ) . Variable γ max is set to γ rel . Next, a random step-size γ is sampled from the interval (0 , γ max ) . The exponentof /d in Algorithm 3, line 5 biases γ towards γ max . Ifthe candidate v − γ ˆ e results in an improvement for (cid:98) J T , (cid:98) V v ,step-size γ is returned. Else, γ max is updated to γ . Thus, the search interval is sequentially reduced until a suitablestep-size is discovered. Theoretically, a travel of infinitesimalmagnitude in the direction of the gradient always results in animprovement. However, if γ max is less than a small quantity δ << η , then a random γ in the interval (0 , γ rel ) is returned(Algorithm. 3, line 11-12) to avoid clumping of new verticesaround v . V. NUMERICAL EXPERIMENTSThe proposed LES algorithm was benchmarked againstInformed sampler and Relevant Region sampler described in[8] and [19] respectively. Note that LES and Relevant Regionsampler share a similar chooseVertex procedure. However,the Relevant Region sampler only generates random samplesin X (cid:15) rel and does not consider the optimization problemcorresponding to (9). All the algorithms were implementedusing C++/OMPL [24]. Data was gathered over 100 trialsfor each experiment using the standardized OMPL bench-marking tools [28]. All experiments were performed on a 64bit laptop running Ubuntu 16.04 OS, with 16 GB RAM andn Intel i7 processor. The parameter p LES and an analogousparameter p rel for Relevant Region sampler were both setto . . Parameter δ was set to − . All sampling strategiesused a goal bias of and were paired with RRT ’s globalrewiring for graph-processing. A description of the differentbenchmarking environments is given below. Potential Cost-map:
This environment, illustrated in Fig. 2,has the state-cost function C ( x ) = 1 + 9 (cid:88) i exp (cid:0) − (cid:107) x c i − x (cid:107) (cid:1) . (13)Here, x c i represent the center points of the high cost whiteregions. The objective for the robot is to plan a path to thegoal while avoiding these soft obstacles. The range parameter η was set to . , . and . for the 2D, 4D and 6D versionsof environment respectively. Robot Manipulators:
A planning problem for a 7 DOFPanda and a 14 DOF Baxter arm is illustrated in Fig. 4. Theobjective was to find the minimum length path ( C ( x ) = 1 for all x ∈ X ) in the configuration-space with strict jointlimits ( R for Panda, R for Baxter). These joint limits andcollision checking calculations were implemented with thehelp of MoveIt! [29]. The range parameter η was set to . and for the Panda and Baxter experiments respectively.Results from the numerical experiments are illustrated inFig. 5. The proposed LES algorithm outperforms Informed(magenta) and Relevant Region (blue) samplers in higherdimensional settings (Potential 6D, Panda, Baxter) in termsof cost convergence. LES also initiates a larger number ofrewirings in T . However, similar performance gains are notseen in the lower dimensional environments (Potential 2D,4D). Relevant Region sampler, with its focusing propertiesperforms better than Informed sampling. LES incurs a highercomputational cost due to the numerical gradient calculationsin Algorithm 2 and expensive function evaluations of (cid:98) J T , (cid:98) V v in Algorithm 3. Thus, the application of LES leads to alesser number of iterations executed in a given time periodcompared to the other two methods. This might slow downconvergence in lower dimensions. However, random searchtechniques are affected by the “curse of dimensionality” asillustrated in the Appendix. This justifies the computationallycostly procedures of LES which lead to an acceleratedconvergence in higher dimensions.VI. CONCLUSIONThis work proposes a “Locally Exploitative Sampling”algorithm, that generates new samples to improve the cost-to-come value of vertices in a neighborhood. LES numericallycalculates the gradient of (9) and decides an appropriatestep-size to obtain a new sample. Although computationallycostlier, LES adds an “exploitative-bias” that can acceler-ate convergence of SBMP algorithms, especially in higherdimensions. LES generates new samples in the RelevantRegion, a subset of the Informed Set, to avoid redundantexploration after an initial solution is discovered. As dis-cussed earlier, Informed Sampling is a necessary conditionto improve the current solution. However, it is not sufficient, Fig. 6: Schematic for the analysis in Appendix. Blackand magenta circles illustrate the set B (cid:107) x o (cid:107) ( ) and B (cid:15) ( x o ) respectively. The intersection B (cid:15) ( x o ) ∩B (cid:107) x o (cid:107) ( ) can be over-approximated by hyper-sphere centered at x c with radius r c .as an “Informed sample” is not guaranteed to bring aboutimprovements in the current solution or the cost-to-comevalue of vertices. LES can be seen as a way to address thislimitation of Informed Sampling.LES presents many openings for future research. LEScan be extended to kino-dynamic settings and be used withplanners such SST [30]. While the current implementationdoes not leverage the obstacle data gathered by the planner,ideas from [14] can be used to have a “obstacle-aware”version of LES. Exploring depth- k generalization of theobjective (9) and analysing its effect on convergence andcomputational cost is also the focus of future work.APPENDIXThe following analysis is similar to the one providedin [31]. Consider the problem of minimizing a quadraticobjective function J q ( x ) = x T x with random local search.Let the starting state be x o ∈ R d with the correspondingobjective cost J q ( x o ) . Random search generates samplesin the set B (cid:15) ( x o ) to find a new state with cost less than J q ( x o ) . Assume (cid:15) < (cid:107) x o (cid:107) . The set of states that provide animprovement over J q ( x o ) satisfy x T x < x T o x o . This set canbe denoted as B (cid:107) x o (cid:107) ( ) , where is the origin. The set ofgood samples thus lie in the set B (cid:15) ( x o ) ∩ B (cid:107) x o (cid:107) ( ) . Pleasesee Fig. 6. This intersection between two hyper-spheres canbe over-approximated by B r c ( x c ) , where r c = (cid:15) (cid:115) − (cid:15) (cid:107) x o (cid:107) . (14)The probability of generating a good sample using randomsearch is given by P ( x ∈ B (cid:15) ( x o ) ∩ B (cid:107) x o (cid:107) ( )) = µ (cid:0) B (cid:15) ( x o ) ∩ B (cid:107) x o (cid:107) ( ) (cid:1) µ ( B (cid:15) ( x o )) < µ ( B r c ( x c )) µ ( B (cid:15) ( x o ))= (cid:0) − (cid:15) (cid:107) x o (cid:107) (cid:1) d . (15)Thus, the probability of generating a good sample decreasesexponentially with the dimension d . cknowledgements : This work has been supported by NSFawards IIS-1617630 and IIS-2008686.R EFERENCES[1] S. M. LaValle and J. J. Kuffner Jr, “Randomized kinodynamic plan-ning,”
The International Journal of Robotics Research , vol. 20, no. 5,pp. 378–400, 2001.[2] L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. H. Overmars, “Prob-abilistic roadmaps for path planning in high-dimensional configurationspaces,”
Transactions on Robotics and Automation , vol. 12, no. 4, pp.566–580, 1996.[3] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimalmotion planning,”
The International Journal of Robotics Research ,vol. 30, no. 7, pp. 846–894, June 2011.[4] O. Arslan and P. Tsiotras, “Use of relaxation methods in sampling-based algorithms for optimal motion planning,” in
IEEE InternationalConference on Robotics and Automation , Karlsruhe, Germany, May6–10 2013, pp. 2421–2428.[5] ——, “Incremental sampling-based motion planners using policyiteration methods,” in
IEEE 55th Conference on Decision and Control ,Las Vegas, NV, Dec. 12–15 2016, pp. 5004–5009.[6] J. D. Gammell, S. S. Srinivasa, and T. D. Barfoot, “Batch informedtrees (BIT*): Sampling-based optimal planning via the heuristicallyguided search of implicit random geometric graphs,” in
IEEE Inter-national Conference on Robotics and Automation , Seattle, WA, May,25–30 2015, pp. 3067–3074.[7] L. Janson, E. Schmerling, A. Clark, and M. Pavone, “Fast marchingtree: A fast marching sampling-based method for optimal motionplanning in many dimensions,”
The International Journal of RoboticsResearch , vol. 34, no. 7, pp. 883–921, May 2015.[8] J. D. Gammell, T. D. Barfoot, and S. S. Srinivasa, “Informed samplingfor asymptotically optimal path planning,”
IEEE Transactions onRobotics , vol. 34, no. 4, pp. 966–984, Aug. 2018.[9] B. Akgun and M. Stilman, “Sampling heuristics for optimal motionplanning in high dimensions,” in
IEEE/RSJ International Conferenceon Intelligent Robots and Systems , San Francisco, CA, Sept. 25–302011, pp. 2640–2645.[10] R. Alterovitz, S. Patil, and A. Derbakova, “Rapidly-exploringroadmaps: Weighing exploration vs. refinement in optimal motionplanning,” in
IEEE International Conference on Robotics and Automa-tion , Shanghai, China, 2011, pp. 3706–3712.[11] C. Urmson and R. Simmons, “Approaches for heuristically biasingRRT growth,” in
IEEE/RSJ International Conference on IntelligentRobots and Systems. , vol. 2, Las Vegas, NV, Oct. 27–31 2003, pp.1178–1183.[12] J. M. Phillips, N. Bedrossian, and L. E. Kavraki, “Guided expansivespaces trees: a search strategy for motion-and cost-constrained statespaces,” in
IEEE International Conference on Robotics and Automa-tion , New Orleans, LA, April 26–30 2004, pp. 3968–3973.[13] S. M. Persson and I. Sharf, “Sampling-based A* algorithm forrobot path-planning,”
The International Journal of Robotics Research ,vol. 33, no. 13, pp. 1683–1708, 10 2014.[14] T. Lai, P. Morere, F. Ramos, and G. Francis, “Bayesian local sampling-based planning,”
IEEE Robotics and Automation Letters , vol. 5, no. 2,pp. 1954–1961, 2020.[15] S. Rodriguez, X. Tang, J.-M. Lien, and N. M. Amato, “An obstacle-based rapidly-exploring random tree,” in
IEEE International Confer-ence on Robotics and Automation , Orlando, FL, May 15–19 2006, pp.895–900.[16] L. Jaillet, J. Cort´es, and T. Sim´eon, “Sampling-based path planningon configuration-space costmaps,”
IEEE Transactions on Robotics ,vol. 26, no. 4, pp. 635–646, 8 2010.[17] D. Devaurs, T. Sim´eon, and J. Cort´es, “Enhancing the transition-based RRT to deal with complex cost spaces,” in
IEEE InternationalConference on Robotics and Automation , Karlsr¨uhe, Germany, May6–10 2013, pp. 4120–4125.[18] ——, “Optimal path planning in complex cost spaces with sampling-based algorithms,”
IEEE Transactions on Automation Science andEngineering , vol. 13, no. 2, pp. 415–424, 2015.[19] S. S. Joshi and P. Tsiotras, “Relevant region exploration on generalcost-maps for sampling-based motion planning,” in
International Con-ference on Intelligent Robots and Systems (IROS) . Las Vegas, NV:IEEE/RSJ, Oct. 25–29 2020. [20] S. Choudhury, J. D. Gammell, T. D. Barfoot, S. S. Srinivasa, andS. Scherer, “Regionally accelerated batch informed trees (RABIT*): Aframework to integrate local information into optimal path planning,”in
International Conference on Robotics and Automation (ICRA) .IEEE, 2016, pp. 4207–4214.[21] M. Zucker, N. Ratliff, A. D. Dragan, M. Pivtoraiko, M. Klingensmith,C. M. Dellin, J. A. Bagnell, and S. S. Srinivasa, “CHOMP: CovariantHamiltonian optimization for motion planning,”
The InternationalJournal of Robotics Research , vol. 32, no. 9-10, pp. 1164–1193, 2013.[22] D. Kim, M. Kang, and S.-E. Yoon, “Volumetric tree*: Adaptive sparsegraph for effective exploration of homotopy classes,” in
InternationalConference on Intelligent Robots and Systems (IROS) . IEEE/RSJ,2019, pp. 1496–1503.[23] F. Hauer and P. Tsiotras, “Deformable rapidly-exploring random trees.”in
Robotics: Science and Systems , Cambridge, MA, July 12–16 2017.[24] I. A. Sucan, M. Moll, and L. E. Kavraki, “The open motion planninglibrary,”
IEEE Robotics & Automation Magazine , vol. 19, no. 4, pp.72–82, Dec. 2012.[25] D. Hsu, J.-C. Latombe, and R. Motwani, “Path planning in expansiveconfiguration spaces,” in
IEEE International Conference on Roboticsand Automation , vol. 3, Albuquerque, NM, April 25–29 1997, pp.2719–2726.[26] S. Boyd and L. Vandenberghe,
Convex Optimization . CambridgeUniversity Press, 2004.[27] D. Yi, R. Thakker, C. Gulino, O. Salzman, and S. Srinivasa, “Gen-eralizing informed sampling for asymptotically-optimal sampling-based kinodynamic planning via Markov Chain Monte Carlo,” in
IEEE International Conference on Robotics and Automation (ICRA) ,Brisbane, Australia, May 21–25 2018, pp. 7063–7070.[28] M. Moll, I. A. Sucan, and L. E. Kavraki, “Benchmarking motionplanning algorithms: An extensible infrastructure for analysis andvisualization,”
IEEE Robotics & Automation Magazine , vol. 22, no. 3,pp. 96–102, 2015.[29] S. Chitta, I. Sucan, and S. Cousins, “Moveit![ROS topics],”
IEEERobotics & Automation Magazine , vol. 19, no. 1, pp. 18–19, 2012.[30] Y. Li, Z. Littlefield, and K. E. Bekris, “Sparse methods for efficientasymptotically optimal kinodynamic planning,” in
Algorithmic Foun-dations of Robotics XI . Springer, 2015, pp. 263–282.[31] J. Watt, R. Borhani, and A. K. Katsaggelos,