Structured Prediction for CRiSP Inverse Kinematics Learning with Misspecified Robot Models
Gian Maria Marconi, Raffaello Camoriano, Lorenzo Rosasco, Carlo Ciliberto
SStructured Prediction for CRiSP Inverse KinematicsLearning with Misspecified Robot Models
Gian Maria Marconi ∗ , , Raffaello Camoriano ∗ , Lorenzo Rosasco , , Carlo Ciliberto [email protected] raff[email protected] [email protected] [email protected] Abstract
With the recent advances in machine learning, problems that traditionally would requireaccurate modeling to be solved analytically can now be successfully approached with data-drivenstrategies. Among these, computing the inverse kinematics of a redundant robot arm poses asignificant challenge due to the non-linear structure of the robot, the hard joint constraints andthe non-invertible kinematics map. Moreover, most learning algorithms consider a completelydata-driven approach, while often useful information on the structure of the robot is availableand should be positively exploited. In this work, we present a simple, yet effective, approach forlearning the inverse kinematics. We introduce a structured prediction algorithm that combines adata-driven strategy with the model provided by a forward kinematics function – even when thisfunction is misspeficied – to accurately solve the problem. The proposed approach ensures thatpredicted joint configurations are well within the robot’s constraints. We also provide statisticalguarantees on the generalization properties of our estimator as well as an empirical evaluation ofits performance on trajectory reconstruction tasks.
Computing the inverse kinematics of a robot is a well-known key problem in several applicationsrequiring robot control [20]. This task consists in finding a set of joint configurations that wouldresult in a given pose of the end effector, and is traditionally solved by assuming access to an accuratemodel of the robot and employing geometric or numerical optimization techniques. However, a majordrawback of these strategies is that they are extremely sensitive to inaccuracies in the model. This canbe a significant limitation in settings where the kinematic parameters of the robot are only availableup to a given precision.A recently proposed alternative to model-based approaches is to learn the inverse kinematicsfunction from examples of joint configurations and workspace pairs [10,11,16,17]. However, traditionalregression techniques are not suited for this task since computing the inverse kinematics of robots withredundant joints is an ill-posed problem and there are multiple joint configurations that correspondto the same workspace pose [19]. To address this issue, previous works have recast the problem in thevelocity domain. The goal becomes that of learning a map from the velocity of the end effector to the ∗ Equal Contribution. RIKEN Center for AI Project, Tokyo, Japan. Laboratory for Computational and StatisticalLearning, Istituto Italiano di Tecnologia, Italy, and Massachusetts Institute of Technology, Cambridge, MA, USA. MaLGa & DIBRIS, Università degli Studi di Genova, Genova, Italy. Center for Brains, Minds and Machines, MIT,Cambridge, MA, USA. Computer Science, University College London, London, United Kingdom a r X i v : . [ c s . R O ] M a r igure 1: Franka Emika Panda arm tracking a spiral trajectory via the CRiSP-FK inverse kinematicsstructured predictor. velocity of joints. Alternatively, because this problem is locally linear [23], regression techniques canbe employed to compute piecewise functions. This idea has been explored both with linear estimatorsand neural networks (NN) [16, 23]. In contrast to learning in the velocity domain, works in [3, 14, 17]proposed to solve the learning problem in the position domain. Albeit more challenging, this strategyhas the advantage that it does not limit the algorithm to compute local inverse kinematics estimatorsbut allows for more global solutions.In this work, we propose a novel structured prediction strategy to learn the inverse kinematics ofa redundant robot. Our approach combines a data-driven strategy with the (possibly inaccurate orbiased) forward kinematics function of the robot, potentially obtaining the best of both worlds. Weempirically show that our approach can compensate for biases in the forward kinematics and stilllearn an accurate inverse kinematics. This scenario is common when it is possible to gather datafrom a real robot, but the available kinematic model is imprecise. Our approach aims to estimatethe joint configuration required to achieve a target pose, in contrast with most previous data-drivenmethods, which only consider the position of the end effector. We test our approach on trajectoryreconstruction applications. As a byproduct of our work, we also provide a new dataset for inversekinematics learning on a 5-DoF planar manipulator and on the 7-DoF Panda robot (see Fig. 1).The remainder of this paper is organized as follows: in Sec. 2 we introduce the problem of inversekinematics and review related work on the topic. In Sec. 3 we introduce our method for inversekinematics and characterize its theoretical properties. In Sec. 4 we empirically evaluate the proposedapproach on trajectory reconstruction applications. Sec. 5 concludes this work and discusses potentialdirections for future research.
We introduce here the problem of learning the inverse kinematics of a robotic manipulator and discussprevious work on the topic. Let SO ( d ) be the special orthogonal group of dimension d ∈ N + . Wedenote by X = R d × SO ( d ) the space of possible end-effector poses , comprising the Cartesian positionand orientation of the robot’s end effector in a d -dimensional Euclidean space. Assuming a robot with2 ∈ N + joints, we denote by Y = [ a , b ] × . . . × [ a J , b J ] the space of all admissible joint configurations such that for any j ∈ { , . . . , J } the set [ a j , b j ] ⊂ [0 , π ) identifies the physical limits of the j -thjoint.Assuming the robot to have forward kinematics g : Y → X , our goal is to learn an inversekinematics map, namely a function f : X → Y such that g ◦ f ( x ) = x. (1)However, finding such function is not straightforward. Since the forward kinematics g of redundantmanipulators is not injective, there are multiple joint configurations that result in the same end-effector pose. A common approach to this problem consists in defining it in the velocity domain andenforcing the uniqueness of the solution with further constraints. The resulting problem can be solvednumerically. However, the solution can be highly sensitive to model inaccuracies (i.e., it needs verygood knowledge of g ) [12].Data-driven approaches can overcome model inaccuracies by learning the inverse kinematicsfunction from input-output pairs. Assuming g to be unknown , these methods aim to learn f by relyingon a finite number n of examples ( x i , y i ) ni =1 such that y i = g ( x i ). To allow for a statistical analysisof our proposed method, in the rest of this work we will assume the pairs ( x i , y i ) to be sampledi.i.d. according to a distribution ρ on X × Y . This can be done even when g is not available by firstrandomly sampling a joint configuration y i and then measuring the robot pose x i = g ( y i ) (see Sec. 4for more details on this process in practice).The work in [14] is an example of this. A NN learns the forward kinematics of the robot, andit is then used to train a second NN such that the composition of the two is the identity function.However, given the non-convex nature of optimization problems associated to NN models, this methodis significantly unstable during training. In [17], a method to learn a locally linear function in thevelocity domain is proposed. This results in a good local approximation which, however, lacks globalsmoothness. The work in [3] is related to our proposed approach in that it tackles the inversekinematics problem directly in the Cartesian domain by using structured prediction techniques [2]. Bytraining a one-class structured SVM, their algorithm learns the inverse kinematics for some trajectories.However, this approach has a high sample complexity and requires to sample the training set only ina neighbourhood of the goal trajectory. This poses a challenge in most practical applications since itrequires to retrain the model for each new trajectory or limit the usage of a model to trajectoriesclose to the one used for training.Similarly to [3], in this work we rely on ideas from structured prediction to learn the inversekinematics of a robot. However, we significantly improve over previous work by providing the followingcontributions: 1) we introduce a novel approach that takes full advantage of useful side information,such as partial or inaccurate knowledge of the forward kinematics; 2) we extend the learning problem toboth position and orientation; 3) we characterize the theoretical properties of the proposed estimator,proving universal consistency and excess risk bounds; 4) we empirically demonstrate the effectivenessof our approach on a number of challenging and realistic scenarios in simulation. Structured prediction methods are used to learn input-output relations when the output space isnot a linear space, but still presents some relevant structure that can be leveraged to compare andidentify the best prediction for a given input. Notable examples are quantile estimation [22], image3 a) Configuration space loss (b)
Forward kinematics loss
Figure 2: Dependency of learning inverse kinematics with respect to the structured predictionloss.
Pose reconstruction for x ∗ = g ( y ∗ ) for the 5-DoF planar manipulator described in Sec. 4.2 given threetraining examples with pose x , x , x close to the target pose x ∗ , but significantly different joint configurations y , y , y . Applying CRiSP with the configuration space loss, produces configurations that are far from thetarget in Cartesian space (Fig. 2a). In contrast, the forward kinematic loss on (4) (even if possibly inaccurate)leads to a significantly better pose estimation (Fig. 2b). segmentation [15], manifold-valued prediction [18], and protein folding [13]. Given an input x ∈ X ,and a structured output y ∈ Y , many of the methods used in structured prediction can be abstractedas learning a function F α : X × Y → R that measures the quality of a candidate y as a predictionfor a specific input x and depends on some learnable parameters α ∈ Θ. The structured estimator b f : X → Y is then the function identifying the best output that maximizes F b f ( x ) = argmax y ∈Y F α ( y, x ) (2)While structured methods are often more computationally expensive than traditional supervisedlearning approaches, they can be applied to much more complex spaces and encode all the extrainformation that such spaces entail. In the case of inverse kinematics learning, the output spaceconsisting of the set of joint angles is highly non-linear, and it typically presents hard constraints inthe admissible joint angles. Many of the data-driven methods used to learn the inverse kinematicslearn a mapping b f : X → Y by minimizing some loss that measures the fidelity of the predictedˆ y with respect to the training { y } ni =1 in the configuration space Y . However, the goal of learningthe inverse kinematics is often subordinated to a higher-level task aimed at controlling the robot inthe Cartesian space X . Therefore, the true goal is to learn an inverse kinematics function b f suchthat g ( b f ( x )) ’ x , where g is the true forward mapping of the robot. To such end, we propose anovel structured prediction approach that faithfully encodes the structure of the configuration space(including constraints on the joints), but also evaluates the fidelity of the prediction in the Cartesianspace. This leads to lower errors and more robust behaviour. To do so, we assume to have at ourdisposal the forward kinematics function of the robot at hand, even if with some errors, i.e. ˜ g ’ g .Our approach provides a solution to problem (1) using the framework for structured predictionproposed in [8]. Given a set of joint configurations and corresponding end-effector poses D = ( x i , y i ) ni =1 ,generated by a forward kinematics function g : Y → X , our goal is to learn a function b f ( x ) ≈ g − .Let : Y × Y → R be a structured loss function that measures prediction errors in the output space.4n the context of this work, Y corresponds to the space of (constrained) joint configurations. Let also k : X × X → R be a kernel function [21] on input data (the target end-effector pose, in the settingof this work). We recall that kernel functions are a standard tool used in machine learning settingsto learn non-linear non-parametric models (see [21] and references therein). For context, severalchoices of kernel functions are typically available in practice, such as linear k ( x, x ) = x > x , Gaussian k ( x, x ) = e −k x − x k /σ , or Laplacian k ( x, x ) = e −k x − x k /σ (where σ > F α of the form b f ( x ) = argmin y ∈Y ( F α ( x ) ( y ) = n X i =1 α i ( x ) ( y, y i ) ) , (3)with weights α ( x ) = [ α , . . . , α n ] > = (cid:0) K + nλ I n (cid:1) − K x , where K ∈ R n × n is the kernel matrix associatedto the kernel k , with entries { K } ij = k ( x i , x j ), and K x ∈ R n is the evaluation vector with entries { K x } i = k ( x, x i ). Finally, I n ∈ R n × n is the identity matrix. In general, an estimator of the formin (3) can be interpreted as mapping an input point x to a y = b f ( x ) corresponding to a weightedbarycenter of the training output samples y i according to the loss . The weights { α i } ni =1 define howrelevant are the output examples { y i } ni =1 for the considered test point.Alg. 1 summarizes the two key steps characterizing CRiSP: training and prediction. First, duringthe training phase, we compute the inverse A of the regularized kernel matrix ( K + nλ I n ). Thisprocess is akin to training a kernel ridge regression (KRR) model and can be carried out using anysolver for linear systems (time complexity O ( n ) and memory complexity O ( n )). In our experiments,we compute the inverse of A using its Cholesky decomposition to obtain a numerically robust solution.Then, given a new input x , the prediction step in Alg. 1 consists in first computing the weights( α i ( x )) ni =1 via the matrix-vector product α ( x ) = A K x and then solving the constrained minimizationproblem in (3). In our experiments, we addressed this latter problem adopting the L-BFGS-Boptimizer [5] from the SciPy scientific computing library [24], which proved to be the most efficientamong the constrained optimizers we tried in our experiments. For the purpose of reproducibility, wehave made the implementation of CRiSP available to the community .The optimization of (3) strongly depends on the loss employed. Below, we consider how todesign such a loss in the context of learning the inverse kinematics of a robot. A first key question is how to choose the loss to measure prediction errors. In principle, it mightbe tempting to consider a loss such as the squared sum of joint angle differences, which naturallyquantifies the difference between the joint configurations between the predicted and the measuredjoint configurations. However, we argue that this might cause problems since configurations thatare distant with respect to the metric on Y could correspond to similar poses in Cartesian space.This issue is illustrated by Fig. 2a, where the CRiSP estimator trained with such configuration spaceloss is unable to predict a correct joint configuration to reach the desired target, and often shows apositional bias depending on the considered workspace region. https://github.com/gmmarconi/CRiSP-for-Misspecified-Robot-Model lgorithm 1: CRiSP
Input: D = ( x i , y i ) ni =1 training set, k kernel, λ > Training:
Compute the kernel matrix ( K ) ij = k ( x i , x j )Compute A = ( K + nλ I n ) − ∈ R n × n Prediction.
For any new input x :Evaluate K x = ( k ( x, x ) , . . . , k ( x, x n )) > ∈ R n Compute the weights α ( x ) = A K x .ˆ y = Minimize ( F α ( x ) ( · )) // e.g. use L-BFGS [5] Return: ˆ y .In contrast, here we propose a structured loss that measures how much two joint configurations“differ” in Cartesian space. More precisely, we assume that a – possibly inaccurate – forward kinematicsfunction ˜ g ( y ) = [˜ g p ( y ) , ˜ g o ( y )] > is available, where ˜ g p : Y → R d and ˜ g o : Y → SO ( d ) are the componentsthat map respectively to the position and the orientation of the end effector. It is important to notethat this function can be different from the ground-truth forward kinematics g used to generate thedataset D . Then, we propose the forward kinematics loss (FK) ( y, y i ) = k ˜ g p ( y ) − ˜ g p ( y i ) k + d O (˜ g o ( y ) , ˜ g o ( y i )) , (4)where the first term measures the Euclidean distance of the end effector from the desired position,while the second term d O ( y, z ) := c X j =1 min( | y j − z j | , π − | y j − z j | ) (5)measures the error between the predicted and the target end-effector orientation with respect to thesquared geodesic distance on the circle, which can be used as an alternative representation for SO ( d ),with c = 1 for SO (1) and c = 3 for SO (3). Note that it is also possible to weight differently theelements in (4) to adjust position and orientation accuracies according to the desired performance.We refer to CRiSP-FK as the estimator employing such loss. Fig. 2b shows that, as designed, thisestimator is better-suited to learn the inverse kinematics of a robotic manipulator. By leveraging the structured prediction perspective from [7], here we characterize the statisticalproperties of the proposed CRiSP estimator. In particular, the following result proves that b f is universally consistent , namely that as the number n of training examples grows, b f is guaranteed toasymptotically converge to the ideal inverse kinematics f ∗ : X → Y minimizing f ∗ = argmin f : X →Y E ρ ( f ( x ) , y ) , (6)where ρ is the probability distribution on X × Y from which we sample the train-test pairs ( x, y ).More formally, we have the following result.
Theorem 1 (Universal Consistency).
Let D = ( x i , y i ) ni =1 be sampled i.i.d. according to adistribution ρ on X × Y , let b f be the estimator in (3) trained with FK loss from (4) and λ = n − / n D , using a universal kernel [21] (e.g. Gaussian or Laplacian). Then, let f ∗ the ideal inversekinematics from (6) , then with probability , lim n → + ∞ E ρ ( b f ( x ) , y ) = E ρ ( f ∗ ( x ) , y ) . (7) Proof.
The result requires showing that the satisfies the Implicit Loss Embedding (ILE) property [7,Def. 1] (a technical property whose definition is outside the scope of this paper. We refer the interestedreader to the original work). We first note that such property holds already for the squared difference k · − · k and orientation loss d O ( · , · ) satisfy such property (see e.g. [18]). It follows that also the FKloss satisfies the ILE property, since sum and composition with smooth functions (namely ˜ g p and˜ g o ) still satisfy the ILE property [7, respectively Thm. 10 and Cor. 11]. Then, Thm. 1 follows as adirect corollary of [7, Thm. 4]. (cid:3) Thm. 1 shows that the proposed algorithm asymptotically yields the best possible inverse kinematicsmap f ∗ with respect to the training distribution ρ . By introducing additional regularity assumptionson the inverse kinematics map, it is possible to improve the result above, yielding also non-asymptoticrates . In particular, we require that the ideal inverse kinematics f ∗ belongs to a suitable Sobolevspace W s, of square-integrable functions with the first s ∈ R weak derivatives square-integrable(see [1] for a formal definition). The latter is a standard assumption in supervised learning settings [6],and essentially requires the inverse kinematics to be a regular function (e.g. smooth with controlledderivatives). Then, the following result concludes our analysis, providing upper bounds on CRiSP’sexcess risk. Theorem 2 (Rates).
With the same hypotheses of Thm. 1 and using a Laplacian kernel, assumethat the ideal solution f ∗ ∈ W s, ( R d ) , with s > d/ . Then, with high probability with respect to ρ E ρ [ ( b f ( x ) , y ) − 4 ( f ∗ ( x ) , y )] ≤ O ( n − / ) . (8) Proof.
The proof is analogous to that of Thm. 1 but relies on the assumption f ∗ ∈ W s, ( R d ) toapply [7, Thm. 5] to the CRiSP estimator, yielding (8) as required. (cid:3) In this section, we empirically validate the proposed approach. In particular, we show that:• The flexibility provided by Structured Prediction allows CRiSP-FK to overcome the limitationsof other data-driven methods. In particular, in Sec. 4.3 we show that being able to define astructured loss directly in Cartesian space allows our method to outperform other Machine-Learning-based IK approaches that employ losses in joint configuration space. CRiSP-FK alsoyields IK solutions respecting joint position limits by construction, by simply defining boxconstraints in (3).• Thanks to the previously introduced properties, CRiSP-FK is more robust to model misspecifi-cation with respect to model-based IK and consistently outperforms it in several 2D and 3Dsettings, as reported in Sec. 4.4. 7 a) OCSVM (b)
Neural Network (c)
CRiSP-R (d)
CRiSP-FK
Figure 3:
Qualitative results for trajectory reconstruction on a planar robotic manipulator of One ClassStructured SVM (a) , the neural network model (b) , CRiSP with radians loss on the joints (c) and CRiSP withthe forwards kinematics loss (d) . Scale in meters.
We evaluate the performance of our approach on the trajectory reconstruction task described below.By using an estimator in the form of (2), it is possible to instantiate a single global model over thewhole Cartesian space, where each local minimum of F α ( x, y ) corresponds to a possible solution. Thisovercomes the problem of the non-injectivity of g , since there is always a unique local solution foundby gradient-based iterative optimizers such as L-BFGS-B, depending on the starting y . This propertycan be leveraged in common robotics tasks such as trajectory reconstruction, where it is necessary tosolve the inverse kinematics for a sequence of points { x t } Lt =1 which describes a trajectory in space.We follow the idea from [3] and compute the inverse kinematics of a trajectory one point at a time,using the inferred b f ( x t − ) of the previous trajectory point as the starting L-BFGS-B value for theprediction of the next point in the trajectory. This idea hinges on the assumption that by startingfrom a configuration y t , the solution y t +1 for the next point will be similar, providing continuity inthe joints movement. In our first set of experiments we employ a 5-DoF simulated planar manipulator with 5 links of2 m each, and 5 revolute joints spanning the output space (in radians) Y = [0 , π ] × [ − π, × [ − π/ , − π/ × [ − π, × [ − π/ , π/ n = 25000random configurations { y i } ni =1 with uniform distribution in Y and compute the corresponding poses { g ( y i ) } ni =1 = { x i } ni =1 , with x i ∈ R × SO (1) ∀ i = 1 , . . . , n . Since g is a highly non-linear function,while the sampled configurations have uniform distribution, the corresponding poses are not uniformlydistributed. We aim at reconstructing an eight-shaped trajectory, well within the robot’s workspace,and a circle-shaped trajectory, closer to the boundary of the reachable region.The second set of experiments was executed in 3D Cartesian space, and it involves realistic IKtasks. To this aim, we use a simulated 7-DoFs Franka Emika Panda manipulator (shown in Fig. 1).We employed the official joint position limits throughout our experiments, yielding the followingconfiguration space Y = [ − . , . × [ − . , . × [ − . , − . × [ − . , − . × [ − . , . × [ − . , . × [ − . , − . cm around a circular trajectory centered in [0.0, 4.4, -55] cm and with radius 3 cm . Eachsample has either one of two fixed orientations with respect to the z axis of the world reference frame:8 able 1: Root mean square error for position (in cm) and orientation (radians) on two test trajectories: eightand circle. Results reported are for the neural network model (NN), One Class Structured SVM (OCSSVM),CRiSP-R, and CRiSP-FK
Traj. OCSVM NN CRiSP-R CRiSP-FKEight Pos. [cm] 57 ±
15 88 ±
24 80 ± Orn. [rad] 0 . ± .
38 0 . ± .
01 0 . ± . Circle Pos. [cm] 275 ±
95 81 ±
10 54 ± Orn. [rad] 0 . ± .
06 0 .
05 + 0 . . ± . π/ rad and − π/ rad .The second task involves the reconstruction of a spiraling trajectory of radius 3 cm and height 6 cm .We uniformly sample 35000 training points, but this time from a larger region of the workspace so as tohave a lower density of examples per unit of volume. Moreover, the collected end-effector orientationsare no longer constrained, making the learning task more challenging and high-dimensional. We haveexperimentally observed that our method relies on local information from training samples, whichsuggests that restricting our experiments to specific areas of Cartesian space does not imply a loss ingenerality as long as the robot is not too close to singular configurations. However, having a lowersample density sets-up a harder problem.Our experiments are based on the following software components. We use the Bullet 3 simulator [9](via the PyBullet Python interface) for simulating our robots and tasks. We employ the Selectively-Damped Least Squares (SDLS) algorithm [4] available in Bullet as an IK baseline for our experiments.Hyperparameter selection for CRiSP-FK was performed via grid search on a separate validation set. We assess the impact of the two different choices of loss function from both a qualitative andquantitative perspective. For reference, we also compare CRiSP with a neural network estimator thatdoes not take into account the structure of the robot (i.e. the joint constraints) as well as with theOne-Class Structured SVM (OCSSVM) introduced in [3]. For the neural model, we trained a five-layerneural network (NN) comprised of fully connected layers and hyperbolic tangents activations, wherethe first two and last two layers have 64 hidden units each, while the middle layers have 128 hiddenunits each. The NN performs multivariate regression directly into the joints space and does not takeinto account the joint constraints of the robot. When the NN produces joint configurations outsidethe robot’s constraint, we clip the predictions to satisfy the constraints. Regarding our proposedestimator we consider: 1) a CRiSP predictor with loss d O corresponding to the sum of squared radialdistances between joint angles from (5); 2) the CRiSP model with FK loss introduced in (4). Werefer to these models as CRiSP-R and CRiSP-FK, respectively.We cross-validate for hyperparameterson 10000 randomly-sampled validation points and evaluate the performance on two test trajectories:an eight-shaped one and a circular one.Fig. 3 offers a qualitative comparison between the different methods, showing the correspondingpredictions in the case of tracking the eight-shaped trajectory. It can be noticed that all methods butCRiSP-FK are unable to correctly track the required trajectory. This is likely due to the fact thatCRiSP-FK combines the best of both data-driven and model-based approaches, by employing a lossfunction comparing the resulting Cartesian poses instead of the joint values. These observations arequantitatively supported in Tab. 1, which reports the average prediction error in both position and9 a) Biased links (b)
Biased joints
Figure 4:
On the y axis the errors for orientation (in radians) and position (in cm) of circle-shaped andeight-shaped trajectories for models with bias in the links (Fig. 4a) and bias in the joints (Fig. 4a. On the x axis the magnitude of the bias in cm for the links and degrees for the joints. orientation of the different methods.We note that OCSSVM performs particularly poorly on the circular trajectory. We argue thatthis is due to intrinsic limitations of OCSSVM, which was originally developed for support estimationpurposes and therefore needs a much higher sample complexity. In fact, the circular trajectory isvery close to the boundaries of the manipulator workspace, where training samples are sparse. Giventhe empirical observations in this section, in the following we do not report additional results forthe NN model, OCSSVM, and CRiSP-R, since their performances are sub-optimal with respect tothe propsoed CRiSP-FK strategy (and in the case of OCSSVM, extremely demanding in terms ofcomputational time). Rather, we focus on a comparison with model-based methods, which are morecomputationally efficient. We evaluate the capability of CRiSP-FK to compensate for errors in the supplied forward kinematicsmodel g . To do so, we generate a dataset D using the true forward model g of the robot and then wetrain our model using the loss ( y, y i ) = k ˜ g p ( y ) − ˜ g p ( y i ) k + d O (˜ g o ( y ) , ˜ g o ( y i ))) , where ˜ g is computedas the original g , plus a fixed bias either in the joint angles or in the link lengths. For joint angles, if aconfiguration has values y = [ y , . . . , y J ], then ˜ g ( y ) = g ( y + ¯ b ) where ¯ b = b · [1 , . . . , ∈ R J and b ∈ R controls the amount of bias. For link lengths, we add b to the nominal link lengths used to build thetrue g , with the signs of b chosen at random at every experiment repetition. We test the trainedmodel on trajectory reconstruction and report qualitative and quantitative results for increasing valuesof b . As a comparison baseline, we approach the same task using the SDLS inverse kinematics solverof PyBullet. This experiment is performed with both the planar manipulator and the Panda robot,training on the datasets and trajectories described in Sec. 4.2. Fig. 4b and Fig. 4a show the RMSEerror in position and orientation on both trajectories as a function of, respectively, the bias in thejoints and in the links. Tab. 2 shows the same type of error on four different trajectories: the spiral( S ) and the circular trajectory with three different end-effector orientations ( C π/ , C − π/ , C ), where C is an out-of-sample orientation that does not appear in the training set. In Fig. 6b and Fig. 6a we10 able 2: IK performances comparison of CRiSP-FK and SDLS on 4 desired trajectories (position andorientation) for the end-effector (EE) of a 7-DoF Panda arm. S : spiral with fixed desired EE orientation; C , C − π/ , C + π/ : Circumference with 3 different EE orientations. The Panda arm’s kinematic model is corruptedby introducing increasing biases at each joint (0 to 3 deg ) or at each link (0 to 30 mm ). Position ( mm ) andorientation ( rad · − ) root mean square errors (RMSE) are reported with one standard deviation along thetrajectories. Trajectory
S C C − π/ C + π/ Algorithm
CRiSP-FK SDLS CRiSP-FK SDLS CRiSP-FK SDLS CRiSP-FK SDLS
Joint Bias (deg) Pos. RMSE ( mm )
97 ± 1.2
110 ± 1.47
Joint Bias (deg)
Orn. RMSE
10 ± 0.21 ( rad · − )
203 ± 86.6
403 ± 11.9
198 ± 22.8
462 ± 2.72
453 ± 16.7
485 ± 4.383
611 ± 260
548 ± 59
189 ± 157
25 ± 4.89
Link Bias (mm)
Pos. RMSE ( mm )
121 ± 1.26
120 ± 0.876
330 ± 6.56
115 ± 2.56
349 ± 3.37
286 ± 3.65
296 ± 2.81
Link Bias (mm)
Orn. RMSE ( rad · − ) show the magnitudes of α ( x ) from (3) to provide a visual intuition on the datasets and the relevanceof each training sample during prediction. Finally, in Fig. 5b and Fig. 5a we show an example ofreconstructed trajectory with misspecified model. The performance of CRiSP-FK is validated by the results for 0 bias in Tab. 2, where the presentedproblem has more degrees of freedom than the planar one, and is, therefore, more challenging. In thecase of the circular trajectory, Fig. 6b shows that even when the local sample density is high, onlyclose examples are actually used to compute the proposed solution. When presented with a task forwhich the dataset is sparser, such as the spiral trajectory shown in Fig. 5a, the performance is worsethan SDLS for orientation but still better in position. A qualitative overview of our experiments isreported in the supplementary video.The second part of the experiments highlights the capability of our approach to compensatewhen provided with a misspecified model. This is shown both by the plots in Fig. 4b for the planarmanipulator and in Tab. 2 for the Panda robot. In the planar case, for no bias, SDLS outperformsCRiSP-FK. However, just a small bias makes the prediction error of SDLS increase sharply, whileCRiSP-FK shows much higher robustness even for biases up to 10 deg in the joint angles or ∼ cm in the link lengths. A similar trend is observed in the experiments with the Panda robot in Tab. 2. Inthe simpler scenario of the circular trajectory with known orientations, CRiSP-FK outperforms fromthe start SDLS. On C and S an error of half a degree is enough to observe a significant increase inthe error for SDLS, while CRiSP-FK remains more robust.11 a) (b) Figure 5:
Trajectories reconstructed by CRiSP-FK and SDLS under model bias (0.1 deg ) for the S (a) and C (b) tasks, respectively. We make a note on computational times. For reference, all experiments were performed on anIntel® Core™ i7-1075H laptop CPU. Within this setting, CRiSP requires on average 5 minutes totrain (and perform model selection) on 25000 training points and 0 .
7s for each prediction. SDLS(which is model-based and does not have a training phase) takes 10 − s on average at prediction time.As expected, the SDLS model-based approach is significantly faster than the data-driven strategies.However, the implementation of SDLS, albeit using a Python interface, is based on the C++ layer ofthe Bullet simulator, while CRiSP is completely in Python, except for the matrix inversion routineused in the training step. We note that these numbers are highly dependant on the complexity of theforward kinematics and the number of training points. While the focus of our work is on the robustnessand predictive accuracy of CRiSP, we argue that further work can significantly improve predictiontime by leveraging classical large-scale machine learning techniques such as Nyström approximation,determinantal point sampling or gradient preconditioning. In this work we studied the problem of learning the inverse kinematics of a robot manipulator using adata-driven approach. We focused on settings in which the kinematic structure of the robot is known,but potentially inaccurate. Within this setting, we proposed CRiSP, a structured prediction algorithmcombining a-priori knowledge of the model with non-parametric regression, to efficiently learn aninverse kinematics map. We characterized the generalization properties of the proposed approachand empirically demonstrated the effectiveness of CRiSP on trajectory reconstruction tasks. Ourapproach is significantly more effective than previous data-driven methods, as well as model-basedones, even in settings affected by varying types of bias in the kinematics.12 a) (b)
Figure 6: (a) and (b) show CRiSP-FK’s α i ( x ) values for a representative spiral and circular trajectory point,respectively. Future research will focus on two main directions: firstly, we will explore acceleration strategiesat the inference stage, to make our method appealing for real-time applications. Secondly, we willinvestigate active-learning-based strategies to find better anchor points to train CRiSP, ultimatelyyielding a more concise and efficient estimator.
Acknowledgements
This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM),funded by NSF STC award CCF-1231216, and the Italian Institute of Technology. We gratefullyacknowledge the support of NVIDIA Corporation for the donation of the Titan Xp GPUs and the Teslak40 GPU used for this research. Part of this work has been carried out at the Machine Learning Genoa(MaLGa) center, Università di Genova (IT). We thank Silvio Traversaro and Yeshasvi Tirupachurifrom Istituto Italiano di Tecnologia for the fruitful discussions and insights. L. R. acknowledges thefinancial support of the European Research Council (grant SLING 819789), the AFOSR projectsFA9550-17-1-0390 and BAA-AFRL-AFOSR-2016-0007 (European Office of Aerospace Research andDevelopment), and the EU H2020-MSCA-RISE project NoMADS - DLV-777826. C.C. acknowledgesthe financial support of the Royal Society of Engineering, grant SPREM RGS/R1/201149.
References [1] Robert A Adams and John JF Fournier.
Sobolev spaces . Elsevier, 2003.[2] GH Bakir, T Hofmann, B Schölkopf, AJ Smola, B Taskar, and SVN Vishwanathan. Predictingstructured data, 2007. 133] Botond Bócsi, Duy Nguyen-Tuong, Lehel Csató, Bernhard Schoelkopf, and Jan Peters. Learninginverse kinematics with structured prediction. In
International Conference on Intelligent Robotsand Systems . IEEE, 2011.[4] Samuel R Buss and Jin-Su Kim. Selectively damped least squares for inverse kinematics.
Journalof Graphics tools , 10(3):37–49, 2005.[5] Richard H Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. A limited memory algorithm forbound constrained optimization.
SIAM Journal on Scientific Computing , 16(5):1190–1208, 1995.[6] Carlo Ciliberto, Lorenzo Rosasco, and Alessandro Rudi. A consistent regularization approach forstructured prediction. In
Advances in neural information processing systems , pages 4412–4420,2016.[7] Carlo Ciliberto, Lorenzo Rosasco, and Alessandro Rudi. A general framework for consistentstructured prediction with implicit loss embeddings.
Journal of Machine Learning Research ,21(98):1–67, 2020.[8] Carlo Ciliberto, Alessandro Rudi, Lorenzo Rosasco, and Massimiliano Pontil. Consistent multitasklearning with nonlinear output relations. In
Advances in Neural Information Processing Systems ,pages 1986–1996, 2017.[9] Erwin Coumans and Yunfei Bai. Pybullet, a python module for physics simulation for games,robotics and machine learning. 2016.[10] Vicente Ruiz De Angulo and Carme Torras. Learning inverse kinematics: Reduced samplingthrough decomposition into virtual robots.
IEEE Transactions on Systems, Man, and Cybernetics,Part B (Cybernetics) , 2008.[11] Aaron D’Souza, Sethu Vijayakumar, and Stefan Schaal. Learning inverse kinematics. In
Proceed-ings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. , volume 1,pages 298–303. IEEE, 2001.[12] Thomas George Thuruthel, Yasmin Ansari, Egidio Falotico, and Cecilia Laschi. Control strategiesfor soft robotic manipulators: A survey.
Soft robotics , 2018.[13] Thorsten Joachims, Thomas Hofmann, Yisong Yue, and Chun-Nam Yu. Predicting structuredobjects with support vector machines.
Communications of the ACM , 52(11):97–104, 2009.[14] Michael I Jordan and David E Rumelhart. Forward models: Supervised learning with a distalteacher.
Cognitive science , 1992.[15] Sebastian Nowozin, Christoph H Lampert, et al. Structured learning and prediction in computervision.
Foundations and Trends® in Computer Graphics and Vision , 6(3–4):185–365, 2011.[16] Eimei Oyama, Nak Young Chong, Arvin Agah, and Taro Maeda. Inverse kinematics learningby modular architecture neural networks with performance prediction networks. In
Proceedings2001 ICRA. IEEE International Conference on Robotics and Automation , 2001.[17] Eimei Oyama and Taro et al. Maeda. Inverse kinematics learning for robotic arms with fewerdegrees of freedom by modular neural network systems. In , 2005.1418] Alessandro Rudi, Carlo Ciliberto, Gian Maria Marconi, and Lorenzo Rosasco. Manifold structuredprediction. In
Advances in Neural Information Processing Systems , pages 5610–5621, 2018.[19] Bruno Siciliano. Kinematic control of redundant robot manipulators: A tutorial.
Journal ofintelligent and robotic systems , 1990.[20] Mark W Spong, Seth Hutchinson, Mathukumalli Vidyasagar, et al.
Robot modeling and control ,volume 3. wiley New York, 2006.[21] Ingo Steinwart and Andreas Christmann.
Support vector machines . Springer, 2008.[22] Ichiro Takeuchi, Quoc V Le, Timothy D Sears, and Alexander J Smola. Nonparametric quantileestimation.
Journal of machine learning research , 2006.[23] Gaurav Tevatia and Stefan Schaal. Inverse kinematics for humanoid robots. In
Proceedings 2000ICRA. IEEE International Conference on Robotics and Automation. , volume 1, pages 294–299.IEEE, 2000.[24] Pauli Virtanen, Ralf Gommers, et al. Scipy 1.0: fundamental algorithms for scientific computingin python.