[PDF] Learning Singularity Avoidance

Abstract

With the increase in complexity of robotic systems and the rise in non-expert users, it can be assumed that task constraints are not explicitly known. In tasks where avoiding singularity is critical to its success, this paper provides an approach, especially for non-expert users, for the system to learn the constraints contained in a set of demonstrations, such that they can be used to optimise an autonomous controller to avoid singularity, without having to explicitly know the task constraints. The proposed approach avoids singularity, and thereby unpredictable behaviour when carrying out a task, by maximising the learnt manipulability throughout the motion of the constrained system, and is not limited to kinematic systems. Its benefits are demonstrated through comparisons with other control policies which show that the constrained manipulability of a system learnt through demonstration can be used to avoid singularities in cases where these other policies would fail. In the absence of the systems manipulability subject to a tasks constraints, the proposed approach can be used instead to infer these with results showing errors less than 10^-5 in 3DOF simulated systems as well as 10^-2 using a 7DOF real world robotic system.

Full PDF

LLearning Singularity Avoidance

Jeevan Manavalan ∗ & Matthew Howard Abstract — With the increase in complexity of robotic systemsand the rise in non-expert users, it can be assumed that taskconstraints are not explicitly known. In tasks where avoidingsingularity is critical to its success, this paper provides anapproach, especially for non-expert users, for the system tolearn the constraints contained in a set of demonstrations, suchthat they can be used to optimise an autonomous controller toavoid singularity, without having to explicitly know the taskconstraints. The proposed approach avoids singularity, andthereby unpredictable behaviour when carrying out a task, bymaximising the learnt manipulability throughout the motion ofthe constrained system, and is not limited to kinematic systems.Its beneﬁts are demonstrated through comparisons with othercontrol policies which show that the constrained manipulabilityof a system learnt through demonstration can be used to avoidsingularities in cases where these other policies would fail. Inthe absence of the systems manipulability subject to a tasksconstraints, the proposed approach can be used instead toinfer these with results showing errors less than − in 3DOFsimulated systems as well as − using a 7DOF real worldrobotic system. I. I

NTRODUCTION

In recent years, there has been a booming shift fromthe development of specialised factory robots to versatileautonomous ones that are targeted at non-expert users. How-ever, systems are performing tasks for which they werenot speciﬁcally designed and there is uncertainty over thedegree of redundancy (or contrastingly, overconstrainedness)in the control of their movements. Consequently, increasingimportance is placed on improving control techniques toreduce such uncertainty, which otherwise may inadvertentlyaffect the task at hand.Traditionally, those issues are assessed by the so-called manipulability of the system, by analysing the extent towhich solutions exist to the inverse problem of ﬁndingcontrol solutions for a given set of task constraints. Firstintroduced by Yoshikawa [1], the manipulability index worksby identifying linear dependencies in the task Jacobian thatmay cause a singular conﬁguration to be reached.Initially used to devise control algorithms to avoid kine-matic singularities in manipulators, it has since been used in awide variety of contexts, such as real-time end-pose planningin walking tasks [2], grasp planning [3], and planning ofhuman-robot interaction work spaces [4]. This measure hasalso been extended to account for joint limits, self-collisionin redundant systems, and the need for adaptability to avoidobstacles in the work space [5].The common assumption among studies using differentforms of Yoshikawa’s manipulability index, [6] is that thenature of the constraints affecting the system’s manipulability Jeevan Manavalan and Matthew J. Howard are with the Centre forRobotics Research, Department of Informatics, King’s College London,London, UK. [email protected] ∗ This work was partially supported by the EPSRC Grant Agreement No.EP/P010202/1. (more speciﬁcally, the system’s task Jacobian) is known ana-lytically, a priori for design of the controller. However, withthe rise in non-expert users and the increase in complexityof tasks, this assumption is increasingly untenable. Ignoringthe manipulability in such systems risks, for instance, a non-expert user driving a robot through an unstable singular pointand causing possible damage to the robot, or worse, injuryto the user.As an alternative, this paper provides a data-driven ap-proach in which the constrained joint manipulability is learntfrom user demonstrations , without need for explicit deﬁnitionthrough analytical approaches. This is beneﬁcial to non-expert users without explicit knowledge of task constraintsas it optimises a system’s control to avoid instabilities inthe system caused by singularities when carrying out a task.The proposed approach uses constraint consistent learning[7], [8], [9], [10] to, ﬁrst, learn the task constraint and,second, optimise the manipulability derived from the learntconstraint matrix within the null space of the primary taskconstraint. It thereby, avoids singularity by maximising the learnt manipulability throughout the motion of the con-strained system, in the absence of analytical prior knowledgeof the constraints. Results of estimating the manipulabilitysubject to task constraints show errors less than − in3DOF simulated systems as well as − using a 7DOFreal world robotic system.II. P ROBLEM D EFINITION

This work considers the control of systems subject to uncer-tain constraints due to the complexity and/or naivety of non-expert users, and the need to prioritise joint conﬁgurations,which lead to greater degrees of freedom to ﬂexibly performdemonstrated tasks.

A. Task Prioritised Constraints

Formally, a system of S -dimensional (self-imposed or envi-ronmental) constraints can be deﬁned as A ( x ) u ( x ) = b ( x ) (1)where x ∈ R P represents state (usually represented eitherin end-effector or joint space), A ( x ) ∈ R S×Q is a matrixdescribing the constraints, u ∈ R Q represents the action and b ( x ) ∈ R S is the task space policy describing the primarytask to be accomplished. It is assumed that the A ( x ) shouldbe learnt by the robot through demonstrations, while alsoautonomously handling the degree of constrainedness of thesystem such that it enhances the ﬂexibility.The general solution to (1) is given by u ( x ) = A † ( x ) b ( x ) (cid:124) (cid:123)(cid:122) (cid:125) v + N ( x ) π ( x ) (cid:124) (cid:123)(cid:122) (cid:125) w (2) a r X i v : . [ c s . R O ] M a r ig. 1: Demonstrating the task of opening drawers subject totheir linear constraint.where v is the task space component that implements thetask space policy, w is the null space component and N ( x ) := I − A ( x ) † A ( x ) ∈ R Q×Q (3)is the null space projection matrix that projects the null spacepolicy π ( x ) onto the null space of A . Here, I ∈ R Q×Q denotes the identity matrix and A † = A (cid:62) ( AA (cid:62) ) − is theMoore-Penrose pseudo-inverse of A . This does not onlyapply to kinematics, but also to redundant actuation [11],and redundancy in dynamics [12].Commonly, in the context of programming by demonstra-tion by non-expert users, A , N , b and π are not explicitlyknown. Instead, the controller must be derived from data. Inthis paper, it is assumed that data is given as pairs of N observed states x n and actions u n .What non-expert users may not be aware of is that w controls the additional free non-task related degrees offreedom of the system. It can be thought of as a lowerpriority task which does not conﬂict with the goal deﬁnedin the task space component. The beneﬁt of having a nullspace component is evident in tasks which have multiplesolutions. For example, a reaching task where multiple pathsto a goal may be available, some paths may drive a systemcloser to its joint limits or singular conﬁgurations whichcan lead to an increased risk of getting stuck or causeturbulent movement in face of perturbations or the impositionof additional constraints. B. Example: Opening Drawers

As a real world example, consider placing the robot inan environment designed for use by people such as anofﬁce. A robot trained to assist staff with, for example,collecting documents needs to be able to perform taskssuch as opening ﬁling cabinets. In this case, the problemof teaching the robot to open and close drawers can besolved through programming by demonstration (as shown inFig. 1). Programming by demonstration is suitable becauseofﬁce staff can decide to retrain systems when required inanother ofﬁce or if the room layout changes without havingany expert knowledge of the system and the need to call in aspecialist. The primary task policy is to have the robot end-effector open the drawer, in which the constraint matrix A is determined by a linear constraint following the directionof the drawer opening. Where the task is kinaesthetically demonstrated, the usermight manually guide a robot facing a ﬁling cabinet, fromholding onto a random point on the handle to opening thedrawer towards the robot, and repeating this action multipletimes using different starting points. In this scenario, the primary task might be learnt in a straightforward mannerusing one of several constraint learning methods (see § III-C).Unless explicitly instructed, however,the user might nottake speciﬁc care of how the motion appears in the nullspace when performing those demonstrations. For example,the user might choose to grab the drawer from the nearestpoint or in a pose/grip which allows demonstrations in themost comfortable manner for the user. It is unlikely thatan average user will know to avoid unstable or singularconﬁgurations—however, these may occur in various task-dependant situations. If for example, a system placed infront of a drawer starts with all the joints fully extended,and a novice user then directs it to open the drawer bymoving the arm directly back towards itself without resolvingredundancy in the joint space, the system may propel itselfin unpredictable directions due to the singularity. Moreover,systems that use an ad hoc way of dealing with singularities,such as when using Matlab’s pseudoinverse function whichreplaces singular values with zero are not satisfactory, as thisprevents any movement of the system and thus comes at theexpense of completing the task. A better way to do this isby maximising the system’s manipulability.

C. Task Manipulability and Programming by Demonstration

To avoid these problems and reduce uncertainty of encounter-ing unstable or singular conﬁgurations ( § II-B), traditionally,Yoshikawa’s manipulability index is used, whereby the nullspace degrees of freedom are used to maximise the distancefrom singular points during the execution of the primary task.The manipulability is a measure of a system’s abilityto position and orientate its end-effector [1]. In order tohelp with the designing and control of systems, Yoshikawadeveloped the manipulability index [1]. It works by ﬁndingthe linear dependencies in the task Jacobian which could re-sult in reaching a singular conﬁguration. Thus the followingmeasure is used to assess manipulability µ ( x ) = (cid:113) det ( A ( x ) A ( x ) (cid:62) ) (4)Note that, computation of the manipulability presupposesthe availability of A in analytical form —however, as notedin § II-A, this is usually unavailable in the context of pro-gramming by demonstration where the primary task andassociated constraints are implicit in the demonstrations .As µ → , the manipulability index indicates that thesystem is approaching a singular pose. The upper limit of µ depends on the system itself and can only be providedonce the entire work space is assessed (however the proposedmethod does not require this as it compares its manipulabilitylocally). One of the applications of µ lies in its use as a costfunction to replace π in (2) which results in the secondarytask moving the system towards the goal using a path whichfavours higher manipulability (see § III-D).To obtain µ , such that it can be used in such a manner,this paper proposes to form an estimate of the constraintsn a task and derive an estimate of (5) . The approach usesdemonstration data such that it is applicable even for non-experts with no formal knowledge of the constraints involvedin a task. In doing so, it allows for the robot to moreaccurately manoeuvre towards targets while minimising therisk of crossing over singular points and avoiding potentiallyunstable, unpredictable behaviour.III. M ETHOD

In this paper, the proposed approach forms an estimate of thetask space constraint matrix A to be used in π to manipulatesystems away from singular points while performing a task.This minimises the risk of encountering singularities whichlead to unpredictable behaviour. A. Data Collection

The proposed method works on data given as N pairs of ob-served states x n and actions u n collected though kinaestheticdemonstrations of the primary task. Assumptions on the datainclude that (i) observations are in the form presented in (2),(ii) b from the task space varies throughout all observations,(iii) π from the null space is consistent throughout u , and(iv) A , N , b , and π are not explicitly known for any givenobservation. B. Separating the task and null space component

Given the demonstration data { x n , u n } N n =1 , the ﬁrst step isto separate the task and null space components. For this, theapproach ﬁrst proposed in [13] can be used.As shown there, the ﬁrst and second terms of (2) can beseparated by seeking an estimate ˜ w that minimises E [ ˜ w ] = || ˜ P n u n − ˜ w n || (5)where ˜ w n := ˜ w ( x n ) and ˜ P n := ˜ w n ˜ w n (cid:62) / || ˜ w n || . Thisworks on the principal that, there exists a projection P forwhich P u = P ( v + w ) = w .Similarly, ˜ v is required as it functions as the primary taskcontroller for the system and can be extracted by subtractingthe newly estimated ˜ w from u , i.e., ˜ v = u − ˜ w . C. Representation & Learning of the Constraint A At this point, the original demonstrated actions u are sep-arated into the task and null space parts. Based on thelatter, the goal now is to compose an estimate of A thatcan be used in assessing manipulability. Constraints imposedon motion in the task space can refer to translational andorientational coordinates in the end-effector or joint spacedepending on the task at hand. An important criteria forusing the manipulability for control is that constraints are state-dependant [14]. Otherwise, if A is constant across thestate space [9] every state has the same manipulability indexand the singularity avoidance controller has no role.Taking this into consideration, a suitable representation ofthe constraint matrix is [8] A ( x ) = ΛΦ ( x ) (6)where Λ ∈ R S×P is an unknown selection matrix (to beestimated in the learning) and Φ ( x ) ∈ R P×Q is a featurematrix. The rows of the latter contain candidate constraintsthat can be predeﬁned where there is prior knowledge ofpotential constraints affecting the system, or can take generic forms such as a series of polynomials. For example, one maychoose Φ ( x ) = J ( x ) , the Jacobian of the manipulator, where A ( x ) = ΛJ ( x ) encodes constraints on the motion of speciﬁcdegrees of freedom in the end-effector space.Depending on the assumptions made on the representationof A , one of several learning methods could potentially beused to form the estimate of the selection matrix ˜ Λ [9], [14],[15]. Of these, this paper picks [14] as it requires relativelyfew parameters and little data to perform robustly [9]. Theestimate is formed by minimising E [ ˜ A ] = N (cid:88) n =1 ˜ w (cid:62) n ( ˜ ΛΦ n ) † ˜ ΛΦ n ˜ w n (7)where Φ n := Φ ( x n ) . This results in the estimate ˜ A ( x ) =˜ ΛΦ ( x ) . D. Estimated Singularity Avoidance Policy

Using the estimated constraint matrix ˜ A , it is now possibleto form the estimate of the system manipulability for anyconﬁguration within the support of the data. The latter isgiven by substitution of A in Yoshikawa’s manipulabilityindex. ˜ µ ( x ) = (cid:114) det (cid:16) ˜ A ( x ) ˜ A ( x ) (cid:62) (cid:17) . (8)From this constrained manipulability map, states for partic-ular end-effector poses can be selected based on ˜ µ using π to update the joint angles. When used as a cost function,this information can provide the direction in which thesystem should move in order to increase its manipulabilityand maximise the distance from singular points, therebyreducing the risk of unpredictable behaviour. The simplestsuch approach is to use gradient ascent by replacing π in(2) with π ˜ µ ( x ) = ∇ x ˜ µ. (9)Alternatively, if the task space trajectory is predictable, ˜ µ canbe used in combination with global approximation in the nullspace (see, e.g., [16]).Fig. 2 shows a brief overview of the major steps involvedin the proposed approach. E. Evaluation Criteria for Learning Methods

Learning the constrained manipulability may not always be atrivial matter depending on the task at hand. Factors such as,high dimensionality of a system, or the structure of particularconstraints in comparison to others, can lead to poor learningperformance. It is therefore necessary to deﬁne the followingmetric to assess learning performance.The Normalised Manipulability Index Error (NMIE) eval-uates the ﬁtness of the learnt manipulability index ˜ µ , thefollowing can be used to measure the distance between thetrue and learnt manipulability index E NMIE = 1 N σ µ N (cid:88) n =1 || µ n − ˜ µ n || (10)where N is the number of data points. The error is nor-malised by the variance of µ . The MIE will reach zero as ˜ µ → µ .ig. 2: Overview of approach to maximising manipulabilityin programming by demonstration tasks. (A) Motion datais collected through demonstrations of the task. (B1) Thedata is used to determine the separate task and null spacecomponents so that (B2) the latter can be used to estimatethe constraint. (C) Using the estimated constraint matrix,an estimate of the constrained manipulability ˜ µ is made.(D) This estimate is used to select states with greatermanipulability, and (E) control the robot toward these whenperforming the primary task.IV. E VALUATION

In this section, the proposed approach is ﬁrst examinedthrough a simulated 3-link planar system, before evaluatingits performance in the context of programming by demon-stration of a physical robot. A. Three Link Planar Arm

The aim of the ﬁrst evaluation is to test the robustness oflearning the manipulability from motion data. The setup isas follows.Constrained motion data is gathered from a kinematicsimulation of a 3-link planar robot. The state and action spacerefer to the joint angle position and velocities, respectively, i.e., x := q ∈ R and u := ˙ q ∈ R . The taskspace isdescribed by the end-effector coordinates r = ( r x , r y , r θ ) (cid:62) referring to the positions and orientation, respectively. Thesimulation runs at a rate of Hz . Joint space motion ofthe system is recorded as it performs tasks under differentconstraints in the end-effector space. As described in § III-C, The data supporting this research are openly available fromKing’s College London at http://doi.org/[link will be madeavailable soon] . Further information about the data and conditions ofaccess can be found by emailing [email protected]

TABLE I: NMIE (mean ± s.d.) × − for each constraint over50 trials. Constraint NMIE Λ x,y . × − ± . × − Λ x,θ . × − ± . × − Λ y,θ . × − ± . × − a task constraint at state x is described through A ( x ) = ΛJ ( x ) (11)where J ∈ R × is the manipulator Jacobian, and Λ ∈ R × is the selection matrix specifying the coordinates to beconstrained. The following three constraints are evaluated: Λ x,y = ((1 , , , (0 , , , (0 , , T . Λ x,θ = ((1 , , , (0 , , , (0 , , T . Λ y,θ = ((0 , , , (0 , , , (0 , , T . To simulate demonstrations of reaching behaviour, the robotend-effector starts from a point chosen uniform-randomly q ∼ U [0 ◦ , ◦ ] , q ∼ U [90 ◦ , ◦ ] , q ∼ U [0 ◦ , ◦ ] to ataskspace target r ∗ and following a linear point attractorpolicy b ( x ) = r ∗ − r . (12)where r ∗ is drawn uniformly from r ∗ x ∼ U [ − , , r ∗ y ∼ U [0 , , r ∗ θ ∼ U [0 , π ] . Targets without a valid inverse kine-matic solution are removed. All trajectories also use a pointattractor as a control policy in w π ( x ) = ψψψ ∗ − x . (13)The latter enforces consistency, that makes it easier to sep-arate the constraint from the control policy. ψψψ ∗ is arbitrarilychosen as q = 10 ◦ , q = − ◦ , q = 10 ◦ . For eachconstraint, trajectories are generated each containing 10data points (1,000 points per constraint). This is done forseparate training and testing data sets (a total of 2,000 pointsper constraint). Finally, this whole experiment is repeated times.The NMIE is presented in Table I. As can be seen, learningof the constrained manipulability index is successful with er-rors less − . This shows that the manipulability index canbe learnt with very high precision through demonstrations,without having to explicitly know how the constraints affectthe system’s motions.To further assess the suitability of using ˜ µ instead of µ ,the RMSE is evaluated for 20 randomly generated trajec-tories of 100 points, using π µ learnt under the constraint Λ x,y and π ˜ µ following § III-D. The starting point is chosenuniform-randomly q ∼ U [0 ◦ , ◦ ] , q ∼ U [90 ◦ , ◦ ] , q ∼ U [0 ◦ , ◦ ] , and following (12), r ∗ is drawn uniformly from r ∗ x ∼ U [ − , , r ∗ y ∼ U [0 , , r ∗ θ ∼ U [0 , π ] . This producestwo trajectories, one using π µ and the other π ˜ µ . Resultsgiven as (mean ± s.d.) × − are . ± . . These errorsbeing lower than − in both the mean and standarddeviation indicate that all 20 trajectories are accuratelyreproduced, therefore π ˜ µ is an appropriate replacement forwhen π µ is difﬁcult to infer.At this point, the suitability of π ˜ µ in absence of explicitknowledge of the constraint has been established. In ordero understand the beneﬁt of using the manipulability-basedcontroller (9) to handle redundancy in the joint-space, itsperformance can be compared to that of other commonlyused policies when encountering singularities in Λ x,y . Asexamples of the latter, a zero policy and a linear pointattractor in π are chosen. The zero policy emulates themost common and simplest approach (being the shortest pathdirectly towards the target subject to the task constraints).The linear point attractor is also a common choice as anull space policy as a way of bringing the arm to a defaultposture. When considering how systems behave near singularconﬁgurations, it is also important to consider the case wherea system has an ad hoc way to deal with singularities,which is becoming more common among safety protocolsin commercialised systems for novice users (see § II-C).Thus two cases are presented here, one where a systemstarts in a singular conﬁguration without any impromptuway of dealing with singular values, and one case wheresingularities are dealt with by setting the singular value tozero after it crosses a certain threshold. Here, Matlab’s adhoc approach is used, whereby the pseudoinverse functiondetermines when singularities are encountered by using athreshold of max(size(A)) × eps(norm(A)) , whichin this case is . × − . The eps in Matlab calculatesthe ﬂoating-point relative accuracy.Figure 3 shows how a system behaves under the threedifferent control policies in π . This system is subject to atask constraint in the r x and r y coordinate and uses threecontrol policies to evaluate how each handles movementfrom a singular pose. As shown, the proposed method isable to move away from the singular starting pose q =(90 + 10 − ) ◦ , q = 360 ◦ and q = − ◦ using thelearnt manipulability. On the other hand, the zero policy getsstuck at the starting point as it attempts to move directlydown towards the target r ∗ = (0 , . The point attractordoes succeed in the task, however, the movement to thedefault posture q ∗ = ( − ◦ , ◦ , − ◦ ) T results bringsthe robot close to singular conﬁgurations. While both themanipulability and point attractor policies reach their target,it is evident when looking at the bottom row that the (true)manipulability index of both systems are vastly different.As shown, the point attractor nearly approaches a singularconﬁguration at around 70 steps into the movement whichexplains the overall erratic movement to reach the target.On the other hand, the manipulability encounters no suchproblem as it moves towards the target while maximising itsmanipulability throughout the movement.Figure 4 looks at a case where no ad hoc method is usedto detect singularities. In this case, a starting pose is setnear a singular pose of q = 90 ◦ , q = − ◦ and q =( −

180 + 10 − ) ◦ . As shown, both the manipulaiblity andzero policy move away from the near-singular conﬁguration.On the other hand, the point attractor with a default postureof q ∗ = ( − ◦ , − ◦ , ◦ ) T exhibits highly unstablebehaviour (its next state exceeds for each joint). Thistype unpredictable behaviour is the most hazardous whenhaving systems work in the real world.Overall, these experiments show that the constrained ma- as starting in a singular conﬁguration would lead to a division over 0 inthe taskspace within the ﬁrst step of the system’s movement regardless ofthe control policy. Fig. 3: Comparing the manipulability, zero and point attractorin π where singular values in the taskspace are replacedwith . The bottom row shows the manipulability over timeof the corresponding systems in the top row throughout thetrajectory.Fig. 4: Comparing the manipulability, zero and point attractorin π where singularities are not dealt with. The bottom rowshows the manipulability over time of the correspondingsystems in the top row throughout the trajectory.nipulability of a system can be learnt through programmingby demonstration. Moreover, the learnt manipulability indexcan be used as a controller to avoid singularities in com-parison to a straight-forward zero policy or a simple pointattractor. B. 7-Link Sawyer Arm

The ﬁnal experiment assesses the proposed approach in areal world task executed on the Sawyer, a 7DOF revolutephysical robotic system with a maximum reach of 1260mmand precision of ± . mm. The experimental scenario chosenis the closing of a drawer.The state and action space refer to the joint angle positionand velocities, respectively, i.e., x := q ∈ R and u :=˙ q ∈ R . The taskspace is described by the end-effectorcoordinates r = ( r x , r y , r z ) (cid:62) referring to positions in 3Dcartesian space. The system runs at a rate of Hz . Jointspace motion is kinaesthetically recorded by guiding theSawyer as it performs tasks under a constraint in the end-effector space. J ∈ R × is the manipulator Jacobian, andig. 5: Learning the task constraint when closing the drawerthrough programming by demonstration using the Sawyer. Λ ∈ R × is the selection matrix specifying the coordinatesto be constrained. The following constraint is evaluatedwhich models a drawer when it is orientated such that theconstraint’s null space lies along the x -axis (as shown inFig. 5) Λ x = ((1 , , , (0 , , , (0 , , T . In order to have consistency in π , the system starts in adefault pose of q ∼ − ◦ , q ∼ ◦ , q ∼ − ◦ , q ∼ ◦ , q ∼ − ◦ , − ◦ , q ∼ ◦ where the joints pointoutward such that stretching out the systems arm away fromits body and along the constraint resolves redundancy in asimilar manner for each trajectory.Fifty trajectories are recorded and the data is down-sampled such that each trajectory is reduced to 10 points.This is done as the direction of the constrained movementsare captured even with such little data, and more data simplyresults in longer computation times.The MIE learnt from 50 trajectories is . , thereforeit is learnt successfully with errors below − . Whileit is shown that the manipulability index is learnt, it isimportant to establish whether the estimated manipulabilityis still suitable as a cost function to avoid singularity de-spite a greater error margin in comparison to the simulated3DOF system. To this end, the RMSE is evaluated for20 randomly generated trajectories of 100 points using thelearnt model. The starting point is chosen uniform-randomly q ∼ U [0 ◦ , ◦ ] , q ∼ U [90 ◦ , ◦ ] , q ∼ U [0 ◦ , ◦ ] , q ∼ U [90 ◦ , ◦ ] , q ∼ U [0 ◦ , ◦ ] , q ∼ U [90 ◦ , ◦ ] , q ∼ U [0 ◦ , ◦ ] . r ∗ is drawn uniformly from ∼ U [ − , for the x -axis. Two trajectories are produced, one using π µ andthe other π ˜ µ . The results are . ± . (mean ± s.d.).Considering the high dimensionality of the robot, it isreasonable to assume an increased error in comparison tothe simulated 3-link system.V. C ONCLUSION

This paper uses learning by demonstration to merge learn-ing and manipulability-based control optimisation of anautonomous system to avoid singularities. The control op-timisation uses a learnt cost function that maximises ma-nipulability throughout the motion of a constrained system,not limited to kinematic systems. An approach is provided a similar experiment was done where the null space of the constraintlies along the y -axis giving alike performance to learn the constraint of the task, if not known, fromdata. Results have been presented for a 3-link simulationin a 2D workspace and a real world experiment using thesawyer’s arm in its 7DOF joint space. All experiments arein agreement that manipulabilities can be learnt throughdemonstration. The simulation demonstrates using the learntmanipulability as a cost function to have the system avoidsingularities while performing a task. This approach is alsoveriﬁed in the real world using a robotic system with ahigh dimensional conﬁguration space, showing that con-straints can be learnt with enough precision to identify andavoid singular regions, when substituting ˜ µ for µ and beingused as a cost function. When compared to other controlpolicies such as a zero policy and a point attractor, theoptimised movements from the proposed approach result inan autonomous system that moves towards the goal whilehandling redundancy by moving away from singular regionsthrough local optimisation. This approach when comparedto the aforementioned policies allows for the completion ofthe task, where the other policies are shown to succumb tothe singularities within the same task resulting in either nomovement at all or unpredictable behaviour.Future work looks at conducting a study with naivesubjects to evaluate the usefulness of the programming bydemonstration approach to avoid singularity in practice.R EFERENCES[1] T. Yoshikawa, “Analysis and control of robot manipulators withredundancy,” in

Robotics research: the ﬁrst international symposium .Mit Press Cambridge, MA, USA, 1984, pp. 735–747.[2] Y. Yang, V. Ivan, Z. Li, M. Fallon, and S. Vijayakumar, “iDRM: Hu-manoid motion planning with realtime end-pose selection in complexenvironments,” in

IEEE Int. Conf. Humanoid Robots , 2016, pp. 271–278.[3] A. M. Sundaram, O. Porges, and M. A. Roa, “Planning realisticinteractions for bimanual grasping and manipulation,” in

IEEE-RASInt. Conf on Humanoids , 2016, pp. 987–994.[4] N. Vahrenkamp, H. Arnst, M. W¨achter, D. Schiebener, P. Sotiropoulos,M. Kowalik, and T. Asfour, “Workspace analysis for planning human-robot interaction tasks,” in

IEEE Int. Conf. Humanoid Robots , 2016,pp. 1298–1303.[5] N. Vahrenkamp, T. Asfour, G. Metta, G. Sandini, and R. Dillmann,“Manipulability analysis,” in

IEEE Int. Conf. Humanoid Robots , 2012,pp. 568–573.[6] S. Patel and T. Sobh, “Manipulator performance measures - a compre-hensive literature survey,”

Journal of Intelligent and Robotic Systems ,vol. 77, no. 3, pp. 547–570, 2015.[7] J. Manavalan and M. Howard,

Learning Null Space Projections Fast ,2017.[8] H. C. Lin, P. Ray, and M. Howard, “Learning task constraints in oper-ational space formulation,” in

IEEE Int. Conf. Robotics & Automation ,2017, pp. 309–315.[9] H.-C. Lin, S. Rathod, and M. Howard, “Learning state dependentconstraints,” in

IEEE Transactions on Robotics , 2016.[10] M. Howard, S. Klanke, M. Gienger, C. Goerick, and S. Vijayakumar,“Robust constraint-consistent learning,” in

IEEE/RSJ Int. Conf. onIntelligent Robots and Systems , 2009, pp. 4629–4636.[11] K. Tahara, S. Arimoto, M. Sekimoto, and Z.-W. Luo, “On controlof reaching movements for musculo-skeletal redundant arm model,”

Applied Bionics and Biomechanics , vol. 6, no. 1, pp. 11–26, 2009.[12] F. Udwadia and R. Kalaba,

Analytical Dynamics: A New Approach .Cambridge University Press, 2007.[13] C. Towell, M. Howard, and S. Vijayakumar, “Learning nullspacepolicies,” in

IEEE Int. Conf. Intel. Robots & Sys. , 2010, pp. 241–248.[14] H.-C. Lin, M. Howard, and S. Vijayakumar, “Learning null spaceprojections,” in

IEEE Int. Conf. Robotics & Automation , 2015, pp.2613–2619.[15] L. Armesto, J. Bosga, V. Ivan, and S. Vijayakumar, “Efﬁcient learningof constraints and generic null space policies,” in

IEEE Int. Conf.Robotics & Automation , 2017, pp. 1520–1526.[16] Y. Nakamura,