A Hierarchical Architecture for Human-Robot Cooperation Processes
Kourosh Darvish, Enrico Simetti, Fulvio Mastrogiovanni, Giuseppe Casalino
AA Hierarchical Architecture for Human-Robot Cooperation Processes
Kourosh Darvish , Enrico Simetti, Fulvio Mastrogiovanni, Giuseppe Casalino ©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, includingreprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, orreuse of any copyrighted component of this work in other works. Abstract —In this paper we propose F
LEX
HRC+, a hierarchi-cal human-robot cooperation architecture designed to providecollaborative robots with an extended degree of autonomy whensupporting human operators in high-variability shop-floor tasks.The architecture encompasses three levels, namely for perception,representation, and action. Building up on previous work, herewe focus on (i) an in-the-loop decision making process forthe operations of collaborative robots coping with the variabil-ity of actions carried out by human operators, and (ii) therepresentation level, integrating a hierarchical AND/OR graphwhose online behaviour is formally specified using First OrderLogic. The architecture is accompanied by experiments includingcollaborative furniture assembly and object positioning tasks.
Index Terms —Human-robot cooperation; Smart factory;AND/OR graph; Task representation; Online decision making.
I. I
NTRODUCTION
The paradigm of consumer- and demand-driven manufac-turing introduces the need for small-scale, customised, highquality production at lower prices, and with faster deliverytimes [1], [2]. However, small-scale production does not fullyexploit the benefit of robot-based manufacturing yet [3], [4].Consumer- and demand-driven manufacturing requires robotscharacterised by high flexibility, fast reconfiguration and in-stallation, as well as low maintenance costs. Collaborativerobots are expected to meet such demands, decrease costs,and therefore increase products variability and customisation[1], [2], [3]. In fact, they are considered key enabling factors toautomate small-scale production when operations to be carriedout are highly dynamic and partially unstructured [5].Recently, many authors argued that consumer- and demand-driven manufacturing can benefit from the introduction ofhuman-robot cooperation (HRC) processes [6]. HRC assumesthat human operators and robots purposely interact in a sharedworkspace to achieve a common objective. The design ofcollaborative robots should adhere to a number of human-centric principles. Human-centric design enforces such factorsas the explainability of robot decisions, the usability of robotinterfaces, the awareness of the cooperation process, and afair human-robot workload , as well as safety requirements forhuman operators [7], [8]. On the one hand, since it has beenshown that the effectiveness and the overall performance ofhuman operators are positively correlated with robot motionpredictability [9], collaborative robots should (requirement R )prevent psychological discomfort, stress, and a high induced All the authors are with the Department of Informatics, Bioengineering,Robotics, and Systems Engineering, University of Genoa, Via Opera Pia 13,16145, Genoa, Italy.This paper has been accepted for publication in the IEEE Transactions onRobotics (T-RO). Corresponding author’s email: [email protected] cognitive load on human operators [10], [3], [11]. On theother hand, it has been demonstrated that a natural andefficient cooperation is possible only by a reasoned trade-offbetween the cooperation objective (e.g., the assemblage of asemi-finished product), and the human or robot degrees ofautonomy. This is specially true when the task is only partiallywell-defined (e.g., such assemblage can be done using differentaction sequences), which can be somewhat enforced or relaxedon a context-dependent basis [12], [13]. Collaborative robotsshould ( R ) be able to react to human operator actionswhile retaining the capability of planning goal-oriented actionsequences [14], [15], [16], and ( R ) doing so by abstractingthe structure of tasks from perceptual variability and uncertain-ties [10], [3], [17]. The customised production advocated byconsumer- and demand-driven manufacturing still relies on thecognitive capabilities of human operators since collaborativerobots are largely unable to efficiently manage inter-tasksor intra-task variations [5]. The ability of human operatorsto decompose complex tasks into simpler operations (e.g.,assembling furniture parts to obtain other semi-finished partsto be used later), or to naturally manage small variations(e.g., assembling furniture with parts of different size, liketables with differing flat top size or leg length), still posesa significant challenge for collaborative robots [18]. As aconsequence, collaborative robots ( R ) should exhibit decisionmaking capabilities grounded on flexible task representations,and ( R ) should enforce a definition of hierarchical actionsequences able to map high-level complex tasks to low-level,simple robot operations.In this paper, we present an integrated architecture forHRC processes, which we refer to as F LEX
HRC+, aimedat addressing requirements R , R and R outlined above.F LEX
HRC+ can adapt the behaviour of collaborative robotsto human operator actions, while proactively taking decisionsaimed at meeting the cooperation goals (addressing R ).F LEX
HRC+ enables online human-robot decision making,a flexible execution of HRC tasks (addressing R ), and ahierarchical representation of such tasks enforcing modularityand reuse (addressing R ). Our contribution is at two levels. • Human-robot cooperation level . The first contribution isan in-the-loop, hybrid reactive-deliberative architecturefor online, flexible and scalable HRC processes. Thearchitecture is characterised by proactive decision makingand reactive adaptation to a perceived sequence of humanoperator actions or unsuccessful robot operations. • Task representation level . The second contribution is anintegrated hierarchical representation of HRC processesemploying First Order Logic (FOL) and AND/OR graphsto model static and dynamic aspects of HRC-relatedtasks. a r X i v : . [ c s . R O ] S e p ith respect to our previous work [16], [17], this paper pro-vides two significant improvements. The former is the use ofa FOL-based encoding of information stored in a hierarchicalAND/OR graph. The FOL-based task representation has twoimportant consequences. First, it allows for modelling (partof) cooperation tasks on a non-grounded, terminological level,which can be therefore specified as a set of assertions anchoredto objects in the robot workspace. This allows for a morecompact representation, with a consequent increased flexibilityof the whole cooperation process. Second, it decouples thecooperation task from the involved objects. The latter foreseesthe use of hierarchical AND/OR graphs, which enable agreat deal of modularity (when coupled with the FOL-basedrepresentation, graphs can be reused in different phases of thecooperation), as well as scalability.The paper is organized as follows. Section II describesrelevant state-of-the-art approaches in HRC processes. SectionIII introduces the main traits of F LEX
HRC+. We describe theFOL-based and hierarchical AND/OR graph task representa-tion structure in Section IV, and the task management processin Section V. In Section VI, we describe the experimentalscenarios and discuss relevant results. Conclusions follow.II. R
ELATED W ORK
Task representation . Different approaches have been pro-posed to model HRC processes. Some of them are aimedat introducing aspects of social interaction [19], [20], [21].Others, similarly to the work described in this paper, focusexplicitly on HRC processes for collaborative manipulation orassembly. In this regard, the work in [22] proposed a three-layer framework at the team, agent, and skill execution levelsusing AND/OR graphs. Task allocation is done by minimizinga cost function offline, whereas the reactive behaviour ismanaged at the control level. Such a work is extended in[23] where the task allocation is modulated according to theco-worker ergonomics, and where the use of an augmentedreality technology combined with the recognition of gesturesof the human co-worker allows for an intuitive human-robotinteraction. In both contributions, the flexibility aspects andtask allocation are restricted to the offline phase. The uncer-tainties associated with the robot perception and the outcomesof robot actions are merely simplified at the control level.Therefore, the robot does not proactively make decisions.Moreover, the scalability of the solution cannot be easilydetermined. In our work, we overcome these limitations byaddressing the requirements R , R and R with an onlinetask allocation for the human operator or the robot accordingto both human decisions as they unfold at run-time, andto the robot online simulation results. An example of theseadvancements is provided in Section VI-D and Section VI-C.The work presented in [24], [16] is aimed at recognisingaction sequences performed by human operators online, andto provide robots with methods to adapt accordingly. Therecognition of such action sequences assumes actions to becompleted before they can be properly recognised. This isexpected to introduce possibly unacceptable delays in the co-operation process, and therefore it might jeopardise efficiency and naturalness. A slightly different approach, pursued in [25],employs probabilistic methods and AND/OR graphs to predict human operator actions, thereby trading-off recognition perfor-mance and prediction accuracy. In case of wrong predictions,the effectiveness of the overall cooperation process can benegatively affected. Such an approach neither considers theintrinsic uncertainties of robot actions (partially fulfilling R )nor the variations in assembly scenarios (therefore missing R ), while the table assembly scenario illustrated in SectionVI-D meets those requirements. The approach proposed in[26] envisions a dyadic , mutual adaptation between humanoperators and robots. Robots act as leaders guiding humanoperators towards an efficient task execution strategy. It is nomystery that such an approach can lead to a lack of naturalnessin the cooperation process. While human action prediction,recognition, and adaptation are forms of implicit human-robot communication , an explicit, speech-based communication isadopted instead in [20], [19]. Aspects related to naturalness areenforced using speech-based communication only in principle.In fact, this is done at the detriment of effectiveness andefficiency, since speech recognition can yield to dramaticallypoor results in industrial scenarios.The work in [27] implemented a concurrent, cooperativeassembly task with relational Markov Decision Processes(MDPs), taking actions’ duration into account. Using a prob-abilistic modelling of state transitions, the model can recoverfrom failures in a reactive fashion. However, this is possible bydisregarding perception uncertainties. MDPs have been used in[28], [21] to enforce adaptation to human operator behavioursonline. Such an approach leads to purely reactive behaviours,which are considered indeed natural, but neither effective norefficient. In our case, failures are avoided both with proactivedecision trees and reactive traversal of an AND/OR graph.Finally, [27] models the cooperation with a tree-like structurewhereas, in our case, the AND/OR graph and its hierarchicalstructure makes the representation compact and modular. Thisenables the implementation of complex scenarios, such as theone described in Section VI-C.Task networks and approaches based on classical planning[29], [30] are characterised by a natural description layerbased on FOL [20]. Such a layer is expected to enforceeffectiveness since it constitutes a close-to-human languageused to associate semantics to each robot’s action [31]. Acognitive approach based on an attention-based mechanism isproposed by [32], in which plans are generated using hierarchi-cal task networks, and an attention-based system executes andmonitors multiple plans while resolving possible conflicts. Todemonstrate capabilities of the approach, a pick and place taskin a simulated environment is shown, therefore eliminatingintrinsic uncertainties characterising perception and action inreal-world environments. Differently from our proposition putforth in Section VI-D, the experiments do not support reactivebehaviour or proactive decision making at the team and tasklevels. Similarly, [33] proposed a collaborative system witha graphical user interface design toolkit for task automationand recognition based on first-order behaviour trees ( R ).Such a collaborative system does not adapt to the intrinsicvariability of human decisions or to the workspace status. Toccommodate for human preferences, an approach based ona scheduling and control framework has been introduced in[34]. In particular, an optimal scheduling policy embeddingtemporal constraints and human preferences are learned offlineand executed online (partially meeting R ). However, it doesnot support an online adaptation to the possible workspacevariability (missing therefore R ). Our contribution aims atovercoming these limitations with an online team and tasklevel flexibility. Moreover, none of these works provide evi-dence for scalability, as shown in Section VI-C.In order to model action planning in HRC scenarios, hierar-chical approaches have been used [20]. The work in [35] tookan industry-oriented perspective for human-robot workplacedesign. It allows for scaling collaboration to complex scenariosusing a three-layer task representation approach ( R ). Yet,differently from our approach, as demonstrated in SectionVI-C, the one in [35] limits scalability to three-layers at most. Action planning . As mentioned above, a hierarchical rep-resentation allows for the modelling of complex cooperationtasks efficiently, and it enforces modularity and scalability. Afew approaches consider the interplay between efficiency andnaturalness in the cooperation process [20], [36], [37].One of the main challenges to address in HRC scenarios isdeciding how to allocate actions, either to the human operator,the robot, or in principle to both [12]. When a human operatoris given the freedom of autonomously deciding how to accom-plish a task, action allocation cannot be defined beforehandand must be resolved online [38], [39]. A multi-objectiveoptimisation problem is typically formulated [39], [35]. Itdefines a utility measure considering the quality of an actionresult, its cost , its associated cognitive load , and the resources needed for its completion [40]. Such optimisation problem isthen resolved online for dynamic task allocation [41], [39],[42]. Other approaches are limited to offline solutions [35],[22]. As described in [17], which is expanded in this paper,resolving action allocation online can be done only if therelevant parameters of the employed utility measure are eitherestimated beforehand or it is safe to assume they can bequantified while the cooperation process unfolds.It is noteworthy that the integration of task representation,online task planning, task allocation, and motion planning isexpected to enhance the robustness to failures and the overallHRC process efficiency [17]. It has been shown in [17] that in-the-loop robot motion predictions are beneficial to a naturalinteraction, as opposed to the prediction of human operatormotions [43], [44].III. T HE F LEX
HRC+ A
RCHITECTURE
Overview . The architecture of F
LEX
HRC+ is organisedin three levels, namely the representation level (depicted in green in Figure 1), the perception level (in blue ), and theaction level (in red ). The representation level maintains all therelevant information related to cooperative tasks via the
TaskRepresentation module, and to the shared workspace in the
Knowledge Base module. The representation level orchestratesaction planning and decision making for task execution, aswell as action allocation, via the
Task Manager module. The Fig. 1: F
LEX
HRC+’s architecture: in green the representationlevel, in blue the perception level, in red the action level.perception level acquires information about the workspace interms of objects and other entities therein using the
Objectand Scene Perception module. It is responsible for detectingand classifying actions performed by human operators as donein the
Human Action Recognition module. Starting from rawdata, the perception level updates the representation level withrelevant information about objects and human operator actions.Then, the action level serialises the execution of robot actions(via the
Robot Execution Manager module), performs in-the-loop robot action simulations in the
Simulator module, andcontrols online all robot motions using the
Controller module.
Representation level . The
Task Representation modulemaintains knowledge about all possible states and state tran-sitions modelling cooperative tasks. The module also defineshow the HRC process can progress by providing suggestionsto the human operator or commands to the robot about whatto do next. As anticipated above, we apply AND/OR graphs,primarily introduced by [45], to represent cooperative tasks[16], as better described in the next Section. The modulereceives as input the current cooperation status from the
Task Manager , and therefore provides it with next actionuggestions. The
Task Manager module maps cooperationstates as represented in the AND/OR graph structure to eitherhuman or robot actions. The module plans for suggestedstates receiving appropriate information from the
Task Rep-resentation module, it grounds action parameters to actualvalues, and it performs action assignments on the basis ofincoming perceptual information [16], [17]. The
KnowledgeBase module explicitly represents the cooperation state andworkspace-related perceptual information using custom datastructures [17].
Perception level . The
Human Action Recognition moduleprovides F
LEX
HRC+ with information about actions per-formed by human operators. The module models action tem-plates using Gaussian Mixture Models and Gaussian MixtureRegression. Models originate from a dataset of inertial dataobtained using operator-worn sensors, and applies online apattern matching algorithm to detect and recognise meaningfulactions as performed by human operators [46]. To do so, themodule receives an inertial data stream, and informs the
TaskManager about the detected and recognised human operatoractions [47], [16]. The
Object and Scene Perception moduleprovides information about objects in the robot workspace andmodels them using a set of primitive shapes characterised bytheir geometrical characteristics. The module simply appliesEuclidean distance to cluster a point cloud originating froman RGB-D sensor located on the robot body, it applies theRandom Sample Consensus (RANSAC) algorithm to modelthose clusters as primitive shapes, and it determines theirrelevant features. Additionally, Principal Component Analysisis used to find complementary object features for manipulationpurposes. Recognised objects and their features are maintainedin the
Knowledge Base module [48], [49], [50].
Action level . The
Robot Execution Manager maps actioncommands issued by the representation level to control ac-tions fed to the
Simulator or the
Controller module. Eachcontrol action is associated with a hierarchy of equality or inequality control objectives. These include reaching a desiredend-effector position, avoiding obstacles and joint limits, orrespecting the kinematic constraints imposed by a rigid objectmanipulated by two arms.In order to execute each control action, the Controller module exploits a kinematic task-priority based framework[51], [52], which solves a sequence of prioritized optimizationproblems, computing the reference velocities for robot’s actu-ators. The workspace-related feedback necessary to execute anaction is received from the
Knowledge Base . The task-priorityframework allows for the activation and deactivation of suchinequality objectives as, e.g., maintaining a minimum distancefrom obstacles or from humans without over-constraining thesolution when it is not necessary, hence those objectives can beput at the highest priority, increasing the safety of the system.Other safety and compliance features such as detecting andresponding to contacts and impacts, preventing from applyingcontinuous or excessive pressure, or keeping impact forcesbelow design limits, are more related to the robot dynamiccontrol, and are usually implemented within low-level controlarchitectures. Therefore, they are not the focus of this paper.The in-the-loop
Simulator module replicates
Controller op- erations, integrating its output to simulate the robot behaviouronline. It receives reference information from
Robot ExecutionManager , and provides it with the results of the simulation,e.g., failure/success of a given command, action executiontime, final robot pose, or estimated energy consumption [17].IV. T HE H UMAN -R OBOT C OOPERATION M ODEL
A. Single-layer AND/OR Graphs
In order to formalise the human-robot cooperation processin F
LEX
HRC+, we adopt AND/OR graphs. An AND/ORgraph allows for an easy decomposition of problems andprocedures in their building blocks (as parts of the graph), aswell as the logic relationships among them (i.e., the graph con-nectivity). Since in AND/OR graphs the root node representsthe solution to the problem being modelled, solving it meanstraversing the graph from leaf nodes to the root node accordingto its structure. AND/OR graphs can take limited forms of non-determinism or uncertainty into account [45], [53], [31] viathe availability of various branches leading to the solution. Inprevious work [16], [17], such representation has been adoptedto model the online behaviour of simple cooperation processes.In this paper we systematise the original formulation andextend it along two directions: first, we provide a conceptual-isation of AND/OR graphs compatible with a FOL-based taskrepresentation framework, which allows us to better modelcooperation templates ; second, we introduce and analyse thebenefits of hierarchical AND/OR graphs to support modular,scalable, and flexible task representation.An AND/OR graph G can be formally defined as a tuple (cid:104) N, H (cid:105) where N is a set of | N | nodes, and H is a set of | H | hyper-arcs. An hyper-arc h ∈ H induces two sets of nodes,namely the set N c ( h ) ⊂ N of its child nodes, and the singleton N p ( h ) ⊂ N made up of a parent node, such that h : N c ( h ) → N p ( h ) . (1)For the scenarios we consider, at the semantic level eachnode n ∈ N represents a particular state related to thecooperation between a human operator and a collaborativerobot. Each hyper-arc h ∈ H represents a (possibly) many-to-one transition among states, i.e., activities performed byhuman operators or robots that make the cooperation moveforward. In F LEX
HRC+, a node n ∈ N is associated with aconjunction S ( n ) of literals, such that S ( n ) = s ∧ . . . ∧ s k ∧ . . . ∧ s | S ( n ) | , (2)where each literal s k may consist of variables, constants orlogic predicates, also negated. As it will be described later,each literal can be considered a representation fragment relatedto the cooperation state defined by n . As a consequence, wewill refer to S ( n ) as the cooperation state represented in n .It is noteworthy that a given S does not have to include onlygrounded literals, i.e., constants or grounded predicates. It caninclude variables as well, and as such the corresponding nodecan be treated as a class of states (or cooperation templates)at the Task Representation level [31].Using the definition of state in (2), it is possible to betterspecify the state transition in (1). This is a relationship inducedn hyper-arc h between a set of requirements made up byjoining all the literals defining states S ( n k ) associated withall the nodes n k ∈ N c ( h ) , and a set of effects made up by thestate S ( n ) associated with single node n ∈ N p ( h ) , such that h : S ( n ) ∧ . . . ∧ S ( n k ) ∧ . . . ∧ S ( n | N c | ) → S ( n ) . (3)The relation among child nodes in hyper-arcs is the logical and , whereas the relation between different hyper-arcs induc-ing on the same parent node is the logical or . Different hyper-arcs inducing on the same parent node represent alternativeways for a cooperation process to move on. We define n ∈ N as a leaf node if n is not acting as a parent node for anyhyper-arc, i.e., if h ∈ H does not exist such that n ∈ N p ( h ) .Similarly, we define n as a root node if it is the only nodethat is not a child node for any hyper-arc, i.e., if h ∈ H doesnot exist such that n ∈ N c ( h ) .Each hyper-arc h ∈ H implements the transition in (3) bychecking the truth values associated with all the requirementsdefined by nodes in N c ( h ) , executing a number of actionsassociated with h , and generating effects compatible with thecooperation state of the parent node. In particular, each hyper-arc h ∈ H is responsible for executing an ordered set A ( h ) of actions , such that A ( h ) = ( a , . . . , a k , . . . , a | A | ; (cid:22) ) , (4)where the precedence operator (cid:22) defines the pairwise expectedorder of action execution. Before an hyper-arc h is executed,all actions a ∈ A ( h ) are marked as undone , and we refer tothis using a predicate done ( a ) ← f alse . When one action a isexecuted either by the human operator or the robot, its statuschanges to done as done ( a ) ← true . An hyper-arc h ∈ H is marked as solved , i.e., solved ( h ) ← true iff all actions a ∈ A ( h ) are done in the expected order.In a similar way, nodes n ∈ N may be associated with a(possibly ordered) set of processes P ( n ) , i.e., P ( n ) = ( p , . . . , p k , . . . , p | P | ; (cid:22) ) . (5)Differently from actions, which are instrumental to performtransitions from one cooperation state to another, and must benecessarily executed by human operators or robots, processesare robot behaviours which do not imply any state transition.They are used to control physical or other non-functionalvariations of some quantity over time. An example may bea robot behaviour aimed at keeping a certain object in a givenpose or configuration using two grippers, the effects of externalforces notwithstanding, because the related cooperation stateassume that object pose. Processes are meant at being executedby robots, whereas a human operator may carry them outoccasionally. Each process is characterised by a priority, orprecedence. Actions and processes are associated with hyper-arcs and states, respectively. However, there is no causalrelationship between actions and processes, hence, they can beexecuted in any order, according to the priority or preference.When a node n ∈ N is reached, all of its processes are acti-vated , i.e., activated ( p ) ← true for each p ∈ P ( n ) . A process p ∈ P ( n ) can be deactivated when certain process-specifictermination conditions are met, i.e., activated ( p ) ← f alse . A node n is marked as met , i.e., met ( n ) ← true , if allthe associated processes are deactivated if necessary in theprescribed order, or P ( n ) is an empty set.Using these definitions, it is possible to introduce the notionof feasibility for nodes and hyper-arcs. A node n ∈ N is feasible , which we refer to as feasible ( n ) ← true , iff a solvedhyper-arc h ∈ H exists, for which n ∈ N p ( h ) , and met ( n ) ← f alse , i.e., ∃ h ∈ H. ( solved ( h ) ∩ n ∈ N p ( h ) ∩ ¬ met ( n )) . (6)All leaf nodes in an AND/OR graph are usually feasible atthe beginning of the human-robot cooperation process, whichmeans that the cooperation itself can be performed in manyways, and is not constrained to follow certain sequences ofoperations. In a similar way, an hyper-arc h ∈ H is feasible ,i.e., feasible ( h ) ← true , iff for each node n ∈ N c ( h ) , met ( n ) ← true and solved ( h ) ← f alse , i.e., ∀ n ∈ N c ( h ) . ( met ( n ) ∩ ¬ solved ( h )) . (7)Once an hyper-arc h i ∈ H is solved, all other feasible hyper-arcs h j ∈ H \ { h i } , which share with h i at least one childnode, i.e., N c ( h i ) ∩ N c ( h j ) (cid:54) = ∅ , are marked as unfeasible, inorder to prevent the cooperation process to consider alternativeways to cooperation that have become irrelevant.The human-robot cooperation process is modelled as a graph traversal procedure. Starting from a set of leaf nodes, itmust reach the root node by selecting hyper-arcs and reachingstates in one of the available sequences, depending on thefeasibility statuses of nodes and hyper-arcs. To this aim, eachnode n ∈ N is associated with a weight w ( n ) , and each hyper-arc h ∈ H is similarly associated with a weight w ( h ) . Weightsare related to the number, difficulty or time-to-completion ofactions/processes, and to other more qualitative metrics relatedto human operator preferences [54]. Nodes or hyper-arcs withlower weights are privileged compared to others with higherweights. Weights are identified through several demonstrationsof expert users. Then, a cooperation path cp induced by G isa set of nodes and hyper-arcs, such that cp = ( n , . . . , n k , h , . . . , h l ) , (8)which represents a particular way to connect leaf nodes to theroot node. We refer to the set of cooperation paths inducedby G as CP ( G ) , where each element cp ∈ CP is in the formdescribed by (8). According to the structure of the modelledhuman-robot cooperation task, multiple cooperation paths mayexist, meaning that multiple ways to solve the task may beequally legitimate. Each cooperation path cp ∈ CP can beassociated with an overall cost c ( cp ) , such that c ( cp ) = k (cid:88) j =1 w ( n j ) + l (cid:88) j =1 w ( h j ) . (9)The different cooperation paths in CP can be ranked ac-cording to their overall costs. Two cooperation paths cp i and cp j ∈ CP are equal iff the corresponding sets of nodesand hyper-arcs are the same, and are equivalent iff theircorresponding overall costs are the same.he traversal procedure dynamically follows the coopera-tion path that at any time is characterised by the lowest cost.The traversal procedure suggests to human operators actionsin the hyper-arcs that are part of the path, and sends to robotsactions they must execute. Human operators can override thesuggestions at any time, executing different actions, whichmay cause the system to be in a cooperation state not part ofthe current cooperation path. When this situation is detected,F LEX
HRC+ tries to progress from that state onwards [16],[17]. This mechanism enables F
LEX
HRC+ to pursue anoptimal path leading to the solution, while it allows humanoperators to choose alternative paths when they deem it fit. Aslong as the human-robot cooperation process unfolds, and theAND/OR graph is traversed, we refer with N f and H f to thesets of currently feasible nodes and hyper-arcs, respectively.In fact, the actual elements of these two sets depend on theevolution of the cooperation process.We say that an AND/OR graph G is solved , and we write solved ( G ) ← true , iff its root node r ∈ N is met, i.e., met ( r ) ← true . Otherwise, if the condition N f ∪ H f = ∅ (i.e., there are no feasible nodes nor hyper-arcs) then thehuman-robot cooperation process is failed, because there isno feasible cooperation path leading to the root. It is note-worthy that representations based on AND/OR graphs, whenupdated online, do not require the full knowledge of the robotworkspace. In fact, while a given cooperation path is followed,the traversal algorithm only needs knowledge about feasiblenodes and hyper-arcs for making the task progress.The AND/OR graph structure presented in this paper isbased on the one introduced in [16]. Notable differences arethe possibility of allowing for multiple hyper-arcs connectingthe same child nodes to a parent node, ensuring the minimumcost returned from each hyper-arc or node, and supportinga FOL-based task representation. The first feature allows theAND/OR graph to model different state transitions from onecooperation state to another, the second one ensures an op-timal, predictable, and therefore explainable robot behaviour,whereas the last one increases the overall expressive power ofthe representation structure.The single-layer AND/OR graph traversal procedure iscomposed of two phases, the first being offline and the secondonline. The offline phase loads the description of the AND/ORgraph, generates the data structure G , sets the graph status asunsolved. Later, the feasibility of all nodes and hyper-arcs ischecked, the set CP of cooperation paths is generated, andsuggestions for next actions (in terms of nodes and hyper-arcs)are computed as defined in (9).The update of feasibility statuses of all the involved nodesand hyper-arcs, i.e., populating the corresponding sets N f and H f , is done simply by iteratively invoking two functions,namely UPDATE N ODE F EASIBILITY () and
UPDATE H YPERAR - C F EASIBILITY (). The two functions are further developed inAlgorithm 1 and Algorithm 2. Given a node n and a hyper-arc h the two Algorithms use such predicative knowledge on n and h as the values of feasible ( n ) , feasible ( h ) , met ( n ) , and solved ( h ) to update the feasibility of graph nodes and hyper-arcs, respectively, therefore producing updated sets N f and H f . In Algorithm 1, lines 3-19 update the feasibility status Algorithm 1
UPDATE N ODE F EASIBILITY () Require:
A node n , feasibility sets N f and H f Ensure:
Updated feasibility sets N f and H f feasible ( n ) ← f alse N f ← N f \{ n } if met ( n ) then for all h s.t. n ∈ N c ( h ) do if solved ( h ) then feasible ( h ) ← f alse H f ← H f \{ h } else feasible ( h ) ← true H f ← H f ∪ { h } for all n (cid:48) s.t. n (cid:48) ∈ N c ( h ) do if ¬ met ( n (cid:48) ) then feasible ( h ) ← f alse H f ← H f \{ h } break end if end for end if end for else if N c ( n ) = ∅ then feasible ( n ) ← true N f ← N f ∪ { n } else for all h s.t. n ∈ N p ( h ) do if solved ( h ) then feasible ( n ) ← true N f ← N f ∪ { n } break end if end for end if end if return ( N f , H f )of a relevant hyper-arc h when n ∈ N c ( h ) and met ( n ) holdstrue. In case node met ( n ) holds false (lines 20-32), lines 21-23change the node feasibility when it does not have any childnode, i.e., if n is not a parent of any hyper-arc. Lines 24-31 check for a solved hyper-arc connected to node n , and incase at least one of such hyper-arcs exists, then it is markedas feasible. In Algorithm 2, lines 2-13 update the feasibilitystatuses when the hyper-arc h is solved. The feasibility of the h ’s parent node (line 3) is updated in lines 4-7. The feasibilityof all the hyper-arcs having a common set of child nodes with h is updated in lines 8-13. Lines 14-27 check the feasibilityof the unsolved hyper-arc h . If a child node of h (line 17) isnot met (lines 18-21) or there is another solved hyper-arc h (cid:48) with a set of child nodes common with h (lines 22-25), thehyper-arc h becomes infeasible. Finally, FIND S UGGESTIONS ()in Algorithm 3 determines the set Φ of suggestions whoseelements are generically indicated using x , and their associatedcost c ( x ) . There might be different paths from N f or H f to lgorithm 2 UPDATE H YPERARC F EASIBILITY () Require:
A hyper-arc h , feasibility sets N f and H f Ensure:
Updated feasibility sets N f and H f feasible ( h ) ← false if solved ( h ) then n ← N p ( h ) if ¬ met ( n ) then feasible ( n ) ← true N f ← N f ∪ { n } end if for all n s.t. n ∈ N c ( h ) do for all h (cid:48) s.t. n ∈ N c ( h (cid:48) ) do feasible ( h (cid:48) ) ← false H f ← H f \{ h (cid:48) } end for end for else feasible ( h ) ← true H f ← H f ∪ { h } for all n s.t. n ∈ N c ( h ) do if ¬ met ( n ) then feasible ( h ) ← false H f ← H f \{ h } break else if ∃ h (cid:48) s.t. solved ( h (cid:48) ) ∧ n ∈ N c ( h (cid:48) ) then feasible ( h (cid:48) ) ← false H f ← H f \{ h (cid:48) } end if end for end if return ( N f , H f )the root of G . Therefore, the AND/OR graph is expected toprovide the minimum cost among all these cooperation paths toensure optimality. The cost c ( x ) for a node or hyper-arc is theminimum cost of the cooperation path cp which the node orthe hyper-arc belongs to. Therefore the Algorithm guaranteesthe optimality of the AND/OR graph because for all nodes orhyper-arcs in N f or H f , respectively, it holds that: c ( x ) = min x ∈ cp c ( cp ) . (10)In Algorithm 3, lines 3-11 return feasible nodes and theminimum cost of the cooperation paths which include them.The same applies to hyper-arcs in lines 12-20.The online phase follows Algorithm 4. The two sets of metnodes and solved hyper-arcs are referred to as N m and H s ,respectively. Upon the reception of the Task Manager ’s query,the Algorithm updates node and hyper-arc statuses (in terms of solved , met and feasible predicates) in lines 2-9. The solved status for the whole AND/OR graph is checked. If the rootnode is met, then the graph is marked as solved (line 11).Otherwise, line 14 updates all the path weights, which includenodes in N m and hyper-arcs in H s . In line 15 the new feasiblenodes and hyper-arcs, and their associated costs, are madeavailable. In the Algorithm, the functions MET N ODE ( n , G )and SOLVED H YPERARC ( h , G ) check first if feasible ( n ) or Algorithm 3
FIND S UGGESTIONS () Require:
An AND/OR graph G = (cid:104) N, H (cid:105)
Ensure:
A set
Φ = {(cid:104) x, c ( x ) (cid:105)} of suggestions Φ = ∅ cost ← for all n ∈ N s.t. feasible ( n ) do cost ← inf for all cp ∈ CP ( G ) s.t. n ∈ cp do if cost < c ( cp ) then cost ← c ( cp ) end if end for Φ ← Φ ∪ {(cid:104) n, cost (cid:105)} end for for all h ∈ H s.t. feasible ( h ) do cost ← inf for all cp ∈ CP ( G ) s.t. h ∈ cp do if cost < c ( cp ) then cost ← c ( cp ) end if end for Φ ← Φ ∪ {(cid:104) h, cost (cid:105)} end for return Φ Algorithm 4
ONLINE P HASE () Require:
An AND/OR graph G = (cid:104) N, H (cid:105) , feasibility sets N f and H f , the met set N m , the solved set H s Ensure:
An updated AND/OR graph G , updated feasibilitysets N f and H f , A set Φ = {(cid:104) x, c ( x ) (cid:105)} of suggestions n r ← GET R OOT ( G ) for all n ∈ N m do
3: MET N ODE ( n , G )
4: UPDATE N ODE F EASIBILITY ( n , N f , H f ) end for for all h ∈ H s do
7: SOLVED H YPERARC ( h , G )
8: UPDATE H YPERARC F EASIBILITY ( h , N f , H f ) end for if met ( n r ) then solved ( G ) ← true return end if
14: UPDATE A LL P ATHS ( G , N m , H s ) Φ ← FIND S UGGESTIONS ( G ) return feasible ( h ) hold true, then update G by met ( n ) ← true and solved ( h ) ← true . In particular, UPDATE A LL P ATHS ( G , N m , H s ) updates the cooperation path costs at each query. For agiven cooperation path cp ∈ CP , the path cost c ( cp ) at eachmoment is the cost of traversing it from the current to the rootstate. Initially, all the path costs are computed from the leavesto the root using (9). When a node or hyper-arc belonging toa given cooperation path is met or solved, its overall cost iseduced of an amount related to its weight, i.e., ∀ x ∈ N m , H s : c ( cp ) = c ( cp ) − w ( x ) . (11) B. Hierarchical AND/OR Graphs
The use of hierarchical AND/OR graphs in the context ofHRC tasks has two motivations. The first is related to thecomputational complexity of single-layer AND/OR graphs,while the second is related to flexibility and scalability re-quirements. It has been shown that AND/OR graphs arecharacterised by a polynomial time complexity in the numberof nodes and hyper-arcs [55]. The problem of determiningwhether a solution in terms of a path from the set N L of leafnodes to the root node is NP-hard [56]. In the online phaseof HRC tasks, being able to quickly determine and selectan alternative cooperation path to take into account humanoperator preferences is therefore of the utmost importance.On the computational side, this means reducing the numberof nodes and hyper-arcs which the Task Manager module mustreason upon. Different real-world operations are structuredas mandatory or alternative sets of human or robot actions,which can be seen as atomic . Being able to identify and re-use the same sub-sequences of operations in different partsof the same HRC process or as part of different processes isexpected to enhance flexibility, because such sub-sequencescan be easily substituted if needed, and scalability, since theoverall complexity can be increased maintaining a manageablerepresentation overhead.Analogously to single-layer AND/OR graphs, a hierarchicalAND/OR graph H is defined as a tuple (cid:104) Γ , Θ (cid:105) where Γ is anordered set of | Γ | AND/OR graphs, such that:
Γ = (cid:0) G , . . . , G | Γ | ; (cid:22) (cid:1) , (12)and Θ is a set of | Θ | transitions between couples of AND/ORgraphs. In (12), the AND/OR graphs are pairwise orderedaccording to their depth level. With a slight abuse of notation,we associate a depth level l to an AND/OR graph G and weindicate it with G l , the highest level being l = 0 . AND/ORgraphs with increasing depth levels are characterised by adecreasing level of abstraction, i.e., deeper graphs modelHRC more accurately. Transitions in Θ define how differentAND/OR graphs in Γ are connected, and in particular modelthe relationship between any G l and a deeper connected graph G l +1 .It is necessary to better define transitions. If we recall (3)and we contextualise for an AND/OR G l = (cid:104) N l , H l (cid:105) , weobserve that a given hyper-arc in H l represents a mappingbetween the set of its child nodes and the singleton parentnode. We can think of a generalised version of such a mappingto encompass a whole AND/OR graph G l +1 = (cid:104) N l +1 , H l +1 (cid:105) ,where the set of child nodes is constituted by the set N l +1 L ofleaf nodes, and the singleton parent node by the graph’s rootnode r l +1 ∈ N l +1 , such as: G l +1 : S ( n l +11 ) ∧ . . . ∧ S ( n l +1 k ) ∧ . . . ∧ S ( n l +1 | N l +1 L | ) → S ( r l +1 ) . (13) A transition T can defined between a hyper-arc h l ∈ H l andan entire deeper AND/OR graph G l +1 , such that: T : h l → G l +1 , (14)subject to the fact that appropriate mappings can be definedbetween the set of child nodes of h l and the set of leaf nodesof the deeper graph, i.e., M : N c ( h l ) → N L ∈ N l +1 , (15)and the singleton set of parent nodes of h l and the root nodeof the deeper graph, i.e., M : N p ( h l ) → r l +1 ∈ N l +1 . (16)Mappings M and M must be such that the conjunction ofliterals of nodes in N c ( h l ) and the conjunction of literals ofleaves in G l +1 is semantically equivalent . They should be thesame or representing the same information with a differentdepth of representation, for example each literal of nodes in N c ( h l ) may correspond to one or more literals of nodes in N l +1 L . The same applies for the root of G l +1 and N p ( h l ) .Once these mappings are defined, it easy to see that H has atree-like structure, where graphs in Γ are nodes and transitionsin Θ are edges.An AND/OR graph G l is feasible, and we refer to it as feasible ( G l ) if it has at least one feasible node or hyper-arc.If a transition T exists in the form (14), a hyper-arc h l ∈ H l is feasible iff the associated deeper AND/OR graph G l +1 , isfeasible: ∀ T. (cid:0) feasible ( G l +1 ) ↔ feasible ( h l ) (cid:1) . (17)Shen the hyper-arc h l becomes feasible in G l , the nodesin N l +1 L of G l +1 become feasible as well. Furthermore, thehyper-arc h l is solved iff the associated deeper AND/OR graph G l +1 is solved: ∀ T. (cid:0) solved ( G l +1 ) ↔ solved ( h l ) (cid:1) . (18)For all hyper-arcs in H l for which a transition T towards G l +1 exists, we must define how to compute the relatedweight. If we define cp l +1 , ∗ the cooperation path in G l +1 characterised by the lowest cost, we easily define: w ( h l ) = c (cid:0) cp l +1 , ∗ (cid:1) . (19)In this case the weight is attributed using an optimistic strategy,because as per change of the cooperation path in G l +1 it mayhappen that the actual w ( h l ) is underestimated.Similarly to the single-layer case, hierarchical AND/ORgraphs are used in two phases, first offline and then online.A transition T is modelled using a function in the form G l +1 = LOWER G RAPH ( h l ) , whereas the inverse relationshipis obtained using h l = UPPER H YPERARC ( G l +1 ) . The offlinephase first loads the description of the highest-level AND/ORgraph G . Considering any nesting level l , if a hyper-arc h ∈ H l is associated with a deeper AND/OR graph descrip-tion G l +1 by a transition, the Algorithm calls the function OfflinePhase () on G l +1 to build it before going on with G l .Algorithm 5 describes the workflow associated with thehierarchical AND/OR graph during online execution. When-ever the status of the HRC process needs updating, the graph lgorithm 5 ONLINE H IERARCHICAL P HASE () Require:
A hierarchical AND/OR graph H = (cid:104) Γ , Θ (cid:105) , feasi-bility sets N i,f and H i,f , the met set N i,m , the solved set H i,s for each G i ∈ Γ Ensure:
An updated hierarchical AND/OR graph H , updatedfeasibility sets N i,f and H i,f for each G i ∈ Γ , a set Φ H = {(cid:104) x, c ( x ) , g ( x ) } of suggestions G ← GET R OOT G RAPH ( Γ ) r ← GET R OOT ( G ) for all G i ∈ Γ do for all n ∈ N i,m do
5: MET N ODE ( n , G i )
6: UPDATE N ODE F EASIBILITY ( n , G i ) if solved ( G i ) and G i (cid:54) = G then h ← UPPER H YPERARC ( G i ) solved ( h ) ← true H i,s ← H i,s ∪ { h } end if end for end for for all G i ∈ Γ do for all h ∈ H i,s do
16: SOLVED H YPERARC ( h , G i )
17: UPDATE H YPERARC F EASIBILITY ( h , N i,f , H i,f ) end for end for if met ( r ) then solved ( G ) ← true return end if for all G i ∈ Γ do
25: UPDATE A LL P ATHS ( G i , N i,m , H i,s ) end for N f ← N ,f ∪ . . . ∪ N | Γ | ,f H f ← H ,f ∪ . . . ∪ H | Γ | ,f Φ H = ∅ {(cid:104) x, c ( x ) , g ( x ) (cid:105)} ← FIND S UGGESTIONS ( H , Φ H ) return representation is updated starting from all sets N i,m of metnodes, and the sets H i,s of solved hyper-arcs, for all AND/ORgraphs in Γ (lines 3-12). After node statuses are updatedin lines 5-6, the Algorithm checks whether any graph issolved (line 7). If this holds true and the solved graph isnot the root graph of H , then the associated higher-levelhyper-arc is labelled as solved (line 9) and then included inthe corresponding set of solved hyper-arcs H i,s . Lines 14-19update the feasibility statues for all solved hyper-arcs. Then,if the root node of the root graph is met, then the whole graphis solved (line 21) and the Algorithm terminates (line 22).Otherwise, all cooperation paths are updated (line 25), andthe set Φ H of next suggestions is found (line 30), as betterdescribed in Algorithm 6. It is noteworthy that Φ H includes Φ and adds to each triplet the graph label containing the nodeor hyper-arc.Algorithm 6 finds first feasible nodes part of an optimal Algorithm 6
FIND S UGGESTIONS () Require:
A hierarchical AND/OR graph H = (cid:104) Γ , Θ (cid:105) , a set Φ H = {(cid:104) x, c ( x ) , g ( x ) (cid:105)} of suggestions Ensure:
An updated set Φ H cost ← for all G i = (cid:104) N i , H i (cid:105) ∈ Γ do for all n ∈ N i s.t. feasible ( n ) do cost ← inf for all cp ∈ CP ( G i ) s.t. n ∈ cp do if cost < c ( cp ) then cost ← c ( cp ) end if end for Φ H ← Φ H ∪ {(cid:104) n, cost, G i (cid:105)} end for end for for all G i = (cid:104) N i , H i (cid:105) ∈ Γ do for all h ∈ H i s.t. feasible ( h ) do cost ← inf for all cp ∈ CP ( G i ) s.t. h ∈ cp do if cost < c ( cp ) then cost ← c ( cp ) end if end for if LOWER G RAPH ( h ) = null then Φ H ← Φ H ∪ {(cid:104) h, cost, G i (cid:105)} else G j ← LOWER G RAPH ( h ) Φ H ← FIND S UGGESTION ( G j , Φ H ), with x ∈ N j,f ∪ H j,f for all (cid:104) x, c ( x ) , g ( x ) (cid:105) ∈ Φ do cost ← cost − w ( h ) + c ( x ) Φ H ← Φ H ∪ {(cid:104) x, cost, g ( x ) (cid:105)} end for end if end for end for return Φ H cooperation path (lines 2-12), as well as the associated costand graph. A similar operation is done in lines 13-32 for hyper-arcs. In this case it is necessary to check whether a transitionexists towards a deeper AND/OR graph. If this is not thecase, the hyper-arc is stored as a suggestion. Otherwise, theassociated graph is determined and the function is recursivelycalled on it. Finally, the minimum cost from the parent nodeof a hyper-arc to the root node of the corresponding graph iscomputed in line 27, and the suggestion updated accordingly.Figure 4 provides an example of a hierarchical AND/OR graphfor a kitchen assembly scenario. On the left hand side, the firstlayer includes nodes, with the root node being on top, and hyper-arcs. Hyper-arcs h and h can be further specialisedas second layer graphs, e.g., the one in the mid of the Figure.This is characterised by nodes and hyper-arcs (partlydepicted in the Figure), two of which are further specialisedas a third layer on the right.. R EASONING ON THE C OOPERATION M ODEL
All reasoning tasks on the cooperation model are carried outwithin the
Task Manager module. The module receives the setsof feasible states or hyper-arcs from the
Task Representation module, and determines the sequence of actions for eachhyper-arc (or the sequence of processes for states), it groundsrelevant parameters to actions, and allocates actions to humanoperators or robots to maximise a utility indicator.
A. Reasoning upon First Order Logic based AND/OR Graphs
Differently from what happens in standard AND/OR graphs,F
LEX
HRC+ encodes actions in hyper-arcs using a nota-tion compliant to the Planning Domain Definition Language(PDDL) formalism [29]. In F
LEX
HRC+, an action a con-tributes to a transition modelled as a hyper-arc (3). To do so,it acts on a set param ( a ) of parameters to anchor, and it mapsa set pre ( a ) of precondition literals in conjunction (possiblydefining cooperation states) to a set ef f ( a ) of effect literals,which are part of other cooperation states in the graph or areintermediate literals. The set of effect literals can be split intotwo disjoint sets, namely a set of literals ef f + ( a ) holdingtrue after the action has been executed, and a set of literals ef f − ( a ) not holding anymore after the action. Therefore, anaction in F LEX
HRC+ can be defined as: a = (cid:0) param ( a ) , pre ( a ) , ef f + ( a ) , ef f − ( a ) (cid:1) . (20)Each action a is associated with a set agents ( a ) enumeratingthe agents (either human operators or robots), which may beresponsible for performing a . It is noteworthy that such a setmay be a singleton (e.g., when only one agent can be allocatedto the action), may define a set of possibilities (e.g., whena human operator or a robot may be tasked with the sameaction), or may define a list of agents which may be requiredto perform the action jointly . Using such a formalisation,although the semantics associated with the literals is known,they may not be anchored to any real object in the robotworkspace at the modelling level.Using the sequences of the actions for all the feasible statesand hyper-arcs with their associated costs, Task Manager cre-ates a data structure called
Action-State table [16]. The tablekeeps the information of the chosen state to follow and theprogress of the associated action executions. Given the Action-State table, the
Task Manager either proactively selects whichhyper-arc to follow, or adapts to human preferences as soonas their actions are duly recognised. In both cases, in order toreach the goal node in the graph, it identifies a cooperationpath to follow with the minimum cost according to (9). Oncethe
Task Manager selects a feasible hyper-arc that is part ofthe minimum cost path, to perform each action associatedwith the hyper-arc, F
LEX
HRC+ must anchor non-groundedliterals in action definitions, and allocate agents to each action.To do so, first updated information about the workspace isretrieved online from the
Knowledge Base module. Then, allthe possible literal groundings are determined, as well as allthe possible combinations of agents, which may be tasked withthe set of ordered actions associated with the selected hyper-arc. Using such information, a decision tree is automatically generated [17], whose various branches are related to thediverse parameter groundings and agents who can performthe actions. All branches in the tree are ranked using a utilityfunction, i.e., a metric to estimate performance and quality ofactions execution [40]. The value is generated using the utilityfunction for each leaf of the decision tree, by simulating all theordered set of actions associated with the selected hyper-arc,varying action parameters or assigned agents.In general terms, a utility function to determine the bestcourse of action for a cooperation model can be defined asthe weighted sum of several robot-centred or human-centredmetrics, e.g., the closest simulated distance to obstacles, themaximum joint accelerations, the maximum velocities, theoverall execution time, which can be used to evaluate the over-all execution quality . In our work, we have opted to consideronly the execution time. Hence, given a utility function J anda branch b of the decision tree, the utility function J ( b ) isdefined as to maximise the inverse of the total execution timeassociated with the branch b , namely: J ( b ) = (cid:15) ( b ) × K (cid:80) i =1 t i , (21)where K is the number of actions of the selected hyper-arc orbranch, t i is the execution time of the i -th action in the branch b , and the unit function (cid:15) ( b ) equals to one if all actions in b are executed successfully, to zero otherwise. A simulation issuccessful if the agent can reach its given goal in a definedtime interval. We ground the parameters of the optimal stateand assign the actions to an agent based on the branch with themaximum utility value. If the utility value of all the branchesin the decision tree becomes zero, the Task Manager sets theoptimal state as infeasible and attempts for a new optimalstate. With this method, the
Task Manager proactively avoidscooperation from failing and increases the overall robustnessof the architecture. Action success or failure, and – in case ofsuccess – the time it takes to complete an action is determinedby the
Simulator module, which mimics the robot kinematicbehaviour using the currently available knowledge of the robotstate and its workspace. If disturbances are limited and therobot model is known with sufficient precision, the likelihoodof having consistent results in simulation and with the realrobot is increased. One may decide to simulate the robotdynamics, but in our experience this option, depending onthe given task, may not provide more detailed informationto compute the utility value because of uncertainties in theinteraction with the environment.As an example, let us consider a table assembly task wherea tabletop T must be connected to either leg A or B , asshown in Figure 2. The example will be further explored inSection VI. The task can be performed by a human operator H or either robot R or R , interchangeably. When the Task Manager receives the set of feasible hyper-arcs andtheir associated costs from the
Task Representation module, itgenerates the corresponding Action-State table. Each row ofthe Action-State table shows the sequence of atomic actionsto be carried out in the given state or hyper-arc. In theexample shown in the Figure, the cooperation cost of hyper- ransport (LG(?Y), TT(?X), AG (?Z)) Screw (LG(?Y), TT(?X), AG (?Z)) h1 (5):
Connect (LG(?Y), TT(?X), AG (?Z)) h2 (8):
Transport (LG(?Y), MP(?X), AG (?Z)) h3 (7):
Transport (LG(A), TT(T), AG (R1)) Screw (LG(A), TT(T), AG (R1))Transport (LG(B), TT(T), AG (R1)) Screw (LG(B), TT(T), AG (R1))Transport (LG(A), TT(T), AG (R2)) Screw (LG(A), TT(T), AG (R2))Transport (LG(B), TT(T), AG (R2)) Screw (LG(B), TT(T), AG (R2)) (1.3)(2.5)(0.7)(0.0)hyper-arc cost Action-State table decision tree utility value
Fig. 2: An example of the interconnections among feasible states or hyper-arcs, the Action-State table, and the decision tree.LG stands for L EG , TT stands for T ABLETOP , MP stands for M
IDDLE P OSE (in front of the human operator), which are actionparameters, whereas AG stands for the set of A
GENT s.arc h is the lowest (with a value of ) among the threefeasible hyper-arcs h , h , and h . Therefore, Task Manager selects h as the optimal state, and generates the correspondingdecision tree (on the right hand side of the Figure). Usingthe information stored in the knowledge base, the predicate T ABLETOP (? X ) can be grounded to T only, and L EG (? Y ) canbe grounded to one among A or B , whereas A GENT (? Z ) canbe assigned to either R or R . In this case, the decision treeramifies to four branches in total. Task Manager computesthe utility value by simulating all the actions in each branch,and the one with maximum utility, i.e., the second branchwith J = 2 . , is selected to ground the parameters (leg B and tabletop T ) and responsible agents (robot R ) of theset of actions associated with the given hyper-arc. Groundingthe literals online according to the perceived information ofthe workspace allows the architecture to adapt to task-levelvariations, therefore enforcing flexibility. If, during an actualcooperation process, a human operator attempts to achieve adifferent feasible state, e.g., h , then the Task Manager adaptsto such a decision and grounds the predicates accordingly.Moreover, if the robot cannot perform a given action in acertain amount of time despite a successful simulation, as-suming T
RANSPORT ( L EG ( B ) , T ABLETOP ( T ) , A GENTS ( R ) ) in h , then the Task Manager stops the robot from executing itscurrent task, sets the optimal hyper-arc to follow as infeasible,and finds a new optimal hyper-arc h among the availableones, making the system more robust to failures and toenvironment uncertainties. However, if the execution of all thehyper-arcs fails, the collaboration as a whole fails.It is noteworthy that a discussion of formal properties couldbe relevant for (i) the traversal procedure associated with theAND/OR graph, and (ii) the decision tree. In the first case,formal properties have been demonstrated by Nguyen andSzalas [57] related to AND/OR graphs when used as context-free semi-Thue systems , as in our case, including succinctness,correctness and completeness. In the second case, the decisiontree is generated to model different assignments to variables,and as such is akin to brute force, although encoded in acompact representation. B. Behaviour of the Task Manager
The
Task Manager is organised in two phases, respectivelyoffline and online. Offline, the
Task Manager initialises the list of agents participating in the collaboration scenario, allaction descriptions, and the possible robots or humans whocan perform each action. An action may be executed bydifferent agents individually, or jointly. However, action as-signment is done online, and the necessary skill to performeach robot action is instructed in
Robot Execution Manager or the
Controller . Then, the
Task Manager loads the set ofaction sequences for all states or hyper-arcs involved in thecooperation process. Using such information, we create thenecessary data structures for the online execution.Figure 3 shows a flowchart associated with the online phase.The phase starts with an empty query from the
Task Represen-tation module. When the response to such a query is available,together with the set of feasible states or state transitions, the
Task Manager first checks whether the cooperation graph issuccessfully solved. Otherwise, it generates the Action-Statetable as described above and checks for a met state or asolved state transition in the
Check state execution function.Afterwards, among all feasible states, it finds the optimal stateusing the function
Find optimal state , and checks if the actionsin such a state are grounded or assigned, as well as whether therobot can successfully execute the actions in the simulation.In order to ground the optimal state action parameters and toassign actions to agents, the
Task Manager first generates thedecision tree in
Generate decision tree , then it simulates all theactions associated with its branches using a breadth-first searchalgorithm, and then it computes the utility value for all thebranches in
Evaluate decision tree . Finally, using the function
Update optimal state , it checks for the maximum utility value,it grounds all action parameters, and assigns the actions to theagents in the optimal state. If the maximum utility value iszero, the function
Update optimal state sets the current optimalstate as infeasible. In this case,
Find optimal state selectsanother state with minimum cost among others available inthe Action-State table, and again loops in-between simulationand decision tree evaluation. Eventually,
Find next action findsthe first action in the grounded or evaluated optimal state thatis not done yet, and
Find responsible agent is called to assignthe command to an agent.When the acknowledgement of an action execution is avail-able (either successful or failed), the
Update Action-State table function updates the representation in the Action-State table.If a human operator performs an action that was not suggested,then the
Task Manager commands the robot to terminate itsurrent action and to go to its resting configuration . Finally,if the Action-State table does not hold any feasible states orstate transitions, the cooperation is failed.VI. E
XPERIMENTAL E VALUATION
A. Implementation and Experiments F LEX
HRC+ has been validated using a dual-arm Baxtermanipulator, equipped with an additional RGB-D device lo-cated on the robot’s head , and pointing downward to a tablewhere objects to manipulate are located. In order to gatherinformation about the motion of human operators, we usean LG G watch R (W110) smartwatch for acquiring inertialdata from their right wrist. Data are routed via a Bluetoothconnection to an LG G3 smartphone, and then to a workstationthrough standard WiFi. The workstation has an Intel Core i7-4790 3.60 GHz × TaskRepresentation module and to compare the benefits of thehierarchical AND/OR graph structure in real-world scenarios;(iii) cooperative assembly operations to show the flexibility,the proactive decision making and the reactive adaptation ca-pabilities of F
LEX
HRC+. The AND/OR graph representationis accompanied by an open source implementation . B. Performance Evaluation of the Task Representation Module
In order to carry out a performance evaluation of standard,FOL-based, and hierarchical AND/OR graph representations,let us consider the problem of assembling a table with adifferent number of legs, whereby we gradually increase thenumber of table legs to be assembled from one to nine. Thisimplicitly makes the graph more complex due to the increasingnumber of required hyper-arcs. Figure 5 shows the hierarchical(on the left) and the standard version of the AND/OR graph(on the right) for a table assembly with only two legs. Thetask is obviously unrealistic, but it can be used to highlightthe difference between the single-layer and the hierarchicalrepresentations, as well as standard and FOL-based models.The hierarchical representation is modelled by a FOL-basedrepresentation. For this task, we use a two-layer hierarchicalAND/OR graph.In order to connect a leg to the tabletop according to Figure5 in the middle, there are four cooperation paths that canbe followed: (i) the robot connects the leg and the tabletopdirectly ( blue path in the Figure, cost set to ); (ii) the humanoperator connects the leg and the tabletop directly ( red path, Web: https://github.com/TheEngineRoom-UniGe/ANDOR. cost equal to ); (iii) the robot places the leg in a newposition in the workspace, and later connects the leg to thetabletop ( black path, cost set to ); (iv) when the leg is in anew position, the human operator connects it to the tabletop( green path, cost equal to ). In the third case, the reasonfor introducing a temporary new position for the table leg isthat, when a dual-arm robot is used, there might be situationswhereby one arm can reach the initial leg position, but onlythe other one can move it to its final position. Although thecost of the red and green paths are equal (both of them are ), if the robot can perform hyper-arc h1 , it follows the greenpath. In such a path, the first feasible hyper-arc h1 is commonto the black and green paths. Therefore, solving this hyper-arccan lead the cooperation to move to the black path, which ischaracterised by a lower cost (equal to ), as well as to thegreen path. Before solving the Leg middle pose node, the robotcannot ascertain whether it can perform hyper-arc h2 or not,and therefore the green path is followed if possible. While thecooperation process unfolds, human operators can switch thecooperation to the red path if they deem it fit.In order to compare standard and FOL-based AND/ORgraph representations, let us label the identical legs with A and B . Even if the objects are identical, their locations in therobot workspace are different. Therefore, during execution, thepredicate L EG (? X ) should be grounded to either A or B , sothat the leg’s location is known to the robot when issuingmotion commands. However, at the representation level, sincethe legs are identical, the one actually chosen is irrelevant.Figure 5 on the right hand side shows abstractly all the possibleways to place the two legs on the tabletop when they aregrounded in the Task Representation module. For the sake ofclarity, a simplified AND/OR graph is shown in this Figure.In the experiments, all the details (similar to what is depictedin Figure 5 in the middle) are modelled in the same layer.Figure 6 depicts the average computational time (in alogarithmic scale) for the table assembly task in the offline(on the top in the Figure) and online (bottom) phases. Foreach number of considered legs, we have performed the taskten times and we report average timings. In this experiment,the selection of the cooperation path is done randomly toavoid bias. As it can be noted, the single-layer AND/ORgraph representations, i.e., the standard and the FOL-basedgraphs, seem to be characterised by an exponential complexityin the number of legs (both offline and online), whereas inthe hierarchical representation the computational time growslinearly. In the FOL-based, single-layer representation, theaverage computational time increases from (2 . ± . × − s (one leg) to (1 . ± . × s (nine legs) for the offlinephase, and from (3 . ± . × − s to (1 . ± . × s for the online phase. In the FOL-based hierarchical case,the average computational time for the offline phase increasesfrom (2 . ± . × − s (one leg) to (7 . ± . × − s (ninelegs), and for the online phase it raises from (4 . ± . × − s to (1 . ± . × − s . As demonstrated in Figure 6, theFOL-based representation outperforms the standard AND/ORgraph, in terms of computational time for both the offline andonline phases. The offline phase takes (3 . ± . × s to solve the standard AND/OR graph with five legs, and obot ExecutionManager[Action-State table] Generate Action-State Table[cooperation is solved] [else][ ] , , ( ) Check state execution[a state/state transition is solved][else]
Success [optimal state]
Find optimal state[optimal state is grounded & feasible] [else]Find next actionFind responsible agent [responsible: robot] [responsible: human]Command to robot Suggestionto human
Generate decision treeEvaluate decision treeUpdate optimal stateUpdate Action-State Table [ ] + ≥ 1∣∣ ∣∣ ∣∣ ∣∣ [else] Failure [human was not responsible]Human actionRobot action , Command to robot: Rest KB[literal] [groundedliterals]HumanTaskRepresentationTaskRepresentationHuman ActionRecognitionRobot ExecutionManager [optimal state is evaluated]
Fig. 3: A flowchart depicting the
Task Manager online phase.
Large wallcabinet piecesSmall wallcabinet piecesWide wallcabinet piecesBase cabinet pieces h1h2h3h4 h5
Kitchen Assembly (1 st Layer) Small or Large Cabinet Assembly (2 nd Layer) h1h2h3h4 h5 h7h8
Connecting Cam-lock Screw & Support Bracket to Side Plate (3 rd Layer) h1h2h3h4h5
Fig. 4: A formalisation of an IKEA kitchen assembly task representation using a hierarchical AND/OR graph with five layersaccording to the products documentation. On the left: the first layer of the AND/OR graph showing high-level kitchen assemblyinstructions. In the middle: the second layer related to the assembly of the wall cabinet with small or large sizes (second layer).On the right: the third layer with the connection of cam-lock screws and support brackets to the side plates of the cabinets(images courtesy of IKEA). (2 . ± . × − s for the equivalent FOL-based graph.For the online phase, the computational time for the standardAND/OR graph is (3 . ± . × s , whereas for the FOL-based AND/OR graph is (2 . ± . × − s , i.e., a highadvantage is given in terms of average time. The computationaltimes in case of the single leg case are similar for FOL-basedand standard AND/OR graph. It can be observed, as also shown in Figure 5, that in prin-ciple the hierarchical AND/OR graph representation shouldallow designers to easily represent a table assembly task witha varying number of legs without the need for detailing whichspecific leg a robot or a human operator should connect tothe tabletop. This is characterised by an obvious advantagein terms of representation scalability for increasingly more eg + tabletop,0Leg on table,0Leg middle pose,0Tabletop,0h1, 1 h2, 1 h3(human), 2h4, 1h5(human), 3 h1, 1Assembled table,02 nd Leg + tabletop,01 st Leg + tabletop,0 Tabletop in assembly pose,01 st Leg on table,02 nd Leg on table,0 Tabletop on table,0h1, 1h2, ?h3, ? h4, 1 Assembled table,0Leg A + Leg B + tabletop,0 Tabletop in assembly pose,0Leg B on table,0Leg A on table,0 Tabletop on table,0h1h2h4 h6Leg A + tabletop,0 Leg B + tabletop,0h3 h5
Standard AND/OR graphFOL-based & hierarchical AND/OR graph
Fig. 5: An AND/OR graph modelling a table assembly task with two legs. On the left: the high-level AND/OR graph for tableassembly. In the middle: the low-level AND/OR graph connecting one leg to a tabletop. On the right: standard AND/OR graphwith two identical legs A and B . For the sake of clarity, not all the hyper-arcs are shown, i.e., the red dashed hyper-arcs aresimplified.Fig. 6: The mean computational time in logarithmic scaleto solve a single-layer standard (dotted red line), FOL-based(dashed blue line), and hierarchical (solid black line) AND/ORgraph for a table assembly task with a varying number of legs:on the top, the offline phase; on the bottom, the online phase.complex tasks. In practice, the designer of this particularcooperation process has specified nodes and hyper-arcs to model the table assembly with nine legs employinga hierarchical representation, whereas in case of FOL-basedrepresentation, nodes and hyper-arcs have been identi-fied. Similarly, to compare standard and FOL-based AND/ORgraphs, the table assembly with five legs needs and nodes, and and hyper-arcs, respectively. Clearly, it isnot doable in practice to model the assemblage of a tablewith a high number of legs in case of a standard AND/OR graph because of the modelling complexity. Therefore, wecan reasonably argue that the adoption of a FOL-based andhierarchical AND/OR graph facilitates to a great extent themodelling process. Obviously, such an evaluation should becarried by means of a methodological and experimental study,which for its extensive nature is out of scope here. C. An IKEA Kitchen Assembly Scenario
Figure 4 shows the hierarchical AND/OR graph task rep-resentation realization for an IKEA kitchen composed of twowall cabinets (small and large), one wide wall cabinet, and abase cabinet with a sink and a faucet. The AND/OR graphis implemented according to the assembly documentation ofthe products provided on the IKEA website [58]. In thisexperiment, we benefit from a hierarchical AND/OR graphwith five layers, however, only three layers are shown inFigure 4. The interested reader can find the complete AND/ORmodel in the GitHub repository associated with the developedsoftware architecture. The kitchen assembly is comprised of pieces and connectors (such as screws and nuts, butexcluding the nails) in total, assembled by distinct AND/ORgraphs used in different layers, spawning nodes and hyper-arcs online. Exploiting the hierarchical representation,the designer has defined a lower number of nodes and hyper-arcs in the modelling process, i.e., nodes and hyper-arcs. Over ten trials, and by solving hyper-arcs and nodesrandomly, the offline and online average computational timesare (5 . ± . × − s and (1 . ± . × − s , respectively.While designing the associated AND/OR graph, we assumethat the agents assembling the kitchen can perform the follow-ing actions, namely: keeping a piece , approaching a desiredpose , transporting an object to a desired pose , fitting a pieceto another piece , screwing using bolt and nuts , hammeringails , following a trajectory with the contact force , grasping ,and ungrasping . Moreover, the agents should perceive theirown actions as well as actions by other agents, recognize thepieces and their features such as the size.The first layer of the AND/OR graph, as represented inFigure 4 on the left hand side, decomposes the kitchen assem-bly task into assembly scenarios for different cabinets. Theassembly of each cabinet is modelled in more detail in lower-level AND/OR graphs, therefore increasing the modularity andscalability at the representation level. The three cabinets to beattached to the wall are simpler than the base cabinet, thereforetheir assemblies are modelled by two additional lower-levellayers, whereas the base cabinet assembly (including faucet,sink, strainer, and a drawer) is more complex, and therefore ithas been modelled by four extra layers. In this scenario, thelarge and small wall cabinets (hyper-arcs h1 and h2 of the first-layer AND/OR graph) are similar in shape and are different insize. Owing to the FOL-based AND/OR graph representation,we can model the assembly of the two cabinets using a uniqueAND/OR graph. As shown in the middle of Figure 4 related tothe wall cabinet assembly process, hyper-arcs h1 , h2 , h3 , and h4 model the connection of different screws, support brackets,or dowels to the cabinet’s wooden plank parts. Hyper-arcs h5 and h6 model the connection of different planks usingcam-lock nuts for constructing the structure of the cabinet.Hyper-arc h8 hammers the backplate of the cabinet to thecabinet structure using nails. Later, the assembly proceeds withfixing the hinges to the structure and front door, attaching thestructure to the wall, and connecting the front door to thestructure. Finally, the right hand side of Figure 4 details thehyper-arcs h1 and h2 of the cabinet assembly. It models theconnection of the cam-lock screws and bracket supports to thecabinet side wooden planks. Accordingly, the Task Manager specifies the actions to carry out in order to execute hyper-arcs.The IKEA kitchen assembly scenario demonstrates thescalability feature of AND/OR graphs with several layers,i.e., the possibility to define layers of different semantics, andto exploit the same lower-level assembly scenario in severalhigher-level assembly processes. Although this representationmodel is here presented only as it is , i.e., we do not showa real robot actually collaborating with a human operator inthe process, nonetheless we believe it is a good example of areal-world, yet complex assembly process, which could be anexcellent use case.
D. A Human-Robot Cooperative Table Assembly Task
In a third set of tests, a number of human-robot cooperativetable assembly tasks have been carried out using tables dif-ferent in terms of tabletop sizes and leg lengths, without anychange at the task representation level.These tests demonstrate F
LEX
HRC+ capabilities at twolevels. On the one hand, at the team level, the human operatorand the robot perform the same table assembly task usingdifferent cooperation paths online, including no cooperation atall, i.e., the operator or the robot manage the assembly processon their own. On the other hand, at the task level, the humanoperator and the robot can cooperate on similar tasks, such Fig. 7: The image shows the two different tabletops (left), thetwo set of legs (bottom-right) and the four 3D printed skirts(top-right).as the assembly of tables with different physical properties,without the need for ad hoc representations. In particular, wehave used two rectangular tabletops, two sets of four legs,and four customised 3D printed skirts to guide the legs whenplacing them into the screws for precision compensation, andalso to fix the legs to the tabletop, all of them shown in Figure7. Hence, four different types of tables can be assembled,which are not a priori known to the human operator nor therobot. Initially, the legs and the tabletop are randomly locatedin the shared human-robot workspace.Modelled human operator actions include pick up (thehuman operator picks up one of the legs for manipulationpurposes), put down (the manipulated object is put on thetable in front of the robot), and screw (the human operatorfixes the leg to the tabletop using a rotation movement),whereas robot actions include approach (the robot approachesthe grasping pose of an object with one of its end-effectors), transport (the robot moves an object to a desired goal positionon the table) screw (similar to the one to be executed byhuman operators), unscrew (the robot counter-rotates a legwith respect to the screw action), grasp (the robot closes one ofits grippers after approaching an object), ungrasp (the robotopens a gripper), and update workspace (the robot updatesits internal representation of the workspace using perceptionmodules).We model the cooperative table assembly task with fourlegs using a single-layer FOL-based AND/OR graph similarto the one shown in Figure 5. The AND/OR graph repre-sentation encompasses all the possibilities, ranging from thecase whereby either the human operator or the robot assemblethe whole table on their own, to various cases in which theoperator and the robot cooperate.The table assembly task has been performed eight times aspresented in the accompanying video . Results are summarisedin Table I. The Table shows that the overall execution of Task Manager , Task Representation , and
Simulator requireless than of the complete table assembly task, with alow standard deviation. The more time consuming portions Web: https://youtu.be/CEIARyW422o.
ABLE I: Computational performance of F
LEX
HRC+ mod-ules for the table assembly task.
Computation avg time [s] avg time [%] std [s]Task Manager
Task Representation
Simulator
Robot actions
Human actions
Total
Fig. 8: The sequence of actions associated with tabletopplacement by the robot.of the cooperation are the actions performed by the humanoperator or the robot. The high standard deviation associatedwith experiment lengths is mainly due to human decisionstaken online, and to the significant difference between thehuman and robot speeds in performing the actions. As areference, the maximum robot joint angular velocity, and end-effector angular and linear velocities are . rad/s , . rad/s ,and . m/s , respectively. To avoid repetitions in showingthe results for different parts of the experiments, we dividethe cooperative table assembly task in three segments: in thefirst (tabletop placement), the robot places the tabletop in theworkspace; the second segment (legs fix) is related to alloperations to fix the legs to the tabletop; in the last segment(check) the human operator checks all connections.Figure 8 shows an example of tabletop placement. In ourexperiments, this sequence is always carried out by the robot:the robot’s left and right arms approach the tabletop (a-b),grasp it (c), change its orientation and place it on the tablehorizontally (d-e), and finally return to the resting pose (f).Figures 9, 10, 11, 12 show different situations whereby thehuman operator and the robot perform a legs fix cooperatively.They are free to choose which leg to pick up for manipulation, whereas the order of the bolts on the tabletop is a priori defined in our scenario. Figure 9 shows the case in which therobot connects all the legs to the tabletop. At the beginning,the robot can choose which one of the four legs to pick up, aswell as which arm to use to do it. As shown in the previousSection, this is modelled using FOL-based predicates whoseliterals must be anchored to actual percepts . To do so, therobot evaluates the utility function for all eight options (i.e.,all the couples of legs and arms) using the Simulator module.In this specific example, the second leg from the right and theright arm are selected (a-b, robot point of view). The humanoperator decides not to intervene in the assembly process, andthe robot performs all the assembly operations autonomously.Therefore, the robot follows the blue cooperation path shownin Figure 5, and fixes all the legs to the tabletop. Figure 10shows a case in which the human operator decides to connectall the legs to the tabletop. The robot tries, by default, to followthe minimum cost cooperation path (in blue in Figure 5), but assoon as the pick up action by the human operator is detected,the robot follows the corresponding cooperation path in red.Figure 11 shows the case in which the human operator andthe robot cooperate to connect all the legs to the tabletop. Inparticular, in (a-c) it is shown how the robot fixes the rightmostleg to the tabletop. Later, the robot decides to connect anotherleg with the left arm, but the human operator intervenes andperforms the fix operation to the second tabletop bolt (d-f).The robot adapts to the human decision when the human actionis recognised. It updates its representation of the workspacevia the
Object and Scene Perception module, it updates theAND/OR graph and determines the new set of feasible states.Then, the robot connects the third leg to the tabletop, andfinally the human decides to fix the last leg (g-l). Figure 12shows how the robot adapts to human decisions, and how it canproactively request the human to perform a certain task, whichit cannot perform. In (a-e) the robot connects the first twolegs to the tabletop. Afterwards, the human operator decidesto connect the third leg to the tabletop (f-h). Finally, in (i-l) a situation is depicted whereby the robot is not capableof performing a given task, and therefore it asks the humanoperator to connect the last leg to the tabletop. By followingthe green cooperation path in Figure 5, first the robot transportsthe leg in front of the human operator, and finally the operatorconnects it to the tabletop.Finally, Figure 13 shows the human operator controlling allleg connections to the tabletops at the end of the task.It is noteworthy that in Figures 9, 10, and 12, the human op-erator and the robot assemble the large tabletop with the longlegs, while in Figure 11 they assemble the small table with theshort legs. These examples show the ability of F
LEX
HRC+to handle different real-world objects, i.e., anchoring differentpercepts to the same symbol-mediated predicates in the FOL-based representation. As a consequence, we can considered theperception layer and the representation layer as decoupled, i.e.,changes in the robot workspace need not to be represented.
E. Discussion
As posited in the Introduction, F
LEX
HRC+ targets threeof the identified requirements for HRC, namely R , whichig. 9: The robot connects all the legs to the tabletop.Fig. 10: The human operator connects all the legs to the tabletop; the robot adapts to human decisions online.Fig. 11: The human operator and the robot connect the legs to the tabletop cooperatively; the robot adapts in any case tohuman decisions.ig. 12: The human operator and the robot connect the legs to the tabletop cooperatively; the robot adapts to human decision,and asks proactively to the human to perform a certain task.Fig. 13: A sequence of check actions the human operator carries out to verify all connections between the legs and the tabletop.is related to the trade-off between flexibility and actionsaimed at meeting the cooperation objectives, R , wherebythe robot should exhibit flexible decision making capabilities,and R , the capability of doing so exploiting hierarchical taskrepresentation structures.In the first case, it is posited that a control architecturefor collaborative robots should not only provide the necessaryautonomy for general-purpose action planning and execution,but should also allow the collaborative robot to adapt to thebehaviour of human operators. Such an adaptation shouldbe limited to the available plans reaching the cooperationgoal. F LEX
HRC+ copes with R by equipping the robot withthe ability of autonomously scheduling the most appropriateactions to carry out, while exploiting perception and actionmodules to adapt to human operators. The hierarchical, FOL-based AND/OR graph representation is capable of switchingamong possible, intrinsically feasible plans, and among themfollowing the currently optimal one. This is achieved alsoby means of an in-the-loop, simulation-based approach toanchor internal symbols to actual perceptions. The results ofthe simulation include the actions to be performed, as well asthe upcoming end-effector trajectories. Although human-robotcommunication is not the focus of the paper, it is noteworthy that current work is devoted to the use of Augmented Realitytechnology to intuitively communicate upcoming robot actionsto human operators [59], [60]. In the second case, F LEX
HRC+enables some level of flexibility as far as the representationof the cooperative task is concerned, as required by R .The framework separates low-level perception activities fromthe high-level structure of the cooperation task. At the taskand architectural levels, the robustness to noisy or inaccurateperception is mainly related to what happens in the Knowledgebase module. Therein, two assumptions are made, i.e., closed-world and continuity, which imply that any change in therobot workspace is the consequence of actions performedby the human operator or the robot. In the current versionof F
LEX
HRC+, all relevant information for robot behaviouris maintained in the knowledge base. If newly perceivedinformation is not compatible with stored information, thenthe knowledge base enters an inconsistent state, and waits fornew, compatible, information. In previous work [30], we haveshown how to deal with inconsistent states in a knowledgebase, by using the notion of normative knowledge (i.e., in-formation expected to hold at a given time instant) to planfor a series of robot actions aimed at solving inconsistencies.Such an approach is not integrated with F
LEX
HRC+ yet.s far as R is concerned, a hierarchical representation isa key enabling factor to model complex scenarios adopting abottom-up approach and achieving lower reasoning times. Inthe Task Representation module, the use of hierarchical, FOL-based AND/OR graphs simplifies the definition of qualitativelysimilar tasks, which can be then used to model different phasesof the cooperation, and it greatly increases the computationalefficiency in graph traversal, which is of the utmost importanceto allow collaborative robots to be reactive and adaptive to planchanges. This allows for building a library of simpler tasks,which can be composed together to create composite tasks[61]. In this respect, current work is devoted to integrate inF
LEX
HRC+ the capability of learning from human demon-strations and through interactions with the environment [62].VII. C
ONCLUSIONS
In this paper, we have introduced a human-robot cooperationframework enabling a greater flexibility and scalability in as-sembly tasks. In terms of flexibility, two levels are involved. Atthe task level, the robot deals with the differences in the objectsbeing manipulated, without the need for explicitly encodingsuch differences in the cooperation model. At the cooperationlevel, different modules in the framework allow the humanoperator or the robot to choose the degree of cooperation as thetask progresses. As far scalability is concerned, it is possible todesign complex cooperation scenarios building upon simplerones. To this aim, a First Order Logic hierarchical taskrepresentation has been developed and integrated within theframework. The introduced FOL-based hierarchical AND/ORgraph structure makes task representation efficient enough toperform highly complex scenarios, while at the same timesimplifying the design phase, and ensuring explainable robotbehaviour. However, a designer is still required to manuallybuild the graph. This implies a deep knowledge of the assem-bly task as well as of the involved robot capabilities. Obviouslyenough, such a process is time-consuming, and it is prone tomodelling errors or inaccuracies.In order to cope with these shortcomings, as mentionedin the Introduction, a must-have feature for next generationcollaborative robots would be the capability of learning thetask structure both at the task and the action level [63],[64], [15], [65].While at the task level a robot may learn thesequence of discrete actions [64], [66], at the action level itshould learn how to control its movements to reach a givengoal [15], [67]. We plan to tackle such learning aspects in afuture development of F
LEX
HRC+.R
Journal of Manufacturing Systems , vol. 39,pp. 79–100, 2016.[3] S. Kock, T. Vittor, B. Matthias, H. Jerregard, M. Kllman, I. Lundberg,R. Mellander, and M. Hedelind, “Robot concept for scalable, flexibleassembly automation: A technology study on a harmless dual-armedrobot,” in
Proceedings of the 2011 IEEE International Symposium onAssembly and Manufacturing (ISAM)
Robotics and Computer-Integrated Manufacturing , vol. 40, pp. 1–13,2016.[7] J. A. Adams, “Human-robot interaction design: Understanding userneeds and requirements,” in
Proceedings of the 2005 Annual Meeting onHuman Factors and Ergonomics Society (HFES) , vol. 49, no. 3, Orlando,FL, USA, September 2005.[8] A. Steinfeld, T. Fong, D. Kaber, M. Lewis, J. Scholtz, A. Schultz,and M. Goodrich, “Common metrics for human-robot interaction,” in
Proceedings of the 2006 ACM SIGCHI/SIGART Conference on Human-robot Interaction (HRI) , Salt Lake City, Utah, USA, March 2006.[9] D. Bortot, M. Born, and K. Bengler, “Directly or on detours? howshould industrial robots approximate humans?” in
Proceedings of the2013 ACM/IEEE International Conference on Human-Robot Interaction(HRI) , Tokyo, Japan, March 2013.[10] A. DeSantis, B. Siciliano, A. DeLuca, and A. Bicchi, “An atlas ofphysical human-robot interaction,”
Mechanism and Machine Theory ,vol. 43, no. 3, pp. 253–270, 2008.[11] P. A. Lasota, T. Fong, and J. A. Shah, “A survey of methods for safehuman-robot interaction,”
Foundations and Trends® in Robotics , vol. 5,no. 4, pp. 261–349, 2017.[12] M. A. Goodrich and A. C. Schultz, “Human-robot interaction: a survey,”
Foundations and Trends® in Human–Computer Interaction , vol. 1, no. 3,pp. 203–275, 2007.[13] F. Ferland, D. L´etourneau, A. Aumont, J. Fr´emy, M.-A. Legault,M. Lauria, and F. Michaud, “Natural interaction design of a humanoidrobot,”
Journal of Human-Robot Interaction , vol. 1, no. 2, pp. 118–134,2013.[14] A. Valli, “The design of natural interaction,”
Multimedia Tools andApplications , vol. 38, no. 3, pp. 295–305, 2008.[15] B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey ofrobot learning from demonstration,”
Robotics and Autonomous Systems ,vol. 57, no. 5, pp. 469–483, 2009.[16] K. Darvish, F. Wanderlingh, B. Bruno, E. Simetti, F. Mastrogiovanni,and G. Casalino, “Flexible human-robot cooperation models for assistedshop-floor tasks,”
Mechatronics , vol. 51, pp. 97–114, 2018.[17] K. Darvish, B. Bruno, E. Simetti, F. Mastrogiovanni, and G. Casalino,“Interleaved online task planning, simulation, task allocation and motioncontrol for flexible human-robot cooperation,” in
Proceedings of the2018 IEEE International Symposium on Robot and Human InteractiveCommunication (RO-MAN) , Nanjing and Tai’an, China, August 2018.[18] P. Garca, P. Caamao, R. J. Duro, and F. Bellas, “Scalable task assignmentfor heterogeneous multi-robot teams,”
International Journal of AdvancedRobotic Systems , vol. 10, no. 2, p. 105, 2013.[19] J. Shah, J. Wiken, B. Williams, and C. Breazeal, “Improved human-robotteam performance using Chaski, a human-inspired plan execution sys-tem,” in
Proceedings of the 2011 ACM/IEEE International Conferenceon Human-Robot Interaction (HRI) , Switzerland, March 2011.[20] S. Lemaignan, M. Warnier, E. A. Sisbot, A. Clodic, and R. Alami, “Arti-ficial cognition for social humanrobot interaction: An implementation,”
Artificial Intelligence , vol. 247, pp. 45–69, 2017.[21] J. W. Crandall, M. Oudah, T. Chenlinangjia, F. Ishowo-Oloko, S. Ab-dallah, J. Bonnefon, M. Cebri´an, A. Shariff, M. A. Goodrich, andI. Rahwan, “Cooperating with machines,”
Nature Communications ,vol. 9, no. 1, p. 233, 2018.[22] L. Johannsmeier and S. Haddadin, “A hierarchical human-robotinteraction-planning framework for task allocation in collaborative in-dustrial assembly processes,”
IEEE Robotics and Automation Letters ,vol. 2, no. 1, pp. 41–48, January 2017.[23] E. Lamon, A. De Franco, L. Peternel, and A. Ajoudani, “A capability-aware role allocation approach to industrial assembly tasks,”
IEEERobotics and Automation Letters , vol. 4, no. 4, pp. 3378–3385, 2019.[24] S. J. Levine and B. C. Williams, “Concurrent plan recognition andexecution for human-robot teams.” in
Proceedings of the 2014 Inter-national Conference on Automated Planning and Scheduling (ICAPS) ,Portsmouth, NH, USA, June 2014.25] K. P. Hawkins, S. Bansal, N. N. Vo, and A. F. Bobick, “Anticipatinghuman actions for collaboration in the presence of task and sensoruncertainty,” in
Proceedings of the 2014 IEEE International Conferenceon Robotics and Automation (ICRA) , Hong Kong, China, May 2014.[26] S. Nikolaidis, D. Hsu, and S. Srinivasa, “Human-robot mutual adaptationin collaborative tasks: Models and experiments,”
The InternationalJournal of Robotics Research , vol. 36, no. 5-7, pp. 618–634, 2017.[27] M. Toussaint, T. Munzer, Y. Mollard, L. Y. Wu, N. A. Vien, andM. Lopes, “Relational activity processes for modeling concurrent co-operation,” in
Proceedings of the 2016 IEEE International Conferenceon Robotics and Automation (ICRA) , Stockholm, Sweden, May 2016.[28] D. Claes and K. Tuyls, “Human robot-team interaction,” in
Proceedingsof the 2013 International Symposium on Artificial Life and IntelligentAgents (ALIA) , Bangor, Wales, UK, 2014.[29] M. Cashmore, M. Fox, D. Long, D. Magazzeni, B. Ridder, A. Carrera,N. Palomeras, N. Hurtos, and M. Carreras, “Rosplan: Planning inthe robot operating system.” in
Proceedings of the 2015 InternationalConference on Automated Planning and Scheduling (ICAPS) , Jerusalem,Israel, June 2015.[30] A. Capitanelli, M. Maratea, F. Mastrogiovanni, and M. Vallati, “Onthe manipulation of articulated objects in human-robot cooperationscenarios,”
CoRR , vol. abs/1801.01757, 2018.[31] S. Russell and P. Norvig,
Artificial Intelligence: A Modern Approach ,3rd ed. Upper Saddle River, NJ, USA: Prentice Hall Press, 2010.[32] R. Caccavale, J. Cacace, M. Fiore, R. Alami, and A. Finzi, “Attentionalsupervision of human-robot collaborative plans,” in
Proceedings of the2016 IEEE International Symposium on Robot and Human InteractiveCommunication (RO-MAN) , New York City, NY, USA, August 2016.[33] C. Paxton, A. Hundt, F. Jonathan, K. Guerin, and G. D. Hager, “Costar:Instructing collaborative robots with behavior trees and vision,” in .IEEE, 2017, pp. 564–571.[34] R. Wilcox, S. Nikolaidis, and J. Shah, “Optimization of temporal dynam-ics for adaptive human-robot interaction in assembly manufacturing,”in
Proceedings of the 2012 Robotics: Science and Systems , Sydney,Australia, July 2012.[35] P. Tsarouchi, G. Michalos, S. Makris, T. Athanasatos, K. Dimoulas,and G. Chryssolouris, “On a humanrobot workplace design and taskallocation system,”
International Journal of Computer Integrated Man-ufacturing , vol. 30, no. 12, pp. 1272–1279, 2017.[36] R. Caccavale and A. Finzi, “Flexible task execution and attentionalregulations in human-robot interaction,”
IEEE Transactions on Cognitiveand Developmental Systems , vol. 9, no. 1, pp. 68–79, 2017.[37] E. Sebastiani, R. Lallement, R. Alami, and L. Iocchi, “Dealing with on-line human-robot negotiations in hierarchical agent-based task planner,”in
Proceedings of 2017 International Conference on Automated Planningand Scheduling (ICAPS) , Pittsburgh, PA, USA, June 2017.[38] N. Miyata, J. Ota, T. Arai, and H. Asama, “Cooperative transportby multiple mobile robots in unknown static environments associatedwith real-time task assignment,”
IEEE Transactions on Robotics andAutomation , vol. 18, no. 5, pp. 769–780, 2002.[39] F. Chen, K. Sekiyama, F. Cannella, and T. Fukuda, “Optimal subtaskallocation for human and robot collaboration within hybrid assemblysystem,”
IEEE Transactions on Automation Science and Engineering ,vol. 11, no. 4, pp. 1065–1075, 2014.[40] B. P. Gerkey and M. J. Matari´c, “A formal analysis and taxonomy oftask allocation in multi-robot systems,”
The International Journal ofRobotics Research , vol. 23, no. 9, pp. 939–954, 2004.[41] J. A. Shah, P. R. Conrad, and B. C. Williams, “Fast distributed multi-agent plan execution with dynamic task assignment and scheduling,”in
Proceedings of the 2009 International Conference on AutomatedPlanning and Scheduling (ICAPS) , September 2009.[42] T. R. A. Giele, T. Mioch, M. A. Neerincx, and J.-J. C. Meyer,“Dynamic task allocation for human-robot teams,” in
Proceedings ofthe 2015 International Conference on Agents and Artificial Intelligence(ICAART) , Lisbon, Portugal, January 2015.[43] H. S. Koppula and A. Saxena, “Anticipating human activities for reactiverobotic response,” in
Proceedings of the 2013 IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS) , Tokyo, Japan,November 2013.[44] J. Mainprice, R. Hayne, and D. Berenson, “Predicting human reachingmotion in collaborative tasks using inverse optimal control and iterativere-planning,” in
Proceedings of the 2015 IEEE International Conferenceon Robotics and Automation (ICRA) , Seattle, WA, USA, May 2015.[45] L. S. H. de Mello and A. C. Sanderson, “And/or graph representationof assembly plans,”
IEEE Transactions on Robotics and Automation ,vol. 6, no. 2, pp. 188–199, 1990. [46] B. Bruno, F. Mastrogiovanni, A. Sgorbissa, T. Vernazza, and R. Zaccaria,“Analysis of human behavior recognition algorithms based on acceler-ation data,” in
Proceedings of the 2013 IEEE International Conferenceon Robotics and Automation (ICRA) , Karlsruhe, Germany, May 2013.[47] B. Bruno, F. Mastrogiovanni, A. Saffiotti, and A. Sgorbissa, “Usingfuzzy logic to enhance classification of human motion primitives,” in
Information Processing and Management of Uncertainty in Knowledge-Based Systems , A. Laurent, O. Strauss, B. Bouchon-Meunier, and R. R.Yager, Eds. Montpellier, France: Springer International Publishing,2014, pp. 596–605.[48] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigmfor model fitting with applications to image analysis and automatedcartography,”
Communications of the ACM , vol. 24, no. 6, pp. 381–395,1981.[49] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,”
Chemometrics and Intelligent Laboratory Systems , vol. 2, no. 1-3, pp.37–52, 1987.[50] L. Buoncompagni and F. Mastrogiovanni, “A software architecture forobject perception and semantic representation,” in
Proceedings of the2015 Italian Workshop on Artificial Intelligence and Robotics (AIRO) ,Ferrara, Italy, September 2015.[51] E. Simetti and G. Casalino, “A novel practical technique to integrateinequality control objectives and task transitions in priority basedcontrol,”
Journal of Intelligent & Robotic Systems , vol. 84, no. 1, pp.877–902, 2016.[52] E. Simetti, G. Casalino, F. Wanderlingh, and M. Aicardi, “Task prioritycontrol of underwater intervention systems: Theory and applications,”
Ocean Engineering , vol. 164, pp. 40 – 54, 2018.[53] G. F. Luger,
Artificial Intelligence: Structures and Strategies for Com-plex Problem Solving , 6th ed. Boston, MA, USA: Addison-WesleyPublishing Company, 2009.[54] L. Buoncompagni and F. Mastrogiovanni, “Dialogue-based supervisionand explanation of robot spatial beliefs: a software architecture perspec-tive,” in
Proceedings of the 2018 IEEE International Symposium onRobot and Human Interactive Communication (RO-MAN) , Nanjing andTai’an, China, August 2018.[55] E. Laber, “A randomized competitive algorithm for evaluating pricedand/or trees,”
Theorical Computer Science , vol. 1, no. 401, pp. 120–130, 2008.[56] S. Sahni, “Computationally related problems,”
SIAM Journal on Com-puting , vol. 4, no. 3, pp. 262–279, 1974.[57] L. A. Nguyen and A. Szałas, “A tableau calculus for regular grammarlogics with converse,” in
Proceedings of International Conference onAutomated Deduction
Proceedings ofthe 2008 International Conference on Mechatronics and Machine Visionin Practice , Auckland, New Zealand, December 2008.[60] G. Michalos, P. Karagiannis, S. Makris, nder Tokalar, and G. Chrys-solouris, “Augmented reality (AR) applications for supporting human-robot interactive cooperation,”
Procedia CIRP , vol. 41, pp. 370–375,2016.[61] F. Mastrogiovanni and A. Sgorbissa, “A behaviour sequencing and com-position architecture based on ontologies for entertainment humanoidrobots,”
Robotics and Autonomous Systems , vol. 61, no. 2, pp. 170–183,2013.[62] A. Carf`ı, J. Villalobos, E. Coronado, B. Bruno, and F. Mastrogiovanni,“Can human-inspired learning behaviour facilitate human-robot interac-tion?”
International Journal of Social Robotics , To appear.[63] A. Stopp, S. Horstmann, S. Kristensen, and F. Lohnert, “Toward in-teractive learning for manufacturing assistants,”
IEEE Transactions onIndustrial Electronics , vol. 50, no. 4, pp. 705–707, Aug 2003.[64] S. Ekvall and D. Kragic, “Robot learning from demonstration: A task-level planning approach,”
International Journal of Advanced RoboticSystems , vol. 5, no. 3, p. 33, 2008.[65] G. Konidaris, S. Kuindersma, R. Grupen, and A. Barto, “Robot learn-ing from demonstration by constructing skill trees,”
The InternationalJournal of Robotics Research , vol. 31, no. 3, pp. 360–375, 2012.[66] T. Munzer, M. Toussaint, and M. Lopes, “Preference learning onthe execution of collaborative human-robot tasks,” in
Proceedings ofthe 2017 IEEE International Conference on Robotics and Automation(ICRA) , May 2017.[67] P. Kormushev, S. Calinon, and D. G. Caldwell, “Robot motor skillcoordination with em-based reinforcement learning,” in