Towards Evaluating Plan Generation Approaches with Instructional Texts
Debajyoti Paul Chowdhury, Arghya Biswas, Tomasz Sosnowski, Kristina Yordanova
TTOWARDS EVALUATING PLAN GENERATION APPROACHES WITHINSTRUCTIONAL TEXTS
DEBAJYOTI PAUL CHOWDHURY , ARGHYA BISWAS , TOMASZ SOSNOWSKI , KRISTINAYORDANOVA , , H B M
Text University of Rostock, Rostock, Germany University of Bristol, Bristol, UK Kurt Lewin Center for Theoretical Psychology
Abstract.
Recent research in behavior understanding through language grounding hasshown it is possible to automatically generate behaviour models from textual instruc-tions. These models usually have goal-oriented structure and are modelled with differentformalisms from the planning domain such as the Planning Domain Definition Language.One major problem that still remains is that there are no benchmark datasets for com-paring the different model generation approaches, as each approach is usually evaluatedon domain-specific application. To allow the objective comparison of different methodsfor model generation from textual instructions, in this report we introduce a datasetconsisting of 83 textual instructions in English language, their refinement in a morestructured form as well as manually developed plans for each of the instructions. Thedataset is publicly available to the community.
Keywords: textual instructions, model generation, planning, knowledge extraction,benchmark Introduction
Intelligent assistive systems support the daily activities and allow healthy people as wellas people with impairments to continue their independent life [10]. To provide timely andadequate support, such systems have to recognise the user actions and intentions, trackthe user interactions with objects, detect errors in the user behaviour, and find the bestway of assisting them [10]. This can be done by activity recognition (AR) approachesthat utilise human behaviour models (HBM) in the form of rules. These rules are used togenerate probabilistic models with which the system can infer the user actions and goals[12, 21, 9, 20, 2]. This type of models is also known as computational state space models(CSSM) [12]. CSSMs perform activity recognition by treating it as a plan recognitionproblem. In plan recognition given an initial state, a set of possible actions, and a set ofobservations, the executed actions and the user goals have to be recognised [19]. CSSMsuse prior knowledge to obtain the context information needed for building the user actionsand the problem domain. The prior knowledge is provided in the form of precondition-effect rules by a domain expert or by the model designer. This knowledge is then usedto manually build a CSSM. The manual modelling is, however, time consuming and errorprone process [16, 14].To address this problem, different works propose to learn the models from sensor data[36, 17]. One problem these approaches face is that sensor data is expensive [23]. Anotherproblem is that sensors are not always able to capture fine-grained activities [8], thus,they might potentially not be learned. a r X i v : . [ c s . A I] J a n CHOWDHURY ET AL.
To reduce the impact of domain experts or sensor data on the model performance, onecan substitute them with textual data [18]. In other words, one can utilise the knowledgeencoded in textual instructions to learn the model structure. Textual instructions specifytasks for achieving a given goal or performing a certain task without explicitly stating allthe required steps. This omission of certain steps makes them a challenging source forlearning a model [6]. On the positive side, they are usually written in imperative form,have a simple sentence structure, and are highly organised. Compared to rich texts, thismakes them a better source for identifying the sequence of actions needed for reachingthe goal [35].There are different works that address the problem of learning models from textual in-structions. Such models are used for constructing plans of human behaviour [15, 4, 26, 28],for learning an optimal actions’ execution sequence based on natural instructions [5, 6, 22,7, 1, 3], for constructing machine understandable model from natural language instruc-tions [35, 11], and for automatically generating semantic annotation for sensor datasets[27, 30]. Model learning from textual instructions has applications in different fields ofcomputer science: constructing plans of terrorist attacks [15], improving tasks execution(such as navigation, computer commands following, games playing) by interpreting natu-ral language instructions [5, 6, 22, 7, 1, 3], the ability of a robot or machine to interpretinstructions given in natural language [35, 11], or for behaviour analysis tasks based onsensor observations [26, 32].According to [4], to learn a model of human behaviour from textual instructions, thesystem has to:(1) extract the actions’ semantics from the text,(2) learn the model semantics through language grounding,(3) and, finally, to translate it into computational model of human behaviour for planning problems.In addition to these requirements, [25] introduces 4. the need of learning the domainontology that is used to abstract and / or specialise the model.One general challenge that remains is how to empirically compare the different ap-proaches. Yordanova [26] proposes different measures such as correctness of the identifiedelements, complexity of the model, model coverage, and similarity to handcrafted mod-els. To compare different approaches with these metrics, however, one needs a commondataset . In this technical report we address the problem by introducing a dataset, con-sisting of 83 textual instructions on different topics from our everyday life, together withrefinements of the instructions and the corresponding plans. The dataset can be usedto compare the ability of approaches for model generation from text to learn a modelstructure able to explain the corresponding plans.2. Procedure for collecting the textual instructions
The dataset was collected within the context of the project Text2HBM . The goal ofthe project is to develop methods for the automatic generation of planning operators andbehaviour models from textual instructions.To collect the dataset, we first decided on different topics from our everyday life, forwhich the instructions are to be collected. The topics include cooking recipes, buying Providing datasets and the corresponding annotation is a problem in the community that is oftenunderestimated but that could have negative effect on the overall performance of the system [29, 31, 33].In the case of automatically generated models for behaviour analysis the ground truth is a collection ofplans that correspond to the provided textual instructions. https://text2hbm.org/ NSTRUCTIONAL TEXTS FOR PLAN GENERATION 3 movie tickets, changing elements of house appliances, deciding on vacation plan. For eachtype of instructions, we attempted to find different opinions on the execution sequence.In other words, we collected the descriptions from different sources addressing the sameprocess.Concretely, we first selected a topic (e.g. how to make chocolate cookies, how to makea cake etc.), then we asked different people to provide their way of achieving the objective.For example, suppose we would like to make chocolate cookies. We took three differentopinions about the making process. The three processes are depicted in Figure 1. For eachtype of instruction, we have collected between one and 14 instructions to cover differentvariations of the same activity. The originally collected texts are our starting point or“raw data”. For example, Figure 2 shows the original description collected for preparingchocolate cookies.After initial tests with approaches for model generation we found out that this formof action sentences is not suitable for generating models as each sentence is too complex.To address this problem, we performed a refinement step where for each instruction, thesentence was simplified. For any sentence, which consists of more than one verb or oneverb with many objects to which the action is applied, we broke it into a single verb-objectper sentence. In other words, we refined the texts so that there is only one action in eachstep.In Figure 2 one can see there are multiple objects, on which the same action is applied(for example, the action “cream” is applied to the objects “brown sugar”, “white sugar”,and “butter”). To refine it, for each sentence, we broke down the entire sentence intoone action (verb) per sentence (see Figure 3). We could have created a more structuredrepresentation of the steps by removing the articles “a” and “the”, but by doing so anapproach relying on natural sentences will probably not be able to parse the sentencescorrectly.After refining the texts, we now needed a mechanism for validating the correctness ofthe generated models. To achieve that, we manually created plans that represent every“refined text” instruction. The format for the plans can be seen in Figure 4.Here the first element is the time at which the action starts , the second element isthe ∗ , which tells us whether this is a new action or whether it was transferred from theprevious step , and the third element shows the concrete step of the plan in the form ofaction followed by objects (entities), on which the action is executed.It can be seen, that we followed a certain structure when defining the plan steps. Afterinitialising, the next steps are written in a format similar to the structure of the refinedsentences. Each step starts with a verb, followed by the primary object and succeededby the secondary object. Usually, the secondary object is the source or destination lo-cation of the primary object. For example, cream (verb) butter (primary object)mixer (secondary object) . More generally, the form of the steps follows the formula N,*,(action object1/location1 object2/location2) .3.
Dataset
In the following we describe the data format. The dataset is publicly available and canbe downloaded from https://github.com/stenialo/Text2HBM . In this case we assume instantaneous actions, such that each action takes one time step. It is however,also possible to have actions with real durations. For example see [34]. This parameter is redundant in single agent behaviour, such as in our case. There are, however,applications where the behaviour of multiple agents is tracked. In such cases, when a new action startsfor one agent, it is possible that the action for another agent is just transferred from the previous one.For example, see [13, 24].
CHOWDHURY ET AL.
Cover a sheet with parchment paper. Add the vanilla, sugar, brown sugar, eggs, and butter to a bowl. Lightly spoon flour into dry measuring cup. Cream the butter, brown sugar, and white sugar using an electric mixer. Whisk flour, salt, and baking soda into another bowl.
Combine flour, cocoa, baking powder, baking soda, and salt, stirring with a whisk.
In a separate bowl, combine the flour, hot cocoa, salt, baking soda, and baking powder.
Mix dry and wet ingredients until fully blended, without over-mixing.
Place sugar and butter in a large bowl; beat with a mixer at high speed until well blended.
Preheat oven to 350F.
Pour in the dry ingredients and blend on low speed for about two minutes, until the cookie dough is a dark brown and there are no flour streaks.
Apply cooking spray to a pan so that the cookies won't stick to the baking pan.
Add vanilla and eggs and beat well.
Fold in the chocolate chips and marshmallows.
Take a small quantity of your cookie dough and roll it in your hands into a ball shape.
With mixer on low speed, gradually add flour mixture.
Cover the bowl with plastic wrap and allow the cookie dough to chill for about one hour.
Place the cookie balls on the baking pan.
Beat just until combined.
Using a mini ice cream scoop, drop the cookie dough onto the baking sheet.
Flatten each cookie dough ball with a fork.
Fold in cherries and chocolate chips.
Bake the cookies for about 9-11 minutes, until the marshmallows melt and the cookies slightly brown.
Put cookies in the oven for about 8-10 minutes.
Drop by tablespoonfuls 2 inches apart onto baking sheets coated with cooking spray.
Allow the cookies to cool on a wire rack for about 5-7 minutes.
Take the cookies out of the oven and let them sit on a wire cooling rack for about 15 minutes to cool.
Bake at 350F for 12 minutes or just until set.
Serve
Store in an airtight container or eat the cookies once they are cooled.
Remove from oven; cool on pans for 5 minutes.
Cool completely on wire racks and enjoy.
Figure 1.
Three different execution sequences or “recipes” for preparingchocolate cookies.
NSTRUCTIONAL TEXTS FOR PLAN GENERATION 5
Preheat oven to 350F.Cover a baking sheet with a parchment paper.Cream the butter, brown sugar, white sugar using an electric mixer.In a separate bowl, combine the dry ingredients. Combine the flour, hot cocoa,salt, baking soda, and baking powder.Pour in the dry ingredients and blend on low speed for about two minutes, untilthe cookie dough is a dark brown and there are no flour streaks.Fold in the chocolate chips and marshmallows.Cover the bowl with plastic wrap and allow the cookie dough to chill for about one hour.Using a mini ice cream scoop, drop the cookie dough onto the baking sheet.Bake the cookies for about 9-11 minutes, until the marshmallows melt and thecookies slightly brown.Allow the cookies to cool on a wire rack for about 5-7 minutes.Serve.
Figure 2.
The originally collected text for preparing chocolate cookies.
Heat the oven to 350F.Cover the baking sheet with a paper.Take the electric mixer.Cream the butter.Cream the brown sugar.Cream the white sugar.Take a separate bowl.Combine the dry ingredients.Combine the flour.Combine the hot cocoa.Combine the salt.Combine the baking soda.Combine the baking powder.Pour the dry ingredients.Blend for two minutes.Fold the chocolate chips.Fold the marshmallows.Cover the bowl with plastic wrap.Chill the cookie dough.Put the cookie dough on sheet.Bake the cookies for 10 minutes.Cool the cookies for 6 minutes.Serve.
Figure 3.
The text for preparing chocolate cookies after refinement.3.1.
Data format.
The entire dataset consists of various activities resulting in 83 uniquedescriptions. It is separated in eleven categories: Cake, Chicken, Filter system, MovieTickets, PC cooler, Portable AC, Vacation Plan, Cookies, French Toasts, Sushi, WashingMachine.In each of these activities there are certain types of procedures which are categorisedin terms of different names. • Cookies has fourteen different procedures for making some kinds of cookies. • French Toast has fourteen different procedures for making some kinds of Frenchtoasts. • Sushi has thirteen different procedures for making some kinds of sushi. • Washing Machine has five different procedures for cleaning and other stuffsrelated to washing machine. • Vacation Plans has eight different procedures for making some kind of trip orvacation.
CHOWDHURY ET AL.
Figure 4.
The plan corresponding to the refined instruction for preparing cookies. • Cake has ten different opinions from ten different persons for procedures of makingcake. • Chicken also has ten different opinions from ten different persons for proceduresof making cake. • Portable Air Conditioner has four different processes for installing. • Movie Ticket booking has two different options. • House Water-Filter System also has two different opinions. • PC Cooling System has one procedure.Each of these procedures/opinions/processes has three types of data. These three cat-egories are: • Raw Data:
As the term suggests, it is the raw texts collected from differentpeople for different kind of preparation in each of the activities mentioned. Thesentences can be complex or simple depending on the amount of parts of speechin the particular context. • Refined Text:
To make the Raw Data easily processable, the sentences of thesteps in Raw Data needed to be simplified. This form basically starts with a mainverb followed by the main object and ended by the secondary object. Usually,the last term happens to be the location of the object. But sometimes there canbe cases where the location is not important. But each and every step has tofollow the structure of single verb and single primary object. Also, each sentencehas to be grammatically correct and meaningful. For this reason, all the requiredprepositions and articles can not be avoided. • Plans:
To evaluate the generated models we needed particular plan data for eachof the corresponding texts. After initialising the format looks like
N,*,(actionobject1/location1 object2/location2)
Folder structure.
The dataset has the following folder structure.
NSTRUCTIONAL TEXTS FOR PLAN GENERATION 7 • Textual Descriptions – Plans – Raw Data – Refined Text – Refined Text2Folder
Plans contains the plans corresponding on the refined text. As the descrip-tions were collected by two people, in some of the folders there is additional folder
Re-fined Text2 . If this folder exists for the given instruction, then the plans are based on
Refined Text2 . Each step in a plan has the form
Time,*,(action object object) (e.g.1,*,(take cup table)). The time indicates the start time of the action, * indicates thatthe action is a new actions, and the concrete action with the involved elements of theenvironment is in brackets.Folder
Raw Data contains the original instructions.Folder
Refined Text contains a refinement of the original instructions from folder
Raw Data . This refinement consists of making the sentences shorter, with only one verbper sentence. The sentences also start with a verb in imperative form.Folder
Refined Text2 contains a refinement of the texts in folder
Refined Text . Asthe data was collected by two persons, for some instructions the intermediate refinementwas not kept, thus the final refinement is in folder
Refined Text .The dataset can be downloaded from the github repository https://github.com/stenialo/Text2HBM . 4.
Conclusion
In this work we reported a dataset for evaluating automatically generated planningmodels. These models are generated from textual instructions and can be used for variousapplications. The aim of the dataset is to provide a benchmark for approaches thatattempt to learn planning operators from instructional texts. The manually built plansallow to test the validity of each generated model against the corresponding plan or evenagainst other plans describing the same activity. We hope that this dataset will be the firststep towards identifying common metrics and data for evaluating methods for automaticgeneration of behaviour models.
Acknowledgements
This work is part of the Text2HBM project, funded by the German Research Foundation(DFG), grant number YO 226/1-1.
References [1] M. Babe¸s-Vroman, J. MacGlashan, R. Gao, K. Winner, R. Adjogah, M. desJardins, M. Littman,and S. Muresan. Learning to interpret natural language instructions. In
Proceedings of the SecondWorkshop on Semantic Interpretation in an Actionable Context , SIAC ’12, pages 1–6, Stroudsburg,PA, USA, 2012. Association for Computational Linguistics.[2] C. L. Baker, R. Saxe, and J. B. Tenenbaum. Action understanding as inverse planning.
Cognition ,113(3):329–349, 2009.[3] L. Benotti, T. Lau, and M. Villalba. Interpreting natural language instructions using language,vision, and behavior.
ACM Trans. Interact. Intell. Syst. , 4(3):13:1–13:22, Aug. 2014.[4] S. R. K. Branavan, N. Kushman, T. Lei, and R. Barzilay. Learning high-level planning from text.In
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: LongPapers - Volume 1 , ACL ’12, pages 126–135, Stroudsburg, PA, USA, 2012. Association for Compu-tational Linguistics.
CHOWDHURY ET AL. [5] S. R. K. Branavan, D. Silver, and R. Barzilay. Learning to win by reading manuals in a monte-carlo framework. In
Proceedings of the 49th Annual Meeting of the Association for ComputationalLinguistics: Human Language Technologies - Volume 1 , HLT ’11, pages 268–277, Stroudsburg, PA,USA, 2011. Association for Computational Linguistics.[6] S. R. K. Branavan, L. S. Zettlemoyer, and R. Barzilay. Reading between the lines: Learning to maphigh-level instructions to commands. In
Proceedings of the 48th Annual Meeting of the Associationfor Computational Linguistics , ACL ’10, pages 1268–1277, Stroudsburg, PA, USA, 2010. Associationfor Computational Linguistics.[7] D. L. Chen and R. J. Mooney. Learning to interpret natural language navigation instructions fromobservations. In
Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-2011) ,pages 859–865, August 2011.[8] L. Chen, J. Hoey, C. Nugent, D. Cook, and Z. Yu. Sensor-based activity recognition.
IEEE Trans-actions on Systems, Man, and Cybernetics, Part C: Applications and Reviews , 42(6):790–808, No-vember 2012.[9] L. M. Hiatt, A. M. Harrison, and J. G. Trafton. Accommodating human variability in human-robotteams through theory of mind. In
Proceedings of the Twenty-Second International Joint Conferenceon Artificial Intelligence , IJCAI’11, pages 2066–2071, Barcelona, Spain, 2011. AAAI Press.[10] J. Hoey, P. Poupart, A. v. Bertoldi, T. Craig, C. Boutilier, and A. Mihailidis. Automated hand-washing assistance for persons with dementia using video and a partially observable markov decisionprocess.
Computer Vision and Image Understanding , 114(5):503–519, May 2010.[11] T. Kollar, S. Tellex, D. Roy, and N. Roy. Grounding verbs of motion in natural language commandsto robots. In O. Khatib, V. Kumar, and G. Sukhatme, editors,
Experimental Robotics , volume 79 of
Springer Tracts in Advanced Robotics , pages 31–47. Springer Berlin Heidelberg, 2014.[12] F. Kr¨uger, M. Nyolt, K. Yordanova, A. Hein, and T. Kirste. Computational state space models foractivity and intention recognition. a feasibility study.
PLoS ONE , 9(11):e109381, 11 2014.[13] F. Kr¨uger, K. Yordanova, A. Hein, and T. Kirste. Plan synthesis for probabilistic activity recognition.In J. Filipe and A. L. N. Fred, editors,
Proceedings of the 5th International Conference on Agents andArtificial Intelligence (ICAART 2013) , pages 283–288, Barcelona, Spain, February 2013. SciTePress.[14] F. Kr¨uger, K. Yordanova, V. K¨oppen, and T. Kirste. Towards tool support for computational causalbehavior models for activity recognition. In
Proceedings of the 1st Workshop: ”Situation-AwareAssistant Systems Engineering: Requirements, Methods, and Challenges” (SeASE 2012) held atInformatik 2012 , pages 561–572, Braunschweig, Germany, September 2012.[15] X. Li, W. Mao, D. Zeng, and F.-Y. Wang. Automatic construction of domain theory for attackplanning. In
IEEE International Conference on Intelligence and Security Informatics (ISI), 2010 ,pages 65–70, May 2010.[16] T. A. Nguyen, S. Kambhampati, and M. Do. Synthesizing robust plans under incomplete domainmodels. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors,
Advancesin Neural Information Processing Systems 26 , pages 2472–2480. Curran Associates, Inc., 2013.[17] G. Okeyo, L. Chen, H. Wang, and R. Sterritt. Ontology-based learning framework for activity assis-tance in an adaptive smart home. In L. Chen, C. D. Nugent, J. Biswas, and J. Hoey, editors,
ActivityRecognition in Pervasive Intelligent Environments , volume 4 of
Atlantis Ambient and Pervasive In-telligence , pages 237–263. Atlantis Press, 2011.[18] M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Fox, H. Kautz, and D. Hahnel.Inferring activities from interactions with objects.
IEEE Pervasive Computing , 3(4):50–57, Oct.2004.[19] M. Ram´ırez and H. Geffner. Plan recognition as planning. In
Proceedings of the 21st InternationalJont Conference on Artifical Intelligence , IJCAI’09, pages 1778–1783, San Francisco, CA, USA, 2009.Morgan Kaufmann Publishers Inc.[20] M. Ramirez and H. Geffner. Goal recognition over pomdps: Inferring the intention of a pomdpagent. In
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ,volume 3 of
IJCAI’11 , pages 2009–2014, Barcelona, Spain, 2011. AAAI Press.[21] J. G. Trafton, L. M. Hiatt, A. M. Harrison, F. P. Tamborello, S. S. Khemlani, and A. C. Schultz.Act-r/e: An embodied cognitive architecture for human-robot interaction.
Journal of Human-RobotInteraction , 2(1):30–55, 2013.[22] A. Vogel and D. Jurafsky. Learning to follow navigational directions. In
Proceedings of the 48thAnnual Meeting of the Association for Computational Linguistics , ACL ’10, pages 806–814, Strouds-burg, PA, USA, 2010. Association for Computational Linguistics.
NSTRUCTIONAL TEXTS FOR PLAN GENERATION 9 [23] J. Ye, G. Stevenson, and S. Dobson. Usmart: An unsupervised semantic mining activity recognitiontechnique.
ACM Trans. Interact. Intell. Syst. , 4(4):16:1–16:27, Nov. 2014.[24] K. Yordanova.
Methods for Engineering Symbolic Human Behaviour Models for Activity Recogni-tion . PhD thesis, Institute of Computer Science, Rostock, Germany, June 2014. urn:nbn:de:gbv:28-diss2014-0133-5.[25] K. Yordanova. From textual instructions to sensor-based recognition of user behaviour. In
CompanionPublication of the 21st International Conference on Intelligent User Interfaces , IUI ’16 Companion,pages 67–73, New York, NY, USA, 2016. ACM.[26] K. Yordanova. Extracting planning operators from instructional texts for behaviour interpretation.In
German Conference on Artificial Intelligence , pages 215–228, Berlin, Germany, Sptember 2018.[27] K. Yordanova. Towards automated generation of semantic annotation for activity recognition prob-lems. In , March 2020. under review.[28] K. Yordanova and T. Kirste. Learning models of human behaviour from textual instructions. In
Pro-ceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016) ,pages 415–422, Rome, Italy, February 2016.[29] K. Yordanova and F. Kr¨uger. Creating and exploring semantic annotation for behaviour analysis.
Sensors , 18(9):2778:1–22, 2018.[30] K. Yordanova, F. Krger, and T. Kirste. Providing semantic annotation for the cmu grand chal-lenge dataset. In , pages 579–584, March 2018.[31] K. Yordanova, A. Paiement, M. Schrder, E. Tonkin, P. Woznowski, C. M. Olsson, J. Rafferty, andT. Sztyler. Challenges in annotation of user data for ubiquitous systems: Results from the 1st arduousworkshop. Technical Report arXiv:1803.05843, arXiv preprint, March 2018.[32] K. Y. Yordanova, C. Monserrat, D. Nieves, and J. Hern´andez-Orallo. Knowledge extraction from tasknarratives. In
Proceedings of the 4th International Workshop on Sensor-based Activity Recognitionand Interaction , iWOAR ’17, pages 7:1–7:6, New York, NY, USA, 2017. ACM.[33] K. Yordanova. Challenges providing ground truth for pervasive healthcare systems.
IEEE PervasiveComputing , 18(2):100–104, Apr 2019.[34] K. Yordanova, S. Ldtke, S. Whitehouse, F. Krger, A. Paiement, M. Mirmehdi, I. Craddock, andT. Kirste. Analysing cooking behaviour in home settings: Towards health monitoring.
Sensors ,19(3), 2019.[35] Z. Zhang, P. Webster, V. Uren, A. Varga, and F. Ciravegna. Automatically extracting proceduralknowledge from instructional texts using natural language processing. In N. Calzolari, K. Choukri,T. Declerck, M. U. Do˘gan, B. Maegaard, J. Mariani, J. Odijk, and S. Piperidis, editors,
Proceedingsof the Eighth International Conference on Language Resources and Evaluation (LREC-2012) , pages520–527, Istanbul, Turkey, May 2012. European Language Resources Association (ELRA). ACLAnthology Identifier: L12-1094.[36] H. H. Zhuo and S. Kambhampati. Action-model acquisition from noisy plan traces. In