Is this you? Create Your Porfile

Thomas Kollar

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas Kollar is active.

Explore More

Publication

Featured researches published by Thomas Kollar.

The International Journal of Robotics Research | 2008

Trajectory Optimization using Reinforcement Learning for Map Exploration

Thomas Kollar; Nicholas Roy

Automatically building maps from sensor data is a necessary and fundamental skill for mobile robots; as a result, considerable research attention has focused on the technical challenges inherent in the mapping problem. While statistical inference techniques have led to computationally efficient mapping algorithms, the next major challenge in robotic mapping is to automate the data collection process. In this paper, we address the problem of how a robot should plan to explore an unknown environment and collect data in order to maximize the accuracy of the resulting map. We formulate exploration as a constrained optimization problem and use reinforcement learning to find trajectories that lead to accurate maps. We demonstrate this process in simulation and show that the learned policy not only results in improved map building, but that the learned policy also transfers successfully to a real robot exploring on MIT campus.

intelligent robots and systems | 2007

Topological mapping using spectral clustering and classification

Emma Brunskill; Thomas Kollar; Nicholas Roy

In this work we present an online method for generating topological maps from raw sensor information. We first describe an algorithm to automatically decompose a map into submap segments using a graph partitioning technique known as spectral clustering. We then describe how to train a classifier to recognize graph submaps from laser signatures using the AdaBoost machine learning algorithm. We demonstrate that the we can perform topological mapping by incrementally segmenting the world as the robot moves through its environment, and we can close the loop when the learned classifier recognizes that the robot has returned to a previously visited location.

Ai Magazine | 2011

Approaching the Symbol Grounding Problem with Probabilistic Graphical Models

Stefanie Tellex; Thomas Kollar; Steven R. Dickerson; Matthew R. Walter; Ashis Gopal Banerjee; Seth J. Teller; Nicholas Roy

n order for robots to engage in dialog with human teammates, they must have the ability to map between words in the language and aspects of the external world. A solution to this symbol grounding problem (Harnad, 1990) would enable a robot to interpret commands such as “Drive over to receiving and pick up the tire pallet.” In this article we describe several of our results that use probabilistic inference to address the symbol grounding problem. Our speciﬁc approach is to develop models that factor according to the linguistic structure of a command. We ﬁrst describe an early result, a generative model that factors according to the sequential structure of language, and then discuss our new framework, generalized grounding graphs (G3). The G3 framework dynamically instantiates a probabilistic graphical model for a natural language input, enabling a mapping between words in language and concrete objects, places, paths and events in the external world. We report on corpus-based experiments where the robot is able to learn and use word meanings in three real-world tasks: indoor navigation, spatial language video retrieval, and mobile manipulation.

international conference on robotics and automation | 2011

Following and interpreting narrated guided tours

Sachithra Hemachandra; Thomas Kollar; Nicholas Roy; Seth J. Teller

We describe a robotic tour-taking capability enabling a robot to acquire local knowledge of a human-occupied environment. A tour-taking robot autonomously follows a human guide through an environment, interpreting the guides spoken utterances and the shared spatiotemporal context in order to acquire a spatially segmented and semantically labeled metrical-topological representation of the environment. The described tour-taking capability enables scalable deployment of mobile robots into human-occupied environments, and natural human-robot interaction for commanded mobility. Our primary contributions are an efficient, socially acceptable autonomous tour-following behavior and a tour interpretation algorithm that partitions a map into spaces labeled according to the guides utterances. The tour-taking behavior is demonstrated in a multi-floor office building and evaluated by assessing the comfort of the tour guides, and by comparing the robots map partitions to those produced by humans.

international conference on robotics and automation | 2010

Indoor scene recognition through object detection

Pablo Espinace; Thomas Kollar; Alvaro Soto; Nicholas Roy

Scene recognition is a highly valuable perceptual ability for an indoor mobile robot, however, current approaches for scene recognition present a significant drop in performance for the case of indoor scenes. We believe that this can be explained by the high appearance variability of indoor environments. This stresses the need to include high-level semantic information in the recognition process. In this work we propose a new approach for indoor scene recognition based on a generative probabilistic hierarchical model that uses common objects as an intermediate semantic representation. Under this model, we use object classifiers to associate low-level visual features to objects, and at the same time, we use contextual relations to associate objects to scenes. As a further contribution, we improve the performance of current state-of-the-art category-level object classifiers by including geometrical information obtained from a 3D range sensor that facilitates the implementation of a focus of attention mechanism within a Monte Carlo sampling scheme. We test our approach using real data, showing significant advantages with respect to previous state-of-the-art methods.

Other repository | 2014

Grounding Verbs of Motion in Natural Language Commands to Robots

Thomas Kollar; Stefanie Tellex; Deb Roy; Nicholas Roy

To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. An important class of instructions involve interacting with people, such as “Follow the person to the kitchen” or “Meet the person at the elevators.” These instructions require that the robot fluidly react to changes in the environment, not simply follow a pre-computed plan. We present an algorithm for understanding natural language commands with three components. First, we create a cost function that scores the language according to how well it matches a candidate plan in the environment, defined as the log-likelihood of the plan given the command. Components of the cost function include novel models for the meanings of motion verbs such as “follow,” “meet,” and “avoid,” as well as spatial relations such as “to” and landmark phrases such as “the kitchen.” Second, an inference method uses this cost function to perform forward search, finding a plan that matches the natural language command. Third, a high-level controller repeatedly calls the inference method at each timestep to compute a new plan in response to changes in the environment such as the movement of the human partner or other people in the scene. When a command consists of more than a single task, the controller switches to the next task when an earlier one is satisfied. We evaluate our approach on a set of example tasks that require the ability to follow both simple and complex natural language commands.

robotics: science and systems | 2012

Toward Information Theoretic Human-Robot Dialog

Stefanie Tellex; Pratiksha Thaker; Robin Deits; Thomas Kollar; Nicholas Roy

Our goal is to build robots that can robustly interact with humans using natural language. This problem is challenging because human language is filled with ambiguity, and furthermore, due to limitations in sensing, the robots perception of its environment might be much more limited than that of its human partner. To enable a robot to recover from a failure to understand a natural language utterance, this paper describes an information-theoretic strategy for asking targeted clarifying questions and using information from the answer to disambiguate the language. To identify good questions, we derive an estimate of the robots uncertainty about the mapping between specific phrases in the language and aspects of the external world. This metric enables the robot to ask a targeted question about the parts of the language for which it is most uncertain. After receiving an answer, the robot fuses information from the command, the question, and the answer in a joint probabilistic graphical model in the G3 framework. When using answers to questions, we show the robot is able to infer mappings between parts of the language and concrete object groundings in the external world with higher accuracy than by using information from the command alone. Furthermore, we demonstrate that by effectively selecting which questions to ask, the robot is able to achieve significant performance gains while asking many fewer questions than baseline metrics.

human robot interaction | 2013

Clarifying commands with information-theoretic human-robot dialog

Robin Deits; Stefanie Tellex; Pratiksha Thaker; Dimitar Simeonov; Thomas Kollar; Nicholas Roy

Our goal is to improve the efficiency and effectiveness of natural language communication between humans and robots. Human language is frequently ambiguous, and a robots limited sensing makes complete understanding of a statement even more difficult. To address these challenges, we describe an approach for enabling a robot to engage in clarifying dialog with a human partner, just as a human might do in a similar situation. Given an unconstrained command from a human operator, the robot asks one or more questions and receives natural language answers from the human. We apply an information-theoretic approach to choosing questions for the robot to ask. Specifically, we choose the type and subject of questions in order to maximize the reduction in Shannon entropy of the robots mapping between language and entities in the world. Within the framework of the G3 graphical model, we derive a method to estimate this entropy reduction, choose the optimal question to ask, and merge the information gained from the human operators answer. We demonstrate that this improves the accuracy of command understanding over prior work while asking fewer questions as compared to baseline question-selection strategies.

Robotics and Autonomous Systems | 2013

Indoor scene recognition by a mobile robot through adaptive object detection

Pablo Espinace; Thomas Kollar; Nicholas Roy; Alvaro Soto

Mobile robotics has achieved notable progress, however, to increase the complexity of the tasks that mobile robots can perform in natural environments, we need to provide them with a greater semantic understanding of their surrounding. In particular, identifying indoor scenes, such as an Office or a Kitchen, is a highly valuable perceptual ability for an indoor mobile robot, and in this paper we propose a new technique to achieve this goal. As a distinguishing feature, we use common objects, such as Doors or furniture, as a key intermediate representation to recognize indoor scenes. We frame our method as a generative probabilistic hierarchical model, where we use object category classifiers to associate low-level visual features to objects, and contextual relations to associate objects to scenes. The inherent semantic interpretation of common objects allows us to use rich sources of online data to populate the probabilistic terms of our model. In contrast to alternative computer vision based methods, we boost performance by exploiting the embedded and dynamic nature of a mobile robot. In particular, we increase detection accuracy and efficiency by using a 3D range sensor that allows us to implement a focus of attention mechanism based on geometric and structural information. Furthermore, we use concepts from information theory to propose an adaptive scheme that limits computational load by selectively guiding the search for informative objects. The operation of this scheme is facilitated by the dynamic nature of a mobile robot that is constantly changing its field of view. We test our approach using real data captured by a mobile robot navigating in Office and home environments. Our results indicate that the proposed approach outperforms several state-of-the-art techniques for scene recognition.

international conference on robotics and automation | 2013

Imitation learning for natural language direction following through unknown environments

Felix Duvallet; Thomas Kollar; Anthony Stentz

The use of spoken instructions in human-robot teams holds the promise of enabling untrained users to effectively control complex robotic systems in a natural and intuitive way. Providing robots with the capability to understand natural language directions would enable effortless coordination in human robot teams that operate in non-specialized unknown environments. However, natural language direction following through unknown environments requires understanding the meaning of language, using a partial semantic world model to generate actions in the world, and reasoning about the environment and landmarks that have not yet been detected. We address the problem of robots following natural language directions through complex unknown environments. By exploiting the structure of spatial language, we can frame direction following as a problem of sequential decision making under uncertainty. We learn a policy which predicts a sequence of actions that follow the directions by exploring the environment and discovering landmarks, backtracking when necessary, and explicitly declaring when it has reached the destination. We use imitation learning to train the policy, using demonstrations of people following directions. By training explicitly in unknown environments, we can generalize to situations that have not been encountered previously.

Explore More