Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where George Konidaris is active.

Publication


Featured researches published by George Konidaris.


international conference on machine learning | 2006

Autonomous shaping: knowledge transfer in reinforcement learning

George Konidaris; Andrew G. Barto

We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks that are related but distinct. Such agents can be trained on a sequence of relatively easy tasks in order to develop a more informative measure of reward that can be transferred to improve performance on more difficult tasks without requiring a hand coded shaping function. We use a rod positioning task to show that this significantly improves performance even after a very brief training period.


The International Journal of Robotics Research | 2012

Robot learning from demonstration by constructing skill trees

George Konidaris; Scott Kuindersma; Roderic A. Grupen; Andrew G. Barto

We describe CST, an online algorithm for constructing skill trees from demonstration trajectories. CST segments a demonstration trajectory into a chain of component skills, where each skill has a goal and is assigned a suitable abstraction from an abstraction library. These properties permit skills to be improved efficiently using a policy learning algorithm. Chains from multiple demonstration trajectories are merged into a skill tree. We show that CST can be used to acquire skills from human demonstration in a dynamic continuous domain, and from both expert demonstration and learned control sequences on the uBot-5 mobile manipulator.


international conference on robotics and automation | 2012

LQR-RRT*: Optimal sampling-based motion planning with automatically derived extension heuristics

Alejandro Perez; Robert Platt; George Konidaris; Leslie Pack Kaelbling; Tomás Lozano-Pérez

The RRT* algorithm has recently been proposed as an optimal extension to the standard RRT algorithm [1]. However, like RRT, RRT* is difficult to apply in problems with complicated or underactuated dynamics because it requires the design of a two domain-specific extension heuristics: a distance metric and node extension method. We propose automatically deriving these two heuristics for RRT* by locally linearizing the domain dynamics and applying linear quadratic regulation (LQR). The resulting algorithm, LQR-RRT*, finds optimal plans in domains with complex or underactuated dynamics without requiring domain-specific design choices. We demonstrate its application in domains that are successively torque-limited, underactuated, and in belief space.


intelligent robots and systems | 2012

Learning and generalization of complex tasks from unstructured demonstrations

Scott Niekum; Sarah Osentoski; George Konidaris; Andrew G. Barto

We present a novel method for segmenting demonstrations, recognizing repeated skills, and generalizing complex tasks from unstructured demonstrations. This method combines many of the advantages of recent automatic segmentation methods for learning from demonstration into a single principled, integrated framework. Specifically, we use the Beta Process Autoregressive Hidden Markov Model and Dynamic Movement Primitives to learn and generalize a multi-step task on the PR2 mobile manipulator and to demonstrate the potential of our framework to learn a large library of skills over time.


The International Journal of Robotics Research | 2015

Learning grounded finite-state representations from unstructured demonstrations

Scott Niekum; Sarah Osentoski; George Konidaris; Sachin Chitta; Bhaskara Marthi; Andrew G. Barto

Robots exhibit flexible behavior largely in proportion to their degree of knowledge about the world. Such knowledge is often meticulously hand-coded for a narrow class of tasks, limiting the scope of possible robot competencies. Thus, the primary limiting factor of robot capabilities is often not the physical attributes of the robot, but the limited time and skill of expert programmers. One way to deal with the vast number of situations and environments that robots face outside the laboratory is to provide users with simple methods for programming robots that do not require the skill of an expert. For this reason, learning from demonstration (LfD) has become a popular alternative to traditional robot programming methods, aiming to provide a natural mechanism for quickly teaching robots. By simply showing a robot how to perform a task, users can easily demonstrate new tasks as needed, without any special knowledge about the robot. Unfortunately, LfD often yields little knowledge about the world, and thus lacks robust generalization capabilities, especially for complex, multi-step tasks. We present a series of algorithms that draw from recent advances in Bayesian non-parametric statistics and control theory to automatically detect and leverage repeated structure at multiple levels of abstraction in demonstration data. The discovery of repeated structure provides critical insights into task invariants, features of importance, high-level task structure, and appropriate skills for the task. This culminates in the discovery of a finite-state representation of the task, composed of grounded skills that are flexible and reusable, providing robust generalization and transfer in complex, multi-step robotic tasks. These algorithms are tested and evaluated using a PR2 mobile manipulator, showing success on several complex real-world tasks, such as furniture assembly.


international conference on robotics and automation | 2015

Planning for decentralized control of multiple robots under uncertainty

Christopher Amato; George Konidaris; Gabriel Cruz; Christopher A. Maynor; Jonathan P. How; Leslie Pack Kaelbling

This paper presents a probabilistic framework for synthesizing control policies for general multi-robot systems that is based on decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs are a general model of decision-making where a team of agents must cooperate to optimize a shared objective in the presence of uncertainty. Dec-POMDPs also consider communication limitations, so execution is decentralized. While Dec-POMDPs are typically intractable to solve for real-world problems, recent research on the use of macro-actions in Dec-POMDPs has significantly increased the size of problem that can be practically solved. We show that, in contrast to most existing methods that are specialized to a particular problem class, our approach can synthesize control policies that exploit any opportunities for coordination that are present in the problem, while balancing uncertainty, sensor information, and information about other agents. We use three variants of a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communication, and signaling, as appropriate. This demonstrates that our algorithmic framework can automatically optimize control and communication policies for complex multi-robot systems.


international conference on robotics and automation | 2013

Optimal sampling-based planning for linear-quadratic kinodynamic systems

Gustavo Goretkin; Alejandro Perez; Robert Platt; George Konidaris

We propose a new method for applying RRT* to kinodynamic motion planning problems by using finite-horizon linear quadratic regulation (LQR) to measure cost and to extend the tree. First, we introduce the method in the context of arbitrary affine dynamical systems with quadratic costs. For these systems, the algorithm is shown to converge to optimal solutions almost surely. Second, we extend the algorithm to non-linear systems with non-quadratic costs, and demonstrate its performance experimentally.


simulation of adaptive behavior | 2006

An adaptive robot motivational system

George Konidaris; Andrew G. Barto

We present a robot motivational system design framework The framework represents the underlying (possibly conflicting) goals of the robot as a set of drives, while ensuring comparable drive levels and providing a mechanism for drive priority adaptation during the robots lifetime The resulting drive reward signals are compatible with existing reinforcement learning methods for balancing multiple reward functions We illustrate the framework with an experiment that demonstrates some of its benefits.


Adaptive Behavior | 2005

An Architecture for Behavior-Based Reinforcement Learning

George Konidaris; Gillian M. Hayes

This paper introduces an integration of reinforcement learning and behavior-based control designed to produce real-time learning in situated agents. The model layers a distributed and asynchronous reinforcement learning algorithm over a learned topological map and standard behavioral substrate to create a reinforcement learning complex. The topological map creates a small and task-relevant state space that aims to make learning feasible, while the distributed and asynchronous aspects of the architecture make it compatible with behavior-based design principles. We present the design, implementation and results of an experiment that requires a mobile robot to perform puck foraging in three artificial arenas using the new model, random decision making, and layered standard reinforcement learning. The results show that our model is able to learn rapidly on a real robot in a real environment, learning and adapting to change more quickly than both alternatives. We show that the robot is able to make the best choices it can given its drives and experiences using only local decisions and therefore displays planning behavior without the use of classical planning techniques.


international conference on robotics and automation | 2014

Learning parameterized motor skills on a humanoid robot

Bruno Castro da Silva; Gianluca Baldassarre; George Konidaris; Andrew G. Barto

We demonstrate a sample-efficient method for constructing reusable parameterized skills that can solve families of related motor tasks. Our method uses learned policies to analyze the policy space topology and learn a set of regression models which, given a novel task, appropriately parameterizes an underlying low-level controller. By identifying the disjoint charts that compose the policy manifold, the method can separately model the qualitatively different sub-skills required for solving distinct classes of tasks. Such sub-skills are useful because they can be treated as new discrete, specialized actions by higher-level planning processes. We also propose a method for reusing seemingly unsuccessful policies as additional, valid training samples for synthesizing the skill, thus accelerating learning. We evaluate our method on a humanoid iCub robot tasked with learning to accurately throw plastic balls at parameterized target locations.

Collaboration


Dive into the George Konidaris's collaboration.

Top Co-Authors

Avatar

Andrew G. Barto

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Leslie Pack Kaelbling

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Roderic A. Grupen

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Scott Niekum

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Tomás Lozano-Pérez

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Benjamin Rosman

Council for Scientific and Industrial Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Philip S. Thomas

University of Massachusetts Amherst

View shared research outputs
Researchain Logo
Decentralizing Knowledge