Is this you? Create Your Porfile

Yevgen Chebotar

University of Southern California

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yevgen Chebotar is active.

Explore More

Publication

Featured researches published by Yevgen Chebotar.

international conference on robotics and automation | 2017

Path integral guided policy search

Yevgen Chebotar; Mrinal Kalakrishnan; Ali Yahya; Adrian Li; Stefan Schaal; Sergey Levine

3Sergey Levine is with Google Brain, Mountain View, CA 94043, USA. We present a policy search method for learning complex feedback control policies that map from high-dimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search (GPS), which iteratively optimizes a set of local policies for specific instances of a task, and uses these to train a complex, high-dimensional global policy that generalizes across task instances. We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI2), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization. We show that these contributions enable us to learn deep neural network policies that can directly perform torque control from visual input. We validate the method on a challenging door opening task and a pick-and-place task, and we demonstrate that our approach substantially outperforms the prior LQR-based local policy optimizer on these tasks. Furthermore, we show that on-policy sampling significantly increases the generalization ability of these policies.

ieee-ras international conference on humanoid robots | 2015

Force estimation and slip detection/classification for grip control using a biomimetic tactile sensor

Zhe Su; Karol Hausman; Yevgen Chebotar; Artem Molchanov; Gerald E. Loeb; Gaurav S. Sukhatme; Stefan Schaal

We introduce and evaluate contact-based techniques to estimate tactile properties and detect manipulation events using a biomimetic tactile sensor. In particular, we estimate finger forces, and detect and classify slip events. In addition, we present a grip force controller that uses the estimation results to gently pick up objects of various weights and texture. The estimation techniques and the grip controller are experimentally evaluated on a robotic system consisting of Barrett arms and hands. Our results indicate that we are able to accurately estimate forces acting in all directions, detect the incipient slip, and classify slip with over 80% success rate.

intelligent robots and systems | 2014

Learning robot tactile sensing for object manipulation.

Yevgen Chebotar; Oliver Kroemer; Jan Peters

Tactile sensing is a fundamental component of object manipulation and tool handling skills. With robots entering unstructured environments, tactile feedback also becomes an important ability for robot manipulation. In this work, we explore how a robot can learn to use tactile sensing in object manipulation tasks. We first address the problem of in-hand object localization and adapt three pose estimation algorithms from computer vision. Second, we employ dynamic motor primitives to learn robot movements from human demonstrations and record desired tactile signal trajectories. Then, we add tactile feedback to the control loop and apply relative entropy policy search to learn the parameters of the tactile coupling. Additionally, we show how the learning of tactile feedback can be performed more efficiently by reducing the dimensionality of the tactile information through spectral clustering and principal component analysis. Our approach is implemented on a real robot, which learns to perform a scraping task with a spatula in an altered environment.

conference of the international speech communication association | 2016

Distilling Knowledge from Ensembles of Neural Networks for Speech Recognition

Austin Waters; Yevgen Chebotar

Speech recognition systems that combine multiple types of acoustic models have been shown to outperform single-model systems. However, such systems can be complex to implement and too resource-intensive to use in production. This paper describes how to use knowledge distillation to combine acoustic models in a way that has the best of many worlds: It improves recognition accuracy significantly, can be implemented with standard training tools, and requires no additional complexity during recognition. First, we identify a simple but particularly strong type of ensemble: a late combination of recurrent neural networks with different architectures and training objectives. To harness such an ensemble, we use a variant of standard cross-entropy training to distill it into a single model and then discriminatively fine-tune the result. An evaluation on 2,000-hour large vocabulary tasks in 5 languages shows that the distilled models provide up to 8.9% relative WER improvement over conventionally-trained baselines with an identical number of parameters.

international symposium on experimental robotics | 2016

Generalizing Regrasping with Supervised Policy Learning

Yevgen Chebotar; Karol Hausman; Oliver Kroemer; Gaurav S. Sukhatme; Stefan Schaal

We present a method for learning a general regrasping behavior by using supervised policy learning. First, we use reinforcement learning to learn linear regrasping policies, with a small number of parameters, for single objects. Next, a general high-dimensional regrasping policy is learned in a supervised manner by using the outputs of the individual policies. In our experiments with multiple objects, we show that learning low-dimensional policies makes the reinforcement learning feasible with a small amount of data. Our experiments indicate that the general high-dimensional policy learned using our method is able to outperform the respective linear policies on each of the single objects that they were trained on. Moreover, the general policy is able to generalize to a novel object that was not present during training.

conference of the european chapter of the association for computational linguistics | 2012