Is this you? Create Your Porfile

Richard Liaw

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard Liaw is active.

Explore More

Publication

Featured researches published by Richard Liaw.

arXiv: Distributed, Parallel, and Cluster Computing | 2017

Real-Time Machine Learning: The Missing Pieces

Robert Nishihara; Philipp Moritz; Stephanie Wang; Alexey Tumanov; William Paul; Johann Schleier-Smith; Richard Liaw; Mehrdad Niknami; Michael I. Jordan; Ion Stoica

Machine learning applications are increasingly deployed not only to serve predictions using static models, but also as tightly-integrated components of feedback loops involving dynamic, real-time decision making. These applications pose a new set of requirements, none of which are difficult to achieve in isolation, but the combination of which creates a challenge for existing distributed execution frameworks: computation with millisecond latency at high throughput, adaptive construction of arbitrary task graphs, and execution of heterogeneous kernels over diverse sets of resources. We assert that a new distributed execution framework is needed for such ML applications and propose a candidate approach with a proof-of-concept architecture that achieves a 63x performance improvement over a state-of-the-art execution framework for a representative application.

The International Journal of Robotics Research | 2018

SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards

Sanjay Krishnan; Animesh Garg; Richard Liaw; Brijen Thananjeyan; Lauren Miller; Florian T. Pokorny; Ken Goldberg

We present sequential windowed inverse reinforcement learning (SWIRL), a policy search algorithm that is a hybrid of exploration and demonstration paradigms for robot learning. We apply unsupervised learning to a small number of initial expert demonstrations to structure future autonomous exploration. SWIRL approximates a long time horizon task as a sequence of local reward functions and subtask transition conditions. Over this approximation, SWIRL applies Q-learning to compute a policy that maximizes rewards. Experiments suggest that SWIRL requires significantly fewer rollouts than pure reinforcement learning and fewer expert demonstrations than behavioral cloning to learn a policy. We evaluate SWIRL in two simulated control tasks, parallel parking and a two-link pendulum. On the parallel parking task, SWIRL achieves the maximum reward on the task with 85% fewer rollouts than Q-learning, and one-eight of demonstrations needed by behavioral cloning. We also consider physical experiments on surgical tensioning and cutting deformable sheets using a da Vinci surgical robot. On the deformable tensioning task, SWIRL achieves a 36% relative improvement in reward compared with a baseline of behavioral cloning with segmentation.

arXiv: Distributed, Parallel, and Cluster Computing | 2017