Philip S. Thomas
University of Massachusetts Amherst
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Philip S. Thomas.
international conference on development and learning | 2012
Philip S. Thomas; Andrew G. Barto
We present a method for autonomous on-line discovery of motor primitives for Markov decision processes with high-dimensional continuous action spaces. These biologically-inspired motor primitives require overhead to compute but form a compressed representation of the action set that allows for improved performance on subsequent learning tasks that have similar dynamics.
IEEE Transactions on Human-Machine Systems | 2016
Kathleen M. Jagodnik; Philip S. Thomas; Antonie J. van den Bogert; Michael S. Branicky; Robert F. Kirsch
High-level spinal cord injury (SCI) in humans causes paralysis below the neck. Functional electrical stimulation (FES) technology applies electrical current to nerves and muscles to restore movement, and controllers for upper extremity FES neuroprostheses calculate stimulation patterns to produce desired arm movement. However, currently available FES controllers have yet to restore natural movements. Reinforcement learning (RL) is a reward-driven control technique; it can employ user-generated rewards, and human preferences can be used in training. To test this concept with FES, we conducted simulation experiments using computer-generated “pseudo-human” rewards. Rewards with varying properties were used with an actor-critic RL controller for a planar two-degree-of-freedom biomechanical human arm model performing reaching movements. Results demonstrate that sparse, delayed pseudo-human rewards permit stable and effective RL controller learning. The frequency of reward is proportional to learning success, and human-scale sparse rewards permit greater learning than exclusively automated rewards. Diversity of training task sets did not affect learning. Long-term stability of trained controllers was observed. Using human-generated rewards to train RL controllers for upper-extremity FES systems may be useful. Our findings represent progress toward achieving human-machine teaming in control of upper-extremity FES systems for more natural arm movements based on human user preferences and RL algorithm learning capabilities.
international joint conference on artificial intelligence | 2018
Shayan Doroudi; Philip S. Thomas; Emma Brunskill
We consider the problem of off-policy policy selection in reinforcement learning: using historical data generated from running one policy to compare two or more policies. We show that approaches based on importance sampling can be unfair—they can select the worse of two policies more often than not. We give two examples where the unfairness of importance sampling could be practically concerning. We then present sufficient conditions to theoretically guarantee fairness and a related notion of safety. Finally, we provide a practical importance sampling-based estimator to help mitigate one of the systematic sources of unfairness resulting from using importance sampling for policy selection.
IEEE Transactions on Neural Systems and Rehabilitation Engineering | 2017
Kathleen M. Jagodnik; Philip S. Thomas; Antonie J. van den Bogert; Michael S. Branicky; Robert F. Kirsch
Functional Electrical Stimulation (FES) employs neuroprostheses to apply electrical current to the nerves and muscles of individuals paralyzed by spinal cord injury to restore voluntary movement. Neuroprosthesis controllers calculate stimulation patterns to produce desired actions. To date, no existing controller is able to efficiently adapt its control strategy to the wide range of possible physiological arm characteristics, reaching movements, and user preferences that vary over time. Reinforcement learning (RL) is a control strategy that can incorporate human reward signals as inputs to allow human users to shape controller behavior. In this paper, ten neurologically intact human participants assigned subjective numerical rewards to train RL controllers, evaluating animations of goal-oriented reaching tasks performed using a planar musculoskeletal human arm simulation. The RL controller learning achieved using human trainers was compared with learning accomplished using human-like rewards generated by an algorithm; metrics included success at reaching the specified target; time required to reach the target; and target overshoot. Both sets of controllers learned efficiently and with minimal differences, significantly outperforming standard controllers. Reward positivity and consistency were found to be unrelated to learning success. These results suggest that human rewards can be used effectively to train RL-based FES controllers.
national conference on artificial intelligence | 2011
George Konidaris; Sarah Osentoski; Philip S. Thomas
international conference on machine learning | 2014
Philip S. Thomas
national conference on artificial intelligence | 2016
Marc G. Bellemare; Georg Ostrovski; Arthur Guez; Philip S. Thomas; Rémi Munos
international conference on artificial intelligence | 2015
Georgios Theocharous; Philip S. Thomas; Mohammad Ghavamzadeh
innovative applications of artificial intelligence | 2009
Philip S. Thomas; Antonie J. van den Bogert; Kathleen M. Jagodnik; Michael S. Branicky
national conference on artificial intelligence | 2015
Philip S. Thomas; Georgios Theocharous; Mohammad Ghavamzadeh