Daniel R. Jiang
Princeton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel R. Jiang.
Operations Research | 2015
Daniel R. Jiang; Warren B. Powell
Many sequential decision problems can be formulated as Markov decision processes MDPs where the optimal value function or cost-to-go function can be shown to satisfy a monotone structure in some or all of its dimensions. When the state space becomes large, traditional techniques, such as the backward dynamic programming algorithm i.e., backward induction or value iteration, may no longer be effective in finding a solution within a reasonable time frame, and thus we are forced to consider other approaches, such as approximate dynamic programming ADP. We propose a provably convergent ADP algorithm called Monotone-ADP that exploits the monotonicity of the value functions to increase the rate of convergence. In this paper, we describe a general finite-horizon problem setting where the optimal value function is monotone, present a convergence proof for Monotone-ADP under various technical assumptions, and show numerical results for three application domains: optimal stopping, energy storage/allocation, and glycemic control for diabetes patients. The empirical results indicate that by taking advantage of monotonicity, we can attain high quality solutions within a relatively small number of iterations, using up to two orders of magnitude less computation than is needed to compute the optimal solution exactly.
ieee symposium on adaptive dynamic programming and reinforcement learning | 2014
Daniel R. Jiang; Thuy V. Pham; Warren B. Powell; Daniel F. Salas; Warren R. Scott
As more renewable, yet volatile, forms of energy like solar and wind are being incorporated into the grid, the problem of finding optimal control policies for energy storage is becoming increasingly important. These sequential decision problems are often modeled as stochastic dynamic programs, but when the state space becomes large, traditional (exact) techniques such as backward induction, policy iteration, or value iteration quickly become computationally intractable. Approximate dynamic programming (ADP) thus becomes a natural solution technique for solving these problems to near-optimality using significantly fewer computational resources. In this paper, we compare the performance of the following: various approximation architectures with approximate policy iteration (API), approximate value iteration (AVI) with structured lookup table, and direct policy search on a benchmarked energy storage problem (i.e., the optimal solution is computable).
Mathematics of Operations Research | 2017
Daniel R. Jiang; Warren B. Powell
In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk measures that includes the popular value at risk (VaR) and the conditional value at risk (CVaR). Although there is considerable theoretical development of risk-averse MDPs in the literature, the computational challenges have not been explored as thoroughly. We propose data-driven and simulation-based approximate dynamic programming (ADP) algorithms to solve the risk-averse sequential decision problem. We address the issue of inefficient sampling for risk applications in simulated settings and present a procedure, based on importance sampling, to direct samples toward the “risky region” as the ADP algorithm progresses. ...
Informs Journal on Computing | 2015
Daniel R. Jiang; Warren B. Powell
Archive | 2017
Daniel R. Jiang; Lina Al-Kanj; Warren B. Powell
arXiv: Optimization and Control | 2016
Daniel R. Jiang; Warren B. Powell
Archive | 2016
Daniel R. Jiang; Warren B. Powell
international conference on machine learning | 2018
Daniel R. Jiang; Emmanuel Ekwedike; Han Liu
arXiv: Optimization and Control | 2018
Yijia Wang; Daniel R. Jiang
Archive | 2016
Daniel R. Jiang