William B. Haskell
National University of Singapore
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by William B. Haskell.
Operations Research | 2014
Alejandro Toriello; William B. Haskell; Michael Poremba
We propose a dynamic traveling salesman problem TSP with stochastic arc costs motivated by applications, such as dynamic vehicle routing, in which the cost of a decision is known only probabilistically beforehand but is revealed dynamically before the decision is executed. We formulate this as a dynamic program DP and compare it to static counterparts to demonstrate the advantage of the dynamic paradigm over an a priori approach. We then apply approximate linear programming ALP to overcome the DPs curse of dimensionality, obtain a semi-infinite linear programming lower bound, and discuss its tractability. We also analyze a rollout version of the price-directed policy implied by our ALP and derive worst-case guarantees for its performance. Our computational study demonstrates the quality of a heuristically modified rollout policy using a computationally effective a posteriori bound.
decision and game theory for security | 2014
Matthew Brown; William B. Haskell; Milind Tambe
Boundedly rational human adversaries pose a serious challenge to security because they deviate from the classical assumption of perfect rationality. An emerging trend in security game research addresses this challenge by using behavioral models such as quantal response (QR) and subjective utility quantal response (SUQR). These models improve the quality of the defender’s strategy by more accurately modeling the decisions made by real human adversaries. Work on incorporating human behavioral models into security games has typically followed two threads. The first thread, scalability, seeks to develop efficient algorithms to design patrols for large-scale domains that protect against a single adversary. However, this thread cannot handle the common situation of multiple adversary types with heterogeneous behavioral models. Having multiple adversary types introduces considerable uncertainty into the defender’s planning problem. The second thread, robustness, uses either Bayesian or maximin approaches to handle this uncertainty caused by multiple adversary types. However, the robust approach has so far not been able to scale up to complex, large-scale security games. Thus, each of these two threads alone fails to work in key real world security games. Our present work addresses this shortcoming and merges these two research threads to yield a scalable and robust algorithm, MIDAS (MaxImin Defense Against SUQR), for generating game-theoretic patrols to defend against multiple boundedly rational human adversaries. Given the size of the defender’s optimization problem, the key component of MIDAS is incremental cut and strategy generation using a master/slave optimization approach. Innovations in MIDAS include (i) a maximin mixed-integer linear programming formulation in the master and (ii) a compact transition graph formulation in the slave. Additionally, we provide a theoretical analysis of our new model and report its performance in simulations. In collaboration with the United States Coast Guard (USCG), we consider the problem of defending fishery stocks from illegal fishing in the Gulf of Mexico and use MIDAS to handle heterogeneity in adversary types (i.e., illegal fishermen) in order to construct robust patrol strategies for USCG assets.
Mathematics of Operations Research | 2016
William B. Haskell; Rahul Jain; Dileep M. Kalathil
We propose empirical dynamic programming algorithms for Markov decision processes. In these algorithms, the exact expectation in the Bellman operator in classical value iteration is replaced by an empirical estimate to get “empirical value iteration” (EVI). Policy evaluation and policy improvement in classical policy iteration are also replaced by simulation to get “empirical policy iteration” (EPI). Thus, these empirical dynamic programming algorithms involve iteration of a random operator, the empirical Bellman operator. We introduce notions of probabilistic fixed points for such random monotone operators. We develop a stochastic dominance framework for convergence analysis of such operators. We then use this to give sample complexity bounds for both EVI and EPI. We then provide various variations and extensions to asynchronous empirical dynamic programming, the minimax empirical dynamic program, and show how this can also be used to solve the dynamic newsvendor problem. Preliminary experimental results suggest a faster rate of convergence than stochastic approximation algorithms.
Siam Journal on Control and Optimization | 2013
William B. Haskell; Rahul Jain
We are interested in risk constraints for infinite horizon discrete time Markov decision processes (MDPs). Starting with average reward MDPs, we show that increasing concave stochastic dominance constraints on the empirical distribution of reward lead to linear constraints on occupation measures. An optimal policy for the resulting class of dominance-constrained MDPs is obtained by solving a linear program. We compute the dual of this linear program to obtain average dynamic programming optimality equations that reflect the dominance constraint. In particular, a new pricing term appears in the optimality equations corresponding to the dominance constraint. We show that many types of stochastic orders can be used in place of the increasing concave stochastic order. We also carry out a parallel development for discounted reward MDPs with stochastic dominance constraints. A portfolio optimization example is used to motivate the paper.
Siam Journal on Control and Optimization | 2015
William B. Haskell; Rahul Jain
In classical Markov decision process (MDP) theory, we search for a policy that, say, minimizes the expected infinite horizon discounted cost. Expectation is, of course, a risk neutral measure, which does not suffice in many applications, particularly in finance. We replace the expectation with a general risk functional, and call such models risk-aware MDP models. We consider minimization of such risk functionals in two cases, the expected utility framework, and conditional value-at-risk, a popular coherent risk measure. Later, we consider risk-aware MDPs wherein the risk is expressed in the constraints. This includes stochastic dominance constraints, and the classical chance-constrained optimization problems. In each case, we develop a convex analytic approach to solve such risk-aware MDPs. In most cases, we show that the problem can be formulated as an infinite-dimensional linear program (LP) in occupation measures when we augment the state space. We provide a discretization method and finite approximati...
Annals of Operations Research | 2013
William B. Haskell; J. George Shanthikumar; Z. Max Shen
We study convex optimization problems with a class of multivariate integral stochastic order constraints defined in terms of parametrized families of increasing concave functions. We show that utility functions act as the Lagrange multipliers of the stochastic order constraints in this general setting, and that the dual problem is a search over utility functions. Practical implementation issues are discussed.
European Journal of Operational Research | 2016
William B. Haskell; Lunce Fu; Maged Dessouky
We consider robust stochastic optimization problems for risk-averse decision makers, where there is ambiguity about both the decision maker’s risk preferences and the underlying probability distribution. We propose and analyze a robust optimization problem that accounts for both types of ambiguity. First, we derive a duality theory for this problem class and identify random utility functions as the Lagrange multipliers. Second, we turn to the computational aspects of this problem. We show how to evaluate our robust optimization problem exactly in some special cases, and then we consider some tractable relaxations for the general case. Finally, we apply our model to both the newsvendor and portfolio optimization problems and discuss its implications.
Annals of Operations Research | 2017
William B. Haskell; J. George Shanthikumar; Z. Max Shen
We consider stochastic optimization problems with integral stochastic order constraints. This problem class is characterized by an infinite number of constraints indexed by a function space of increasing concave utility functions. We are interested in effective numerical methods and a Lagrangian duality theory. First, we show how sample average approximation and linear programming can be combined to provide a computational scheme for this problem class. Then, we compute the Lagrangian dual problem to gain more insight into this problem class.
Siam Journal on Optimization | 2017
William B. Haskell; J. George Shanthikumar; Z. Max Shen
Stochastic dominance, a pairwise comparison between random variables, is an effective tool for expressing risk aversion in stochastic optimization. In this paper, we develop a family of primal-dual algorithms for optimization problems with stochastic dominance-constraints. First, we develop an offline primal-dual algorithm and bound its optimality gap as a function of the number of iterations. Then, we extend this algorithm to the online setting where only one random sample is given in each decision epoch. We give probabilistic bounds on the optimality gap in this setting. This technique also yields an online algorithm for the stochastic dominance-constrained multiarmed bandit with partial feedback. The paper concludes by discussing a dual approach for a batch learning problem with robust stochastic dominance constraints.
advances in computing and communications | 2014
William B. Haskell; Rahul Jain; Dileep M. Kalathil
We propose a simulation based algorithm, Empirical Value Iteration (EVI) algorithm, for finding the optimal value function of an MDP with infinite horizon discounted cost criteria when the transition probability kernels are unknown. Unlike simulation based algorithms using stochastic approximation techniques which give only asymptotic convergence results, we give provable, non-asymptotic performance guarantees in terms of sample complexity results: given ε > 0 and δ > 0, we specify the minimum number of simulation samples n(ε; δ) needed in each iteration and the minimum number of iterations t(ε; δ) that are sufficient for the EVI to yield, with a probability at least 1 - δ, an approximate value function that is at least ε close to the optimal value function.