Onésimo Hernández-Lerma

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Onésimo Hernández-Lerma is active.

Explore More

Publication

Featured researches published by Onésimo Hernández-Lerma.

Journal of the American Statistical Association | 1996

Discrete-time Markov control processes : basic optimality criteria

Onésimo Hernández-Lerma; Jean B. Lasserre

1 Introduction and Summary.- 1.1 Introduction.- 1.2 Markov control processes.- 1.3 Preliminary examples.- 1.4 Summary of the following chapters.- 2 Markov Control Processes.- 2.1 Introduction.- 2.2 Markov control processes.- 2.3 Markov policies and the Markov property.- 3 Finite-Horizon Problems.- 3.1 Introduction.- 3.2 Dynamic programming.- 3.3 The measurable selection condition.- 3.4 Variants of the DP equation.- 3.5 LQ problems.- 3.6 A consumption-investment problem.- 3.7 An inventory-production system.- 4 Infinite-Horizon Discounted-Cost Problems.- 4.1 Introduction.- 4.2 The discounted-cost optimality equation.- 4.3 Complements to the DCOE.- 4.4 Policy iteration and other approximations.- 4.5 Further optimality criteria.- 4.6 Asymptotic discount optimality.- 4.7 The discounted LQ problem.- 4.8 Concluding remarks.- 5 Long-Run Average-Cost Problems.- 5.1 Introduction.- 5.2 Canonical triplets.- 5.3 The vanishing discount approach.- 5.4 The average-cost optimality inequality.- 5.5 The average-cost optimality equation.- 5.6 Value iteration.- 5.7 Other optimality results.- 5.8 Concluding remarks.- 6 The Linear Programming Formulation.- 6.1 Introduction.- 6.2 Infinite-dimensional linear programming.- 6.3 Discounted cost.- 6.4 Average cost: preliminaries.- 6.5 Average cost: solvability.- 6.6 Further remarks.- Appendix A Miscellaneous Results.- Appendix B Conditional Expectation.- Appendix C Stochastic Kernels.- Appendix D Multifunctions and Selectors.- Appendix E Convergence of Probability Measures.- References.

European Journal of Operational Research | 2001

Adaptive Markov Control Processes

Onésimo Hernández-Lerma

1 Controlled Markov Processes.- 1.1 Introduction.- 1.2 Stochastic Control Problems.- Control Models.- Policies.- Performance Criteria.- Control Problems.- 1.3 Examples.- An Inventory/Production System.- Control of Water Reservoirs.- Fisheries Management.- Nonstationary MCMs.- Semi-Markov Control Models.- 1.4 Further Comments.- 2 Discounted Reward Criterion.- 2.1 Introduction.- Summary.- 2.2 Optimality Conditions.- Continuity of ?*.- 2.3 Asymptotic Discount Optimality.- 2.4 Approximation of MCMs.- Nonstationary Value-Iteration.- Finite-State Approximations.- 2.5 Adaptive Control Models.- Preliminaries.- Nonstationary Value-Iteration.- The Principle of Estimation and Control.- Adaptive Policies.- 2.6 Nonparametric Adaptive Control.- The Parametric Approach.- New Setting.- The Empirical Distribution Process.- Nonparametric Adaptive Policies.- 2.7 Comments and References.- 3 Average Reward Criterion.- 3.1 Introduction.- Summary.- 3.2 The Optimality Equation.- 3.3 Ergodicity Conditions.- 3.4 Value Iteration.- Uniform Approximations.- Successive Averagings.- 3.5 Approximating Models.- 3.6 Nonstationary Value Iteration.- Nonstationary Successive Averagings.- Discounted-Like NVI.- 3.7 Adaptive Control Models.- Preliminaries.- The Principle of Estimation and Control (PEC).- Nonstationary Value Iteration (NVI).- 3.8 Comments and References.- 4 Partially Observable Control Models.- 4.1 Introduction.- Summary.- 4.2 PO-CM: Case of Known Parameters.- The PO Control Problem.- 4.3 Transformation into a CO Control Problem.- I-Policies.- The New Control Model.- 4.4 Optimal I-Policies.- 4.5 PO-CMs with Unknown Parameters.- PEC and NVI I-Policies.- 4.6 Comments and References.- 5 Parameter Estimation in MCMs.- 5.1 Introduction.- Summary.- 5.2 Contrast Functions.- 5.3 Minimum Contrast Estimators.- 5.4 Comments and References.- 6 Discretization Procedures.- 6.1 Introduction.- Summary.- 6.2 Preliminaries.- 6.3 The Non-Adaptive Case.- A Non-Recursive Procedure.- A Recursive Procedure.- 6.4 Adaptive Control Problems.- Preliminaries.- Discretization of the PEC Adaptive Policy.- Discretization of the NVI Adaptive Policy.- 6.5 Proofs.- The Non-Adaptive Case.- The Adaptive Case.- 6.6 Comments and References.- Appendix A. Contraction Operators.- Appendix B. Probability Measures.- Total Variation Norm.- Weak Convergence.- Appendix C. Stochastic Kernels.- Appendix D. Multifunctions and Measurable Selectors.- The Hausdorff Metric.- Multifunctions.- References.- Author Index.

Archive | 2003

Markov Chains and Invariant Probabilities

Onésimo Hernández-Lerma; Jean B. Lasserre

1 Preliminaries.- 1.1 Introduction.- 1.2 Measures and Functions.- 1.3 Weak Topologies.- 1.4 Convergence of Measures.- 1.5 Complements.- 1.6 Notes.- I Markov Chains and Ergodicity.- 2 Markov Chains and Ergodic Theorems.- 2.1 Introduction.- 2.2 Basic Notation and Definitions.- 2.3 Ergodic Theorems.- 2.4 The Ergodicity Property.- 2.5 Pathwise Results.- 2.6 Notes.- 3 Countable Markov Chains.- 3.1 Introduction.- 3.2 Classification of States and Class Properties.- 3.3 Limit Theorems.- 3.4 Notes.- 4 Harris Markov Chains.- 4.1 Introduction.- 4.2 Basic Definitions and Properties.- 4.3 Characterization of Harris recurrence.- 4.4 Sufficient Conditions for P.H.R.- 4.5 Harris and Doeblin Decompositions.- 4.6 Notes.- 5 Markov Chains in Metric Spaces.- 5.1 Introduction.- 5.2 The Limit in Ergodic Theorems.- 5.3 Yosidas Ergodic Decomposition.- 5.4 Pathwise Results.- 5.5 Proofs.- 5.6 Notes.- 6 Classification of Markov Chains via Occupation Measures.- 6.1 Introduction.- 6.2 A Classification.- 6.3 On the Birkhoff Individual Ergodic Theorem.- 6.4 Notes.- II Further Ergodicity Properties.- 7 Feller Markov Chains.- 7.1 Introduction.- 7.2 Weak-and Strong-Feller Markov Chains.- 7.3 Quasi Feller Chains.- 7.4 Notes.- 8 The Poisson Equation.- 8.1 Introduction.- 8.2 The Poisson Equation.- 8.3 Canonical Pairs.- 8.4 The Cesaro-Averages Approach.- 8.5 The Abelian Approach.- 8.6 Notes.- 9 Strong and Uniform Ergodicity.- 9.1 Introduction.- 9.2 Strong and Uniform Ergodicity.- 9.3 Weak and Weak Uniform Ergodicity.- 9.4 Notes.- III Existence and Approximation of Invariant Probability Measures.- 10 Existence of Invariant Probability Measures.- 10.1 Introduction and Statement of the Problems.- 10.2 Notation and Definitions.- 10.3 Existence Results.- 10.4 Markov Chains in Locally Compact Separable Metric Spaces.- 10.5 Other Existence Results in Locally Compact Separable Metric Spaces.- 10.6 Technical Preliminaries.- 10.7 Proofs.- 10.8 Notes.- 11 Existence and Uniqueness of Fixed Points for Markov Operators.- 11.1 Introduction and Statement of the Problems.- 11.2 Notation and Definitions.- 11.3 Existence Results.- 11.4 Proofs.- 11.5 Notes.- 12 Approximation Procedures for Invariant Probability Measures.- 12.1 Introduction.- 12.2 Statement of the Problem and Preliminaries.- 12.3 An Approximation Scheme.- 12.4 A Moment Approach for a Special Class of Markov Chains.- 12.5 Notes.

Archive | 2009

Continuous-Time Markov Decision Processes

Xianping Guo; Onésimo Hernández-Lerma

In Chap. 2, we formally introduce the concepts associated to a continuous time MDP. Namely, the basic model of continuous-time MDPs and the concept of a Markov policy are stated in precise terms in Sect. 2.2. We also give, in Sect. 2.3, a precise definition of state and action processes in continuous-time MDPs, together with some fundamental properties of these two processes. Then, in Sect. 2.4, we introduce the basic optimality criteria that we are interested in.

Top | 2006

A Survey of Recent Results on Continuous-Time Markov Decision Processes

Xianping Guo; Onésimo Hernández-Lerma; Tomás Prieto-Rumeau; Xi-Ren Cao; Junyu Zhang; Qiying Hu; Mark E. Lewis; Ricardo Vélez

This paper is a survey of recent results on continuous-time Markov decision processes (MDPs) withunbounded transition rates, and reward rates that may beunbounded from above and from below. These results pertain to discounted and average reward optimality criteria, which are the most commonly used criteria, and also to more selective concepts, such as bias optimality and sensitive discount criteria. For concreteness, we consider only MDPs with a countable state space, but we indicate how the results can be extended to more general MDPs or to Markov games.

Siam Journal on Optimization | 1998

Approximation Schemes for Infinite Linear Programs

Onésimo Hernández-Lerma; Jean B. Lasserre

This paper presents approximation schemes for an infinite linear program. In particular, it is shown that, under suitable assumptions, the programs optimum value can be approximated by the values of finite-dimensional linear programs, and that, in addition, every accumulation point of a sequence of optimal solutions for the approximating programs is an optimal solution for the original problem.

Siam Journal on Control and Optimization | 2003

Constrained Average Cost Markov Control Processes in Borel Spaces

Onésimo Hernández-Lerma; Juan González-Hernández; Raquiel R. López-Martínez

This paper considers constrained Markov control processes in Borel spaces, with unbounded costs. The criterion to be minimized is a long-run expected average cost, and the constraints can be imposed on similar average costs, or on average rewards, or discounted costs or rewards. We give conditions under which the constrained problem (CP) is solvable and equivalent to an equality constrained (EC) linear program. Furthermore, we show that there is no duality gap between EC and the dual program EC* and that in fact the strong duality condition holds. Finally, we introduce an explicit procedure to solve CP in some cases which is illustrated with a detailed example.

Siam Journal on Control and Optimization | 2002

Minimax Control of Discrete-Time Stochastic Systems

Jesus Gonzalez-Trejo; Onésimo Hernández-Lerma; Luis F. Hoyos-Reyes

This paper gives a unified, self-contained presentation of minimax control problems for discrete-time stochastic systems on Borel spaces, with possibly unbounded costs. The main results include conditions for the existence of minimax strategies for finite-horizon problems and infinite-horizon discounted and undiscounted (average) cost criteria. The results are specialized to control systems with unknown disturbance distributions---also known as games against nature. Two examples illustrate the theory, one of them on the mold level control problem, which is a key problem in the steelmaking industry.

IEEE Transactions on Automatic Control | 1990

Error bounds for rolling horizon policies in discrete-time Markov control processes

Onésimo Hernández-Lerma; Jean B. Lasserre

Error bounds are presented for rolling horizon (RH) policies in general, stationary and nonstationary, (Borel) Markov control problems with both discounted and average reward criteria. In each of these cases, conditions are given under which the reward of the rolling horizon policy converges geometrically to the optimal reward function, uniformly in the initial state, as the length of the rolling horizon increases. A description of the control model and the general assumptions are given. The approach is based on extending the results of J.M. Alden and A.R.L. Smith (1988) on nonstationary processes with finite state and action spaces. However the proofs presented are simpler. This is because, when stationary models are analyzed first, the error bounds follow more or less directly from well-known value iteration results. The corresponding error bounds for nonstationary models are obtained by reducing these models to stationary ones. >

Siam Journal on Control and Optimization | 1999

Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes

Onésimo Hernández-Lerma; Oscar Vega-Amaya; Guadalupe Carrasco

This paper studies several average-cost criteria for Markov control processes on Borel spaces with possibly unbounded costs. Under suitable hypotheses we show (i) the existence of a sample-path average cost (SPAC-) optimal stationary policy; (ii) a stationary policy is SPAC-optimal if and only if it is expected average cost (EAC-) optimal; and (iii) within the class of stationary SPAC-optimal (equivalently, EAC-optimal) policies there exists one with a minimal limiting average variance.

Explore More