Raúl Montes-de-Oca
Universidad Autónoma Metropolitana
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Raúl Montes-de-Oca.
Annals of Operations Research | 1991
Onésimo Hernández-Lerma; Raúl Montes-de-Oca; Rolando Cavazos-Cadena
This paper describes virtually all the recurrence conditions used heretofore for Markov decision processes with Borel state and action spaces, which include some forms of mixing and contraction properties, Doeblins condition, Harris recurrence, strong ergodicity, and the existence of bounded solutions to the optimality equation for average reward processes. The aim is to establish (when possible) implications and equivalences between these conditions.
Mathematical Methods of Operations Research | 2004
Daniel Cruz-Suárez; Raúl Montes-de-Oca; Francisco Salem-Silva
Abstract.This paper presents three conditions. Each of them guarantees the uniqueness of optimal policies of discounted Markov decision processes. The conditions presented here impose hypotheses specifically on the state space X, the action space A, the admissible action sets A(x),x∈X, the transition probability Q, and on the cost function c. Two of these conditions require mainly convexity assumptions, but the third one does not need this kind of assumptions. However, it needs certain stochastic order relations in Q, and the cost function c to reach its minimum with respect to the actions, just in one action. We illustrate the conditions with several examples including, in particular, discrete models, the linear regulator problem, and also a model of an inventory control system.
Acta Applicandae Mathematicae | 1996
Raúl Montes-de-Oca; Onésimo Hernández-Lerma
This paper deals with discrete-time Markov control processes withBorel state and control spaces, with possiblyunbounded costs andnoncompact control constraint sets, and the average cost criterion. Conditions are given for the convergence of the value iteration algorithm to the optimal average cost, and for a sequence of finite-horizon optimal policies to have an accumulation point which is average cost optimal.
Mathematics of Operations Research | 2003
Rolando Cavazos-Cadena; Raúl Montes-de-Oca
This work concerns discrete-time Markov decision chains with finite state space and bounded costs. The controller has constant risk sensitivity ?, and the performance of a control policy is measured by the corresponding risk-sensitive average cost criterion. Assuming that the optimality equation has a solution, it is shown that the value iteration scheme can be implemented to obtain, in a finite number of steps, (1) an approximation to the optimal ?-sensitive average cost with an error less than a given tolerance, and (2) a stationary policy whose performance index is arbitrarily close to the optimal value. The argument used to establish these results is based on a modification of the original model, which is an extension of a transformation introduced by Schweitzer (1971) to analyze the the risk-neutral case.
Mathematical Methods of Operations Research | 1998
Oscar Vega-Amaya; Raúl Montes-de-Oca
We show the existence ofaverage cost (AC-) optimal policy for an inventory system withuncountable state space; in fact, the AC-optimal cost and an AC-optimal stationary policy areexplicitly computed. In order to do this, we use a variant of thevanishing discount factor approach, which have been intensively studied in recent years but the available results not cover the inventory problem we are interested in.
Systems & Control Letters | 1994
Raúl Montes-de-Oca
Abstract This paper deals with discrete-time Markov control processes with Borel state space, allowing unbounded costs and noncompact control sets. For these models, the existence of average optimal stationary policies has been recently established under very general assumptions, using an optimality inequality . Here we give a condition, which is a strengtened version of a variant of the ‘vanishing discount factor’ approach, for the optimality equation to hold.
Mathematical Methods of Operations Research | 2009
Evgueni Gordienko; Enrique Lemus-Rodríguez; Raúl Montes-de-Oca
We study perturbations of a discrete-time Markov control process on a general state space. The amount of perturbation is measured by means of the Kantorovich distance. We assume that an average (per unit of time on the infinite horizon) optimal control policy can be found for the perturbed (supposedly known) process, and that it is used to control the original (unperturbed) process. The one-stage cost is not assumed to be bounded. Under Lyapunov-like conditions we find upper bounds for the average cost excess when such an approximation is used in place of the optimal (unknown) control policy. As an application of the found inequalities we consider the approximation by relevant empirical distributions. We illustrate our results by estimating the stability of a simple autoregressive control process. Also examples of unstable processes are provided.
Mathematical Methods of Operations Research | 2008
Evgueni Gordienko; Enrique Lemus-Rodríguez; Raúl Montes-de-Oca
We find inequalities to estimate the stability (robustness) of a discounted cost optimization problem for discrete-time Markov control processes on a Borel state space. The one stage cost is allowed to be unbounded. Unlike the known results in this area we consider a perturbation of transition probabilities measured by the Kantorovich metric, closely related to the weak convergence. The results obtained make possible to estimate the vanishing rate of the stability index when approximation is made through empirical measures.
Journal of Optimization Theory and Applications | 2014
Rolando Cavazos-Cadena; Raúl Montes-de-Oca; Karel Sladký
This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.
Archive | 2012
Rolando Cavazos-Cadena; Raúl Montes-de-Oca
This work concerns discrete-time average Markov decision chains on a denumerable state space. Besides standard continuity compactness requirements, the main structural condition on the model is that the cost function has a Lyapunov function l and that a power larger than two of l also admits a Lyapunov function. In this context, the existence of optimal stationary policies in the (strong) sample-path sense is established, and it is shown that the Markov policies obtained from methods commonly used to approximate a solution of the optimality equation are also sample-path average optimal.