Rolando Cavazos-Cadena
Universidad Autónoma Agraria Antonio Narro
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rolando Cavazos-Cadena.
Operations Research Letters | 1992
Rolando Cavazos-Cadena; Linn I. Sennott
We consider discrete time average cost Markov decision processes with countable state space and finite action sets. Conditions recently proposed by Borkar, Cavazos-Cadena, Weber and Stidham, and Sennott for the existence of an expected average cost optimal stationary policy are compared. The conclusion is that the Sennott conditions are the weakest. We also give an example for which the Sennott axioms hold but the others fail.
Annals of Operations Research | 1991
Onésimo Hernández-Lerma; Raúl Montes-de-Oca; Rolando Cavazos-Cadena
This paper describes virtually all the recurrence conditions used heretofore for Markov decision processes with Borel state and action spaces, which include some forms of mixing and contraction properties, Doeblins condition, Harris recurrence, strong ergodicity, and the existence of bounded solutions to the optimality equation for average reward processes. The aim is to establish (when possible) implications and equivalences between these conditions.
Applied Mathematics and Optimization | 1986
Rolando Cavazos-Cadena
A finite-state iterative scheme introduced by White [9] to approximate the optimal value function of denumerable-state Markov decision processes with bounded rewards, is extended to the case of unbounded rewards. Convergence theorems that, when applied to the case of bounded rewards, give stronger results than those in [9] are proved. Moreover, bounds on the rates of convergence under several assumptions are given and the extended scheme is used to obtain policies with asymptotic optimality properties.
Systems & Control Letters | 1991
Rolando Cavazos-Cadena
Abstract We consider Markov decision processes with denumerable state space and finite control sets; the performance index of a control policy is a long-run expected average cost criterion and the cost function is bounded below. For these models, the existence of average optimal stationary policies was recently established in [11] under very general assumptions. Such a result was obtained via an optimality inequality. Here, we use a simple example to prove that the conditions in [11] do not imply the existence of a solution to the average cost optimality equation.
Applied Mathematics and Optimization | 1989
Rolando Cavazos-Cadena
An average-reward Markov decision process (MDP) with discretetime parameter, denumerable state space, and bounded reward function is considered. With such a model, we associate a family of MDPs. Then, we determinenecessary conditions for the existence of a bounded solution to the optimality equation for each one of the models in the family. Moreover,necessary andsufficient conditions are given so that the optimality equations have a bounded solution with an additional property.
Systems & Control Letters | 1988
Rolando Cavazos-Cadena
Abstract We consider average reward Markov decision processes with discrete time parameter and denumerable state space. We are concerned with the following problem: Find necessary and sufficient conditions so that, for arbitrary bounded reward function, the corresponding average reward optimality equation has a bounded solution. This problem is solved for a class of systems including the case in which, under the action of any stationary policy, the state space is an irreducible positive recurrent class.
Annals of Operations Research | 1991
Rolando Cavazos-Cadena
This paper concerns countable state space Markov decision processes endowed with a (long-run expected)average reward criterion. For these models we summarize and, in some cases,extend some recent results on sufficient conditions to establish the existence of optimal stationary policies. The topics considered are the following: (i) the new assumptions introduced by Sennott in [20–23], (ii)necessary and sufficient conditions for the existence of a bounded solution to the optimality equation, and (iii) equivalence of average optimality criteria. Some problems are posed.
Applied Mathematics and Optimization | 1992
Rolando Cavazos-Cadena; Onésimo Hernández-Lerma
We are concerned with Markov decision processes with countable state space and discrete-time parameter. The main structural restriction on the model is the following: under the action of any stationary policy the state space is acommunicating class. In this context, we prove the equivalence of ten stability/ergodicity conditions on the transition law of the model, which imply the existence of average optimal stationary policies for an arbitrary continuous and bounded reward function; these conditions include the Lyapunov function condition (LFC) introduced by A. Hordijk. As a consequence of our results, the LFC is proved to be equivalent to the following: under the action of any stationary policy the corresponding Markov chain has a unique invariant distribution which depends continuously on the stationary policy being used. A weak form of the latter condition was used by one of the authors to establish the existence of optimal stationary policies using an approach based on renewal theory.
IEEE Transactions on Automatic Control | 2000
Rolando Cavazos-Cadena
In this paper stochastic dynamic systems are studied, modeled by a countable state space Markov cost/reward chain, satisfying a Lyapunov-type stability condition. For an infinite planning horizon, risk-sensitive (exponential) discounted and average cost criteria are considered. The main contribution is the development of a vanishing discount approach to relate the discounted criterion problem with the average criterion one, as the discount factor increases to one, i.e., no discounting. In comparison to the well-established risk-neutral case, our results are novel and reveal several fundamental and surprising differences. Other contributions made include the use of convex analytic arguments to obtain appropriately convergent sequences and a verification theorem for the case of unbounded solutions to the average cost Poisson equation arising in the risk-sensitive case. Also of importance is the fact that our developments are very much self-contained and employ only basic probabilistic and analysis principles.
Acta Applicandae Mathematicae | 1990
Onésimo Hernández-Lerma; Rolando Cavazos-Cadena
We consider a class of discrete-time Markov control processes with Borel state and action spaces, and ℝd i.i.d. disturbances with unknown distribution μ. Under mild semi-continuity and compactness conditions, and assuming that μ is absolutely continuous with respect to Lebesgue measure, we establish the existence of adaptive control policies which are (1) optimal for the average-reward criterion, and (2) asymptotically optimal in the discounted case. Our results are obtained by taking advantage of some well-known facts in the theory of density estimation. This approach allows us to avoid restrictive conditions on the state space and/or on the systems transition law imposed in recent works, and on the other hand, it clearly shows the way to other applications of nonparametric (density) estimation to adaptive control.