Is this you? Create Your Porfile

John N. Tsitsiklis

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John N. Tsitsiklis is active.

Explore More

Publication

Featured researches published by John N. Tsitsiklis.

IEEE Transactions on Automatic Control | 1986

Distributed asynchronous deterministic and stochastic gradient optimization algorithms

John N. Tsitsiklis; Dimitri P. Bertsekas; Michael Athans

We present a model for asynchronous distributed computation and then proceed to analyze the convergence of natural asynchronous distributed versions of a large class of deterministic and stochastic gradient-like algorithms. We show that such algorithms retain the desirable convergence properties of their centralized counterparts, provided that the time between consecutive interprocessor communications and the communication delays are not too large.

Mathematics of Operations Research | 1987

The complexity of Markov decision processes

Christos H. Papadimitriou; John N. Tsitsiklis

We investigate the complexity of the classical problem of optimal policy computation in Markov decision processes. All three variants of the problem finite horizon, infinite horizon discounted, and infinite horizon average cost were known to be solvable in polynomial time by dynamic programming finite horizon problems, linear programming, or successive approximation techniques infinite horizon. We show that they are complete for P, and therefore most likely cannot be solved by highly parallel algorithms. We also show that, in contrast, the deterministic cases of all three problems can be solved very fast in parallel. The version with partially observed states is shown to be PSPACE-complete, and thus even less likely to be solved in polynomial time than the NP-complete problems; in fact, we show that, most likely, it is not possible to have an efficient on-line implementation involving polynomial time on-line computations and memory of an optimal policy, even if an arbitrary amount of precomputation is allowed. Finally, the variant of the problem in which there are no observations is shown to be NP-complete.

IEEE Transactions on Automatic Control | 1997

An analysis of temporal-difference learning with function approximation

John N. Tsitsiklis; B. Van Roy

We present new results about the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of a Markov chain using linear function approximators. The algorithm we analyze performs on-line updating of a parameter vector during a single endless trajectory of an aperiodic irreducible finite state Markov chain. Results include convergence (with probability 1), a characterization of the limit of convergence, and a bound on the resulting approximation error. In addition to establishing new and stronger results than those previously available, our analysis is based on a new line of reasoning that provides new intuition about the dynamics of temporal-difference learning. Furthermore, we discuss the implications of two counter-examples with regards to the Significance of on-line updating and linearly parameterized function approximators.

IEEE Transactions on Automatic Control | 1995

Efficient algorithms for globally optimal trajectories

John N. Tsitsiklis

We present serial and parallel algorithms for solving a system of equations that arises from the discretization of the Hamilton-Jacobi equation associated to a trajectory optimization problem of the following type. A vehicle starts at a prespecified point x/sub o/ and follows a unit speed trajectory x(t) inside a region in /spl Rscr//sup m/ until an unspecified time T that the region is exited. A trajectory minimizing a cost function of the form /spl int//sub 0//sup T/ r(x(t))dt+q(x(T)) is sought. The discretized Hamilton-Jacobi equation corresponding to this problem is usually solved using iterative methods. Nevertheless, assuming that the function r is positive, we are able to exploit the problem structure and develop one-pass algorithms for the discretized problem. The first algorithm resembles Dijkstras shortest path algorithm and runs in time O(n log n), where n is the number of grid points. The second algorithm uses a somewhat different discretization and borrows some ideas from a variation of Dials shortest path algorithm (1969) that we develop here; it runs in time O(n), which is the best possible, under some fairly mild assumptions. Finally, we show that the latter algorithm can be efficiently parallelized: for two-dimensional problems and with p processors, its running time becomes O(n/p), provided that p=O(/spl radic/n/log n). >

conference on decision and control | 2005

Convergence in Multiagent Coordination, Consensus, and Flocking

Vincent D. Blondel; Julien M. Hendrickx; Alex Olshevsky; John N. Tsitsiklis

We discuss an old distributed algorithm for reaching consensus that has received a fair amount of recent attention. In this algorithm, a number of agents exchange their values asynchronously and form weighted averages with (possibly outdated) values possessed by their neighbors. We overview existing convergence results, and establish some new ones, for the case of unbounded intercommunication intervals.

Automatica | 2000

Survey A survey of computational complexity results in systems and control

Vincent D. Blondel; John N. Tsitsiklis

The purpose of this paper is twofold: (a) to provide a tutorial introduction to some key concepts from the theory of computational complexity, highlighting their relevance to systems and control theory, and (b) to survey the relatively recent research activity lying at the interface between these fields. We begin with a brief introduction to models of computation, the concepts of undecidability, polynomial-time algorithms, NP-completeness, and the implications of intractability results. We then survey a number of problems that arise in systems and control theory, some of them classical, some of them related to current research. We discuss them from the point of view of computational complexity and also point out many open problems. In particular, we consider problems related to stability or stabilizability of linear systems with parametric uncertainty, robust control, time-varying linear systems, nonlinear and hybrid systems, and stochastic optimal control.

Machine Learning | 1994

Asynchronous Stochastic Approximation and Q-Learning

John N. Tsitsiklis

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available.

conference on decision and control | 2008

On distributed averaging algorithms and quantization effects

Angelia Nedic; Alex Olshevsky; Asuman E. Ozdaglar; John N. Tsitsiklis

We consider distributed iterative algorithms for the averaging problem over time-varying topologies. Our focus is on the convergence time of such algorithms when complete (unquantized) information is available, and on the degradation of performance when only quantized information is available. We study a large and natural class of averaging algorithms, which includes the vast majority of algorithms proposed to date, and provide tight polynomial bounds on their convergence time. We also describe an algorithm within this class whose convergence time is the best among currently available averaging algorithms for time-varying topologies. We then propose and analyze distributed averaging algorithms under the additional constraint that agents can only store and communicate quantized information, so that they can only converge to the average of the initial values of the agents within some error. We establish bounds on the error and tight bounds on the convergence time, as a function of the number of quantization levels.

Mathematics of Operations Research | 1991

An Analysis of Stochastic Shortest Path Problems

Dimitri P. Bertsekas; John N. Tsitsiklis

We consider a stochastic version of the classical shortest path problem whereby for each node of a graph, we must choose a probability distribution over the set of successor nodes so as to reach a certain destination node with minimum expected cost. The costs of transition between successive nodes can be positive as well as negative. We prove natural generalizations of the standard results for the deterministic shortest path problem, and we extend the corresponding theory for undiscounted finite state Markovian decision problems by removing the usual restriction that costs are either all nonnegative or all nonpositive.

Siam Journal on Control and Optimization | 2003

On Actor-Critic Algorithms

Vijay R. Konda; John N. Tsitsiklis

In this article, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction, based on information provided by the critic. We show that the features for the critic should ideally span a subspace prescribed by the choice of parameterization of the actor. We study actor-critic algorithms for Markov decision processes with Polish state and action spaces. We state and prove two results regarding their convergence.

Explore More