Is this you? Create Your Porfile

Vivek S. Borkar

Tata Institute of Fundamental Research

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vivek S. Borkar is active.

Explore More

Publication

Featured researches published by Vivek S. Borkar.

Siam Journal on Control and Optimization | 2000

The O.D. E. Method for Convergence of Stochastic Approximation and Reinforcement Learning

Vivek S. Borkar; Sean P. Meyn

It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated ODE. This in turn implies convergence of the algorithm. Several specific classes of algorithms are considered as applications. It is found that the results provide (i) a simpler derivation of known results for reinforcement learning algorithms; (ii) a proof for the first time that a class of asynchronous stochastic approximation algorithms are convergent without using any a priori assumption of stability; (iii) a proof for the first time that asynchronous adaptive critic and Q-learning algorithms are convergent for the average cost optimal control problem.

conference on decision and control | 2006

A New Distributed Time Synchronization Protocol for Multihop Wireless Networks

Roberto Solis; Vivek S. Borkar; P. R. Kumar

A distributed algorithm to achieve accurate time synchronization in large multihop wireless networks is presented. The central idea is to exploit the large number of global constraints that have to be satisfied by a common notion of time in a multihop network. If, at a certain instant, Oij is the clock offset between two neighboring nodes i and j, then for any loop i1, i2, i3 , ..., in, in + 1 - i1 in the multihop network, these offsets must satisfy the global constraint Sigma k = 1 nOik, ik + 1 = 0. Noisy estimates Ocirc ij of Oij are usually arrived at by bilateral exchanges of timestamped messages or local broadcasts. By imposing the large number of global constraints for all the loops in the multihop network, these estimates can be smoothed and made more accurate. A fully distributed and asynchronous algorithm which functions by simple local broadcasts is designed. Changing the time reference node for synchronization is also easy, consisting simply of one node switching on adaptation, and another switching it off. Implementation results on a forty node network, and comparative evaluation against a leading algorithm, are presented

international conference on computer communications | 2009

A Theory of QoS for Wireless

I-Hong Hou; Vivek S. Borkar; P. R. Kumar

Wireless networks are increasingly used to carry applications with QoS constraints. Two problems arise when dealing with traffic with QoS constraints. One is admission control, which consists of determining whether it is possible to fulfill the demands of a set of clients. The other is finding an optimal scheduling policy to meet the demands of all clients. In this paper, we propose a framework for jointly addressing three QoS criteria: delay, delivery ratio, and channel reliability. We analytically prove the necessary and sufficient condition for a set of clients to be feasible with respect to the above three criteria. We then establish an efficient algorithm for admission control to decide whether a set of clients is feasible. We further propose two scheduling policies and prove that they are feasibility optimal in the sense that they can meet the demands of every feasible set of clients. In addition, we show that these policies are easily implementable on the IEEE 802.11 mechanisms. We also present the results of simulation studies that appear to confirm the theoretical studies and suggest that the proposed policies outperform others tested under a variety of settings.

conference on decision and control | 1996

e-approximation of differential inclusions

Anuj Puri; Vivek S. Borkar; Pravin Varaiya

For a Lipschitz differential inclusion x ∈ f(x), we give a method to compute an arbitrarily close approimation of Reachf(X0, t) — the set of states reached after time t starting from an initial set X0. For a differential inclusion x ∈ f(x), and any e>0, we define a finite sample graph A∈. Every trajectory φ of the differential inclusion x ∈f(x) is also a “trajectory” in A∈. And every “trajectory” η of A∈ has the property that dist(ή(t), f(η(t))) ≤ e. Using this, we can compute the einvariant sets of the differential inclusion — the sets that remain invariant under e-perturbations in f.

Mathematics of Operations Research | 2002

Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost

Vivek S. Borkar; Sean P. Meyn

The existence of an optimal feedback law is established for the risk-sensitive optimal control problem with denumerable state space. The main assumptions imposed are irreducibility and anear monotonicity condition on the one-step cost function. A solution can be found constructively using either value iteration or policy iteration under suitable conditions on initial feedback law.

Operations Research | 2006

Adaptive Importance Sampling Technique for Markov Chains Using Stochastic Approximation

T. P. I. Ahamed; Vivek S. Borkar; Sandeep Juneja

For a discrete-time finite-state Markov chain, we develop an adaptive importance sampling scheme to estimate the expected total cost before hitting a set of terminal states. This scheme updates the change of measure at every transition using constant or decreasing step-size stochastic approximation. The updates are shown to concentrate asymptotically in a neighborhood of the desired zero-variance estimator. Through simulation experiments on simple Markovian queues, we observe that the proposed technique performs very well in estimating performance measures related to rare events associated with queue lengths exceeding prescribed thresholds. We include performance comparisons of the proposed algorithm with existing adaptive importance sampling algorithms on some examples. We also discuss the extension of the technique to estimate the infinite horizon expected discounted cost and the expected average cost.

Siam Journal on Control and Optimization | 1989

Control of Markov chains with long-run average cost criterion: the dynamic programming equations

Vivek S. Borkar

The long-run average cost control problem for discrete time Markov chains on a countable state space is studied in a very general framework. Necessary and sufficient conditions for optimality in terms of the dynamic programming equations are given when an optimal stable stationary strategy is known to exist (e.g., for the situations studied in [Stochastic Differential Systems, Stochastic Control Theory and Applications, IMA Vol. Math. App. 10, Springer-Verlag, New York, Berlin, 1988, pp. 57–77]). A characterization of the desired solution of the dynamic programming equations is given in a special case. Also included is a novel convex analytic argument for deducing the existence of an optimal stable stationary.strategy when that of a randomized one is known.

Siam Journal on Control and Optimization | 1988

Ergodic control of multidimensional diffusions 1: the existence results

Vivek S. Borkar; Mrinal K. Ghosh

The existence of optimal stable Markov relaxed controls for the ergodic control of multidimensional diffusions is established by direct probabilistic methods based on a characterization of a.s. limit sets of empirical measures. The optimality of the above is established in the strong (i.e., almost sure) sense among all admissible controls under very general conditions.

international conference on computer communications | 2008

Index Policies for Real-Time Multicast Scheduling for Wireless Broadcast Systems

Vivek Raghunathan; Vivek S. Borkar; Min Cao; P. R. Kumar

Motivated by the increasing usage of wireless broadcast networks for multicast real-time applications like video, this paper considers a canonical real-time multicast scheduling problem for a wireless broadcast LAN. A wireless access point (AP) has N latency-sensitive flows, each associated with a deadline and a multicast group of receivers that desire to receive all the packets successfully by their corresponding deadlines. We consider periodic and one-shot models of real-time arrivals. The channel from the AP to each receiver is a wireless erasure channel, independent across users and slots. We wish to find a communication strategy that minimizes the total deadlines missed across all receivers, where a receiver counts a miss if it does not receive a packet by its deadline. We cast this problem as a restless bandit in stochastic control. We use Whittles relaxation framework for restless bandits to establish Whittle-indexability for multicast realtime scheduling under the assumption of complete feedback from all receivers in every slot. For the Whittle relaxation, we show that for each flow, the APs decision between transmitting in a slot and idling has a threshold structure. For the homogeneous case where the erasure channel to each receiver is identically distributed with parameter p, the Whittle index of a flow is xi(1 - p) , where xi is the number of receivers who have yet to receive the current packet of flow i. For the general heterogeneous case in which the erasure channel to receiver j has loss probability pj, the Whittle index corresponding to each flow is Sigmaj (1- pj), where the sum is over all multicast receivers who are yet to receive the packet. We bound the performance of the optimal Whittle relaxation with respect to the optimal wireless multicast real-time scheduler. The heuristic index policy that schedules the flow with the maximum Whittle index in each slot is simple. To relax the complete feedback assumption, we design a scalable mechanism based on statistical estimation theory that obtains the required feedback from all the receivers using a single ACK per packet transmission. The resultant policy is amenable to low-complexity implementation.

IEEE Transactions on Automatic Control | 2008

Structural Properties of Optimal Transmission Policies Over a Randomly Varying Channel

Mukul Agarwal; Vivek S. Borkar; Abhay Karandikar

We consider the problem of transmitting packets over a randomly varying point to point channel with the objective of minimizing the expected power consumption subject to a constraint on the average packet delay. By casting it as a constrained Markov decision process in discrete time with time-averaged costs, we prove structural results about the dependence of the optimal policy on buffer occupancy, number of packet arrivals in the previous slot and the channel fading state for both i.i.d. and Markov arrivals and channel fading. The techniques we use to establish such results: convexity, stochastic dominance, decreasing-differences, are among the standard ones for the purpose. Our main contribution, however, is the passage to the average cost case, a notoriously difficult problem for which rather limited results are available. The novel proof techniques used here are likely to have utility in other stochastic control problems well beyond their immediate application considered here.

Explore More