Gurdal Arslan
University of Hawaii at Manoa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gurdal Arslan.
systems man and cybernetics | 2009
Jason R. Marden; Gurdal Arslan; Jeff S. Shamma
We present a view of cooperative control using the language of learning in games. We review the game-theoretic concepts of potential and weakly acyclic games, and demonstrate how several cooperative control problems, such as consensus and dynamic sensor coverage, can be formulated in these settings. Motivated by this connection, we build upon game-theoretic concepts to better accommodate a broader class of cooperative control problems. In particular, we extend existing learning algorithms to accommodate restricted action sets caused by the limitations of agent capabilities and group based decision making. Furthermore, we also introduce a new class of games called sometimes weakly acyclic games for time-varying objective functions and action sets, and provide distributed algorithms for convergence to an equilibrium.
Siam Journal on Control and Optimization | 2009
Jason R. Marden; H. Peyton Young; Gurdal Arslan; Jeff S. Shamma
We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent cooperative control problems. A strategy adjustment process determines how players select their strategies at any stage as a function of the information gathered over previous stages. Of particular interest are “payoff-based” processes in which, at any stage, players know only their own actions and (noise corrupted) payoffs from previous stages. In particular, players do not know the actions taken by other players and do not know the structural form of payoff functions. We introduce three different payoff-based processes for increasingly general scenarios and prove that, after a sufficiently large number of stages, player actions constitute a Nash equilibrium at any stage with arbitrarily high probability. We also show how to modify player utility functions through tolls and incentives in so-called congestion games, a special class of weakly acyclic games, to guarantee that a centralized objective can be realized as a Nash equilibrium. We illustrate the methods with a simulation of distributed routing over a network.
IEEE Transactions on Automatic Control | 2002
Gurdal Arslan; Tamer Basar
This note investigates the control of stochastic nonlinear systems with parametric uncertainty. The class of systems considered are single-input-single-output and in strict-feedback form, with the performance measured with respect to a risk-sensitive cost criterion. The uncertainty in the system description is assumed to be linearly parameterized, where the unmeasured parameters are generated by stochastic differential equations. By employing the backstepping design technique on the estimates of the unmeasured states, provided by a simple state estimator, an output-feedback adaptive controller is constructed which maintains an arbitrarily small average value for the risk-sensitive cost. The controller designed achieves boundedness in probability for all closed-loop signals and, under certain conditions, the tracking error converges to zero almost surely.
Automatica | 2001
Gurdal Arslan; Tamer Basar
Neural-net based approximators can be used to design disturbance attenuating adaptive controllers for strict-feedback systems with structurally unknown nonlinearities.
Machine Learning | 2007
Shie Mannor; Jeff S. Shamma; Gurdal Arslan
We provide a simple learning process that enables an agent to forecast a sequence of outcomes. Our forecasting scheme, termed tracking forecast, is based on tracking the past observations while emphasizing recent outcomes. As opposed to other forecasting schemes, we sacrifice universality in favor of a significantly reduced memory requirements. We show that if the sequence of outcomes has certain properties—it has some internal (hidden) state that does not change too rapidly—then the tracking forecast is weakly calibrated so that the forecast appears to be correct most of the time. For binary outcomes, this result holds without any internal state assumptions. We consider learning in a repeated strategic game where each player attempts to compute some forecast of the opponent actions and play a best response to it. We show that if one of the players uses a tracking forecast, while the other player uses a standard learning algorithm (such as exponential regret matching or smooth fictitious play), then the player using the tracking forecast obtains the best response to the actual play of the other players. We further show that if both players use tracking forecast, then under certain conditions on the game matrix, convergence to a Nash equilibrium is possible with positive probability for a larger class of games than the class of games for which smooth fictitious play converges to a Nash equilibrium.
IEEE Transactions on Automatic Control | 2006
Jeff S. Shamma; Gurdal Arslan
This note considers the decentralized control of spatially invariant systems, i.e., systems of homogeneous interacting components. The main idea is for individual components to model interactions with neighbors as disturbances that satisfy certain magnitude bounds while simultaneously self-imposing symmetric magnitude bounds. These magnitude bounds can be interpreted as negotiated levels of interaction among components. It turns out that this approach is equivalent to constructing a feedback that is robustly stabilizing with structured uncertainties.
conference on decision and control | 2002
Gurdal Arslan; Jonathan D. Wolfe; Jeff S. Shamma; Jason L. Speyer
We formulate a dynamic air vehicle assignment problem to sequentially determine the optimal allocation of air vehicle and ammunition resources to threat clusters in an air to ground campaign. A threat cluster includes a number of different types of threats, which are assumed to cooperate among themselves when they are engaged by a team of air vehicles. The objective is to efficiently allocate the assets to eliminate as many valuable threats as possible while minimizing the air vehicle attrition. This problem is formulated as an optimal control problem, whose exact solution can be obtained for small size problems by dynamic programming. For larger more realistic problems, we investigate hierarchical control and potential function methods to solve the optimal control problem in a computationally efficient manner.
IEEE Transactions on Automatic Control | 2017
Gurdal Arslan; Serdar Yüksel
There are only a few learning algorithms applicable to stochastic dynamic teams and games which generalize Markov decision processes to decentralized stochastic control problems involving possibly self-interested decision makers. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In stochastic dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic games, and study their convergence for the weakly acyclic case which includes team problems as an important special case. The algorithms are decentralized in that each decision maker has access only to its own decisions and cost realizations as well as the state transitions; in particular, each decision maker is completely oblivious to the presence of the other decision makers. We show that these algorithms converge to equilibrium policies almost surely in large classes of stochastic games.
conference on decision and control | 2012
Gurdal Arslan; M. Fatih Demirkol; Serdar Yüksel
We study the problem of cost minimization in competitive resource allocation problems, motivated by our previous work on power minimization in MIMO interference systems. Our setup leads to a general cost minimization game in which each player wishes to minimize the cost of its resource consumption while achieving a target utility level. In general, the player strategies are coupled through both their cost functions and their utility functions. Equilibrium exists only for a certain set of target utility levels which in general is a proper set of all achievable utility levels. To characterize the set of equilibrium utility levels, we introduce the dual of a cost minimization game called a utility maximization game in which each player wishes to maximize its utility while keeping the cost of its resource consumption below a cost threshold. We associate the set of equilibrium utility levels with the set of equilibrium of the dual game corresponding to all cost thresholds, and show that the dual game always possesses an equilibrium. We also obtain an inner estimate of the set of equilibrium utility levels in the case of decoupled cost functions by a minimax approach. We then relax the hard constraint on achieving a target utility level, and introduce a weighted cost minimization game which always possesses an equilibrium. We recover the original equilibria through the equilibria of the weighted cost minimization game as the penalty on not achieving the target utility levels increases.
international conference on game theory for networks | 2009
Gurdal Arslan; M. Fatih Demirkol; Serdar Yüksel
We consider a multi-link and multi-input-multi-output (MIMO) interference system in which each link wishes to minimize its own power by choosing its own signal vector subject to an information theoretic Quality-of-Service (QoS) requirement. Our setup leads to a multi-link game, referred to as a “power game”, in which the feasible strategy set of an individual link depends on the strategies of the other links. We characterize the rates for which an equilibrium solution exists in a power game in terms of the equilibria of “capacity games” introduced in our earlier work [1]. We provide an example where the set of equilibrium rates is properly contained in the set of achievable rates. We provide a conservative estimate of the region of equilibrium rates using a minmax approach. We discuss the uniqueness of equilibrium as well as the convergence of best response dynamics (a.k.a. iterative water-filling) for all rates when the interference is sufficiently small and some other mild conditions are met. Finally, we extend our results to the case where the QoS requirements are softened.