Michael Kaisers
Maastricht University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Kaisers.
Entertainment Computing | 2009
Marc J. V. Ponsen; Karl Tuyls; Michael Kaisers; Jan Ramon
Abstract In this paper we investigate the evolutionary dynamics of strategic behavior in the game of poker by means of data gathered from a large number of real world poker games. We perform this study from an evolutionary game theoretic perspective using two Replicator Dynamics models. First we consider the basic selection model on this data, secondly we use a model which includes both selection and mutation. We investigate the dynamic properties by studying how rational players switch between different strategies under different circumstances, what the basins of attraction of the equilibria look like, and what the stability properties of the attractors are. We illustrate the dynamics using a simplex analysis. Our experimental results confirm existing domain knowledge of the game, namely that certain strategies are clearly inferior while others can be successful given certain game conditions.
cooperative information agents | 2007
H. Jaap van den Herik; Daniel Hennes; Michael Kaisers; Karl Tuyls; Katja Verbeeck
In this paper we compare state-of-the-art multi-agent reinforcement learning algorithms in a wide variety of games. We consider two types of algorithms: value iteration and policy iteration. Four characteristics are studied: initial conditions, parameter settings, convergence speed, and local versus global convergence. Global convergence is still difficult to achieve in practice, despite existing theoretical guarantees. Multiple visualizations are included to provide a comprehensive insight into the learning dynamics.
web intelligence | 2008
Michael Kaisers; Karl Tuyls; Frank Thuijsman; Simon Parsons
Auctions are pervasive in todaypsilas society and provide a variety of real markets. This article facilitates a strategic choice between a set of available trading strategies by introducing a methodology to approximate heuristic payoff tables by normal form games. An example from the auction domain is transformed by this means and an evolutionary game theory analysis is applied subsequently. The information loss in the normal form approximation is shown to be reasonably small such that the concise normal form representation can be leveraged in order to make strategic decisions in auctions. In particular, a mix of trading strategies that guarantees a certain profit is computed and further applications are indicated.
genetic and evolutionary computation conference | 2012
Daniel Hennes; Daan Bloembergen; Michael Kaisers; Karl Tuyls; Simon Parsons
We analyze the competitive advantage of price signal information for traders in simulated double auctions. Previous work has established that more information about the price development does not guarantee higher performance. In particular, traders with limited information perform below market average and are outperformed by random traders; only insiders beat the market. However, this result has only been shown in markets with a few traders and a uniform distribution over information levels. We present additional simulations of several more realistic information distributions, extending previous findings. In addition, we analyze the market dynamics with an evolutionary model of competing information levels. Results show that the highest information level will dominate if information comes for free. If information is costly, less-informed traders may prevail reflecting a more realistic distribution over information levels.
multiagent system technologies | 2012
Haitham Bou Ammar; Karl Tuyls; Michael Kaisers
Swarm intelligence has been successfully applied in various domains, e.g., path planning, resource allocation and data mining. Despite its wide use, a theoretical framework in which the behavior of swarm intelligence can be formally understood is still lacking. This article starts by formally deriving the evolutionary dynamics of ant colony optimization, an important swarm intelligence algorithm. We then continue to formally link these to reinforcement learning. Specifically, we show that the attained evolutionary dynamics are equivalent to the dynamics of Q-learning. Both algorithms are equivalent to a dynamical system known as the replicator dynamics in the domain of evolutionary game theory. In conclusion, the process of improvement described by the replicator dynamics appears to be a fundamental principle which drives processes in swarm intelligence, evolution, and learning.
multiagent system technologies | 2012
Marcel Neumann; Karl Tuyls; Michael Kaisers
Numerous pricing strategies have been proposed and incorporated into software trading agents. Early simulation experiments have shown that markets are efficient even with only a few traders placing randomly priced offers at each time step using the Zero-Intelligence strategy. This article investigates the strategic effect of timing on the performance of trader profits and market efficiency. The trading strategies Zero-Intelligence and Zero-Intelligence-Plus are enhanced by new timing strategies, replacing their heuristics with random and strategic behavior. As expected, agents make less profit with random timing against the heuristic. However, market efficiency remains the same, confirming that continuous double auctions are highly efficient mechanisms even with traders placing profitable offers with random prices and timing. Furthermore, Zero-Intelligence-Plus agents also achieve high efficiency, but strategic timing reduces the risk of being exploited by trading far from the equilibrium price. Thus, there is a clear individual incentive to exploit timing strategically.
european workshop on multi-agent systems | 2011
Michael Kaisers; Daan Bloembergen; Karl Tuyls
This article shows that seemingly diverse implementations of multi-agent reinforcement learning share the same basic building block in their learning dynamics: a mathematical term that is closely related to the gradient of the expected reward. Gradient Ascent on the expected reward has been used to derive strong convergence results in two-player two-action games, at the expense of strong assumptions such as full information on the game that is being played. Variations of Gradient Ascent, such as Infinitesimal Gradient Ascent (IGA), Win-or-Learn-Fast IGA, and Weighted Policy Learning (WPL), assume a known value function for which the reinforcement gradient can be computed directly. In contrast, independent multi-agent reinforcement learning algorithms that assume less information on the game being played such as Cross learning, variations of Q-learning and Regret minimization base their learning on feedback from discrete interactions with the environment, requiring neither an explicit representation of the value function nor its gradient. Despite this much stricter limitation on information available to these algorithms, they yield dynamics which are very similar to Gradient Ascent and exhibit equivalent convergence behavior. In addition to the formal derivation, directional field plots of the learning dynamics in representative classes of two-player two-action games illustrate the similarities and strengthen the theoretical findings.
adaptive and learning agents | 2009
Michael Kaisers; Karl Tuyls
Todays society is largely connected and many real life applications lend themselves to be modeled as multi-agent systems. Although such systems as well as their models are desirable, e.g., for reasons of stability or parallelism, they are highly complex and therefore difficult to understand or predict. Multi-agent learning has been acknowledged to be indispensable to control or find solutions for such systems. Recently, evolutionary game theory has been linked to multi-agent reinforcement learning. However, gaining insight into the dynamics of games, especially if time dependent, remains a challenging problem. This article introduces a new perspective on the reinforcement learning process described by the replicator dynamics, providing a tool to design time dependent parameters of the game or the learning process. This perspective is orthogonal to the common view of policy trajectories driven by the replicator dynamics. Rather than letting the time dimension collapse, the set of initial policies is considered to be a particle cloud that approximates a distribution and we look at the evolution of this distribution over time. First, the methodology is described, then it is applied to an example game and viable extensions are discussed.
adaptive agents and multi agents systems | 2010
Michael Kaisers; Karl Tuyls
adaptive agents and multi agents systems | 2011
Michael Wunder; Michael Kaisers; John Robert Yaros; Michael L. Littman