Bram Bakker | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bram Bakker is active.

Explore More

Publication

Featured researches published by Bram Bakker.

international conference on robotics and automation | 2006

Hierarchical map building and planning based on graph partitioning

Zoran Zivkovic; Bram Bakker; Ben J. A. Kröse

Mobile robot localization and navigation requires a map - the robots internal representation of the environment. A common problem is that path planning becomes very inefficient for large maps. In this paper we address the problem of segmenting a base-level map in order to construct a higher-level representation of the space which can be used for more efficient planning. We represent the base-level map as a graph for both geometric and appearance based space representations. Then we use a graph partitioning method to cluster nodes of the base-level map and in this way construct a high-level map, which is also a graph. We apply a hierarchical path planning method for stochastic tasks based on Markov decision processes (MDPs) and investigate the effect of choosing different numbers of clusters

the cryptographers track at the rsa conference | 2011

Improving differential power analysis by elastic alignment

Jasper G. J. van Woudenberg; Marc F. Witteman; Bram Bakker

To prevent smart card attacks using Differential Power Analysis (DPA), manufacturers commonly implement DPA countermeasures that create misalignment in power trace sets and decrease the effectiveness of DPA. We design and investigate the elastic alignment algorithm for non-linearly warping trace sets in order to align them. Elastic alignment uses FastDTW, originally a method for aligning speech utterances in speech recognition systems, to obtain so-called warp paths that can be used to perform alignment. We show on traces obtained from a smart card with random process interrupts that misalignment is reduced significantly, and that even under an unstable clock the algorithm is able to perform alignment.

european conference on machine learning | 2008

Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs

Lior Kuyer; Shimon Whiteson; Bram Bakker; Nikos Vlassis

Since traffic jams are ubiquitous in the modern world, optimizing the behavior of traffic lights for efficient traffic flow is a critically important goal. Though most current traffic lights use simple heuristic protocols, more efficient controllers can be discovered automatically via multiagent reinforcement learning, where each agent controls a single traffic light. However, in previous work on this approach, agents select only locally optimal actions without coordinating their behavior. This paper extends this approach to include explicit coordination between neighboring traffic lights. Coordination is achieved using the max-plus algorithm, which estimates the optimal joint action by sending locally optimized messages among connected agents. This paper presents the first application of max-plus to a large-scale problem and thus verifies its efficacy in realistic settings. It also provides empirical evidence that max-plus performs well on cyclic graphs, though it has been proven to converge only for tree-structured graphs. Furthermore, it provides a new understanding of the properties a traffic network must have for such coordination to be beneficial and shows that max-plus outperforms previous methods on networks that possess those properties.

international conference on robotics and automation | 2006

Quasi-online reinforcement learning for robots

Bram Bakker; Viktor Zhumatiy; Gabriel Gruener; Jürgen Schmidhuber

This paper describes quasi-online reinforcement learning: while a robot is exploring its environment, in the background a probabilistic model of the environment is built on the fly as new experiences arrive; the policy is trained concurrently based on this model using an anytime algorithm. Prioritized sweeping, directed exploration, and transformed reward functions provide additional speed-ups. The robot quickly learns goal-directed policies from scratch, requiring few interactions with the environment and making efficient use of available computation time. From an outside perspective it learns the behavior online and in real time. We describe comparisons with standard methods and show the individual utility of each of the proposed techniques

Studies in computational intelligence | 2010

Traffic light control by multiagent reinforcement learning systems

Bram Bakker; Shimon Whiteson; Leon J. H. M. Kester; Frans C. A. Groen

Traffic light control is one of the main means of controlling road traffic. Improving traffic control is important because it can lead to higher traffic throughput and reduced traffic congestion. This chapter describes multiagent reinforcement learning techniques for automatic optimization of traffic light controllers. Such techniques are attractive because they can automatically discover efficient control strategies for complex tasks, such as traffic control, for which it is hard or impossible to compute optimal solutions directly and hard to develop hand-coded solutions. First, the general multi-agent reinforcement learning framework is described, which is used to control traffic lights in this work. In this framework, multiple local controllers (agents) are each responsible for the optimization of traffic lights around a single traffic junction, making use of locally perceived traffic state information (sensed cars on the road), a learned probabilistic model of car behavior, and a learned value function which indicates how traffic light decisions affect long-term utility, in terms of the average waiting time of cars. Next, three extensions are described which improve upon the basic framework in various ways: agents (traffic junction controllers) taking into account congestion information from neighboring agents; handling partial observability of traffic states; and coordinating the behavior of multiple agents by coordination graphs and the max-plus algorithm.

International Journal of High Performance Computing Applications | 2007

Reinforcement learning by backpropagation through an LSTM model/critic

Bram Bakker

This paper describes backpropagation through an LSTM recurrent neural network model/critic, for reinforcement learning tasks in partially observable domains. This combines the advantage of LSTMs strength at learning long-term temporal dependencies to infer states in partially observable tasks, with the advantage of being able to learn high-dimensional and/or continuous actions with backpropagations focused credit assignment mechanism

IAS | 2005