Sondre Glimsdal
University of Agder
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sondre Glimsdal.
Applied Intelligence | 2013
Ole-Christoffer Granmo; Sondre Glimsdal
The two-armed bandit problem is a classical optimization problem where a decision maker sequentially pulls one of two arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Bandit problems are particularly fascinating because a large class of real world problems, including routing, Quality of Service (QoS) control, game playing, and resource allocation, can be solved in a decentralized manner when modeled as a system of interacting gambling machines.Although computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. This paper proposes a novel scheme for decentralized decision making based on the Goore Game in which each decision maker is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling conjugate priors, and on random sampling from these posteriors. We further report theoretical results on the variance of the random rewards experienced by each individual decision maker. Based on these theoretical results, each decision maker is able to accelerate its own learning by taking advantage of the increasingly more reliable feedback that is obtained as exploration gradually turns into exploitation in bandit problem based learning.Extensive experiments, involving QoS control in simulated wireless sensor networks, demonstrate that the accelerated learning allows us to combine the benefits of conservative learning, which is high accuracy, with the benefits of hurried learning, which is fast convergence. In this manner, our scheme outperforms recently proposed Goore Game solution schemes, where one has to trade off accuracy with speed. As an additional benefit, performance also becomes more stable. We thus believe that our methodology opens avenues for improved performance in a number of applications of bandit based decentralized decision making.
international conference industrial engineering other applications applied intelligent systems | 2013
Morten Goodwin; Ole-Christoffer Granmo; Jaziar Radianti; Parvaneh Sarshar; Sondre Glimsdal
An emergency requiring evacuation is a chaotic event filled with uncertainties both for the people affected and rescuers. The evacuees are often left to themselves for navigation to the escape area. The chaotic situation increases when a predefined escape route is blocked by a hazard, and there is a need to re-think which escape route is safest. This paper addresses automatically finding the safest escape route in emergency situations in large buildings or ships with imperfect knowledge of the hazards. The proposed solution, based on Ant Colony Optimisation, suggests a near optimal escape plan for every affected person -- considering both dynamic spread of hazards and congestion avoidance. The solution can be used both on an individual bases, such as from a personal smart phone of one of the evacuees, or from a remote location by emergency personnel trying to assist large groups.
international conference industrial engineering other applications applied intelligent systems | 2013
Ole-Christoffer Granmo; Jaziar Radianti; Morten Goodwin; Julie Dugdale; Parvaneh Sarshar; Sondre Glimsdal; Jose J. Gonzalez
Managing the uncertainties that arise in disasters - such as ship fire - can be extremely challenging. Previous work has typically focused either on modeling crowd behavior or hazard dynamics, targeting fully known environments. However, when a disaster strikes, uncertainty about the nature, extent and further development of the hazard is the rule rather than the exception. Additionally, crowd and hazard dynamics are both intertwined and uncertain, making evacuation planning extremely difficult. To address this challenge, we propose a novel spatio-temporal probabilistic model that integrates crowd with hazard dynamics, using a ship fire as a proof-of-concept scenario. The model is realized as a dynamic Bayesian network (DBN), supporting distinct kinds of crowd evacuation behavior - both descriptive and normative (optimal). Descriptive modeling is based on studies of physical fire models, crowd psychology models, and corresponding flow models, while we identify optimal behavior using Ant-Based Colony Optimization (ACO). Simulation results demonstrate that the DNB model allows us to track and forecast the movement of people until they escape, as the hazard develops from time step to time step. Furthermore, the ACO provides safe paths, dynamically responding to current threats.
international conference industrial engineering other applications applied intelligent systems | 2011
Ole-Christoffer Granmo; Sondre Glimsdal
The two-armed bandit problem is a classical optimization problem where a decision maker sequentially pulls one of two arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Bandit problems are particularly fascinating because a large class of real world problems, including routing, QoS control, game playing, and resource allocation, can be solved in a decentralized manner when modeled as a system of interacting gambling machines. Although computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. This paper proposes a novel scheme for decentralized decision making based on the Goore Game in which each decision maker is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling conjugate priors, and on random sampling from these posteriors. We further report theoretical results on the variance of the random rewards experienced by each individual decision maker. Based on these theoretical results, each decision maker is able to accelerate its own learning by taking advantage of the increasingly more reliable feedback that is obtained as exploration gradually turns into exploitation in bandit problem based learning. Extensive experiments demonstrate that the accelerated learning allows us to combine the benefits of conservative learning, which is high accuracy, with the benefits of hurried learning, which is fast convergence. In this manner, our scheme outperforms recently proposed Goore Game solution schemes, where one has to trade off accuracy with speed. We thus believe that our methodology opens avenues for improved performance in a number of applications of bandit based decentralized decision making.
Applied Intelligence | 2018
Sondre Glimsdal; Ole-Christoffer Granmo
A number of intriguing decision scenarios revolve around partitioning a collection of objects to optimize some application specific objective function. This problem is generally referred to as the Object Partitioning Problem (OPP) and is known to be NP-hard. We here consider a particularly challenging version of OPP, namely, the Stochastic On-line Equi-Partitioning Problem (SO-EPP). In SO-EPP, the target partitioning is unknown and has to be inferred purely from observing an on-line sequence of object pairs. The paired objects belong to the same partition with probability p and to different partitions with probability 1 − p, with p also being unknown. As an additional complication, the partitions are required to be of equal cardinality. Previously, only heuristic sub-optimal solution strategies have been proposed for SO- EPP. In this paper, we propose the first Bayesian solution strategy. In brief, the scheme that we propose, BN-EPP, is founded on a Bayesian network representation of SO-EPP problems. Based on probabilistic reasoning, we are not only able to infer the underlying object partitioning with superior accuracy. We are also able to simultaneously infer p, allowing us to accelerate learning as object pairs arrive. Furthermore, our scheme is the first to support a wide range of constraints on the partitioning (Constrained SO-EPP). Being Bayesian, BN-EPP provides superior performance compared to existing solution schemes. We additionally introduce Walk-BN-EPP, a novel WalkSAT inspired algorithm for solving large scale BN-EPP problems. Finally, we provide a BN-EPP based solution to the problem of order picking, a representative real-life application of BN-EPP.
international conference industrial, engineering & other applications applied intelligent systems | 2015
Robin Tollisen; Jon Vegard Jansen; Morten Goodwin; Sondre Glimsdal
Dominion is a complex game, with hidden information and stochastic elements. This makes creating any artificial intelligence AI challenging. To this date, there is little work in the literature on AI for Dominion, and existing solutions rely upon carefully tuned finite-state solutions. This paper presents two novel AIs for Dominion based on Monte-Carlo Tree Search MCTS methods. This is achieved by employing Upper Confidence Bounds UCB and Upper Confidence Bounds applied to Trees UCT. The proposed solutions are notably better than existing work. The strongest proposal is able to win 67% of games played against a known, good finite-state solution, even when the finite-state solution has the unfair advantage of starting the game.
artificial intelligence applications and innovations | 2015
Sondre Glimsdal; Ole-Christoffer Granmo
The multi-armed bandit problem has been studied for decades. In brief, a gambler repeatedly pulls one out of N slot machine arms, randomly receiving a reward or a penalty from each pull. The aim of the gambler is to maximize the expected number of rewards received, when the probabilities of receiving rewards are unknown. Thus, the gambler must, as quickly as possible, identify the arm with the largest probability of producing rewards, compactly capturing the exploration-exploitation dilemma in reinforcement learning. In this paper we introduce a particular challenging variant of the multi-armed bandit problem, inspired by the so-called N-Door Puzzle. In this variant, the gambler is only told whether the optimal arm lies to the “left” or to the “right” of the one pulled, with the feedback being erroneous with probability 1 − p. Our novel scheme for this problem is based on a Bayesian representation of the solution space, and combines this representation with Thompson sampling to balance exploration against exploitation. Furthermore, we introduce the possibility of traitorous environments that lie about the direction of the optimal arm (adversarial learning problem). Empirical results show that our scheme deals with both traitorous and non-traitorous environments, significantly outperforming competing algorithms.
international conference on machine learning and applications | 2014
Sondre Glimsdal; Ole-Christoffer Granmo
A number of intriguing decision scenarios, such as order picking, revolve around partitioning a collection of objects so as to optimize some application specific objective function. In its general form, this problem is referred to as the Object Partitioning Problem (OOP), known to be NP-hard. We here consider a variant of OPP, namely the Stochastic Online Equi-Partitioning Problem (SO-EPP). In SO-EPP, objects arrive sequentially, in pairs. The relationship between the arriving object pairs is stochastic: They belong to the same partition with probability p. From a history of object arrivals, the goal is to predict which objects will appear together in future arrivals. As an additional complication, the partitions of related objects are required to be of equal cardinality. The decision maker, however, is not informed about the true relation between the objects, he is merely observing the stream of object pairs, and has to predict future behavior. Inferring the correct partitioning from historical behavior is thus a significant challenge, which becomes even more difficult when p is unknown. Previously, only heuristic sub-optimal solution strategies have been proposed for SO-EPP. In this paper, we propose the first it optimal solution strategy. In brief, the scheme that we propose, BN-EPP, is founded on a Bayesian Network representation of SO-EPP problems. Based on probabilistic reasoning we are not only able to infer the correct object partitioning with optimal accuracy. We are also able to simultaneously infer p, allowing us to accelerate learning as object pairs arrive. Being optimal, BN-EPP provides superior performance compared to existing state-of-the-art solution schemes. BN-EPP is also highly flexible, being capable of encoding object partitioning constraints. Finally, BN-EPP is parameter free - its performance does not rely on fine tuning any parameters. As a result of these advantages, BN-EPP opens up for significantly improved performance for OOP based applications.
61, [8] s. | 2013
Sondre Glimsdal
Archive | 2017
Sondre Glimsdal; Ole-Christoffer Granmo