Landon Kraemer
University of Southern Mississippi
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Landon Kraemer.
workshop on parallel and distributed simulation | 2008
Bikramjit Banerjee; Ahmed Abukmail; Landon Kraemer
We adapt a scalable layered intelligence technique from the game industry, for agent-based crowd simulation. We extend this approach for planned movements, pursuance of assignable goals, and avoidance of dynamically introduced obstacles/threats, while keeping the system scalable with the number of agents. We exploit parallel processing for expediting the pre-processing step that generates the path-plans offline. We demonstrate the various behaviors in a hall-evacuation scenario, and experimentally establish the scalability of the frame-rates with increasing number of agents.
Simulation | 2009
Bikramjit Banerjee; Ahmed Abukmail; Landon Kraemer
We adapt a scalable layered intelligence technique from the game industry, for agent-based crowd simulation. We extend this approach for planned movements, pursuance of assignable goals, and avoidance of dynamically introduced obstacles/threats as well as congestions, while keeping the system scalable with the number of agents. We demonstrate the various behaviors in hall-evacuation scenarios, and experimentally establish the scalability of the frame rates with increasing numbers of agents.
Neurocomputing | 2016
Landon Kraemer; Bikramjit Banerjee
Decentralized partially observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Multi-agent reinforcement learning (MARL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. In some practical scenarios this may not be the case. We propose a novel MARL approach in which agents are allowed to rehearse with information that will not be available during policy execution. The key is for the agents to learn policies that do not explicitly rely on these rehearsal features. We also establish a weak convergence result for our algorithm, RLaR, demonstrating that RLaR converges in probability when certain conditions are met. We show experimentally that incorporating rehearsal features can enhance the learning rate compared to non-rehearsal-based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems. We also compare RLaR against an existing approximate Dec-POMDP solver which, like RLaR, does not assume a priori knowledge of the model. While RLaRs policy representation is not as scalable, we show that RLaR produces higher quality policies for most problems and horizons studied.
Agents for games and simulations II | 2011
Bikramjit Banerjee; Landon Kraemer
We present a novel automated technique for the quantitative validation and comparison of multi-agent based crowd egress simulation systems. Despite much progress in the simulation technology itself, little attention has been accorded to the problem of validating these systems against reality. Previous approaches focused on local (spatial or temporal) crowd patterns, and either resorted to visual comparison (e.g., U-shaped crowd at bottlenecks), or relied on ad-hoc applications of measures such as egress rates, densities, etc. to compare with reality. To the best of our knowledge, we offer the first systematic and unified approach to validate the global performance of a multi-agent based crowd egress simulation system. We employ this technique to evaluate a multi-agent based crowd egress simulation system that we have also recently developed, and compare two different simulation technologies in this system.
Autonomous Agents and Multi-Agent Systems | 2015
Bikramjit Banerjee; Jeremy Lyle; Landon Kraemer
Multi-agent plan recognition (MAPR) seeks to identify the dynamic team structures and team plans from observations of the action sequences of a set of intelligent agents, based on a library of known team plans (plan library), and an evaluation function. It has important applications in decision support, team work, analyzing data from automated monitoring, surveillance, and intelligence analysis in general. We introduce a general model for MAPR that accommodates different representations of the plan library, and includes single agent plan recognition as a special case. Thus it provides an ideal substrate to investigate and contrast the complexities of single and multi-agent plan recognition. Using this model we generate theoretical insights on hardness, with practical implications. A key feature of these results is that they are baseline, i.e., the polynomial solvability results are given in terms of a compact and expressive plan language (context free language), while the hardness results are given in terms of a less compact language. Consequently the hardness results continue to hold in virtually all realistic plan languages, while the polynomial solvability results extend to the subsets of the context free plan language. In particular, we show that MAPR is in P (polynomial in the size of the plan library and the observation trace) if the number of agents is fixed (in particular 1) but NP-complete otherwise. If the number of agents is a variable, then even the one step MAPR problem is NP-complete. While these results pertain to abduction, we also investigate a related question: adaptation, i.e., the problem of refining the evaluation function based on feedback. We show that adaptation is also NP-hard for a variable number of agents, but easy for a single agent. These results establish a clear distinction between the hardnesses of single and multi-agent plan recognition even in idealized settings, indicating the necessity of a fundamentally different set of techniques for the latter.
ACM Transactions on Autonomous and Adaptive Systems | 2015
Landon Kraemer; Bikramjit Banerjee
Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a formal model for planning in cooperative multiagent systems where agents operate with noisy sensors and actuators, as well as local information. Prevalent solution techniques are centralized and model based—limitations that we address by distributed reinforcement learning (RL). We particularly favor alternate learning, where agents alternately learn best responses to each other, which appears to outperform concurrent RL. However, alternate learning requires an initial policy. We propose two principled approaches to generating informed initial policies: a naive approach that lays the foundation for a more sophisticated approach. We empirically demonstrate that the refined approach produces near-optimal solutions in many challenging benchmark settings, staking a claim to being an efficient (and realistic) approximate solver in its own right. Furthermore, alternate best response learning seeded with such policies quickly learns high-quality policies as well.
Infotech@Aerospace 2011 | 2011
Bikramjit Banerjee; Landon Kraemer; Wanda Solano
We present a novel technique for anomaly detection and prognosis in sensor data from rocket engine test stands. We apply a combination of particle filtering and machine learning approaches to capture the model of nominal operations, and use voting techniques in conjunction with particle filtering to detect anomalies in test runs. We use two approaches ‐ pure particle filtering and pure machine learning ‐ for prognosis. Our experiments on test stand sensor data show successful detection of a known anomaly in the test data, while producing almost no false positives. Both prognostic approaches, however, predict no further impact had the test been completed, perhaps indicating that the anomaly was innocuous. We present the application of two well-known AI techniques ‐ Bayesian filtering and machine learning ‐ to the problems of diagnosis and prognosis of anomalies in sensor network data. Our objective was to develop a system that post-processes a csv file showing the sensor readings and activities (time-series) from a rocket engine test, and detect any anomalies that might have occurred during the test, as well as predict the future evolution of these (and other) anomalies if the test was allowed to continue. The output was required to be in the form of the names of the sensors that show anomalous behavior, and the start and end time of each anomaly, both diagnosed and predicted. Since our approach was model-based, we needed to automatically learn a model of nominal behavior from tests that were marked nominal. In this paper, we will describe this system and show experimental results that demonstrate that it has successfully detected a known anomaly in a given test stand data set, and delivered a prognosis that matches the broad conclusion of the test engineers. The paper is organized as follows. In section III we present the theoretical background, viz., dynamic Bayesian networks and the particle filtering framework that underlies our approach. In section IV we present our anomaly detection model and explain how it is tied to the particle filtering framework. In section V we describe anomaly detection module in detail, and present the experimental results in section V.D. In section VI we present the prognosis module and mention the experimental results. We present our conclusions and future work in section VII.
Advances in Complex Systems | 2011
Bikramjit Banerjee; Landon Kraemer
The design of reinforcement learning solutions to many problems artificially constrain the action set available to an agent, in order to limit the exploration/sample complexity. While exploring, if an agent can discover new actions that can break through the constraints of its basic/atomic action set, then the quality of the learned decision policy could improve. On the flipside, considering all possible non-atomic actions might explode the exploration complexity. We present a novel heuristic solution to this dilemma, and empirically evaluate it in grid navigation tasks. In particular, we show that both the solution quality and the sample complexity improve significantly when basic reinforcement learning is coupled with action discovery. Our approach relies on reducing the number of decision points, which is particularly suited for multiagent coordination learning, since agents tend to learn more easily with fewer coordination problems (CPs). To demonstrate this we extend action discovery to multi-agent reinforcement learning. We show that Joint Action Learners (JALs) indeed learn coordination policies of higher quality with lower sample complexity when coupled with action discovery, in a multi-agent box-pushing task.
national conference on artificial intelligence | 2010
Bikramjit Banerjee; Landon Kraemer; Jeremy Lyle
national conference on artificial intelligence | 2012
Bikramjit Banerjee; Jeremy Lyle; Landon Kraemer; Rajesh Yellamraju