Is this you? Create Your Porfile

Adrian Šošić

Technische Universität Darmstadt

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adrian Šošić is active.

Explore More

Publication

Featured researches published by Adrian Šošić.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2018

A Bayesian Approach to Policy Recognition and State Representation Learning

Adrian Šošić; Abdelhak M. Zoubir; Heinz Koeppl

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used, e.g., for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g., they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the experts controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.

international conference on acoustics, speech, and signal processing | 2016

Policy recognition via expectation maximization

Adrian Šošić; Abdelhak M. Zoubir; Heinz Koeppl

Learning from Demonstrations (LfD) has proven to be a powerful concept for solving optimal control problems in high-dimensional state spaces where demonstrations can be used to facilitate the search for efficient control policies. However, many existing LfD approaches suffer from either theoretical, practical, or computational drawbacks such as the need to learn a latent reward model, to monitor the experts controls, or to repeatedly solve potentially demanding planning problems. In this work, we consider the LfD objective from a system identification perspective and propose a probabilistic policy recognition framework based on expectation maximization that operates directly on the observed expert trajectories, avoiding the aforementioned problems. Using a spatial prior over policies, we are able to make accurate predictions in regions of the state space that are scarcely explored.

international conference on artificial neural networks | 2014

sNN-LDS: Spatio-temporal Non-negative Sparse Coding for Human Action Recognition

Thomas Guthier; Adrian Šošić; Volker Willert; Julian Eggert

Current state-of-the-art approaches for visual human action recognition focus on complex local spatio-temporal descriptors, while the spatio-temporal relations between the descriptors are discarded. These bag-of-features (BOF) based methods come with the disadvantage of limited descriptive power, because class-specific mid- and large-scale spatio-temporal information, such as body pose sequences, cannot be represented. To overcome this restriction, we propose sparse non-negative linear dynamical systems (sNN-LDS) as a dynamic, parts-based, spatio-temporal representation of local descriptors. We provide novel learning rules based on sparse non-negative matrix factorization (sNMF) to simultaneously learn both the parts as well as their transitions. On the challenging UCF-Sports dataset our sNN-LDS combined with simple local features is competitive with state-of-the-art BOF-SVM methods.

international conference on swarm intelligence | 2018

Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning

Maximilian Hüttenrauch; Adrian Šošić; Gerhard Neumann

Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. While it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building and building a communication link. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.

Swarm Intelligence | 2018

Reinforcement learning in a continuum of agents

Adrian Šošić; Abdelhak M. Zoubir; Heinz Koeppl

We present a decision-making framework for modeling the collective behavior of large groups of cooperatively interacting agents based on a continuum description of the agents’ joint state. The continuum model is derived from an agent-based system of locally coupled stochastic differential equations, taking into account that each agent in the group is only partially informed about the global system state. The usefulness of the proposed framework is twofold: (i) for multi-agent scenarios, it provides a computational approach to handling large-scale distributed decision-making problems and learning decentralized control policies. (ii) For single-agent systems, it offers an alternative approximation scheme for evaluating expectations of state distributions. We demonstrate our framework on a variant of the Kuramoto model using a variety of distributed control tasks, such as positioning and aggregation. As part of our experiments, we compare the effectiveness of the controllers learned by the continuum model and agent-based systems of different sizes, and we analyze how the degree of observability in the system affects the learning process.

Swarm Intelligence | 2018

Correction to: Reinforcement learning in a continuum of agents

Adrian Šošić; Abdelhak M. Zoubir; Heinz Koeppl

The original version of this article unfortunately contained a mistake. The presentation of Equation (21) was incorrect. The corrected equation is given below.

adaptive agents and multi agents systems | 2017