Is this you? Create Your Porfile

Martin Bäuml

Karlsruhe Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Bäuml is active.

Explore More

Publication

Featured researches published by Martin Bäuml.

computer vision and pattern recognition | 2012

Improving foreground segmentations with probabilistic superpixel Markov random fields

Alexander Schick; Martin Bäuml; Rainer Stiefelhagen

We propose a novel post-processing framework to improve foreground segmentations with the use of Probabilistic Superpixel Markov Random Fields. First, we convert a given pixel-based segmentation into a probabilistic superpixel representation. Based on these probabilistic superpixels, a Markov random field exploits structural information and similarities to improve the segmentation. We evaluate our approach on all categories of the Change Detection 2012 dataset. Our approach improves all performance measures simultaneously for eight different basis foreground segmentation algorithms.

computer vision and pattern recognition | 2013

Semi-supervised Learning with Constraints for Person Identification in Multimedia Data

Martin Bäuml; Makarand Tapaswi; Rainer Stiefelhagen

We address the problem of person identification in TV series. We propose a unified learning framework for multi-class classification which incorporates labeled and unlabeled data, and constraints between pairs of features in the training. We apply the framework to train multinomial logistic regression classifiers for multi-class face recognition. The method is completely automatic, as the labeled data is obtained by tagging speaking faces using subtitles and fan transcripts of the videos. We demonstrate our approach on six episodes each of two diverse TV series and achieve state-of-the-art performance.

computer vision and pattern recognition | 2012

“Knock! Knock! Who is it?” probabilistic person identification in TV-series

Makarand Tapaswi; Martin Bäuml; Rainer Stiefelhagen

We describe a probabilistic method for identifying characters in TV series or movies. We aim at labeling every character appearance, and not only those where a face can be detected. Consequently, our basic unit of appearance is a person track (as opposed to a face track). We model each TV series episode as a Markov Random Field, integrating face recognition, clothing appearance, speaker recognition and contextual constraints in a probabilistic manner. The identification task is then formulated as an energy minimization problem. In order to identify tracks without faces, we learn clothing models by adapting available face recognition results. Within a scene, as indicated by prior analysis of the temporal structure of the TV series, clothing features are combined by agglomerative clustering. We evaluate our approach on the first 6 episodes of The Big Bang Theory and achieve an absolute improvement of 20% for person identification and 12% for face recognition.

advanced video and signal based surveillance | 2011

Evaluation of local features for person re-identification in image sequences

Martin Bäuml; Rainer Stiefelhagen

In this paper we present a comparative study of local features for the task of person (re) identification. A combination of state of the art interest point detectors and descriptors is evaluated. The experiments are performed on a novel dataset which we make publicly available for future research in this area. The results indicate that there are significant differences between the evaluated descriptors, with GLOH and SIFT outperforming both Shape Context and SURF descriptors. The evaluated interest point descriptors perform equally well, with a slight advantage for the Hessian-Laplace detector. The Harris-Affine and Hessian-Affine affine invariant region detectors do not provide any performance advantage and therefore do not justify their additional computational expense.

advanced video and signal based surveillance | 2010

Multi-pose Face Recognition for Person Retrieval in Camera Networks

Martin Bäuml; Keni Bernardin; Mika Fischer; Haz m Kemal Ekenel; Rainer Stiefelhagen

In this paper, we study the use of facial appearancefeatures for the re-identification of persons using distributedcamera networks in a realistic surveillance scenario.In contrast to features commonly used for person reidentification,such as whole body appearance, facial featuresoffer the advantage of remaining stable over muchlarger intervals of time. The challenge in using faces forsuch applications, apart from low captured face resolutions,is that their appearance across camera sightings is largelyinfluenced by lighting and viewing pose. Here, a numberof techniques to address these problems are presented andevaluated on a database of surveillance-type recordings. Asystem for online capture and interactive retrieval is presentedthat allows to search for sightings of particular personsin the video database. Evaluation results are presentedon surveillance data recorded with four cameras over severaldays. A mean average precision of 0.60 was achievedfor inter-camera retrieval using just a single track as queryset, and up to 0.86 after relevance feedback by an operator.

advanced video and signal based surveillance | 2011

Part-based clothing segmentation for person retrieval

Michael Weber; Martin Bäuml

Recent advances have shown that clothing appearance provides important features for person re-identification and retrieval in surveillance and multimedia data. However, the regions from which such features are extracted are usually only very crudely segmented, due to the difficulty of segmenting highly articulated entities such as persons. In order to overcome the problem of unconstrained poses, we propose a segmentation approach based on a large number of part detectors. Our approach is able to separately segment a persons upper and lower clothing regions, taking into account the persons body pose. We evaluate our approach on the task of character retrieval on a new challenging data set and present promising results.

computer vision and pattern recognition | 2015

Book2Movie: Aligning video scenes with book chapters

Makarand Tapaswi; Martin Bäuml; Rainer Stiefelhagen

Film adaptations of novels often visually display in a few shots what is described in many pages of the source novel. In this paper we present a new problem: to align book chapters with video scenes. Such an alignment facilitates finding differences between the adaptation and the original source, and also acts as a basis for deriving rich descriptions from the novel for the video clips. We propose an efficient method to compute an alignment between book chapters and video scenes using matching dialogs and character identities as cues. A major consideration is to allow the alignment to be non-sequential. Our suggested shortest path based approach deals with the non-sequential alignments and can be used to determine whether a video scene was part of the original book. We create a new data set involving two popular novel-to-film adaptations with widely varying properties and compare our method against other text-to-video alignment baselines. Using the alignment, we present a qualitative analysis of describing the video through rich narratives obtained from the novel.

computer vision and pattern recognition | 2014

StoryGraphs: Visualizing Character Interactions as a Timeline

Makarand Tapaswi; Martin Bäuml; Rainer Stiefelhagen

We present a novel way to automatically summarize and represent the storyline of a TV episode by visualizing character interactions as a chart. We also propose a scene detection method that lends itself well to generate over-segmented scenes which is used to partition the video. The positioning of character lines in the chart is formulated as an optimization problem which trades between the aesthetics and functionality of the chart. Using automatic person identification, we present StoryGraphs for 3 diverse TV series encompassing a total of 22 episodes. We define quantitative criteria to evaluate StoryGraphs and also compare them against episode summaries to evaluate their ability to provide an overview of the episode.

international conference on multimedia retrieval | 2014

Story-based Video Retrieval in TV series using Plot Synopses

Makarand Tapaswi; Martin Bäuml; Rainer Stiefelhagen

We present a novel approach to search for plots in the storyline of structured videos such as TV series. To this end, we propose to align natural language descriptions of the videos, such as plot synopses, with the corresponding shots in the video. Guided by subtitles and person identities the alignment problem is formulated as an optimization task over all possible assignments and solved efficiently using dynamic programming. We evaluate our approach on a novel dataset comprising of the complete season 5 of Buffy the Vampire Slayer, and show good alignment performance and the ability to retrieve plots in the storyline.

ieee international conference on automatic face gesture recognition | 2015

Improved weak labels using contextual cues for person identification in videos

Makarand Tapaswi; Martin Bäuml; Rainer Stiefelhagen

Fully automatic person identification in TV series has been achieved by obtaining weak labels from subtitles and transcripts [11]. In this paper, we revisit the problem of matching subtitles with face tracks to obtain more assignments and more accurate weak labels. We perform a detailed analysis of the state-of-the-art showing the types of errors during the assignment and providing insights into their cause. We then propose to model the problem of assigning names to face tracks as a joint optimization problem. Using negative constraints between co-occurring pairs of tracks and positive constraints from track threads, we are able to significantly improve the speaker assignment performance. This directly influences the identification performance on all face tracks. We also propose a new feature to determine whether a tracked face is speaking and show further improvements in performance while being computationally more efficient.

Explore More