Carsten Stoll | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carsten Stoll is active.

Explore More

Publication

Featured researches published by Carsten Stoll.

international conference on computer graphics and interactive techniques | 2008

Performance capture from sparse multi-view video

Edilson de Aguiar; Carsten Stoll; Christian Theobalt; Naveed Ahmed; Hans-Peter Seidel; Sebastian Thrun

This paper proposes a new marker-less approach to capturing human performances from multi-view video. Our algorithm can jointly reconstruct spatio-temporally coherent geometry, motion and textural surface appearance of actors that perform complex and rapid moves. Furthermore, since our algorithm is purely meshbased and makes as few as possible prior assumptions about the type of subject being tracked, it can even capture performances of people wearing wide apparel, such as a dancer wearing a skirt. To serve this purpose our method efficiently and effectively combines the power of surface- and volume-based shape deformation techniques with a new mesh-based analysis-through-synthesis framework. This framework extracts motion constraints from video and makes the laser-scan of the tracked subject mimic the recorded performance. Also small-scale time-varying shape detail is recovered by applying model-guided multi-view stereo to refine the model surface. Our method delivers captured performance data at high level of detail, is highly versatile, and is applicable to many complex types of scenes that could not be handled by alternative marker-based or marker-free recording techniques.

computer vision and pattern recognition | 2009

Motion capture using joint skeleton tracking and surface estimation

Juergen Gall; Carsten Stoll; Edilson de Aguiar; Christian Theobalt; Bodo Rosenhahn; Hans-Peter Seidel

This paper proposes a method for capturing the performance of a human or an animal from a multi-view video sequence. Given an articulated template model and silhouettes from a multi-view image sequence, our approach recovers not only the movement of the skeleton, but also the possibly non-rigid temporal deformation of the 3D surface. While large scale deformations or fast movements are captured by the skeleton pose and approximate surface skinning, true small scale deformations or non-rigid garment motion are captured by fitting the surface to the silhouette. We further propose a novel optimization scheme for skeleton-based pose estimation that exploits the skeletons tree structure to split the optimization problem into a local one and a lower dimensional global one. We show on various sequences that our approach can capture the 3D motion of animals and humans accurately even in the case of rapid movements and wide apparel like skirts.

computer vision and pattern recognition | 2007

Marker-less Deformable Mesh Tracking for Human Shape and Motion Capture

E. de Aguiar; Christian Theobalt; Carsten Stoll; Hans-Peter Seidel

We present a novel algorithm to jointly capture the motion and the dynamic shape of humans from multiple video streams without using optical markers. Instead of relying on kinematic skeletons, as traditional motion capture methods, our approach uses a deformable high-quality mesh of a human as scene representation. It jointly uses an image-based 3D correspondence estimation algorithm and a fast Laplacian mesh deformation scheme to capture both motion and surface deformation of the actor from the input video footage. As opposed to many related methods, our algorithm can track people wearing wide apparel, it can straightforwardly be applied to any type of subject, e.g. animals, and it preserves the connectivity of the mesh over time. We demonstrate the performance of our approach using synthetic and captured real-world video sequences and validate its accuracy by comparison to the ground truth.

international conference on computer vision | 2011

Fast articulated motion tracking using a sums of Gaussians body model

Carsten Stoll; Nils Hasler; Juergen Gall; Hans-Peter Seidel; Christian Theobalt

We present an approach for modeling the human body by Sums of spatial Gaussians (SoG), allowing us to perform fast and high-quality markerless motion capture from multi-view video sequences. The SoG model is equipped with a color model to represent the shape and appearance of the human and can be reconstructed from a sparse set of images. Similar to the human body, we also represent the image domain as SoG that models color consistent image blobs. Based on the SoG models of the image and the human body, we introduce a novel continuous and differentiable model-to-image similarity measure that can be used to estimate the skeletal motion of a human at 5–15 frames per second even for many camera views. In our experiments, we show that our method, which does not rely on silhouettes or training data, offers an good balance between accuracy and computational cost.

Computer Graphics Forum | 2012

Coherent Spatiotemporal Filtering, Upsampling and Rendering of RGBZ Videos

Christian Richardt; Carsten Stoll; Neil A. Dodgson; Hans-Peter Seidel; Christian Theobalt

Sophisticated video processing effects require both image and geometry information. We explore the possibility to augment a video camera with a recent infrared time‐of‐flight depth camera, to capture high‐resolution RGB and low‐resolution, noisy depth at video frame rates. To turn such a setup into a practical RGBZ video camera, we develop efficient data filtering techniques that are tailored to the noise characteristics of IR depth cameras. We first remove typical artefacts in the RGBZ data and then apply an efficient spatiotemporal denoising and upsampling scheme. This allows us to record temporally coherent RGBZ videos at interactive frame rates and to use them to render a variety of effects in unprecedented quality. We show effects such as video relighting, geometry‐based abstraction and stylisation, background segmentation and rendering in stereoscopic 3D.

Computers & Graphics | 2009

Technical Section: Estimating body shape of dressed humans

Nils Hasler; Carsten Stoll; Bodo Rosenhahn; Thorsten Thormählen; Hans-Peter Seidel

The paper presents a method to estimate the detailed 3D body shape of a person even if heavy or loose clothing is worn. The approach is based on a space of human shapes, learned from a large database of registered body scans. Together with this database we use as input a 3D scan or model of the person wearing clothes and apply a fitting method, based on ICP (iterated closest point) registration and Laplacian mesh deformation. The statistical model of human body shapes enforces that the model stays within the space of human shapes. The method therefore allows us to compute the most likely shape and pose of the subject, even if it is heavily occluded or body parts are not visible. Several experiments demonstrate the applicability and accuracy of our approach to recover occluded or missing body parts from 3D laser scans.

international conference on computer graphics and interactive techniques | 2010

Video-based reconstruction of animatable human characters

Carsten Stoll; Juergen Gall; Edilson de Aguiar; Sebastian Thrun; Christian Theobalt

We present a new performance capture approach that incorporates a physically-based cloth model to reconstruct a rigged fully-animatable virtual double of a real person in loose apparel from multi-view video recordings. Our algorithm only requires a minimum of manual interaction. Without the use of optical markers in the scene, our algorithm first reconstructs skeleton motion and detailed time-varying surface geometry of a real person from a reference video sequence. These captured reference performance data are then analyzed to automatically identify non-rigidly deforming pieces of apparel on the animated geometry. For each piece of apparel, parameters of a physically-based real-time cloth simulation model are estimated, and surface geometry of occluded body regions is approximated. The reconstructed character model comprises a skeleton-based representation for the actual body parts and a physically-based simulation model for the apparel. In contrast to previous performance capture methods, we can now also create new real-time animations of actors captured in general apparel.

computer vision and pattern recognition | 2011

Markerless motion capture of interacting characters using multi-view image segmentation

Yebin Liu; Carsten Stoll; Juergen Gall; Hans-Peter Seidel; Christian Theobalt

We present a markerless motion capture approach that reconstructs the skeletal motion and detailed time-varying surface geometry of two closely interacting people from multi-view video. Due to ambiguities in feature-to-person assignments and frequent occlusions, it is not feasible to directly apply single-person capture approaches to the multi-person case. We therefore propose a combined image segmentation and tracking approach to overcome these difficulties. A new probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Thereafter, a single-person markerless motion and surface capture approach can be applied to each individual, either one-by-one or in parallel, even under strong occlusions. We demonstrate the performance of our approach on several challenging multi-person motions, including dance and martial arts, and also provide a reference dataset for multi-person motion capture with ground truth.

international conference on computer graphics and interactive techniques | 2011

Video-based characters: creating new human performances from a multi-view video database

Feng Xu; Yebin Liu; Carsten Stoll; James Tompkin; Gaurav Bharaj; Qionghai Dai; Hans-Peter Seidel; Jan Kautz; Christian Theobalt

We present a method to synthesize plausible video sequences of humans according to user-defined body motions and viewpoints. We first capture a small database of multi-view video sequences of an actor performing various basic motions. This database needs to be captured only once and serves as the input to our synthesis algorithm. We then apply a marker-less model-based performance capture approach to the entire database to obtain pose and geometry of the actor in each database frame. To create novel video sequences of the actor from the database, a user animates a 3D human skeleton with novel motion and viewpoints. Our technique then synthesizes a realistic video sequence of the actor performing the specified motion based only on the initial database. The first key component of our approach is a new efficient retrieval strategy to find appropriate spatio-temporally coherent database frames from which to synthesize target video frames. The second key component is a warping-based texture synthesis approach that uses the retrieved most-similar database frames to synthesize spatio-temporally coherent target video frames. For instance, this enables us to easily create video sequences of actors performing dangerous stunts without them being placed in harms way. We show through a variety of result videos and a user study that we can synthesize realistic videos of people, even if the target motions and camera views are different from the database content.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Markerless Motion Capture of Multiple Characters Using Multiview Image Segmentation

Yebin Liu; Juergen Gall; Carsten Stoll; Qionghai Dai; Hans-Peter Seidel; Christian Theobalt

Capturing the skeleton motion and detailed time-varying surface geometry of multiple, closely interacting peoples is a very challenging task, even in a multicamera setup, due to frequent occlusions and ambiguities in feature-to-person assignments. To address this task, we propose a framework that exploits multiview image segmentation. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Given the articulated template models of each person and the labeled pixels, a combined optimization scheme, which splits the skeleton pose optimization problem into a local one and a lower dimensional global one, is applied one by one to each individual, followed with surface estimation to capture detailed nonrigid deformations. We show on various sequences that our approach can capture the 3D motion of humans accurately even if they move rapidly, if they wear wide apparel, and if they are engaged in challenging multiperson motions, including dancing, wrestling, and hugging.

Explore More