Chenglei Wu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chenglei Wu is active.

Explore More

Publication

Featured researches published by Chenglei Wu.

international conference on computer graphics and interactive techniques | 2013

Reconstructing detailed dynamic face geometry from monocular video

Pablo Garrido; Levi Valgaert; Chenglei Wu; Christian Theobalt

Detailed facial performance geometry can be reconstructed using dense camera and light setups in controlled studios. However, a wide range of important applications cannot employ these approaches, including all movie productions shot from a single principal camera. For post-production, these require dynamic monocular face capture for appearance modification. We present a new method for capturing face geometry from monocular video. Our approach captures detailed, dynamic, spatio-temporally coherent 3D face geometry without the need for markers. It works under uncontrolled lighting, and it successfully reconstructs expressive motion including high-frequency face detail such as folds and laugh lines. After simple manual initialization, the capturing process is fully automatic, which makes it versatile, lightweight and easy-to-deploy. Our approach tracks accurate sparse 2D features between automatically selected key frames to animate a parametric blend shape model, which is further refined in pose, expression and shape by temporally coherent optical flow and photometric stereo. We demonstrate performance capture results for long and complex face sequences captured indoors and outdoors, and we exemplify the relevance of our approach as an enabling technology for model-based face editing in movies and video, such as adding new facial textures, as well as a step towards enabling everyone to do facial performance capture with a single affordable camera.

international conference on computer graphics and interactive techniques | 2012

Lightweight binocular facial performance capture under uncontrolled lighting

Levi Valgaerts; Chenglei Wu; Andrés Bruhn; Hans-Peter Seidel; Christian Theobalt

Recent progress in passive facial performance capture has shown impressively detailed results on highly articulated motion. However, most methods rely on complex multi-camera set-ups, controlled lighting or fiducial markers. This prevents them from being used in general environments, outdoor scenes, during live action on a film set, or by freelance animators and everyday users who want to capture their digital selves. In this paper, we therefore propose a lightweight passive facial performance capture approach that is able to reconstruct high-quality dynamic facial geometry from only a single pair of stereo cameras. Our method succeeds under uncontrolled and time-varying lighting, and also in outdoor scenes. Our approach builds upon and extends recent image-based scene flow computation, lighting estimation and shading-based refinement algorithms. It integrates them into a pipeline that is specifically tailored towards facial performance reconstruction from challenging binocular footage under uncontrolled lighting. In an experimental evaluation, the strong capabilities of our method become explicit: We achieve detailed and spatio-temporally coherent results for expressive facial motion in both indoor and outdoor scenes -- even from low quality input images recorded with a hand-held consumer stereo camera. We believe that our approach is the first to capture facial performances of such high quality from a single stereo rig and we demonstrate that it brings facial performance capture out of the studio, into the wild, and within the reach of everybody.

computer vision and pattern recognition | 2011

High-quality shape from multi-view stereo and shading under general illumination

Chenglei Wu; Bennett Wilburn; Yasuyuki Matsushita; Christian Theobalt

Multi-view stereo methods reconstruct 3D geometry from images well for sufficiently textured scenes, but often fail to recover high-frequency surface detail, particularly for smoothly shaded surfaces. On the other hand, shape-from-shading methods can recover fine detail from shading variations. Unfortunately, it is non-trivial to apply shape-from-shading alone to multi-view data, and most shading-based estimation methods only succeed under very restricted or controlled illumination. We present a new algorithm that combines multi-view stereo and shading-based refinement for high-quality reconstruction of 3D geometry models from images taken under constant but otherwise arbitrary illumination. We have tested our algorithm on several scenes that were captured under several general and unknown lighting conditions, and we show that our final reconstructions rival laser range scans.

IEEE Transactions on Visualization and Computer Graphics | 2011

Fusing Multiview and Photometric Stereo for 3D Reconstruction under Uncalibrated Illumination

Chenglei Wu; Yebin Liu; Qionghai Dai; Bennett Wilburn

We propose a method to obtain a complete and accurate 3D model from multiview images captured under a variety of unknown illuminations. Based on recent results showing that for Lambertian objects, general illumination can be approximated well using low-order spherical harmonics, we develop a robust alternating approach to recover surface normals. Surface normals are initialized using a multi-illumination multiview stereo algorithm, then refined using a robust alternating optimization method based on the ℓ1 metric. Erroneous normal estimates are detected using a shape prior. Finally, the computed normals are used to improve the preliminary 3D model. The reconstruction system achieves watertight and robust 3D reconstruction while neither requiring manual interactions nor imposing any constraints on the illumination. Experimental results on both real world and synthetic data show that the technique can acquire accurate 3D models for Lambertian surfaces, and even tolerates small violations of the Lambertian assumption.

international conference on computer graphics and interactive techniques | 2014

Real-time shading-based refinement for consumer depth cameras

Chenglei Wu; Michael Zollhöfer; Matthias Nießner; Marc Stamminger; Shahram Izadi; Christian Theobalt

We present the first real-time method for refinement of depth data using shape-from-shading in general uncontrolled scenes. Per frame, our real-time algorithm takes raw noisy depth data and an aligned RGB image as input, and approximates the time-varying incident lighting, which is then used for geometry refinement. This leads to dramatically enhanced depth maps at 30Hz. Our algorithm makes few scene assumptions, handling arbitrary scene objects even under motion. To enable this type of real-time depth map enhancement, we contribute a new highly parallel algorithm that reformulates the inverse rendering optimization problem in prior work, allowing us to estimate lighting and shape in a temporally coherent way at video frame-rates. Our optimization problem is minimized using a new regular grid Gauss-Newton solver implemented fully on the GPU. We demonstrate results showing enhanced depth maps, which are comparable to offline methods but are computed orders of magnitude faster, as well as baseline comparisons with online filtering-based methods. We conclude with applications of our higher quality depth maps for improved real-time surface reconstruction and performance capture.

international conference on computer vision | 2011

Shading-based dynamic shape refinement from multi-view video under general illumination

Chenglei Wu; Kiran Varanasi; Yebin Liu; Hans-Peter Seidel; Christian Theobalt

We present an approach to add true fine-scale spatio-temporal shape detail to dynamic scene geometry captured from multi-view video footage. Our approach exploits shading information to recover the millimeter-scale surface structure, but in contrast to related approaches succeeds under general unconstrained lighting conditions. Our method starts off from a set of multi-view video frames and an initial series of reconstructed coarse 3D meshes that lack any surface detail. In a spatio-temporal maximum a posteriori probability (MAP) inference framework, our approach first estimates the incident illumination and the spatially-varying albedo map on the mesh surface for every time instant. Thereafter, albedo and illumination are used to estimate the true geometric detail visible in the images and add it to the coarse reconstructions. The MAP framework uses weak temporal priors on lighting, albedo and geometry which improve reconstruction quality yet allow for temporal variations in the data.

international conference on computer graphics and interactive techniques | 2013

On-set performance capture of multiple actors with a stereo camera

Chenglei Wu; Carsten Stoll; Levi Valgaerts; Christian Theobalt

State-of-the-art marker-less performance capture algorithms reconstruct detailed human skeletal motion and space-time coherent surface geometry. Despite being a big improvement over marker-based motion capture methods, they are still rarely applied in practical VFX productions as they require ten or more cameras and a studio with controlled lighting or a green screen background. If one was able to capture performances directly on a general set using only the primary stereo camera used for principal photography, many possibilities would open up in virtual production and previsualization, the creation of virtual actors, and video editing during post-production. We describe a new algorithm which works towards this goal. It is able to track skeletal motion and detailed surface geometry of one or more actors from footage recorded with a stereo rig that is allowed to move. It succeeds in general sets with uncontrolled background and uncontrolled illumination, and scenes in which actors strike non-frontal poses. It is one of the first performance capture methods to exploit detailed BRDF information and scene illumination for accurate pose tracking and surface refinement in general scenes. It also relies on a new foreground segmentation approach that combines appearance, stereo, and pose tracking results to segment out actors from the background. Appearance, segmentation, and motion cues are combined in a new pose optimization framework that is robust under uncontrolled lighting, uncontrolled background and very sparse camera views.

international conference on computer graphics and interactive techniques | 2015

Shading-based refinement on volumetric signed distance functions

Michael Zollhöfer; Angela Dai; Matthias Innmann; Chenglei Wu; Marc Stamminger; Christian Theobalt; Matthias Nießner

We present a novel method to obtain fine-scale detail in 3D reconstructions generated with low-budget RGB-D cameras or other commodity scanning devices. As the depth data of these sensors is noisy, truncated signed distance fields are typically used to regularize out the noise, which unfortunately leads to over-smoothed results. In our approach, we leverage RGB data to refine these reconstructions through shading cues, as color input is typically of much higher resolution than the depth data. As a result, we obtain reconstructions with high geometric detail, far beyond the depth resolution of the camera itself. Our core contribution is shading-based refinement directly on the implicit surface representation, which is generated from globally-aligned RGB-D images. We formulate the inverse shading problem on the volumetric distance field, and present a novel objective function which jointly optimizes for fine-scale surface geometry and spatially-varying surface reflectance. In order to enable the efficient reconstruction of sub-millimeter detail, we store and process our surface using a sparse voxel hashing scheme which we augment by introducing a grid hierarchy. A tailored GPU-based Gauss-Newton solver enables us to refine large shape models to previously unseen resolution within only a few seconds.

international conference on computer graphics and interactive techniques | 2016

An anatomically-constrained local deformation model for monocular face capture

Chenglei Wu; Derek Bradley; Markus H. Gross; Thabo Beeler

We present a new anatomically-constrained local face model and fitting approach for tracking 3D faces from 2D motion data in very high quality. In contrast to traditional global face models, often built from a large set of blendshapes, we propose a local deformation model composed of many small subspaces spatially distributed over the face. Our local model offers far more flexibility and expressiveness than global blendshape models, even with a much smaller model size. This flexibility would typically come at the cost of reduced robustness, in particular during the under-constrained task of monocular reconstruction. However, a key contribution of this work is that we consider the face anatomy and introduce subspace skin thickness constraints into our model, which constrain the face to only valid expressions and helps counteract depth ambiguities in monocular tracking. Given our new model, we present a novel fitting optimization that allows 3D facial performance reconstruction from a single view at extremely high quality, far beyond previous fitting approaches. Our model is flexible, and can be applied also when only sparse motion data is available, for example with marker-based motion capture or even face posing from artistic sketches. Furthermore, by incorporating anatomical constraints we can automatically estimate the rigid motion of the skull, obtaining a rigid stabilization of the performance for free. We demonstrate our model and single-view fitting method on a number of examples, including, for the first time, extreme local skin deformation caused by external forces such as wind, captured from a single high-speed camera.

eurographics | 2013

Capturing Relightable Human Performances under General Uncontrolled Illumination

Guannan Li; Chenglei Wu; Carsten Stoll; Yebin Liu; Kiran Varanasi; Qionghai Dai; Christian Theobalt

We present a novel approach to create relightable free‐viewpoint human performances from multi‐view video recorded under general uncontrolled and uncalibated illumination. We first capture a multi‐view sequence of an actor wearing arbitrary apparel and reconstruct a spatio‐temporal coherent coarse 3D model of the performance using a marker‐less tracking approach. Using these coarse reconstructions, we estimate the low‐frequency component of the illumination in a spherical harmonics (SH) basis as well as the diffuse reflectance, and then utilize them to estimate the dynamic geometry detail of human actors based on shading cues. Given the high‐quality time‐varying geometry, the estimated illumination is extended to the all‐frequency domain by re‐estimating it in the wavelet basis. Finally, the high‐quality all‐frequency illumination is utilized to reconstruct the spatially‐varying BRDF of the surface. The recovered time‐varying surface geometry and spatially‐varying non‐Lambertian reflectance allow us to generate high‐quality model‐based free view‐point videos of the actor under novel illumination conditions. Our method enables plausible reconstruction of relightable dynamic scene models without a complex controlled lighting apparatus, and opens up a path towards relightable performance capture in less constrained environments and using less complex acquisition setups.

Explore More