Oleg Alexander
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Oleg Alexander.
ACM Transactions on Graphics | 2014
Graham Fyffe; Andrew Jones; Oleg Alexander; Ryosuke Ichikari; Paul E. Debevec
We present a process for rendering a realistic facial performance with control of viewpoint and illumination. The performance is based on one or more high-quality geometry and reflectance scans of an actor in static poses, driven by one or more video streams of a performance. We compute optical flow correspondences between neighboring video frames, and a sparse set of correspondences between static scans and video frames. The latter are made possible by leveraging the relightability of the static 3D scans to match the viewpoint(s) and appearance of the actor in videos taken in arbitrary environments. As optical flow tends to compute proper correspondence for some areas but not others, we also compute a smoothed, per-pixel confidence map for every computed flow, based on normalized cross-correlation. These flows and their confidences yield a set of weighted triangulation constraints among the static poses and the frames of a performance. Given a single artist-prepared face mesh for one static pose, we optimally combine the weighted triangulation constraints, along with a shape regularization term, into a consistent 3D geometry solution over the entire performance that is drift free by construction. In contrast to previous work, even partial correspondences contribute to drift minimization, for example, where a successful match is found in the eye region but not the mouth. Our shape regularization employs a differential shape term based on a spatially varying blend of the differential shapes of the static poses and neighboring dynamic poses, weighted by the associated flow confidences. These weights also permit dynamic reflectance maps to be produced for the performance by blending the static scan maps. Finally, as the geometry and maps are represented on a consistent artist-friendly mesh, we render the resulting high-quality animated face geometry and animated reflectance maps using standard rendering tools.
conference on visual media production | 2009
Oleg Alexander; Mike Rogers; William Lambeth; Matt Chiang; Paul E. Debevec
The Digital Emily Project is a collaboration between facial animation company Image Metrics and the Graphics Laboratory at the University of Southern California’s Institute for Creative Technologies to achieve one of the world’s first photorealistic digital facial performances. The project leverages latest-generation techniques in high-resolution face scanning, character rigging, video-based facial animation, and compositing. An actress was first filmed on a studio set speaking emotive lines of dialog in high definition. The lighting on the set was captured as a high dynamic range light probe image. The actress’ face was then three-dimensionally scanned in thirty-three facial expressions showing different emotions and mouth and eye movements using a high-resolution facial scanning process accurate to the level of skin pores and fine wrinkles. Lighting-independent diffuse and specular reflectance maps were also acquired as part of the scanning process. Correspondences between the 3D expression scans were formed using a semi-automatic process, allowing a blendshape facial animation rig to be constructed whose expressions closely mirrored the shapes observed in the rich set of facial scans; animated eyes and teeth were also added to the model. Skin texture detail showing dynamic wrinkling was converted into multiresolution displacement maps also driven by the blend shapes. A semi-automatic video-based facial animation system was then used to animate the 3D face rig to match the performance seen in the original video, and this performance was tracked onto the facial motion in the studio video. The final face was illuminated by the captured studio illumination and shading using the acquired reflectance maps with a skin translucency shading algorithm. Using this process, the project was able to render a synthetic facial performance which was generally accepted as being a real face.
intelligent user interfaces | 2014
Ron Artstein; David R. Traum; Oleg Alexander; Anton Leuski; Andrew Jones; Kallirroi Georgila; Paul E. Debevec; William R. Swartout; Heather Maio; Stephen Smith
Time-offset interaction is a new technology that allows for two-way communication with a person who is not available for conversation in real time: a large set of statements are prepared in advance, and users access these statements through natural conversation that mimics face-to-face interaction. Conversational reactions to user questions are retrieved through a statistical classifier, using technology that is similar to previous interactive systems with synthetic characters; however, all of the retrieved utterances are genuine statements by a real person. Recordings of answers, listening and idle behaviors, and blending techniques are used to create a persistent visual image of the person throughout the interaction. A proof-of-concept has been implemented using the likeness of Pinchas Gutter, a Holocaust survivor, enabling short conversations about his family, his religious views, and resistance. This proof-of-concept has been shown to dozens of people, from school children to Holocaust scholars, with many commenting on the impact of the experience and potential for this kind of interface.
international conference on interactive digital storytelling | 2015
David R. Traum; Andrew Jones; Kia Hays; Heather Maio; Oleg Alexander; Ron Artstein; Paul E. Debevec; Alesia Gainer; Kallirroi Georgila; Kathleen Haase; Karen Jungblut; Anton Leuski; Stephen Smith; William R. Swartout
We describe a digital system that allows people to have an interactive conversation with a human storyteller (a Holocaust survivor) who has recorded a number of dialogue contributions, including many compelling narratives of his experiences and thoughts. The goal is to preserve as much as possible of the experience of face-to-face interaction. The survivor’s stories, answers to common questions, and testimony are recorded in high fidelity, and then delivered interactively to an audience as responses to spoken questions. People can ask questions and receive answers on a broad range of topics including the survivor’s experiences before, after and during the war, his attitudes and philosophy. Evaluation results show that most user questions can be addressed by the system, and that audiences are highly engaged with the resulting interaction.
international conference on computer graphics and interactive techniques | 2015
Koki Nagano; Graham Fyffe; Oleg Alexander; Jernej Barbič; Hao Li; Abhijeet Ghosh; Paul E. Debevec
We present a technique for synthesizing the effects of skin microstructure deformation by anisotropically convolving a high-resolution displacement map to match normal distribution changes in measured skin samples. We use a 10-micron resolution scanning technique to measure several in vivo skin samples as they are stretched and compressed in different directions, quantifying how stretching smooths the skin and compression makes it rougher. We tabulate the resulting surface normal distributions, and show that convolving a neutral skin microstructure displacement map with blurring and sharpening filters can mimic normal distribution changes and microstructure deformations. We implement the spatially-varying displacement map filtering on the GPU to interactively render the effects of dynamic microgeometry on animated faces obtained from high-resolution facial scans.
international conference on computer graphics and interactive techniques | 2013
Graham Fyffe; Andrew Jones; Oleg Alexander; Ryosuke Ichikari; Paul Graham; Koki Nagano; Jay Busch; Paul E. Debevec
We present a technique for creating realistic facial animation from a set of high-resolution static scans of an actors face driven by passive video of the actor from one or more viewpoints. We capture high-resolution static geometry using multi-view stereo and gradient-based photometric stereo [Ghosh et al. 2011]. The scan set includes around 30 expressions largely inspired by the Facial Action Coding System (FACS). Examples of the input scan geometry can be seen in Figure 1 (a). The base topology is defined by an artist for the neutral scan of each subject. The dynamic performance can be shot under existing environmental illumination using one or more off-the shelf HD video cameras.
computer animation and social agents | 2016
Dan Casas; Andrew W. Feng; Oleg Alexander; Graham Fyffe; Paul E. Debevec; Ryosuke Ichikari; Hao Li; Kyle Olszewski; Evan A. Suma; Ari Shapiro
Creating and animating realistic 3D human faces is an important element of virtual reality, video games, and other areas that involve interactive 3D graphics. In this paper, we propose a system to generate photorealistic 3D blendshape-based face models automatically using only a single consumer RGB-D sensor. The capture and processing requires no artistic expertise to operate, takes 15 seconds to capture and generate a single facial expression, and approximately 1 minute of processing time per expression to transform it into a blendshape model. Our main contributions include a complete end-to-end pipeline for capturing and generating photorealistic blendshape models automatically and a registration method that solves dense correspondences between two face scans by utilizing facial landmarks detection and optical flows. We demonstrate the effectiveness of the proposed method by capturing different human subjects with a variety of sensors and puppeteering their 3D faces with real-time facial performance retargeting. The rapid nature of our method allows for just-in-time construction of a digital face. To that end, we also integrated our pipeline with a virtual reality facial performance capture system that allows dynamic embodiment of the generated faces despite partial occlusion of the users real face by the head-mounted display.
interactive 3d graphics and games | 2015
Dan Casas; Oleg Alexander; Andrew W. Feng; Graham Fyffe; Ryosuke Ichikari; Paul E. Debevec; Ruizhe Wang; Evan A. Suma; Ari Shapiro
Creating and animating a realistic 3D human face has been an important task in computer graphics. The capability of capturing the 3D face of a human subject and reanimate it quickly will find many applications in games, training simulations, and interactive 3D graphics. In this paper, we propose a system to capture photorealistic 3D faces and generate the blendshape models automatically using only a single commodity RGB-D sensor. Our method can rapidly generate a set of expressive facial poses from a single Microsoft Kinect and requires no artistic expertise on the part of the capture subject. The system takes only a matter of seconds to capture and produce a 3D facial pose and only requires 4 minutes of processing time to transform it into a blendshape model. Our main contributions include an end-to-end pipeline for capturing and generating face blendshape models automatically, and a registration method that solves dense correspondences between two face scans by utilizing facial landmark detection and optical flow. We demonstrate the effectiveness of the proposed method by capturing 3D facial models of different human subjects and puppeteering their models in an animation system with real-time facial performance retargeting.
international conference on computer graphics and interactive techniques | 2015
Dan Casas; Oleg Alexander; Andrew W. Feng; Graham Fyffe; Ryosuke Ichikari; Paul E. Debevec; Ruizhe Wang; Evan A. Suma; Ari Shapiro
Creating and animating a realistic 3D human face is an important task in computer graphics. The capability of capturing the 3D face of a human subject and reanimate it quickly will find many applications in games, training simulations, and interactive 3D graphics. We demonstrate a system to capture photorealistic 3D faces and generate the blendshape models automatically using only a single commodity RGB-D sensor. Our method can rapidly generate a set of expressive facial poses from a single depth sensor, such as a Microsoft Kinect version 1, and requires no artistic expertise in order to process those scans. The system takes only a matter of seconds to capture and produce a 3D facial pose and only requires a few minutes of processing time to transform it into a blendshape-compatible model. Our main contributions include an end-to-end pipeline for capturing and generating face blendshape models automatically, and a registration method that solves dense correspondences between two face scans by utilizing facial landmarks detection and optical flows. We demonstrate the effectiveness of the proposed method by capturing different human subjects and puppeteering their 3D faces in an animation system with real-time facial performance retargeting.
international conference on computer graphics and interactive techniques | 2014
Andrew Jones; Jonas Unger; Koki Nagano; Jay Busch; Xueming Yu; Hsuan-Yueh Peng; Oleg Alexander; Paul E. Debevec
We present a system for capturing and rendering life-size 3D human subjects on an automultiscopic display. Automultiscopic 3D displays allow a large number of viewers to experience 3D content simultaneously without the hassle of special glasses or head gear. Such displays are ideal for human subjects as they allow for natural personal interactions with 3D cues such as eye-gaze and complex hand gestures. In this talk, we will focus on a case-study where our system was used to digitize television host Morgan Spurlock for his documentary show ”Inside Man” on CNN. Automultiscopic displays work by generating many simultaneous views with highangular density over a wide-field of view. The angular spacing between between views must be small enough that each eye perceives a distinct and different view. As the user moves around the display, the eye smoothly transitions from one view to the next. We generate multiple views using a dense horizontal array of video projectors. As video projectors continue to shrink in size, power consumption, and cost, it is now possible to closely stack hundreds of projectors so that their lenses are almost continuous. However this display presents a new challenge for content acquisition. It would require hundreds of cameras to directly measure every projector ray. We achieve similar quality with a new view interpolation algorithm suitable for dense automultiscopic displays.