Jizhou Gao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jizhou Gao is active.

Explore More

Publication

Featured researches published by Jizhou Gao.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

Spatial-Temporal Fusion for High Accuracy Depth Maps Using Dynamic MRFs

Jiejie Zhu; Liang Wang; Jizhou Gao; Ruigang Yang

Time-of-flight range sensors and passive stereo have complimentary characteristics in nature. To fuse them to get high accuracy depth maps varying over time, we extend traditional spatial MRFs to dynamic MRFs with temporal coherence. This new model allows both the spatial and the temporal relationship to be propagated in local neighbors. By efficiently finding a maximum of the posterior probability using Loopy Belief Propagation, we show that our approach leads to improved accuracy and robustness of depth estimates for dynamic scenes.

international conference on computer graphics and interactive techniques | 2013

Semantic decomposition and reconstruction of residential scenes from LiDAR data

Hui Lin; Jizhou Gao; Yu Zhou; Guiliang Lu; Mao Ye; Chenxi Zhang; Ligang Liu; Ruigang Yang

We present a complete system to semantically decompose and reconstruct 3D models from point clouds. Different than previous urban modeling approaches, our system is designed for residential scenes, which consist of mainly low-rise buildings that do not exhibit the regularity and repetitiveness as high-rise buildings in downtown areas. Our system first automatically labels the input into distinctive categories using supervised learning techniques. Based on the semantic labels, objects in different categories are reconstructed with domain-specific knowledge. In particular, we present a novel building modeling scheme that aims to decompose and fit the building point cloud into basic blocks that are block-wise symmetric and convex. This building representation and its reconstruction algorithm are flexible, efficient, and robust to missing data. We demonstrate the effectiveness of our system on various datasets and compare our building modeling scheme with other state-of-the-art reconstruction algorithms to show its advantage in terms of both quality and speed.

IEEE Transactions on Visualization and Computer Graphics | 2012

Video Stereolization: Combining Motion Analysis with User Interaction

Miao Liao; Jizhou Gao; Ruigang Yang; Minglun Gong

We present a semiautomatic system that converts conventional videos into stereoscopic videos by combining motion analysis with user interaction, aiming to transfer as much as possible labeling work from the user to the computer. In addition to the widely used structure from motion (SFM) techniques, we develop two new methods that analyze the optical flow to provide additional qualitative depth constraints. They remove the camera movement restriction imposed by SFM so that general motions can be used in scene depth estimation-the central problem in mono-to-stereo conversion. With these algorithms, the users labeling task is significantly simplified. We further developed a quadratic programming approach to incorporate both quantitative depth and qualitative depth (such as these from user scribbling) to recover dense depth maps for all frames, from which stereoscopic view can be synthesized. In addition to visual results, we present user study results showing that our approach is more intuitive and less labor intensive, while producing 3D effect comparable to that from current state-of-the-art interactive algorithms.

european conference on computer vision | 2008

Illumination and Person-Insensitive Head Pose Estimation Using Distance Metric Learning

Xianwang Wang; Xinyu Huang; Jizhou Gao; Ruigang Yang

Head pose estimation is an important task for many face analysis applications, such as face recognition systems and human computer interactions. In this paper we aim to address the pose estimation problem under some challenging conditions, e.g., from a single image, large pose variation, and un-even illumination conditions. The approach we developed combines non-linear dimension reduction techniques with a learned distance metric transformation. The learned distance metric provides better intra-class clustering, therefore preserving a smooth low-dimensional manifold in the presence of large variation in the input images due to illumination changes. Experiments show that our method improves the performance, achieving accuracy within 2-3 degrees for face images with varying poses and within 3-4 degrees error for face images with varying pose and illumination changes.

IEEE Transactions on Visualization and Computer Graphics | 2014

Personal Photograph Enhancement Using Internet Photo Collections

Chenxi Zhang; Jizhou Gao; Oliver Wang; Pierre Fite Georgel; Ruigang Yang; James Davis; Jan Michael Frahm; Marc Pollefeys

Given the growth of Internet photo collections, we now have a visual index of all major cities and tourist sites in the world. However, it is still a difficult task to capture that perfect shot with your own camera when visiting these places, especially when your camera itself has limitations, such as a limited field of view. In this paper, we propose a framework to overcome the imperfections of personal photographs of tourist sites using the rich information provided by large-scale Internet photo collections. Our method deploys state-of-the-art techniques for constructing initial 3D models from photo collections. The same techniques are then used to register personal photographs to these models, allowing us to augment personal 2D images with 3D information. This strong available scene prior allows us to address a number of traditionally challenging image enhancement techniques and achieve high-quality results using simple and robust algorithms. Specifically, we demonstrate automatic foreground segmentation, mono-to-stereo conversion, field-of-view expansion, photometric enhancement, and additionally automatic annotation with geolocation and tags. Our method clearly demonstrates some possible benefits of employing the rich information contained in online photo databases to efficiently enhance and augment ones own personal photographs.

international conference on computer vision | 2009

Unsupervised learning of high-order structural semantics from images

Jizhou Gao; Yin Hu; Jinze Liu; Ruigang Yang

Structural semantics are fundamental to understanding both natural and man-made objects from languages to buildings. They are manifested as repeated structures or patterns and are often captured in images. Finding repeated patterns in images, therefore, has important applications in scene understanding, 3D reconstruction, and image retrieval as well as image compression. Previous approaches in visual-pattern mining limited themselves by looking for frequently co-occurring features within a small neighborhood in an image. However, semantics of a visual pattern are typically defined by specific spatial relationships between features regardless of the spatial proximity. In this paper, semantics are represented as visual elements and geometric relationships between them. A novel unsupervised learning algorithm finds pair-wise associations of visual elements that have consistent geometric relationships sufficiently often. The algorithms are efficient - maximal matchings are determined without combinatorial search. High-order structural semantics are extracted by mining patterns that are composed of pairwise spatially consistent associations of visual elements. We demonstrate the effectiveness of our approach for discovering repeated visual patterns on a variety of image collections.

digital identity management | 2007

Examplar-based Shape from Shading

Xinyu Huang; Jizhou Gao; Liang Wang; Ruigang Yang

Traditional Shape-from-Shading (SFS) techniques aim to solve an under-constrained problem: estimating depth map from one single image. The results are usually brittle from real images containing detailed shapes. Inspired by recent advances in texture synthesis, we present an exemplar-based approach to improve the robustness and accuracy of SFS. In essence, we utilize an appearance database synthesized from known 3D models where each image pixel is associated with its ground-truth normal. The input image is compared against the images in the database to find the most likely normals. The prior knowledge from the database is formulated as an additional cost term under an energy minimization framework to solve the depth map. Using a generic small database consisting of 50 spheres with different radius, our approach has demonstrated its capability to obviously improve the reconstruction quality from both synthetic and real images with different shapes, in particular those with small details.

asian conference on computer vision | 2007

Calibrating pan-tilt cameras with telephoto lenses

Xinyu Huang; Jizhou Gao; Ruigang Yang

Pan-tilt cameras are widely used in surveillance networks. These cameras are often equipped with telephoto lenses to capture objects at a distance. Such a camera makes full-metric calibration more difficult since the projection with a telephoto lens is close to orthographic. This paper discusses the problems caused by pan-tilt cameras with long focal length and presents a method to improve the calibration accuracy. Experiments show that our method reduces the re-projection errors by an order of magnitude compared to popular homography-based approaches.

computer vision and pattern recognition | 2008

Estimating pose and illumination direction for frontal face synthesis

Xinyu Huang; Xianwang Wang; Jizhou Gao; Ruigang Yang

Face pose and illumination estimation is an important pre-processing step in many face analysis problems. In this paper, we present a new method to estimate the face pose and illumination direction from one single image. The basic idea is to compare the reconstruction residuals between the input image and a small set of reference images under different poses and illumination directions. Based on the estimated pose and illumination direction, we develop a face synthesis framework to rectify the input image to the frontal view under standard illumination. Experiments show that our estimation method is both fast (less than one second per frame) and accurate (even less than three degrees) and our face synthesis method can generate visually plausible results, in particular for challenging inputs with with large pose changes and poor lighting conditions. The synthesized frontal face views increase the face recognition rate significantly from 1:5% to 62:1%.

asian conference on computer vision | 2009

Manifold estimation in view-based feature space for face synthesis across poses

Xinyu Huang; Jizhou Gao; Sen-ching S. Cheung; Ruigang Yang

This paper presents a new approach to synthesize face images under different pose changes given a single input image. The approach is based on two observations: 1. a series of face images of a single person under different poses could be mapped to a smooth manifold in the unified feature space. 2. the manifolds from different faces are separated from each other by their dissimilarities. The new manifold estimation is formulated as an energy minimization problem with smoothness constraints. The experiments show that face images under different poses can be robustly synthesized from one input image, even with large pose variations.

Explore More