Enliang Zheng
University of North Carolina at Chapel Hill
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Enliang Zheng.
european conference on computer vision | 2016
Johannes L. Schönberger; Enliang Zheng; Jan Michael Frahm; Marc Pollefeys
This work presents a Multi-View Stereo system for robust and efficient dense modeling from unstructured image collections. Our core contributions are the joint estimation of depth and normal information, pixelwise view selection using photometric and geometric priors, and a multi-view geometric consistency term for the simultaneous refinement and image-based depth and normal fusion. Experiments on benchmarks and large-scale Internet photo collections demonstrate state-of-the-art performance in terms of accuracy, completeness, and efficiency.
computer vision and pattern recognition | 2014
Enliang Zheng; Enrique Dunn; Vladimir Jojic; Jan Michael Frahm
We propose a multi-view depthmap estimation approach aimed at adaptively ascertaining the pixel level data associations between a reference image and all the elements of a source image set. Namely, we address the question, what aggregation subset of the source image set should we use to estimate the depth of a particular pixel in the reference image? We pose the problem within a probabilistic framework that jointly models pixel-level view selection and depthmap estimation given the local pairwise image photoconsistency. The corresponding graphical model is solved by EM-based view selection probability inference and PatchMatch-like depth sampling and propagation. Experimental results on standard multi-view benchmarks convey the state-of-the art estimation accuracy afforded by mitigating spurious pixel level data associations. Additionally, experiments on large Internet crowd sourced data demonstrate the robustness of our approach against unstructured and heterogeneous image capture characteristics. Moreover, the linear computational and storage requirements of our formulation, as well as its inherent parallelism, enables an efficient and scalable GPU-based implementation.
international conference on computer vision | 2015
Enliang Zheng; Changchang Wu
This paper proposes a new incremental structure from motion (SfM) algorithm based on a novel structure-less camera resection technique. Traditional methods rely on 2D-3D correspondences to compute the pose of candidate cameras using PnP. In this work, we take the collection of already reconstructed cameras as a generalized camera, and determine the absolute pose of a candidate pinhole camera from pure 2D correspondences, which we call it semi-generalized camera pose problem. We present the minimal solvers of the new problem for both calibrated and partially calibrated (unknown focal length) pinhole cameras. By integrating these new algorithms in an incremental SfM system, we go beyond the state-of-art methods with the capability of reconstructing cameras without 2D-3D correspondences. Large-scale real image experiments show that our new SfM system significantly improves the completeness of 3D reconstruction over the standard approach.
international conference on computer vision | 2015
Enliang Zheng; Dinghuang Ji; Enrique Dunn; Jan Michael Frahm
We target the sparse 3D reconstruction of dynamic objects observed by multiple unsynchronized video cameras with unknown temporal overlap. To this end, we develop a framework to recover the unknown structure without sequencing information across video sequences. Our proposed compressed sensing framework poses the estimation of 3D structure as the problem of dictionary learning. Moreover, we define our dictionary as the temporally varying 3D structure, while we define local sequencing information in terms of the sparse coefficients describing a locally linear 3D structural interpolation. Our formulation optimizes a biconvex cost function that leverages a compressed sensing formulation and enforces both structural dependency coherence across video streams, as well as motion smoothness across estimates from common video sources. Experimental results demonstrate the effectiveness of our approach in both synthetic data and captured imagery.
international conference on 3d imaging, modeling, processing, visualization & transmission | 2011
Enliang Zheng; Rahul Raguram; Pierre Fite-Georgel; Jan Michael Frahm
In this paper, we present an efficient technique for generating multi-perspective panoramic images of long scenes. The input to our system is a video sequence captured by a moving camera navigating through a long scene, and our goal is to efficiently generate a panoramic summary of the scene. This problem has received considerable attention in recent years, leading to the development of a number of systems capable of generating high-quality panoramas. However, a significant limitation of current systems is their computational complexity: most current techniques employ computationally expensive algorithms (such as structure-from-motion and dense stereo), or require some degree of manual interaction. In turn, this limits the scalability of the algorithms as well as their ease of implementation. In contrast, the technique we present is simple, efficient, easy to implement, and produces results of comparable quality to state of the art techniques, while doing so at a fraction of the computational cost. Our system operates entirely in the 2D image domain, performing robust image alignment and optical flow based mosaicing, in lieu of more expensive 3D pose/structure computation. We demonstrate the effectiveness of our system on a number of challenging image sequences.
european conference on computer vision | 2014
Enliang Zheng; Ke Wang; Enrique Dunn; Jan Michael Frahm
We introduce the problem of joint object class sequencing and trajectory triangulation (JOST), which is defined as the reconstruction of the motion path of a class of dynamic objects through a scene from an unordered set of images. We leverage standard object detection techinques to identify object instances within a set of registered images. Each of these object detections defines a single 2D point with a corresponding viewing ray. The set of viewing rays attained from the aggregation of all detections belonging to a common object class is then used to estimate a motion path denoted as the object class trajectory. Our method jointly determines the topology of the objects motion path and reconstructs the 3D object points corresponding to our object detections. We pose the problem as an optimization over both the unknown 3D points and the topology of the path, which is approximated by a Generalized Minimum Spanning Tree (GMST) on a multipartite graph and then refined through a continuous optimization over the 3D object points. Experiments on synthetic and real datasets demonstrate the effectiveness of our method and the feasibility to solve a previously intractable problem.
british machine vision conference | 2012
Enliang Zheng; Enrique Dunn; Rahul Raguram; Jan Michael Frahm
The estimation of a complete 3D model from a set of depthmaps is a data intensive task aimed at mitigating measurement noise in the input data by leveraging the inherent redundancy in overlapping multi-view observations. In this paper we propose an efficient depthmap fusion approach that reduces the memory complexity associated with volumetric scene representations. By virtue of reducing the memory footprint we are able to process an increased reconstruction volume with greater spatial resolution. Our approach also improves upon state of the art fusion techniques by approaching the problem in an incremental online setting instead of batch mode processing. In this way, are able to handle an arbitrary number of input images at high pixel resolution and facilitate a streaming 3D processing pipeline. Experiments demonstrate the effectiveness of our proposal both at 3D modeling from internet-scale crowd source data as well as close-range 3D modeling from high resolution video streams.
Geo-spatial Information Science | 2013
Jan Michael Frahm; Jared Heinly; Enliang Zheng; Enrique Dunn; Pierre Fite-Georgel; Marc Pollefeys
In this article we present our system for scalable, robust, and fast city-scale reconstruction from Internet photo collections (IPC) obtaining geo-registered dense 3D models. The major achievements of our system are the efficient use of coarse appearance descriptors combined with strong geometric constraints to reduce the computational complexity of the image overlap search. This unique combination of recognition and geometric constraints allows our method to reduce from quadratic complexity in the number of images to almost linear complexity in the IPC size. Accordingly, our 3D-modeling framework is inherently better scalable than other state of the art methods and in fact is currently the only method to support modeling from millions of images. In addition, we propose a novel mechanism to overcome the inherent scale ambiguity of the reconstructed models by exploiting geo-tags of the Internet photo collection images and readily available StreetView panoramas for fully automatic geo-registration of the 3D model. Moreover, our system also exploits image appearance clustering to tackle the challenge of computing dense 3D models from an image collection that has significant variation in illumination between images along with a wide variety of sensors and their associated different radiometric camera parameters. Our algorithm exploits the redundancy of the data to suppress estimation noise through a novel depth map fusion. The fusion simultaneously exploits surface and free space constraints during the fusion of a large number of depth maps. Cost volume compression during the fusion achieves lower memory requirements for high-resolution models. We demonstrate our system on a variety of scenes from an Internet photo collection of Berlin containing almost three million images from which we compute dense models in less than the span of a day on a single computer.
international conference on acoustics, speech, and signal processing | 2009
Enliang Zheng; Qiang Chen; Xiaochao Yang; Yuncai Liu
We consider the problem of 3D modeling under the environments where colors of the foreground objects are similar to the background, which poses a difficult problem of foreground and background classification. A purely image-based algorithm is adopted in this paper, with no prior information about the foreground objects. We classify foreground and background by fusing the information at the pixel and region levels to obtain the similarity probability map, followed by a Bayesian sensor fusion framework to infer the space occupancy grid. The estimation of the occupancy allows incremental updating once a new observation is available, and the contribution of each observation can be adjusted according to its reliability. Finally, three parameters in the algorithm are analyzed in detail and experiments show the effectiveness of this method.
international conference on computer vision | 2015
Enliang Zheng; Ke Wang; Enrique Dunn; Jan Michael Frahm
We propose two novel minimal solvers which advance the state of the art in satellite imagery processing. Our methods are efficient and do not rely on the prior existence of complex inverse mapping functions to correlate 2D image coordinates and 3D terrain. Our first solver improves on the stereo correspondence problem for satellite imagery, in that we provide an exact image-to-object space mapping (where prior methods were inaccurate). Our second solver provides a novel mechanism for 3D point triangulation, which has improved robustness and accuracy over prior techniques. Given the usefulness and ubiquity of satellite imagery, our proposed methods allow for improved results in a variety of existing and future applications.