Menghua Zhai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Menghua Zhai is active.

Explore More

Publication

Featured researches published by Menghua Zhai.

computer vision and pattern recognition | 2016

Detecting Vanishing Points Using Global Image Context in a Non-ManhattanWorld

Menghua Zhai; Scott Workman; Nathan Jacobs

We propose a novel method for detecting horizontal vanishing points and the zenith vanishing point in man-made environments. The dominant trend in existing methods is to first find candidate vanishing points, then remove outliers by enforcing mutual orthogonality. Our method reverses this process: we propose a set of horizon line candidates and score each based on the vanishing points it contains. A key element of our approach is the use of global image context, extracted with a deep convolutional network, to constrain the set of candidates under consideration. Our method does not make a Manhattan-world assumption and can operate effectively on scenes with only a single horizontal vanishing point. We evaluate our approach on three benchmark datasets and achieve state-of the-art performance on each. In addition, our approach is significantly faster than the previous best method.

british machine vision conference | 2016

Horizon Lines in the Wild.

Scott Workman; Menghua Zhai; Nathan Jacobs

The horizon line is an important contextual attribute for a wide variety of image understanding tasks. As such, many methods have been proposed to estimate its location from a single image. These methods typically require the image to contain specific cues, such as vanishing points, coplanar circles, and regular textures, thus limiting their real-world applicability. We introduce a large, realistic evaluation dataset, Horizon Lines in the Wild (HLW), containing natural images with labeled horizon lines. Using this dataset, we investigate the application of convolutional neural networks for directly estimating the horizon line, without requiring any explicit geometric constraints or other special cues. An extensive evaluation shows that using our CNNs, either in isolation or in conjunction with a previous geometric approach, we achieve state-of-the-art results on the challenging HLW dataset and two existing benchmark datasets.

international conference on image processing | 2015

DEEPFOCAL: A method for direct focal length estimation

Scott Workman; Connor Greenwell; Menghua Zhai; Ryan Baltenberger; Nathan Jacobs

Estimating the focal length of an image is an important preprocessing step for many applications. Despite this, existing methods for single-view focal length estimation are limited in that they require particular geometric calibration objects, such as orthogonal vanishing points, co-planar circles, or a calibration grid, to occur in the field of view. In this work, we explore the application of a deep convolutional neural network, trained on natural images obtained from Internet photo collections, to directly estimate the focal length using only raw pixel intensities as input features. We present quantitative results that demonstrate the ability of our technique to estimate the focal length with comparisons against several baseline methods, including an automatic method which uses orthogonal vanishing points.

computer vision and pattern recognition | 2017

Predicting Ground-Level Scene Layout from Aerial Imagery

Menghua Zhai; Zachary Bessinger; Scott Workman; Nathan Jacobs

We introduce a novel strategy for learning to extract semantically meaningful features from aerial imagery. Instead of manually labeling the aerial imagery, we propose to predict (noisy) semantic features automatically extracted from co-located ground imagery. Our network architecture takes an aerial image as input, extracts features using a convolutional neural network, and then applies an adaptive transformation to map these features into the ground-level perspective. We use an end-to-end learning approach to minimize the difference between the semantic segmentation extracted directly from the ground image and the semantic segmentation predicted solely based on the aerial image. We show that a model learned using this strategy, with no additional training, is already capable of rough semantic labeling of aerial imagery. Furthermore, we demonstrate that by finetuning this model we can achieve more accurate semantic segmentation than two baseline initialization strategies. We use our network to address the task of estimating the geolocation and geo-orientation of a ground image. Finally, we show how features extracted from an aerial image can be used to hallucinate a plausible ground-level panorama.

workshop on applications of computer vision | 2016

Analyzing human appearance as a cue for dating images

Tawfiq Salem; Scott Workman; Menghua Zhai; Nathan Jacobs

Given an image, we propose to use the appearance of people in the scene to estimate when the picture was taken. There are a wide variety of cues that can be used to address this problem. Most previous work has focused on low-level image features, such as color and vignetting. Recent work on image dating has used more semantic cues, such as the appearance of automobiles and buildings. We extend this line of research by focusing on human appearance. Our approach, based on a deep convolutional neural network, allows us to more deeply explore the relationship between human appearance and time. We find that clothing, hair styles, and glasses can all be informative features. To support our analysis, we have collected a new dataset containing images of people from many high school yearbooks, covering the years 1912-2014. While not a complete solution to the problem of image dating, our results show that human appearance is strongly related to time and that semantic information can be a useful cue.

workshop on applications of computer vision | 2016

A fast method for estimating transient scene attributes

Ryan Baltenberger; Menghua Zhai; Connor Greenwell; Scott Workman; Nathan Jacobs

We propose the use of deep convolutional neural networks to estimate the transient attributes of a scene from a single image. Transient scene attributes describe both the objective conditions, such as the weather, time of day, and the season, and subjective properties of a scene, such as whether or not the scene seems busy. Recently, convolutional neural networks have been used to achieve state-of-the-art results for many vision problems, from object detection to scene classification, but have not previously been used for estimating transient attributes. We compare several methods for adapting an existing network architecture and present state-of-the-art results on two benchmark datasets. Our method is more accurate and significantly faster than previous methods, enabling real-world applications.

international conference on image processing | 2014

MPCA: EM-based PCA for mixed-size image datasets

Feiyu Shi; Menghua Zhai; Drew Duncan; Nathan Jacobs

Principal component analysis (PCA) is a widely used technique for dimensionality reduction which assumes that the input data can be represented as a collection of fixed-length vectors. Many real-world datasets, such as those constructed from Internet photo collections, do not satisfy this assumption. A natural approach to addressing this problem is to first coerce all input data to a fixed size, and then use standard PCA techniques. This approach is problematic because it either introduces artifacts when we must upsample an image, or loses information when we must downsample an image. We propose MPCA, an approach for estimating the PCA decomposition from multi-sized input data which avoids this initial resizing step. We demonstrate the effectiveness of this approach on simulated and real-world datasets.

international conference on pattern recognition | 2014

Covariance-Based PCA for Multi-size Data

Menghua Zhai; Feiyu Shi; Drew Duncan; Nathan Jacobs

Principal component analysis (PCA) is used in diverse settings for dimensionality reduction. If data elements are all the same size, there are many approaches to estimating the PCA decomposition of the dataset. However, many datasets contain elements of different sizes that must be coerced into a fixed size before analysis. Such approaches introduce errors into the resulting PCA decomposition. We introduce CO-MPCA, a nonlinear method of directly estimating the PCA decomposition from datasets with elements of different sizes. We compare our method with two baseline approaches on three datasets: a synthetic vector dataset, a synthetic image dataset, and a real dataset of color histograms extracted from surveillance video. We provide quantitative and qualitative evidence that using CO-MPCA gives a more accurate estimate of the PCA basis.

european conference on computer vision | 2018

Learning to Look around Objects for Top-View Representations of Outdoor Scenes

Samuel Schulter; Menghua Zhai; Nathan Jacobs; Manmohan Chandraker

Given a single RGB image of a complex outdoor road scene in the perspective view, we address the novel problem of estimating an occlusion-reasoned semantic scene layout in the top-view. This challenging problem not only requires an accurate understanding of both the 3D geometry and the semantics of the visible scene, but also of occluded areas. We propose a convolutional neural network that learns to predict occluded portions of the scene layout by looking around foreground objects like cars or pedestrians. But instead of hallucinating RGB values, we show that directly predicting the semantics and depths in the occluded areas enables a better transformation into the top-view. We further show that this initial top-view representation can be significantly enhanced by learning priors and rules about typical road layouts from simulated or, if available, map data. Crucially, training our model does not require costly or subjective human annotations for occluded areas or the top-view, but rather uses readily available annotations for standard semantic segmentation. We extensively evaluate and analyze our approach on the KITTI and Cityscapes data sets.

international conference on image processing | 2016

Camera geo-calibration using an MCMC approach

Menghua Zhai; Scott Workman; Nathan Jacobs

We address the problem of single-image geo-calibration, in which an estimate of the geographic location, viewing direction and field of view is sought for the camera that captured an image. The dominant approach to this problem is to match features of the query image, using color and texture, against a reference database of nearby ground imagery. However, this fails when such imagery is not available. We propose to overcome this limitation by matching against a geographic database that contains the locations of known objects, such as houses, roads and bodies of water. Since we are unable to find one-to-one correspondences between image locations and objects in our database, we model the problem probabilistically based on the geometric configuration of multiple such weak correspondences. We propose a Markov Chain Monte Carlo (MCMC) sampling approach to approximate the underlying probability distribution over the full geo-calibration of the camera.

Explore More