Henry A. Rowley | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Henry A. Rowley is active.

Explore More

Publication

Featured researches published by Henry A. Rowley.

computer vision and pattern recognition | 1996

Neural network-based face detection

Henry A. Rowley; Shumeet Baluja; Takeo Kanade

We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training the networks, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images. Comparisons with other state-of-the-art face detection systems are presented; our system has better performance in terms of detection and false-positive rates.

computer vision and pattern recognition | 1998

Rotation invariant neural network-based face detection

Henry A. Rowley; Shumeet Baluja; Takeo Kanade

The system (see Figure 2) uses a neural network, called a “router”, to analyze each window of the input before it is processed by a “detector” network. If the window contains a face, the router returns the angle of the face. The window can then be “derotated” to make the face upright. The derotated window is then passed to the detection network, which decides whether a face is present. If a non-face image is encountered, the router will return a meaningless rotation. Since a rotation of a non-face image will yield another nonface image, the detector network will still not detect a face. A rotated face, which would not have been detected by an upright face detector, will be rotated to an upright position, and subsequently detected as a face. Because the detector network is only applied once at each image location, this approach is significantly faster than exhaustively trying all orientations, and will yield fewer false detections [ 1,3]. To speed up the above algorithm for demonstration purposes, we used several techniques. First, we use a change detection algorithm to restrict the search area. Second, we use a model of skin color (acquired online as faces are detected) to further restrict the search. Finally, we use a candidate detection network to quickly rule out some portions of the input image, before examining them more carefully (and slowly) with the detection network. With these techniques, it takes about 6 seconds to process a 160x 120 pixel image on an SGI 02 workstation with a 174 MHz RIO000 processor.

computer vision and pattern recognition | 2008

Face tracking and recognition with visual constraints in real-world videos

Minyoung Kim; Sanjiv Kumar; Vladimir Pavlovic; Henry A. Rowley

We address the problem of tracking and recognizing faces in real-world, noisy videos. We track faces using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting. However, adaptive appearance trackers often suffer from drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, our tracker introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework. The generative term conforms the particles to the space of generic face poses while the discriminative one ensures rejection of poorly aligned targets. This leads to a tracker that significantly improves robustness against abrupt appearance changes and occlusions, critical for the subsequent recognition phase. Identity of the tracked subject is established by fusing pose-discriminant and person-discriminant features over the duration of a video sequence. This leads to a robust video-based face recognizer with state-of-the-art recognition performance. We test the quality of tracking and face recognition on real-world noisy videos from YouTube as well as the standard Honda/UCSD database. Our approach produces successful face tracking results on over 80% of all videos without video or person-specific parameter tuning. The good tracking performance induces similarly high recognition rates: 100% on Honda/UCSD and over 70% on the YouTube set containing 35 celebrities in 1500 sequences.

innovative applications of artificial intelligence | 2005

Boosting sex identification performance

Shumeet Baluja; Henry A. Rowley

This paper presents a method based on AdaBoost to identify the sex of a person from a low resolution grayscale picture of their face. The method described here is implemented in a system that will process well over 109 images. The goal of this work is to create an efficient system that is both simple to implement and maintain; the methods described here are extremely fast and have straightforward implementations. We achieve 80% accuracy in sex identification with less than 10 pixel comparisons and 90% accuracy with less than 50 pixel comparisons. The best classifiers published to date use Support Vector Machines; we match their accuracies with as few as 500 comparison operations on a 20× 20 pixel image. The AdaBoost based classifiers presented here achieve over 93% accuracy; these match or surpass the accuracies of the SVM-based classifiers, and yield performance that is 50 times faster.

computer vision and pattern recognition | 2008

Large-scale manifold learning

Ameet Talwalkar; Sanjiv Kumar; Henry A. Rowley

This paper examines the problem of extracting low-dimensional manifold structure given millions of high-dimensional face images. Specifically, we address the computational challenges of nonlinear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. Since most manifold learning techniques rely on spectral decomposition, we first analyze two approximate spectral decomposition techniques for large dense matrices (Nystrom and column-sampling), providing the first direct theoretical and empirical comparison between these techniques. We next show extensive experiments on learning low-dimensional embeddings for two large face datasets: CMU-PIE (35 thousand faces) and a web dataset (18 million faces). Our comparisons show that the Nystrom approximation is superior to the column-sampling method. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE dataset.

workshop on applications of computer vision | 2007

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Ting Liu; Charles Rosenberg; Henry A. Rowley

The proliferation of the Web and digital photography have made large scale image collections containing billions of images a reality. Image collections on this scale make performing even the most common and simple computer vision, image processing, and machine learning tasks nontrivial. An example is nearest neighbor search, which not only serves as a fundamental subproblem in many more sophisticated algorithms, but also has direct applications, such as image retrieval and image clustering. In this paper, we address the nearest neighbor problem as the first step towards scalable image processing. We describe a scalable version of an approximate nearest neighbor search algorithm and discuss how it can be used to find near duplicates among over a billion images

computer vision and pattern recognition | 2013

Learning Binary Codes for High-Dimensional Data Using Bilinear Projections

Yunchao Gong; Sanjiv Kumar; Henry A. Rowley; Svetlana Lazebnik

Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on large-scale datasets like Image Net, extremely high-dimensional visual descriptors, e.g., Fisher Vectors, are needed. We present a novel method for converting such descriptors to compact similarity-preserving binary codes that exploits their natural matrix structure to reduce their dimensionality using compact bilinear projections instead of a single large projection matrix. This method achieves comparable retrieval and classification accuracy to the original descriptors and to the state-of-the-art Product Quantization approach while having orders of magnitude faster code generation time and smaller memory footprint.

computer vision and pattern recognition | 2011

Image saliency: From intrinsic to extrinsic context

Meng Wang; Janusz Konrad; Prakash Ishwar; Kevin Jing; Henry A. Rowley

We propose a novel framework for automatic saliency estimation in natural images. We consider saliency to be an anomaly with respect to a given context that can be global or local. In the case of global context, we estimate saliency in the whole image relative to a large dictionary of images. Unlike in some prior methods, this dictionary is not annotated, i.e., saliency is assumed unknown. In the case of local context, we partition the image into patches and estimate saliency in each patch relative to a large dictionary of un-annotated patches from the rest of the image. We propose a unified framework that applies to both cases in three steps. First, given an input (image or patch) we extract k nearest neighbors from the dictionary. Then, we geometrically warp each neighbor to match the input. Finally, we derive the saliency map from the mean absolute error between the input and all its warped neighbors. This algorithm is not only easy to implement but also outperforms state-of-the-art methods.

conference on image and video retrieval | 2007

Canonical image selection from the web

Yushi Jing; Shumeet Baluja; Henry A. Rowley

The vast majority of the features used in todays commercially deployed image search systems employ techniques that are largely indistinguishable from text-document search - the images returned in response to a query are based on the text of the web pages from which they are linked. Unfortunately, depending on the query type, the quality of this approach can be inconsistent. Several recent studies have demonstrated the effectiveness of using image features to refine search results. However, it is not clear whether (or how much) image-based approach can generalize to larger samples of web queries. Also, the previously used global features often only capture a small part of the image information, which in many cases does not correspond to the distinctive characteristics of the category. This paper explores the use of local features in the concrete task of finding the single canonical images for a collection of commonly searched-for products. Through large-scale user testing, the canonical images found by using only local image features significantly outperformed the top results from Yahoo, Microsoft and Google, highlighting the importance of having these image features as an integral part of future image search engines.

international conference on computer vision | 2011

Large-scale image annotation using visual synset

David Tsai; Yushi Jing; Yi Liu; Henry A. Rowley; Sergey Ioffe; James M. Rehg

We address the problem of large-scale annotation of web images. Our approach is based on the concept of visual synset, which is an organization of images which are visually-similar and semantically-related. Each visual synset represents a single prototypical visual concept, and has an associated set of weighted annotations. Linear SVMs are utilized to predict the visual synset membership for unseen image examples, and a weighted voting rule is used to construct a ranked list of predicted annotations from a set of visual synsets. We demonstrate that visual synsets lead to better performance than standard methods on a new annotation database containing more than 200 million im- ages and 300 thousand annotations, which is the largest ever reported

Explore More