Is this you? Create Your Porfile

Michael Maire

California Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Maire is active.

Explore More

Publication

Featured researches published by Michael Maire.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011

Contour Detection and Hierarchical Image Segmentation

Pablo Andrés Arbeláez; Michael Maire; Charless C. Fowlkes; Jitendra Malik

This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentation algorithm consists of generic machinery for transforming the output of any contour detector into a hierarchical region tree. In this manner, we reduce the problem of image segmentation to that of contour detection. Extensive experimental evaluation demonstrates that both our contour detection and segmentation methods significantly outperform competing algorithms. The automatically generated hierarchical segmentations can be interactively refined by user-specified annotations. Computation at multiple image resolutions provides a means of coupling our system to recognition applications.

european conference on computer vision | 2014

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin; Michael Maire; Serge J. Belongie; James Hays; Pietro Perona; Deva Ramanan; Piotr Dollár; C. Lawrence Zitnick

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

computer vision and pattern recognition | 2006

SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

Hao Zhang; Alexander C. Berg; Michael Maire; Jitendra Malik

We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.

computer vision and pattern recognition | 2009

From contours to regions: An empirical evaluation

Pablo Andrés Arbeláez; Michael Maire; Charless C. Fowlkes; Jitendra Malik

We propose a generic grouping algorithm that constructs a hierarchy of regions from the output of any contour detector. Our method consists of two steps, an oriented watershed transform (OWT) to form initial regions from contours, followed by construction of an ultra-metric contour map (UCM) defining a hierarchical segmentation. We provide extensive experimental evaluation to demonstrate that, when coupled to a high-performance contour detector, the OWT-UCM algorithm produces state-of-the-art image segmentations. These hierarchical segmentations can optionally be further refined by user-specified annotations.

computer vision and pattern recognition | 2008

Using contours to detect and localize junctions in natural images

Michael Maire; Pablo Andrés Arbeláez; Charless C. Fowlkes; Jitendra Malik

Contours and junctions are important cues for perceptual organization and shape recognition. Detecting junctions locally has proved problematic because the image intensity surface is confusing in the neighborhood of a junction. Edge detectors also do not perform well near junctions. Current leading approaches to junction detection, such as the Harris operator, are based on 2D variation in the intensity signal. However, a drawback of this strategy is that it confuses textured regions with junctions. We believe that the right approach to junction detection should take advantage of the contours that are incident at a junction; contours themselves can be detected by processes that use more global approaches. In this paper, we develop a new high-performance contour detector using a combination of local and global cues. This contour detector provides the best performance to date (F=0.70) on the Berkeley Segmentation Dataset (BSDS) benchmark. From the resulting contours, we detect and localize candidate junctions, taking into account both contour salience and geometric configuration. We show that improvements in our contour model lead to better junctions. Our contour and junction detectors both provide state of the art performance.

european conference on computer vision | 2016

Learning Representations for Automatic Colorization

Gustav Larsson; Michael Maire; Gregory Shakhnarovich

We develop a fully automatic image colorization system. Our approach leverages recent advances in deep networks, exploiting both low-level and semantic representations. As many scene elements naturally appear according to multimodal color distributions, we train our model to predict per-pixel color histograms. This intermediate output can be used to automatically generate a color image, or further manipulated prior to image formation. On both fully and partially automatic colorization tasks, we outperform existing methods. We also explore colorization as a vehicle for self-supervised visual representation learning.

computer vision and pattern recognition | 2011

Occlusion boundary detection and figure/ground assignment from optical flow

Patrik Sundberg; Thomas Brox; Michael Maire; Pablo Andrés Arbeláez; Jitendra Malik

In this work, we propose a contour and region detector for video data that exploits motion cues and distinguishes occlusion boundaries from internal boundaries based on optical flow. This detector outperforms the state-of-the-art on the benchmark of Stein and Hebert [24], improving average precision from. 58 to. 72. Moreover, the optical flow on and near occlusion boundaries allows us to assign a depth ordering to the adjacent regions. To evaluate performance on this edge-based figure/ground labeling task, we introduce a new video dataset that we believe will support further research in the field by allowing quantitative comparison of computational models for occlusion boundary detection, depth ordering and segmentation in video sequences.

european conference on computer vision | 2004

Recognition by probabilistic hypothesis construction

Pierre Moreels; Michael Maire; Pietro Perona

We present a probabilistic framework for recognizing objects in images of cluttered scenes. Hundreds of objects may be considered and searched in parallel. Each object is learned from a single training image and modeled by the visual appearance of a set of features, and their position with respect to a common reference frame. The recognition process computes identity and position of objects in the scene by finding the best interpretation of the scene in terms of learned objects. Features detected in an input image are either paired with database features, or marked as clutters. Each hypothesis is scored using a generative model of the image which is defined using the learned objects and a model for clutter. While the space of possible hypotheses is enormously large, one may find the best hypothesis efficiently – we explore some heuristics to do so. Our algorithm compares favorably with state-of-the-art recognition systems.

international conference on computer vision | 2011

Object detection and segmentation from joint embedding of parts and pixels

Michael Maire; Stella X. Yu; Pietro Perona

We present a new framework in which image segmentation, figure/ground organization, and object detection all appear as the result of solving a single grouping problem. This framework serves as a perceptual organization stage that integrates information from low-level image cues with that of high-level part detectors. Pixels and parts each appear as nodes in a graph whose edges encode both affinity and ordering relationships. We derive a generalized eigen-problem from this graph and read off an interpretation of the image from the solution eigenvectors. Combining an off-the-shelf top-down part-based person detector with our low-level cues and grouping formulation, we demonstrate improvements to object detection and segmentation.

international conference on computer vision | 2015

Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression

Takuya Narihira; Michael Maire; Stella X. Yu

We introduce a new approach to intrinsic image decomposition, the task of decomposing a single image into albedo and shading components. Our strategy, which we term direct intrinsics, is to learn a convolutional neural network (CNN) that directly predicts output albedo and shading channels from an input RGB image patch. Direct intrinsics is a departure from classical techniques for intrinsic image decomposition, which typically rely on physically-motivated priors and graph-based inference algorithms. The large-scale synthetic ground-truth of the MPI Sintel dataset plays the key role in training direct intrinsics. We demonstrate results on both the synthetic images of Sintel and the real images of the classic MIT intrinsic image dataset. On Sintel, direct intrinsics, using only RGB input, outperforms all prior work, including methods that rely on RGB+Depth input. Direct intrinsics also generalizes across modalities, our Sintel-trained CNN produces quite reasonable decompositions on the real images of the MIT dataset. Our results indicate that the marriage of CNNs with synthetic training data may be a powerful new technique for tackling classic problems in computer vision.

Explore More