Greg Mori | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Greg Mori is active.

Explore More

Publication

Featured researches published by Greg Mori.

computer vision and pattern recognition | 2004

Recovering human body configurations: combining segmentation and recognition

Greg Mori; Xiaofeng Ren; Alexei A. Efros; Jitendra Malik

The goal of this work is to detect a human figure image and localize his joints and limbs along with their associated pixel masks. In this work we attempt to tackle this problem in a general setting. The dataset we use is a collection of sports news photographs of baseball players, varying dramatically in pose and clothing. The approach that we take is to use segmentation to guide our recognition algorithm to salient bits of the image. We use this segmentation approach to build limb and torso detectors, the outputs of which are assembled into human figures. We present quantitative results on torso localization, in addition to shortlisted full body configurations.

computer vision and pattern recognition | 2007

Detecting Pedestrians by Learning Shapelet Features

Payam Sabzmeydani; Greg Mori

In this paper, we address the problem of detecting pedestrians in still images. We introduce an algorithm for learning shapelet features, a set of mid-level features. These features are focused on local regions of the image and are built from low-level gradient information that discriminates between pedestrian and non-pedestrian classes. Using Ad-aBoost, these shapelet features are created as a combination of oriented gradient responses. To train the final classifier, we use AdaBoost for a second time to select a subset of our learned shapelets. By first focusing locally on smaller feature sets, our algorithm attempts to harvest more useful information than by examining all the low-level features together. We present quantitative results demonstrating the effectiveness of our algorithm. In particular, we obtain an error rate 14 percentage points lower (at 10-6 FPPW) than the previous state of the art detector of Dalal and Triggs on the INRIA dataset.

computer vision and pattern recognition | 2008

Action recognition by learning mid-level motion features

Alireza Fathi; Greg Mori

This paper presents a method for human action recognition based on patterns of motion. Previous approaches to action recognition use either local features describing small patches or large-scale features describing the entire human figure. We develop a method constructing mid-level motion features which are built from low-level optical flow information. These features are focused on local regions of the image sequence and are created using a variant of AdaBoost. These features are tuned to discriminate between different classes of action, and are efficient to compute at run-time. A battery of classifiers based on these mid-level features is created and used to classify input sequences. State-of-the-art results are presented on a variety of standard datasets.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Efficient shape matching using shape contexts

Greg Mori; Serge J. Belongie; Jitendra Malik

We demonstrate that shape contexts can be used to quickly prune a search for similar shapes. We present two algorithms for rapid shape retrieval: representative shape contexts, performing comparisons based on a small number of shape contexts, and shapemes, using vector quantization in the space of shape contexts to obtain prototypical shape pieces.

european conference on computer vision | 2002

Estimating Human Body Configurations Using Shape Context Matching

Greg Mori; Jitendra Malik

The problem we consider in this paper is to take a single two-dimensional image containing a human body, locate the joint positions, and use these to estimate the body configuration and pose in three-dimensional space. The basic approach is to store a number of exemplar 2D views of the human body in a variety of different configurations and viewpoints with respect to the camera. On each of these stored views, the locations of the body joints (left elbow, right knee, etc.) are manually marked and labelled for future use. The test shape is then matched to each stored view, using the technique of shape context matching in conjunction with a kinematic chain-based deformation model. Assuming that there is a stored view sufficiently similar in configuration and pose, the correspondence process will succeed. The locations of the body joints are then transferred from the exemplar view to the test shape. Given the joint locations, the 3D body configuration and pose are then estimated. We can apply this technique to video by treating each frame independently - tracking just becomes repeated recognition! We present results on a variety of datasets.

international conference on computer vision | 2005

Guiding model search using segmentation

Greg Mori

In this paper we show how segmentation as preprocessing paradigm can be used to improve the efficiency and accuracy of model search in an image. We operationalize this idea using an over-segmentation of an image into superpixels. The problem domain we explore is human body pose estimation from still images. The superpixels prove useful in two ways. First, we restrict the joint positions in our human body model to lie at centers of superpixels, which reduces the size of the model search space. In addition, accurate support masks for computing features on half-limbs of the body model are obtained by using agglomerations of superpixels as half limb segments. We present results on a challenging dataset of people in sports news images

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

Recovering 3D human body configurations using shape contexts

Greg Mori; Jitendra Malik

The problem we consider in this paper is to take a single two-dimensional image containing a human figure, locate the joint positions, and use these to estimate the body configuration and pose in three-dimensional space. The basic approach is to store a number of exemplar 2D views of the human body in a variety of different configurations and viewpoints with respect to the camera. On each of these stored views, the locations of the body joints (left elbow, right knee, etc.) are manually marked and labeled for future use. The input image is then matched to each stored view, using the technique of shape context matching in conjunction with a kinematic chain-based deformation model. Assuming that there is a stored view sufficiently similar in configuration and pose, the correspondence process would succeed. The locations of the body joints are then transferred from the exemplar view to the test shape. Given the 2D joint locations, the 3D body configuration and pose are then estimated using an existing algorithm. We can apply this technique to video by treating each frame independently - tracking just becomes repeated recognition. We present results on a variety of data sets

computer vision and pattern recognition | 2010

Recognizing human actions from still images with latent poses

Weilong Yang; Yang Wang; Greg Mori

We consider the problem of recognizing human actions from still images. We propose a novel approach that treats the pose of the person in the image as latent variables that will help with recognition. Different from other work that learns separate systems for pose estimation and action recognition, then combines them in an ad-hoc fashion, our system is trained in an integrated fashion that jointly considers poses and actions. Our learning objective is designed to directly exploit the pose information for action recognition. Our experimental results demonstrate that by inferring the latent poses, we can improve the final action recognition results.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

Discriminative Latent Models for Recognizing Contextual Group Activities

Tian Lan; Yang Wang; Weilong Yang; Stephen N. Robinovitch; Greg Mori

In this paper, we go beyond recognizing the actions of individuals and focus on group activities. This is motivated from the observation that human actions are rarely performed in isolation; the contextual information of what other people in the scene are doing provides a useful cue for understanding high-level activities. We propose a novel framework for recognizing group activities which jointly captures the group activity, the individual person actions, and the interactions among them. Two types of contextual information, group-person interaction and person-person interaction, are explored in a latent variable framework. In particular, we propose three different approaches to model the person-person interaction. One approach is to explore the structures of person-person interaction. Differently from most of the previous latent structured models, which assume a predefined structure for the hidden layer, e.g., a tree structure, we treat the structure of the hidden layer as a latent variable and implicitly infer it during learning and inference. The second approach explores person-person interaction in the feature level. We introduce a new feature representation called the action context (AC) descriptor. The AC descriptor encodes information about not only the action of an individual person in the video, but also the behavior of other people nearby. The third approach combines the above two. Our experimental results demonstrate the benefit of using contextual information for disambiguating group activities.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011

Hidden Part Models for Human Action Recognition: Probabilistic versus Max Margin

Yang Wang; Greg Mori

We present a discriminative part-based approach for human action recognition from video sequences using motion features. Our model is based on the recently proposed hidden conditional random field (HCRF) for object recognition. Similarly to HCRF for object recognition, we model a human action by a flexible constellation of parts conditioned on image observations. Differently from object recognition, our model combines both large-scale global features and local patch features to distinguish various actions. Our experimental results show that our model is comparable to other state-of-the-art approaches in action recognition. In particular, our experimental results demonstrate that combining large-scale global features and local patch features performs significantly better than directly applying HCRF on local patches alone. We also propose an alternative for learning the parameters of an HCRF model in a max-margin framework. We call this method the max-margin hidden conditional random field (MMHCRF). We demonstrate that MMHCRF outperforms HCRF in human action recognition. In addition, MMHCRF can handle a much broader range of complex hidden structures arising in various problems in computer vision.

Explore More