Ashish Kapoor | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ashish Kapoor is active.

Explore More

Publication

Featured researches published by Ashish Kapoor.

acm multimedia | 2005

Multimodal affect recognition in learning environments

Ashish Kapoor; Rosalind W. Picard

We propose a multi-sensor affect recognition system and evaluate it on the challenging task of classifying interest (or disinterest) in children trying to solve an educational puzzle on the computer. The multimodal sensory information from facial expressions and postural shifts of the learner is combined with information about the learners activity on the computer. We propose a unified approach, based on a mixture of Gaussian Processes, for achieving sensor fusion under the problematic conditions of missing channels and noisy labels. This approach generates separate class labels corresponding to each individual modality. The final classification is based upon a hidden random variable, which probabilistically combines the sensors. The multimodal Gaussian Process approach achieves accuracy of over 86%, significantly outperforming classification using the individual modalities, and several other combination schemes.

international conference on computer vision | 2007

Active Learning with Gaussian Processes for Object Categorization

Ashish Kapoor; Kristen Grauman; Raquel Urtasun; Trevor Darrell

Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) are powerful regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. The uncertainty model provided by GPs offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We derive a novel active category learning method based on our probabilistic regression model, and show that a significant boost in classification performance is possible, especially when the amount of training data for a category is ultimately very small.

computer vision and pattern recognition | 2011

Learning a blind measure of perceptual image quality

Huixuan Tang; Neel Joshi; Ashish Kapoor

It is often desirable to evaluate an image based on its quality. For many computer vision applications, a perceptually meaningful measure is the most relevant for evaluation; however, most commonly used measure do not map well to human judgements of image quality. A further complication of many existing image measure is that they require a reference image, which is often not available in practice. In this paper, we present a “blind” image quality measure, where potentially neither the groundtruth image nor the degradation process are known. Our method uses a set of novel low-level image features in a machine learning framework to learn a mapping from these features to subjective image quality scores. The image quality features stem from natural image measure and texture statistics. Experiments on a standard image quality benchmark dataset shows that our method outperforms the current state of art.

International Journal of Computer Vision | 2010

Gaussian Processes for Object Categorization

Ashish Kapoor; Kristen Grauman; Raquel Urtasun; Trevor Darrell

Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework for deriving regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. Our probabilistic formulation provides a principled way to learn hyperparameters, which we utilize to learn an optimal combination of multiple covariance functions. It also offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We show that with an appropriate combination of kernels a significant boost in classification performance is possible. Further, our experiments indicate the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

human factors in computing systems | 2012

AffectAura: an intelligent system for emotional memory

Daniel McDuff; Amy K. Karlson; Ashish Kapoor; Asta Roseway; Mary Czerwinski

We present AffectAura, an emotional prosthetic that allows users to reflect on their emotional states over long periods of time. We designed a multimodal sensor set-up for continuous logging of audio, visual, physiological and contextual data, a classification scheme for predicting user affective state and an interface for user reflection. The system continuously predicts a users valence, arousal and engage-ment, and correlates this with information on events, communications and data interactions. We evaluate the interface through a user study consisting of six users and over 240 hours of data, and demonstrate the utility of such a reflection tool. We show that users could reason forward and backward in time about their emotional experiences using the interface, and found this useful.

workshop on perceptive user interfaces | 2001

A real-time head nod and shake detector

Ashish Kapoor; Rosalind W. Picard

Head nods and head shakes are non-verbal gestures used often to communicate intent, emotion and to perform conversational functions. We describe a vision-based system that detects head nods and head shakes in real time and can act as a useful and basic interface to a machine. We use an infrared sensitive camera equipped with infrared LEDs to track pupils. The directions of head movements, determined using the position of pupils, are used as observations by a discrete Hidden Markov Model (HMM) based pattern analyzer to detect when a head nod/shake occurs. The system is trained and tested on natural data from ten users gathered in the presence of varied lighting and varied facial expressions. The system as described achieves a real time recognition accuracy of 78.46% on the test dataset.

human factors in computing systems | 2009

EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers

Justin Talbot; Bongshin Lee; Ashish Kapoor; Desney S. Tan

Machine learning is an increasingly used computational tool within human-computer interaction research. While most researchers currently utilize an iterative approach to refining classifier models and performance, we propose that ensemble classification techniques may be a viable and even preferable alternative. In ensemble learning, algorithms combine multiple classifiers to build one that is superior to its components. In this paper, we present EnsembleMatrix, an interactive visualization system that presents a graphical view of confusion matrices to help users understand relative merits of various classifiers. EnsembleMatrix allows users to directly interact with the visualizations in order to explore and build combination models. We evaluate the efficacy of the system and the approach in a user study. Results show that users are able to quickly combine multiple classifiers operating on multiple feature sets to produce an ensemble classifier with accuracy that approaches best-reported performance classifying images in the CalTech-101 dataset.

computer vision and pattern recognition | 2009

Active learning for large multi-class problems

Prateek Jain; Ashish Kapoor

Scarcity and infeasibility of human supervision for large scale multi-class classification problems necessitates active learning. Unfortunately, existing active learning methods for multi-class problems are inherently binary methods and do not scale up to a large number of classes. In this paper, we introduce a probabilistic variant of the K-nearest neighbor method for classification that can be seamlessly used for active learning in multi-class scenarios. Given some labeled training data, our method learns an accurate metric/kernel function over the input space that can be used for classification and similarity search. Unlike existing metric/kernel learning methods, our scheme is highly scalable for classification problems and provides a natural notion of uncertainty over class labels. Further, we use this measure of uncertainty to actively sample training examples that maximize discriminating capabilities of the model. Experiments on benchmark datasets show that the proposed method learns appropriate distance metrics that lead to state-of-the-art performance for object categorization problems. Furthermore, our active learning method effectively samples training examples, resulting in significant accuracy gains over random sampling for multi-class problems involving a large number of classes.

international soi conference | 2003

Fully automatic upper facial action recognition

Ashish Kapoor; Yuan Qi; Rosalind W. Picard

We provide a new fully automatic framework to analyze facial action units, the fundamental building blocks of facial expression enumerated in Paul Ekmans facial action coding system (FACS). The action units examined here include upper facial muscle movements such as inner eyebrow raise, eye widening, and so forth, which combine to form facial expressions. Although prior methods have obtained high recognition rates for recognizing facial action units, these methods either use manually preprocessed image sequences or require human specification of facial features; thus, they have exploited substantial human intervention. We present a fully automatic method, requiring no such human specification. The system first robustly detects the pupils using an infrared sensitive camera equipped with infrared LEDs. For each frame, the pupil positions are used to localize and normalize eye and eyebrow regions, which are analyzed using PCA to recover parameters that relate to the shape of the facial features. These parameters are used as input to classifiers based on support vector machines to recognize upper facial action units and all their possible combinations. On a completely natural dataset with lots of head movements, pose changes and occlusions, the new framework achieved a recognition accuracy of 69.3% for each individual AU and an accuracy of 62.5% for all possible AU combinations. This framework achieves a higher recognition accuracy on the Cohn-Kanade AU-coded facial expression database, which has been previously used to evaluate other facial action recognition system.

international conference on pattern recognition | 2004

Probabilistic combination of multiple modalities to detect interest

Ashish Kapoor; Rosalind W. Picard; Yuri A. Ivanov

This paper describes a new approach to combine multiple modalities and applies it to the problem of affect recognition. The problem is posed as a combination of classifiers in a probabilistic framework that naturally explains the concepts of experts and critics. Each channel of data has an expert associated that generates the beliefs about the correct class. Probabilistic models of error and the critics, which predict the performance of the expert on the current input, are used to combine the experts beliefs about the correct class. The method is applied to detect the affective state of interest using information from the face, postures and task the subjects are performing. The classification using multiple modalities achieves a recognition accuracy of 67.8%, outperforming the classification using individual modalities. Further, the proposed combination scheme achieves the greatest reduction in error when compared with other classifier combination methods.

Explore More