Minh Hoai Nguyen
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Minh Hoai Nguyen.
affective computing and intelligent interaction | 2009
Jeffrey F. Cohn; Tomas Simon Kruez; Iain A. Matthews; Ying Yang; Minh Hoai Nguyen; Margara Tejera Padilla; Feng Zhou; Fernando De la Torre
Current methods of assessing psychopathology depend almost entirely on verbal report (clinical interview or questionnaire) of patients, their family, or caregivers. They lack systematic and efficient ways of incorporating behavioral observations that are strong indicators of psychological disorder, much of which may occur outside the awareness of either individual. We compared clinical diagnosis of major depression with automatically measured facial actions and vocal prosody in patients undergoing treatment for depression. Manual FACS coding, active appearance modeling (AAM) and pitch extraction were used to measure facial and vocal expression. Classifiers using leave-one-out validation were SVM for FACS and for AAM and logistic regression for voice. Both face and voice demonstrated moderate concurrent validity with depression. Accuracy in detecting depression was 88% for manual FACS and 79% for AAM. Accuracy for vocal prosody was 79%. These findings suggest the feasibility of automatic detection of depression, raise new issues in automated facial image analysis and machine learning, and have exciting implications for clinical theory and practice.
Pattern Recognition | 2010
Minh Hoai Nguyen; Fernando De la Torre
Selecting relevant features for support vector machine (SVM) classifiers is important for a variety of reasons such as generalization performance, computational efficiency, and feature interpretability. Traditional SVM approaches to feature selection typically extract features and learn SVM parameters independently. Independently performing these two steps might result in a loss of information related to the classification process. This paper proposes a convex energy-based framework to jointly perform feature selection and SVM parameter learning for linear and non-linear kernels. Experiments on various databases show significant reduction of features used while maintaining classification performance.
international conference on computer vision | 2009
Minh Hoai Nguyen; Lorenzo Torresani; Fernando De la Torre; Carsten Rother
Visual categorization problems, such as object classification or action recognition, are increasingly often approached using a detection strategy: a classifier function is first applied to candidate subwindows of the image or the video, and then the maximum classifier score is used for class decision. Traditionally, the subwindow classifiers are trained on a large collection of examples manually annotated with masks or bounding boxes. The reliance on time-consuming human labeling effectively limits the application of these methods to problems involving very few categories. Furthermore, the human selection of the masks introduces arbitrary biases (e.g. in terms of window size and location) which may be suboptimal for classification. In this paper we propose a novel method for learning a discriminative subwindow classifier from examples annotated with binary labels indicating the presence of an object or action of interest, but not its location. During training, our approach simultaneously localizes the instances of the positive class and learns a subwindow SVM to recognize them. We extend our method to classification of time series by presenting an algorithm that localizes the most discriminative set of temporal segments in the signal. We evaluate our approach on several datasets for object and action recognition and show that it achieves results similar and in many cases superior to those obtained with full supervision.
computer vision and pattern recognition | 2010
Tomas Simon; Minh Hoai Nguyen; Fernando De la Torre; Jeffrey F. Cohn
Automatic facial action unit (AU) detection from video is a long-standing problem in computer vision. Two main approaches have been pursued: (1) static modeling — typically posed as a discriminative classification problem in which each video frame is evaluated independently; (2) temporal modeling — frames are segmented into sequences and typically modeled with a variant of dynamic Bayesian networks. We propose a segment-based approach, kSeg-SVM, that incorporates benefits of both approaches and avoids their limitations. kSeg-SVM is a temporal extension of the spatial bag-of-words. kSeg-SVM is trained within a structured output SVM framework that formulates AU detection as a problem of detecting temporal events in a time series of visual features. Each segment is modeled by a variant of the BoW representation with soft assignment of the words based on similarity. Our framework has several benefits for AU detection: (1) both dependencies between features and the length of action units are modeled; (2) all possible segments of the video may be used for training; and (3) no assumptions are required about the underlying structure of the action unit events (e.g., i.i.d.). Our algorithm finds the best k-or-fewer segments that maximize the SVM score. Experimental results suggest that the proposed method outperforms state-of-the-art static methods for AU detection.
computer vision and pattern recognition | 2008
F. De la Torre; Minh Hoai Nguyen
Parameterized appearance models (PAMs) (e.g. eigen-tracking, active appearance models, morphable models) use principal component analysis (PCA) to model the shape and appearance of objects in images. Given a new image with an unknown appearance/shape configuration, PAMs can detect and track the object by optimizing the modelpsilas parameters that best match the image. While PAMs have numerous advantages for image registration relative to alternative approaches, they suffer from two major limitations: First, PCA cannot model non-linear structure in the data. Second, learning PAMs requires precise manually labeled training data. This paper proposes parameterized kernel principal component analysis (PKPCA), an extension of PAMs that uses Kernel PCA (KPCA) for learning a non-linear appearance model invariant to rigid and/or non-rigid deformations. We demonstrate improved performance in supervised and unsupervised image registration, and present a novel application to improve the quality of manual landmarks in faces. In addition, we suggest a clean and effective matrix formulation for PKPCA.
ieee international conference on automatic face & gesture recognition | 2008
Minh Hoai Nguyen; J. Perez; F. De la Torre
Automatic facial feature localization has been a long-standing challenge in the field of computer vision for several decades. This can be explained by the large variation a face in an image can have due to factors such as position, facial expression, pose, illumination, and background clutter. Support Vector Machines (SVMs) have been a popular statistical tool for facial feature detection. Traditional SVM approaches to facial feature detection typically extract features from images (e.g. multiband filter, SIFT features) and learn the SVM parameters. Independently learning features and SVM parameters might result in a loss of information related to the classification process. This paper proposes an energy-based framework to jointly perform relevant feature weighting and SVM parameter learning. Preliminary experiments on standard face databases have shown significant improvement in speed with our approach.
computer vision and pattern recognition | 2008
Minh Hoai Nguyen; F. De la Torre
Parameterized appearance models (PAMs) (e.g. eigen-tracking, active appearance models, morphable models) are commonly used to model the appearance and shape variation of objects in images. While PAMs have numerous advantages relative to alternate approaches, they have at least two drawbacks. First, they are especially prone to local minima in the fitting process. Second, often few if any of the local minima of the cost function correspond to acceptable solutions. To solve these problems, this paper proposes a method to learn a cost function by explicitly optimizing that the local minima occur at and only at the places corresponding to the correct fitting parameters. To the best of our knowledge, this is the first paper to address the problem of learning a cost function to explicitly model local properties of the error surface to fit PAMs. Synthetic and real examples show improvement in alignment performance in comparison with traditional approaches.
Computer Graphics Forum | 2008
Minh Hoai Nguyen; Jean-François Lalonde; Alexei A. Efros; Fernando De la Torre
Many categories of objects, such as human faces, can be naturally viewed as a composition of several different layers. For example, a bearded face with glasses can be decomposed into three layers: a layer for glasses, a layer for the beard and a layer for other permanent facial features. While modeling such a face with a linear subspace model could be very difficult, layer separation allows for easy modeling and modification of some certain structures while leaving others unchanged. In this paper, we present a method for automatic layer extraction and its applications to face synthesis and editing. Layers are automatically extracted by utilizing the differences between subspaces and modeled separately. We show that our method can be used for tasks such beard removal (virtual shaving), beard synthesis, and beard transfer, among others.
ieee international conference on automatic face & gesture recognition | 2008
Minh Hoai Nguyen; F. De la Torre
Active appearance models (AAMs) have been extensively used for face alignment during the last 20 years. While AAMs have numerous advantages relative to alternate approaches, they suffer from two major drawbacks: (i) AAMs are especially prone to local minima in the fitting process; (ii) few if any of the local minima of the cost function correspond to acceptable solutions. To minimize these problems, this paper proposes a method to learn the fitting cost function that explicitly optimizes that the local minima occur at and only at the places corresponding to the correct fitting parameters. The paper explores two methods to parameterize the cost function: pixel weighting and subspace learning. Experiments on synthetic and real data show the effectiveness of our approach for face alignment.
australian joint conference on artificial intelligence | 2006
Minh Hoai Nguyen; Wayne Wobcke
SharedPlans is an agent teamwork model that provides a formalization of the conditions under which a group of agents has a collaborative plan. This paper describes a general framework for implementing SharedPlans theory that addresses the computational issues of team formation, group plan elaboration and plan execution, involving coordination, communication and monitoring. The framework includes a team-oriented programming language for specifying recipes for SharedPlans, and an extension to a BDI architecture with several meta-plans for interpreting the plan language. We indicate how the formal requirements for the establishment of SharedPlans are fulfilled within the framework.