Munawar Hayat | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Munawar Hayat is active.

Explore More

Publication

Featured researches published by Munawar Hayat.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Deep Reconstruction Models for Image Set Classification

Munawar Hayat; Mohammed Bennamoun; Senjian An

Image set classification finds its applications in a number of real-life scenarios such as classification from surveillance videos, multi-view camera networks and personal albums. Compared with single image based classification, it offers more promises and has therefore attracted significant research attention in recent years. Unlike many existing methods which assume images of a set to lie on a certain geometric surface, this paper introduces a deep learning framework which makes no such prior assumptions and can automatically discover the underlying geometric structure. Specifically, a Template Deep Reconstruction Model (TDRM) is defined whose parameters are initialized by performing unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBMs). The initialized TDRM is then separately trained for images of each class and class-specific DRMs are learnt. Based on the minimum reconstruction errors from the learnt class-specific models, three different voting strategies are devised for classification. Extensive experiments are performed to demonstrate the efficacy of the proposed framework for the tasks of face and object recognition from image sets. Experimental results show that the proposed method consistently outperforms the existing state of the art methods.

computer vision and pattern recognition | 2014

Learning Non-linear Reconstruction Models for Image Set Classification

Munawar Hayat; Mohammed Bennamoun; Senjian An

We propose a deep learning framework for image set classification with application to face recognition. An Adaptive Deep Network Template (ADNT) is defined whose parameters are initialized by performing unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBMs). The pre-initialized ADNT is then separately trained for images of each class and class-specific models are learnt. Based on the minimum reconstruction error from the learnt class-specific models, a majority voting strategy is used for classification. The proposed framework is extensively evaluated for the task of image set classification based face recognition on Honda/UCSD, CMU Mobo, YouTube Celebrities and a Kinect dataset. Our experimental results and comparisons with existing state-of-the-art methods show that the proposed method consistently achieves the best performance on all these datasets.

Pattern Recognition | 2016

A Two-Phase Weighted Collaborative Representation for 3D partial face recognition with single sample

Yinjie Lei; Yulan Guo; Munawar Hayat; Mohammed Bennamoun; Xinzhi Zhou

3D face recognition with the availability of only partial data (missing parts, occlusions and data corruptions) and single training sample is a highly challenging task. This paper presents an efficient 3D face recognition approach to address this challenge. We represent a facial scan with a set of local Keypoint-based Multiple Triangle Statistics (KMTS), which is robust to partial facial data, large facial expressions and pose variations. To address the single sample problem, we then propose a Two-Phase Weighted Collaborative Representation Classification (TPWCRC) framework. A class-based probability estimation is first calculated based on the extracted local descriptors as a prior knowledge. The resulting class-based probability estimation is then incorporated into the proposed classification framework as a locality constraint to further enhance its discriminating power. Experimental results on six challenging 3D facial datasets show that the proposed KMTS-TPWCRC framework achieves promising results for human face recognition with missing parts, occlusions, data corruptions, expressions and pose variations. HighlightsNovel Keypoint-based Multiple Triangle Statistics (KMTS) are proposed for 3D face representation.The proposed local descriptor is robust to partial facial data and expression/pose variations.A Two-Phase Weighted Collaborative Representation Classification (TPWCRC) framework is used to perform face recognition.The proposed classification framework can effectively address the single sample problem.State-of-the-art performance on six challenging datasets with high efficiency is achieved.

IEEE Transactions on Neural Networks | 2018

Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data

Salman Hameed Khan; Munawar Hayat; Mohammed Bennamoun; Ferdous Ahmed Sohel; Roberto Togneri

Class imbalance is a common problem in the case of real-world object detection and classification tasks. Data of some classes are abundant, making them an overrepresented majority, and data of other classes are scarce, making them an underrepresented minority. This imbalance makes it challenging for a classifier to appropriately learn the discriminating boundaries of the majority and minority classes. In this paper, we propose a cost-sensitive (CoSen) deep neural network, which can automatically learn robust feature representations for both the majority and minority classes. During training, our learning procedure jointly optimizes the class-dependent costs and the neural network parameters. The proposed approach is applicable to both binary and multiclass problems without any modification. Moreover, as opposed to data-level approaches, we do not alter the original data distribution, which results in a lower computational cost during the training process. We report the results of our experiments on six major image classification data sets and show that the proposed approach significantly outperforms the baseline algorithms. Comparisons with popular data sampling techniques and CoSen classifiers demonstrate the superior performance of our proposed method.

IEEE Transactions on Affective Computing | 2014

An Automatic Framework for Textured 3D Video-Based Facial Expression Recognition

Munawar Hayat; Mohammed Bennamoun

Most of the existing research on 3D facial expression recognition has been done using static 3D meshes. 3D videos of a face are believed to contain more information in terms of the facial dynamics which are very critical for expression recognition. This paper presents a fully automatic framework which exploits the dynamics of textured 3D videos for recognition of six discrete facial expressions. Local video-patches of variable lengths are extracted from numerous locations of the training videos and represented as points on the Grassmannian manifold. An efficient graph-based spectral clustering algorithm is used to separately cluster these points for every expression class. Using a valid Grassmannian kernel function, the resulting cluster centers are embedded into a Reproducing Kernel Hilbert Space (RKHS) where six binary SVM models are learnt. Given a query video, we extract video-patches from it, represent them as points on the manifold and match these points with the learnt SVM models followed by a voting based strategy to decide about the class of the query video. The proposed framework is also implemented in parallel on 2D videos and a score level fusion of 2D & 3D videos is performed for performance improvement of the system. The experimental results on BU4DFE data set show that the system achieves a very high classification accuracy for facial expression recognition from 3D videos.

european conference on computer vision | 2014

Reverse training: An efficient approach for image set classification

Munawar Hayat; Mohammed Bennamoun; Senjian An

This paper introduces a new approach, called reverse training, to efficiently extend binary classifiers for the task of multi-class image set classification. Unlike existing binary to multi-class extension strategies, which require multiple binary classifiers, the proposed approach is very efficient since it trains a single binary classifier to optimally discriminate the class of the query image set from all others. For this purpose, the classifier is trained with the images of the query set (labelled positive) and a randomly sampled subset of the training data (labelled negative). The trained classifier is then evaluated on rest of the training images. The class of these images with their largest percentage classified as positive is predicted as the class of the query image set. The confidence level of the prediction is also computed and integrated into the proposed approach to further enhance its robustness and accuracy. Extensive experiments and comparisons with existing methods show that the proposed approach achieves state of the art performance for face and object recognition on a number of datasets.

IEEE Transactions on Image Processing | 2016

A Discriminative Representation of Convolutional Features for Indoor Scene Recognition

Salman Hameed Khan; Munawar Hayat; Mohammed Bennamoun; Roberto Togneri; Ferdous Ahmed Sohel

Indoor scene recognition is a multi-faceted and challenging problem due to the diverse intra-class variations and the confusing inter-class similarities that characterize such scenes. This paper presents a novel approach that exploits rich mid-level convolutional features to categorize indoor scenes. Traditional convolutional features retain the global spatial structure, which is a desirable property for general object recognition. We, however, argue that the structure-preserving property of the convolutional neural network activations is not of substantial help in the presence of large variations in scene layouts, e.g., in indoor scenes. We propose to transform the structured convolutional activations to another highly discriminative feature space. The representation in the transformed space not only incorporates the discriminative aspects of the target data set but also encodes the features in terms of the general object categories that are present in indoor scenes. To this end, we introduce a new large-scale data set of 1300 object categories that are commonly present in indoor scenes. Our proposed approach achieves a significant performance boost over the previous state-of-the-art approaches on five major scene classification data sets.

Neurocomputing | 2016

An RGB-D based image set classification for robust face recognition from Kinect data

Munawar Hayat; Mohammed Bennamoun; Amar A. El-Sallam

The paper proposes a method for robust face recognition from low quality Kinect acquired images which have a wide range of variations in head pose, illumination, facial expressions, sunglass disguise and occlusions by hand. Multiple Kinect images of a person are considered as an image set and face recognition from these images is formulated as an RGB-D image set classification problem. The Kinect acquired raw depth data is used for pose estimation and an automatic cropping of the face region. Based upon the estimated poses, the face images of a set are divided into multiple image subsets. An efficient block based covariance matrix representation is proposed to model images in an image subset on Riemannian manifold (Lie group). For classification, SVM models are separately learnt for each image subset on the Lie group of Riemannian manifold and a fusion strategy is introduced to combine results from all image subsets. The proposed technique has been evaluated on a combination of three large data sets containing over 35,000 RGB-D images under challenging conditions. The proposed RGB-D based image set classification incurs low computational cost and achieves an identification rate as high as 99.5%.

workshop on applications of computer vision | 2013

Clustering of video-patches on Grassmannian manifold for facial expression recognition from 3D videos

Munawar Hayat; Mohammed Bennamoun; Amar A. El-Sallam

This paper presents a fully automatic system which exploits the dynamics of 3D videos and is capable of recognizing six basic facial expressions. Local video-patches of variable lengths are extracted from different locations of the training videos and represented as points on the Grass-mannian manifold. An efficient spectral clustering based algorithm is used to separately cluster points for each of the six expression classes. The resulting cluster centers are matched with the points of a test video and a voting based strategy is used to decide about the expression class of the test video. The proposed system is tested on the largest publicly available 3D video database, BU4DFE. The experimental results show that the system achieves a very high classification accuracy and outperforms the current state of the art algorithms for facial expression recognition from 3D videos.

international conference on human system interactions | 2012

Evaluation of Spatiotemporal Detectors and Descriptors for Facial Expression Recognition

Munawar Hayat; Mohammed Bennamoun; Amar A. El-Sallam

Local spatiotemporal detectors and descriptors have recently become very popular for video analysis in many applications. They do not require any preprocessing steps and are invariant to spatial and temporal scales. Despite their computational simplicity, they have not been evaluated and tested for video analysis of facial data. This paper considers two space-time detectors and four descriptors and uses bag of features framework for human facial expression recognition on BU_4DFE data set. A comparison of local spatiotemporal features with other non-spatiotemporal published techniques on the same data set is also given. Unlike spatiotemporal features, these techniques involve time consuming and computationally intensive preprocessing steps like manual initialization and tracking of facial points. Our results show that despite being totally automatic and not requiring any user intervention, local spacetime features provide promising and comparable performance for facial expression recognition on BU_4DFE data set.

Explore More