Farhood Negin
French Institute for Research in Computer Science and Automation
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Farhood Negin.
The Visual Computer | 2018
Saeid Agahian; Farhood Negin; Cemal Köse
Over the last few decades, human action recognition has become one of the most challenging tasks in the field of computer vision. Effortless and accurate extraction of 3D skeleton information has been recently achieved by means of economical depth sensors and state-of-the-art deep learning approaches. In this study, we introduce a novel bag-of-poses framework for action recognition using 3D skeleton data. Our assumption is that any action can be represented by a set of predefined spatiotemporal poses. The pose descriptor is composed of three parts. The first part is concatenation of the normalized coordinate of the skeleton joints. The second part is consisted of temporal displacement of the joints constructed with predefined temporal offset, and the third part is temporal displacement with the previous frame in the sequence. In order to generate the key poses, we apply K-means clustering over all the training pose descriptors of the dataset. SVM classifier is trained with the generated key poses to classify an action pose. Accordingly, every action in the dataset is encoded with key pose histograms. ELM classifier is used for action recognition due to its fast, accurate and reliable performance compared to the other classifiers. The proposed framework is validated with five publicly available benchmark 3D action datasets and achieved state-of-the-art results on three of the datasets and competitive results on the other two datasets compared to the other methods.
advanced video and signal based surveillance | 2016
Farhood Negin; Michal Koperski; Carlos Fernando Crispim; Francois Bremond; Serhan Cosar; Konstantinos Avgerinakis
Many supervised approaches report state-of-the-art results for recognizing short-term actions in manually clipped videos by utilizing fine body motion information. The main downside of these approaches is that they are not applicable in real world settings. The challenge is different when it comes to unstructured scenes and long-term videos. Unsupervised approaches have been used to model the long-term activities but the main pitfall is their limitation to handle subtle differences between similar activities since they mostly use global motion information. In this paper, we present a hybrid approach for long-term human activity recognition with more precise recognition of activities compared to unsupervised approaches. It enables processing of long-term videos by automatically clipping and performing online recognition. The performance of our approach has been tested on two Activities of Daily Living (ADL) datasets. Experimental results are promising compared to existing approaches.
Expert Systems With Applications | 2018
Farhood Negin; Pau Rodríguez; Michal Koperski; Adlen Kerboua; Jordi Gonzàlez; Jérémy Bourgeois; Emmanuelle Chapoulie; Philippe Robert; Francois Bremond
Abstract Praxis test is a gesture-based diagnostic test which has been accepted as diagnostically indicative of cortical pathologies such as Alzheimer’s disease. Despite being simple, this test is oftentimes skipped by the clinicians. In this paper, we propose a novel framework to investigate the potential of static and dynamic upper-body gestures based on the Praxis test and their potential in a medical framework to automatize the test procedures for computer-assisted cognitive assessment of older adults. In order to carry out gesture recognition as well as correctness assessment of the performances we have recollected a novel challenging RGB-D gesture video dataset recorded by Kinect v2, which contains 29 specific gestures suggested by clinicians and recorded from both experts and patients performing the gesture set. Moreover, we propose a framework to learn the dynamics of upper-body gestures, considering the videos as sequences of short-term clips of gestures. Our approach first uses body part detection to extract image patches surrounding the hands and then, by means of a fine-tuned convolutional neural network (CNN) model, it learns deep hand features which are then linked to a long short-term memory to capture the temporal dependencies between video frames. We report the results of four developed methods using different modalities. The experiments show effectiveness of our deep learning based approach in gesture recognition and performance assessment tasks. Satisfaction of clinicians from the assessment reports indicates the impact of framework corresponding to the diagnosis.
advanced video and signal based surveillance | 2017
Nguyen Thi Lan Anh; Furqan M. Khan; Farhood Negin; Francois Bremond
Appearance based multi-object tracking (MOT) is a challenging task, specially in complex scenes where objects have similar appearance or are occluded by background or other objects. Such factors motivate researchers to propose effective trackers which should satisfy real-time processing and object trajectory recovery criteria. In order to handle both mentioned requirements, we propose a robust online multi-object tracking method that extends the features and methods proposed for re-identification to MOT. The proposed tracker combines a local and a global tracker in a comprehensive two-step framework. In the local tracking step, we use the frame-to-frame association to generate online object trajectories. Each object trajectory is called tracklet and is represented by a set of multi-modal feature distributions modeled by GMMs. In the global tracking step, occlusions and mis-detections are recovered by tracklet bipartite association method based on learning Mahalanobis metric between GMM components using KISSME metric learning algorithm. Experiments on two public datasets show that our tracker performs well when compared to state-of-the-art tracking algorithms.
Sensors | 2017
Carlos Fernando Crispim-Junior; Alvaro Gómez Uría; Carola Strumia; Michal Koperski; Alexandra König; Farhood Negin; Serhan Cosar; Anh Tuan Nghiem; Duc Phu Chau; Guillaume Charpiat; Francois Bremond
Visual activity recognition plays a fundamental role in several research fields as a way to extract semantic meaning of images and videos. Prior work has mostly focused on classification tasks, where a label is given for a video clip. However, real life scenarios require a method to browse a continuous video flow, automatically identify relevant temporal segments and classify them accordingly to target activities. This paper proposes a knowledge-driven event recognition framework to address this problem. The novelty of the method lies in the combination of a constraint-based ontology language for event modeling with robust algorithms to detect, track and re-identify people using color-depth sensing (Kinect® sensor). This combination enables to model and recognize longer and more complex events and to incorporate domain knowledge and 3D information into the same models. Moreover, the ontology-driven approach enables human understanding of system decisions and facilitates knowledge transfer across different scenes. The proposed framework is evaluated with real-world recordings of seniors carrying out unscripted, daily activities at hospital observation rooms and nursing homes. Results demonstrated that the proposed framework outperforms state-of-the-art methods in a variety of activities and datasets, and it is robust to variable and low-frame rate recordings. Further work will investigate how to extend the proposed framework with uncertainty management techniques to handle strong occlusion and ambiguous semantics, and how to exploit it to further support medicine on the timely diagnosis of cognitive disorders, such as Alzheimer’s disease.
asian conference on pattern recognition | 2015
Farhood Negin; Serhan Cogar; Francois Bremond; Michal Koperski
This paper presents an unsupervised approach for learning long-term human activities without requiring any user interaction (e.g., clipping long-term videos into short-term actions, labeling huge amount of short-term actions as in supervised approaches). First, important regions in the scene are learned via clustering trajectory points and the global movement of people is presented as a sequence of primitive events. Then, using local action descriptors with bag-of-words (BoW) approach, we represent the body motion of people inside each region. Incorporating global motion information with action descriptors, a comprehensive representation of human activities is obtained by creating models that contains both global and body motion of people. Learning of zones and the construction of primitive events is automatically performed. Once models are learned, the approach provides an online recognition framework. We have tested the performance of our approach on recognizing activities of daily living and showed its efficiency over existing approaches.
The 10th World Conference of Gerontechnology (ISG 2016) | 2016
Farhood Negin; Jérémy Bourgeois; Emmanuelle Chapoulie; Philippe Robert; François Brémond
Archive | 2016
Farhood Negin; Serhan Cosar; Michal Koperski; Carlos Fernando Crispim; Konstantinos Avgerinakis; François Brémond
Archive | 2016
Farhood Negin; Jérémy Bourgeois; Emmanuelle Chapoulie; Philippe Robert; François Brémond
Archive | 2015
Farhood Negin; Serhan Cosar; Michal Koperski; François Brémond