Maryam Najafian | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maryam Najafian is active.

Explore More

Publication

Featured researches published by Maryam Najafian.

Odyssey 2016 | 2016

Identification of British English regional accents using fusion of i-vector and multi-accent phonotactic systems.

Maryam Najafian; Saeid Safavi; Phil Weber; Martin J. Russell

The para-linguistic information in a speech signal includes clues to the geographical and social background of the speaker. This paper is concerned with recognition of the 14 regional accents of British English. For Accent Identification (AID), acoustic methods exploit differences between the distributions of sounds, while phonotactic approaches exploit the sequences in which these sounds occur. We demonstrate these methods are good complements for each other and use their confusion matrices for further analysis. Our relatively simple i-vector and phonotactic fused system with recognition accuracy of 84.87% outperforms the i-vector fused results reported in literature, by 4.7%. Further analysis on distribution of British English accents has been carried out by analyzing the low dimensional representation of i-vector AID feature space.

2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE) | 2016

Employing speech and location information for automatic assessment of child language environments

Maryam Najafian; Dwight W. Irvin; Ying Luo; Beth Rous; John H. L. Hansen

Assessment of the language environment of children in early childhood is a challenging task for both human and machine, and understanding the classroom environment of early learners is an essential step towards facilitating language acquisition and development. This paper explores an approach for intelligent language environment monitoring based on the duration of child-to-child and adult-to-child conversations and a childs physical location in classrooms within a childcare center. The amount of childs communication with other children and adults was measured using an i-vector based child-adult diarization system (developed at CRSS). Furthermore the average time spent by each child across different activity areas within the classroom was measured using a location tracking system. The proposed solution here offers unique opportunities to assess speech and language interaction for children, and quantify location context which would contribute to improved language environments.

2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE) | 2016

Delay reduction in real-time recognition of human activity for stroke rehabilitation

Roozbeh Nabiei; Maryam Najafian; Manish Parekh; Peter Jancovic; Martin J. Russell

Assisting patients to perform activity of daily living (ADLs) is a challenging task for both human and machine. Hence, developing a computer-based rehabilitation system to re-train patients to carry out daily activities is an essential step towards facilitating rehabilitation of stroke patients with apraxia and action disorganization syndrome (AADS). This paper presents a real-time hidden Markov model (HMM) based human activity recognizer, and proposes a technique to reduce the time-delay occurred during the decoding stage. Results are reported for complete tea-making trials. In this study, the input features are recorded using sensors attached to the objects involved in the tea-making task, plus hand coordinate data captured using KinectTM sensor. A coaster of sensors, comprising an accelerometer and three force-sensitive resistors, are packaged in a unit which can be easily attached to the base of an object. A parallel asynchronous set of detectors, each responsible for the detection of one sub-goal in the tea-making task, are used to address challenges arising from overlaps between human actions. The proposed activity recognition system with the modified HMM topology provides a practical solution to the action recognition problem and reduces the time-delay by 64% with no loss in accuracy.

Frontiers of Earth Science in China | 2017

Optimized extreme learning machine for urban land cover classification using hyperspectral imagery

Hongjun Su; Shufang Tian; Yue Cai; Yehua Sheng; Chen Chen; Maryam Najafian

This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel σ for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.

spoken language technology workshop | 2016

Speaker independent diarization for child language environment analysis using deep neural networks

Maryam Najafian; John H. L. Hansen

Large-scale monitoring of the child language environment through measuring the amount of speech directed to the child by other children and adults during a vocal communication is an important task. Using the audio extracted from a recording unit worn by a child within a childcare center, at each point in time our proposed diarization system can determine the content of the childs language environment, by categorizing the audio content into one of the four major categories, namely (1) speech initiated by the child wearing the recording unit, speech originated by other (2) children or (3) adults and directed at the primary child, and (4) non-speech contents. In this study, we exploit complex Hidden Markov Models (HMMs) with multiple states to model the temporal dependencies between different sources of acoustic variability and estimate the HMM state output probabilities using deep neural networks as a discriminative modeling approach. The proposed system is robust against common diarization errors caused by rapid turn takings, between class similarities, and background noise without the need to prior clustering techniques. The experimental results confirm that this approach outperforms the state-of-the-art Gaussian mixture model based diarization without the need for bottom-up clustering and leads to 22.24% relative error reduction.

international conference on 3d vision | 2016

Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences

Mengyuan Liu; Hong Liu; Chen Chen; Maryam Najafian

In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subjects speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.

Workshop on Child Computer Interaction | 2016

Automatic measurement and analysis of the child verbal communication using classroom acoustics within a child care center.

Maryam Najafian; Dwight W. Irvin; Ying Luo; Beth Rous; John H. L. Hansen

Understanding the language environment of early learners is a challenging task for both human and machine, and it is critical in facilitating effective language development among young children. This papers presents a new application for the existing diarization systems and investigates the language environment of young children using a turn taking strategy employing an i-vector based baseline that captures adult-to-child or child-tochild conversational turns across different classrooms in a child care center. Detecting speaker turns is necessary before more in depth subsequent analysis of audio such as word count, speech recognition, and keyword spotting which can contribute to the design of future learning spaces specifically designed for typically developing children, or those at-risk with communication limitations. Experimental results using naturalistic childteacher classroom settings indicate the proposed rapid childadult speech turn taking scheme is highly effective under noisy classroom conditions and results in 27.3% relative error rate reduction compared to the baseline results produced by the LIUM diarization toolkit.

international conference on acoustics, speech, and signal processing | 2017

Environment aware speaker diarization for moving targets using parallel DNN-based recognizers

Maryam Najafian; John H. L. Hansen

Current diarization algorithms are commonly applied to the outputs of single non-moving microphones. They do not explicitly identify the content of overlapped segments from multiple speakers or acoustic events. This paper presents an acoustic environment aware child-adult diarization applied to the audio recorded by a single microphone attached to moving targets under realistic high noise conditions. The proposed system exploits a parallel deep neural network and hidden Markov model based approach which enables tracking of rapid turn changes in audio segments as well as capturing the cross talk labels for overlapped speech. It outperforms the state-of-the-art diarization systems without the need to prior clustering or front-end speech activity detection.

conference of the international speech communication association | 2014