Yunbin Deng
BAE Systems
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yunbin Deng.
international conference on acoustics, speech, and signal processing | 2009
Glen Colby; James T. Heaton; L. Donald Gilmore; Jason J. Sroka; Yunbin Deng; João B. D. Cabrera; Serge H. Roy; Carlo J. De Luca; Geoffrey S. Meltzner
The authors previously reported speaker-dependent automatic speech recognition accuracy for isolated words using eleven surface-electromyographic (sEMG) sensors in fixed recording locations on the face and neck. The original array of sensors was chosen to ensure ample coverage of the muscle groups known to contribute to articulation during speech production. In this paper we systematically analyzed speech recognition performance from sensor subsets with the goal of reducing the number of sensors needed and finding the best combination of sensor locations to achieve word recognition rates comparable to the full set. We evaluated each of the different possible subsets by its mean word recognition rate across nine speakers using HMM modeling of MFCC and co-activation features derived from the subset of sensor signals. We show empirically that five sensors are sufficient to achieve a recognition rate to within a half a percentage point of that obtainable from the full set of sensors.
international conference on biometrics theory applications and systems | 2015
Yu Zhong; Yunbin Deng; Geoffrey S. Meltzner
Accelerometers embedded in mobile devices have shown great potential for non-obtrusive gait biometrics by directly capturing a users characteristic locomotion. Although gait analysis using these sensors has achieved highly accurate authentication and identification performance under controlled experimental settings, the robustness of such algorithms in the presence of assorted variations typical in real world scenarios remains a major challenge. In this paper, we propose a novel pace independent mobile gait biometrics algorithm that is insensitive to variability in walking speed. Our approach also exploits recent advances in invariant mobile gait representation to be independent of sensor rotation. Performance evaluations on a realistic mobile gait dataset containing 51 subjects confirm the merits of the proposed algorithm toward practical mobile gait authentication.
military communications conference | 2012
Yunbin Deng; Glen Colby; James T. Heaton; Geoffrey S. Meltzner
Military speech communication often needs to be conducted in very high noise environments. In addition, there are scenarios, such as special-ops missions, for which it is beneficial to have covert voice communications. To enable both capabilities, we have developed the MUTE (Mouthed-speech Understanding and Transcription Engine) system, which bypasses the limitations of traditional acoustic speech communication by measuring and interpreting muscle activity of the facial and neck musculature involved in silent speech production. This article details our recent progress on automatic surface electromyography (sEMG) speech activity detection, feature parameterization, multi-task sEMG corpus development, context dependent sub-word sEMG modeling, discriminative phoneme model training, and flexible vocabulary continuous sEMG silent speech recognition. Our current system achieved recognition accuracy at developable levels for a pre-defined special ops task. We further propose research directions in adaptive sEMG feature parameterization and data driven decision question generation for context-dependent sEMG phoneme modeling.
IEEE Transactions on Audio, Speech, and Language Processing | 2017
Geoffrey S. Meltzner; James T. Heaton; Yunbin Deng; Gianluca De Luca; Serge H. Roy; Joshua C. Kline
Each year thousands of individuals require surgical removal of the larynx (voice box) due to trauma or disease, and thereby require an alternative voice source or assistive device to verbally communicate. Although natural voice is lost after laryngectomy, most muscles controlling speech articulation remain intact. Surface electromyographic (sEMG) activity of speech musculature can be recorded from the neck and face, and used for automatic speech recognition to provide speech-to-text or synthesized speech as an alternative means of communication. This is true even when speech is mouthed or spoken in a silent (subvocal) manner, making it an appropriate communication platform after laryngectomy. In this study, eight individuals at least 6 months after total laryngectomy were recorded using eight sEMG sensors on their face (4) and neck (4) while reading phrases constructed from a 2500-word vocabulary. A unique set of phrases were used for training phoneme-based recognition models for each of the 39 commonly used phonemes in English, and the remaining phrases were used for testing word recognition of the models based on phoneme identification from running speech. Word error rates were on average 10.3% for the full eight-sensor set (averaging 9.5% for the top four participants), and 13.6% when reducing the sensor set to four locations per individual (n = 7). This study provides a compelling proof-of-concept for sEMG-based alaryngeal speech recognition, with the strong potential to further improve recognition performance.
Journal of Neural Engineering | 2018
Geoffrey S. Meltzner; James T. Heaton; Yunbin Deng; Gianluca De Luca; Serge H. Roy; Joshua C. Kline
OBJECTIVE Speech is among the most natural forms of human communication, thereby offering an attractive modality for human-machine interaction through automatic speech recognition (ASR). However, the limitations of ASR-including degradation in the presence of ambient noise, limited privacy and poor accessibility for those with significant speech disorders-have motivated the need for alternative non-acoustic modalities of subvocal or silent speech recognition (SSR). APPROACH We have developed a new system of face- and neck-worn sensors and signal processing algorithms that are capable of recognizing silently mouthed words and phrases entirely from the surface electromyographic (sEMG) signals recorded from muscles of the face and neck that are involved in the production of speech. The algorithms were strategically developed by evolving speech recognition models: first for recognizing isolated words by extracting speech-related features from sEMG signals, then for recognizing sequences of words from patterns of sEMG signals using grammar models, and finally for recognizing a vocabulary of previously untrained words using phoneme-based models. The final recognition algorithms were integrated with specially designed multi-point, miniaturized sensors that can be arranged in flexible geometries to record high-fidelity sEMG signal measurements from small articulator muscles of the face and neck. MAIN RESULTS We tested the system of sensors and algorithms during a series of subvocal speech experiments involving more than 1200 phrases generated from a 2200-word vocabulary and achieved an 8.9%-word error rate (91.1% recognition rate), far surpassing previous attempts in the field. SIGNIFICANCE These results demonstrate the viability of our system as an alternative modality of communication for a multitude of applications including: persons with speech impairments following a laryngectomy; military personnel requiring hands-free covert communication; or the consumer in need of privacy while speaking on a mobile phone in public.
international conference on biometrics theory applications and systems | 2016
Yunbin Deng
Existing studies on speaker identification are mostly performed on telephone and microphone speech data, which are collected with subjects close to the sensor. For the first time, this study reports long range standoff automatic speaker identification experiments using laser Doppler vibrometer (LDV) sensor. The LDV sensor modality has the potential to extend the speech acquisition standoff distance far beyond microphone arrays to enable new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR). Five LDV speech corpuses, each consists of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall, using Polytec LDV model OFV-505. The distance from the LDV sensor to the vibration targets is 50 feet. State of the art i-vector speaker identification experiments on this LDV speech data show great promise of this LDV long range acoustic sensing modality.
conference of the international speech communication association | 2009
Yunbin Deng; Rupal Patel; James T. Heaton; Glen Colby; L. Donald Gilmore; João B. D. Cabrera; Serge H. Roy; Carlo J. De Luca; Geoffrey S. Meltzner
conference of the international speech communication association | 2014
Yunbin Deng; James T. Heaton; Geoffrey S. Meltzner
international conference of the ieee engineering in medicine and biology society | 2011
Geoffrey S. Meltzner; Glen Colby; Yunbin Deng; James T. Heaton
Archive | 2017
Leonid Naimark; Yunbin Deng; Geoffrey S. Meltzner; Yu Zhong