Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nelson Morgan is active.

Publication


Featured researches published by Nelson Morgan.


IEEE Transactions on Speech and Audio Processing | 1994

RASTA processing of speech

Hynek Hermansky; Nelson Morgan

Performance of even the best current stochastic recognizers severely degrades in an unexpected communications environment. In some cases, the environmental effect can be modeled by a set of simple transformations and, in particular, by convolution with an environmental impulse response and the addition of some environmental noise. Often, the temporal properties of these environmental effects are quite different from the temporal properties of speech. We have been experimenting with filtering approaches that attempt to exploit these differences to produce robust representations for speech recognition and enhancement and have called this class of representations relative spectra (RASTA). In this paper, we review the theoretical and experimental foundations of the method, discuss the relationship with human auditory perception, and extend the original method to combinations of additive noise and convolutional noise. We discuss the relationship between RASTA features and the nature of the recognition models that are required and the relationship of these features to delta features and to cepstral mean subtraction. Finally, we show an application of the RASTA technique to speech enhancement. >


international conference on acoustics, speech, and signal processing | 2003

The ICSI Meeting Corpus

Adam Janin; Don Baron; Jane Edwards; Daniel P. W. Ellis; David Gelbart; Nelson Morgan; Barbara Peskin; Thilo Pfau; Elizabeth Shriberg; Andreas Stolcke; Chuck Wooters

We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains audio recorded simultaneously from head-worn and table-top microphones, word-level transcripts of meetings, and various metadata on participants, meetings, and hardware. Such a corpus supports work in automatic speech recognition, noise robustness, dialog modeling, prosody, rich transcription, information retrieval, and more. We present details on the contents of the corpus, as well as rationales for the decisions that led to its configuration. The corpus were delivered to the Linguistic Data Consortium (LDC).


parallel computing | 2009

A view of the parallel computing landscape

Krste Asanovic; Rastislav Bodik; James Demmel; Tony M. Keaveny; Kurt Keutzer; John Kubiatowicz; Nelson Morgan; David A. Patterson; Koushik Sen; John Wawrzynek; David Wessel; Katherine A. Yelick

Writing programs that scale with increasing numbers of cores should be as easy as writing programs for sequential computers.


IEEE Transactions on Speech and Audio Processing | 1994

Connectionist probability estimators in HMM speech recognition

Steve Renals; Nelson Morgan; Michael Cohen; Horacio Franco

The authors are concerned with integrating connectionist networks into a hidden Markov model (HMM) speech recognition system. This is achieved through a statistical interpretation of connectionist networks as probability estimators. They review the basis of HMM speech recognition and point out the possible benefits of incorporating connectionist networks. Issues necessary to the construction of a connectionist HMM recognition system are discussed, including choice of connectionist probability estimator. They describe the performance of such a system using a multilayer perceptron probability estimator evaluated on the speaker-independent DARPA Resource Management database. In conclusion, they show that a connectionist component improves a state-of-the-art HMM system. >


Speech Communication | 1998

Robust speech recognition using the modulation spectrogram

Brian Kingsbury; Nelson Morgan; Steven Greenberg

Abstract The performance of present-day automatic speech recognition (ASR) systems is seriously compromised by levels of acoustic interference (such as additive noise and room reverberation) representative of real-world speaking conditions. Studies on the perception of speech by human listeners suggest that recognizer robustness might be improved by focusing on temporal structure in the speech signal that appears as low-frequency (below 16 Hz) amplitude modulations in subband channels following critical-band frequency analysis. A speech representation that emphasizes this temporal structure, the “modulation spectrogram”, has been developed. Visual displays of speech produced with the modulation spectrogram are relatively stable in the presence of high levels of background noise and reverberation. Using the modulation spectrogram as a front end for ASR provides a significant improvement in performance on highly reverberant speech. When the modulation spectrogram is used in combination with log-RASTA-PLP (log RelAtive SpecTrAl Perceptual Linear Predictive analysis) performance over a range of noisy and reverberant conditions is significantly improved, suggesting that the use of multiple representations is another promising method for improving the robustness of ASR systems.


international conference on acoustics, speech, and signal processing | 1992

RASTA-PLP speech analysis technique

Hynek Hermansky; Nelson Morgan; Aruna Bayya; Phil Kohn

Most speech parameter estimation techniques are easily influenced by the frequency response of the communication channel. The authors have developed a technique that is more robust to such steady-state spectral factors in speech. The approach is conceptually simple and computationally efficient. The new method is described, and experimental results are proposed that show significant advantages for the proposed method.<<ETX>>


international conference on acoustics, speech, and signal processing | 1993

Recognition of speech in additive and convolutional noise based on RASTA spectral processing

Hynek Hermansky; Nelson Morgan; Hans-Günter Hirsch

RASTA (relative spectral) processing is studied in a spectral domain which is linear-like for small spectral values and logarithmic-like for large spectral values. Experiments with a recognizer trained on clean speech and test data degraded by both convolutional and additive noise show that doing RASTA processing in the new domain yields results comparable with those obtained by training the recognizer on known noise.<<ETX>>


international conference on acoustics, speech, and signal processing | 1990

Continuous speech recognition using multilayer perceptrons with hidden Markov models

Nelson Morgan

A phoneme based, speaker-dependent continuous-speech recognition system embedding a multilayer perceptron (MLP) (i.e. a feedforward artificial neural network) into a hidden Markov model (HMM) approach is described. Contextual information from a sliding window on the input frames is used to improve frame or phoneme classification performance over the corresponding performance for simple maximum-likelihood probabilities, or even maximum a posteriori (MAP) probabilities which are estimated without the benefit of context. Performance for a simple discrete density HMM system appears to be somewhat better when MLP methods are used to estimate the probabilities.<<ETX>>


Electroencephalography and Clinical Neurophysiology | 1989

Event-related covariances during a bimanual visuomotor task. I. Methods and analysis of stimulus- and response-locked data☆☆☆

Alan Gevins; Steven L. Bressler; Nelson Morgan; Brian A. Cutillo; Richard M. White; Douglas S. Greer; Judy Illes

A new method that measures between-channel, event-related covariances (ERCs) from scalp-recorded brain signals has been developed. The method was applied to recordings of 26 EEG channels from 7 right-handed men performing a bimanual visuomotor judgment task that required fine motor control. Covariance and time-delay measures were derived from pairs of filtered, laplacian-derived, averaged wave forms, which were enhanced by rejection of outlying trials, in intervals spanning event-related potential components. Stimulus- and response-locked ERC patterns were consistent with functional neuroanatomical models of visual stimulus processing and response execution. In early post-stimulus intervals, ERC patterns differed according to the physical properties of the stimulus; in later intervals, the patterns differed according to the subjective interpretation of the stimulus. The response-locked ERC patterns suggested 4 major cortical generators for the voluntary fine motor control required by the task: motor, somesthetic, premotor and/or supplementary motor, and prefrontal. This new method may thus be an advancement toward characterizing, both spatially and temporally, functional cortical networks in the human brain responsible for perception and action.


IEEE Transactions on Speech and Audio Processing | 1995

The challenge of spoken language systems: Research directions for the nineties

Ron Cole; L. Hirschman; L. Atlas; M. Beckman; Alan W. Biermann; M. Bush; Mark A. Clements; L. Cohen; Oscar N. Garcia; B. Hanson; Hynek Hermansky; S. Levinson; Kathleen R. McKeown; Nelson Morgan; David G. Novick; Mari Ostendorf; Sharon L. Oviatt; Patti Price; Harvey F. Silverman; J. Spiitz; Alex Waibel; Cliff Weinstein; Stephen A. Zahorian; Victor W. Zue

A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the persons words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area. >

Collaboration


Dive into the Nelson Morgan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ben Gold

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alan Gevins

Michigan State University

View shared research outputs
Top Co-Authors

Avatar

Chuck Wooters

International Computer Science Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John Wawrzynek

University of California

View shared research outputs
Top Co-Authors

Avatar

Krste Asanovic

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge