R. De Mori | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where R. De Mori is active.

Explore More

Publication

Featured researches published by R. De Mori.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1990

A cache-based natural language model for speech recognition

R. Kuhn; R. De Mori

Speech-recognition systems must often decide between competing ways of breaking up the acoustic input into strings of words. Since the possible strings may be acoustically similar, a language model is required; given a word string, the model returns its linguistic probability. Several Markov language models are discussed. A novel kind of language model which reflects short-term patterns of word use by means of a cache component (analogous to cache memory in hardware terminology) is presented. The model also contains a 3g-gram component of the traditional type. The combined model and a pure 3g-gram model were tested on samples drawn from the Lancaster-Oslo/Bergen (LOB) corpus of English text. The relative performance of the two models is examined, and suggestions for the future improvements are made. >

Speech Communication | 2007

Automatic speech recognition and speech variability: A review

M. Benzeghiba; R. De Mori; Olivier Deroo; Stéphane Dupont; T. Erbes; D. Jouvet; L. Fissore; Pietro Laface; Alfred Mertins; Christophe Ris; R. Rose; V. Tyagi; C. Wellekens

Major progress is being recorded regularly on both the technology and exploitation of automatic speech recognition (ASR) and spoken language systems. However, there are still technological barriers to flexible solutions and user satisfaction under some circumstances. This is related to several factors, such as the sensitivity to the environment (background noise), or the weak representation of grammatical and semantic knowledge. Current research is also emphasizing deficiencies in dealing with variation naturally present in speech. For instance, the lack of robustness to foreign accents precludes the use by specific populations. Also, some applications, like directory assistance, particularly stress the core recognition technology due to the very high active vocabulary (application perplexity). There are actually many factors affecting the speech realization: regional, sociolinguistic, or related to the environment or the speaker herself. These create a wide range of variations that may not be modeled correctly (speaker, gender, speaking rate, vocal effort, regional accent, speaking style, non-stationarity, etc.), especially when resources for system training are scarce. This paper outlines current advances related to these topics.

IEEE Transactions on Neural Networks | 1992

Global optimization of a neural network-hidden Markov model hybrid

Yoshua Bengio; R. De Mori; G. Flammia; R. Kompe

The integration of multilayered and recurrent artificial neural networks (ANNs) with hidden Markov models (HMMs) is addressed. ANNs are suitable for approximating functions that compute new acoustic parameters, whereas HMMs have been proven successful at modeling the temporal structure of the speech signal. In the approach described, the ANN outputs constitute the sequence of observation vectors for the HMM. An algorithm is proposed for global optimization of all the parameters. Results on speaker-independent recognition experiments using this integrated ANN-HMM system on the TIMIT continuous speech database are reported.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1995

The application of semantic classification trees to natural language understanding

R. Kuhn; R. De Mori

This article describes a new method for building a natural language understanding (NLU) system, in which the systems rules are learnt automatically from training data. The method has been applied to design of a speech understanding (SU) system. Designers of such systems rely increasingly on robust matchers to perform the task of extracting meaning from one or several word sequence hypotheses generated by a speech recognizer. We describe a new data structure, the semantic classification tree (SCT), that learns semantic rules from training data and can be a building block for robust matchers for NLU tasks. By reducing the need for handcoding and debugging a large number of rules, this approach facilitates rapid construction of an NLU system. In the case of an SU system, the rules learned by an SCT are highly resistant to errors by the speaker or by the speech recognizer because they depend on a small number of words in each utterance. Our work shows that semantic rules can be learned automatically from training data, yielding successful NLU for a realistic application. >

IEEE Signal Processing Magazine | 2008

Spoken language understanding

R. De Mori; Frédéric Béchet; Dilek Hakkani-Tür; Michael F. McTear; Giuseppe Riccardi; Gokhan Tur

Semantics deals with the organization of meanings and the relations between sensory signs or symbols and what they denote or mean. Computational semantics performs a conceptualization of the world using computational processes for composing a meaning representation structure from available signs and their features present, for example, in words and sentences. Spoken language understanding (SLU) is the interpretation of signs conveyed by a speech signal. SLU and natural language understanding (NLU) share the goal of obtaining a conceptual representation of natural language sentences. Specific to SLU is the fact that signs to be used for interpretation are coded into signals along with other information such as speaker identity. Furthermore, spoken sentences often do not follow the grammar of a language; they exhibit self-corrections, hesitations, repetitions, and other irregular phenomena. SLU systems contain an automatic speech recognition (ASR) component and must be robust to noise due to the spontaneous nature of spoken language and the errors introduced by ASR. Moreover, ASR components output a stream of words with no structure information like punctuation and sentence boundaries. Therefore, SLU systems cannot rely on such markers and must perform text segmentation and understanding at the same time.

IEEE Transactions on Audio, Speech, and Language Processing | 2011

Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages

Stefan Hahn; Marco Dinarelli; Christian Raymond; Fabrice Lefèvre; Patrick Lehnen; R. De Mori; Alessandro Moschitti; Hermann Ney; Giuseppe Riccardi

One of the first steps in building a spoken language understanding (SLU) module for dialogue systems is the extraction of flat concepts out of a given word sequence, usually provided by an automatic speech recognition (ASR) system. In this paper, six different modeling approaches are investigated to tackle the task of concept tagging. These methods include classical, well-known generative and discriminative methods like Finite State Transducers (FSTs), Statistical Machine Translation (SMT), Maximum Entropy Markov Models (MEMMs), or Support Vector Machines (SVMs) as well as techniques recently applied to natural language processing such as Conditional Random Fields (CRFs) or Dynamic Bayesian Networks (DBNs). Following a detailed description of the models, experimental and comparative results are presented on three corpora in different languages and with different complexity. The French MEDIA corpus has already been exploited during an evaluation campaign and so a direct comparison with existing benchmarks is possible. Recently collected Italian and Polish corpora are used to test the robustness and portability of the modeling approaches. For all tasks, manual transcriptions as well as ASR inputs are considered. Additionally to single systems, methods for system combination are investigated. The best performing model on all tasks is based on conditional random fields. On the MEDIA evaluation corpus, a concept error rate of 12.6% could be achieved. Here, additionally to attribute names, attribute values have been extracted using a combination of a rule-based and a statistical approach. Applying system combination using weighted ROVER with all six systems, the concept error rate (CER) drops to 12.0%.

IEEE Software | 1995

Reengineering user interfaces

Ettore Merlo; P.-Y. Gagne; J.F. Girard; Kostas Kontogiannis; Laurie J. Hendren; Prakash Panangaden; R. De Mori

Most developers would like to avoid redesigning a system around a new interface. But turning a character-based interface into a graphical one requires significant time and resources. The authors describe how this process can be partially automated, giving the results of their own reverse-engineering effort. >

Ibm Systems Journal | 1994

Investigating reverse engineering technologies for the CAS program understanding project

Erich B. Buss; R. De Mori; W. M. Gentleman; J. Henshaw; H. Johnson; Kostas Kontogiannis; Ettore Merlo; Hausi A. Müller; John Mylopoulos; S. Paul; A. Prakash; Martin Stanley; Scott R. Tilley; J. Troster; Kenny Wong

Corporations face mounting maintenance and re-engineering costs for large legacy systems. Evolving over several years, these systems embody substantial corporate knowledge, including requirements, design decisions, and business rules. Such knowledge is difficult to recover after many years of operation, evolution, and personnel change. To address the problem of program understanding, software engineers are spending an ever-growing amount of effort on reverse engineering technologies. This paper describes the scope and results of an ongoing research project on program understanding undertaken by the IBM Toronto Software Solutions Laboratory Centre for Advanced Studies (CAS). The project involves a team from CAS and five research groups working cooperatively on complementary reverse engineering approaches. All the groups are using the source code of SQL/DS™ (a multimillion-line relational database system) as the reference legacy system. Also discussed is an approach adopted to integrate the various tools under a single reverse engineering environment.

ieee automatic speech recognition and understanding workshop | 2007

Spoken language understanding: a survey

R. De Mori

A survey of research on spoken language understanding is presented. It covers aspects of knowledge representation, automatic interpretation strategies, semantic grammars, conceptual language models, semantic event detection, shallow semantic parsing, semantic classification, semantic confidence, active learning.

IEEE Transactions on Acoustics, Speech, and Signal Processing | 1976

Automatic detection and description of syllabic features in continuous speech

R. De Mori; Pietro Laface; Elio Piccolo

The details of the implementation of a syntax-controlled acoustic encoder of a speech understanding system (SUS) are presented. Finite-state automata operating on artificial descriptions of suprasegmentals and global spectral features isolate syllables in continuous speech. Then a combinational algorithm tracks the formants for the voiced intervals of each syllable, and other algorithms provide a complete structural description of spectral and prosodic features for a spoken sentence. Such a description consists of a string of symbols and numerical attributes and is a representation of speech in terms of perceptually significant primitive forms. It contains all the information required to reconstruct the analyzed sentence with a formant synthesizer; it can be used directly either for emitting or verifying hypotheses at the lexical level of an SUS and for automatically learning phonetic features by grammatical inference.

Explore More