Bernard Merialdo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bernard Merialdo is active.

Explore More

Publication

Featured researches published by Bernard Merialdo.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1986

Natural Language Modeling for Phoneme-to-Text Transcription

Anne-Marie Derouault; Bernard Merialdo

This paper relates different kinds of language modeling methods that can be applied to the linguistic decoding part of a speech recognition system with a very large vocabulary. These models are studied experimentally on a pseudophonetic input arising from French stenotypy. We propose a model which combines the advantages of a statistical modeling with information theoretic tools, and those of a grammatical approach.

international conference on acoustics, speech, and signal processing | 1991

Tagging text with a probabilistic model

Bernard Merialdo

Experiments on the use of a probabilistic model to tag English text, that is, to assign to each word the correct tag (part of speech) in the context of the sentence, are presented. A simple triclass Markov model is used, and the best way to estimate the parameters of this model, depending on the kind and amount of training data that is provided, is found. Two approaches are compared: the use of text that has been tagged by hand and comparing relative frequency counts; and use text without tags and training the model as a hidden Markov process, according to a maximum likelihood principle. Experiments show that the best training is obtained by using as much tagged text as is available, a maximum likelihood training may improve the accuracy of the tagging.<<ETX>>

international conference on acoustics speech and signal processing | 1988

Phonetic recognition using hidden Markov models and maximum mutual information training

Bernard Merialdo

The application of maximum-mutual-information (MMI) training to hidden Markov models (HMMs) is studied for phonetic recognition. MMI training has been proposed as an alternative to standard maximum-likelihood (ML) training. In practice, MMI training performs better (produces models that are more accurate) than ML training. The fundamental notions of HMM, ML and MMI training are reviewed, and it is shown how MMI training can be applied easily to the case of phonetic models and phonetic recognition. Some computational heuristics are proposed to implement these computations practically. Some experiments (training and recognition) are detailed that show that the phonetic error rate decreases significantly when MMI training is used, as compared with ML training.<<ETX>>

international conference on acoustics, speech, and signal processing | 1991

Automatic phonetic baseform determination

Lalit R. Bahl; Subhro Das; Peter Vincent Desouza; Mark E. Epstein; Robert L. Mercer; Bernard Merialdo; David Nahamoo; Michael Picheny; J. Powell

The authors describe a series of experiments in which the phonetic baseform is deduced automatically for new words by utilizing actual utterances of the new word in conjunction with a set of automatically derived spelling-to-sound rules. Recognition performance was evaluated on new words spoken by two different speakers when the phonetic baseforms were extracted via the above approach. The error rates on these new words were found to be comparable to or better than when the phonetic baseforms were derived by hand, thus validating the basic approach.<<ETX>>

Ibm Journal of Research and Development | 1988

Multilevel decoding for very-large-size-dictionary speech recognition

Bernard Merialdo

An important concern in the field of speech recognition is the size of the vocabulary that a recognition system is able to support. Large vocabularies introduce difficulties involving the amount of computation the system must perform and the number of ambiguities it must resolve. But, for practical applications in general and for dictation tasks in particular, large vocabularies are required, because of the difficulties and inconveniences involved in restricting the speaker to the use of a limited vocabulary. This paper describes a new organization of the recognition process, Multilevel Decoding (MLD), that allows the system to support a Very-Large-Size Dictionary (VLSD)—one comprising over 100,000 words. This significantly surpasses the capacity of previous speech-recognition systems. With MLD, the effect of dictionary size on the accuracy of recognition can be studied. In this paper, recognition experiments using 10,000- and 200,000-word dictionaries are compared. They indicate that recognition using a 200,000-word dictionary is more accurate than recognition using a 10,000-word dictionary (when unrecognized words are included in the error rate).

international conference on acoustics, speech, and signal processing | 1987

Speech recognition with very large size dictionary

Bernard Merialdo

This paper proposes a new strategy, the Multi-Level Decoding (MLD), that allows to use a Very Large Size Dictionary (VLSD, size more than 100,000 words) in speech recognition. MLD proceeds in three steps:\bulleta Syllable Match procedure uses an acoustic model to build a list of the most probable syllables that match the acoustic signal from a given time frame.\bulletfrom this list, a Word Match procedure uses the dictionary to build partial word hypothesis.\bulletthen a Sentence Match procedure uses a probabilistic language model to build partial sentence hypothesis until total sentences are found. An original matching algorithm is proposed for the Syllable Match procedure. This strategy is experimented on a dictation task of French texts. Two different dictionaries are tested,\bulletone composed of the 10,000 most frequent words,\bulletthe other composed of 200,000 words. The recognition results are given and compared. The error rate on words with 10,000 words is 17.3%. If the errors due to the lack of coverage are not counted, the error rate with 10,000 words is reduced to 10.6%. The error rate with 200,000 words is 12.7%.

international conference on acoustics, speech, and signal processing | 1985

Probabilistic grammar for phonetic to French transcription

Anne-Marie Derouault; Bernard Merialdo

In this paper, we study the combination of an information theoretic tool (Markov modeling of natural language [3]) with probabilistic grammatical analysis. Continuous Speech Recognition for natural language raises a lot of difficulties, both for the acoustic processing and the linguistic decoding. Our work specifically concerns the linguistic decoding techniques for a very large (140,000 entries) French dictionary, and a oral open discourse. So the task is to transcribe a continuous string of pseudo-phonemes into written text. This string would be ideally the output of a perfect acoustic processor. We present a grammar designed for automatic transcription and compute probabilities for the rules. We compare its results with those obtained earlier with Markov modeling. We show that it is possible to combine the two approaches and get better results than each model separately.

Archive | 1999

Hidden Markov Models

Marc El-Beze; Bernard Merialdo

In the previous chapters we have seen how the context of a word in a sentence is used to help identify the proper tag for that word. It is clear that such consultation of the context is necessary if we want the tagging to reach an acceptable level of correctness. It is not clear, however, that the mechanism used for this consultation should be rule-based.

international conference on acoustics, speech, and signal processing | 1986

Phoneme classification using Markov models

Bernard Merialdo; Anne-Marie Derouault; S. Soudoplatoff

An approach for supporting large vocabulary in speech recognition is to use broad phonetic classes to reduce the search to a subset of the dictionary. In this paper, we investigate the problem of defining an optimal classification for a given speech decoder, so that these broad phonetic classes are recognized as accurately as possible from the speech signal. More precisely, given Hidden Markov Models of phonemes, we define a similarity measure of the phonetic machines, and use a standard classification algorithm to find the optimal classification. Three measures are proposed, and compared with manual classifications.

Proc. of the NATO Advanced Study Institute on Pattern recognition theory and applications | 1987

Speech recognition experiment with 10,000 words dictionary

Helene Cerf-Danon; Anne-Marie Derouault; Marc El-Beze; Bernard Merialdo; Serge Soudoplatoff

Important progress has been achieved in Speech Recognition during the last ten years. Some small recognition tasks like vocal commands can now be accomplished, and more fundamental research involves the study and design of large vocabulary recognition. The research on Automatic Dictation is becoming very active, and recent realizations have shown very good performances. A key factor in the development of such Listening Typewriters is the ability to support Large Size Dictionary (LSD, several thousands words), or even Very Large Size Dictionaries (VLSD, several hundred of thousands words), because any restriction on the vocabulary is a restriction on potential users. This is even more important for inflected languages, such as French, because of the number of different forms for each lemma (on the average: 2.2 for English, 5 for German, 7 for French).

Explore More