Lori Lamel | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lori Lamel is active.

Explore More

Publication

Featured researches published by Lori Lamel.

Archive | 2000

The Use of Lexica in Automatic Speech Recognition

Martine Adda-Decker; Lori Lamel

The lexicon plays a pivotal role in automatic speech recognition as it is the link between the acoustic-level representation and the word sequence output by the speech recognizer. The role of the lexicon can be considered as twofold: first, the lexicon specifies what words or lexical items are known by the system; second, the lexicon provides the means to build acoustic models for each entry. Lexical design thus entails two main parts: definition and selection of the vocabulary items and representation of each pronunciation entry using the basic acoustic units of the recognizer. For large vocabulary speech recognition, the vocabulary is usually selected to maximize lexical coverage for a given size lexicon, and the elementary units of choice are usually phonemes or phone-like units.

Procedia Computer Science | 2016

Lithuanian Broadcast Speech Transcription Using Semi-supervised Acoustic Model Training

Rasa Lileikytė; Arseniy Gorin; Lori Lamel; Jean-Luc Gauvain; Thiago Fraga-Silva

Abstract This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360xa0hours of untranscribed speech data. A graphemic pronunciation approach is used to simplify the pronunciation model generation and there-fore ease the language model adaptation for the system users. Discriminative training on top of semi-supervised training is also investigated, as well as various types of acoustic features and their combinations. Experimental results are provided for each of our development steps as well as contrastive results comparing various options. Using the best system configuration a word error rate of 18.3% is obtained on a set of development data from the Quaero program.

Procedia Computer Science | 2016

Breaking the unwritten language barrier: the BULB project

Gilles Adda; Sebastian Stüker; Martine Adda-Decker; Odette Ambouroue; Laurent Besacier; David Blachon; Hélène Bonneau-Maynard; Pierre Godard; Fatima Hamlaoui; Dmitry Idiatov; Guy-Noël Kouarata; Lori Lamel; Emmanuel-Moselly Makasso; Annie Rialland; Mark Van de Velde; François Yvon; Sabine Zerbian

Abstract The project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100xa0h per language) at a reasonable cost. For this we use standard mobile devices and a dedicated software— Lig-Aikuma . After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development . In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists’ needs and technologys capabilities.

Computer Speech & Language | 2018

Conversational telephone speech recognition for Lithuanian

Rasa Lileikytė; Lori Lamel; Jean-Luc Gauvain; Arseniy Gorin

Abstract The research presented in the paper addresses conversational telephone speech recognition and keyword spotting for the Lithuanian language. Lithuanian can be considered a low e-resourced language as little transcribed audio data, and more generally, only limited linguistic resources are available electronically. Part of this research explores the impact of reducing the amount of linguistic knowledge and manual supervision when developing the transcription system. Since designing a pronunciation dictionary requires language-specific expertise, the need for manual supervision was assessed by comparing phonemic and graphemic units for acoustic modeling. Although the Lithuanian language is generally described in the linguistic literature with 56 phonemes, under low-resourced conditions some phonemes may not be sufficiently observed to be modeled. Therefore different phoneme inventories were explored to assess the effects of explicitly modeling diphthongs, affricates and soft consonants. The impact of using Web data for language modeling and additional untranscribed audio data for semi-supervised training was also measured. Out-of-vocabulary (OOV) keywords are a well-known challenge for keyword search. While word-based keyword search is quite effective for in-vocabulary words, OOV keywords are largely undetected. Morpheme-based subword units are compared with character n-gram-based units for their capacity to detect OOV keywords. Experimental results are reported for two training conditions defined in the IARPA Babel program: the full language pack and the very limited language pack, for which, respectively, 40u2009h and 3u2009h of transcribed training data are available. For both conditions, grapheme-based and phoneme-based models are shown to obtain comparable transcription and keyword spotting results. The use of Web texts for language modeling is shown to significantly improve both speech recognition and keyword spotting performance. Combining full-word and subword units leads to the best keyword spotting results.

international conference on acoustics, speech, and signal processing | 2017

An investigation into language model data augmentation for low-resourced STT and KWS

Guangpu Huang; Thiago Fraga da Silva; Lori Lamel; Jean-Luc Gauvain; Arseniy Gorin; Antoine Laurent; Rasa Lileikyte; Abdel Messouadi

This paper reports on investigations using two techniques for language model text data augmentation for low-resourced automatic speech recognition and keyword search. Lowresourced languages are characterized by limited training materials, which typically results in high out-of-vocabulary (OOV) rates and poor language model estimates. One technique makes use of recurrent neural networks (RNNs) using word or subword units. Word-based RNNs keep the same system vocabulary, so they cannot reduce the OOV, whereas subword units can reduce the OOV but generate many false combinations. A complementary technique is based on automatic machine translation, which requires parallel texts and is able to add words to the vocabulary. These methods were assessed on 10 languages in the context of the Babel program and NIST OpenKWS evaluation. Although improvements vary across languages with both methods, small gains were generally observed in terms of word error rate reduction and improved keyword search performance.

international conference on acoustics, speech, and signal processing | 2017

Effective keyword search for low-resourced conversational speech

Rasa Lileikyte; Thiago Fraga-Silva; Lori Lamel; Jean-Luc Gauvain; Antoine Laurent; Guangpu Huang

In this paper we aim to enhance keyword search for conversational telephone speech under low-resourced conditions. Two techniques to improve the detection of out-of-vocabulary keywords are assessed in this study: using extra text resources to augment the lexicon and language model, and via subword units for keyword search. Two approaches for data augmentation are explored to extend the limited amount of transcribed conversational speech: using conversational-like Web data and texts generated by recurrent neural networks. Contrastive comparisons of subword-based systems are performed to evaluate the benefits of multiple subword decodings and single decoding. Keyword search results are reported for all the techniques, but only some improve performance. Results are reported for the Mongolian and Igbo languages using data from the 2016 Babel program.

Multilingual Speech Processing | 2006

Chapter 5 – Multilingual Dictionaries

Martine Adda-Decker; Lori Lamel

Publisher Summary nThis chapter focuses on multilingual dictionaries for use in automatic speech recognition. It provides an overview of dictionary modeling and generation issues in the context of multilingual speech processing. For most automatic speech recognition systems, multilingual pronunciation dictionaries are still collections of monolingual dictionaries. For different languages, the proportion of imported words—that is, words shared with other languages—increases with vocabulary size. This chapter addresses the various steps in lexical development, including the normalization, choice of word items, the selection of a word list, and pronunciation generation. Tokenization and normalization were first addressed in the context of written sources, which often form the basis of language modeling material. Suitable units for dictionary modeling are discussed in the light of similarities and dissimilarities between languages. Dictionary generation techniques are illustrated, along with the pros and cons of automatic or semiautomatic procedures. In this chapter, only languages for which written resources are available are considered.

Archive | 2000