Giuseppe Riccardi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Giuseppe Riccardi is active.

Explore More

Publication

Featured researches published by Giuseppe Riccardi.

Speech Communication | 1999

Grammar fragment acquisition using syntactic and semantic clustering

Kazuhiro Arai; Jeremy H. Wright; Giuseppe Riccardi; Allen L. Gorin

A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these three probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.

ieee automatic speech recognition and understanding workshop | 2001

Computing consensus translation from multiple machine translation systems

B. Bangalore; Germán Bordel; Giuseppe Riccardi

We address the problem of computing a consensus translation given the outputs from a set of machine translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the subjective and objective performance of the multilingual acquisition approach on a limited domain spoken language application. We evaluate five domain-independent off-the-shelf MT systems and show that the consensus-based translation performance is equal to or better than any of the given MT systems, in terms of both objective and subjective measures.

IEEE Computer | 2002

Automated natural spoken dialog

Allen L. Gorin; Alicia Abella; Giuseppe Riccardi; Jerry H. Wright

The next generation of voice-based interface technology will enable easy-to-use automation of new and existing communication services, making human-machine interaction more natural.

IEEE Transactions on Speech and Audio Processing | 2005

Active learning: theory and applications to automatic speech recognition

Giuseppe Riccardi; Dilek Hakkani-Tür

We are interested in the problem of adaptive learning in the context of automatic speech recognition (ASR). In this paper, we propose an active learning algorithm for ASR. Automatic speech recognition systems are trained using human supervision to provide transcriptions of speech utterances. The goal of Active Learning is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples, and then selecting the most informative ones with respect to a given cost function for a human to label. In this paper we describe how to estimate the confidence score for each utterance through an on-line algorithm using the lattice output of a speech recognizer. The utterance scores are filtered through the informativeness function and an optimal subset of training samples is selected. The active learning algorithm has been applied to both batch and on-line learning scheme and we have experimented with different selective sampling algorithms. Our experiments show that by using active learning the amount of labeled data needed for a given word accuracy can be reduced by more than 60% with respect to random sampling.

conference on security, steganography, and watermarking of multimedia contents | 2005

Natural Language Watermarking

Mikhail Mike Atallah; Srinivas Bangalore; Dilek Hakkani-Tür; Giuseppe Riccardi; Mercan Topkara; Umut Topkara

In this paper we discuss natural language watermarking, which uses the structure of the sentence constituents in natural language text in order to insert a watermark. This approach is different from techniques, collectively referred to as “text watermarking,” which embed information by modifying the appearance of text elements, such as lines, words, or characters. We provide a survey of the current state of the art in natural language watermarking and introduce terminology, techniques, and tools for text processing. We also examine the parallels and differences of the two watermarking domains and outline how techniques from the image watermarking domain may be applicable to the natural language watermarking domain.

empirical methods in natural language processing | 2009

Convolution Kernels on Constituent, Dependency and Sequential Structures for Relation Extraction

Truc-Vien T. Nguyen; Alessandro Moschitti; Giuseppe Riccardi

This paper explores the use of innovative kernels based on syntactic and semantic structures for a target relation extraction task. Syntax is derived from constituent and dependency parse trees whereas semantics concerns to entity types and lexical sequences. We investigate the effectiveness of such representations in the automated relation extraction from texts. We process the above data by means of Support Vector Machines along with the syntactic tree, the partial tree and the word sequence kernels. Our study on the ACE 2004 corpus illustrates that the combination of the above kernels achieves high effectiveness and significantly improves the current state-of-the-art.

international conference on acoustics, speech, and signal processing | 2002

Active learning for automatic speech recognition

Dilek Hakkani-Tür; Giuseppe Riccardi; Allen L. Gorin

State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor intensive and time-consuming. In this paper, we describe a new method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples, and then selecting the most informative ones with respect to a given cost function for a human to label. We automatically estimate a confidence score for each word of the utterance, exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. We compute utterance confidence scores based on these word confidence scores, then selectively sample the utterances to be transcribed using the utterance confidence scores. In our experiments, we show that we reduce the amount of labeled data needed for a given word accuracy by 27%.

IEEE Signal Processing Magazine | 2008

Spoken language understanding

R. De Mori; Frédéric Béchet; Dilek Hakkani-Tür; Michael F. McTear; Giuseppe Riccardi; Gokhan Tur

Semantics deals with the organization of meanings and the relations between sensory signs or symbols and what they denote or mean. Computational semantics performs a conceptualization of the world using computational processes for composing a meaning representation structure from available signs and their features present, for example, in words and sentences. Spoken language understanding (SLU) is the interpretation of signs conveyed by a speech signal. SLU and natural language understanding (NLU) share the goal of obtaining a conceptual representation of natural language sentences. Specific to SLU is the fact that signs to be used for interpretation are coded into signals along with other information such as speaker identity. Furthermore, spoken sentences often do not follow the grammar of a language; they exhibit self-corrections, hesitations, repetitions, and other irregular phenomena. SLU systems contain an automatic speech recognition (ASR) component and must be robust to noise due to the spontaneous nature of spoken language and the errors introduced by ASR. Moreover, ASR components output a stream of words with no structure information like punctuation and sentence boundaries. Therefore, SLU systems cannot rely on such markers and must perform text segmentation and understanding at the same time.

Computer Speech & Language | 1996

Stochastic automata for language modeling

Giuseppe Riccardi; Roberto Pieraccini; Enrico Bocchieri

Abstract Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition and understanding, search algorithms use stochastic models to extract the most likely uttered sentence and its correspondent interpretation. The design of the language models has to be effective in order to mostly constrain the search algorithms and has to be efficient to comply with the storage space limits. In this work we present the VariableN-gram Stochastic Automaton (VNSA) language model that provides a unified formalism for building a wide class of language models. First, this approach allows for the use of accurate language models for large vocabulary speech recognition by using the standard search algorithm in the one-pass Viterbi decoder. Second, the unified formalism is an effective approach to incorporate different sources of information for computing the probability of word sequences. Third, the VNSAs are well suited for those applications where speech and language decoding cascades are implemented through weighted rational transductions. The VNSAs have been compared to standard bigram and trigram language models and their reduced set of parameters does not affect by any means the performances in terms of perplexity. The design of a stochastic language model through the VNSA is described and applied to word and phrase class-based language models. The effectiveness of VNSAs has been tested within the Air Travel Information System (ATIS) task to build the language model for the speech recognition and the language understanding system.

international conference on acoustics, speech, and signal processing | 2005

The AT&T WATSON speech recognizer

Vincent Goffin; Cyril Allauzen; Enrico Bocchieri; Dilek Hakkani-Tür; Andrej Ljolje; Sarangarajan Parthasarathy; Mazin G. Rahim; Giuseppe Riccardi; Murat Saraclar

This paper describes the AT&T WATSON real-time speech recognizer, the product of several decades of research at AT&T. The recognizer handles a wide range of vocabulary sizes and is based on continuous-density hidden Markov models for acoustic modeling and finite state networks for language modeling. The recognition network is optimized for efficient search. We identify the algorithms used for high-accuracy, real-time and low-latency recognition. We present results for small and large vocabulary tasks taken from the AT&T VoiceTone/sup /spl reg// service, showing word accuracy improvement of about 5% absolute and real-time processing speed-up by a factor between 2 and 3.

Explore More