Paolo Baggia
CSELT
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paolo Baggia.
international conference on acoustics, speech, and signal processing | 1997
Cosmin Popovici; Paolo Baggia
Analyses language modeling in spoken dialogue systems for accessing a database. The use of several language models obtained by exploiting dialogue predictions gives better results than the use of a single model for the whole dialogue interaction. For this reason, several models have been created, each one for a specific system question, such as the request for or the confirmation of a parameter. The use of dialogue-dependent language models increases the performance both at the recognition level and at the understanding level, especially on answers to system requests. Moreover, using other methods to increase the performance, like the automatic clustering of vocabulary words or the use of better acoustic models during recognition, does not affect the improvements given by dialogue-dependent language models. The system used in our experiments is Dialogos, the Italian spoken dialogue system used for accessing railway timetable information over the telephone. The experiments were carried out on a large corpus of dialogues collected using Dialogos.
international conference on acoustics, speech, and signal processing | 1997
Dario Albesano; Paolo Baggia; Morena Danieli; Roberto Gemello; Elisabetta Gerbino; Claudio Rullent
This paper presents Dialogos, a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions both to users which get good recognition performance and to the ones which get lower scores. The robust behavior of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows to deal with partial or total breakdowns of the different levels of analysis. We report the field trial data of the system and the evaluation results of the overall system and of the submodules.
international conference on acoustics, speech, and signal processing | 1993
Paolo Baggia; Claudio Rullent
The authors describe a robust parsing strategy where partial parsing is seen not as a back-up strategy, but as the normal mode of operation of the parser. The goal is to extract from a lattice of word hypotheses the information content of an utterance using the minimum amount of linguistic knowledge. A system for accessing a train timetable in Italian has been implemented and tested on a test set of 600 utterances. The proposed approach to partial parsing represents a compromise between the need for accurate linguistic knowledge to avoid misunderstanding and the need to reduce the amount of linguistic knowledge to be used by the system, given its high development cost and the related reduction in efficiency and robustness. The approach makes it possible to increase robustness to spontaneous speech and to reduce the effect of limitations in the syntactic/semantic coverage of the grammar. The experiments show that the method has a good ability to extract correctly many concepts even when recognition problems do not make it possible to extract all of them.<<ETX>>
International Journal of Speech Technology | 1997
Dario Albesano; Paolo Baggia; Morena Danieli; Roberto Gemello; Elisabetta Gerbino; Claudio Rullent
This paper presents a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions even for people with poor recognition performance. The robust behaviour of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows the system to deal with partial or total breakdowns at other levels of analysis. We report the field trial data of the system with respect to speech recognition metrics of word accuracy and sentence understanding rate, time-to-completion, time-to-acquisition of crucial parameters, and degree of success of the interactions in providing the speakers with the information they required. The evaluation data show that most of the subjects were able to interact fruitfully with the system. These results suggest that the design choices made to achieve robust behaviour are a promising way to create usable spoken language telephone systems.
conference on applied natural language processing | 1992
Paolo Baggia; Elisabetta Gerbino; Egidio P. Giachin; Claudio Rullent
This paper describes the approach followed in the development of the linguistic processor of the continuous speech dialog system implemented at our labs. The application scenario (voice-based information retrieval service over the telephone) poses severe specifications to the system: it has to be speaker-independent, to deal with noisy and corrupted speech, and to work in real time. To cope with these types of applications requires to improve both efficiency and accuracy. At present, the system accepts telephone-quality speech (utterances referring to an electronic mailbox access, recorded through a PABX) and, in the speaker-independent configuration, it correctly understands 72% of the utterances in about twice real time. Experimental results are discussed, as obtained from an implementation of the system on a Sun SparcStation 1 using the C language.
international conference on acoustics, speech, and signal processing | 1993
Elisabetta Gerbino; Paolo Baggia; Alberto Ciaramella; Claudio Rullent
The development of spoken dialogue systems (SDSs) requires the definition of evaluation metrics which can assess the performance of these systems at different levels and compare various SDSs. The authors present a first test, made with naive users, on an integrated dialogue system for telephone speech access to a remote data base. They describe the system architecture as well as the goals of the test, its features, the methodology used during the evaluation, and the results obtained. The SDS is shown to be effective for providing the user with the required information. The presence of spontaneous speech phenomena is frequent with naive users. The dialogue helps the user to overcome the errors due to spontaneous speech. The use of isolated words for confirmation is useful, but partially limits the interaction friendliness.<<ETX>>
conference of the international speech communication association | 1992
Paolo Baggia; Luciano Fissore; Elisabetta Gerbino; Egidio P. Giachin; Claudio Rullent
Abstract A parser for continuous speech has to deal with lattices where the word hypotheses of the correct sentence are not usually perfectly aligned and short function words may be missing. To cope with these problems, a two-way interaction between the recognition module and the parser, called feedback verification procedure (FVP), has been investigated. The parser generates many solutions, that are fed back to the recognizer which realigns them against the acoustical data, finds the missing function words among the given candidates, and attributes them a new score. The best scoring solution is finally selected by the parser. Results on a 787-word, speaker-independent, telephone-bandwidth continuous speech recognition task are presented.
International Journal of Pattern Recognition and Artificial Intelligence | 1994
Paolo Baggia; Luciano Fissore; Egidio P. Giachin; Giorgio Micca; Claudio Rullent; Pietro Laface
This paper describes a Continuous Speech Understanding System that allows information services to be accessed through the telephone line. It accepts queries within a restricted semantic domain, expressed in free but syntactically correct natural language, with a lexicon of the order of 800 words. In the implementation here described, a user can access an electronic mailbox or a train information service through a PABX telephone line. The architecture of the system is based on two main modules that represent and use different knowledge sources. A speaker independent recognition module generates, for each utterance, a lattice of word hypotheses which is the interface to an understanding module that performs the syntactic and semantic analysis. The recognition module is based on Hidden Markov Models of subword units, and performs the acoustic decoding process according to a beam search strategy. The understanding module finds the most likely sequence of words and represents its meaning in a format which facilitates the access to a database. It makes use of a modified caseframe analysis guided by the word hypotheses scores. Experiments were performed with 600 sentences from 10 speakers on the E-Mail application task. Using 15 Gaussian mixtures per state, a word accuracy of 75.7 was obtained with a test vocabulary of 787 words and no linguistic constraints. Linguistic processing of the corresponding lattices achieved a sentence understanding rate of 82%.
Proceedings 1998 IEEE 4th Workshop Interactive Voice Technology for Telecommunications Applications. IVTTA '98 (Cat. No.98TH8376) | 1998
Paolo Baggia; G. Castagneri; M. Danieli
conference of the international speech communication association | 1991
Paolo Baggia; Alberto Ciaramella; Davide Clementino; Lorenzo Fissore; Elisabetta Gerbino; Egidio P. Giachin; Giorgio Micca; Luciano Nebbia; Roberto Pacifici; Giancarlo Pirani; Claudio Rullent