Claudio Rullent
CSELT
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Claudio Rullent.
international conference on acoustics, speech, and signal processing | 1997
Dario Albesano; Paolo Baggia; Morena Danieli; Roberto Gemello; Elisabetta Gerbino; Claudio Rullent
This paper presents Dialogos, a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions both to users which get good recognition performance and to the ones which get lower scores. The robust behavior of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows to deal with partial or total breakdowns of the different levels of analysis. We report the field trial data of the system and the evaluation results of the overall system and of the submodules.
international conference on acoustics, speech, and signal processing | 1993
Paolo Baggia; Claudio Rullent
The authors describe a robust parsing strategy where partial parsing is seen not as a back-up strategy, but as the normal mode of operation of the parser. The goal is to extract from a lattice of word hypotheses the information content of an utterance using the minimum amount of linguistic knowledge. A system for accessing a train timetable in Italian has been implemented and tested on a test set of 600 utterances. The proposed approach to partial parsing represents a compromise between the need for accurate linguistic knowledge to avoid misunderstanding and the need to reduce the amount of linguistic knowledge to be used by the system, given its high development cost and the related reduction in efficiency and robustness. The approach makes it possible to increase robustness to spontaneous speech and to reduce the effect of limitations in the syntactic/semantic coverage of the grammar. The experiments show that the method has a good ability to extract correctly many concepts even when recognition problems do not make it possible to extract all of them.<<ETX>>
international conference on acoustics speech and signal processing | 1988
Luciano Fissore; Egidio P. Giachin; Pietro Laface; Giorgio Micca; R. Pieraccini; Claudio Rullent
A continuous speech recognition and understanding system is presented that accepts queries about a restricted geographical domain, expressed in free but syntactically correct natural language, with a lexicon of the order of one thousand words. A lattice of word candidates hypothesized by the speaker dependent recognition level is the interface to an understanding module that performs the syntactic and semantic analysis. The recognition subsystem generates word hypotheses by exploiting hidden Markov models of sub-word units. Bottom-up constraints are also introduced to restrict the set of candidate words. The understanding module determines the most likely sequence of words and represents its meaning in a parse-tree suitable to access a database. It makes use of a modified caseframe analysis driven by the word hypotheses likelihood scores. The results of a set of experiments performed in 150 sentences collected from one speaker are given.<<ETX>>
International Journal of Speech Technology | 1997
Dario Albesano; Paolo Baggia; Morena Danieli; Roberto Gemello; Elisabetta Gerbino; Claudio Rullent
This paper presents a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions even for people with poor recognition performance. The robust behaviour of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows the system to deal with partial or total breakdowns at other levels of analysis. We report the field trial data of the system with respect to speech recognition metrics of word accuracy and sentence understanding rate, time-to-completion, time-to-acquisition of crucial parameters, and degree of success of the interactions in providing the speakers with the information they required. The evaluation data show that most of the subjects were able to interact fruitfully with the system. These results suggest that the design choices made to achieve robust behaviour are a promising way to create usable spoken language telephone systems.
international conference on computational linguistics | 1988
Egidio P. Giachin; Claudio Rullent
This paper describes a technique for enabling a speech understanding system to deal with sentences for which some monosyllabic words are not recognized. Such words are supposed to act as mere syntactic markers within the system linguistic domain. This result is achieved by combining a modified caseframe approach to linguistic knowledge representation with a parsing strategy able to integrate expectations from the language model and predictions from words. Experimental results show that the proposed technique permits to greatly increase the quota of corrupted sentences correctly understandable without sensibly decreasing parsing efficiency.
conference on applied natural language processing | 1992
Paolo Baggia; Elisabetta Gerbino; Egidio P. Giachin; Claudio Rullent
This paper describes the approach followed in the development of the linguistic processor of the continuous speech dialog system implemented at our labs. The application scenario (voice-based information retrieval service over the telephone) poses severe specifications to the system: it has to be speaker-independent, to deal with noisy and corrupted speech, and to work in real time. To cope with these types of applications requires to improve both efficiency and accuracy. At present, the system accepts telephone-quality speech (utterances referring to an electronic mailbox access, recorded through a PABX) and, in the speaker-independent configuration, it correctly understands 72% of the utterances in about twice real time. Experimental results are discussed, as obtained from an implementation of the system on a Sun SparcStation 1 using the C language.
international conference on acoustics, speech, and signal processing | 1993
Elisabetta Gerbino; Paolo Baggia; Alberto Ciaramella; Claudio Rullent
The development of spoken dialogue systems (SDSs) requires the definition of evaluation metrics which can assess the performance of these systems at different levels and compare various SDSs. The authors present a first test, made with naive users, on an integrated dialogue system for telephone speech access to a remote data base. They describe the system architecture as well as the goals of the test, its features, the methodology used during the evaluation, and the results obtained. The SDS is shown to be effective for providing the user with the required information. The presence of spontaneous speech phenomena is frequent with naive users. The dialogue helps the user to overcome the errors due to spontaneous speech. The use of isolated words for confirmation is useful, but partially limits the interaction friendliness.<<ETX>>
Archive | 1992
Egidio P. Giachin; Claudio Rullent
The goal of a speech understanding system is to correctly identify the action to be taken as a response to a user’s voiced request. To this purpose, the system has to rely on some type of linguistic knowledge beside merely recognize words. Several approaches have been proposed to employ language modeling in speech understanding. They include unified architectures integrating modular knowledge sources that account for every level of knowledge from acoustics to linguistics, and two-level architectures in which the separation between recognition and linguistic processing is well defined. Within this approach, two main methods may be conceived: linguistic constraints are integrated into the recognizer, which decodes one string of words that is treated by a natural language interface; or the recognizer produces a scored word lattice that is subsequently processed by a suitable linguistic module. For the present study, this latter approach was considered the most promising one, provided a satisfactory solution to efficient word lattice parsing could be found.
international conference on parallel architectures and languages europe | 1987
Pier Giorgio Bosco; Egidio P. Giachin; G. Giandonato; G. Martinengo; Claudio Rullent
This paper describes an architecture for rule-based interpretation of uncertain data, which is currently under development at our labs. Inference on uncertain input facts is a central topic in Al, with application, e.g., to the syntactic-semantic layers of speech understanding systems. The severe requirements of real-time applications dictate a parallel approach to this problem. The description covers the main aspects related to parallelism and communication at the three levels which have interacted in the design of this architecture: the hardware machine, a highly-parallel homogeneous structure of processing element — memory pairs interconnected by a fast packet-switching network; the programming language, which is a dialect of Lisp augmented with asynchronous message passing primitives; the inferential algorithm, which unifies goal-driven and data-driven strategies under a score-guided search control. Rules are mapped into a set of processes which cooperate by exchanging, via the primitives and the network mentioned above, messages corresponding to succinct representations of intermediate deductions.
conference of the international speech communication association | 1992
Paolo Baggia; Luciano Fissore; Elisabetta Gerbino; Egidio P. Giachin; Claudio Rullent
Abstract A parser for continuous speech has to deal with lattices where the word hypotheses of the correct sentence are not usually perfectly aligned and short function words may be missing. To cope with these problems, a two-way interaction between the recognition module and the parser, called feedback verification procedure (FVP), has been investigated. The parser generates many solutions, that are fed back to the recognizer which realigns them against the acoustical data, finds the missing function words among the given candidates, and attributes them a new score. The best scoring solution is finally selected by the parser. Results on a 787-word, speaker-independent, telephone-bandwidth continuous speech recognition task are presented.