Stefano Scarci | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefano Scarci is active.

Explore More

Publication

Featured researches published by Stefano Scarci.

international conference on acoustics, speech, and signal processing | 1987

Phoneme classification for real time speech recognition of Italian

Paolo D'Orta; Marco Ferretti; Stefano Scarci

The development of large dictionary speech recognition systems requires the use of techniques aimed at limiting the search of the correct word to a subset of the vocabulary as small as possible. An approach to this problem is to create classes of equivalence among words by means of a phoneme classification. We investigate methods based on the definition of a similarity measure of Hidden Markov Models of phonemes, and on the automatic identification of broad phonetic classes via clustering algorithms. We discuss the obtained classifications, and their use in a real time speech recognition system for a 3000-word dictionary for Italian; results are compared to those achieved by knowledge based classifications.

Ibm Journal of Research and Development | 1988

Large-vocabulary speech recognition: a system for the Italian language

Paolo D'Orta; Marco Ferretti; Alessandro Martelli; Sergio Melecrinis; Stefano Scarci; Giampiero Volpi

We describe a research project in automatic speech recognition which has led to the development of an experimental large-vocabulary real-time recognizer for Italian, and show how the maximum-likelihood techniques which had been employed in the development of prototype recognizers for English can be tailored to a language with substantially different characteristics.

international conference on acoustics, speech, and signal processing | 1989

Language model and acoustic model information in probabilistic speech recognition

Marco Ferretti; Giulio Maltese; Stefano Scarci

The authors propose an approach to the estimation of the performance of the language model and the acoustic model in probabilistic speech recognition that tries to take into account the interaction between the two. It consists of a new measure, called speech decoder entropy (SDE), of joint acoustic-context information. Some results are presented for a 20000-word vocabulary recognizer. The authors discuss some limitations of the work and some suggestions for future developments.<<ETX>>

international conference on acoustics, speech, and signal processing | 1987

A speech recognition system for the Italian language

Paolo D'Orta; Marco Ferretti; A. Martelli; S. Melecrinis; Stefano Scarci; G. Volpi

A real-time speech recognition system for Italian, based on a probabilistic approach, has been developed at the IBM Rome Research Center. It handles natural language sentences, from a 3000-word dictionary, dictated with words separated by short pauses. The architecture consists of an IBM 3090 mainframe and a PC/AT equipped with signal processing hardware. Recognition experiments have been performed for several speakers, each of whom had previously trained the system by dictating a 15-minute text. The paper describes the system, gives results and outlines future developments.

Speech Communication | 1990

Measuring information provided by language model and acoustic model in probabilistic speech recognition: theory and experimental results

Marco Ferretti; Giulio Maltese; Stefano Scarci

Abstract In probabilistic speech recognition it is often interesting to evaluate the contribution of the language model and that of the acoustic model. We propose an information theoretical approach which takes into account the interaction between the two sources of information. Experimental results are presented concerning the IBM prototype real-time recognizer of the Italian language based on a 20,000-word vocabulary.

conference of the european chapter of the association for computational linguistics | 1987

An automatic speech recognition system for the Italian language

Paolo D'Orta; Marco Ferretti; Alessandro Martelli; Stefano Scarci

An automatic speech recognition system for Italian language has been developed at IBM Italy Scientific Center in Rome. It is able to recognize in real time natural language sentences, composed with words from a dictionary of 6500 items, dictated by a speaker with short pauses among them. The system is speaker dependent, before using it the speaker has to perform the training stage reading a predefined text 15--20 minutes long. It runs on an architecture composed by an IBM 3090 mainframe and a PC/AT based workstation with signal processing equipments.

annual european computer conference | 1989

Experimenting natural-language dictation with a 20000-word speech recognizer

P. Alto; M. Brandetti; Marco Ferretti; Giulio Maltese; Stefano Scarci

The authors describe a newly developed real-time large-vocabulary speech recognizer for the Italian language and some preliminary experiments on its usage. Some of these experiments are aimed at evaluating voice versus keyboard as a means for entry and editing of texts. The experiments made use of a dictating-machine prototype for the Italian language, which recognizes in real time natural-language sentences built from a 20000-word vocabulary. A voice-activated editor was developed to allow the user to create, revise, file, and print documents. It is found that large-vocabulary speech recognition can offer a very competitive alternative to traditional text entry. It is likely to be well accepted even by users who have a large experience in keyboard text editing. The study has already suggested possible improvements to the man-machine interface of the current speech recognizer.<<ETX>>

Archive | 1992

Experimenting Text Creation by Natural-Language, Large-Vocabulary Speech Recognition

P. Alto; M. Brandetti; Marco Ferretti; Giulio Maltese; F. Mancini; A. Mazza; Stefano Scarci; G. Vitillaro

In the last years the probabilistic approach to speech recognition has allowed the development of high-performances large-vocabulary speech recognition systems [1] [2]. At the IBM Rome Scientific Center a speech-recognition prototype for the Italian language, based on this approach, has been built. The prototype is able to recognize in real time natural-language sentences built using a vocabulary containing up to 20000 words. [4]. Once and for all the user has to perform an acoustic training phase (about 20 minutes long), during which he is required to utter a predefined text. Words must be uttered inserting small pauses (a few centiseconds), between them. The prototype architecture is based on a personal computer equipped with special hardware. The first system we developed was aimed at a business and finance lexicon. Many laboratory tests have shown the effectiveness of the prototype as a tool to create texts by voice. After a first phase during which in-house experiments were carried on [5], the need arose to test the system in real work enviroments and for different applications. Two applications were considered: the dictation of radiological reports and of insurance company documents. Due to their characteristics, these applications seemed to be very well suited for our purposes. Since the vocabulary of the recognizer must be predefined, we had to adapt the system to the lexicon required by the new applications. The paper describes the techniques developed to efficiently adapt the basic component of the recognizer the acoustic and language models. The results obtained experimenting automatic text dictation during real work are also presented.

Lecture Notes in Computer Science | 1989

A 2000-word speech recognizer of Italian

M. Brandetti; Marco Ferretti; A. Fusi; Giulio Maltese; Stefano Scarci; G. Vitillarco

A real-time speech recognition system of Italian has been developed at IBM Rome Scientific Center. It handles natural language sentences from a 20000-word dictionary, dictated with words separated by short pauses. The architecture consists of a PC/AT equipped with signal processing hardware. The paper describes the system, shows results of decoding tests and includes descriptions of the topics in speech recognition being currently investigated.

Archive | 1992