Sérgio Paulo
INESC-ID
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sérgio Paulo.
processing of the portuguese language | 2008
Sérgio Paulo; Luís C. Oliveira; Carlos Mendes; Luís Figueira; Renato Cassaca; Céu Viana; Helena Moniz
This paper describes a new generic text-to-speech synthesis system, developed in the scope of the Tecnovoz Project. Although it was primarily targeted at speech synthesis in European Portuguese, its modular architecture and flexible components allows its use for different languages. We also provide a survey on the development of the language resources needed by the TTS.
Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002. | 2002
Sérgio Paulo; Luís C. Oliveira
The purpose of this work was the development of a set of tools to automate the process of multilevel annotation of speech signals, preserving the alignments of the utterances different levels of the linguistic representation. Our goal is to build speech databases, using speech from non professional speakers with multilevel relational annotations, that can be used for the development of concatenative-based text-to-speech synthesizers or for training and testing statistical models. The method is based on the linguistic analysis of the transcription of the spoken material performed by a TTS system. The predicted phone sequence is then compared with the sequence produced by the speaker. The problem of aligning these two sequences is solved in a language-independent way using Weighted Finite State Transducers. After the alignment, a re-synchronization procedure is applied to the remaining levels to put them in agreement with the spoken utterance.
international conference natural language processing | 2004
Sérgio Paulo; Luís C. Oliveira
In this paper we propose the use of an HMM-based phonetic aligner together with a speech-synthesis-based one to improve the accuracy of the global alignment system. We also present a phone duration-independent measure to evaluate the accuracy of the automatic annotation tools. In the second part of the paper we propose and evaluate some new confidence measures for phonetic annotation.
Multimedia Tools and Applications | 2016
Aitor Álvarez; Carlos Mendes; Matteo Raffaelli; Tiago Luís; Sérgio Paulo; Nicola Piccinini; Haritz Arzelus; João Paulo Neto; Carlo Aliprandi; Arantza del Pozo
The subtitling demand of multimedia content has grown quickly over the last years, especially after the adoption of the new European audiovisual legislation, which forces to make multimedia content accessible to all. As a result, TV channels have been moved to produce subtitles for a high percentage of their broadcast content. Consequently, the market has been seeking subtitling alternatives more productive than the traditional manual process. The large effort dedicated by the research community to the development of Large Vocabulary Continuous Speech Recognition (LVCSR) over the last decade has resulted in significant improvements on multimedia transcription, becoming the most powerful technology for automatic intralingual subtitling. This article contains a detailed description of the live and batch automatic subtitling applications developed by the SAVAS consortium for several European languages based on proprietary LVCSR technology specifically tailored to the subtitling needs, together with results of their quality evaluation.
processing of the portuguese language | 2003
Sérgio Paulo; Luís C. Oliveira
The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.
conference of the international speech communication association | 2003
Sérgio Paulo; Luís C. Oliveira
conference of the international speech communication association | 2005
Sérgio Paulo; Luís C. Oliveira
language resources and evaluation | 2008
Luís C. Oliveira; Sérgio Paulo; Luís Figueira; Carlos Mendes; Ana Nunes; Joaquim Godinho
conference of the international speech communication association | 2007
Sérgio Paulo; Luís C. Oliveira
language resources and evaluation | 2014
Arantza del Pozo; Carlo Aliprandi; Aitor Álvarez; Carlos Mendes; João Paulo Neto; Sérgio Paulo; Nicola Piccinini; Matteo Raffaelli