Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Luís C. Oliveira is active.

Publication


Featured researches published by Luís C. Oliveira.


EURASIP Journal on Advances in Signal Processing | 2009

Jitter estimation algorithms for detection of pathological voices

Dárcio G. Silva; Luís C. Oliveira; Mário Andrea

This work is focused on the evaluation of different methods to estimate the amount of jitter present in speech signals. The jitter value is a measure of the irregularity of a quasiperiodic signal and is a good indicator of the presence of pathologies in the larynx such as vocal fold nodules or a vocal fold polyp. Given the irregular nature of the speech signal, each jitter estimation algorithm relies on its own model making a direct comparison of the results very difficult. For this reason, the evaluation of the different jitter estimation methods was target on their ability to detect pathological voices. Two databases were used for this evaluation: a subset of the MEEI database and a smaller database acquired in the scope of this work. The results showed that there were significant differences in the performance of the algorithms being evaluated. Surprisingly, in the largest database the best results were not achieved with the commonly used relative jitter, measured as a percentage of the glottal cycle, but with absolute jitter values measured in microseconds. Also, the new proposed measure for jitter, LocJitt, performed in general is equal to or better than the commonly used tools of MDVP and Praat.


processing of the portuguese language | 2003

Using morphossyntactic information in TTS systems: comparing strategies for European Portuguese

Ricardo Ribeiro; Luís C. Oliveira; Isabel Trancoso

To improve the quality of the speech produced by a Text-to-Speech (TTS) system, it is important to obtain the maximum amount of information from the input text that may help in this task. This covers a wide range of possibilities that can go from the simple conversion of non orthographic items to more complex syntactic and semantic analysis. In this paper, we present the development of a morphossyntactic tagging system and analyze its influence on the performance of a TTS system for European Portuguese.


processing of the portuguese language | 2008

DIXI --- A Generic Text-to-Speech System for European Portuguese

Sérgio Paulo; Luís C. Oliveira; Carlos Mendes; Luís Figueira; Renato Cassaca; Céu Viana; Helena Moniz

This paper describes a new generic text-to-speech synthesis system, developed in the scope of the Tecnovoz Project. Although it was primarily targeted at speech synthesis in European Portuguese, its modular architecture and flexible components allows its use for different languages. We also provide a survey on the development of the language resources needed by the TTS.


Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002. | 2002

Multilevel annotation of speech signals using weighted finite state transducers

Sérgio Paulo; Luís C. Oliveira

The purpose of this work was the development of a set of tools to automate the process of multilevel annotation of speech signals, preserving the alignments of the utterances different levels of the linguistic representation. Our goal is to build speech databases, using speech from non professional speakers with multilevel relational annotations, that can be used for the development of concatenative-based text-to-speech synthesizers or for training and testing statistical models. The method is based on the linguistic analysis of the transcription of the spoken material performed by a TTS system. The predicted phone sequence is then compared with the sequence produced by the speaker. The problem of aligning these two sequences is solved in a language-independent way using Weighted Finite State Transducers. After the alignment, a re-synchronization procedure is applied to the remaining levels to put them in agreement with the spoken utterance.


spoken language technology workshop | 2012

Intent transfer in speech-to-speech machine translation

Gopala Krishna Anumanchipalli; Luís C. Oliveira; Alan W. Black

This paper presents an approach for transfer of speaker intent in speech-to-speech machine translation (S2SMT). Specifically, we describe techniques to retain the prominence patterns of the source language utterance through the translation pipeline and impose this information during speech synthesis in the target language. We first present an analysis of word focus across languages to motivate the problem of transfer. We then propose an approach for training an appropriate transfer function for intonation on a parallel speech corpus in the two languages within which the translation is carried out. We present our analysis and experiments on English↔Portuguese and English↔German language pairs and evaluate the proposed transformation techniques through objective measures.


SSW | 2001

Prosodic Phrasing: Machine and Human Evaluation

Céu Viana; Luís C. Oliveira; Ana Isabel Mata

This paper describes a set of experiments aiming at the construction and evaluation of a new phrasing module for European Portuguese text-to-speech synthesis, using classification and regression trees learned from hand-labelled texts. Using the assessment criteria of matching boundary predictions against the corresponding labelled ones, the best solution achieves an overall performance of 91.9%, with 86.3% of correctly assigned breaks and 4.3% of false insertions. Although in absolute terms such scores may be considered surprisingly good given the size of the training set, the total number of exact matches at the sentence level is much lower (22%). This suggested a more formal experiment to test the acceptability of the predicted phrasing in the judgement of human evaluators. As the model was not trained on a labelled speech corpus but on hand-labelled texts, the reference phrasing needed also to be assessed. The evaluation experiment involved 90 participants who were asked to grade both the automatic and the reference phrasings, and also to express their opinion on where the breaks should be placed. As expected, the results showed a large variability among the subjects in their acceptance of a specific sentence partition, and criteria had to be defined to summarise the data from the different evaluators. With the adopted criteria, the performance of the automatic assignment procedure at the sentence level is better rated by human evaluators than by simple matching with the reference corpus (78% vs. 22%, respectively).


international conference natural language processing | 2004

Automatic Phonetic Alignment and Its Confidence Measures

Sérgio Paulo; Luís C. Oliveira

In this paper we propose the use of an HMM-based phonetic aligner together with a speech-synthesis-based one to improve the accuracy of the global alignment system. We also present a phone duration-independent measure to evaluate the accuracy of the automatic annotation tools. In the second part of the paper we propose and evaluate some new confidence measures for phonetic annotation.


ACM Transactions on Accessible Computing | 2015

Measuring the Performance of a Location-Aware Text Prediction System

Luís Garcia; Luís C. Oliveira; David Martins de Matos

In recent years, some works have discussed the conception of location-aware Augmentative and Alternative Communication (AAC) systems with very positive feedback from participants. However, in most cases, complementary quantitative evaluations have not been carried out to confirm those results. To contribute to clarifying the validity of these approaches, our study quantitatively evaluated the effect of using language models with location knowledge on the efficiency of a word and sentence prediction system. Using corpora collected for three different locations (classroom, school cafeteria, home), location-specific language models were trained with sentences from each location and compared with a traditional all-purpose language model, trained on all corpora. User tests showed a modest mean improvement of 2.4% and 1.3% for Words Per Minute (WPM) and Keystroke Saving Rate (KSR), respectively, but the differences were not statistically significant. Since our text prediction system relies on the concept of sentence reuse, we ran a set of simulations with language models having different sentence knowledge levels (0%, 25%, 50%, 75%, 100%). We also introduced in the comparison a second location-aware strategy that combines the location-specific approach with the all-purpose approach (mixed approach). The mixed language models performed better under low sentence-reuse conditions (0%, 25%, 50%) with 1.0%, 1.3%, and 1.2% KSR improvements, respectively. The location-specific language models performed better under high sentence-reuse conditions (75%, 100%) with 1.7% and 1.5% KSR improvements, respectively.


international conference on acoustics, speech, and signal processing | 1990

CELP: a candidate for GSM half-rate coding

Isabel Trancoso; C. Ribeiro; Luís B. Almeida; Luís C. Oliveira; Jorge S. Marques

A systematic study of code excited linear predictive (CELP) coders is presented. The design of an optimized version complying with the specifications for a GSM half-rate coder is described. This study is divided into three parts: election of an unquantized configuration, assuming unquantized scale factors and filter coefficients, fine tuning of the parameter quantization, and adaptation to the GSM specifications. A strong emphasis is placed on efficient codebook search procedures, and two approaches are briefly described: the truncated autocorrelation approach and the unity-magnitude approach. The final coder version achieves a very good combination of speech quality and implementation complexity.<<ETX>>


processing of the portuguese language | 2003

Improving the accuracy of the speech synthesis based phonetic alignment using multiple acoustic features

Sérgio Paulo; Luís C. Oliveira

The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.

Collaboration


Dive into the Luís C. Oliveira's collaboration.

Top Co-Authors

Avatar

Isabel Trancoso

Instituto Superior Técnico

View shared research outputs
Top Co-Authors

Avatar

Alan W. Black

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

João P. Cabral

University College Dublin

View shared research outputs
Top Co-Authors

Avatar

Pedro M. S. Carvalho

Technical University of Lisbon

View shared research outputs
Top Co-Authors

Avatar

Ana Paiva

Instituto Superior Técnico

View shared research outputs
Researchain Logo
Decentralizing Knowledge