Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Helena Moniz is active.

Publication


Featured researches published by Helena Moniz.


IEEE Transactions on Audio, Speech, and Language Processing | 2012

Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts

Fernando Batista; Helena Moniz; Isabel Trancoso; Nuno J. Mamede

This paper focuses on the tasks of recovering capitalization and punctuation marks from texts without that information, such as spoken transcripts, produced by automatic speech recognition systems. These two practical rich transcription tasks were performed using the same discriminative approach, based on maximum entropy, suitable for on-the-fly usage. Reported experiments were conducted both over Portuguese and English broadcast news data. Both force aligned and automatic transcripts were used, allowing to measure the impact of the speech recognition errors. Capitalized words and named entities are intrinsically related, and are influenced by time variation effects. For that reason, the so-called language dynamics have been addressed for the capitalization task. Language adaptation results indicate, for both languages, that the capitalization performance is affected by the temporal distance between the training and testing data. In what regards the punctuation task, this paper covers the three most frequent punctuation marks: full stop, comma, and question marks. Different methods were explored for improving the baseline results for full stop and comma. The first uses punctuation information extracted from large written corpora. The second applies different levels of linguistic structure, including lexical, prosodic, and speaker related features. The comma detection improved significantly in the first method, thus indicating that it depends more on lexical features. The second method provided even better results, for both languages and both punctuation marks, best results being achieved mainly for full stop. As for question marks, there is a small gain, but differences are not very significant, due to the relatively small number of question marks in the corpora.


processing of the portuguese language | 2008

DIXI --- A Generic Text-to-Speech System for European Portuguese

Sérgio Paulo; Luís C. Oliveira; Carlos Mendes; Luís Figueira; Renato Cassaca; Céu Viana; Helena Moniz

This paper describes a new generic text-to-speech synthesis system, developed in the scope of the Tecnovoz Project. Although it was primarily targeted at speech synthesis in European Portuguese, its modular architecture and flexible components allows its use for different languages. We also provide a survey on the development of the language resources needed by the TTS.


Speech Communication | 2014

Speaking style effects in the production of disfluencies

Helena Moniz; Fernando Batista; Ana Isabel Mata; Isabel Trancoso

Abstract This work explores speaking style effects in the production of disfluencies. University lectures and map-task dialogues are analyzed in order to evaluate if the prosodic strategies used when uttering disfluencies vary across speaking styles. Our results show that the distribution of disfluency types is not arbitrary across lectures and dialogues. Moreover, although there is a statistically significant cross-style strategy of prosodic contrast marking (pitch and energy increases) between the region to repair and the repair of fluency, this strategy is displayed differently depending on the specific speech task. The overall patterns observed in the lectures, with regularities ascribed for speaker and disfluency types, do not hold with the same strength for the dialogues, due to underlying specificities of the communicative purposes. The tempo patterns found for both speech tasks also confirm their distinct behaviour, evidencing the more dynamic tempo characteristics of dialogues. In university lectures, prosodic cues are given to the listener both for the units inside disfluent regions and between these and the adjacent contexts. This suggests a stronger prosodic contrast marking of disfluency–fluency repair when compared to dialogues, as if teachers were monitoring the different regions – the introduction to a disfluency, the disfluency itself and the beginning of the repair – demarcating them in very contrastive ways.


Intonational Grammar in Ibero-Romance: Approaches across linguistic subfields | 2016

Stylistic variation in the intonation of European Portuguese teenagers and adults

Ana Isabel Mata; Helena Moniz; Fernando Batista

Studies of emerging prosody from the word to the phrase, integrating various sources of evidence, are scarce, and our understanding of the pathways of prosodic development is still very limited. An investigation of emerging intonation and prosodic phrasing was undertaken on the basis of production data on intonation and duration patterns from the speech of two European Portuguese children between 1;00 and 2;04. The results show that both the development of intonation and phrasing were found to precede the onset of combinatorial speech, and to coincide in time with critical points in lexical development. Prosodic phrasing evolved in three steps, by the unfolding of key prosodic levels. Implications of these results are discussed in relation to early prosodic development across languages.


Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues | 2010

Analysis of interrogatives in different domains

Helena Moniz; Fernando Batista; Isabel Trancoso; Ana Isabel Mata

The aim of this work is twofold: to quantify the distinct interrogative types in different domains for European Portuguese, and to discuss the weight of the linguistic features that best describe these structures, in order to model interrogatives in speech. We analyzed spoken dialogue, university lectures, and broadcast news corpora, and, for the sake of comparison, newspaper texts. The statistical analysis confirms that the percentage of the different types of interrogative is highly dependent on the nature of the corpus. Experiments on the automatic detection of interrogatives for European Portuguese, using only lexical cues, show results that are strongly correlated with the detection of a specific type of interrogatives (namely wh- questions). When acoustic and prosodic features (pitch, energy and duration) are added, yes/no and tag questions are then increasingly identified, showing the advantages of combining both lexical, acoustic and prosodic information.


symposium on languages, applications and technologies | 2014

Expanding a Database of Portuguese Tweets

Gaspar Brogueira; Fernando Batista; João Paulo Carvalho; Helena Moniz

This paper describes an existing database of geolocated tweets that were produced in Portuguese regions and proposes an approach to further expand it. The existing database covers eight consecutive days of collected tweets, totaling about 300 thousand tweets, produced by about 11 thousand different users. A detailed analysis on the content of the messages suggests a predominance of young authors that use Twitter as a way of reaching their colleagues with their feelings, ideas and comments. In order to further characterize this community of young people, we propose a method for retrieving additional tweets produced by the same set of authors already in the database. Our goal is to further extend the knowledge about each user of this community, making it possible to automatically characterize each user by the content he/she produces, cluster users and open other possibilities in the scope of social analysis.


COST'09 Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony | 2009

Disfluencies and the perspective of prosodic fluency

Helena Moniz; Isabel Trancoso; Ana Isabel Mata

This work explores prosodic cues of disfluent phenomena. We have conducted a perceptual experiment to test if listeners would rate all disfluencies as disfluent events or if some of them would be rated as fluent devices in specific prosodic contexts. Results pointed out significant differences (p < 0.05) between judgments of fluency vs. disfluency. Distinct prosodic properties of these events were also significant (p < 0.05) in their characterization as fluent devices. In an attempt to discriminate which linguistic features are more salient in the classification of disfluencies, we have also used CART techniques on a corpus of 3.5 hours of spontaneous and prepared non-scripted speech. CART results pointed out 2 splits: break indices and contour shape. The first split indicates that disfluent events uttered at breaks 3 and 4 are considered felicitous. The second one indicates that these events must have plateau or ascending contours to be considered as such; otherwise they are strongly penalized. The results obtained show that there are regular trends in the production of disfluencies, namely, prosodic phrasing and contour shape.


international conference on information systems | 2014

Portuguese geolocated tweets: an overview

Gaspar Brogueira; Fernando Batista; João Paulo Carvalho; Helena Moniz

This paper describes an existing database of geolocated tweets that were produced in Portuguese regions. The existing database was collected during eight consecutive days and contains about 307K tweets, produced by about 11K different users. A detailed analysis on the content of the messages suggests a predominance of teenagers and young adult authors that use Twitter as a way to communicate their feelings, ideas and comments to their colleagues. An overview of the dataset suggests that tweets have a very personal content, often describing family bonds and school activities and concerns. This is a suitable source of information for a number of tasks, including sociolinguistic studies, sentiment analysis, among others.


symposium on languages applications and technologies | 2015

Speech Features for Discriminating Stress Using Branch and Bound Wrapper Search

Mariana Julião; Jorge M. B. Silva; Ana Aguiar; Helena Moniz; Fernando Batista

Stress detection from speech is a less explored field than Automatic Emotion Recognition and it is still not clear which features are better stress discriminants. The project VOCE aims at doing speech classification as stressed or not-stressed in real-time, using acoustic-prosodic features only. We therefore look for the best discriminating feature subsets from a set of 6125 features extracted with openSMILE toolkit plus 160 Teager Energy Operator (TEO) features. We use a Mutual Information (MI) filter and a branch and bound wrapper heuristic with an SVM classifier to perform feature selection. Since many feature sets are selected, we analyse them in terms of chosen features and classifier performance concerning also true positive and false positive rates. The results show that the best feature types for our application case are Audio Spectral, MFCC, PCM and TEO. We reached results as high as 70.4 % for generalisation accuracy.


International Conference on Advances in Speech and Language Technologies for Iberian Languages | 2016

Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children

Rubén Solera-Ureña; Helena Moniz; Fernando Batista; Ramón Fernández Astudillo; Joana Campos; Ana Paiva; Isabel Trancoso

This paper investigates the use of heterogeneous speech corpora for automatic assessment of personality traits in terms of the Big-Five OCEAN dimensions. The motivation for this work is twofold: the need to develop methods to overcome the lack of children’s speech corpora, particularly severe when targeting personality traits, and the interest on cross-age comparisons of acoustic-prosodic features to build robust paralinguistic detectors. For this purpose, we devise an experimental setup with age mismatch utilizing the Interspeech 2012 Personality Sub-challenge, containing adult speech, as training data. As test data, we use a corpus of children’s European Portuguese speech. We investigate various features sets such as the Sub-challenge baseline features, the recently introduced eGeMAPS features and our own knowledge-based features. The preliminary results bring insights into cross-age and -language detection of personality traits in spontaneous speech, pointing out to a stable set of acoustic-prosodic features for Extraversion and Agreeableness in both adult and child speech.

Collaboration


Dive into the Helena Moniz's collaboration.

Top Co-Authors

Avatar

Isabel Trancoso

Instituto Superior Técnico

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge