Michael Tjalve
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Tjalve.
Ai Magazine | 2014
Tatiana Josephy; Matthew Lease; Praveen Paritosh; Markus Krause; Mihai Georgescu; Michael Tjalve; Daniela Braga
The first AAAI Conference on Human Computation and Crowdsourcing (HCOMP-2013) was be held November 6-9, 2013 in Palm Springs, California. Three workshops took place on Saturday, November 9th: Crowdsourcing at Scale (full day), Human and Machine Learning in Games (full day) and Scaling Speech, Language Understanding and Dialogue through Crowdsourcing (half day). This report summarizes the activities of those three events.
text speech and dialogue | 2013
Jeanne Parson; Daniela Braga; Michael Tjalve; Jieun Oh
One of the key aspects of creating high quality synthetic speech is the validation process. Establishing validation processes that are reliable and scalable is challenging. Today, the maturity of the crowdsourcing infrastructure along with better techniques for validating the data gathered through crowdsourcing have made it possible to perform reliable speech synthesis validation at a larger scale. In this paper, we present a study of voice quality evaluation using the crowdsourcing platform. We investigate voice gender preference across eight locales for three typical TTS scenarios. We also examine to which degree speaker adaptation can carry over certain voice qualities, such as mood, of the target speaker to the adapted TTS. Based on an existing full TTS font, adaptation is carried out on a smaller amount of speech data from a target speaker. Finally, we show how crowdsourcing contributes to objective assessment when dealing with voice preference in voice talent selection.
processing of the portuguese language | 2014
Annika Hämäläinen; Hyongsil Cho; Sara Candeias; Thomas Pellegrini; Alberto Abad; Michael Tjalve; Isabel Trancoso; Miguel Sales Dias
This paper reports findings from an analysis of errors made by an automatic speech recogniser trained and tested with 3-10-year-old European Portuguese childrens speech. We expected and were able to identify frequent pronunciation error patterns in the childrens speech. Furthermore, we were able to correlate some of these pronunciation error patterns and automatic speech recognition errors. The findings reported in this paper are of phonetic interest but will also be useful for improving the performance of automatic speech recognisers aimed at children representing the target population of the study.
IEEE Transactions on Audio, Speech, and Language Processing | 2018
Jorge Proença; Carla Lopes; Michael Tjalve; Andreas Stolcke; Sara Candeias; Fernando Perdigão
This paper proposes an approach to automatically parse childrens reading of sentences by detecting word pronunciations and extra content, and to classify words as correctly or incorrectly pronounced. This approach can be directly helpful for automatic assessment of reading level or for automatic reading tutors, where a correct reading must be identified. We propose a first segmentation stage to locate candidate word pronunciations based on allowing repetitions and false starts of a words syllables. A decoding grammar based solely on syllables allows silence to appear during a word pronunciation. At a second stage, word candidates are classified as mispronounced or not. The feature that best classifies mispronunciations is found to be the log-likelihood ratio between a free phone loop and a word spotting model in the very close vicinity of the candidate segmentation. Additional features are combined in multifeature models to further improve classification, including: normalizations of the log-likelihood ratio, derivations from phone likelihoods, and Levenshtein distances between the correct pronunciation and recognized phonemes through two phoneme recognition approaches. Results show that most extra events were detected (close to 2% word error rate achieved) and that using automatic segmentation for mispronunciation classification approaches the performance of manual segmentation. Although the log-likelihood ratio from a spotting approach is already a good metric to classify word pronunciations, the combination of additional features provides a relative reduction of the miss rate of 18% (from 34.03% to 27.79% using manual segmentation and from 35.58% to 29.35% using automatic segmentation, at constant 5% false alarm rate).
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, | 2017
Gina-Anne Levow; Emily M. Bender; Patrick Littell; Kristen Howell; Shobhana Lakshmi Chelliah; Joshua Crowgey; Dan Garrette; Jeff Good; Sharon Hargus; David Inman; Michael Maxwell; Michael Tjalve; Fei Xia
This paper describes the use of Shared Task Evaluation Campaigns by designing tasks that are compelling to speech and natural language processing researchers while addressing technical challenges in language documentation and exploiting growing archives of endangered language data.
processing of the portuguese language | 2016
Jorge Proença; Dirce Celorico; Carla Lopes; Miguel Sales Dias; Michael Tjalve; Andreas Stolcke; Sara Candeias; Fernando Perdigão
To evaluate the reading performance of children, human assessment is usually involved, where a teacher or tutor has to take time to individually estimate the performance in terms of fluency (speed, accuracy and expression). Automatic estimation of reading ability can be an important alternative or complement to the usual methods, and can improve other applications such as e-learning. Techniques must be developed to analyse audio recordings of read utterances by children and detect the deviations from the intended correct reading i.e. disfluencies. For that goal, a database of 284 European Portuguese children from 6 to 10 years old (1st–4th grades) reading aloud amounting to 20 h was collected in private and public Portuguese schools. This paper describes the design of the reading tasks as well as the data collection procedure. The presence of different types of disfluencies is analysed as well as reading performance compared to known curricular goals.
conference of the international speech communication association | 2013
Thomas Pellegrini; Annika Hämäläinen; Philippe Boula de Mareüil; Michael Tjalve; Isabel Trancoso; Sara Candeias; Miguel Sales Dias; Daniela Braga
conference of the international speech communication association | 2005
Michael Tjalve; Mark Huckvale
Workshop on Child Computer Interaction - WOCCI 2014 | 2014
Annika Hämäläinen; Sara Candeias; Hyongsil Cho; Hugo Meinedo; Alberto Abad; Thomas Pellegrini; Michael Tjalve; Isabel Trancoso; Miguel Sales Dias
processing of the portuguese language | 2014
Annika Hämäläinen; Hugo Meinedo; Michael Tjalve; Thomas Pellegrini; Isabel Trancoso; Miguel Sales Dias