Mariko Kondo
Waseda University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mariko Kondo.
2009 Oriental COCOSDA International Conference on Speech Database and Assessments | 2009
Tanya Visceglia; Chiu-yu Tseng; Mariko Kondo; Helen M. Meng; Yoshinori Sagisaka
This research is part of the ongoing multinational collaboration “Asian English Speech cOrpus Project” (AESOP), whose aim is to build up an Asian English speech corpus representing the varieties of English spoken in Asia. AESOP is an international consortium of linguists, speech scientists, psychologists and educators from Japan, Taiwan, Hong Kong, China, Thailand, Indonesia and Mongolia. Its primary aim is to collect and compare Asian English speech corpora from the countries listed above in order to derive a set of core properties common to all varieties of Asian English, as well as to discover features that are particular to individual varieties. Each research team will use a common recording setup and share an experimental task set, and will develop a common, open-ended annotation system. Moreover, AESOP-collected corpora will be an open resource, available to the research community at large. The initial stage of the phonetics aspect of this project will be devoted to designing spoken-language tasks which will elicit production of a large range of English segmental and suprasegmental characteristics. These data will be used to generate a catalogue of acoustic characteristics particular to individual varieties of Asian English, which will then be compared with the data collected by other AESOP members in order to determine areas of overlap between L1 and L2 English as well as differences among varieties of Asian English.
Archive | 2005
Mariko Kondo
Vowel devoicing is a common phonological process in many languages and typically involves high vowels and schwa. High vowels and schwa are inherently short (Bell, 1978; Dauer 1980) and the process usually occurs when the vowels are either adjacent to, or surrounded by, voiceless consonants, during which the glottis is fully open. It is thought that vowel devoicing is a consequence of articulatory undershoot of the glottal movements. It also suggests that vowel devoicing processes are the results of the glottal gestural overlap between voiceless consonants and short vowels. The movements of glottal muscles for the short high vowels /i/ and /u/ blend with those of the adjacent voiceless sounds or a pause (Jun, 1993; Jun and Beckman, 1994). In many languages, the process is also considered to be part of the vowel neutralization and reduction processes in which vowels are first reduced in duration and centralized in quality, typically in the unaccented position, and then eventually devoiced and/or deleted in fast or casual speech (Hyman 1975, Wheeler 1979, Dauer 1980, Kohler 1990).
Language and Speech | 2018
Kimiko Tsukada; Mariko Kondo
This study examines the perception of Mandarin lexical tones by native speakers of Burmese who use lexical tones in their first language (L1) but are naïve to Mandarin. Unlike Mandarin tones, which are primarily cued by pitch, Burmese tones are cued by phonation type as well as pitch. The question of interest is whether Burmese listeners can utilize their L1 experience in processing unfamiliar Mandarin tones. Burmese listeners’ discrimination accuracy was compared with that of Mandarin listeners and Australian English listeners. The Australian English group was included as a control group with a non-tonal background. Accuracy of perception of six tone pairs (T1-T2, T1-T3, T1-T4, T2-T3, T2-T4, T3-T4) was assessed in a discrimination test. Our main findings are 1) Mandarin listeners were more accurate than non-native listeners in discriminating all tone pairs, 2) Australian English listeners naïve to Mandarin were more accurate than similarly naïve Burmese listeners in discriminating all tone pairs except for T2-T4, and 3) Burmese listeners had the greatest trouble discriminating T2-T3 and T1-T2. Taken together, the results suggest that merely possessing lexical tones in L1 may not necessarily facilitate the perception of non-native tones, and that the active use of phonation type in encoding L1 tones may have played a role in Burmese listeners’ less than optimal perception of Mandarin tones.
Journal of the Acoustical Society of America | 2016
Shigeaki Amano; Kimiko Yamakawa; Mariko Kondo
Previous studies have shown that a devoiced vowel makes burst duration of a singleton stop consonant longer than a voiced vowel does. However, the effects of devoiced vowels on stop consonants including a geminate have not been fully investigated. To clarify the effects, durations of a burst and a closure in Japanese stops pronounced by 19 native Japanese speakers were analyzed by stop type (singleton or geminate) and speaking rate (fast, normal, or slow). Analysis revealed that at all speaking rates a devoiced vowel makes the burst duration of singleton and geminate stops longer than a voiced vowel does. Analysis also revealed that at normal and fast speaking rates a devoiced vowel has no effect on closure duration in singleton or geminate stops. However, at a slow speaking rate, a devoiced vowel makes the closure duration of a singleton stop longer. These results indicate that a devoiced vowel consistently lengthens the burst duration of singleton and geminate stops but that its effects on closure durat...
Journal of the Acoustical Society of America | 2016
Mariko Kondo
Japanese speakers have difficulty in differentiating English /l/ and /r/ because they are not contrastive. Actually, variations of both /l/ and /r/ occur in Japanese speech. The most common realization of Japanese /r/ is alveolar tap [ɾ] and all variants of both /l/ and /r/ are considered as allophones of /r/. Previously, it was found that Japanese speakers have more problems with /l/ than with /r/ in their English production. So, this study investigates why these differences occur. Analysis of Japanese speakers’ mimicry speech of (a) American English and (b) English accented Japanese suggested that Japanese speakers are aware of acoustic and articulatory features of English approximant [ɹ]. Japanese speakers overused approximant [ɹ] and r-colored vowels in their mimicries of both (a) and (b). Further articulatory analysis of Japanese and English consonants showed that the English approximant [ɹ] is quite distinct from Japanese consonants, all of which lack lip rounding. The results of these studies sugge...
Speech Communication | 2015
Greg Short; Keikichi Hirose; Mariko Kondo; Nobuaki Minematsu
Automatically recognize vowel length in Japanese accounting for speaking rate, which has not been done robustly.Perceptual experiments that give new knowledge about the mechanism for human discrimination of vowel length.Algorithm developed motivated by the above perceptual experiments accounting for speaking rate.Method shows to outperform the standard approach.Method also is shown to be robust against speaking rate. Automatic recognition of vowel length in Japanese has several applications in speech processing such as for computer assisted language learning (CALL) systems. Standard automatic speech recognition (ASR) systems make use of hidden Markov models (HMMs) to carry out the recognition. However, HMMs are not particularly well-suited for this problem since classification of vowel length is dependent on prosodic information, and since it is a relative feature affected by changes in the durations of surrounding sounds which vary in part due to changes in speaking rates. That being said, it is not obvious how to design an algorithm to account for these contextual dependencies, since there is still not enough known about how humans perceive the contrast. Therefore, in this paper, we conduct perceptual experiments to further understand the mechanism of human vowel length recognition. In our research, we found that the perceptual boundary is largely affected by the vowels two prior, one prior, and following the vowel for which the length is being recognized. Based on these results and the works of others, we propose an algorithm which does post-processing on alignments output by HMMs to automatically recognize vowel length. The method is primarily composed of two levels of processing dealing first with local dependencies and then long-term dependencies. We test several variations of this algorithm. The method we develop is shown to have superior recognition capabilities and be robust against speaking rate differences producing significant improvements. We test this method on three different databases: a speaking rate database, a native database, and a non-native database.
2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA) | 2014
Kimiko Yamakawa; Shigeaki Amano; Mariko Kondo
A Japanese read-word database was developed for 343 Japanese words spoken by 129 non-native speakers of Japanese (Korean, Chinese, Thai, Vietnamese, French, and English speakers) and 18 native speakers of Japanese. The Japanese words had a contrast of segments such as single/geminate stops, normal/lengthened vowels, existence/inexistence of moraic nasal, contracted sound/vowel /i/, and voiced/voiceless consonants that non-native speakers of Japanese frequently mispronounce. The speech files of the Japanese words and phoneme labels for a part of the speech files were registered in the database. The database will contribute to scientific research on characteristics of Japanese speech by non-native speakers. It will also contribute to develop a computer aided learning system of Japanese speech for non-native speakers.
Journal of the Acoustical Society of America | 2013
Shigeaki Amano; Kimiko Yamakawa; Mariko Kondo
Twenty-nine durational variables were examined to clarify rhythmic characteristics in non-fluent Japanese utterances by non-native speakers. Discriminant analysis with these variables was performed on 343 Japanese words, each pronounced in a carrier sentence by six native Japanese speakers and 14 non-native Japanese speakers (7 Vietnamese with low Japanese proficiency and 7 Chinese with high Japanese proficiency). The results showed that a combination of two durational variables could discriminate Japanese speakers from Vietnamese speakers with a small error (8.7%, n = 4458), namely the percentage of vowel duration and the average of “Normalized Voice Onset Asynchrony,” which is an interval time between the onset of two successive vowels divided by the first vowels duration. However, these two variables made a large error (39.4%, n = 4458) in the discrimination of Japanese speakers from Chinese speakers who had higher Japanese proficiency than Vietnamese speakers. These results suggest that the two varia...
conference of the international speech communication association | 2009
Helen M. Meng; Chiu-yu Tseng; Mariko Kondo; Alissa M. Harrison; Tanya Visceglia
conference of the international speech communication association | 1994
Mariko Kondo