Mariko Kondo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mariko Kondo is active.

Explore More

Publication

Featured researches published by Mariko Kondo.

2009 Oriental COCOSDA International Conference on Speech Database and Assessments | 2009

Phonetic aspects of content design in AESOP (Asian English Speech cOrpus Project)

Tanya Visceglia; Chiu-yu Tseng; Mariko Kondo; Helen M. Meng; Yoshinori Sagisaka

This research is part of the ongoing multinational collaboration “Asian English Speech cOrpus Project” (AESOP), whose aim is to build up an Asian English speech corpus representing the varieties of English spoken in Asia. AESOP is an international consortium of linguists, speech scientists, psychologists and educators from Japan, Taiwan, Hong Kong, China, Thailand, Indonesia and Mongolia. Its primary aim is to collect and compare Asian English speech corpora from the countries listed above in order to derive a set of core properties common to all varieties of Asian English, as well as to discover features that are particular to individual varieties. Each research team will use a common recording setup and share an experimental task set, and will develop a common, open-ended annotation system. Moreover, AESOP-collected corpora will be an open resource, available to the research community at large. The initial stage of the phonetics aspect of this project will be devoted to designing spoken-language tasks which will elicit production of a large range of English segmental and suprasegmental characteristics. These data will be used to generate a catalogue of acoustic characteristics particular to individual varieties of Asian English, which will then be compared with the data collected by other AESOP members in order to determine areas of overlap between L1 and L2 English as well as differences among varieties of Asian English.

Archive | 2005

Syllable structure and its acoustic effects on vowels in devoicing environments

Mariko Kondo

Vowel devoicing is a common phonological process in many languages and typically involves high vowels and schwa. High vowels and schwa are inherently short (Bell, 1978; Dauer 1980) and the process usually occurs when the vowels are either adjacent to, or surrounded by, voiceless consonants, during which the glottis is fully open. It is thought that vowel devoicing is a consequence of articulatory undershoot of the glottal movements. It also suggests that vowel devoicing processes are the results of the glottal gestural overlap between voiceless consonants and short vowels. The movements of glottal muscles for the short high vowels /i/ and /u/ blend with those of the adjacent voiceless sounds or a pause (Jun, 1993; Jun and Beckman, 1994). In many languages, the process is also considered to be part of the vowel neutralization and reduction processes in which vowels are first reduced in duration and centralized in quality, typically in the unaccented position, and then eventually devoiced and/or deleted in fast or casual speech (Hyman 1975, Wheeler 1979, Dauer 1980, Kohler 1990).

Language and Speech | 2018

The Perception of Mandarin Lexical Tones by Native Speakers of Burmese

Kimiko Tsukada; Mariko Kondo

This study examines the perception of Mandarin lexical tones by native speakers of Burmese who use lexical tones in their first language (L1) but are naïve to Mandarin. Unlike Mandarin tones, which are primarily cued by pitch, Burmese tones are cued by phonation type as well as pitch. The question of interest is whether Burmese listeners can utilize their L1 experience in processing unfamiliar Mandarin tones. Burmese listeners’ discrimination accuracy was compared with that of Mandarin listeners and Australian English listeners. The Australian English group was included as a control group with a non-tonal background. Accuracy of perception of six tone pairs (T1-T2, T1-T3, T1-T4, T2-T3, T2-T4, T3-T4) was assessed in a discrimination test. Our main findings are 1) Mandarin listeners were more accurate than non-native listeners in discriminating all tone pairs, 2) Australian English listeners naïve to Mandarin were more accurate than similarly naïve Burmese listeners in discriminating all tone pairs except for T2-T4, and 3) Burmese listeners had the greatest trouble discriminating T2-T3 and T1-T2. Taken together, the results suggest that merely possessing lexical tones in L1 may not necessarily facilitate the perception of non-native tones, and that the active use of phonation type in encoding L1 tones may have played a role in Burmese listeners’ less than optimal perception of Mandarin tones.

Journal of the Acoustical Society of America | 2016

Duration of Japanese singleton and geminate stops with devoiced vowel in various speaking rates

Shigeaki Amano; Kimiko Yamakawa; Mariko Kondo

Previous studies have shown that a devoiced vowel makes burst duration of a singleton stop consonant longer than a voiced vowel does. However, the effects of devoiced vowels on stop consonants including a geminate have not been fully investigated. To clarify the effects, durations of a burst and a closure in Japanese stops pronounced by 19 native Japanese speakers were analyzed by stop type (singleton or geminate) and speaking rate (fast, normal, or slow). Analysis revealed that at all speaking rates a devoiced vowel makes the burst duration of singleton and geminate stops longer than a voiced vowel does. Analysis also revealed that at normal and fast speaking rates a devoiced vowel has no effect on closure duration in singleton or geminate stops. However, at a slow speaking rate, a devoiced vowel makes the closure duration of a singleton stop longer. These results indicate that a devoiced vowel consistently lengthens the burst duration of singleton and geminate stops but that its effects on closure durat...

Journal of the Acoustical Society of America | 2016

Asymmetrical interpretation of English liquid consonants by Japanese speakers

Mariko Kondo

Japanese speakers have difficulty in differentiating English /l/ and /r/ because they are not contrastive. Actually, variations of both /l/ and /r/ occur in Japanese speech. The most common realization of Japanese /r/ is alveolar tap [ɾ] and all variants of both /l/ and /r/ are considered as allophones of /r/. Previously, it was found that Japanese speakers have more problems with /l/ than with /r/ in their English production. So, this study investigates why these differences occur. Analysis of Japanese speakers’ mimicry speech of (a) American English and (b) English accented Japanese suggested that Japanese speakers are aware of acoustic and articulatory features of English approximant [ɹ]. Japanese speakers overused approximant [ɹ] and r-colored vowels in their mimicries of both (a) and (b). Further articulatory analysis of Japanese and English consonants showed that the English approximant [ɹ] is quite distinct from Japanese consonants, all of which lack lip rounding. The results of these studies sugge...

Speech Communication | 2015

Automatic recognition of Japanese vowel length accounting for speaking rate and motivated by perception analysis

Greg Short; Keikichi Hirose; Mariko Kondo; Nobuaki Minematsu

Automatically recognize vowel length in Japanese accounting for speaking rate, which has not been done robustly.Perceptual experiments that give new knowledge about the mechanism for human discrimination of vowel length.Algorithm developed motivated by the above perceptual experiments accounting for speaking rate.Method shows to outperform the standard approach.Method also is shown to be robust against speaking rate. Automatic recognition of vowel length in Japanese has several applications in speech processing such as for computer assisted language learning (CALL) systems. Standard automatic speech recognition (ASR) systems make use of hidden Markov models (HMMs) to carry out the recognition. However, HMMs are not particularly well-suited for this problem since classification of vowel length is dependent on prosodic information, and since it is a relative feature affected by changes in the durations of surrounding sounds which vary in part due to changes in speaking rates. That being said, it is not obvious how to design an algorithm to account for these contextual dependencies, since there is still not enough known about how humans perceive the contrast. Therefore, in this paper, we conduct perceptual experiments to further understand the mechanism of human vowel length recognition. In our research, we found that the perceptual boundary is largely affected by the vowels two prior, one prior, and following the vowel for which the length is being recognized. Based on these results and the works of others, we propose an algorithm which does post-processing on alignments output by HMMs to automatically recognize vowel length. The method is primarily composed of two levels of processing dealing first with local dependencies and then long-term dependencies. We test several variations of this algorithm. The method we develop is shown to have superior recognition capabilities and be robust against speaking rate differences producing significant improvements. We test this method on three different databases: a speaking rate database, a native database, and a non-native database.

2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA) | 2014

Development of Japanese read-word database for non-native speakers of Japanese

Kimiko Yamakawa; Shigeaki Amano; Mariko Kondo

A Japanese read-word database was developed for 343 Japanese words spoken by 129 non-native speakers of Japanese (Korean, Chinese, Thai, Vietnamese, French, and English speakers) and 18 native speakers of Japanese. The Japanese words had a contrast of segments such as single/geminate stops, normal/lengthened vowels, existence/inexistence of moraic nasal, contracted sound/vowel /i/, and voiced/voiceless consonants that non-native speakers of Japanese frequently mispronounce. The speech files of the Japanese words and phoneme labels for a part of the speech files were registered in the database. The database will contribute to scientific research on characteristics of Japanese speech by non-native speakers. It will also contribute to develop a computer aided learning system of Japanese speech for non-native speakers.

Journal of the Acoustical Society of America | 2013

The use of durational variables to characterize the rhythmic patterns of non-fluent Japanese utterance by non-native speakers

Shigeaki Amano; Kimiko Yamakawa; Mariko Kondo

Twenty-nine durational variables were examined to clarify rhythmic characteristics in non-fluent Japanese utterances by non-native speakers. Discriminant analysis with these variables was performed on 343 Japanese words, each pronounced in a carrier sentence by six native Japanese speakers and 14 non-native Japanese speakers (7 Vietnamese with low Japanese proficiency and 7 Chinese with high Japanese proficiency). The results showed that a combination of two durational variables could discriminate Japanese speakers from Vietnamese speakers with a small error (8.7%, n = 4458), namely the percentage of vowel duration and the average of “Normalized Voice Onset Asynchrony,” which is an interval time between the onset of two successive vowels divided by the first vowels duration. However, these two variables made a large error (39.4%, n = 4458) in the discrimination of Japanese speakers from Chinese speakers who had higher Japanese proficiency than Vietnamese speakers. These results suggest that the two varia...

conference of the international speech communication association | 2009