Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tatsuya Kitamura is active.

Publication


Featured researches published by Tatsuya Kitamura.


Journal of the Acoustical Society of America | 2008

Acoustic analysis of the vocal tract during vowel production by finite-difference time-domain method

Hironori Takemoto; Parham Mokhtari; Tatsuya Kitamura

The vocal tract shape is three-dimensionally complex. For accurate acoustic analysis, a finite-difference time-domain method was introduced in the present study. By this method, transfer functions of the vocal tract for the five Japanese vowels were calculated from three-dimensionally reconstructed magnetic resonance imaging (MRI) data. The calculated transfer functions were compared with those obtained from acoustic measurements of vocal tract physical models precisely constructed from the same MRI data. Calculated transfer functions agreed well with measured ones up to 10 kHz. Acoustic effects of the piriform fossae, epiglottic valleculae, and inter-dental spaces were also examined. They caused spectral changes by generating dips. The amount of change was significant for the piriform fossae, while it was almost negligible for the other two. The piriform fossae and valleculae generated spectral dips for all the vowels. The dip frequencies of the piriform fossae were almost stable, while those of the valleculae varied among vowels. The inter-dental spaces generated very small spectral dips below 2.5 kHz for the high and middle vowels. In addition, transverse resonances within the oral cavity generated small spectral dips above 4 kHz for the low vowels.


Journal of the Acoustical Society of America | 2006

Acoustic roles of the laryngeal cavity in vocal tract resonance

Hironori Takemoto; Seiji Adachi; Tatsuya Kitamura; Parham Mokhtari; Kiyoshi Honda

The acoustic effects of the laryngeal cavity on the vocal tract resonance were investigated by using vocal tract area functions for the five Japanese vowels obtained from an adult male speaker. Transfer functions were examined with the laryngeal cavity eliminated from the whole vocal tract, volume velocity distribution patterns were calculated, and susceptance matching analysis was performed between the laryngeal cavity and the vocal tract excluding the laryngeal cavity (vocal tract proper). It was revealed that the laryngeal cavity generates one of the formants of the vocal tract, which is the fourth in the present study. At this formant, the resonance of the laryngeal cavity (the 1/4 wavelength resonance) induces the open-tube resonance of the vocal tract proper (the 3/2 wavelength resonance). At the other formants, on the other hand, the vocal tract proper acts as a closed tube, because the laryngeal cavity has only a small contribution to generating these formants and the effective closed end of the whole vocal tract is the junction between the laryngeal cavity and the vocal tract proper.


Computer Methods in Biomechanics and Biomedical Engineering | 2010

Visualisation of hypopharyngeal cavities and vocal-tract acoustic modelling

Kiyoshi Honda; Tatsuya Kitamura; Hironori Takemoto; Seiji Adachi; Parham Mokhtari; Sayoko Takano; Yukiko Nota; Hiroyuki Hirata; Ichiro Fujimoto; Yasuhiro Shimada; Shinobu Masaki; Satoru Fujita; Jianwu Dang

The hypopharyngeal cavities consist of the laryngeal cavity and bilateral piriform fossa, constituting the bottom part of the vocal tract near the larynx. Visualisation of these cavities with magnetic resonance imaging (MRI) techniques reveals that during speech, the laryngeal cavity takes the form of a long-neck flask and the piriform fossa takes the form of a goblet of varying shapes: the former diminishes greatly in whispering and the latter disappears during deep inhalation. These cavities have been shown to exert significant acoustic effects at higher frequency spectra. In this study, acoustic experiments were conducted for male and female mechanical vocal tracts with the results that acoustic effects of those cavities determine the frequency spectra above 2 kHz, giving rise to peaks and zeros. An acoustic model of vowel production was proposed with three components: voice source, hypopharyngeal cavities and vocal tract proper, which provides effective means in controlling voice quality and expressing individual vocal characteristics.


Journal of the Acoustical Society of America | 2006

Cyclicity of laryngeal cavity resonance due to vocal fold vibration

Tatsuya Kitamura; Hironori Takemoto; Seiji Adachi; Parham Mokhtari; Kiyoshi Honda

Acoustic effects of the time-varying glottal area due to vocal fold vibration on the laryngeal cavity resonance were investigated based on vocal tract area functions and acoustic analysis. The laryngeal cavity consists of the vestibular and ventricular parts of the larynx, and gives rise to a regional acoustic resonance within the vocal tract, with this resonance imparting an extra formant to the vocal tract resonance pattern. Vocal tract transfer functions of the five Japanese vowels uttered by three male subjects were calculated under open- and closed-glottis conditions. The results revealed that the resonance appears at the frequency region from 3.0 to 3.7 kHz when the glottis is closed and disappears when it is open. Real spectra estimated from open- and closed-glottis periods of vowel sounds also showed the on-off pattern of the resonance within a pitch period. Furthermore, a time-domain acoustic analysis of vowels indicated that the resonance component could be observed as a pitch-synchronized rise-and-fall pattern of the bandpass amplitude. The cyclic nature of the resonance can be explained as the laryngeal cavity acting as a closed tube that generates the resonance during a closed-glottis period, but damps the resonance off during an open-glottis period.


Speech Communication | 2008

Single-matrix formulation of a time domain acoustic model of the vocal tract with side branches

Parham Mokhtari; Hironori Takemoto; Tatsuya Kitamura

Although it has been found that the piriform fossae play an important role in speech production and acoustics, the popular time domain articulatory synthesizer of [Maeda, S., 1982. A digital simulation method of the vocal-tract system. Speech Comm. 1 (3-4), 199-229] currently cannot include any more than one side branch to the acoustic tube that represents the main vocal tract. To overcome this limitation, in this paper we extended Maedas (1982) simulation method, by mathematical reformulation in terms of a single-matrix equation having a system matrix that is both sparse and symmetric. Using vocal tract area functions measured by MRI, the simulation results showed that the piriform fossae suppress the energy in the higher frequencies by introducing spectral zeros around 4-5kHz, and also tend to lower the second formant of vowels. These spectral changes agree with results produced using a well-tested frequency domain transmission-line method, thus validating our new formulation of the time domain synthesizer. The reformulation can be easily extended to accommodate any number of vocal tract side branches, thus enabling more realistic, physiologically correct acoustic simulation of speech production.


Journal of the Acoustical Society of America | 2013

Acoustic interaction between the right and left piriform fossae in generating spectral dips.

Hironori Takemoto; Seiji Adachi; Parham Mokhtari; Tatsuya Kitamura

It is known that the right and left piriform fossae generate two deep dips on speech spectra and that acoustic interaction exists in generating the dips: if only one piriform fossa is modified, both the dips change in frequency and amplitude. In the present study, using a simple geometrical model and measured vocal tract shapes, the acoustic interaction was examined by the finite-difference time-domain method. As a result, one of the two dips was lower in frequency than the two independent dips that appeared when either of the piriform fossae was occluded, and the other dip was higher in frequency than the two dips. At the lower dip frequency, the piriform fossae resonated almost in opposite phase, while at the higher dip frequency, they resonated almost in phase. These facts indicate that the piriform fossae and the lower part of the pharynx can be modeled as a coupled two-oscillator system whose two normal vibration modes generate the two spectral dips. When the piriform fossae were identical, only the higher dip appeared. This is because the lower mode is not acoustically coupled to the main vocal tract enough to generate an absorption dip.


Journal of the Acoustical Society of America | 2008

Integrated magnetic resonance imaging methods for speech science and technology

Shinobu Masaki; Yukiko Nota; Sayoko Takano; Hironori Takemoto; Tatsuya Kitamura; Kiyoshi Honda

This presentation introduces our integration of magnetic resonance imaging (MRI) techniques at ATR Brain Activity Imaging Center (Kyoto, Japan) toward research into speech science and technology. The first breakthrough in our application of MRI to speech research was the motion imaging of the speech organs in articulation using a cardiac cine‐MRI method. It enables us to acquire information in the time‐space domain to reconstruct successive image frames using utterance repetitions synchronized with MRI scans. This cine‐technique was further improved for high‐quality imaging and expanded into three‐dimensional (3D) visualization of articulatory movements. Using this technique, we could successfully obtain temporal changes of vocal‐tract area function during a Japanese five‐vowel sequence. This effort also contributed to developing other techniques to overcome the limitations of MRI, such as the post‐hoc inclusion of teeth images in 3D volumes or the phonation‐synchronized scan for crystal‐sharp static imag...


Journal of the Acoustical Society of America | 2007

Vocal tract length perturbation and its application to male-female vocal tract shape conversion.

Seiji Adachi; Hironori Takemoto; Tatsuya Kitamura; Parham Mokhtari; Kiyoshi Honda

An alternative and complete derivation of the vocal tract length sensitivity function, which is an equation for finding a change in formant frequency due to perturbation of the vocal tract length [Fant, Quarterly Progress and Status Rep. No. 4, Speech Transmission Laboratory, Kungliga Teknisha Hogskolan, Stockholm, 1975, pp. 1-14] is presented. It is based on the adiabatic invariance of the vocal tract as an acoustic resonator and on the radiation pressure on the wall and at the exit of the vocal tract. An algorithm for tuning the vocal tract shape to match the formant frequencies to target values, such as those of a recorded speech signal, which was proposed in Story [J. Acoust. Soc. Am. 119, 715-718 (2006)], is extended so that the vocal tract length can also be changed. Numerical simulation of this extended algorithm shows that it can successfully convert between the vocal tract shapes of a male and a female for each of five Japanese vowels.


human robot interaction | 2016

Human-Robots Implicit Communication based on Dialogue between Robots using Automatic Generation of Funny Scenarios from Web

Ryo Mashimo; Tomohiro Umetani; Tatsuya Kitamura; Akiyo Nadamoto

Numerous studies have examined communication robots that communicate with people, but it is difficult for robots to communicate with people smoothly. We call the communication style based on dialogue between robots as “human-robot implicit communication”. As described herein, we propose a Manzai-robots for which the interaction style is human-robot implicit communication based on an automatically generated scenario from web news. Our generated Manzai scenario consists of snappy patter and a misunderstanding of dialogue based on the four kinds of gap of structure of funny points. Our purpose is that people feel familiarity from smoothly human-robot communication using dialogue between robots based on a Manzai scenario. We conducted experiment of three kinds to assess (1) the effectiveness of automatic creation of Manzai scenario for the robots, (2) the effectiveness of the Manzai-robots as a media, and (3) the effectiveness of types of familiarity for Manzai-robots. Based on their results, we measured the familiarity and smooth communication of our Manzai-robots.


information integration and web-based applications & services | 2015

Automatic generation of Japanese traditional funny scenario from web content based on web intelligence

Ryo Mashimo; Tomohiro Umetani; Tatsuya Kitamura; Akiyo Nadamoto

Today there is much information and knowledge on the internet, and many studies have examined the extraction of many kinds of knowledge from the internet. In addition, numerous studies have examined entertainment robots that communicate with people, but it is difficult for robots to communicate smoothly with people. We specifically examine communication between robots based on dialogue. Here, we create a dialogue-based scenario for the robots to undertake automatically, but it is difficult because the dialogue requires knowledge of many kinds. We consider the use of the knowledge from the web and create scenarios automatically. As described herein, we propose a system that generates dialogue scenarios automatically from web news articles in real time. We used the Manzai metaphor, which is Japanese traditional humorous comedy in our system. Our generated Manzai scenario consists of snappy patter and a misunderstanding dialogue based on the gap of our structure of funny points. We create communication robots to amuse people with our generated humorous robot dialogue scenarios.

Collaboration


Dive into the Tatsuya Kitamura's collaboration.

Top Co-Authors

Avatar

Hironori Takemoto

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Masato Akagi

Japan Advanced Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ichiro Fujimoto

Tokyo Medical and Dental University

View shared research outputs
Researchain Logo
Decentralizing Knowledge