Takahiro Kamai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takahiro Kamai is active.

Explore More

Publication

Featured researches published by Takahiro Kamai.

Journal of the Acoustical Society of America | 2004

System and method for synthesizing multiplexed speech and text at a receiving terminal

Takahiro Kamai; Kenji Matsui; Zhu Weizhong

The reception terminal receives a code series from the communication path. The separator separates the code series into a speech code series and text information. The speech code series is decoded into a pitch period, a LSP coefficient, and code numerals by the synthesizer to reproduce the speech sound in the CELP system. Also, the text information is converted into pronunciation and accent information by the language analyzer and added to prosody information, such as phoneme time length and pitch pattern by the prosody generator. The LSP coefficient, and code numerals suitable for the phoneme are read from the segment database and the pitch frequency from the prosody information is inputted to the synthesizer and synthesized into speech sound.

Journal of the Acoustical Society of America | 2003

Fundamental frequency pattern generator, method and program

Yumiko Kato; Kenji Matsui; Takahiro Kamai; Noriyo Hara

According to this fundamental frequency generating method, a fundamental frequency pattern is set from a data base of a fundamental frequency pattern of each accent phrase standardized by the phoneme time length or the time length of the vowel and the vowel corresponding portion, and when the corresponding fundamental frequency pattern is not stored in the data base, the fundamental frequency pattern is generated by interpolating the interval between points serving as the references of the fundamental frequency pattern. With this method, a fundamental frequency pattern having higher naturalness than with conventional methods can be generated.

Journal of the Acoustical Society of America | 2011

Speech synthesis method and speech synthesizer

Takahiro Kamai; Yumiko Kato

A language processing portion (31) analyzes a text from a dialogue processing section (20) and transforms the text to information on pronunciation and accent. A prosody generation portion (32) generates an intonation pattern according to a control signal from the dialogue processing section (20). A waveform DB (34) stores prerecorded waveform data together with pitch mark data imparted thereto. A waveform cutting portion (33) cuts desired pitch waveforms from the waveform DB (34). A phase operation portion (35) removes phase fluctuation by standardizing phase spectra of the pitch waveforms cut by the waveform cutting portion (33), and afterwards imparts phase fluctuation by diffusing only high phase components randomly according to the control signal from the dialogue processing section (20). The thus-produced pitch waveforms are placed at desired intervals and superimposed.

Journal of the Acoustical Society of America | 2009

Speech synthesis system for naturally reading incomplete sentences

Natsuki Saito; Takahiro Kamai

To provide a speech synthesis apparatus which can prevent user confusion and deterioration of the quality of synthesized speech resulting from incompleteness of the sentences to be read out, and thus can read out speech which is easily understandable to the user. The speech synthesis apparatus includes: an incomplete part-of-sentence detection unit which detects incomplete parts-of-sentences which become linguistically incomplete because of the presence of a missing character string and which complements the detected incomplete parts-of-sentences having a missing character string, with reference to the e-mail texts which have been received by and accumulated in a mail box; a speech synthesis unit which generates synthesized speech based on the complemented e-mail texts; an incomplete part-of-sentence obscuring unit which obscures the acoustic clarity of the synthesized speech corresponding to the incomplete parts-of-sentences detected by the incomplete part-of-sentence detection unit; and a speaker device which plays back and outputs the generated synthesized speech.

Journal of the Acoustical Society of America | 2010