Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yoshiaki Asakawa is active.

Publication


Featured researches published by Yoshiaki Asakawa.


Journal of the Acoustical Society of America | 1993

Character voice communication system

Akira Ichikawa; Yoshiaki Asakawa; Shoichi Takeda; Nobuo Hataoka

A character voice communication system including high efficiency voice coding system for encoding and transmitting speech information at a high efficiency and a voice character input/output system for converting speech information into character information or receiving character information and transmitting speech or character information are organically integrated. A speech analyzer and a speech synthesizer are shared by both the voice coding and the voice character input/output systems. Communication apparatus is also provided which allows mutual conversion between speech signals and character codes.


Journal of the Acoustical Society of America | 1995

Vector quantizing apparatus and speech analysis-synthesis system using the apparatus

Yoshiaki Asakawa; Katsuya Yamasaki; Akira Ichikawa

A vector quantizing apparatus having a general vector quantization circuit, and a storage means for storing at least one frame of data as the result of comparison by a matching circuit is provided. Further, provided are a speech analysis-synthesis system having a spectral envelope generator for generating a spectral envelope which is so smooth that excessive beating is avoided, a spectral envelope vector converter for sampling the spectral envelope at equal intervals on mel-scale, a vector quantizer for quantizing vectors, and a spectral envelope reconstructor for reconstructing the spectral envelope by interpolation on the basis of combined parabolas.


Journal of the Acoustical Society of America | 2000

Method for coding/decoding, coding/decoding device, and videoconferencing apparatus using such device

Makoto Takashima; Yoshiaki Asakawa

The object of the invention is to provide a coding/decoding method in which degradation of sound quality perceptible by the listener does not occur at an low bit rate. A shift number calculation section of a decoding device divides a frequency domain into at least two sub-bands, and approximates each of normalized transform coefficients in the sub-band whose allocated bit value is less than a predetermined threshold using a quantized value of the transform coefficient in a predetermined sub-band other than the sub-band so as to obtain information concerning the approximation, and a multiplexer multiplexes the information and another signal and transmits them. A de-multiplexer of a decoding device separates the code of information concerning the approximation, and a shift number restore section restores the information based thereon. An approximation coefficient calculation section assigns, based on the information concerning the approximation, the transform coefficient values in the predetermined sub-band to the normalized transform coefficients whose allocated bit value is less than the predetermined threshold.


Journal of the Acoustical Society of America | 1994

Speech coding system using excitation pulse train

Yoshiaki Asakawa; Akira Ichikawa; Kazuhiro Kondo; Toshiro Suzuki

A speech signal is analyzed for each frame so that it is separated into spectral envelope information and excitation information, and the excitation information is expressed by a plurality of pulses. Judgement is conducted as to whether the current frame is a voiced frame immediately after the transition from an unvoiced frame, a voiced frame continuative from a voiced frame or an unvoiced frame, and excitation pulses are generated in accordance with the judgement result. In case of a continuing voiced frame, the excitation pulse position of the current voiced frame is determined based on the pitch period with respect to the excitation pulse position of the immediately preceding voiced frame so that the excitation pulse train is generated at a position approximated to the determined position.


Journal of the Acoustical Society of America | 1993

Speech coding and decoding system with background sound reproducing function

Yoshiaki Asakawa; Toshiyuki Aritsuka

In speech decoding, a transmission code, which includes an error correcting code added to a speech code, is received and whether or not there is a code error is detected on the basis of the error correcting code. At this time, when there is no code error or when the detected code error has been corrected, a normal speech decoding processing is executed. On the other hand, when there is a code error which is impossible to be corrected, artificially background sound corresponding to the decoded speech is generated from characteristic parameters indicating unvoiced sound in the decoded speech. The parameters are continuously extracted from the decoded speech, stored in a memory and are used to replace an erroneous portion of the speech code.


Journal of the Acoustical Society of America | 1992

High efficiency voice coding system

Akira Ichikawa; Yoshiaki Asakawa; Akio Komatsu; Eiji Oohira

A voice coding system for separating and coding voice information into spectrum envelope information and voice source information, with the intention of compressing the amount of information for efficient coding of vocal audio signals through the control of the voice source information based on the fact that the spectrum envelope information and voice source information highly correlate with each other.


international conference on acoustics, speech, and signal processing | 1985

A speech coding method using thinned-out residual

Akira Ichikawa; S. Takeda; Yoshiaki Asakawa

A new high-quality speech information compression method is developed. This method introduces techniques of eliminating unnecessary samples of prediction residual wave pulses to obtain a thinned-out residual. First, a thinning-out procedure which minimizes the quality degradation is formulated. Next, a procedure which simplifies this thinning-out procedure under several hypotheses is defined. Subjective evaluation of this procedure using preference tests confirms that almost no quality degradation occurs. Pitch information is utilized. Adding the process of repetitive use of the thinned-out residual to the procedure, preference tests are carried out at a bit-rate of 9.6 kb/s for purposes of comparison with the newest MPE which includes the pitch prediction process. The results are that our proposed method produces slightly higher quality speech than does the MPE method. The number of processing steps is less than one-third that of MPE.


international conference on acoustics, speech, and signal processing | 1984

Speaker-independent connected digit recognition

Nobuo Hataoka; Yoshiaki Asakawa; Akio Komatsu; Akira Ichikawa

An algorithm for speaker-independent connected digit recognition for telephone use, and its experimental results are described. The main features of this algorithm are the use of multiple reference templates assigned to each speaker class, a continuous DP matching process for word spotting, and partial reference templates to confirm spotted digits. The K-nearest neighbor decision rule and pair-comparison judgement are used to obtain the final result from spotted digit sequences. Experimental results show the average correct recognition score to be 94% for each Japanese digit in their connected utterances through actual telephone lines.


Electronics and Communications in Japan Part Iii-fundamental Electronic Science | 2000

Investigations of independence of distortion scales in objective evaluation of synthesized speech quality

Haruaki Yabuoka; Takeshi Nakayama; Yukio Kitabayashi; Yoshiaki Asakawa

Using factor analysis, we have investigated the independence of five distortion scales, namely, differential spectrum distortion, phase distortion, waveform distortion, cepstrum distance, and amplitude distortion, that are widely used in the objective evaluation of synthesized speech quality. We have found that the above distortion scales can be constructed from two factors. The first factor is constructed from cepstrum distance, amplitude distortion, and waveform distortion, and the second factor is constructed from phase distortion and differential spectrum distortion. When a multiple linear regression model is used as a prediction model of MOS, the prediction accuracy is the highest when the cepstrum distance is used as the distortion scale for the first factor and the differential spectrum distortion for the second factor. Investigating the correspondence of psychological quality factors with the physical distortion factors, it was found that the first physical factor is related to “clarity” and second factor is related to “sensation.” Moreover, the importance of differential spectrum distortion in the future quality prediction of low-bit-rate synthesized speech is demonstrated.


international conference on acoustics, speech, and signal processing | 1989

Speech coding method using fuzzy vector quantization

Yoshiaki Asakawa; Akira Ichikawa; Shunichi Yajima; Katsuya Yamasaki

A medium-based, high-quality speech coding method that is promising for producing high-quality speech is proposed. One of its main features is the application of a PSE (power-spectrum envelope) analysis method. Another feature is the application of fuzzy vector quantization (FVQ), which can reduce quantization distortion to below that of conventional VQ. Reduced FVQ (R-FVQ), which uses a k-nearest-neighbor technique, a code-word selection technique, and a predetermined table, is proposed for reducing both the bit rate and quantization distortion. Using R-FVQ, speech was coded at 8-kb/s and the synthetic speech was of high quality and sounded natural.<<ETX>>

Collaboration


Dive into the Yoshiaki Asakawa's collaboration.

Researchain Logo
Decentralizing Knowledge