Masahiro Oshikiri | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Masahiro Oshikiri is active.

Explore More

Publication

Featured researches published by Masahiro Oshikiri.

international conference on acoustics, speech, and signal processing | 2004

Efficient spectrum coding for super-wideband speech and its application to 7/10/15 kHz bandwidth scalable coders

Masahiro Oshikiri; Hiroyuki Ehara; Koji Yoshida

The paper presents an efficient spectrum coding method for super-wideband (beyond 7 kHz, e.g. 10 kHz or 15 kHz bandwidth) speech signals based on a bandwidth expansion technique. By using a 7 kHz bandwidth speech signal, its frequency band over 7 kHz is generated by the expansion technique without violating the harmonics structure of the speech signal. The bandwidth expansion is performed by pitch filtering in a frequency domain. A 7 kHz bandwidth spectrum is used as a pitch filter state, and pitch filtering is performed toward a frequency band over 7 kHz. We adopted this pitch filtering based spectrum coding (PFSC) to our proposing 7/10/15 kHz bandwidth scalable coder. The scalable coder consists of an existing standard wideband coder as a base-layer and two PFSC coders as an enhancement-layer. One PFSC coder encodes a 7-10 kHz band spectrum at 4.4 kbit/s and the other a 10-15 kHz band spectrum at 2.2 kbit/s. When the AMR-WB coder at 15.85 kbit/s is used as the base-layer, the total bitrate of the scalable coder is 22.45 kbit/s and the total algorithmic delay is 30 ms. We conducted degradation category rating (DCR) tests for both 10 kHz and 15 kHz bandwidth signals. The results show that the DMOS score of the proposed coder is better than that of the 7 kHz bandwidth original signals in both bandwidth clean speech conditions. In addition, when G.722 at 56 kbit/s is used as the base-layer instead of the AMR-WB coder, the DMOS score of this scalable coder is close to that of the 7 kHz bandwidth original signals in both bandwidth audio conditions.

Journal of the Acoustical Society of America | 1998

Speech encoding apparatus utilizing stored code data

Masami Akamine; Masahiro Oshikiri; Kimio Miseki

A learning-type speech encoding apparatus comprises an adaptive code book storing driving signal vectors, a minimum distortion searching circuit for searching the adaptive code book for an optimum driving signal vector on the basis of the input speech signal, a synthesizing filter for synthesizing a speech signal using the optimum driving signal vector retrieved, a buffer for storing the optimum driving signal vector retrieved, a training vector creating section for producing a training vector by segmenting the stored driving signal vector in units of a specified length, and a learning section for learning by constantly updating the driving signal vectors in the code book on the basis of the training vector.

Electronics and Communications in Japan Part Iii-fundamental Electronic Science | 2000

A 2.4-kbps variable-bit-rate ADP-CELP speech coder

Masahiro Oshikiri; Masami Akamine

Based on ADP-CELP (Adaptive Density Pulse Code Excited Linear Prediction), in which the sound source is expressed in terms of an adaptive density pulse train model, a variable-rate ADP-CELP system is proposed in which one mode out of four, with different bit rate depending on the nature of the short section speech signal, is selected for each frame. The coding method in each mode is designed to be adapted to the nonspeech section, nonsound section, steady speech section, and nonsteady speech section. The bit rates are 0.53, 2.67, 3.17, and 4.1 kbps. In the variable-rate ADP-CELP system, the variation of the spectral envelope is used as a guideline so that a high sound quality and low bit rate are attempted by means of the speech/nonspeech classification method to accurately distinguish the speech section and nonspeech section even under a background noise environment and of the interpolation vector quantization method that can reduce the coding volume for the pitch period to one-half without degrading the quality. In the variable-rate ADP-CELP system, the average rate is less than 2.4 kbps in the case of sound rate of 60% and the corresponding quality is more than the half-rate standard system (3.45 kbps) of domestic digital cellular telephones.

international conference on acoustics speech and signal processing | 1998

A 2.4 kbps variable bit rate ADP-CELP speech coder

Masahiro Oshikiri; Masami Akamine

This paper presents a variable bit rate ADP-CELP (adaptive density pulse code excited linear prediction) coder that selects one of four kinds of coding structure in each frame based on short time speech characteristics. To improve speech quality and reduce the average bit rate, we have developed a speech/non-speech classification method using spectrum envelope variation, which is robust for background noise. In addition, we propose an efficient pitch lag coding technique. The technique interpolates consecutive frame pitch lags and quantizes a vector of relative pitch lags consisting of variation between an estimated pitch lag and a target pitch lag in plural subframes. The average bit rate of the proposed coder was approximately 2.4 kbps for speech sources with activity factor of 60%. Our subjective testing indicates the quality of the proposed coder exceeds that of the Japanese digital cellular standard with rate of 3.45 kbps.

global communications conference | 1991

Improvement of ADP-CELP speech coding at 4 kbits/s

Masami Akamine; Kimio Miseki; Masahiro Oshikiri

The authors describe improvements in the ADP-CELP (adaptive-density-pulse code-excited-linear-prediction) coder at 4 kb/s. A fractional delay was introduced to the adaptive codebook. A fast exhaustive search procedure for a uniformly distributed fractional delay was studied for the case where the delay length is shorter than the codevectors dimension. A novel method to determine the long-term predictors gain was also used. Four methods to find the density pattern of the ADP excitation were investigated. A modified adaptive postfilter was used to improve subjective speech quality.<<ETX>>

Archive | 1998