Shengbei Wang
Japan Advanced Institute of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shengbei Wang.
intelligent information hiding and multimedia signal processing | 2013
Shengbei Wang; Masashi Unoki
We propose a method of speech watermarking based on modifications to line spectral frequencies (LSFs) of original speech. LSFs were derived from each frame with linear prediction (LP) analysis and watermarks were embedded into them by using the quantization index modulation (QIM) of different quantization steps. We took into consideration inaudibility and robustness that were influenced by minor modifications to LSFs. The proposed approach was evaluated with two kinds of experiments with respect to inaudibility and robustness against different speech codecs and general processing. The results from the evaluations revealed that the proposed approach not only had high rate of bit detection while keeping the original sound quality undistorted but also good robustness against general speech processing.
international conference on acoustics, speech, and signal processing | 2016
Jianwu Dang; Shengbei Wang; Masashi Unoki
Many studies have investigated the relationship between the articulatory and auditory features for isolated speech sound and vowels. For fully understanding the mechanisms of speech production and perception, it is necessary to investigate the consonants in the same way. For this reason, in this study, we investigate the manifolds of vowels and consonants out of Japanese reading speech using Laplacian eigenmaps. We constructed uniform articulatory and auditory spaces based on the vowels and consonants to investigate their manifolds. It is found that the distribution of consonants in articulatory space could be classified into labial and lingual groups which reflected their articulatory properties, while in auditory space their distribution was clustered according to voiced and unvoiced, plosive and fricative properties. In vowel-consonant acoustic space, the consonants distributed as a hoe-like shape, with voiced consonants located on the blade of the hoe and fused with vowels. We defined average correlation coefficients to measure the similarity of manifold between three speakers. The results indicated that the vowel/consonant structures had high consistency among the three speakers.
intelligent information hiding and multimedia signal processing | 2015
Erick Christian Garcia Alvarez; Shengbei Wang; Masashi Unoki
This paper proposes the unification of the codeexcited linear prediction (CELP) codec process with watermarking based on formant tuning. The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate. We took advantage of two key properties: I) humans do not perceive alterations applied to formants and II) CELP and watermarking based on formant tuning methods utilize lineal prediction coefficients. We investigated the inaudibility and robustness of the proposed method by carrying out three different experiments using log-spectrum distance (LSD), the perceptual evaluation of speech quality (PESQ) and the bit detection rate (BDR). The results indicated that the proposed method satisfied the inaudibility requirement when watermarking was applied to the CELP codec, which increased the watermarking detection rate.
intelligent information hiding and multimedia signal processing | 2014
Shengbei Wang; Masashi Unoki
Illegal use of digital technologies has brought a series of problems in speech protection and authorization. Digital watermarking can effectively solve these problems by embedding watermarks into the host signals. This paper proposes a hybrid watermarking method for speech signals based on the concepts of formant enhancement (FE) and cochlear delay (CD). This hybrid method utilizes the source-filter model of speech production to separate the speech into vocal tract filter (characterized by formants) and the sound source (excitation signal/residue) so that the FE-based watermarking and CD-based watermarking can be separately applied. Objective evaluations related to inaudibility and robustness were carried out to evaluate the proposed method. The results showed that the proposed method could satisfy inaudibility, and the robustness could be increased in comparison with single method by taking advantages of both two methods. These results also verified the effectiveness of the proposed method.
IEICE Transactions on Information and Systems | 2015
Shengbei Wang; Masashi Unoki
conference of the international speech communication association | 2014
Shengbei Wang; Masashi Unoki; Nam Soo Kim
Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European | 2014
Shengbei Wang; Masashi Unoki
IEICE technical report. Speech | 2016
Jessada Karnjana; Pham Hoang Bao Nhien; Shengbei Wang; Nhut Minh Ngo; Masashi Unoki
international conference on acoustics, speech, and signal processing | 2018
Shengbei Wang; Weitao Yuan; Jianming Wang; Masashi Unoki
IEICE technical report. Speech | 2016
Shengbei Wang; Nhut Minh Ngo; Masashi Unoki