Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jianguo Wei is active.

Publication


Featured researches published by Jianguo Wei.


conference of the international speech communication association | 2016

A New Model for Acoustic Wave Propagation and Scattering in the Vocal Tract.

Jianguo Wei; Wendan Guan; Darcy Q. Hou; Dingyi Pan; Wenhuan Lu; Jianwu Dang

A new and efficient numerical model is proposed for simulating the acoustic wave propagation and scattering problems due to a complex geometry. In this model, the linearized Euler equations are solved by the finite-difference time-domain (FDTD) method on an orthogonal Eulerian grid. The complex wall boundary represented by a series of Lagrangian points is numerically treated by the immersed boundary method (IBM). To represent the interaction between these two systems, a force field is added to the momentum equation, which is calculated on the Lagrangian points and interpolated to the nearby Eulerian points. The pressure and velocity fields are then calculated alternatively using FDTD. The developed model is verified in the case of acoustic scattering by a cylinder, for which the exact solutions exist. The model is then applied to sound wave propagation in a 2D vocal tract with area function extracted from MRI data. To show the advantage of present model, the grid points are non-aligned with the boundary. The numerical results have good agreements with solutions in literature. A FDTD calculation with boundary condition directly imposed on the grid points closest to the wall cannot give a reasonable solution.


Journal of the Acoustical Society of America | 2008

Vocal tract normalization in articulatory space using thin‐plate spline method

Jianguo Wei; Jianwu Dang

Inter‐subject normalization is a key issue of group analysis of articulatory data to obtain a general description of kinematic properties of human speech production. Multisubject articulatory study however is scarce due to the difficulty of normalization in articulatory domain. In order to reduce intersubject variations among articulatory space, a simple normalization procedure was proposed using a Thin‐plate spline method. The purpose of this normalization processing is to reduce the morphological differences of vocal tracts such as shape and size among different subjects. Nonlinear factors of the reduction are considered in this normalization procedure. The electromagnetic Articulographic (EMMA) data were used in our experiments, which were obtained from the NTT EMMA database for three subjects data included. A physiological articulatory model has been used to serve as the template. The landmarks were defined consistently in vocal tract space over the template and all subjects. The evaluations showed that the variances over subjects have been reduced 2.1 mm for consonants and 2.3 mm for vowels averaged over all tongue pellets.


IEICE Transactions on Information and Systems | 2007

A Model-Based Learning Process for Modeling Coarticulation of Human Speech

Jianguo Wei; Xugang Lu; Jianwu Dang

Machine learning techniques have long been applied in many fields and have gained a lot of success. The purpose of learning processes is generally to obtain a set of parameters based on a given data set by minimizing a certain objective function which can explain the data set in a maximum likelihood or minimum estimation error sense. However, most of the learned parameters are highly data dependent and rarely reflect the true physical mechanism that is involved in the observation data. In order to obtain the inherent knowledge involved in the observed data, it is necessary to combine physical models with learning process rather than only fitting the observations with a black box model. To reveal underlying properties of human speech production, we proposed a learning process based on a physiological articulatory model and a coarticulation model, where both of the models are derived from human mechanisms. A two-layer learning framework was designed to learn the parameters concerned with physiological level using the physiological articulatory model and the parameters in the motor planning level using the coarticulation model. The learning process was carried out on an articulatory database of human speech production. The learned parameters were evaluated by numerical experiments and listening tests. The phonetic targets obtained in the planning stage provided an evidence for understanding the virtual targets of human speech production. As a result, the model based learning process reveals the inherent mechanism of the human speech via the learned parameters with certain physical meaning.


Mathematical Problems in Engineering | 2015

Generalized Finite Difference Time Domain Method and Its Application to Acoustics

Jianguo Wei; Song Wang; Qingzhi Hou; Jianwu Dang

A meshless generalized finite difference time domain (GFDTD) method is proposed and applied to transient acoustics to overcome difficulties due to use of grids or mesh. Inspired by the derivation of meshless particle methods, the generalized finite difference method (GFDM) is reformulated utilizing Taylor series expansion. It is in a way different from the conventional derivation of GFDM in which a weighted energy norm was minimized. The similarity and difference between GFDM and particle methods are hence conveniently examined. It is shown that GFDM has better performance than the modified smoothed particle method in approximating the first- and second-order derivatives of 1D and 2D functions. To solve acoustic wave propagation problems, GFDM is used to approximate the spatial derivatives and the leap-frog scheme is used for time integration. By analog with FDTD, the whole algorithm is referred to as GFDTD. Examples in one- and two-dimensional domain with reflection and absorbing boundary conditions are solved and good agreements with the FDTD reference solutions are observed, even with irregular point distribution. The developed GFDTD method has advantages in solving wave propagation in domain with irregular and moving boundaries.


conference of the international speech communication association | 2016

An Improved 3D Geometric Tongue Model.

Qiang Fang; Yun Chen; Haibo Wang; Jianguo Wei; Jianrong Wang; Xiyu Wu; Aijun Li

This study describes an improved geometric articulatory model based on MRI and CBCT(Cone Beam Computer Tomography) data. The basic idea is to improve the coherence of the vertices of tongue meshes so as to obtain more accurate tongue model. This is conducted in two aspects: i) The representative vertices of tongue surface are depicted in Cartesian coordinate system rather than in a semi-polar gridline coordinate system. ii) tongue surface meshes are modeled with reference to anatomical landmarks. Then, guided PCA is used to extract the control components based on MRI data. The average reconstruction error is less than 1.0 mm. Both qualitative and quantitative evaluation indicates that the proposed method surpasses the conventional semi-polar gridline system based method.


Speech Communication | 2018

Tooth visualization in vowel production MR images for three-dimensional vocal tract modeling

Ju Zhang; Kiyoshi Honda; Jianguo Wei

Abstract Teeth are almost invisible in magnetic resonance imaging (MRI) because they lack free protons to magnetically react. In MRI-based studies on the vocal tract, the teeth must be visualized on the volume data obtained during articulation. To do so, varieties of techniques have been proposed, either covering the teeth by opaque materials or obtaining tooth images followed by their superimposition. In this article, a new method was proposed to visualize the teeth in vowel production MR images for the application of three-dimensional (3D) vocal tract modeling. 3D upper and lower jaw with the teeth was first extracted and reconstructed from static 3D-MRI data acquired during a simple ‘tooth imaging’ posture with minimal time and effort. The extracted 3D jaw with the teeth was superimposed onto the vowel production MRI volume three-dimensionally by using the dental pulps as volume-based landmarks to minimize fitting errors due to varied head positions across the scans. The effectiveness of the proposed method was demonstrated not only by the subjective opinions but also by the objective evaluation. The results show that the teeth are successfully and accurately superimposed onto the vowel production MR images. Also, the reconstructed 3D vocal tract models are observed with the bilateral interdental spaces after tooth superimposition. The proposed method solves the MRI-specific problem of the lack of tooth images and contributes to accurate 3D vocal tract measurement and reconstruction.


Sensors | 2018

Watermarking Based on Compressive Sensing for Digital Speech Detection and Recovery

Wenhuan Lu; Zonglei Chen; Ling Li; Xiaochun Cao; Jianguo Wei; Naixue Xiong; Jian Li; Jianwu Dang

In this paper, a novel imperceptible, fragile and blind watermark scheme is proposed for speech tampering detection and self-recovery. The embedded watermark data for content recovery is calculated from the original discrete cosine transform (DCT) coefficients of host speech. The watermark information is shared in a frames-group instead of stored in one frame. The scheme trades off between the data waste problem and the tampering coincidence problem. When a part of a watermarked speech signal is tampered with, one can accurately localize the tampered area, the watermark data in the area without any modification still can be extracted. Then, a compressive sensing technique is employed to retrieve the coefficients by exploiting the sparseness in the DCT domain. The smaller the tampered the area, the better quality of the recovered signal is. Experimental results show that the watermarked signal is imperceptible, and the recovered signal is intelligible for high tampering rates of up to 47.6%. A deep learning-based enhancement method is also proposed and implemented to increase the SNR of recovered speech signal.


trust, security and privacy in computing and communications | 2016

A Study on Detection and Recovery of Speech Signal Tampering

Jian Li; Wenhuan Lu; Chen Zhang; Jianguo Wei; Xiaochun Cao; Jianwu Dang

In this paper, a watermark scheme for detecting and recovering the tampering of speech signal is proposed. We embed an approximate version of the origin information into the less significant bits (LSBs). When the watermarked signal is tampered, we can localize the tampered area and exclude from the area, recovery the original one from the approximate version. Based the embedded information, we estimate the origin signal by solving a liner equation with Least Square QR-factorization (LSQR) method. The result of an informal listening test shows that 83.7% listening material is intelligible after recovery with 20% part tampered.


Journal of the Acoustical Society of America | 2016

A study on transvelar coupling for non-nasalized sounds

Jianwu Dang; Jianguo Wei; Kiyoshi Honda; Takayoshi Nakai

Previous studies have found that the velum in speech production may not only serve as a binary switch with on-off states for nasal and non-nasal sounds, but also partially alter the acoustic characteristics of non-nasalized sounds. The present study investigated the unique functions of the velum in the production of non-nasalized sounds by using morphological, mechanical, and acoustical measurements. Magnetic resonance imaging movies obtained from three Japanese speakers were used to measure the behaviors of the velum and dynamic changes in the pseudo-volume of the pharyngeal cavity during utterances of voiced stops and vowels. The measurements revealed no significant enlargements in the supraglottal cavity as subjects uttered voiced stops. It is found that the velum thickness varied across utterances in a way that depended on vowels, but not on consonants. The mechanical and acoustical observations in the study suggested that the velum is actively controlled to augment the voice bars of voiced stops, and nostril-radiated sound is one of the most important sources for voice bars, just as is laryngeal wall vibration. This study also proposed a two-layer diaphragm model that simulates transvelar coupling during the production of non-nasalized speech sounds. The simulation demonstrated that the model accurately represented the basic velar functions involved in speech production.


international conference on acoustics, speech, and signal processing | 2010

Morphological normalization of vocal tract shape

Jianguo Wei; Jianwu Dang

Collaboration


Dive into the Jianguo Wei's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qiang Fang

Chinese Academy of Social Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xugang Lu

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Xiyu Wu

Japan Advanced Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge