Wenhuan Lu
Tianjin University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wenhuan Lu.
Journal of Information Science | 2008
Wenhuan Lu; Mitsuru Ikeda
Copyright issues are significant for worldwide information sharing, while mutual understanding about the commonalities and differences among international copyright law articles is difficult due to the diversity of legal knowledge representation. The goal of our research is to propose an appropriate methodology and capture a uniform conceptual model that will provide semantic level representation for processing and modelling international legal knowledge using ontological technology. This paper proposes a preliminary intention-oriented legal knowledge model as a pivotal model that, from the viewpoint of intention behind the law, manages and models legal knowledge derived from international law documents. We develop a domain ontology — international copyright law ontology, which is used as a fundamental conceptual framework to maintain consistency among diverse legal knowledge representations.
conference of the international speech communication association | 2016
Jianguo Wei; Wendan Guan; Darcy Q. Hou; Dingyi Pan; Wenhuan Lu; Jianwu Dang
A new and efficient numerical model is proposed for simulating the acoustic wave propagation and scattering problems due to a complex geometry. In this model, the linearized Euler equations are solved by the finite-difference time-domain (FDTD) method on an orthogonal Eulerian grid. The complex wall boundary represented by a series of Lagrangian points is numerically treated by the immersed boundary method (IBM). To represent the interaction between these two systems, a force field is added to the momentum equation, which is calculated on the Lagrangian points and interpolated to the nearby Eulerian points. The pressure and velocity fields are then calculated alternatively using FDTD. The developed model is verified in the case of acoustic scattering by a cylinder, for which the exact solutions exist. The model is then applied to sound wave propagation in a 2D vocal tract with area function extracted from MRI data. To show the advantage of present model, the grid points are non-aligned with the boundary. The numerical results have good agreements with solutions in literature. A FDTD calculation with boundary condition directly imposed on the grid points closest to the wall cannot give a reasonable solution.
Multimedia Tools and Applications | 2016
Jianguo Wei; Qiang Fang; Xinyuan Zheng; Wenhuan Lu; Yuqing He; Jianwu Dang
Constructing a mapping between articulatory movements and corresponding speech could significantly facilitate speech training and the development of speech aids for voice disorder patients. In this paper, we propose a novel deep learning framework for the creation of a bidirectional mapping between articulatory information and synchronized speech recorded using an ultrasound system. We created a dataset comprising six Chinese vowels and employed the Bimodal Deep Autoencoders algorithm based on the Restricted Boltzmann Machine (RBM) to learn the correlation between speech and ultrasound images of the tongue and the weight matrices of the data representations obtained. Speech and ultrasound images were then reconstructed from the extracted features. The reconstruction error of the ultrasound images created with our method was found to be less than that of the approach based on Principal Components Analysis (PCA). Further, the reconstructed speech approximated the original as the mean formants error (MFE) was small. Following acquisition of their shared representations using the RBM-based deep autoencoder, we carried out mapping between ultrasound images of the tongue and corresponding acoustics signals with a Deep Neural Network (DNN) framework using the revised Deep Denoising Autoencoders. The results obtained indicate that the performance of our proposed method is better than that of a Gaussian Mixture Model (GMM)-based method to which it was compared.
asia pacific signal and information processing association annual summit and conference | 2015
Jingshu Zhang; Jianguo Wei; Wenhuan Lu; Qiang Fang; Kiyoshi Honda; Jianwu Dang
The differences of vowel sounds among speakers are mainly caused by the morphological differences and different speaking styles of each speaker. The traditional vowel normalization methods in acoustic space are mainly concerned with the variance of acoustic features. There is no information of articulatory space taken into account. This paper proposed an approach to normalize vowel spectra by using articulatory information. The articulatory information of this study is mainly related with morphological variations of speakers. By taking articulatory information into account, the normalization method will have clear physical meaning that which part of acoustic variations has been reduced. This paper proposed a vowel normalization framework by using Thin-plate spline method. Thin-plate spline method was applied to normalize formant frequencies of three Chinese vowels, and was compared with traditional acoustic normalization methods for evaluation. The results show that the variances among different subjects were reduced. Vowel diagrams indicate that this method outperforms other acoustic methods in keeping speakers specific characteristics.
ambient intelligence | 2017
Wenhuan Lu; Ju Zhang; Xinli Zhao; Jianrong Wang; Jianwu Dang
Self-localization is a fundamental requirement for autonomous mobile robots. With the rapid development in sensor technology, the sensor suites of robot provide multimodal information that naturally ensures perception robustness, multimodal sensory fusion are able to provide a better solution for enhance the capability of self-localization. This paper proposes a multimodal sensory fusion method based on Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) for RoboCup 3D Simulation league. This approach fuses Inertia Navigation System (INS) and vision perceptor information from different sensors at feature level instead of raw data. The experiment results demonstrate that the proposed approach makes an improvement in predictive accuracy and efficiency compared with the standard Extended Kalman Filter (EKF) and the static Particle Filter (PF) methods.
Journal of the Acoustical Society of America | 2016
Kiyoshi Honda; Honghao Bao; Wenhuan Lu
Origins of individual characteristics of speech sounds have been a mystery. Individual patterns of higher spectra could be attributed to quasi-static hypopharyngeal-cavity resonance, while those of lower spectra are puzzling because both spectra and vocal-tract shapes radically change during speech. A possible clue to look into articulatory idiosyncrasy may be the relation between relative size and mobility of the tongue in the oropharyngeal cavity. To this end, combined cine- and tagged-MRI collected from four Chinese speakers producing two-syllable words were processed. The relative tongue size was indexed by midsagittal tongue area divided by tongue plus airway area both measured above the level of the superior genial tubercle in static MRI during /i/. The mobility of the tongue was measured by average velocity of tag points located along the oral and pharyngeal surface of the tongue. In the result, the velocity monotonically decreased with the relative tongue size, suggesting that the smaller the tong...
Journal of Visual Communication and Image Representation | 2016
Jianguo Wei; Jingshu Zhang; Yan Ji; Qiang Fang; Wenhuan Lu
Abstract Minimizing morphological variances of the vocal tract across speakers is a challenge for articulatory analysis and modeling. In order to reduce morphological differences in speech organs among speakers and retain speakers’ speech dynamics, our study proposes a method of normalizing the vocal-tract shapes of Mandarin and Japanese speakers by using a Thin-Plate Spline (TPS) method. We apply the properties of TPS in a two-dimensional space in order to normalize vocal-tract shapes. Furthermore, we also use DNN (Deep Neural Networks) based speech recognition for our evaluations. We obtained our template for normalization by measuring three speakers’ palates and tongue shapes. Our results show a reduction in variances among subjects. The similar vowel structure of pre/post-normalization data indicates that our framework retains speaker specific characteristics. Our results for the articulatory recognition of isolated phonemes show an improvement of 25%. Moreover, our phone error rate of continuous speech reduced by 5.84%.
international symposium on chinese spoken language processing | 2014
Xinyuan Zheng; Jianguo Wei; Wenhuan Lu; Qiang Fang; Jianwu Dang
Building up the mapping between articulatory movements and corresponding speech could great facility the speech training and speech aid for voiceless patients. In this paper, we propose a deep learning framework for building up a mapping between articulatory information and corresponding speech, which were recorded by ultrasound system. The dataset includes six Chinese vowels. We use Bimodal Deep Autoencoder algorithm based on RBM to learn the relationship between speech and articulation, the weights matrix of representation of them. Speech and ultrasound images have been reconstructed using the extracted features. The reconstruction error of articulation by our method is less than that of PCA based approach. The reconstructed speech is similar to the original one. We propose a mapping from ultrasound tongue image to acoustic signal with a revised Denoising Autoencoder, the results show that it is a promising approach. In contrast, another experiment is conducted to synthesize the ultrasound tongue image from the speech, but the result should be improved.
Sensors | 2018
Wenhuan Lu; Zonglei Chen; Ling Li; Xiaochun Cao; Jianguo Wei; Naixue Xiong; Jian Li; Jianwu Dang
In this paper, a novel imperceptible, fragile and blind watermark scheme is proposed for speech tampering detection and self-recovery. The embedded watermark data for content recovery is calculated from the original discrete cosine transform (DCT) coefficients of host speech. The watermark information is shared in a frames-group instead of stored in one frame. The scheme trades off between the data waste problem and the tampering coincidence problem. When a part of a watermarked speech signal is tampered with, one can accurately localize the tampered area, the watermark data in the area without any modification still can be extracted. Then, a compressive sensing technique is employed to retrieve the coefficients by exploiting the sparseness in the DCT domain. The smaller the tampered the area, the better quality of the recovered signal is. Experimental results show that the watermarked signal is imperceptible, and the recovered signal is intelligible for high tampering rates of up to 47.6%. A deep learning-based enhancement method is also proposed and implemented to increase the SNR of recovered speech signal.
Multimedia Tools and Applications | 2018
Jianguo Wei; Yan Ji; Jingshu Zhang; Qiang Fang; Wenhuan Lu; Kiyoshi Honda; Xugang Lu
In this paper, the contributions of dynamic articulatory information were evaluated by using an articulatory speech recognition system. The Electromagnetic Articulographic dataset is relatively small and hard to be recorded compared with popular speech corpora used for modern speech study. We used articulatory data to study the contribution of each observation channel of vocal tracts in speech recognition by DNN framework. We also analyzed the recognition results of each phoneme according to speech production rules. The contribution rate of each articulator can be considered as the crucial level of each phoneme in speech production. Furthermore, the results indicate that the contribution of each observation point is not relevant to a specific method. The tendency of a contribution of each sensor is identical to the rules of Japanese phonology. In this work, we also evaluated the compensation effect between different channels. We discovered that crucial points are hard to be compensated for compared with non-crucial points. The proposed method can help us identify the crucial points of each phoneme during speech. The results of this paper can contribute to the study of speech production and articulatory-based speech recognition.