R. Muralishankar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where R. Muralishankar is active.

Explore More

Publication

Featured researches published by R. Muralishankar.

Speech Communication | 2004

Modification of pitch using DCT in the source domain

R. Muralishankar; A. G. Ramakrishnan; P. Prathibha

In this paper, we propose a novel algorithm for pitch modification. The linear prediction residual is obtained from pitch synchronous frames by inverse filtering the speech signal. Then the discrete cosine transform (DCT) of these residual frames is taken. Based on the desired factor of pitch modification, the dimension of the DCT coefficients of the residual is modified by truncating or zero padding, and then the inverse discrete cosine transform is obtained. This period modified residual signal is then forward filtered to obtain the pitch modified speech. The mismatch between the positions of the harmonics of the pitch modified signal and the LP spectrum of the original signal introduce gain variations, which is more pronounced in the case of female speech [Proc. Int. Conf. on Acoust. Speech and Signal Process. (1997) 1623]. This is minimised by modifying the radii of the poles of the filter to broaden the otherwise peaky linear predictive spectrum. The modified LP coefficients are used for both inverse and forward filtering. This pitch modification scheme is used in our Concatenative Speech synthesis system for Kannada. The technique has also been successfully applied to creating interrogative sentences from affirmative sentences. The modified speech has been evaluated in terms of intelligibility, distortion and speaker identity. Results indicate that our scheme results in acceptable speech in terms of all these parameters for pitch change factors required for our speech synthesis work.

Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002. | 2002

A complete text-to-speech synthesis system in Tamil

G.L. Jayavardhana Rama; A. G. Ramakrishnan; R. Muralishankar; R. Prathibha

We report the design and development of Thirukkural, the first text-to-speech converter in Tamil. Syllables of different lengths have been selected as units since Tamil is a syllabic language. An automatic segmentation algorithm has been devised for segmenting syllables into consonant and vowel. The units are pitch marked using the discrete cosine transform-spectral autocorrelation function (DCTSAF). Prosodic information is captured in tables based on extensive observation of spoken Tamil. During synthesis, DCT based pitch modification is applied for both waveform interpolation and modifying pitch contour for different sentence modalities. Thirukkural is designed in VC++ and runs on Windows 95/98/NT. Perceptual evaluation by natives show that the synthesized speech is intelligible and fairly natural.

workshop on applications of signal processing to audio and acoustics | 2007

Bauer Method of MVDR Spectral Factorization for Pitch Modification in the Source Domain

M. Ravi Shanker; R. Muralishankar; A. G. Ramakrishnan

In our earlier work [1], we employed MVDR (minimum variance distortionless response) based spectral estimation instead of modified-linear prediction method [2] in pitch modification. Here, we use the Bauer method of MVDR spectral factorization, leading to a causal inverse filter rather than a noncausal filter setup with MVDR spectral estimation [1]. Further, this is employed to obtain source (or residual) signal from pitch synchronous speech frames. The residual signal is resampled using DCT/IDCT depending on the target pitch scale factor. Finally, forward filters realized from the above factorization are used to get pitch modified speech. The modified speech is evaluated subjectively by 10 listeners and mean opinion scores (MOS) are tabulated. Further, modified bark spectral distortion measure is also computed for objective evaluation of performance. We find that the proposed algorithm performs better compared to time domain pitch synchronous overlap [3] and modified-LP method [2]. A good MOS score is achieved with the proposed algorithm compared to [1] with a causal inverse and forward filter setup.

Journal of Sol-Gel Science and Technology | 1997

Pseudo Complex Cepstrum Using Discrete Cosine Transform

R. Muralishankar; A. G. Ramakrishnan

Two new algorithms are proposed, which obtain pseudo complex cepstrum using Discrete Cosine Transform (DCT). We call this as the Discrete Cosine Transformed Cepstrum (DCTC). In the first algorithm, we apply the relation between Discrete Fourier Transform (DFT) and DCT. Computing the complex cepstrum using Fourier transform needs the unwrapped phase. The calculation of the unwrapped phase is difficult whenever multiple zeros and poles occur near or on the unit circle. Since DCT is a real function, its phase can only be 0 or π and the phase is unwrapped by representing the negative sign by exp (−jπ) and the positive sign by exp (j0) . The second algorithm obviates the need for DFT and obtains DCTC by representing the DCT sequence itself by magnitude and phase components. Phase is unwrapped in the same way as the first algorithm. We have tested DCTC on a simulated system that has multiple poles and zeros near or on the unit circle. The results show that DCTC matches the theoretical complex cepstrum more closely than the DFT based complex cepstrum. We have explored possible uses for DCTC in obtaining the pitch contour of syllables, words and sentences. It is shown that the spectral envelope obtained from the first few coefficients matches reasonably with the envelope of the signal spectrum under consideration, and thus can be used in applications, where faithful reproduction of the spectral envelope is not critical. We also examine the utility of DCTC as feature set for speaker identification. The identification rate with DCTC as feature vector was higher than that with linear prediction-derived cepstral coefficients.

ieee region 10 conference | 2003

Subspace and hypothesis based effective segmentation of co-articulated basic-units for concatenative speech synthesis

R. Muralishankar; Ryali Srikanth; A. G. Ramakrishnan

In this paper, we present two new methods for vowel-consonant segmentation of a co-articulated basic-units employed in our Thirukkural Tamil text-to-speech synthesis system (G. L. Jayavardhana Rama et al, IEEE workshop on Speech Synthesis, 2002). The basic-units considered in this are CV, VC, VCV, VCCV and VCCC, where C stands for a consonant and V for any vowel. In the first method, we use a subspace-based approach for vowel-consonant segmentation. It uses oriented principal component analysis (OPCA) where the test feature vectors are projected on to the V and C subspaces. The crossover of the norm-contours obtained by projecting the test basic-unit onto the V and C subspaces give the segmentation points which in turn helps in identifying the V and C durations of a test basic-unit. In the second method, we use probabilistic principal component analysis (PPCA) to get probability models for V and C. We then use the Neymen-Pearson (NP) test to segment the basic-unit into V and C. Finally, we show that the hypothesis testing turns out to be an energy detector for V-C segmentation which is similar to the first method.

international conference on neural information processing | 2004

Wavelet-Based Estimation of Hemodynamic Response Function

Ryali Srikanth; R. Muralishankar; A. G. Ramakrishnan

We present a new algorithm to estimate hemodynamic response function (HRF) and drift component in wavelet domain. The HRF is modeled as a gaussian function with unknown parameters. The functional Magnetic resonance Image (fMRI) noise is modeled as a fractional brownian motion (fBm). The HRF parameters are estimated in wavelet domain since wavelet transform with sufficient number of vanishing moments decorrelates a fBm process. Due to this decorrelating property of wavelet transform, the noise covariance matrix in wavelet domain can be assumed to be diagonal whose entries are estimated using sample variance estimators at each scale. We study the influence of sampling time and shape assumption on the estimation performance. Results are presented by adding synthetic HRFs on null fMRI data.

international conference on acoustics, speech, and signal processing | 2002