Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Visarut Ahkuputra is active.

Publication


Featured researches published by Visarut Ahkuputra.


asia pacific conference on circuits and systems | 1998

Thai syllable segmentation for connected speech based on energy

N. Jittiwarangkul; Somchai Jitapunkul; S. Luksaneeyanavin; Visarut Ahkuputra; Chai Wutiwiwatchai

This paper proposes a novel technique based on local maximum and minimum energy contour to segment syllables of connected speech. Energy contour used in this technique was obtained from five energy algorithms: the absolute energy, the root mean square energy, the square energy, Teager energy and the modified Teager energy. The experimental result was conducted on 36 utterances, 11 speakers (7 male and 4 females) for evaluation empirical threshold and parameter values in each energy algorithm. The result from our technique that is based on the local maximum and minimum energy was better than the commonly used endpoint detection technique based on level equalizer proposed by Lamel L.F et al. [1981].


asia pacific conference on circuits and systems | 1998

Recent advances of Thai speech recognition in Thailand

Somchai Jitapunkul; Sudaporn Luksaneeyanawin; Visarut Ahkuputra; Ekkarit Maneenoi; Sawit Kasuriya; Phakapong Amornkul

This correspondence proposes recent advances in speech processing development on Thai language especially in the speech analysis and speech recognition areas. Recent research progress is presented including the isolated Thai numerals recognition, Thai connected word or polysyllabic word recognition, and speech segmentation techniques. As well as the current status, technology implementation difficulties, and future research directions on Thai speech processing are covered.


asia pacific conference on circuits and systems | 1998

Comparison of different techniques on Thai speech recognition

Visarut Ahkuputra; Somchai Jitapunkul; Ekkarit Maneenoi; Sawit Kasuriya; P. Amornkul

This paper introduces the comparison between Thai isolated word speech recognition techniques using the Hidden Markov Model, the Modified Backpropagation Neural Network, and the Fuzzy-Neural Network. The recognition has been made on the ten isolated Thai numerals from zero to nine under the same system configuration for all the approaches. The 15-state left-to-right discrete hidden Markov model in cooperation with the vector quantization technique was compared with the multilayer network using error backpropagation, the modified backpropagation, and also with the fuzzy-neural network with the same configuration. The recognition accuracy using hidden Markov model, neural network, modified neural network and fuzzy-neural network approaches are 84.250 percent, 73.030 percent, 78.000 percent, and 78.300 percent respectively.


multimedia signal processing | 1998

Thai polysyllabic word recognition using fuzzy-neural network

Chai Wutiwiwatchai; Somchai Jitapunkul; Sudaporn Luksaneeyanawin; Visarut Ahkuputra

A fuzzy-neural network (fuzzy-NN) model was proposed for speaker-independent Thai polysyllabic word recognition. Fuzzy features converted from exact features were used to be input of multilayer perceptron (MLP) neural network. Various fuzzy membership functions on linguistic properties were used for fuzzy conversion and compared together. The binary desired outputs were used during training. 70 Thai words consist of ten numerals, the others were single-syllable, double-syllable and triple-syllable, 20 words in each group, were used for system evaluation. In order to improve recognition accuracy, number of syllable and tonal level detected were conducted for speech preclassification. The Pi fuzzy membership function provided the best recognition accuracy among other functions; trapezoidal, and triangular function. Under an optimal condition, the achieved recognition error rates were 5.6% on dependent test and 6.7% on independent test, which were respectively 3.3% and 3.4% decreasing from the conventional neural network system.


International Journal of Computer Processing of Languages | 2003

Acoustic Modelling of Vowel Articulation on the Nine Thai Spreading Vowels

Visarut Ahkuputra; Ekkarit Maneenoi; Sudaporn Luksaneeyanawin; Somchai Jitapunkul

The Thai vowel system comprises nine short-long vowel pairs. The nine spreading vowels occur in specific locations covering the whole articulatory area. The location of each vowel can be identified according to its acoustic characteristics. Relationships between acoustic and articulatory characteristics were found in the Thai vowel system where F1 relates to vowel height and F2 relates to vowel advancement like other vowel systems. The bark F1-F0 and F3-F2 were also found to represent vowel height and vowel advancement, respectively. Vowel height includes high, mid and low vowel groups while vowel advancement includes front, central and back vowel groups. The previous binary classification criterion is not sufficient for a spreading vowel system. Hence, three new classification strategies are proposed for the classification of complex vowel systems: (1) classification by vowel height, (2) classification by vowel advancement, and (3) classification by combined vowel height and vowel advancement. Acoustic analysis on the Thai vowel system is conducted and each vowel is acoustically modelled by its acoustic-articulatory features. The F1 and bark F1-F0 are employed in classification by vowel height. The F2 and bark F3-F2 are employed in classification by vowel advancement. Both acoustic features are integrated into a single feature vector in combined vowel height and vowel advancement classification using the Bayesian classifier. From the results, linear frequency scale provides better classification accuracy than bark in every case. The linear F1 is 88.52% correct in vowel height classification while it is 94.09% correct using linear F2 in vowel advancement classification. Both linear F1 and F2 achieve 86.33% correctness in combined classification of all the vowels. These illustrate that three classification strategies efficiently classify complex vowel systems using only static spectral cues. Therefore, the simple linear F1 and F2 values can be used to model vowel articulation and to classify each vowel in languages with a complex vowel system like Thai.


Journal of the Acoustical Society of America | 2000

Thai monophthongs classification using CDHMM

Ekkarit Maneenoi; Somchai Jitapunkul; Visarut Ahkuputra; Sudaporn Luksaneeyanawin

This paper presents Thai monophthongal vowels recognition. The Thai monophthongs were qualitatively recognized by the 3‐state left‐to‐right continuous density hidden Markov model. The 18 monophthongs are qualitatively 9 different vowels, each of which has two members, short and long. The LPC cepstral coefficients were used and the temporal cepstral derivative was additionally utilized to compare efficiency of the additional feature with the single feature. Qualitative recognition means that short and long vowel pairs were categorized in the same model. Thai polysyllabic words were used in this research. The database consists of 2100 training phonemes from 30 speakers and 1378 testing phonemes from a different group of 20 speakers, respectively. The highest recognition rate of the single feature obtained from 18‐order LPC cepstral coefficients is 86.983 percent, while the recognition rate of the 16‐order LPC cepstral coefficients plus temporal derivative is 94.580 percent. The results indicate that all the...


Journal of the Acoustical Society of America | 2000

Direct classification of Thai monophthongs on two‐dimensional acoustic–phonetic feature spaces in linear, mel, bark, and bark‐difference frequency scales

Visarut Ahkuputra; Somchai Jitapunkul; Ekkarit Maneenoi; Sudaporn Luksaneeyanawin

A direct classification of nine Thai monophthongs was conducted on two‐dimensional acoustic–phonetic feature spaces. Formant frequencies of a monophthong have been extracted from a stable vowel portion of a syllable nucleus. The F1 and bark‐difference F1‐F0 represents tongue height. The F2 and bark‐difference F3‐F2 represents tongue advancement. The Bayesian classifier utilizes statistical parameters, mean and standard deviation, of each monophthongs in classification. Two‐dimensional feature vectors comprise two acoustic–phonetic features, F2 and F1 in linear, mel, bark, and also, bark‐difference F3‐F2 and F1‐F0. The classification results on the F2 and F1 space are 86.3325% in linear, 84.6187% in mel, 84.2331% in bark, 79.3916% in bark‐difference F2‐F1 and F1‐F0, and also 79.9914% in bark‐difference F3‐F2 and F1‐F0. Considering confusion matrices, the high vowels (/i:, v:, u:/), front‐middle vowel (/e:/), and center‐middle vowel (/q:/) have higher results than other groups which resulted from smaller ov...


conference of the international speech communication association | 2001

F0 Feature Extraction by Polynomial Regression Function for Monosyllabic Thai Tone Recognition

Patavee Charnvivit; Somchai Jitapunkul; Visarut Ahkuputra; Ekkarit Maneenoi; Boonchai Thampanitchawong


conference of the international speech communication association | 2000

Classification of Thai consonant naming using Thai tone.

Umavasee Thathong; Somchai Jitapunkul; Visarut Ahkuputra; Ekkarit Maneenoi; Boonchai Thampanitchawong


conference of the international speech communication association | 1998

A comparison of Thai speech recognition systems using hidden Markov model, neural network, and fuzzy-neural network.

Visarut Ahkuputra; Somchai Jitapunkul; Nutthacha Jittiwarangkul; Ekkarit Maneenoi; Sawit Kasuriya

Collaboration


Dive into the Visarut Ahkuputra's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge