Is this you? Create Your Porfile

Nattanun Thatphithakkul

King Mongkut's Institute of Technology Ladkrabang

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nattanun Thatphithakkul is active.

Explore More

Publication

Featured researches published by Nattanun Thatphithakkul.

2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) | 2011

Mongolian speech corpus for text-to-speech development

Chatchawarn Hansakunbuntheung; Ausdang Thangthai; Nattanun Thatphithakkul; Altangerel Chagnaa

This paper presents a first attempt to develop Mongolian speech corpus that designed for data-driven speech synthesis in Mongolia. The aim of the speech corpus is to develop a high-quality Mongolian TTS for blinds to use with screen reader. The speech corpus contains nearly 6 hours of Mongolian phones. It well provides Cyrillic text transcription and its phonetic transcription with stress marking. It also provides context information including phone context, stressing levels, syntactic position in word, phrase and utterance for modeling speech acoustics and characteristics for speech synthesis.

Mathematical Problems in Engineering | 2014

Language Recognition Using Latent Dynamic Conditional Random Field Model with Phonological Features

Sirinoot Boonsuk; Atiwong Suchato; Proadpran Punyabukkana; Chai Wutiwiwatchai; Nattanun Thatphithakkul

Spoken language recognition (SLR) has been of increasing interest in multilingual speech recognition for identifying the languages of speech utterances. Most existing SLR approaches apply statistical modeling techniques with acoustic and phonotactic features. Among the popular approaches, the acoustic approach has become of greater interest than others because it does not require any prior language-specific knowledge. Previous research on the acoustic approach has shown less interest in applying linguistic knowledge; it was only used as supplementary features, while the current state-of-the-art system assumes independency among features. This paper proposes an SLR system based on the latent-dynamic conditional random field (LDCRF) model using phonological features (PFs). We use PFs to represent acoustic characteristics and linguistic knowledge. The LDCRF model was employed to capture the dynamics of the PFs sequences for language classification. Baseline systems were conducted to evaluate the features and methods including Gaussian mixture model (GMM) based systems using PFs, GMM using cepstral features, and the CRF model using PFs. Evaluated on the NIST LRE 2007 corpus, the proposed method showed an improvement over the baseline systems. Additionally, it showed comparable result with the acoustic system based on -vector. This research demonstrates that utilizing PFs can enhance the performance.

international symposium on communications and information technologies | 2007

Tree-structured model selection and simulated-data adaptation for environmental and speaker robust speech recognition

Nattanun Thatphithakkul; Boontee Kruatrachue; Chai Wutiwiwatchai; Sanparith Marukatat; Vataya Boonpiam

This paper proposes the use of tree-structured model selection and simulated-data in maximum likelihood linear regression (MLLR) adaptation for environment and speaker robust speech recognition. The objective of this work is to solve major problems in robust speech recognition system, namely unknown speaker and unknown environmental noise. The proposed solution is composed of two components. The first one is based on a tree-structured model for selecting a speaker-dependent model that best matches to the input speech. The second component uses simulated-data to adapt the selected acoustic model to fit with the unknown noise. The proposed technique can thus alleviate both problems simultaneously. Experimental results show that the proposed system achieves a higher recognition rate than the system using only the input speech in adaptation and the system using a multi-conditioned acoustic model.

international conference on signal processing | 2008

Robust speech recognition using noise-cluster HMM interpolation

Nattanun Thatphithakkul; Boontee Kruatrachue; Chai Wutiwiwatchai; Sanparith Marukatat; Vataya Boonpiam

This paper proposes a novel approach called noise-cluster HMM interpolation for robust speech recognition. The approach helps alleviating the problem of speech recognition under noisy environments not trained in the system. In this method, a new HMM is interpolated from existing noisy-speech HMMs that are best matched to the input speech. This process is performed on-the-fly with an acceptable delay time and, hence, no need to prepare and store the final model in advance. Interpolation weights among HMMs can be determined by either a direct or a tree-structured search. Evaluated focusing on speech in unseen noisy-environments, the proposed method obviously outperforms a baseline system whose acoustic model for such unseen environment is selected from a tree structure.

ieee region 10 conference | 2006

A speech sectioning tool for corpus development

Patcharika Cotsomrong; Nattanun Thatphithakkul; Kwanchiva Saykham; Vataya Boonpiam; Chai Wutiwiwatchai; Treepop Sunpetchniyom

This paper aims to propose a technique of speech segmentation using zero-crossing (ZC) with an average of absolute amplitude to classify speech and non-speech by analyzing silence (sil) position and end point detection. To automatic segment speech file, we determine the silence position and speech position to specify point of sectioning. Thereafter, we compare the length of each sectioning file with approximation phones length of each file. With this method, we can solve the problem of uncertainty length of silence (sil) and short pause (sp) in one sentence, for example, silence shorter than short pause and short pause longer than silence. The proposed method evaluates with 2 types of data; 1) data with certainty length of silence and short pause and 2) data with uncertainty length of silence and short pause. The results of the experiments show that the accuracy rate of automatic speech sectioning is acceptable

conference of the international speech communication association | 2008