Ananlada Chotimongkol
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ananlada Chotimongkol.
meeting of the association for computational linguistics | 1998
Surapant Meknavin; Boonserm Kijsirikul; Ananlada Chotimongkol; Cholwich Nuttee
For languages that have no explicit word boundary such as Thai, Chinese and Japanese, correcting words in text is harder than in English because of additional ambiguities in locating error words. The traditional method handles this by hypothesizing that every substrings in the input sentence could be error words and trying to correct all of them. In this paper, we propose the idea of reducing the scope of spelling correction by focusing only on dubious areas in the input sentence. Boundaries of these dubious areas could be obtained approximately by applying word segmentation algorithm and finding word sequences with low probability. To generate the candidate correction words, we used a modified edit distance which reflects the characteristic of Thai OCR errors. Finally, a part-of-speech trigram model and Winnow algorithm are combined to determine the most probable correction.
meeting of the association for computational linguistics | 2005
Dilek Hakkani-Tür; Gokhan Tur; Ananlada Chotimongkol
In this paper, we introduce a new data representation format for language processing, the syntactic and semantic graphs (SSGs), and show its use for call classification in spoken dialog systems. For each sentence or utterance, these graphs include lexical information (words), syntactic information (such as the part of speech tags of the words and the syntactic parse of the utterance), and semantic information (such as the named entities and semantic role labels). In our experiments, we used written language as the training data while computing SSGs and tested on spoken language. In spite of this mismatch, we have shown that this is a very promising approach for classifying complex examples, and by using SSGs it is possible to reduce the call classification error rate by 4.74% relative.
empirical methods in natural language processing | 2008
Ananlada Chotimongkol; Alexander I. Rudnicky
We describe an approach for acquiring the domain-specific dialog knowledge required to configure a task-oriented dialog system that uses human-human interaction data. The key aspects of this problem are the design of a dialog information representation and a learning approach that supports capture of domain information from in-domain dialogs. To represent a dialog for a learning purpose, we based our representation, the form-based dialog structure representation, on an observable structure. We show that this representation is sufficient for modeling phenomena that occur regularly in several dissimilar task-oriented domains, including information-access and problem-solving. With the goal of ultimately reducing human annotation effort, we examine the use of unsupervised learning techniques in acquiring the components of the form-based representation (i.e. task, subtask, and concept). These techniques include statistical word clustering based on mutual information and Kullback-Liebler distance, TextTiling, HMM-based segmentation, and bisecting K-mean document clustering. With some modifications to make these algorithms more suitable for inferring the structure of a spoken dialog, the unsupervised learning algorithms show promise.
asia pacific conference on circuits and systems | 1998
Surapant Meknavin; Boonserm Kijsirikul; Ananlada Chotimongkol; C. Nuttee
From specific characteristics of Thai, Thai OCR errors frequently depend on nearby characters. To capture this characteristic of Thai OCR errors more appropriately, we propose the idea of using the varied n-gram of the character confusion probability for scoring approximately matched words. The value of n depends on characteristics of each character. For languages which have no explicit word boundary, word boundary ambiguity has to be resolved before correcting errors. In this paper, a maximal matching algorithm is used instead of a more complicated word segmentation algorithm to reduce a time complexity problem. Finally, a hybrid method which combines a part-of-speech trigram model with Winnow algorithm is used to selected the most probable correction.
conference of the international speech communication association | 2001
Ananlada Chotimongkol; Alexander I. Rudnicky
conference of the international speech communication association | 2000
Ananlada Chotimongkol; Alan W. Black
Archive | 2008
Ananlada Chotimongkol
Archive | 1999
Thatsanee Charoenporn; Ananlada Chotimongkol; Virach Sornlertlamvanich
conference of the international speech communication association | 2005
Rong Zhang; Ziad Al Bawab; Arthur Chan; Ananlada Chotimongkol; David Huggins-Daines; Alexander I. Rudnicky
conference of the international speech communication association | 2002
Ananlada Chotimongkol; Alexander I. Rudnicky
Collaboration
Dive into the Ananlada Chotimongkol's collaboration.
Thailand National Science and Technology Development Agency
View shared research outputs