Minh-Quoc Nghiem
Ho Chi Minh City University of Science
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Minh-Quoc Nghiem.
Polibits | 2011
Keisuke Yokoi; Minh-Quoc Nghiem; Yuichiroh Matsubayashi; Akiko Aizawa
We found a way to use mathematical search to provide better navigation for reading papers on computers. Since the superficial information of mathematical expressions is ambiguous, considering not only mathematical expressions but also the texts around them is necessary. We present how to extract a natural language description, such as variable names or function definitions that refer to mathematical expressions with various experimental results. We first define an extraction task and constructed a reference dataset of 100 Japanese scientific papers by hand. We then propose the use of two methods, pattern matching and machine learning based ones for the extraction task. The effectiveness of the proposed methods is shown through experiments by using the reference set.
arXiv: Information Retrieval | 2014
Minh-Quoc Nghiem; Giovanni Yoko Kristianto; Goran Topić; Akiko Aizawa
Mathematical content is a valuable information source and retrieving this content has become an important issue. This paper compares two searching strategies for math expressions: presentation-based and content-based approaches. Presentation-based search uses state-of-the-art math search system while content-based search uses semantic enrichment of math expressions to convert math expressions into their content forms and searching is done using these content-based expressions. By considering the meaning of math expressions, the quality of search system is improved over presentation-based systems.
exploiting semantic annotations in information retrieval | 2012
Giovanni Yoko Kristianto; Goran Topić; Minh-Quoc Nghiem; Akiko Aizawa
In recent years, growing numbers of scientific papers have been published in XML format generating a large published base of MathML-style formulas. Although these formulas can be indexed and searched based on their XML tree structures, they generally lack sufficient information for semantic interpretation. We propose an annotation design for linking mathematical formulas to natural language descriptions in the surrounding text. We also introduce potential applications for this annotation framework.
symposium on information and communication technology | 2016
Nhi-Thao Tran; Viet-Thang Luong; Ngan Luu-Thuy Nguyen; Minh-Quoc Nghiem
We propose a novel model that apply an extension of the Long Short-Term Memory neural network for sentence compression task. In our model, only the most relevant context of each word is concentrated to avoid the redundant information. Our model is based on two new models that have been successfully used recently in neural machine translation. The first is Bidirectional model that can be trained using all the available input information in the past and future. The second is Attention model that focus not only the whole sentence information but also the particular context of each word in this sentence. Experimental results show that our model significantly outperforms all the recently state-of-the-art method, the Bidirectional and the Attention model on the Google sentence compression dataset.
knowledge and systems engineering | 2015
Nhi-Thao Tran; Van-Giau Ung; An-Vinh Luong; Minh-Quoc Nghiem; Ngan Luu-Thuy Nguyen
This paper proposes an approach for sentence compression that only requires the part-of-speech information. The method is based on an observation of the human compression: adjacent words which form a meaning chunk usually are removed or retained together. We incorporate meaning chunk as a feature for a CRF-based sequence labeling system. Experimental results on English and Vietnamese compression datasets show that the proposed approach achieved better performance than the state-of-the-art systems.
arXiv: Digital Libraries | 2013
Minh-Quoc Nghiem; Giovanni Yoko Kristianto; Goran Topić; Akiko Aizawa
In this paper, we present a new approach to the semantic enrichment of mathematical expression problem. Our approach is a combination of statistical machine translation and disambiguation which makes use of surrounding text of the mathematical expressions. We first use Support Vector Machine classifier to disambiguate mathematical terms using both their presentation form and surrounding text. We then use the disambiguation result to enhance the semantic enrichment of a statistical-machine-translation-based system. Experimental results show that our system archives improvements over prior systems.
International Conference on Computational Social Networks | 2016
Bao-Dai Nguyen-Hoang; Quang-Vinh Ha; Minh-Quoc Nghiem
Recent years, many studies have addressed problems in sentiment analysis at different levels, and building aspect-based methods has become a central issue for deep opinion mining. However, previous studies need to use two separated modules in order to extract aspect-sentiment word pairs, then predict the sentiment polarity. In this paper, we use Restricted Boltzmann Machines in combination with Word Embedding model to build the joined model which not only extracts aspect terms appeared and classifies them into respective categories, but also completes the sentiment polarity prediction task. The experimental results show that the method we use in aspect-based sentiment analysis tasks is better than other state-of-the-art approaches.
knowledge and systems engineering | 2015
Van-Giau Ung; An-Vinh Luong; Nhi-Thao Tran; Minh-Quoc Nghiem
The aim of multi-document summarization is to produce an abridged version which contains important information from a set of documents on the same topic. This paper describes an approach that incorporates a set of features at word and sentence level to extract important sentences from input documents for Vietnamese news multi-document summarization system. Then, the summaries are evaluated automatically by using the ROUGE measure. The obtained result indicates that this approach produces good summaries and is appropriate for Vietnamese as well as languages limited linguistic resources.
knowledge and systems engineering | 2015
An-Vinh Luong; Nhi-Thao Tran; Van-Giau Ung; Minh-Quoc Nghiem
Multi-Sentence Compression is a task whose goal is to produce a short single sentence summary from a group of similar sentences. This paper presents a new re-ranking method based on frequent words extraction along with our modifications on a word graph-based MSC approach to reduce incorrect output. Compression candidates are re-ranked according to the number of frequent words they contain to select the most relevant output. Results of automatic evaluations performed in English and Vietnamese datasets show that the proposed method remarkably improves the generated compressions informativity.
data and knowledge engineering | 2017
Mai Duong; Minh-Quoc Nghiem; Ngan Luu-Thuy Nguyen
Abstract Proofreading, the act of checking first-draft writings performed by native experts, is essential for professional writing by non-native speakers. Usually, proofreading experts return the corrected texts to the writer without reasons of correction, which makes it difficult for the writer to learn from their errors. The combination of word alignment and classification techniques can help us to analyze the original and corrected texts and use them for language learning. In this study, we explore different alignment-classification methods for this task. Our experimental results show that the best method achieved 71.8% in accuracy. We also propose a new error taxonomy for tagging learner corpora, and present our alignment-classification results on the corpus tagged with this new tagset.