Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ngoc Phuoc An Vo is active.

Publication


Featured researches published by Ngoc Phuoc An Vo.


north american chapter of the association for computational linguistics | 2015

FBK-HLT: A New Framework for Semantic Textual Similarity

Ngoc Phuoc An Vo; Simone Magnolini; Octavian Popescu

This paper reports the description and performance of our system, FBK-HLT, participating in the SemEval 2015, Task #2 “Semantic Textual Similarity”, English subtask. We submitted three runs with different hypothesis in combining typical features (lexical similarity, string similarity, word n-grams, etc) with syntactic structure features, resulting in different sets of features. The results evaluated on both STS 2014 and 2015 datasets prove our hypothesis of building a STS system taking into consideration of syntactic information. We outperform the best system on STS 2014 datasets and achieve a very competitive result to the best system on STS 2015 datasets.


Proceedings of the third International Workshop on Natural Language Processing for Social Media | 2015

Paraphrase Identification and Semantic Similarity in Twitter with Simple Features

Ngoc Phuoc An Vo; Simone Magnolini; Octavian Popescu

Paraphrase Identification and Semantic Similarity are two different yet well related tasks in NLP. There are many studies on these two tasks extensively on structured texts in the past. However, with the strong rise of social media data, studying these tasks on unstructured texts, particularly, social texts in Twitter is very interesting as it could be more complicated problems to deal with. We investigate and find a set of simple features which enables us to achieve very competitive performance on both tasks in Twitter data. Interestingly, we also confirm the significance of using word alignment techniques from evaluation metrics in machine translation in the overall performance of these tasks.


AI*IA 2016 Proceedings of the XV International Conference of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence - Volume 10037 | 2016

Analysis of the Impact of Machine Translation Evaluation Metrics for Semantic Textual Similarity

Simone Magnolini; Ngoc Phuoc An Vo; Octavian Popescu

We present a work to evaluate the hypothesis that automatic evaluation metrics developed for Machine Translation MT systems have significant impact on predicting semantic similarity scores in Semantic Textual Similarity STS task, in light of their usage for paraphrase identification. We show that different metrics may have different behaviors and significance along the semantic scale [0---5] of the STS task. In addition, we compare several classification algorithms using a combination of different MT metrics to build an STS system; consequently, we show that although this approach obtains remarkable result in paraphrase identification task, it is insufficient to achieve the same result in STS. We show that this problem is due to an excessive adaptation of some algorithms to dataset domain and at the end a way to mitigate or avoid this issue.


north american chapter of the association for computational linguistics | 2015

A Preliminary Evaluation of the Impact of Syntactic Structure in Semantic Textual Similarity and Semantic Relatedness Tasks

Ngoc Phuoc An Vo; Octavian Popescu

The well related tasks of evaluating the Semantic Textual Similarity and Semantic Relatedness have been under a special attention in NLP community. Many different approaches have been proposed, implemented and evaluated at different levels, such as lexical similarity, word/string/POS tags overlapping, semantic modeling (LSA, LDA), etc. However, at the level of syntactic structure, it is not clear how significant it contributes to the overall accuracy. In this paper, we make a preliminary evaluation of the impact of the syntactic structure in the tasks by running and analyzing the results from several experiments regarding to how syntactic structure contributes to solving these tasks.


international joint conference on knowledge discovery knowledge engineering and knowledge management | 2016

A Multi-Layer System for Semantic Textual Similarity

Ngoc Phuoc An Vo; Octavian Popescu

Building a system able to cope with various phenomena which falls under the umbrella of semantic similarity is far from trivial. It is almost always the case that the performances of a system do not vary consistently or predictably from corpora to corpora. We analyzed the source of this variance and found that it is related to the word-pair similarity distribution among the topics in the various corpora. Then we used this insight to construct a 4-module system that would take into consideration not only string and semantic word similarity, but also word alignment and sentence structure. The system consistently achieves an accuracy which is very close to the state of the art, or reaching a new state of the art. The system is based on a multi-layer architecture and is able to deal with heterogeneous corpora which may not have been generated by the same distribution.


empirical methods in natural language processing | 2014

Fast and Accurate Misspelling Correction in Large Corpora

Octavian Popescu; Ngoc Phuoc An Vo

There are several NLP systems whose accuracy depends crucially on finding misspellings fast. However, the classical approach is based on a quadratic time algorithm with 80% coverage. We present a novel algorithm for misspelling detection, which runs in constant time and improves the coverage to more than 96%. We use this algorithm together with a cross document coreference system in order to find proper name misspellings. The experiments confirmed significant improvement over the state of the art.


north american chapter of the association for computational linguistics | 2015

FBK-HLT: An Application of Semantic Textual Similarity for Answer Selection in Community Question Answering

Ngoc Phuoc An Vo; Simone Magnolini; Octavian Popescu


international conference on computational linguistics | 2014

FBK-TR: SVM for Semantic Relatedeness and Corpus Patterns for RTE

Ngoc Phuoc An Vo; Octavian Popescu; Tommaso Caselli


language resources and evaluation | 2016

Corpora for Learning the Mutual Relationship between Semantic Relatedness and Textual Entailment.

Ngoc Phuoc An Vo; Octavian Popescu


north american chapter of the association for computational linguistics | 2015

FBK-HLT: An Effective System for Paraphrase Identification and Semantic Similarity in Twitter

Ngoc Phuoc An Vo; Simone Magnolini; Octavian Popescu

Collaboration


Dive into the Ngoc Phuoc An Vo's collaboration.

Researchain Logo
Decentralizing Knowledge