Vo Ngoc Phu
Duy Tan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vo Ngoc Phu.
Applied Intelligence | 2017
Vo Ngoc Phu; Nguyen Duy Dat; Vo Thi Ngoc Tran; Vo Thi Ngoc Chau; Tuan A. Nguyen
Sentiment classification plays a significant role in everyday life, in political activities, in activities relating to commodity production, and commercial activities. Finding a solution for the accurate and timely classification of emotions is a challenging task. In this research, we propose a new model for big data sentiment classification in the parallel network environment. Our proposed model uses the Fuzzy C-Means (FCM) method for English sentiment classification with Hadoop MAP (M) /REDUCE (R) in Cloudera. Cloudera is a parallel network environment. Our proposed model can classify the sentiments of millions of English documents in the parallel network environment. We tested our model using the testing data set (which comprised 25,000 English reviews, 12,500 being positive and 12,500 negative) and achieved 60.2 % accuracy. Our English training data set has 60,000 English sentences, comprising 30,000 positive English sentences and 30,000 negative English sentences.
international conference on asian language processing | 2014
Vo Ngoc Phu; Phan Thi Tuoi
We have explored different methods of improving the accuracy of sentiment classification. The sentiment orientation of a document can be positive (+), negative (-), or neutral (0). We combine five dictionaries from [2, 3, 4, 5, 6] into the new one with 21137 entries. The new dictionary has many verbs, adverbs, phrases and idioms, that are not in five ones before. The paper shows that our proposed method based on the combination of Term-Counting method and Enhanced Contextual Valence Shifters method has improved the accuracy of sentiment classification. The combined method has accuracy 68.984% on the testing dataset, and 69.224% on the training dataset. All of these methods are implemented to classify the reviews based on our new dictionary and the Internet Movie data set.
International Journal of Pattern Recognition and Artificial Intelligence | 2017
Nguyen Duy Dat; Vo Ngoc Phu; Vo Thi Ngoc Tran; Vo Thi Ngoc Chau; Tuan A. Nguyen
Sentiment classification is significant in everyday life of everyone, in political activities, activities of commodity production, commercial activities. In this research, we propose a new model for Big Data sentiment classification in the parallel network environment. Our new model uses STING Algorithm (SA) (in the data mining field) for English document-level sentiment classification with Hadoop Map (M)/Reduce (R) based on the 90,000 English sentences of the training data set in a Cloudera parallel network environment — a distributed system. In the world there is not any scientific study which is similar to this survey. Our new model can classify sentiment of millions of English documents with the shortest execution time in the parallel network environment. We test our new model on the 25,000 English documents of the testing data set and achieved on 61.2% accuracy. Our English training data set includes 45,000 positive English sentences and 45,000 negative English sentences.
Artificial Intelligence Review | 2018
Vo Ngoc Phu; Vo Thi Ngoc Chau; Vo Thi Ngoc Tran; Nguyen Duy Dat
Abstract Emotion classification is used in many commercial applications and research applications. The semantic classification models (or sentiment classification methods) are based on the vocabulary of the emotion dictionary being studied and being used very much to this day. In this study, a Vietnamese sentiment dictionary includes Vietnamese terms (Vietnamese nouns, Vietnamese verbs, Vietnamese adjectives, etc.) which the valences (and polarities) are calculated by using Ochiai measure through Google search engine and many Vietnamese adjective phrases which the valences (and polarities) are identified based on Vietnamese language characteristics. The Vietnamese adjectives often bear emotion which values (or semantic scores) are not fixed and are changed when they appear in different contexts of these phrases. Therefore, if the Vietnamese adjectives bring sentiment and their semantic values (or their sentiment scores) are not changed in any context, then the results of the emotion classification are not high accuracy. We propose many rules based on Vietnamese language characteristics to determine the emotional values of the Vietnamese adjective phrases bearing sentiment in specific contexts. Our Vietnamese sentiment adjective dictionary is widely used in applications and researches of the Vietnamese semantic classification.
Knowledge and Information Systems | 2017
Vo Ngoc Phu; Vo Thi Ngoc Chau; Nguyen Duy Dat; Vo Thi Ngoc Tran; Tuan A. Nguyen
Sentiment classification plays an important role in everyday life, in political activities, activities of commodity production and commercial activities. Finding a time-effective and highly accurate solution to the classification of emotions is challenging. Today, there are many models (or methods) to classify the sentiment of documents. Sentiment classification has been studied for many years and is used widely in many different fields. We propose a new model, which is called the valences-totaling model (VTM), by using cosine measure (CM) to classify the sentiment of English documents. VTM is a new model for English sentiment classification. In this study, CM is a measure of similarity between two words and is used to calculate the valence (and polarity) of English semantic lexicons. We prove that CM is able to identify the sentiment valence and the sentiment polarity of the English sentiment lexicons online in combination with the Google search engine with AND operator and OR operator. VTM uses many English semantic lexicons. These English sentiment lexicons are calculated online and are based on the Internet. We present a full range of English sentences; thus, the emotion expressed in the English text is classified with more precision. Our new model is not dependent on a special domain and training data set—it is a domain-independent classifier. We test our new model on the Internet data in English. The calculated valence (and polarity) of English semantic words in this model is based on many documents on millions of English Web sites and English social networks.
International Journal of Speech Technology | 2017
Vo Ngoc Phu; Vo Thi Ngoc Chau; Vo Thi Ngoc Tran
Semantic analysis is very important and very helpful for many researches and many applications for a long time. SVM is a famous algorithm which is used in the researches and applications in many different fields. In this study, we propose a new model using a SVM algorithm with Hadoop Map (M)/Reduce (R) for English document-level emotional classification in the Cloudera parallel network environment. Cloudera is also a distributed system. Our English testing data set has 25,000 English documents, including 12,500 English positive reviews and 12,500 English negative reviews. Our English training data set has 90,000 English sentences, including 45,000 English positive sentences and 45,000 English negative sentences. Our new model is tested on the English testing data set and we achieve 63.7% accuracy of sentiment classification on this English testing data set.
International Journal of Speech Technology | 2017
Vo Ngoc Phu; Vo Thi Ngoc Tran; Vo Thi Ngoc Chau; Nguyen Duy Dat; Khanh Ly Doan Duy
Natural language processing has been studied for many years, and it has been applied to many researches and commercial applications. A new model is proposed in this paper, and is used in the English document-level emotional classification. In this survey, we proposed a new model by using an ID3 algorithm of a decision tree to classify semantics (positive, negative, and neutral) for the English documents. The semantic classification of our model is based on many rules which are generated by applying the ID3 algorithm to 115,000 English sentences of our English training data set. We test our new model on the English testing data set including 25,000 English documents, and achieve 63.6% accuracy of sentiment classification results.
International Journal of Speech Technology | 2017
Vo Ngoc Phu; Vo Thi Ngoc Chau; Vo Thi Ngoc Tran
The researches of semantics (positive, negative, neutral) are performed for a long time and they are very important for many commercial applications, many scientific works, etc. In this paper we propose a new model to calculate the emotional values (or semantic scores) of English terms (English verbs, English nouns, English adjectives, English adverbs, etc.) as follows: firstly, we create our basis English emotional dictionary (called bEED) by using Sorensen measure (Sorensen coefficient, called SM) through Google search engine with AND operator and OR operator and secondly, many English adjective phrases, English adverb phrases and English verb phrases are created based on the English grammars (the English characteristics) by combining the English adverbs of degree with the English adjectives, the English adverbs and English verbs; finally, the valences of the English adverb phrases are identified by their specific contexts. The English phrases often bring the semantics which the values (or emotional scores) are not fixed and are changed when they appear in their different contexts. Therefore, the results of the sentiment classification are not high accuracy if the English phrases bring the emotions and their semantic values (or their sentiment scores) are not changed in any context. For those reasons, we propose many rules based on English language grammars to calculate the sentimental values of the English phrases bearing emotion in their specific contexts. The results of this work are widely used in applications and researches of the English semantic classification.
Evolving Systems | 2017
Vo Ngoc Phu; Vo Thi Ngoc Chau; Vo Thi Ngoc Tran; Dat Nguyen Duy; Khanh Ly Doan Duy
Evolving Systems | 2017
Vo Ngoc Phu; Vo Thi Ngoc Tran; Vo Thi Ngoc Chau; Dat Nguyen Duy; Khanh Ly Doan Duy