Tu Minh Phuong
Posts and Telecommunications Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tu Minh Phuong.
pacific rim international conference on artificial intelligence | 2008
Nguyen Duy Phuong; Le Quang Thang; Tu Minh Phuong
Collaborative filtering and content-based filtering are two main approaches to make recommendations in recommender systems. While each approach has its own strengths and weaknesses, combining the two approaches can improve recommendation accuracy. In this paper, we present a graph-based method that allows combining content information and rating information in a natural way. The proposed method uses user ratings and content descriptions to infer user-content links, and then provides recommendations by exploiting these new links in combination with user-item links. We present experimental results showing that the proposed method performs better than a pure collaborative filtering, a pure content-based filtering, and a hybrid method.
2008 IEEE International Conference on Research, Innovation and Vision for the Future in Computing and Communication Technologies | 2008
Nguyen Duy Phuong; Tu Minh Phuong
Collaborative filtering is a technique to predict userspsila interests for items by exploiting the behavior patterns of a group of users with similar preferences. This technique has been widely used for recommender systems and has a number of successful applications in E-commerce. In practice, a major challenge when applying collaborative filtering is that a typical user provides ratings for just a small number of items, thus the amount of training data is sparse with respect to the size of the domain. In this paper, we present a method to address this problem. Our method formulates the collaborative filtering problem in a multi-task learning framework by treating each user rating prediction as a classification problem and solving multiple classification problems together. By doing this, the method allows sharing information among different classifiers and thus reduces the effect of data sparsity.
Computer Speech & Language | 2018
Ngo Xuan Bach; Nguyen Dieu Linh; Tu Minh Phuong
Abstract Part-of-speech (POS) tagging is a fundamental task in natural language processing (NLP). A robust POS tagger plays an important role in most NLP problems and applications, including syntactic parsing, semantic parsing, machine translation, and question answering. Although a lot of efficient POS taggers has been developed for general, conventional text, little work has been done for social media text. In this paper, we present an empirical study on POS tagging for Vietnamese social media text, which shows several challenges compared with tagging for general text. Social media text does not always conform to formal grammars and correct spelling. It also uses abbreviations, foreign words, and emoticons frequently. A POS tagger developed for conventional text would perform poorly on such noisy data. We address this problem by proposing a tagging model based on Conditional Random Fields (CRFs) with various kinds of features for Vietnamese social media text. We also investigate the effect of features extracted from word clusters under the Brown and canonical correlation analysis (CCA) based clustering in semi-supervised settings. We introduce an annotated corpus for POS tagging, which consists of more than four thousand sentences from Facebook, the most popular social network in Vietnam. Using this corpus, we performed a series of experiments to evaluate the proposed model. Our model achieved 88.26% and 88.92% tagging accuracy in supervised and semi-supervised scenarios, respectively, which are nearly 12% improvement over vnTagger, a state-of-the-art and most widely used Vietnamese POS tagger developed for general, conventional text. In addition, the semi-supervised model outperformed, in terms of accuracy, the version of vnTagger trained on the same Facebook dataset, showing the usefulness of word cluster features. 1
pacific rim international conference on artificial intelligence | 2016
Nguyen Ngoc Diep; Cuong Pham; Tu Minh Phuong
Human activity recognition is important in many applications such as fitness logging, pervasive healthcare, near-emergency warning, and social networking. Using body-worn sensors, these applications detect activities of the users to understand the context and provide them appropriate assistance. For accurate recognition, it is crucial to design appropriate feature representation of sensor data. In this paper, we propose a new type of motion features: motion primitive forests, which are randomized ensembles of decision trees that act on original local features by clustering them to form motion primitives (or words). The bags of these features, which accumulate histograms of the resulting motion primitives over each data frame, are then used to build activity models. We experimentally validated the effectiveness of the proposed method on accelerometer data on three benchmark datasets. On all three datasets, the proposed motion primitive forests provided substantially higher accuracy than existing state-of-the-art methods, and were much faster in both training and prediction, compared with k-means feature learning. In addition, the method showed stable results over different types of original local features, indicating the ability of random forests in selecting relevant local features.
knowledge and systems engineering | 2015
Ngo Xuan Bach; Tran Thi Oanh; Nguyen Trung Hai; Tu Minh Phuong
In this paper, we investigate the task of paraphrase identification in Vietnamese documents, which identify whether two sentences have the same meaning. This task has been shown to be an important research dimension with practical applications in natural language processing and data mining. We choose to model the task as a classification problem and explore different types of features to represent sentences. We also introduce a paraphrase corpus for Vietnamese, vnPara, which consists of 3000 Vietnamese sentence pairs. We describe a series of experiments using various linguistic features and different machine learning algorithms, including Support Vector Machines, Maximum Entropy Model, Naive Bayes, and k-Nearest Neighbors. The results are promising with the best model achieving up to 90% accuracy. To the best of our knowledge, this is the first attempt to solve the task of paraphrase identification for Vietnamese.
knowledge and systems engineering | 2017
Ngo Xuan Bach; Le Thi Ngoc Cham; Tran Ha Ngoc Thien; Tu Minh Phuong
This paper presents a study on analyzing questions in legal domain for Vietnamese language, which is an important step in building an automated question answering system for the domain. We focus on questions about transportation law — the law with arguably the largest number of violations and thus is the most asked about. Given a legal question in natural language, our goal is to extract important information such as Type of Vehicle, Action of Vehicle, Location, and Question Type. We model the question analysis task as a sequence labeling problem and present a CRF-based method to deal with it. Experimental results on a corpus consisting of 1678 Vietnamese questions show that our method can extract 16 types of information with high precision and recall.
knowledge and systems engineering | 2017
Cuong Pham; Nguyen Ngoc Diep; Tu Minh Phuong
Many approaches to human activity recognition such as wearable based or computer vision based are obtrusive in the sense that they prevent the users from performing activities in a natural way, or they might raise privacy invasion concerns. This paper presents e-Shoes — smart shoes for unobtrusive human activity recognition. E-Shoes are shoes instrumented with tiny wireless accelerometers embedded inside the insole of the shoes. The sensors are seamless to the users making the system suitable for recognizing everyday activities. To analyze sensor signals, we propose a convolution neutral networks (CNN) model that automatically learns features from sensing data and makes predictions about performing activities. We verify the effectiveness of the approach with a real dataset that covers seven daily activities. The system achieved 93% accuracy in average, which is very promising, while being energy efficient and easy to use.
pacific rim international conference on artificial intelligence | 2016
Nguyen Ngoc Diep; Cuong Pham; Tu Minh Phuong
Histogram features are extracted by calculating the distribution of orientations of small fragments or quanta of sliding windows on the sensors continuously acceleration data stream. Bins of the histogram is automatically computed based on clusters of similar orientations of quanta, making it less sensitive to parameters used in selection of bins than a heuristic approach. We also present a finer representation of the sliding window by applying the above extraction method to extract local feature vectors of small data segments instead of calculating features from the whole sliding window. Extracted features are used with support vector machines trained to classify frames of data streams into containing falls or non-falls. We evaluated the proposed method on three public datasets with acceleration data including falls and other activities of daily living. On all three datasets, performance of the proposed method is substantially higher than two other fall detection methods.
Journal of Multimedia | 2013
Cuong Pham; Nguyen Ngoc Diep; Tu Minh Phuong
KES | 2015
Ngo Xuan Bach; Tu Minh Phuong