Tuan-Anh Hoang
Singapore Management University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tuan-Anh Hoang.
social informatics | 2012
Su Mon Kywe; Tuan-Anh Hoang; Ee-Peng Lim; Feida Zhu
Twitter network is currently overwhelmed by massive amount of tweets generated by its users. To effectively organize and search tweets, users have to depend on appropriate hashtags inserted into tweets. We begin our research on hashtags by first analyzing a Twitter dataset generated by more than 150,000 Singapore users over a three-month period. Among several interesting findings about hashtag usage by this user community, we have found a consistent and significant use of new hashtags on a daily basis. This suggests that most hashtags have very short life span. We further propose a novel hashtag recommendation method based on collaborative filtering and the method recommends hashtags found in the previous months data. Our method considers both user preferences and tweet content in selecting hashtags to be recommended. Our experiments show that our method yields better performance than recommendation based only on tweet content, even by considering the hashtags adopted by a small number (1 to 3)of users who share similar user preferences.
acm transactions on management information systems | 2012
Palakorn Achananuparp; Ee-Peng Lim; Jing Jiang; Tuan-Anh Hoang
Real-time microblogging systems such as Twitter offer users an easy and lightweight means to exchange information. Instead of writing formal and lengthy messages, microbloggers prefer to frequently broadcast several short messages to be read by other users. Only when messages are interesting, are they propagated further by the readers. In this article, we examine user behavior relevant to information propagation through microblogging. We specifically use retweeting activities among Twitter users to define and model originating and promoting behavior. We propose a basic model for measuring the two behaviors, a mutual dependency model, which considers the mutual relationships between the two behaviors, and a range-based model, which considers the depth and reach of users’ original tweets. Next, we compare the three behavior models and contrast them with the existing work on modeling influential Twitter users. Last, to demonstrate their applicability, we further employ the behavior models to detect interesting events from sudden changes in aggregated information propagation behavior of Twitter users. The results will show that the proposed behavior models can be effectively applied to detect interesting events in the Twitter stream, compared to the baseline tweet-based approaches.
advances in social networks analysis and mining | 2013
Tuan-Anh Hoang; William W. Cohen; Ee-Peng Lim; Douglas Pierce; David P. Redlawsk
In political contexts, it is known that people act as “motivated reasoners”, i.e., information is evaluated first for emotional affect, and this emotional reaction influences later deliberative reasoning steps. As social media becomes a more and more prevalent way of receiving political information, it becomes important to understand more completely the interaction between information, emotion, social community, and information-sharing behavior. In this paper, we describe a high-precision classifier for politically-oriented tweets, and an accurate classifier of a Twitter users political affiliation. Coupled with existing sentiment-analysis tools for microblogs, these methods enable us to systematically study the interaction of emotion and sharing in a large corpus of politically-oriented microblog messages, collected from just before the 2012 US presidential election. In particular, we seek to understand how information sharing is influenced by the political affiliation of the sender and receiver of a message, and the sentiment associated with the message.
social informatics | 2014
Tuan-Anh Hoang; Ee-Peng Lim
In this paper, we propose the Topical Communities and Personal Interest (TCPI) model for simultaneously modeling topics, topical communities, and users’ topical interests in microblogging data. TCPI considers different topical communities while differentiating users’ personal topical interests from those of topical communities, and learning the dependence of each user on the affiliated communities to generate content. This makes TCPI different from existing models that either do not consider the existence of multiple topical communities, or do not differentiate between personal and community’s topical interests. Our experiments on two Twitter datasets show that TCPI can effectively mine the representative topics for each topical community. We also demonstrate that TCPI significantly outperforms other state-of-the-art topic models in the modeling tweet generation task.
siam international conference on data mining | 2014
Tuan-Anh Hoang; William W. Cohen; Ee-Peng Lim
In this paper, we propose the CBS topic model, a probabilistic graphical model, to derive the user communities in microblogging networks based on the sentiments they express on their generated content and behaviors they adopt. As a topic model, CBS can uncover hidden topics and derive user topic distribution. In addition, our model associates topicspecific sentiments and behaviors with each user community. Notably, CBS has a general framework that accommodates multiple types of behaviors simultaneously. Our experiments on two Twitter datasets show that the CBS model can effectively mine the representative behaviors and emotional topics for each community. We also demonstrate that CBS model perform as well as other state-of-the-art models in modeling topics, but outperforms the rest in mining user
siam international conference on data mining | 2013
Ee-Peng Lim; Tuan-Anh Hoang
When a user retweets, there are three behavioral factors that cause the actions. They are the topic virality, user virality and user susceptibility. Topic virality captures the degree to which a topic attracts retweets by users. For each topic, user virality and susceptibility refer to the likelihood that a user attracts retweets and performs retweeting respectively. To model a set of observed retweet data as a result of these three topic specific factors, we first represent the retweets as a three-dimensional tensor of the tweet authors, their followers, and the tweets themselves. We then propose the V 2S model, a tensor factorization model, to simultaneously derive the three sets of behavioral factors. Our experiments on a real Twitter data set show that the V 2S model can effectively mine the behavioral factors of users and tweet topics during an election event. We also demonstrate that the V 2S model outperforms the other topic based models in
pacific-asia conference on knowledge discovery and data mining | 2015
Tuan-Anh Hoang
To explain why a user generates some observed content and behaviors, one has to determine the user’s topical interests as well as that of her community. Most existing works on modeling microblogging users and their communities however are based on either user generated content or user behaviors, but not both. In this paper, we propose the Community and Personal Interest (CPI) model, for modeling interest of microblogging users jointly with that of their communities using both the content and behaviors. The CPI model also provides a common framework to accommodate multiple types of user behaviors. Unlike the other models, CPI does not assume a hierarchical relationship between personal interest and community interest, i.e., one is determined purely based on the other. We build the CPI model based on the principle that a user’s personal interest is different from that of her community. We further develop a regularization technique to bias the model to learn more socially meaningful topics for each community. Our experiments on a Twitter dataset show that the CPI model outperforms other state-of-the-art models in topic learning and user classification tasks. We also demonstrate that the CPI model can effectively mine community interest through some representative case examples.
advances in geographic information systems | 2015
Meng-Fen Chiang; Tuan-Anh Hoang; Ee-Peng Lim
Taxi bookings are events where requests for taxis are made by passengers either over voice calls or mobile apps. As the demand for taxis changes with space and time, it is important to model both the space and temporal dimensions in dynamic booking data. Several applications can benefit from a good taxi booking model. These include the prediction of number of bookings at certain location and time of the day, and the detection of anomalous booking events. In this paper, we propose a Grid-based Gaussian Mixture Model (GGMM) with spatio-temporal dimensions that groups booking data into a number of spatio-temporal clusters by observing the bookings occurring at different time of the day in each spatial grid cell. Using a large-scale real-world dataset consisting of over millions of booking records, we show that GGMM outperforms two strong baselines: a Gaussian Mixture Model (GMM) and the state-of-the-art spatio-temporal behavior model, Periodic Mobility Model (PMM), in estimating the spatio-temporal distribution of bookings at specific grid cells during specific time intervals. GGMM can achieve up to 95.8% (96.5%) reduction in perplexity compared against GMM (PMM). Further, we apply GGMM to detect anomalous bookings and successfully relate the anomalies with some known events, demonstrating GGMMs effectiveness in this task.
siam international conference on data mining | 2018
Roy Ka-Wei Lee; Tuan-Anh Hoang; Ee-Peng Lim
Finding influential users in online social networks is an important problem with many possible useful applications. HITS and other link analysis methods, in particular, have been often used to identify hub and authority users in web graphs and online social networks. These works, however, have not considered topical aspect of links in their analysis. A straightforward approach to overcome this limitation is to first apply topic models to learn the user topics before applying the HITS algorithm. In this paper, we instead propose a novel topic model known as Hub and Authority Topic (HAT) model to combine the two process so as to jointly learn the hub, authority and topical interests. We evaluate HAT against several existing state-of-the-art methods in two aspects: (i) modeling of topics, and (ii) link recommendation. We conduct experiments on two real-world datasets from Twitter and Instagram. Our experiment results show that HAT is comparable to state-of-the-art topic models in learning topics and it outperforms the state-of-the-art in link recommendation task.
international conference on weblogs and social media | 2012
Tuan-Anh Hoang; Ee-Peng Lim