Haewoon Kwak | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Haewoon Kwak is active.

Explore More

Publication

Featured researches published by Haewoon Kwak.

international world wide web conferences | 2010

What is Twitter, a social network or a news media?

Haewoon Kwak; Changhyun Lee; Hosung Park; Sue B. Moon

Twitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal of this paper is to study the topological characteristics of Twitter and its power as a new medium of information sharing. We have crawled the entire Twitter site and obtained 41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106 million tweets. In its follower-following topology analysis we have found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks [28]. In order to identify influentials on Twitter, we have ranked users by the number of followers and by PageRank and found two rankings to be similar. Ranking by retweets differs from the previous two rankings, indicating a gap in influence inferred from the number of followers and that from the popularity of ones tweets. We have analyzed the tweets of top trending topics and reported on their temporal behavior and user participation. We have classified the trending topics based on the active period and the tweets and show that the majority (over 85%) of topics are headline news or persistent news in nature. A closer look at retweets reveals that any retweeted tweet is to reach an average of 1,000 users no matter what the number of followers is of the original tweet. Once retweeted, a tweet gets retweeted almost instantly on next hops, signifying fast diffusion of information after the 1st retweet. To the best of our knowledge this work is the first quantitative study on the entire Twittersphere and information diffusion on it.

internet measurement conference | 2007

I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system

Meeyoung Cha; Haewoon Kwak; Pablo Rodriguez; Yong-Yeol Ahn; Sue B. Moon

User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better understand the impact of UGC systems, we have analyzed YouTube, the worlds largest UGC VoD system. Based on a large amount of data collected, we provide an in-depth study of YouTube and other similar UGC systems. In particular, we study the popularity life-cycle of videos, the intrinsic statistical properties of requests and their relationship with video age, and the level of content aliasing or of illegal content in the system. We also provide insights on the potential for more efficient UGC VoD systems (e.g. utilizing P2P techniques or making better use of caching). Finally, we discuss the opportunities to leverage the latent demand for niche videos that are not reached today due to information filtering effects or other system scarcity distortions. Overall, we believe that the results presented in this paper are crucial in understanding UGC systems and can provide valuable information to ISPs, site administrators, and content owners with major commercial and technical implications.

international world wide web conferences | 2007

Analysis of topological characteristics of huge online social networking services

Yong-Yeol Ahn; Seungyeop Han; Haewoon Kwak; Sue B. Moon; Hawoong Jeong

Social networking services are a fast-growing business in the Internet. However, it is unknown if online relationships and their growth patterns are the same as in real-life social networks. In this paper, we compare the structures of three online social networking services: Cyworld, MySpace, and orkut, each with more than 10 million users, respectively. We have access to complete data of Cyworlds ilchon (friend) relationships and analyze its degree distribution, clustering property, degree correlation, and evolution over time. We also use Cyworld data to evaluate the validity of snowball sampling method, which we use to crawl and obtain partial network topologies of MySpace and orkut. Cyworld, the oldest of the three, demonstrates a changing scaling behavior over time in degree distribution. The latest Cyworld datas degree distribution exhibits a multi-scaling behavior, while those of MySpace and orkut have simple scaling behaviors with different exponents. Very interestingly, each of the two e ponents corresponds to the different segments in Cyworlds degree distribution. Certain online social networking services encourage online activities that cannot be easily copied in real life; we show that they deviate from close-knit online social networks which show a similar degree correlation pattern to real-life social networks.

internet measurement conference | 2008

Comparison of online social relations in volume vs interaction: a case study of cyworld

Hyunwoo Chun; Haewoon Kwak; Young-Ho Eom; Yong-Yeol Ahn; Sue B. Moon; Hawoong Jeong

Online social networking services are among the most popular Internet services according to Alexa.com and have become a key feature in many Internet services. Users interact through various features of online social networking services: making friend relationships, sharing their photos, and writing comments. These friend relationships are expected to become a key to many other features in web services, such as recommendation engines, security measures, online search, and personalization issues. However, we have very limited knowledge on how much interaction actually takes place over friend relationships declared online. A friend relationship only marks the beginning of online interaction. Does the interaction between users follow the declaration of friend relationship? Does a user interact evenly or lopsidedly with friends? We venture to answer these questions in this work. We construct a network from comments written in guestbooks. A node represents a user and a directed edge a comments from a user to another. We call this network an activity network. Previous work on activity networks include phone-call networks [34, 35] and MSN messenger networks [27]. To our best knowledge, this is the first attempt to compare the explicit friend relationship network and implicit activity network. We have analyzed structural characteristics of the activity network and compared them with the friends network. Though the activity network is weighted and directed, its structure is similar to the friend relationship network. We report that the in-degree and out-degree distributions are close to each other and the social interaction through the guestbook is highly reciprocated. When we consider only those links in the activity network that are reciprocated, the degree correlation distribution exhibits much more pronounced assortativity than the friends network and places it close to known social networks. The k-core analysis gives yet another corroborating evidence that the friends network deviates from the known social network and has an unusually large number of highly connected cores. We have delved into the weighted and directed nature of the activity network, and investigated the reciprocity, disparity, and network motifs. We also have observed that peer pressure to stay active online stops building up beyond a certain number of friends. The activity network has shown topological characteristics similar to the friends network, but thanks to its directed and weighted nature, it has allowed us more in-depth analysis of user interaction.

international acm sigir conference on research and development in information retrieval | 2009

The wisdom of the few: a collaborative filtering approach based on expert opinions from the web

Xavier Amatriain; Neal Lathia; Josep M. Pujol; Haewoon Kwak; Nuria Oliver

Nearest-neighbor collaborative filtering provides a successful means of generating recommendations for web users. However, this approach suffers from several shortcomings, including data sparsity and noise, the cold-start problem, and scalability. In this work, we present a novel method for recommending items to users based on expert opinions. Our method is a variation of traditional collaborative filtering: rather than applying a nearest neighbor algorithm to the user-rating data, predictions are computed using a set of expert neighbors from an independent dataset, whose opinions are weighted according to their similarity to the user. This method promises to address some of the weaknesses in traditional collaborative filtering, while maintaining comparable accuracy. We validate our approach by predicting a subset of the Netflix data set. We use ratings crawled from a web portal of expert reviews, measuring results both in terms of prediction accuracy and recommendation list precision. Finally, we explore the ability of our method to generate useful recommendations, by reporting the results of a user-study where users prefer the recommendations generated by our approach.

international world wide web conferences | 2010

Finding influentials based on the temporal order of information adoption in twitter

Changhyun Lee; Haewoon Kwak; Hosung Park; Sue B. Moon

Twitter offers an explicit mechanism to facilitate information diffusion and has emerged as a new medium for communication. Many approaches to find influentials have been proposed, but they do not consider the temporal order of information adoption. In this work, we propose a novel method to find influentials by considering both the link structure and the temporal order of information adoption in Twitter. Our method finds distinct influentials who are not discovered by other methods.

internet measurement conference | 2009

Mining communities in networks: a solution for consistency and its evaluation

Haewoon Kwak; Yoon-Chan Choi; Young-Ho Eom; Hawoong Jeong; Sue B. Moon

Online social networks pose significant challenges to computer scientists, physicists, and sociologists alike, for their massive size, fast evolution, and uncharted potential for social computing. One particular problem that has interested us is community identification. Many algorithms based on various metrics have been proposed for communities in networks [18, 24], but a few algorithms scale to very large networks. Three recent community identification algorithms, namely CNM [16], Wakita [59], and Louvain [10], stand out for their scalability to a few millions of nodes. All of them use modularity as the metric of optimization. However, all three algorithms produce inconsistent communities every time the ordering of nodes to the algorithms changes. We propose two quantitative metrics to represent the level of consistency across multiple runs of an algorithm: pairwise membership probability and consistency. Based on these two metrics, we propose a solution that improves the consistency without compromising the modularity. We demonstrate that our solution to use pairwise membership probabilities as link weights generates consistent communities within six or fewer cycles for most networks. However, our iterative, pairwise membership reinforcing approach does not deliver convergence for Flickr, Orkut, and Cyworld networks as well for the rest of the networks. Our approach is empirically driven and is yet to be shown to produce consistent output analytically. We leave further investigation into the topological structure and its impact on the consistency as future work. In order to evaluate the quality of clustering, we have looked at 3 of the 48 communities identified in the AS graph. Surprisingly, all have either hierarchical, geographical, or topological interpretations to their groupings. Our preliminary evaluation of the quality of communities is promising. We plan to conduct more thorough evaluation of the communities and study network structures and their evolutions using our approach.

human factors in computing systems | 2015

Exploring Cyberbullying and Other Toxic Behavior in Team Competition Online Games

Haewoon Kwak; Jeremy Blackburn; Seungyeop Han

In this work we explore cyberbullying and other toxic behavior in team competition online games. Using a dataset of over 10 million player reports on 1.46 million toxic players along with corresponding crowdsourced decisions, we test several hypotheses drawn from theories explaining toxic behavior. Besides providing large-scale, empirical based understanding of toxic behavior, our work can be used as a basis for building systems to detect, prevent, and counter-act toxic behavior.

social informatics | 2014

A First Look at Global News Coverage of Disasters by Using the GDELT Dataset

Haewoon Kwak; Jisun An

In this work, we reveal the structure of global news coverage of disasters and its determinants by using a large-scale news coverage dataset collected by the GDELT (Global Data on Events, Location, and Tone) project that monitors news media in over 100 languages from the whole world. Significant variables in our hierarchical (mixed-effect) regression model, such as population, political stability, damage, and more, are well aligned with a series of previous research. However, we find strong regionalism in news geography, highlighting the necessity of comprehensive datasets for the study of global news coverage.

social informatics | 2014

Linguistic Analysis of Toxic Behavior in an Online Video Game

Haewoon Kwak; Jeremy Blackburn

In this paper we explore the linguistic components of toxic behavior by using crowdsourced data from over 590 thousand cases of accused toxic players in a popular match-based competition game, League of Legends. We perform a series of linguistic analyses to gain a deeper understanding of the role communication plays in the expression of toxic behavior. We characterize linguistic behavior of toxic players and compare it with that of typical players in an online competition game. We also find empirical support describing how a player transitions from typical to toxic behavior. Our findings can be helpful to automatically detect and warn players who may become toxic and thus insulate potential victims from toxic playing in advance.

Explore More