Furu Wei
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Furu Wei.
knowledge discovery and data mining | 2010
Furu Wei; Shixia Liu; Yangqiu Song; Shimei Pan; Michelle X. Zhou; Weihong Qian; Lei Shi; Li Tan; Qiang Zhang
In this paper, we present a novel exploratory visual analytic system called TIARA (Text Insight via Automated Responsive Analytics), which combines text analytics and interactive visualization to help users explore and analyze large collections of text. Given a collection of documents, TIARA first uses topic analysis techniques to summarize the documents into a set of topics, each of which is represented by a set of keywords. In addition to extracting topics, TIARA derives time-sensitive keywords to depict the content evolution of each topic over time. To help users understand the topic-based summarization results, TIARA employs several interactive text visualization techniques to explain the summarization results and seamlessly link such results to the original text. We have applied TIARA to several real-world applications, including email summarization and patient record analysis. To measure the effectiveness of TIARA, we have conducted several experiments. Our experimental results and initial user feedback suggest that TIARA is effective in aiding users in their exploratory text analytic tasks.
ieee pacific visualization symposium | 2010
Weiwei Cui; Yingcai Wu; Shixia Liu; Furu Wei; Michelle X. Zhou; Huamin Qu
In this paper, we introduce a visualization method that couples a trend chart with word clouds to illustrate temporal content evolutions in a set of documents. Specifically, we use a trend chart to encode the overall semantic evolution of document content over time. In our work, semantic evolution of a document collection is modeled by varied significance of document content, represented by a set of representative keywords, at different time points. At each time point, we also use a word cloud to depict the representative keywords. Since the words in a word cloud may vary one from another over time (e.g., words with increased importance), we use geometry meshes and an adaptive force-directed model to lay out word clouds to highlight the word differences between any two subsequent word clouds. Our method also ensures semantic coherence and spatial stability of word clouds over time. Our work is embodied in an interactive visual analysis system that helps users to perform text analysis and derive insights from a large collection of documents. Our preliminary evaluation demonstrates the usefulness and usability of our work.
IEEE Transactions on Visualization and Computer Graphics | 2010
Yingcai Wu; Furu Wei; Shixia Liu; Norman Au; Weiwei Cui; Hong Zhou; Huamin Qu
The rapid development of Web technology has resulted in an increasing number of hotel customers sharing their opinions on the hotel services. Effective visual analysis of online customer opinions is needed, as it has a significant impact on building a successful business. In this paper, we present OpinionSeer, an interactive visualization system that could visually analyze a large collection of online hotel customer reviews. The system is built on a new visualization-centric opinion mining technique that considers uncertainty for faithfully modeling and analyzing customer opinions. A new visual representation is developed to convey customer opinions by augmenting well-established scatterplots and radial visualization. To provide multiple-level exploration, we introduce subjective logic to handle and organize subjective opinions with degrees of uncertainty. Several case studies illustrate the effectiveness and usefulness of OpinionSeer on analyzing relationships among multiple data dimensions and comparing opinions of different groups. Aside from data on hotel customer feedback, OpinionSeer could also be applied to visually analyze customer opinions on other products or services.
human factors in computing systems | 2012
Mengdie Hu; Shixia Liu; Furu Wei; Yingcai Wu; John T. Stasko; Kwan-Liu Ma
After the news of Osama Bin Ladens death leaked through Twitter, many people wondered if Twitter would fundamentally change the way we produce, spread, and consume news. In this paper we provide an in-depth analysis of how the news broke and spread on Twitter. We confirm the claim that Twitter broke the news first, and find evidence that Twitter had convinced a large number of its audience before mainstream media confirmed the news. We also discover that attention on Twitter was highly concentrated on a small number of opinion leaders and identify three groups of opinion leaders who played key roles in spreading the news: individuals affiliated with media played a large part in breaking the news, mass media brought the news to a wider audience and provided eager Twitter users with content on external sites, and celebrities helped to spread the news and stimulate conversation. Our findings suggest Twitter has great potential as a news medium.
visual analytics science and technology | 2010
Lei Shi; Furu Wei; Shixia Liu; Li Tan; Xiaoxiao Lian; Michelle X. Zhou
Text visualization becomes an increasingly more important research topic as the need to understand massive-scale textual information is proven to be imperative for many people and businesses. However, it is still very challenging to design effective visual metaphors to represent large corpora of text due to the unstructured and high-dimensional nature of text. In this paper, we propose a data model that can be used to represent most of the text corpora. Such a data model contains four basic types of facets: time, category, content (unstructured), and structured facet. To understand the corpus with such a data model, we develop a hybrid visualization by combining the trend graph with tag-clouds. We encode the four types of data facets with four separate visual dimensions. To help people discover evolutionary and correlation patterns, we also develop several visual interaction methods that allow people to interactively analyze text by one or more facets. Finally, we present two case studies to demonstrate the effectiveness of our solution in support of multi-faceted visual analysis of text corpora.
ieee vgtc conference on visualization | 2011
Yingcai Wu; Thomas Provan; Furu Wei; Shixia Liu; Kwan-Liu Ma
Word clouds are proliferating on the Internet and have received much attention in visual analytics. Although word clouds can help users understand the major content of a document collection quickly, their ability to visually compare documents is limited. This paper introduces a new method to create semantic‐preserving word clouds by leveraging tailored seam carving, a well‐established content‐aware image resizing operator. The method can optimize a word cloud layout by removing a left‐to‐right or top‐to‐bottom seam iteratively and gracefully from the layout. Each seam is a connected path of low energy regions determined by a Gaussian‐based energy function. With seam carving, we can pack the word cloud compactly and effectively, while preserving its overall semantic structure. Furthermore, we design a set of interactive visualization techniques for the created word clouds to facilitate visual text analysis and comparison. Case studies are conducted to demonstrate the effectiveness and usefulness of our techniques.
IEEE Transactions on Knowledge and Data Engineering | 2013
Yangqiu Song; Shimei Pan; Shixia Liu; Furu Wei; Michelle X. Zhou; Weihong Qian
In this paper, we propose a novel constrained coclustering method to achieve two goals. First, we combine information-theoretic coclustering and constrained clustering to improve clustering performance. Second, we adopt both supervised and unsupervised constraints to demonstrate the effectiveness of our algorithm. The unsupervised constraints are automatically derived from existing knowledge sources, thus saving the effort and cost of using manually labeled constraints. To achieve our first goal, we develop a two-sided hidden Markov random field (HMRF) model to represent both document and word constraints. We then use an alternating expectation maximization (EM) algorithm to optimize the model. We also propose two novel methods to automatically construct and incorporate document and word constraints to support unsupervised constrained clustering: 1) automatically construct document constraints based on overlapping named entities (NE) extracted by an NE extractor; 2) automatically construct word constraints based on their semantic distance inferred from WordNet. The results of our evaluation over two benchmark data sets demonstrate the superiority of our approaches against a number of existing approaches.
IEEE Transactions on Visualization and Computer Graphics | 2016
Mengchen Liu; Shixia Liu; Xizhou Zhu; Qinying Liao; Furu Wei; Shimei Pan
Although there has been a great deal of interest in analyzing customer opinions and breaking news in microblogs, progress has been hampered by the lack of an effective mechanism to discover and retrieve data of interest from microblogs. To address this problem, we have developed an uncertainty-aware visual analytics approach to retrieve salient posts, users, and hashtags. We extend an existing ranking technique to compute a multifaceted retrieval result: the mutual reinforcement rank of a graph node, the uncertainty of each rank, and the propagation of uncertainty among different graph nodes. To illustrate the three facets, we have also designed a composite visualization with three visual components: a graph visualization, an uncertainty glyph, and a flow map. The graph visualization with glyphs, the flow map, and the uncertainty analysis together enable analysts to effectively find the most uncertain results and interactively refine them. We have applied our approach to several Twitter datasets. Qualitative evaluation and two real-world case studies demonstrate the promise of our approach for retrieving high-quality microblog data.
visual analytics science and technology | 2010
Lei Shi; Weihong Qian; Furu Wei; Li Tan
VisWorks is a software package for text and network visual analytics. This paper introduces its visualization, analytic process and lesson learned in solving Mini-Challenge 1 of VAST 2010 contest.
Proceedings of the first international workshop on Intelligent visual interfaces for text analysis | 2010
Furu Wei; Lei Shi; Li Tan; Xiaohua Sun; Xiaoxiao Lian; Shixia Liu; Michelle X. Zhou
Correlating content from multiple data fields is one of the key challenges in text mining. In this paper, we propose a visual analytics approach that leverages both content correlation analysis and interactive visualization technologies in analyzing and understanding content correlations. We have applied our work to analyzing NHAMCS data (National Hospital Ambulatory Medical Care Survey), which helps reveal healthcare-related data patterns through the correlations between unstructured data fields (e.g., cause of injury and diagnosis) and between structured and unstructured fields (e.g., gender and cause of injury).