Matthew Hurst
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Matthew Hurst.
knowledge discovery and data mining | 2005
Natalie S. Glance; Matthew Hurst; Kamal Nigam; Matthew Siegler; Robert Stockton; Takashi Tomokiyo
Weblogs and message boards provide online forums for discussion that record the voice of the public. Woven into this mass of discussion is a wide range of opinion and commentary about consumer products. This presents an opportunity for companies to understand and respond to the consumer by analyzing this unsolicited feedback. Given the volume, format and content of the data, the appropriate approach to understand this data is to use large-scale web and text data mining technologies.This paper argues that applications for mining large volumes of textual data for marketing intelligence should provide two key elements: a suite of powerful mining and visualization technologies and an interactive analysis environment which allows for rapid generation and testing of hypotheses. This paper presents such a system that gathers and annotates online discussion relating to consumer products using a wide variety of state-of-the-art techniques, including crawling, wrapping, search, text classification and computational linguistics. Marketing intelligence is derived through an interactive analysis framework uniquely configured to leverage the connectivity and content of annotated online discussion.
search in social media | 2008
Marti A. Hearst; Matthew Hurst; Susan T. Dumais
Blog search has not yet reached its full potential. In this position paper, we suggest that more could be done to accommodate the task of finding good blogs to read, especially with respect to matching a desired taste or style of writing. We propose a faceted navigation interface as a good starting point for blog and author search, and that search oriented around people and their writings will lend itself well to advanced interfaces. We also argue that blog search is probably best integrated with search on other forms of timely social media for the task of determining what is currently being thought about a particular topic.
visual analytics science and technology | 2008
Danyel Fisher; Aaron Hoff; George G. Robertson; Matthew Hurst
Analyzing unstructured text streams can be challenging. One popular approach is to isolate specific themes in the text, and to visualize the connections between them. Some existing systems, like ThemeRiver, provide a temporal view of changes in themes; other systems, like In-Spire, use clustering techniques to help an analyst identify the themes at a single point in time. Narratives combines both of these techniques; it uses a temporal axis to visualize ways that concepts have changed over time, and introduces several methods to explore how those concepts relate to each other. Narratives is designed to help the user place news stories in their historical and social context by understanding how the major topics associated with them have changed over time. Users can relate articles through time by examining the topical keywords that summarize a specific news event. By tracking the attention to a news article in the form of references in social media (such as weblogs), a user discovers both important events and measures the social relevance of these stories.
international world wide web conferences | 2005
Natalie S. Glance; Matthew Hurst; Kamal Nigam; Matthew Siegler; Robert Stockton; Takashi Tomokiyo
We present a system that gathers and analyzes online discussion as it relates to consumer products. Weblogs and online message boards provide forums that record the voice of the public. Woven into this discussion is a wide range of opinion and commentary about consumer products. Given its volume, format and content, the appropriate approach to understanding this data is large-scale web and text data mining. By using a wide variety of state-of-the-art techniques including crawling, wrapping, text classification and computational linguistics, online discussion is gathered and annotated within a framework that provides for interactive analysis that yields marketing intelligence for our customers.
Computing Attitude and Affect in Text | 2006
Kamal Nigam; Matthew Hurst
This chapter describes an automated system for detecting polar expressions about a specified topic. The two elementary components of this approach are a shallow NLP polar language extraction system and a machine learning based topic classifier. These components are composed together by making a simple but accurate collocation assumption: if a topical sentence contains polar language, the polarity is associated with the topic. We evaluate our system, components and assumption on a corpus of online consumer messages.
international conference on data engineering | 2009
Matthew Hurst; Alexey Maykov
Weblogs, and other forms of social media, differ from traditional web content in many ways. One of the most important differences is the highly temporal nature of the content. Applications that leverage social media content must, to be effective, have access to this data with minimal publication/acquisition latency. An effective weblog crawler should satisfy the following requirements: low latency, highly scalable, high data quality and appropriate network politeness. In this paper, we outline the weblog crawler implemented in the social streams project and summarize the challenges faced during development.
conference on information and knowledge management | 2009
Julian Brooke; Matthew Hurst
A qualitative examination of review texts suggests that there are consistent patterns to how topic and polarity are expressed in discourse. These patterns are visible in the text and paragraph structure, topic depth, and polarity flow. In this paper, we employ sentence-level sentiment classifiers and a hand-built tree ontology to investigate whether these patterns can be quantitatively identified in a large corpus of video game reviews. Our results indicate that the beginning and the end of major textual units (e.g. paragraphs) stand out in the flow of texts, showing a concentration of reliable opinion and key topic aspects, and that there are other important regularities in the expression of opinion and topic relevant to their ordering and the discourse markers with which they appear.
siam international conference on data mining | 2007
Jure Leskovec; Mary McGlohon; Christos Faloutsos; Natalie S. Glance; Matthew Hurst
international world wide web conferences | 2002
William W. Cohen; Matthew Hurst; Lee S. Jensen
Archive | 2003
Natalie S. Glance; Matthew Hurst; Takashi Tomokiyo