Michael Gamon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Gamon is active.

Explore More

Publication

Featured researches published by Michael Gamon.

intelligent data analysis | 2005

Pulse: mining customer opinions from free text

Michael Gamon; Anthony Aue; Simon Corston-Oliver; Eric K. Ringger

We present a prototype system, code-named Pulse, for mining topics and sentiment orientation jointly from free text customer feedback. We describe the application of the prototype system to a database of car reviews. Pulse enables the exploration of large quantities of customer free text. The user can examine customer opinion “at a glance” or explore the data at a finer level of detail. We describe a simple but effective technique for clustering sentences, the application of a bootstrapping approach to sentiment classification, and a novel user-interface.

international conference on computational linguistics | 2004

Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis

Michael Gamon

We demonstrate that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data. We show that by using large feature vectors in combination with feature reduction, we can train linear support vector machines that achieve high classification accuracy on data that present classification challenges even for a human annotator. We also show that, surprisingly, the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy in this domain.

meeting of the association for computational linguistics | 2006

Correcting ESL Errors Using Phrasal SMT Techniques

Chris Brockett; William B. Dolan; Michael Gamon

This paper presents a pilot study of the use of phrasal Statistical Machine Translation (SMT) techniques to identify and correct writing errors made by learners of English as a Second Language (ESL). Using examples of mass noun errors found in the Chinese Learner Error Corpus (CLEC) to guide creation of an engineered training set, we show that application of the SMT paradigm can capture errors not well addressed by widely-used proofing tools designed for native speakers. Our system was able to correct 61.81% of mistakes in a set of naturally-occurring examples of mass noun errors found on the World Wide Web, suggesting that efforts to collect alignable corpora of pre- and post-editing ESL writing samples offer can enable the development of SMT-based writing assistance tools capable of repairing many of the complex syntactic and lexical problems found in the writing of ESL learners.

international world wide web conferences | 2011

Mark my words!: linguistic style accommodation in social media

Cristian Danescu-Niculescu-Mizil; Michael Gamon; Susan T. Dumais

The psycholinguistic theory of communication accommodation accounts for the general observation that participants in conversations tend to converge to one anothers communicative behavior: they coordinate in a variety of dimensions including choice of words, syntax, utterance length, pitch and gestures. In its almost forty years of existence, this theory has been empirically supported exclusively through small-scale or controlled laboratory studies. Here we address this phenomenon in the context of Twitter conversations. Undoubtedly, this setting is unlike any other in which accommodation was observed and, thus, challenging to the theory. Its novelty comes not only from its size, but also from the non real-time nature of conversations, from the 140 character length restriction, from the wide variety of social relation types, and from a design that was initially not geared towards conversation at all. Given such constraints, it is not clear a priori whether accommodation is robust enough to occur given the constraints of this new environment. To investigate this, we develop a probabilistic framework that can model accommodation and measure its effects. We apply it to a large Twitter conversational dataset specifically developed for this task. This is the first time the hypothesis of linguistic style accommodation has been examined (and verified) in a large scale, real world setting. Furthermore, when investigating concepts such as stylistic influence and symmetry of accommodation, we discover a complexity of the phenomenon which was never observed before. We also explore the potential relation between stylistic influence and network features commonly associated with social status.

meeting of the association for computational linguistics | 2005

Automatic Identification of Sentiment Vocabulary: Exploiting Low Association with Known Sentiment Terms

Michael Gamon; Anthony Aue

We describe an extension to the technique for the automatic identification and labeling of sentiment terms described in Turney (2002) and Turney and Littman (2002). Their basic assumption is that sentiment terms of similar orientation tend to co-occur at the document level. We add a second assumption, namely that sentiment terms of opposite orientation tend not to co-occur at the sentence level. This additional assumption allows us to identify sentiment-bearing terms very reliably. We then use these newly identified terms in various scenarios for the sentiment classification of sentences. We show that our approach outperforms Turneys original approach. Combining our approach with a Naive Bayes bootstrapping method yields a further small improvement of classifier performance. We finally compare our results to precision and recall figures that can be obtained on the same data set with labeled data.

empirical methods in natural language processing | 2015

Representing Text for Joint Embedding of Text and Knowledge Bases

Kristina Toutanova; Danqi Chen; Patrick Pantel; Hoifung Poon; Pallavi Choudhury; Michael Gamon

Models that learn to represent textual and knowledge base relations in the same continuous latent space are able to perform joint inferences among the two kinds of relations and obtain high accuracy on knowledge base completion (Riedel et al., 2013). In this paper we propose a model that captures the compositional structure of textual relations, and jointly optimizes entity, knowledge base, and textual relation representations. The proposed model significantly improves performance over a model that does not share parameters among textual relations with common sub-structure.

empirical methods in natural language processing | 2014

Modeling Interestingness with Deep Neural Networks

Jianfeng Gao; Patrick Pantel; Michael Gamon; Xiaodong He; Li Deng

This paper presents a deep semantic similarity model (DSSM), a special type of deep neural networks designed for text analysis, for recommending target documents to be of interest to a user based on a source document that she is reading. We observe, identify, and detect naturally occurring signals of interestingness in click transitions on the Web between source and target documents, which we collect from commercial Web browser logs. The DSSM is trained on millions of Web transitions, and maps source-target document pairs to feature vectors in a latent space in such a way that the distance between source documents and their corresponding interesting targets in that space is minimized. The effectiveness of the DSSM is demonstrated using two interestingness tasks: automatic highlighting and contextual entity search. The results on large-scale, real-world datasets show that the semantics of documents are important for modeling interestingness and that the DSSM leads to significant quality improvement on both tasks, outperforming not only the classic document models that do not use semantics but also state-of-the-art topic models.

meeting of the association for computational linguistics | 2001

A Machine Learning Approach to the Automatic Evaluation of Machine Translation

Simon Corston-Oliver; Michael Gamon; Chris Brockett

We present a machine learning approach to evaluating the well-formedness of output of a machine translation system, using classifiers that learn to distinguish human reference translations from machine translations. This approach can be used to evaluate an MT system, tracking improvements over time; to aid in the kind of failure analysis that can help guide system development; and to select among alternative output strings. The method presented is fully automated and independent of source language, target language and domain.

international world wide web conferences | 2012

Active objects: actions for entity-centric search

Thomas Lin; Patrick Pantel; Michael Gamon; Anitha Kannan; Ariel Fuxman

We introduce an entity-centric search experience, called Active Objects, in which entity-bearing queries are paired with actions that can be performed on the entities. For example, given a query for a specific flashlight, we aim to present actions such as reading reviews, watching demo videos, and finding the best price online. In an annotation study conducted over a random sample of user query sessions, we found that a large proportion of queries in query logs involve actions on entities, calling for an automatic approach to identifying relevant actions for entity-bearing queries. In this paper, we pose the problem of finding actions that can be performed on entities as the problem of probabilistic inference in a graphical model that captures how an entity bearing query is generated. We design models of increasing complexity that capture latent factors such as entity type and intended actions that determine how a user writes a query in a search box, and the URL that they click on. Given a large collection of real-world queries and clicks from a commercial search engine, the models are learned efficiently through maximum likelihood estimation using an EM algorithm. Given a new query, probabilistic inference enables recommendation of a set of pertinent actions and hosts. We propose an evaluation methodology for measuring the relevance of our recommended actions, and show empirical evidence of the quality and the diversity of the discovered actions.

Language Testing | 2010

The utility of article and preposition error correction systems for English language learners: Feedback and assessment

Martin Chodorow; Michael Gamon; Joel R. Tetreault

In this paper, we describe and evaluate two state-of-the-art systems for identifying and correcting writing errors involving English articles and prepositions. Criterion SM, developed by Educational Testing Service, and ESL Assistant , developed by Microsoft Research, both use machine learning techniques to build models of article and preposition usage which enable them to identify errors and suggest corrections to the writer. We evaluated the effects of these systems on users in two studies. In one, Criterion provided feedback about article errors to native and non-native speakers who were writing an essay for a college-level psychology course. The results showed a significant reduction in the number of article errors in the final essays of the non-native speakers. In the second study, ESL Assistant was used by non-native speakers who were composing email messages. The results indicated that users were selective in their choices among the system’s suggested corrections and that, as a result, they were able to increase the proportion of valid corrections by making effective use of feedback.

Explore More