Michael Gamon
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Gamon.
intelligent data analysis | 2005
Michael Gamon; Anthony Aue; Simon Corston-Oliver; Eric K. Ringger
We present a prototype system, code-named Pulse, for mining topics and sentiment orientation jointly from free text customer feedback. We describe the application of the prototype system to a database of car reviews. Pulse enables the exploration of large quantities of customer free text. The user can examine customer opinion “at a glance” or explore the data at a finer level of detail. We describe a simple but effective technique for clustering sentences, the application of a bootstrapping approach to sentiment classification, and a novel user-interface.
international conference on computational linguistics | 2004
Michael Gamon
We demonstrate that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data. We show that by using large feature vectors in combination with feature reduction, we can train linear support vector machines that achieve high classification accuracy on data that present classification challenges even for a human annotator. We also show that, surprisingly, the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy in this domain.
meeting of the association for computational linguistics | 2006
Chris Brockett; William B. Dolan; Michael Gamon
This paper presents a pilot study of the use of phrasal Statistical Machine Translation (SMT) techniques to identify and correct writing errors made by learners of English as a Second Language (ESL). Using examples of mass noun errors found in the Chinese Learner Error Corpus (CLEC) to guide creation of an engineered training set, we show that application of the SMT paradigm can capture errors not well addressed by widely-used proofing tools designed for native speakers. Our system was able to correct 61.81% of mistakes in a set of naturally-occurring examples of mass noun errors found on the World Wide Web, suggesting that efforts to collect alignable corpora of pre- and post-editing ESL writing samples offer can enable the development of SMT-based writing assistance tools capable of repairing many of the complex syntactic and lexical problems found in the writing of ESL learners.
international world wide web conferences | 2011
Cristian Danescu-Niculescu-Mizil; Michael Gamon; Susan T. Dumais
The psycholinguistic theory of communication accommodation accounts for the general observation that participants in conversations tend to converge to one anothers communicative behavior: they coordinate in a variety of dimensions including choice of words, syntax, utterance length, pitch and gestures. In its almost forty years of existence, this theory has been empirically supported exclusively through small-scale or controlled laboratory studies. Here we address this phenomenon in the context of Twitter conversations. Undoubtedly, this setting is unlike any other in which accommodation was observed and, thus, challenging to the theory. Its novelty comes not only from its size, but also from the non real-time nature of conversations, from the 140 character length restriction, from the wide variety of social relation types, and from a design that was initially not geared towards conversation at all. Given such constraints, it is not clear a priori whether accommodation is robust enough to occur given the constraints of this new environment. To investigate this, we develop a probabilistic framework that can model accommodation and measure its effects. We apply it to a large Twitter conversational dataset specifically developed for this task. This is the first time the hypothesis of linguistic style accommodation has been examined (and verified) in a large scale, real world setting. Furthermore, when investigating concepts such as stylistic influence and symmetry of accommodation, we discover a complexity of the phenomenon which was never observed before. We also explore the potential relation between stylistic influence and network features commonly associated with social status.
meeting of the association for computational linguistics | 2005
Michael Gamon; Anthony Aue
We describe an extension to the technique for the automatic identification and labeling of sentiment terms described in Turney (2002) and Turney and Littman (2002). Their basic assumption is that sentiment terms of similar orientation tend to co-occur at the document level. We add a second assumption, namely that sentiment terms of opposite orientation tend not to co-occur at the sentence level. This additional assumption allows us to identify sentiment-bearing terms very reliably. We then use these newly identified terms in various scenarios for the sentiment classification of sentences. We show that our approach outperforms Turneys original approach. Combining our approach with a Naive Bayes bootstrapping method yields a further small improvement of classifier performance. We finally compare our results to precision and recall figures that can be obtained on the same data set with labeled data.
empirical methods in natural language processing | 2015
Kristina Toutanova; Danqi Chen; Patrick Pantel; Hoifung Poon; Pallavi Choudhury; Michael Gamon
Models that learn to represent textual and knowledge base relations in the same continuous latent space are able to perform joint inferences among the two kinds of relations and obtain high accuracy on knowledge base completion (Riedel et al., 2013). In this paper we propose a model that captures the compositional structure of textual relations, and jointly optimizes entity, knowledge base, and textual relation representations. The proposed model significantly improves performance over a model that does not share parameters among textual relations with common sub-structure.
empirical methods in natural language processing | 2014
Jianfeng Gao; Patrick Pantel; Michael Gamon; Xiaodong He; Li Deng
This paper presents a deep semantic similarity model (DSSM), a special type of deep neural networks designed for text analysis, for recommending target documents to be of interest to a user based on a source document that she is reading. We observe, identify, and detect naturally occurring signals of interestingness in click transitions on the Web between source and target documents, which we collect from commercial Web browser logs. The DSSM is trained on millions of Web transitions, and maps source-target document pairs to feature vectors in a latent space in such a way that the distance between source documents and their corresponding interesting targets in that space is minimized. The effectiveness of the DSSM is demonstrated using two interestingness tasks: automatic highlighting and contextual entity search. The results on large-scale, real-world datasets show that the semantics of documents are important for modeling interestingness and that the DSSM leads to significant quality improvement on both tasks, outperforming not only the classic document models that do not use semantics but also state-of-the-art topic models.
meeting of the association for computational linguistics | 2001
Simon Corston-Oliver; Michael Gamon; Chris Brockett
We present a machine learning approach to evaluating the well-formedness of output of a machine translation system, using classifiers that learn to distinguish human reference translations from machine translations. This approach can be used to evaluate an MT system, tracking improvements over time; to aid in the kind of failure analysis that can help guide system development; and to select among alternative output strings. The method presented is fully automated and independent of source language, target language and domain.
international world wide web conferences | 2012
Thomas Lin; Patrick Pantel; Michael Gamon; Anitha Kannan; Ariel Fuxman
We introduce an entity-centric search experience, called Active Objects, in which entity-bearing queries are paired with actions that can be performed on the entities. For example, given a query for a specific flashlight, we aim to present actions such as reading reviews, watching demo videos, and finding the best price online. In an annotation study conducted over a random sample of user query sessions, we found that a large proportion of queries in query logs involve actions on entities, calling for an automatic approach to identifying relevant actions for entity-bearing queries. In this paper, we pose the problem of finding actions that can be performed on entities as the problem of probabilistic inference in a graphical model that captures how an entity bearing query is generated. We design models of increasing complexity that capture latent factors such as entity type and intended actions that determine how a user writes a query in a search box, and the URL that they click on. Given a large collection of real-world queries and clicks from a commercial search engine, the models are learned efficiently through maximum likelihood estimation using an EM algorithm. Given a new query, probabilistic inference enables recommendation of a set of pertinent actions and hosts. We propose an evaluation methodology for measuring the relevance of our recommended actions, and show empirical evidence of the quality and the diversity of the discovered actions.
Language Testing | 2010
Martin Chodorow; Michael Gamon; Joel R. Tetreault
In this paper, we describe and evaluate two state-of-the-art systems for identifying and correcting writing errors involving English articles and prepositions. Criterion SM, developed by Educational Testing Service, and ESL Assistant , developed by Microsoft Research, both use machine learning techniques to build models of article and preposition usage which enable them to identify errors and suggest corrections to the writer. We evaluated the effects of these systems on users in two studies. In one, Criterion provided feedback about article errors to native and non-native speakers who were writing an essay for a college-level psychology course. The results showed a significant reduction in the number of article errors in the final essays of the non-native speakers. In the second study, ESL Assistant was used by non-native speakers who were composing email messages. The results indicated that users were selective in their choices among the system’s suggested corrections and that, as a result, they were able to increase the proportion of valid corrections by making effective use of feedback.