Is this you? Create Your Porfile

Greg Durrett

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Greg Durrett is active.

Explore More

Publication

Featured researches published by Greg Durrett.

international joint conference on natural language processing | 2015

Neural CRF Parsing

Greg Durrett; Daniel Klein

This paper describes a parsing model that combines the exact dynamic programming of CRF parsing with the rich nonlinear featurization of neural net approaches. Our model is structurally a CRF that factors over anchored rule productions, but instead of linear potential functions based on sparse features, we use nonlinear potentials computed via a feedforward neural network. Because potentials are still local to anchored rules, structured inference (CKY) is unchanged from the sparse case. Computing gradients during learning involves backpropagating an error signal formed from standard CRF sufficient statistics (expected rule counts). Using only dense features, our neural CRF already exceeds a strong baseline CRF model (Hall et al., 2014). In combination with sparse features, our system achieves 91.1 F1 on section 23 of the Penn Treebank, and more generally outperforms the best prior single parser results on a range of languages.

meeting of the association for computational linguistics | 2014

Less Grammar, More Features

David Leo Wright Hall; Greg Durrett; Daniel Klein

We present a parser that relies primarily on extracting information directly from surface spans rather than on propagating information through enriched grammar structure. For example, instead of creating separate grammar symbols to mark the definiteness of an NP, our parser might instead capture the same information from the first word of the NP. Moving context out of the grammar and onto surface features can greatly simplify the structural component of the parser: because so many deep syntactic cues have surface reflexes, our system can still parse accurately with context-free backbones as minimal as Xbar grammars. Keeping the structural backbone simple and moving features to the surface also allows easy adaptation to new languages and even to new tasks. On the SPMRL 2013 multilingual constituency parsing shared task (Seddah et al., 2013), our system outperforms the top single parser system of Bjorkelund et al. (2013) on a range of languages. In addition, despite being designed for syntactic analysis, our system also achieves stateof-the-art numbers on the structural sentiment task of Socher et al. (2013). Finally, we show that, in both syntactic parsing and sentiment analysis, many broad linguistic trends can be captured via surface features.

foundations of genetic algorithms | 2011

Computational complexity analysis of simple genetic programming on two problems modeling isolated program semantics

Greg Durrett; Frank Neumann; Una-May O'Reilly

Analyzing the computational complexity of evolutionary algorithms (EAs) for binary search spaces has significantly informed our understanding of EAs in general. With this paper, we start the computational complexity analysis of genetic programming (GP). We set up several simplified GP algorithms and analyze them on two separable model problems, ORDER and MAJORITY, each of which captures a relevant facet of typical GP problems. Both analyses give first rigorous insights into aspects of GP design, highlighting in particular the impact of accepting or rejecting neutral moves and the importance of a local mutation operator.

north american chapter of the association for computational linguistics | 2016

Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks

Matthew Francis-Landau; Greg Durrett; Daniel Klein

A key challenge in entity linking is making effective use of contextual information to disambiguate mentions that might refer to different entities in different contexts. We present a model that uses convolutional neural networks to capture semantic correspondence between a mentions context and a proposed target entity. These convolutional networks operate at multiple granularities to exploit various kinds of topic information, and their rich parameterization gives them the capacity to learn which n-grams characterize different topics. We combine these networks with a sparse linear model to achieve state-of-the-art performance on multiple entity linking datasets, outperforming the prior systems of Durrett and Klein (2014) and Nguyen et al. (2014).

meeting of the association for computational linguistics | 2016

Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints.

Greg Durrett; Taylor Berg-Kirkpatrick; Daniel Klein

We present a discriminative model for single-document summarization that integrally combines compression and anaphoricity constraints. Our model selects textual units to include in the summary based on a rich set of sparse features whose weights are learned on a large corpus. We allow for the deletion of content within a sentence when that deletion is licensed by compression rules; in our framework, these are implemented as dependencies between subsentential units of text. Anaphoricity constraints then improve cross-sentence coherence by guaranteeing that, for each pronoun included in the summary, the pronouns antecedent is included as well or the pronoun is rewritten as a full mention. When trained end-to-end, our final system outperforms prior work on both ROUGE as well as on human judgments of linguistic quality.

international world wide web conferences | 2017

Tools for Automated Analysis of Cybercriminal Markets

Rebecca S. Portnoff; Sadia Afroz; Greg Durrett; Jonathan K. Kummerfeld; Taylor Berg-Kirkpatrick; Damon McCoy; Kirill Levchenko; Vern Paxson

Underground forums are widely used by criminals to buy and sell a host of stolen items, datasets, resources, and criminal services. These forums contain important resources for understanding cybercrime. However, the number of forums, their size, and the domain expertise required to understand the markets makes manual exploration of these forums unscalable. In this work, we propose an automated, top-down approach for analyzing underground forums. Our approach uses natural language processing and machine learning to automatically generate high-level information about underground forums, first identifying posts related to transactions, and then extracting products and prices. We also demonstrate, via a pair of case studies, how an analyst can use these automated approaches to investigate other categories of products and transactions. We use eight distinct forums to assess our tools: Antichat, Blackhat World, Carders, Darkode, Hack Forums, Hell, L33tCrew and Nulled. Our automated approach is fast and accurate, achieving over 80% accuracy in detecting post category, product, and prices.

north american chapter of the association for computational linguistics | 2015

Disfluency Detection with a Semi-Markov Model and Prosodic Features

James Ferguson; Greg Durrett; Daniel Klein

We present a discriminative model for detecting disfluencies in spoken language transcripts. Structurally, our model is a semiMarkov conditional random field with features targeting characteristics unique to speech repairs. This gives a significant performance improvement over standard chain-structured CRFs that have been employed in past work. We then incorporate prosodic features over silences and relative word duration into our semi-CRF model, resulting in further performance gains; moreover, these features are not easily replaced by discrete prosodic indicators such as ToBI breaks. Our final system, the semi-CRF with prosodic information, achieves an F-score of 85.4, which is 1.3 F1 better than the best prior reported F-score on this dataset.

european conference on evolutionary computation in combinatorial optimization | 2010

A genetic algorithm to minimize chromatic entropy

Greg Durrett; Muriel Médard; Una-May O'Reilly

We present an algorithmic approach to solving the problem of chromatic entropy, a combinatorial optimization problem related to graph coloring. This problem is a component in algorithms for optimizing data compression when computing a function of two correlated sources at a receiver. Our genetic algorithm for minimizing chromatic entropy uses an order-based genome inspired by graph coloring genetic algorithms, as well as some problem-specific heuristics. It performs consistently well on synthetic instances, and for an expositional set of functional compression problems, the GA routinely finds a compression scheme that is 20-30% more efficient than that given by a reference compression algorithm.

empirical methods in natural language processing | 2013