Matthew Honnibal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matthew Honnibal is active.

Explore More

Publication

Featured researches published by Matthew Honnibal.

Artificial Intelligence | 2013

Evaluating Entity Linking with Wikipedia

Ben Hachey; Will Radford; Joel Nothman; Matthew Honnibal; James R. Curran

Named Entity Linking (nel) grounds entity mentions to their corresponding node in a Knowledge Base (kb). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate entities and then disambiguate them, returning either the best candidate or nil. However, comparison has focused on disambiguation accuracy, making it difficult to determine how search impacts performance. Furthermore, important approaches from the literature have not been systematically compared on standard data sets. We reimplement three seminal nel systems and present a detailed evaluation of search strategies. Our experiments find that coreference and acronym handling lead to substantial improvement, and search strategies account for much of the variation between systems. This is an interesting finding, because these aspects of the problem have often been neglected in the literature, which has focused largely on complex candidate ranking algorithms.

empirical methods in natural language processing | 2015

An Improved Non-monotonic Transition System for Dependency Parsing

Matthew Honnibal; Mark Johnson

Transition-based dependency parsers usually use transition systems that monotonically extend partial parse states until they identify a complete parse tree. Honnibal et al. (2013) showed that greedy onebest parsing accuracy can be improved by adding additional non-monotonic transitions that permit the parser to “repair” earlier parsing mistakes by “over-writing” earlier parsing decisions. This increases the size of the set of complete parse trees that each partial parse state can derive, enabling such a parser to escape the “garden paths” that can trap monotonic greedy transition-based dependency parsers. We describe a new set of non-monotonic transitions that permits a partial parse state to derive a larger set of completed parse trees than previous work, which allows our parser to escape from a larger set of garden paths. A parser with our new nonmonotonic transition system has 91.85% directed attachment accuracy, an improvement of 0.6% over a comparable parser using the standard monotonic arc-eager transitions.

Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources | 2009

Evaluating a Statistical CCG Parser on Wikipedia

Matthew Honnibal; Joel Nothman; James R. Curran

The vast majority of parser evaluation is conducted on the 1984 Wall Street Journal (WSJ). In-domain evaluation of this kind is important for system development, but gives little indication about how the parser will perform on many practical problems. Wikipedia is an interesting domain for parsing that has so far been under-explored. We present statistical parsing results that for the first time provide information about what sort of performance a user parsing Wikipedia text can expect. We find that the C&C parsers standard model is 4.3% less accurate on Wikipedia text, but that a simple self-training exercise reduces the gap to 3.8%. The self-training also speeds up the parser on newswire text by 20%.

meeting of the association for computational linguistics | 2007

Creating a Systemic Functional Grammar Corpus from the Penn Treebank

Matthew Honnibal; James R. Curran

The lack of a large annotated systemic functional grammar (SFG) corpus has posed a significant challenge for the development of the theory. Automating SFG annotation is challenging because the theory uses a minimal constituency model, allocating as much of the work as possible to a set of hierarchically organised features. In this paper we show that despite the unorthodox organisation of SFG, adapting existing resources remains the most practical way to create an annotated corpus. We present and analyse SFGBank, an automated conversion of the Penn Treebank into systemic functional grammar. The corpus is comparable to those available for other linguistic theories, offering many opportunities for new research.

empirical methods in natural language processing | 2009

Fully Lexicalising CCGbank with Hat Categories

Matthew Honnibal; James R. Curran

We introduce an extension to CCG that allows form and function to be represented simultaneously, reducing the proliferation of modifier categories seen in standard CCG analyses. We can then remove the non-combinatory rules CCGbank uses to address this problem, producing a grammar that is fully lexicalised and far less ambiguous. There are intrinsic benefits to full lexicalisation, such as semantic transparency and simpler domain adaptation. The clearest advantage is a 52--88% improvement in parse speeds, which comes with only a small reduction in accuracy.

international world wide web conferences | 2015

The Computable News project: Research in the Newsroom

Will Radford; Daniel Tse; Joel Nothman; Ben Hachey; George Wright; James R. Curran; Will Cannings; Timothy O'Keefe; Matthew Honnibal; David Vadas; Candice Loxley

We report on a four year academic research project to build a natural language processing platform in support of a large media company. The Computable News platform processes news stories, producing a layer of structured data that can be used to build rich applications. We describe the underlying platform and the research tasks that we explored building it. The platform supports a wide range of prototype applications designed to support different newsroom functions. We hope that this qualitative review provides some insight into the challenges involved in this type of project.

empirical methods in natural language processing | 2012