Gus Hahn-Powell
University of Arizona
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gus Hahn-Powell.
meeting of the association for computational linguistics | 2015
Marco Antonio Valenzuela-Escárcega; Gus Hahn-Powell; Mihai Surdeanu; Thomas Hicks
We describe the design, development, and API of ODIN (Open Domain INformer), a domainindependent, rule-based event extraction (EE) framework. The proposed EE approach is: simple (most events are captured with simple lexico-syntactic patterns), powerful (the language can capture complex constructs, such as events taking other events as arguments, and regular expressions over syntactic graphs), robust (to recover from syntactic parsing errors, syntactic patterns can be freely mixed with surface, token-based patterns), and fast (the runtime environment processes 110 sentences/second in a real-world domain with a grammar of over 200 rules). We used this framework to develop a grammar for the biochemical domain, which approached human performance. Our EE framework is accompanied by a web-based user interface for the rapid development of event grammars and visualization of matches. The ODIN framework and the domain-specific grammars are available as open-source code.
Database | 2018
Marco Antonio Valenzuela-Escárcega; Özgün Babur; Gus Hahn-Powell; Dane Bell; Thomas Hicks; Enrique Noriega-Atala; Xia Wang; Mihai Surdeanu; Emek Demir; Clayton T. Morrison
Abstract PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated ‘big mechanisms’ with extracted ‘big data’ can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.
meeting of the association for computational linguistics | 2017
Gus Hahn-Powell; Marco Antonio Valenzuela-Escárcega; Mihai Surdeanu
We introduce a modular approach for literature-based discovery consisting of a machine reading and knowledge assembly component that together produce a graph of influence relations (e.g., “A promotes B”) from a collection of publications. A search engine is used to explore direct and indirect influence chains. Query results are substantiated with textual evidence, ranked according to their relevance, and presented in both a table-based view, as well as a network graph visualization. Our approach operates in both domain-specific settings, where there are knowledge bases and ontologies available to guide reading, and in multi-domain settings where such resources are absent. We demonstrate that this deep reading and search system reduces the effort needed to uncover “undiscovered public knowledge”, and that with the aid of this tool a domain expert was able to drastically reduce her model building time from months to two days.
meeting of the association for computational linguistics | 2016
Gus Hahn-Powell; Dane Bell; Marco Antonio Valenzuela-Escárcega; Mihai Surdeanu
Causal precedence between biochemical interactions is crucial in the biomedical domain, because it transforms collections of individual interactions, e.g., bindings and phosphorylations, into the causal mechanisms needed to inform meaningful search and inference. Here, we analyze causal precedence in the biomedical domain as distinct from open-domain, temporal precedence. First, we describe a novel, hand-annotated text corpus of causal precedence in the biomedical domain. Second, we use this corpus to investigate a battery of models of precedence, covering rule-based, feature-based, and latent representation models. The highest-performing individual model achieved a micro F1 of 43 points, approaching the best performers on the simpler temporal-only precedence tasks. Feature-based and latent representation models each outperform the rule-based models, but their performance is complementary to one another. We apply a sieve-based architecture to capitalize on this lack of overlap, achieving a micro F1 score of 46 points.
language resources and evaluation | 2016
Marco Antonio Valenzuela-Escárcega; Gus Hahn-Powell; Mihai Surdeanu
arXiv: Computation and Language | 2015
Marco Antonio Valenzuela-Escárcega; Gus Hahn-Powell; Mihai Surdeanu
meeting of the association for computational linguistics | 2016
Marco Antonio Valenzuela-Escárcega; Gus Hahn-Powell; Dane Bell; Mihai Surdeanu
language resources and evaluation | 2016
Dane Bell; Gus Hahn-Powell; Marco Antonio Valenzuela-Escárcega; Mihai Surdeanu
language resources and evaluation | 2016
Dane Bell; Gus Hahn-Powell; Marco Antonio Valenzuela-Escárcega; Mihai Surdeanu
north american chapter of the association for computational linguistics | 2018
Fan Luo; Marco Antonio Valenzuela-Escárcega; Gus Hahn-Powell; Mihai Surdeanu