Adam Lally
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Adam Lally.
Natural Language Engineering | 2004
David A. Ferrucci; Adam Lally
IBM Research has over 200 people working on Unstructured Information Management (UIM) technologies with a strong focus on Natural Language Processing (NLP). These researchers are engaged in activities ranging from natural language dialog, information retrieval, topic-tracking, named-entity detection, document classification and machine translation to bioinformatics and open-domain question answering. An analysis of these activities strongly suggested that improving the organizations ability to quickly discover each others results and rapidly combine different technologies and approaches would accelerate scientific advance. Furthermore, the ability to reuse and combine results through a common architecture and a robust software framework would accelerate the transfer of research results in NLP into IBMs product platforms. Market analyses indicating a growing need to process unstructured information, specifically multilingual, natural language text, coupled with IBM Researchs investment in NLP, led to the development of middleware architecture for processing unstructured information dubbed UIMA. At the heart of UIMA are powerful search capabilities and a data-driven framework for the development, composition and distributed deployment of analysis engines. In this paper we give a general introduction to UIMA focusing on the design points of its analysis engine architecture and we discuss how UIMA is helping to accelerate research and technology transfer.
Ibm Systems Journal | 2004
David A. Ferrucci; Adam Lally
IBMs Unstrutured Information Management Architecture (UIMA) is a software architecture for developing and deploying unstructured information management (UIM) applications. In this paper we provide a high-level overview of the architecture, introduce its basic components, and describe the set of tools that constitute a UIMA development framework. Then we take the reader through the setps involved in building a simple UIM application, thus highlighting the major UIMA concepts and techniques.
Ibm Journal of Research and Development | 2012
Adam Lally; John M. Prager; Michael C. McCord; Branimir Boguraev; Siddharth Patwardhan; James Fan; Paul Fodor; Jennifer Chu-Carroll
The first stage of processing in the IBM Watson™ system is to perform a detailed analysis of the question in order to determine what it is asking for and how best to approach answering it. Question analysis uses Watsons parsing and semantic analysis capabilities: a deep Slot Grammar parser, a named entity recognizer, a co-reference resolution component, and a relation extraction component. We apply numerous detection rules and classifiers using features from this analysis to detect critical elements of the question, including: 1) the part of the question that is a reference to the answer (the focus); 2) terms in the question that indicate what type of entity is being asked for (lexical answer types); 3) a classification of the question into one or more of several broad types; and 4) elements of the question that play particular roles that may require special handling, for example, nested subquestions that must be separately answered. We describe how these elements are detected and evaluate the impact of accurate detection on our end-to-end question-answering system accuracy.
Ibm Journal of Research and Development | 2012
David Gondek; Adam Lally; Aditya Kalyanpur; James W. Murdock; P. A. Duboue; Lixin Zhang; Yue Pan; Z. M. Qiu; Chris Welty
The final stage in the IBM DeepQA pipeline involves ranking all candidate answers according to their evidence scores and judging the likelihood that each candidate answer is correct. In DeepQA, this is done using a machine learning framework that is phase-based, providing capabilities for manipulating the data and applying machine learning in successive applications. We show how this design can be used to implement solutions to particular challenges that arise in applying machine learning for evidence-based hypothesis evaluation. Our approach facilitates an agile development environment for DeepQA; evidence scoring strategies can be easily introduced, revised, and reconfigured without the need for error-prone manual effort to determine how to combine the various evidence scores. We describe the framework, explain the challenges, and evaluate the gain over a baseline machine learning approach.
Ibm Journal of Research and Development | 2012
Aditya Kalyanpur; Branimir Boguraev; Siddharth Patwardhan; James W. Murdock; Adam Lally; Chris Welty; John M. Prager; B. Coppola; Achille B. Fokoue-Nkoutche; Lixin Zhang; Yue Pan; Z. M. Qiu
Although the majority of evidence analysis in DeepQA is focused on unstructured information (e.g., natural-language documents), several components in the DeepQA system use structured data (e.g., databases, knowledge bases, and ontologies) to generate potential candidate answers or find additional evidence. Structured data analytics are a natural complement to unstructured methods in that they typically cover a narrower range of questions but are more precise within that range. Moreover, structured data that has formal semantics is amenable to logical reasoning techniques that can be used to provide implicit evidence. The DeepQA system does not contain a single monolithic structured data module; instead, it allows for different components to use and integrate structured and semistructured data, with varying degrees of expressivity and formal specificity. This paper is a survey of DeepQA components that use structured data. Areas in which evidence from structured sources has the most impact include typing of answers, application of geospatial and temporal constraints, and the use of formally encoded a priori knowledge of commonly appearing entity types such as countries and U.S. presidents. We present details of appropriate components and demonstrate their end-to-end impact on the IBM Watsoni system.
Ibm Journal of Research and Development | 2012
James W. Murdock; James Fan; Adam Lally; Hideki Shima; Branimir Boguraev
One useful source of evidence for evaluating a candidate answer to a question is a passage that contains the candidate answer and is relevant to the question. In the DeepQA pipeline, we retrieve passages using a novel technique that we call Supporting Evidence Retrieval, in which we perform separate search queries for each candidate answer, in parallel, and include the candidate answer as part of the query. We then score these passages using an assortment of algorithms that use different aspects and relationships of the terms in the question and passage. We provide evidence that our mechanisms for obtaining and scoring passages have a substantial impact on the ability of our question-answering system to answer questions and judge the confidence of the answers.
Ibm Journal of Research and Development | 2012
Edward A. Epstein; Marshall I. Schor; Bhavani S. Iyer; Adam Lally; Eric W. Brown; Jaroslaw Cwiklik
IBM Watson™ is a system created to demonstrate DeepQA technology by competing against human champions in a question-answering game designed for people. The DeepQA architecture was designed to be massively parallel, with an expectation that low latency response times could be achieved by doing parallel computation on many computers. This paper describes how a large set of deep natural-language processing programs were integrated into a single application, scaled out across thousands of central processing unit cores, and optimized to run fast enough to compete in live Jeopardy!™ games.
Ai Magazine | 2017
Adam Lally; Sugato Bagchi; Michael A. Barborak; David W. Buchanan; Jennifer Chu-Carroll; David A. Ferrucci; Michael R. Glass; Aditya Kalyanpur; Erik T. Mueller; J. William Murdock; Siddharth Patwardhan; John M. Prager
We present WatsonPaths, a novel system that can answer scenario-based questions. These include medical questions that present a patient summary and ask for the most likely diagnosis or most appropriate treatment. WatsonPaths builds on the IBM Watson question answering system. WatsonPaths breaks down the input scenario into individual pieces of information, asks relevant subquestions of Watson to conclude new information, and represents these results in a graphical model. Probabilistic inference is performed over the graph to conclude the answer. On a set of medical test preparation questions, WatsonPaths shows a significant improvement in accuracy over multiple baselines.
north american chapter of the association for computational linguistics | 2003
David A. Ferrucci; Adam Lally
IBM Research has over 200 people working on Unstructured Information Management (UIM) technologies with a strong focus on HLT. Spread out over the globe they are engaged in activities ranging from natural language dialog to machine translation to bioinformatics to open-domain question answering. An analysis of these activities strongly suggested that improving the organizations ability to quickly discover each others results and rapidly combine different technologies and approaches would accelerate scientific advance. Furthermore, the ability to reuse and combine results through a common architecture and a robust software framework would accelerate the transfer of research results in HLT into IBMs product platforms. Market analyses indicating a growing need to process unstructured information, specifically multi-lingual, natural language text, coupled with IBM Researchs investment in HLT, led to the development of middleware architecture for processing unstructured information dubbed UIMA. At the heart of UIMA are powerful search capabilities and a data-driven framework for the development, composition and distributed deployment of analysis engines. In this paper we give a general introduction to UIMA focusing on the design points of its analysis engine architecture and we discuss how UIMA is helping to accelerate research and technology transfer.
conference on information and knowledge management | 2011
Aditya Kalyanpur; Siddharth Patwardhan; Branimir Boguraev; Adam Lally; Jennifer Chu-Carroll
Factoid questions often contain one or more assertions (facts) about their answers. However, existing question-answering (QA) systems have not investigated how the multiple facts may be leveraged to enhance system performance. We argue that decomposing complex factoid questions can benefit QA, as an answer candidate is more likely to be correct if multiple independent facts support it. We categorize decomposable questions as parallel or nested, depending on processing strategy required. We present a novel decomposition framework---for parallel and nested questions---which can be overlaid on top of traditional QA systems. It contains decomposition rules for identifying fact sub-questions, a question-rewriting component and a candidate re-ranker. In a particularly challenging domain for our baseline QA system, our framework shows a statistically significant improvement in end-to-end QA performance.