Chris Welty | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chris Welty is active.

Explore More

Publication

Featured researches published by Chris Welty.

Archive | 2011

The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011, Proceedings, Part I

Lora Aroyo; Chris Welty; Harith Alani; Jamie Taylor; Abraham Bernstein; Lalana Kagal; Natasha Noy; Eva Blomqvist

The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011, Proceedings, Part I

Lecture Notes in Computer Science | 2013

The Semantic Web - ISWC 2013

Harith Alani; Lalana Kagal; Achille Fokoue; Paul T. Groth; Chris Biemann; Josiane Xavier Parreira; Lora Aroyo; Natasha Noy; Chris Welty; Krzysztof Janowicz

As collaborative, or network science spreads into more science, engineering and medical fields, both the participants and their funders have expressed a very strong desire for highly functional data and information capabilities that are a) easy to use, b) integrated in a variety of ways, c) leverage prior investments and keep pace with rapid technical change, and d) are not expensive or timeconsuming to build or maintain. In response, and based on our accummulated experience over the last decade and a maturing of several key semantic web approaches, we have adapted, extended, and integrated several open source applications and frameworks that handle major portions of functionality for these platforms. At minimum, these functions include: an object-type repository, collaboration tools, an ability to identify and manage all key entities in the platform, and an integrated portal to manage diverse content and applications, with varied access levels and privacy options. At the same time, there is increasing attention to how researchers present and explain results based on interpretation of increasingly diverse and heterogeneous data and information sources. With the renewed emphasis on good data practices, informatics practitioners have responded to this challenge with maturing informatics-based approaches. These approaches include, but are not limited to, use case development; information modeling and architectures; elaborating vocabularies; mediating interfaces to data and related services on the Web; and traceable provenance. The current era of data-intensive research presents numerous challenges to both individuals and research teams. In environmental science especially, sub-fields that were data-poor are becoming data-rich (volume, type and mode), while some that were largely model/ simulation driven are now dramatically shifting to data-driven or least to data-model assimilation approaches. These paradigm shifts make it very hard for researchers used to one mode to shift to another, let alone produce products of their work that are usable or understandable by non-specialists. However, it is exactly at these frontiers where much of the exciting environmental science needs to be performed and appreciated.

Ibm Journal of Research and Development | 2012

A framework for merging and ranking of answers in DeepQA

David Gondek; Adam Lally; Aditya Kalyanpur; James W. Murdock; P. A. Duboue; Lixin Zhang; Yue Pan; Z. M. Qiu; Chris Welty

The final stage in the IBM DeepQA pipeline involves ranking all candidate answers according to their evidence scores and judging the likelihood that each candidate answer is correct. In DeepQA, this is done using a machine learning framework that is phase-based, providing capabilities for manipulating the data and applying machine learning in successive applications. We show how this design can be used to implement solutions to particular challenges that arise in applying machine learning for evidence-based hypothesis evaluation. Our approach facilitates an agile development environment for DeepQA; evidence scoring strategies can be easily introduced, revised, and reconfigured without the need for error-prone manual effort to determine how to combine the various evidence scores. We describe the framework, explain the challenges, and evaluate the gain over a baseline machine learning approach.

Ibm Journal of Research and Development | 2012

Finding needles in the haystack: search and candidate generation

Jennifer Chu-Carroll; James Fan; Branimir Boguraev; David Carmel; Dafna Sheinwald; Chris Welty

A key phase in the DeepQA architecture is Hypothesis Generation, in which candidate system responses are generated for downstream scoring and ranking. In the IBM Watson™ system, these hypotheses are potential answers to Jeopardy!™ questions and are generated by two components: search and candidate generation. The search component retrieves content relevant to a given question from Watsons knowledge resources. The candidate generation component identifies potential answers to the question from the retrieved content. In this paper, we present strategies developed to use characteristics of Watsons different knowledge sources and to formulate effective search queries against those sources. We further discuss a suite of candidate generation strategies that use various kinds of metadata, such as document titles or anchor texts in hyperlinked documents. We demonstrate that a combination of these strategies brings the correct answer into the candidate answer pool for 87.17% of all the questions in a blind test set, facilitating high end-to-end question-answering performance.

Ibm Journal of Research and Development | 2012

Structured data and inference in DeepQA

Aditya Kalyanpur; Branimir Boguraev; Siddharth Patwardhan; James W. Murdock; Adam Lally; Chris Welty; John M. Prager; B. Coppola; Achille B. Fokoue-Nkoutche; Lixin Zhang; Yue Pan; Z. M. Qiu

Although the majority of evidence analysis in DeepQA is focused on unstructured information (e.g., natural-language documents), several components in the DeepQA system use structured data (e.g., databases, knowledge bases, and ontologies) to generate potential candidate answers or find additional evidence. Structured data analytics are a natural complement to unstructured methods in that they typically cover a narrower range of questions but are more precise within that range. Moreover, structured data that has formal semantics is amenable to logical reasoning techniques that can be used to provide implicit evidence. The DeepQA system does not contain a single monolithic structured data module; instead, it allows for different components to use and integrate structured and semistructured data, with varying degrees of expressivity and formal specificity. This paper is a survey of DeepQA components that use structured data. Areas in which evidence from structured sources has the most impact include typing of answers, application of geospatial and temporal constraints, and the use of formally encoded a priori knowledge of commonly appearing entity types such as countries and U.S. presidents. We present details of appropriate components and demonstrate their end-to-end impact on the IBM Watsoni system.

Ibm Journal of Research and Development | 2012

Typing candidate answers using type coercion

James W. Murdock; Aditya Kalyanpur; Chris Welty; James Fan; David A. Ferrucci; David Gondek; Lixin Zhang; H. Kanayama

Many questions explicitly indicate the type of answer required. One popular approach to answering those questions is to develop recognizers to identify instances of common answer types (e.g., countries, animals, and food) and consider only answers on those lists. Such a strategy is poorly suited to answering questions from the Jeopardy!™ television quiz show. Jeopardy! questions have an extremely broad range of types of answers, and the most frequently occurring types cover only a small fraction of all answers. We present an alternative approach to dealing with answer types. We generate candidate answers without regard to type, and for each candidate, we employ a variety of sources and strategies to judge whether the candidate has the desired type. These sources and strategies provide a set of type coercion scores for each candidate answer. We use these scores to give preference to answers with more evidence of having the right type. Our question-answering system is significantly more accurate with type coercion than it is without type coercion; these components have a combined impact of nearly 5% on the accuracy of the IBM Watson™ question-answering system.

Ontologies for Software Engineering and Software Technology | 2006

The Object Management Group Ontology Definition Metamodel

Robert M. Colomb; Kerry Raymond; Lewis Hart; Patrick Emery; Chris Welty; Guo Tong Xie; Elisa F. Kendall

Report of a submission being made to a major international software engineering standards group, the Object Management Group which ties together OMG standards with World-Wide Web Consortium and International Standards Organization standards. Major industry bodies including IBM are collaborating, and the submission has the support of 24 companies. OMG, W3C and ISO standards strongly influence the industry, especially in combination. Colomb was a major contributor, responsible for 30% of the submission, and the primary author of the paper.

international semantic web conference | 2012

A comparison of hard filters and soft evidence for answer typing in watson

Chris Welty; J. William Murdock; Aditya Kalyanpur; James Fan

Questions often explicitly request a particular type of answer. One popular approach to answering natural language questions involves filtering candidate answers based on precompiled lists of instances of common answer types (e.g., countries, animals, foods, etc.). Such a strategy is poorly suited to an open domain in which there is an extremely broad range of types of answers, and the most frequently occurring types cover only a small fraction of all answers. In this paper we present an alternative approach called TyCor, that employs soft filtering of candidates using multiple strategies and sources. We find that TyCor significantly outperforms a single-source, single-strategy hard filtering approach, demonstrating both that multi-source multi-strategy outperforms a single source, single strategy, and that its fault tolerance yields significantly better performance than a hard filter.

Ai Magazine | 2010

Introduction to the Special Issue on Question Answering

David Gunning; Vinay K. Chaudhri; Chris Welty

This special issue issue of AI Magazine presents six articles on some of the most interesting question answering systems in development today. Included are articles on Project, the Semantic Research, Watson, True Knowledge, and TextRunner (University of Washington’s clever use of statistical NL techniques to answer questions across the open web).

Ksii Transactions on Internet and Information Systems | 2018

Crowdsourcing Ground Truth for Medical Relation Extraction

Anca Dumitrache; Lora Aroyo; Chris Welty

Cognitive computing systems require human labeled data for evaluation and often for training. The standard practice used in gathering this data minimizes disagreement between annotators, and we have found this results in data that fails to account for the ambiguity inherent in language. We have proposed the CrowdTruth method for collecting ground truth through crowdsourcing, which reconsiders the role of people in machine learning based on the observation that disagreement between annotators provides a useful signal for phenomena such as ambiguity in the text. We report on using this method to build an annotated data set for medical relation extraction for the cause and treat relations, and how this data performed in a supervised training experiment. We demonstrate that by modeling ambiguity, labeled data gathered from crowd workers can (1) reach the level of quality of domain experts for this task while reducing the cost, and (2) provide better training data at scale than distant supervision. We further propose and validate new weighted measures for precision, recall, and F-measure, which account for ambiguity in both human and machine performance on this task.

Explore More