Doug Downey | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Doug Downey is active.

Explore More

Publication

Featured researches published by Doug Downey.

international world wide web conferences | 2004

Web-scale information extraction in knowitall: (preliminary results)

Oren Etzioni; Michael J. Cafarella; Doug Downey; Stanley Kok; Ana-Maria Popescu; Tal Shaked; Stephen Soderland; Daniel S. Weld; Alexander Yates

Manually querying search engines in order to accumulate a large bodyof factual information is a tedious, error-prone process of piecemealsearch. Search engines retrieve and rank potentially relevantdocuments for human perusal, but do not extract facts, assessconfidence, or fuse information from multiple documents. This paperintroduces KnowItAll, a system that aims to automate the tedious process ofextracting large collections of facts from the web in an autonomous,domain-independent, and scalable manner.The paper describes preliminary experiments in which an instance of KnowItAll, running for four days on a single machine, was able to automatically extract 54,753 facts. KnowItAll associates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KnowItAlls architecture and reports on lessons learned for the design of large-scale information extraction systems.

conference on information and knowledge management | 2008

Understanding the relationship between searchers' queries and information goals

Doug Downey; Susan T. Dumais; Daniel J. Liebling; Eric Horvitz

We describe results from Web search log studies aimed at elucidating user behaviors associated with queries and destination URLs that appear with different frequencies. We note the diversity of information goals that searchers have and the differing ways that goals are specified. We examine rare and common information goals that are specified using rare or common queries. We identify several significant differences in user behavior depending on the rarity of the query and the destination URL. We find that searchers are more likely to be successful when the frequencies of the query and destination URL are similar. We also establish that the behavioral differences observed for queries and goals of varying rarity persist even after accounting for potential confounding variables, including query length, search engine ranking, session duration, and task difficulty. Finally, using an information-theoretic measure of search difficulty, we show that the benefits obtained by search and navigation actions depend on the frequency of the information goal.

empirical methods in natural language processing | 2005

KnowItNow: Fast, Scalable Information Extraction from the Web

Michael J. Cafarella; Doug Downey; Stephen Soderland; Oren Etzioni

Numerous NLP applications rely on search-engine queries, both to extract information from and to compute statistics over the Web corpus. But search engines often limit the number of available queries. As a result, query-intensive NLP applications such as Information Extraction (IE) distribute their query load over several days, making IE a slow, offline process.This paper introduces a novel architecture for IE that obviates queries to commercial search engines. The architecture is embodied in a system called KnowItNow that performs high-precision IE in minutes instead of days. We compare KnowItNow experimentally with the previously-published KnowItAll system, and quantify the tradeoff between recall and speed. KnowItNows extraction rate is two to three orders of magnitude higher than KnowItAlls.

international acm sigir conference on research and development in information retrieval | 2007

Heads and tails: studies of web search with common and rare queries

Doug Downey; Susan T. Dumais; Eric Horvitz

A large fraction of queries submitted to Web search enginesoccur very infrequently. We describe search log studiesaimed at elucidating behaviors associated with rare andcommon queries. We present several analyses and discussresearch directions.

Journal of Vocational Rehabilitation | 2013

Tablet-based video modeling and prompting in the workplace for individuals with autism

Raymond V. Burke; Keith D. Allen; Monica R. Howard; Doug Downey; Michael G. Matz; Scott L. Bowen

The current study involved a preliminary job-site testing of computer software, i.e., VideoTote, delivered via a computer tablet and designed to provide users with video modeling and prompting for use by young adults with an autism spectrum disorder (ASD) across a range of employment settings. A multiple baseline design was used to assess changes in rates of completion with a complex, 104-step shipping task by four participants diagnosed with ASD. Baseline data were collected on accuracy of task completion after exposure to typical job-training involving instruction, modeling, and practice. The intervention involved video modeling and prompting with a 13 minute video depicting an individual completing job responsibilities that entailed checking to make sure materials were in working order, replacing defective items, packing materials in a container, entering information into a computer, and attaching a label to a container. Results suggested that video modeling and prompting were effective in helping individuals with autism complete a multi-step shipping task. Participants and their parents gave the device and software high ratings as an acceptable treatment for adults with autism to use in the workplace and intervention that complies with universal design principles. Implications for competitive job opportunities for individuals with autism are discussed.

Computational Linguistics | 2014

Learning representations for weakly supervised natural language processing tasks

Fei Huang; Arun Ahuja; Doug Downey; Yi Yang; Yuhong Guo; Alexander Yates

Finding the right representations for words is critical for building accurate NLP systems when domain-specific labeled data for the task is scarce. This article investigates novel techniques for extracting features from n-gram models, Hidden Markov Models, and other statistical language models, including a novel Partial Lattice Markov Random Field model. Experiments on part-of-speech tagging and information extraction, among other tasks, indicate that features taken from statistical language models, in combination with more traditional features, outperform traditional representations alone, and that graphical model representations outperform n-gram models, especially on sparse and polysemous words.

international acm sigir conference on research and development in information retrieval | 2012

Explanatory semantic relatedness and explicit spatialization for exploratory search

Brent J. Hecht; Samuel H. Carton; Mahmood Quaderi; Johannes Schöning; Martin Raubal; Darren Gergle; Doug Downey

Exploratory search, in which a user investigates complex concepts, is cumbersome with todays search engines. We present a new exploratory search approach that generates interactive visualizations of query concepts using thematic cartography (e.g. choropleth maps, heat maps). We show how the approach can be applied broadly across both geographic and non-geographic contexts through explicit spatialization, a novel method that leverages any figure or diagram -- from a periodic table, to a parliamentary seating chart, to a world map -- as a spatial search environment. We enable this capability by introducing explanatory semantic relatedness measures. These measures extend frequently-used semantic relatedness measures to not only estimate the degree of relatedness between two concepts, but also generate human-readable explanations for their estimates by mining Wikipedias text, hyperlinks, and category structure. We implement our approach in a system called Atlasify, evaluate its key components, and present several use cases.

international semantic web conference | 2015

TabEL: Entity Linking in Web Tables

Chandra Bhagavatula; Thanapon Noraset; Doug Downey

Web tables form a valuable source of relational data. The Web contains an estimated 154 million HTML tables of relational data, with Wikipedia alone containing 1.6 million high-quality tables. Extracting the semantics of Web tables to produce machine-understandable knowledge has become an active area of research. A key step in extracting the semantics of Web content is entity linking EL: the task of mapping a phrase in text to its referent entity in a knowledge base KB. In this paper we present TabEL, a new EL system for Web tables. TabEL differs from previous work by weakening the assumption that the semantics of a table can be mapped to pre-defined types and relations found in the target KB. Instead, TabEL enforces soft constraints in the form of a graphical model that assigns higher likelihood to sets of entities that tend to co-occur in Wikipedia documents and tables. In experiments, TabEL significantly reduces error when compared to current state-of-the-art table EL systems, including a

empirical methods in natural language processing | 2015