Deryle Lonsdale | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Deryle Lonsdale is active.

Explore More

Publication

Featured researches published by Deryle Lonsdale.

data and knowledge engineering | 1999

Conceptual-model-based data extraction from multiple-record Web pages

David W. Embley; Douglas M. Campbell; Y. S. Jiang; Stephen W. Liddle; Deryle Lonsdale; Yiu-Kai Ng; Randy Smith

Abstract Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe the essence of a documents content. For these kinds of data-rich, multiple-record documents (e.g., advertisements, movie reviews, weather reports, travel information, sports summaries, financial statements, obituaries, and many others) we can apply a conceptual-modeling approach to extract and structure data automatically. The approach is based on an ontology – a conceptual model instance – that describes the data of interest, including relationships, lexical appearance, and context keywords. By parsing the ontology, we can automatically produce a database scheme and recognizers for constants and keywords, and then invoke routines to recognize and extract data from unstructured documents and structure it according to the generated database scheme. Experiments show that it is possible to achieve good recall and precision ratios for documents that are rich in recognizable constants and narrow in ontological breadth. Our approach is less labor-intensive than other approaches that manually or semiautomatically generate wrappers, and it is generally insensitive to changes in Web-page format.

international world wide web conferences | 2005

Towards Ontology Generation from Tables

Yuri A. Tijerino; David W. Embley; Deryle Lonsdale; Yihong Ding; George Nagy

At the heart of todays information-explosion problems are issues involving semantics, mutual understanding, concept matching, and interoperability. Ontologies and the Semantic Web are offered as a potential solution, but creating ontologies for real-world knowledge is nontrivial. If we could automate the process, we could significantly improve our chances of making the Semantic Web a reality. While understanding natural language is difficult, tables and other structured information make it easier to interpret new items and relations. In this paper we introduce an approach to generating ontologies based on table analysis. We thus call our approach TANGO (Table ANalysis for Generating Ontologies). Based on conceptual modeling extraction techniques, TANGO attempts to (i) understand a tables structure and conceptual content; (ii) discover the constraints that hold between concepts extracted from the table; (iii) match the recognized concepts with ones from a more general specification of related concepts; and (iv) merge the resulting structure with other similar knowledge representations. TANGO is thus a formalized method of processing the format and content of tables that can serve to incrementally build a relevant reusable conceptual ontology.

linguistic annotation workshop | 2007

Active Learning for Part-of-Speech Tagging: Accelerating Corpus Annotation

Eric K. Ringger; Peter McClanahan; Robbie Haertel; George Busby; Marc Carmen; James L. Carroll; Kevin D. Seppi; Deryle Lonsdale

In the construction of a part-of-speech annotated corpus, we are constrained by a fixed budget. A fully annotated corpus is required, but we can afford to label only a subset. We train a Maximum Entropy Markov Model tagger from a labeled subset and automatically tag the remainder. This paper addresses the question of where to focus our manual tagging efforts in order to deliver an annotation of highest quality. In this context, we find that active learning is always helpful. We focus on Query by Uncertainty (QBU) and Query by Committee (QBC) and report on experiments with several baselines and new variations of QBC and QBU, inspired by weaknesses particular to their use in this application. Experiments on English prose and poetry test these approaches and evaluate their robustness. The results allow us to make recommendations for both types of text and raise questions that will lead to further inquiry.

web information systems engineering | 2003

Ontology generation from tables

Yuri A. Tijerino; David W. Embley; Deryle Lonsdale; George Nagy

We often need to access and reorganize information available in multiple tables in diverse Web pages. To understand tables, we rely on acquired expertise, background information, and practice. Current computerized tools seldom consider the structure and content in the context of other tables with related information. This paper addresses the table processing issue by developing a new framework to table understanding that applies an ontology-based conceptual modeling extraction approach to: (i) understand a tables structure and conceptual content to the extent possible; (ii) discover the constraints that hold between concepts extracted from the table; (iii) match the recognized concepts with ones from a more general specification of related concepts; and (iv) merge the resulting structure with other similar knowledge representations for use in future situations. The result is a formalized method of processing the format and content of tables while incrementally building a relevant reusable conceptual ontology.

data and knowledge engineering | 2010

Reusing ontologies and language components for ontology generation

Deryle Lonsdale; David W. Embley; Yihong Ding; Li Xu; Martin Hepp

Realizing the Semantic Web involves creating ontologies, a tedious and costly challenge. Reuse can reduce the cost of ontology engineering. Semantic Web ontologies can provide useful input for ontology reuse. However, the automated reuse of such ontologies remains underexplored. This paper presents a generic architecture for automated ontology reuse. With our implementation of this architecture, we show the practicality of automating ontology generation through ontology reuse. We experimented with a large generic ontology as a basis for automatically generating domain ontologies that fit the scope of sample natural language web pages. The results were encouraging, resulting in five lessons pertinent to future automated ontology reuse study.

Intelligent Robots and Computer Vision XXIV: Algorithms, Techniques, and Active Vision | 2006

Embodying a cognitive model in a mobile robot

D. Paul Benjamin; Damian M. Lyons; Deryle Lonsdale

The ADAPT project is a collaboration of researchers in robotics, linguistics and artificial intelligence at three universities to create a cognitive architecture specifically designed to be embodied in a mobile robot. There are major respects in which existing cognitive architectures are inadequate for robot cognition. In particular, they lack support for true concurrency and for active perception. ADAPT addresses these deficiencies by modeling the world as a network of concurrent schemas, and modeling perception as problem solving. Schemas are represented using the RS (Robot Schemas) language, and are activated by spreading activation. RS provides a powerful language for distributed control of concurrent processes. Also, The formal semantics of RS provides the basis for the semantics of ADAPTs use of natural language. We have implemented the RS language in Soar, a mature cognitive architecture originally developed at CMU and used at a number of universities and companies. Soars subgoaling and learning capabilities enable ADAPT to manage the complexity of its environment and to learn new schemas from experience. We describe the issues faced in developing an embodied cognitive architecture, and our implementation choices.

data and knowledge engineering | 2008

Assessing clinical trial eligibility with logic expression queries

Deryle Lonsdale; Clint Tustison; Craig G. Parker; David W. Embley

This paper introduces a system that processes clinical trials using a combination of natural language processing and database techniques. We process web-based clinical trial recruitment pages to extract semantic information reflecting eligibility criteria for potential participants. From this information we then formulate a query that can match criteria against medical data in patient records. The resulting system reflects a tight coupling of web-based information extraction, natural language processing, medical informatic approaches to clinical knowledge representation, and large-scale database technologies. We present an evaluation of the system and future directions for further system development.

applications of natural language to data bases | 2007

Generating ontologies via language components and ontology reuse

Yihong Ding; Deryle Lonsdale; David W. Embley; Martin Hepp; Li Xu

Realizing the Semantic Web involves creating ontologies, a tedious and costly challenge. Reuse can reduce the cost of ontology engineering. Semantic Web ontologies can provide useful input for ontology reuse. However, the automated reuse of such ontologies remains underexplored. This paper presents a generic architecture for automated ontology reuse. With our implementation of this architecture, we show the practicality of automating ontology generation through ontology reuse. We experimented with a large generic ontology as a basis for automatically generating domain ontologies that fit the scope of sample naturallanguage web pages. The results were encouraging, resulting in five lessons pertinent to future automated ontology reuse study.

international conference on conceptual modeling | 2011

Multilingual ontologies for cross-language information extraction and semantic search

David W. Embley; Stephen W. Liddle; Deryle Lonsdale; Yuri A. Tijerino

Valuable local information is often available on the web, but encoded in a foreign language that non-local users do not understand. Can we create a system to allow a user to query in language L1 for facts in a web page written in language L2? We propose a suite of multilingual extraction ontologies as a solution to this problem. We ground extraction ontologies in each language of interest, and we map both the data and the metadata among the language-specific extraction ontologies. The mappings are through a central, language-agnostic ontology that allows new languages to be added by only having to provide one mapping rather than one for each language pair. Results from an implemented early prototype demonstrate the feasibility of cross-language information extraction and semantic search. Further, results from an experimental evaluation of ontology-based query translation and extraction accuracy are remarkably good given the complexity of the problem and the complications of its implementation.

Handbook of Conceptual Modeling | 2011

Conceptual Modeling Foundations for a Web of Knowledge

David W. Embley; Stephen W. Liddle; Deryle Lonsdale

The semantic web purports to be a web of knowledge that can answer our questions, help us reason about everyday problems as well as scientific endeavors, and service many of our wants and needs. Researchers and others expound various views about exactly what this means. Here we propose an answer with conceptual modeling as its foundation. We define a web of knowledge as a collection of interconnected knowledge bundles superimposed over a web of documents. Knowledge bundles are conceptual model instances augmented with facilities that provide for both extensional and intensional facts, for linking between knowledge bundles yielding a web of data, and for linking to an underlying document collection providing a means of authentication. We formally define both the component parts of these augmented conceptual models and their synergistic interconnections. As for practicalities, we discuss problems regarding the potentially high cost of constructing a web of knowledge and explain how they may be mitigated. We also discuss usage issues and show how untrained users can interact with and gain benefit from a web of knowledge.

Explore More