Donia Scott | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Donia Scott is active.

Explore More

Publication

Featured researches published by Donia Scott.

Computational Linguistics archive | 2003

Document structure

Richard Power; Donia Scott; Nadjet Bouayad-Agha

We argue the case for abstract document structure as a separate descriptive level in the analysis and generation of written texts. The purpose of this representation is to mediate between the message of a text (i.e., its discourse structure) and its physical presentation (i.e., its organization into graphical constituents like sections, paragraphs, sentences, bulleted lists, figures, and footnotes). Abstract document structure can be seen as an extension of Nunbergs text-grammar it is also closely related to logical markup in languages like HTML and LaTEX. We show that by using this intermediate representation, several subtasks in language generation and language understanding can be defined more cleanly.

Natural Language Engineering | 2004

Software Architecture for Language Engineering

Hamish Cunningham; Donia Scott

Every building, and every computer program, has an architecture: structural and organisational principles that underpin its design and construction. The garden shed once built by one of the authors had an ad hoc architecture, extracted (somewhat painfully) from the imagination during a slow and non-deterministic process that, luckily, resulted in a structure which keeps the rain on the outside and the mower on the inside (at least for the time being). As well as being ad hoc (i.e. not informed by analysis of similar practice or relevant science or engineering) this architecture is implicit: no explicit design was made, and no records or documentation kept of the construction process.

Journal of Verbal Learning and Verbal Behavior | 1984

Segmental phonology and the perception of syntactic structure

Donia Scott; Anne Cutler

Recent research in speech production has shown that syntactic structure is reflected in segmental phonology—the application of certain phonological rules of English (e.g., palatalization and alveolar flapping) is inhibited across phrase boundaries. We examined whether such segmental effects can be used in speech perception as cues to syntactic structure, and the relation between the use of these segmental features as syntactic markers in production and perception. Speakers of American English (a dialect in which the above segmental effects occur) could indeed use the segmental cues in syntax perception; speakers of British English (in which the effects do not occur) were unable to make use of them, while speakers of British English who were long-term residents of the United States showed intermediate performance.

Computational Linguistics | 2007

Composing Questions through Conceptual Authoring

Catalina Hallett; Donia Scott; Richard Power

This article describes a method for composing fluent and complex natural language questions, while avoiding the standard pitfalls of free text queries. The method, based on Conceptual Authoring, is targeted at question-answering systems where reliability and transparency are critical, and where users cannot be expected to undergo extensive training in question composition. This scenario is found in most corporate domains, especially in applications that are risk-averse. We present a proof-of-concept system we have developed: a question-answering interface to a large repository of medical histories in the area of cancer. We show that the method allows users to successfully and reliably compose complex queries with minimal training.

Natural Language Engineering | 2006

A Reference Architecture for Natural Language Generation Systems

Chris Mellish; Donia Scott; Lynne J. Cahill; Daniel S. Paiva; Roger Evans; Mike Reape

We present the RAGS (Reference Architecture for Generation Systems) framework: a specification of an abstract Natural Language Generation (NLG) system architecture to support sharing, re-use, comparison and evaluation of NLG technologies. We argue that the evidence from a survey of actual NLG systems calls for a different emphasis in a reference proposal from that seen in similar initiatives in information extraction and multimedia interfaces. We introduce the framework itself, in particular the two-level data model that allows us to support the complex data requirements of NLG systems in a flexible and coherent fashion, and describe our efforts to validate the framework through a range of implementations.

natural language generation | 1994

Expressing procedural relationships in multilingual instructions

Judy Delin; Anthony Hartley; Cécile Paris; Donia Scott; Keith Vander Linden

In this paper we discuss a study of the expression of procedural relations in multilingual user instructions, in particular the relations of Generation and Enablement. These procedural relations are defined in terms of a plan representation model, and applied in a corpus study of English, French, and Portuguese instructions. The results of our analysis indicate specific guidelines for the tactical realisation of expressions of these relations in multilingual instructional text.

Journal of the American Medical Informatics Association | 2016

Extracting information from the text of electronic medical records to improve case detection: a systematic review

Elizabeth Ford; John A. Carroll; Helen Smith; Donia Scott; Jackie Cassell

Abstract Background Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall).

natural language generation | 1994

Stylistic variation in multilingual instructions

Cécile Paris; Donia Scott

Instructional texts have been the object of many studies recently, motivated by the increased need to produce manuals (especially multilingual manuals) coupled with the cost of translators and technical writers. Because these studies concentrate on aspects other than the linguistic realisation of instructions -- for example, the integration of text and graphics - they all generate a sequence of steps required to achieve a task, using imperatives. Our research so far shows, however, that manuals can in fact have different styles, i. e., not all instructions are stated using a sequence of imperatives, and that, furthermore, different parts of manuals often use different styles. In this paper, we present our preliminary results from an analysis of over 30 user guides/manuals for consumer appliances and discuss some of the implications.

international conference on natural language generation | 2000

Can text structure be incompatible with rhetorical structure

Nadjet Bouayad-Agha; Richard Power; Donia Scott

Scott and Souza (1990) have posed the problem of how a rhetorical structure (in which propositions are linked by rhetorical relations, but not yet arranged in a linear order) can be realized by a text structure (in which propositions are ordered and linked up by appropriate discourse connectives). Almost all work on this problem assumes, implicitly or explicitly, that this mapping is governed by a constraint on compatibility of structure. We show how this constraint can be stated precisely, and present some counterexamples which seem acceptable even though they violate compatibility. The examples are based on a phenomenon we call extraposition, in which complex embedded constituents of a rhetorical structure are extracted and realized separately.

Natural Language Engineering | 2004

A Reference Architecture for Generation Systems

Chris Mellish; Mike Reape; Donia Scott; Lynne J. Cahill; Roger Evans; Daniel S. Paiva

We present the RAGS (Reference Architecture for Generation Systems) framework, a specification of an abstract Natural Language Generation (NLG) system architecture to support sharing, re-use, comparison and evaluation of NLG technologies. We argue that the evidence from a survey of actual NLG systems calls for a different emphasis in a reference proposal from that seen in similar initiatives in information extraction and multimedia interfaces. We introduce the framework itself, in particular the two-level data model that allows us to support the complex data requirements of NLG systems in a flexible and coherent fashion, and describe our efforts to validate the framework through a range of implementations.

Explore More