Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stephanie W. Haas is active.

Publication


Featured researches published by Stephanie W. Haas.


acm international conference on digital libraries | 1998

Page and link classifications: connecting diverse resources

Stephanie W. Haas; Erika S. Grams

As digital libraries of all kinds increase in size and scope, they contain more and more diverse information ob.jects. The value of any collection is drawn in part from an understanding of what is there and what relationships exist between items. We believe that classification systems for World Wide Web pages and links, and by extension for any diverse digital library, will be most effective if they are developed in tandem. Therefore, we propose integrated classification systems for Web pages and links which are based on a content analysis of 75 source pages, the almost 1,500 links they contained, and the target pages to which the links led. The consistency with which we were able to classify pages and links bodes well for the possibilities of automatic classification. The slightly lower level of consistency of the link classifications emphasizes the importance of considering context and user expectations in specifying anchors. We conclude by raising important questions about how best to design and link together diverse resources such as those found on the Web or in a digital library.


Journal of the Association for Information Science and Technology | 1995

Sublanguage terms: dictionaries, usage, and automatic classification

Robert M. Losee; Stephanie W. Haas

The use of terms from natural and social scientific titles and abstracts is studied from the perspective of sublanguages and their specialized dictionaries. Different notions of sublanguage distinctiveness are explored. Objective methods for separating hard and soft sciences are suggested based on measures of sublanguage use, dictionary characteristics, and sublanguage distinctiveness. Abstracts were automatically classified with a high degree of accuracy by using a formula that considers the degree of uniqueness of terms in each sublanguage. This may prove useful for text filtering or information retrieval systems.


Information Processing and Management | 1990

Conjunction, ellipsis, and other discontinuous constituents in the constituent object parser

Douglas P. Metzler; Stephanie W. Haas; Cynthia L. Cosic; Charlotte Weise

Abstract The Constituent Object Parser (COP) is a domain independent syntactic parser developed for use in information retrieval and similar applications. Its purpose is to extract a simple hierarchical description of a phrase or sentence that can be used in very general pattern matching procedures to determine the structural similarity of sentences or phrases that contain equivalent terms. This paper discusses the mechanisms by which COP handles the problems of conjunction, ellipsis, and discontinuous constituents. These structures are usually particularly difficult to handle in a parser that does not employ domain knowledge or even general semantic knowledge. cops mechanisms for these structures are directly tailored for, and, in part, even made possible by, the nature of the intended uses of the outputs by the information retrieval matching procedures.


Public Health Reports | 2012

Integration of Syndromic Surveillance Data into Public Health Practice at State and Local Levels in North Carolina

Erika Samoff; Anna E. Waller; Aaron T. Fleischauer; Amy Ising; Meredith K. Davis; Mike Park; Stephanie W. Haas; Lauren M. DiBiase; Pia D.M. MacDonald

Objectives. We sought to describe the integration of syndromic surveillance data into daily surveillance practice at local health departments (LHDs) and make recommendations for the effective integration of syndromic and reportable disease data for public health use. Methods. Structured interviews were conducted with local health directors and communicable disease nursing staff from a stratified random sample of LHDs from May through September 2009. Interviews captured information on direct access to the North Carolina syndromic surveillance system and on the use of syndromic surveillance information for outbreak management, program management, and the creation of reports. We analyzed syndromic surveillance system data to assess the number of signals resulting in a public health response. Results. Syndromic surveillance data were used for outbreak investigation (19% of respondents) and program management and report writing (43% of respondents); a minority reported use of both syndromic and reportable disease data for these purposes (15% and 23%, respectively). Receiving data from frequent system users was associated with using data for these purposes (p=0.016 and p=0.033, respectively, for syndromic and reportable disease data). A small proportion of signals (<25%) resulted in a public health response. Conclusions. Use of syndromic surveillance data by North Carolina local public health authorities resulted in meaningful public health action, including both case investigation and program management. While useful, the syndromic surveillance data system was oriented toward sensitivity rather than efficiency. Successful incorporation of new surveillance data is likely to require systems that are oriented toward efficiency.


Journal of the Association for Information Science and Technology | 1989

Constituent object parsing for information retrieval and similar text processing problems

Douglas P. Metzler; Stephanie W. Haas; Cynthia L. Cosic; Leslie H. Wheeler

The architecture and functioning of the Constituent Object Parser are described. This system has been developed specifically for text processing applications such as information retrieval, which can benefit from structural comparisons between elements of text such as a query and a potentially relevant abstract. The general way in which the system performs these matches, and the ways in which this objective influenced the design of the system are described. The parsing architecture incorporates several interesting features including: (1) an unusual combination of declarative and procedural representation techniques, (2) a monotonic discipline which permits useful heuristic approaches to difficult linguistic problems such as ellipsis, conjunction, ambiguity, and ill‐formed and incomplete input, and (3) an attempt to minimize the level of syntactic detail required in both the grammar and the lexicon.


Proceedings of The Asist Annual Meeting | 2005

Understanding Statistical Concepts and Terms in Context: The GovStat Ontology and the Statistical Interactive Glossary.

Stephanie W. Haas; Maria Cristina Pattuelli; Ron T. Brown

One of the problems that people have in using statistical information from government websites, is that the level of statistical knowledge in the general population is low. People’s lack of statistical knowledge is a barrier to finding the statistics they need and understanding what the statistics mean and how to use them. We describe the Statisticaf Interactive Glossary (SIG), an enhanced glossary of statistical terms, and the GovStat ontology of statistical concepts which supports it. The overall goal of the glossary is to help users understand important statistical terms and concepts in the context in which they are used. We present a conceptual framework whose components articulate the different aspects of a term’s basic explanation that can be manipulated to produce a variety of presentations. Developing the general explanation for each term involves three types of information: the content of the explanation, the context in which the explanation will be displayed, and the format in which the explanation will be delivered. Taxonomic relationships between concepts in the GovStat ontology support the provision of context-specific presentations. These same relationships are also associated with explanation templates, which are patterns for defining or giving an example of a concept. We conclude by discussing evaluation of the SIG. The overarching criterion of effectiveness is whether the SIG helps users Complete their statistical information tasks.


Academic Emergency Medicine | 2003

Diagnosis clusters for emergency medicine.

Debbie Travers; Stephanie W. Haas; Anna E. Waller

OBJECTIVES Aggregated emergency department (ED) data are useful for research, ED operations, and public health surveillance. Diagnosis data are widely available as The International Classification of Diseases, version, 9, Clinical Modification (ICD-9-CM) codes; however, there are over 24,000 ICD-9-CM code-descriptor pairs. Standardized groupings (clusters) of ICD-9-CM codes have been developed by other disciplines, including family medicine (FM), internal medicine (IM), inpatient care (Agency for Healthcare Research and Quality [AHRQ]), and vital statistics (NCHS). The purpose of this study was to evaluate the coverage of four existing ICD-9-CM cluster systems for emergency medicine. METHODS In this descriptive study, four cluster systems were used to group ICD-9-CM final diagnosis data from a southeastern university tertiary referral center. Included were diagnoses for all ED visits in July 2000 and January 2001. In the comparative analysis, the authors determined the coverage in the four cluster systems, defined as the proportion of final diagnosis codes that were placed into clusters and the frequencies of diagnosis codes in each cluster. RESULTS The final sample included 7,543 visits with 19,530 diagnoses. Coverage of the ICD-9-CM codes in the ED sample was: AHRQ, 99%; NCHS, 88%; FM, 71%; IM, 68%. Seventy-six percent of the AHRQ clusters were small, defined as grouping <1% of the diagnosis codes in the sample. CONCLUSIONS The AHRQ system provided the best coverage of ED ICD-9-CM codes. However, most of the clusters were small and not significantly different from the raw data.


Information Processing and Management | 1994

Looking in text windows: their size and composition

Stephanie W. Haas; Robert M. Losee

Abstract A text window is a group of words appearing in contiguous positions in text. Intuitively, words in such close proximity should have something to do with each other. We can use the text window to exploit a variety of lexical, syntactic, and semantic relationships without having to analyze the text explicitly for their structure. This research supports the previously suggested idea that natural groupings of words are best treated as a unit of size 7 to 11 words, that is, plus or minus three to five words. Our text retrieval experiments varying the size of windows, both with full text and with stopwords removed, support these size ranges. The characteristics of windows that best match terms in queries are examined in detail, revealing interesting differences between those for queries with good results and those for queries with poorer results. Queries with good results tend to contain more content word phrases and fewer terms with high frequency of use in the database. Information retrieval systems may benefit from expanding thesaurus-style relationships or incorporating statistical dependencies for terms within these windows.


Information Processing and Management | 1993

Toward the automatic identification of sublanguage vocabulary

Stephanie W. Haas; Shaoyi He

Abstract A sublanguage is the language used in a restricted or specialized domain or field, such as computer science. Information about the vocabulary and structure of a sublanguage is used in any domain-related natural language processing application; however, such information is very time-consuming to gather, and much of it must be found and organized manually. Additionally, information retrieval strategies using lexical information depend on finding the appropriate dictionary entry for general and technical words. The ability to automatically identify terms belonging to a sublanguage could aid in these and other applications. In this paper, a simple but effective method is developed for automatic identification of sublanguage vocabulary words as they occur in abstracts. This procedure may significantly reduce the effort required to extract sublanguage vocabulary for sublanguage analysis and other applications, such as information retrieval. First, the sublanguage vocabulary identification procedures are described using abstracts from computer science and library and information science as the sublanguage sources. The results of the experiments are evaluated using three different criteria. Finally, the practical and theoretical significance of this research is discussed along with plans for further experiments on the vocabulary and structure of sublanguages.


Social Science Computer Review | 2004

The role of metadata in the statistical knowledge network: an emerging research agenda

Carol A. Hert; Sheila O. Denn; Stephanie W. Haas

Metadata (data about data) is integral to many processes in the Statistical Knowledge Network (SKN). Recently, efforts such as the SemanticWeb and others express the importance of metadata in supporting integration of information across multiple sources. Critical challenges in building the SKN are identifying what metadata is needed, and at what point in the cycle of production and use of statistical information it must be available, and establishing an architecture that supports metadata acquisition and use throughout the SKN. This article provides a research agenda in those areas, building on our currentwork in helping users to find the statistical information they need and understand what they find.

Collaboration


Dive into the Stephanie W. Haas's collaboration.

Top Co-Authors

Avatar

Debbie Travers

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anna E. Waller

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Gary Marchionini

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ron T. Brown

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Javed Mostafa

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Jesse Wilbur

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Deepika Mahalingam

University of North Carolina at Chapel Hill

View shared research outputs
Researchain Logo
Decentralizing Knowledge