Donald B. Crouch
University of Minnesota
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Donald B. Crouch.
acm conference on hypertext | 1989
Donald B. Crouch; Carolyn J. Crouch; Glenn Andreas
The graph-traversal approach to hypertext information retrieval is a conceptualization of hypertext in which the structural aspects of the nodes are emphasized. A user navigates through such hypertext systems by evaluating the semantics associated with links between nodes as well as the information contained in nodes. [Fris88] In this paper we describe an hierarchical structure which effectively supports the graphical traversal of a document collection in a hypertext system. We provide an overview of an interactive browser based on cluster hierarchies. Initial results obtained from the use of the browser in an experimental hypertext retrieval system are presented.
Information Processing and Management | 2002
Carolyn J. Crouch; Donald B. Crouch; Qingyan Chen; Steven J. Holtz
Abstract This paper describes an automatic approach designed to improve the retrieval effectiveness of very short queries such as those used in web searching. The method is based on the observation that stemming, which is designed to maximize recall, often results in depressed precision. Our approach is based on pseudo-feedback and attempts to increase the number of relevant documents in the pseudo-relevant set by reranking those documents based on the presence of unstemmed query terms in the document text. The original experiments underlying this work were carried out using Smart 11.0 and the lnc.ltc weighting scheme on three sets of documents from the TREC collection with corresponding TREC (title only) topics as queries. (The average length of these queries after stoplisting ranges from 2.4 to 4.5 terms.) Results, evaluated in terms of P@20 and non-interpolated average precision, showed clearly that pseudo-feedback (PF) based on this approach was effective in increasing the number of relevant documents in the top ranks. Subsequent experiments, performed on the same data sets using Smart 13.0 and the improved Lnu.ltu weighting scheme, indicate that these results hold up even over the much higher baseline provided by the new weights. Query drift analysis presents a more detailed picture of the improvements produced by this process.
international acm sigir conference on research and development in information retrieval | 1989
Carolyn J. Crouch; Donald B. Crouch; Krishna R. Nareddy
In the extended vector space model, each document vector consists of a set of subvectors representing the multiple concepts or concept classes present in the document. Typical information concepts, in addition to the usual content terms or descriptors, include author names, bibliographic links, etc. The extended vector space model is known to improve retrieval effectiveness. However, a major impediment to the use of the extended model is the construction of an extended query. In this paper, we describe a method for automatically extending a query containing only content terms (a single concept class) to a representation containing multiple concept classes. No relevance feedback is involved. Experiments using the CACM collection resulted in an average precision 34% better than that obtained using the standard single-concept term vector model.
international acm sigir conference on research and development in information retrieval | 1986
Donald B. Crouch
This paper gives an overview of the graphical techniques which have been used in the representation of information in a document collection environment. An assessment of the applicability of existing multivariate data graphical techniques to the vector space model is presented.
technical symposium on computer science education | 2003
Donald B. Crouch; Leslie Schwartzman
Outcome-based learning, as embraced by the CAC criteria for accrediting computing programs, requires by its very nature the active, on-going participation of faculty in the assessment process. This paper will describe a means of involving faculty at the earliest stages of development in a comprehensive assessment plan without making undue demands of their time or fostering the anxiety that oftentimes accompanies implementation of the assessment process. The proposed process takes advantage of the flexibility of the CAC criteria.
Archive | 1990
Donald B. Crouch; Robert R. Korfhage
Information retrieval (IR) is concerned with the representation, storage, and retrieval of documents or document surrogates. The output of an IR system in response to a user’s request consists of a set of references which are intended to provide the user with information relevant to his or her information needs as expressed by a query.1 Conventional information retrieval systems operate on large-scale computing systems in an environment where direct access to system facilities is generally limited to search intermediaries and to a few researchers who have been trained to use somewhat complex user—system interfaces. However, poor query formulations and inadequate user—system interaction may still occur even with skilled users. For example, Cleverdon1 has noted that “if two (trained) search intermediaries search the same question on the same database on the same host, only 40 percent of the output may be common to both searches.” Since skilled users often find it difficult to formulate effective search requests and to interact usefully with document retrieval systems, less competent users may be faced with insurmountable problems. This situation is especially critical at the present time; a large number of casual users will soon obtain access to very large information resources through the combination of powerful microcomputers and optical storage technology.
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval | 2009
Carolyn J. Crouch; Donald B. Crouch; Dinesh Bhirud; Pavan Poluri; Chaitanya Polumetla; Varun Sudhakar
This paper reports the results of our experiments to consistently produce highly ranked focused elements in response to the Focused Task of the INEX Ad Hoc Track. The results of these experiments, performed using the 2008 INEX collection, confirm that our current methodology (described herein) produces such elements for this collection. Our goal for 2009 is to apply this methodology to the new, extended 2009 INEX collection to determine its viability in this environment. (These experiments are currently underway.) Our system uses our method for dynamic element retrieval [4], working with the semi-structured text of Wikipedia [5], to produce a rank-ordered list of elements in the context of focused retrieval. It is based on the Vector Space Model [15]; basic functions are performed using the Smart experimental retrieval system [14]. Experimental results are reported for the Focused Task of both the 2008 and 2009 INEX Ad Hoc Tracks.
Focused Access to XML Documents | 2008
Carolyn J. Crouch; Donald B. Crouch; Nachiket Kamat; Vikram Malik; Aditya Mone
This paper describes the successful adaptation of our methodology for the dynamic retrieval of XML elements to a semi-structured environment. Working with text that contains both tagged and untagged elements presents particular challenges in this context. Our system is based on the Vector Space Model; basic functions are performed using the Smart experimental retrieval system. Dynamic element retrieval requires only a single indexing of the document collection at the level of the basic indexing node (i.e., the paragraph). It returns a rank-ordered list of elements identical to that produced by the same query against an all-element index of the collection. Experimental results are reported for both the 2006 and 2007 Ad-hoc tasks.
IEEE Transactions on Engineering Management | 1988
Carolyn J. Crouch; Donald B. Crouch
The authors explore the degree to which software engineering is actually practiced in a large computational support organization. The results reveal that practitioners generally have not adopted these techniques. This failure can be attributed primarily to the environmental conditions under which software is being developed and, in particular, to the external constraints placed on the organization. Problems created by the environment are discussed in relation to the major phases of the software life-cycle model, the sources of such problems are identified, and solutions to the problems are presented. The environmental factors impacting productivity are categorized and the findings extended to engineering support organizations in general. >
International Workshop of the Initiative for the Evaluation of XML Retrieval | 2011
Carolyn J. Crouch; Donald B. Crouch; Natasha Acquilla; Radhika Banhatti; Sai Chittilla; Supraja Nagalla; Reena Narenvarapu
This paper reports briefly on the final results of experiments to produce competitive (i.e., highly ranked) focused elements in response to the various tasks of the INEX 2010 Ad Hoc Track. These experiments are based on an entirely new analysis and indexing of the INEX 2009 Wikipedia collection. Using this indexing and our basic methodology for dynamic element retrieval [5, 6], described herein, yields highly competitive results for all the tasks involved. This is important because our approach to snippet retrieval is based on the conviction that good snippets can be generated from good focused elements. Our work to date in snippet generation is described; this early work ranked 9th in the official results.