Is this you? Create Your Porfile

Stephen F. Weiss

University of North Carolina at Chapel Hill

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stephen F. Weiss is active.

Explore More

Publication

Featured researches published by Stephen F. Weiss.

Information Storage and Retrieval | 1974

Word segmentation by letter successor varieties

Margaret A. Hafer; Stephen F. Weiss

Abstract This paper describes a method for automatically segmenting words into their stems and affixes. The process uses certain statistical properties of a corpus (successor and predecessor letter variety counts) to indicate where words should be divided. Consequently, this process is less reliant on human intervention than are other methods for automated stemming. The segmentation system is used to construct stem dictionaries for document classification. Information retrieval experiments are then performed using documents and queries so classified. Results show not only that this method is capable of high quality word segmentation, but also that its use in information retrieval produces results that are at least as good as those obtained using the more traditional stemming processes.

Information Storage and Retrieval | 1973

Learning to disambiguate

Stephen F. Weiss

Abstract Ambiguous words are a problem in information retrieval and natural language analysis in general This paper investigates automatic resolution of ambiguities from natural language text. Two classes of experiments are presented. In the first, disambiguation is performed with the aid of a manually defined set of resolution rules. In the second the need for these hand made rules is eliminated by means of a learning process. By learning from a series of inputs and its own errors, the system is able to construct resolution rules automatically. Results show that both processes are highly successful in disambiguation and, by use of the learning technique, disambiguation may be achieved with virtually no a priori decisions.

acm conference on hypertext | 1987

A hypertext writing environment and its cognitive basis (panel session)

John B. Smith; Stephen F. Weiss; Gordon J. Ferguson

WE is a hypertext writing environment that can be used to create both electronic and printed documents. It is intended for professionals who work within a computer network of professional workstations. Since writing is a complex mental activity that uses many different kinds of thinking, WE was designed in accord with an explicit cognitive model for writing. That model raises several important questions for both electronic and printed documents. The paper includes a discussion of the underlying cognitive model, a description of WE as it currently exists and as it will be extended in the near future, as well as a brief outline of experiments being conducted to evaluate both the model and the system. It concludes by re-examining some of the issues raised by the cognitive model in light of WE, especially the rote of constraints in hypertext systems.

afips | 1986

WE: A Writing Environment for Professionals,

John B. Smith; Stephen F. Weiss; Gordon J. Ferguson; Jay David Bolter; Marcy Lansman

Abstract : Technical and scientific professionals are writers. Regardless of title or job description, they write. We are developing a comprehensive Writing Environment (WE) for this application. In describing this system, we will emphasize five key concepts: The system is based on a cognitive model for written communication; The system is highly visual; The system was prototyped in Smalltalk and then ported to Objective C; The system will be used a series of cognitive experiments; and the system can be extended to other applications. The emphasis placed on cognitive aspects in this description probably needs more explanation. WE is one instance of an increasingly important kind of software that provides users with an environment in which to think or with functions that supplement human cognitive skills. To be successful, these intelligence augmenting systems must reflect the cognitive processes of the people using them.

Information Systems | 1982

Tree structures for high dimensionality nearest neighbor searching

Caroline M. Eastman; Stephen F. Weiss

Abstract A nearest neighbor searching algorithm which is an extension of the multidimensional binary tree ( k-d tree) for high dimensional spaces is discussed. A model of its behavior, which is applicable under restricted conditions, shows that the search time required is bounded by 0( log 2 N ) α , where N is the number of records and α is a system-dependent parameter. Experiments with a document collection show that the model provides a reasonable guide to performance, and that some savings over a sequential search can be achieved in this type of application. A probabilistic version of the algorithm is presented which provides significantly faster searching with little degradation in retrieval quality.

international acm sigir conference on research and development in information retrieval | 1989

On hypertext

M. Frisse; Maristella Agosti; Marie-France Bruandet; Udo Hahn; Stephen F. Weiss

This panel will employ two different interpretations of the phrase “growing up” to address areas of common interest between hypertext and information retrieval researchers. First, the panelists will question whether or not hypertext is “growing up” as a scientific discipline; They will discuss characteristics that separate hypertext research from other related disciplines. Second, the panelists will discuss the problems encountered when a hypertext system “grows up” in size and complexity; They will discuss the very real problems expected when representing and integrating large knowledge bases, accommodating multiple users, and distributing single logical hypertexts across multiple physical sites. The panelists will not lecture, but they will advance a number of themes including “the Myth of Modularity” (Frisse), “New Architectures Employing Hyperconcept Databases” (Agosti), “Hypertext in Software Engineering” (Bruandet), “Automatic Hypertext Generation” (Hahn), and “Large-Scale Hypertexts” (Weiss).

international acm sigir conference on research and development in information retrieval | 1978

A tree algorithm for nearest neighbor searching in document retrieval systems

Caroline M. Eastman; Stephen F. Weiss

The problem of finding nearest neighbors to a query in a document collection is a special case of associative retrieval, in which searches are performed using more than one key. A nearest neighbors associative retrieval algorithm, suitable for document retrieval using similarity matching, is described. The basic structure used is a binary tree, at each node a set of keys (concepts) is tested to select the most promising branch. Backtracking to initially rejected branches is allowed and often necessary. Under certain conditions, the search time required by this algorithm is 0(log 2 N) k . N is the number of documents, and k is a system-dependent parameter. A series of experiments with a small collection confirm the predictions made using the analytic model; k is approximately 4 in this situation. This algorithm is compared with two other searching algorithms; sequential search and clustered search. For large collections, the average search time for this algorithm is less than that for a sequential search and greater than that for a clustered search. However, the clustered search, unlike the sequential search and this algorithm, does not guarantee that the near neighbors found are actually the nearest neighbors.

Sigact News | 1982

A pumping theorem for regular languages

Donald F. Stanat; Stephen F. Weiss

The present result establishes another characterization of the regular languages. The original pumping lemma can be viewed as asserting that identification of a substring v of x as pumpable can be done on the basis of L and x. Jaffes result can be viewed as asserting that identification of a substring v of x as pumpable can be done on the basis of L and the prefix of z that includes v, and that this property characterizes the regular languages, The following result has as a consequence that identification of some substrings v of z as pumpable can be done on the basis of L and a sufficiently long substring uv, where u need not be the entire prefix. In other words, a sufficiently long substring is an adequate basis for identifying some pumpable substrings for a given regular language L. As with Jaffes theorem, this property is a characterization of the regular languages. This is formally stated in the following theorem.

international acm sigir conference on research and development in information retrieval | 1987

An advanced full-text retrieval and analysis system

John B. Smith; Stephen F. Weiss; Gordon J. Ferguson

MICROARRAS is an advanced full-text retrieval and analysis system. It supports fast, efficient browsing of a documents vocabulary as well as its text, recursive analytic categories, Boolean search with flexible context specifications, evaluation of arithmetic expressions, and graphical display of various numeric distributions. The system is designed to work with large textbases stored on remote mainframes or on a local store for a micro-computer or workstation. The description covers system architecture, design principals, as well as user functions.

Software - Practice and Experience | 1987

Formatting texts accessed randomly

John B. Smith; Stephen F. Weiss

Full‐text systems that access text randomly cannot normally determine the format operations in effect for a given target location. The problem can be solved by viewing the format marks as the non‐terminals in a format grammar. A formatted text can then be parsed using the grammar to build a data structure that serves both as a parse tree and as a search tree. While processing a retrieved segment, a full‐text system can follow the search tree from root to leaf, collecting the format marks encountered at each node to derive the sequence of commands active for that segment. The approach also supports the notion of a ‘well formatted’ document and provides a means for verifying the well‐formedness of a given text. To illustrate the approach, a sample set of format marks and a sample grammar are given suitable for formatting and parsing the article as a sample text.

Explore More