Frank Wm. Tompa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Frank Wm. Tompa is active.

Explore More

Publication

Featured researches published by Frank Wm. Tompa.

ACM Transactions on Information Systems | 1989

A data model for flexible hypertext database systems

Frank Wm. Tompa

Hypertext and other page-oriented databases cannot be schematized in the same manner as record-oriented databases. As a result, most hypertext databases implicitly employ a data model based on a simple, unrestricted graph. This paper presents a hypergraph model for maintaining page-oriented databases in such a way that some of the functionality traditionally provided by database schemes can be available to hypertext databases. In particular, the model formalizes identification of commonality in the structure, set-at-a-time database access, and definition of user-specific views. An efficient implementation of the model is also discussed.

document engineering | 2001

Requirements for XML document database systems

Airi Salminen; Frank Wm. Tompa

The shift from SGML to XML has created new demands for managing structured documents. Many XML documents will be transient representations for the purpose of data exchange between different types of applications, but there will also be a need for effective means to manage persistent XML data as a database. In this paper we explore requirements for an XML database management system. The purpose of the paper is not to suggest a single type of system covering all necessary features. Instead the purpose is to initiate discussion of the requirements arising from document collections, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper models and systems for XML database management. Our discussion addresses issues arising from data modelling, data definition, and data manipulation.

International Conference on Applications of Databases | 1994

Text / relational database management systems: Harmonizing SQL and SGML

G. E. Blake; Mariano P. Consens; P. Kilpeläinen; P. Å. Larson; T. Snider; Frank Wm. Tompa

Combined text and relational database support is increasingly recognized as an emerging need of industry, spanning applications requiring text fields as parts of their data (e.g., for customer support) to those augmenting primary text resources by conventional relational data (e.g., for publication control). In this paper, we propose extensions to SQL that provide flexible and efficient access to structured text described by SGML. We also propose an architecture to support a text/relational database management system as a federated database environment, where component databases are accessed via “agents”: SQL agents that translate standard or extended SQL queries into vendorspecific dialects, and text agents that process text sub-queries on full-text search engines.

ACM Transactions on Database Systems | 1981

An improved third normal form for relational databases

Tok Wang Ling; Frank Wm. Tompa; Tiko Kameda

In this paper, we show that some Codd third normal form relations may contain “superfluous” attributes because the definitions of transitive dependency and prime attribute are inadequate when applied to sets of relations. To correct this, an improved third normal form is defined and an algorithm is given to construct a set of relations from a given set of functional dependencies in such a way that the superfluous attributes are guaranteed to be removed. This new normal form is compared with other existing definitions of third normal form, and the deletion normalization method proposed is shown to subsume the decomposition method of normalization.

international conference on management of data | 2007

Optimal top-down join enumeration

David DeHaan; Frank Wm. Tompa

Most contemporary database systems perform cost-based join enumeration using some variant of System-Rs bottom-up dynamic programming method. The notable exceptions are systems based on the top-down transformational search of Volcano/Cascades. As recent work has demonstrated, bottom-up dynamic programming can attain optimality with respect to the shape of the join graph; no comparable results have been published for transformational search. However, transformational systems leverage benefits of top-down search not available to bottom-up methods. In this paper we describe a top-down join enumeration algorithm that is optimal with respect to the join graph. We present performance results demonstrating that a combination of optimal enumeration with search strategies such as branch-and-bound yields an algorithm significantly faster than those previously described in the literature. Although our algorithm enumerates the search space top-down, it does not rely on transformations and thus retains much of the architecture of traditional dynamic programming. As such, this work provides a migration path for existing bottom-up optimizers to exploit top-down search without drastically changing to the transformational paradigm.

Information Systems | 1988

Maintaining materialized views without accessing base data

Frank Wm. Tompa; José A. Blakeley

Abstract Access to a database through a user view can be serviced quickly when the view is materialized, i.e. the transformed data is explicitly stored. In the presence of database updates, however, the materialized view can become costly to maintain; often it must be completely rederived from the base data using the view definition. Under some conditions the view can be updated directly given only the view definition, the current contents of the materialized view, and the update operation (still expressed against the base data), without accessing the base data itself. In this paper, we consider relational views defined by projection, selection, and join. We present necessary and sufficient conditions on the view definition, contents, and update operations for insertions and deletions to be reflected in the view without reference to base data. Because the possibility of such view-based updating is dependent on the current contents of view, we call the update conditionally autonomously computable .

Machine Learning | 2000

Stochastic Grammatical Inference of Text Database Structure

Matthew Young-Lai; Frank Wm. Tompa

For a document collection in which structural elements are identified with markup, it is often necessary to construct a grammar retrospectively that constrains element nesting and ordering. This has been addressed by others as an application of grammatical inference. We describe an approach based on stochastic grammatical inference which scales more naturally to large data sets and produces models with richer semantics. We adopt an algorithm that produces stochastic finite automata and describe modifications that enable better interactive control of results. Our experimental evaluation uses four document collections with varying structure.

international acm sigir conference on research and development in information retrieval | 2013

Retrieving documents with mathematical content

Shahab Kamali; Frank Wm. Tompa

Many documents with mathematical content are published on the Web, but conventional search engines that rely on keyword search only cannot fully exploit their mathematical information. In particular, keyword search is insufficient when expressions in a document are not annotated with natural keywords or the user cannot describe her query with keywords. Retrieving documents by querying their mathematical content directly is very appealing in various domains such as education, digital libraries, engineering, patent documents, medical sciences, etc. Capturing the relevance of mathematical expressions also greatly enhances document classification in such domains. Unlike text retrieval, where keywords carry enough semantics to distinguish text documents and rank them, math symbols do not contain much semantic information on their own. In fact, mathematical expressions typically consist of few alphabetical symbols organized in rather complex structures. Hence, the structure of an expression, which describes the way such symbols are combined, should also be considered. Unfortunately, there is no standard testbed with which to evaluate the effectiveness of a mathematics retrieval algorithm. In this paper we study the fundamental and challenging problems in mathematics retrieval, that is how to capture the relevance of mathematical expressions, how to query them, and how to evaluate the results. We describe various search paradigms and propose retrieval systems accordingly. We discuss the benefits and drawbacks of each approach, and further compare them through an extensive empirical study.

Computer Standards & Interfaces | 1996

From data representation to data model: meta-semantic issues in the evolution of SGML

Darrell R. Raymond; Frank Wm. Tompa; Derick Wood

SGML provides standard representations for documents, but as documents become more fluid, we will need standard semantics for them as well. The ability to manage change is a fundamental capability of any system that supports document semantics. We look at three areas important in change management: equivalence, redundancy, and operators. We show how these areas are implicitly addressed in SGML and SGML-based standards, and argue that more explicit consideration would be useful both for evaluating current standards, and for developing new systems for document semantics.

very large data bases | 1985

Understanding the implications of view update policies

Claudia Bauzer Medeiros; Frank Wm. Tompa

Database views are traditionally described as unmaterialized queries, which may be coincidentally updatable according to some fixed criteria. One of the problems in updating through views lies in determining whether a given view modification can be correctly translated by the system. To define an updatable view, a view designer must be aware of how an update request in the view will be mapped into updates of the underlying relations. Furthermore, because of side effects, the view designer must also be made aware of the effects of isolated updates back into the view.To address this problem, we present a general algorithm that predicts the effects of arbitrary mapping policies. Given an update policy, this algorithm indicates whether a desired update will, in fact, occur in the view and describes all possible side effects it may have, documenting the conditions under which they occur. The algorithm subsumes the results obtained by other view design tools, and generalizes their use to encompass a larger class of views.

Explore More