Jan Flokstra | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jan Flokstra is active.

Explore More

Publication

Featured researches published by Jan Flokstra.

IEEE Transactions on Knowledge and Data Engineering | 1992

PRISMA/DB: a parallel, main memory relational DBMS

Peter M. G. Apers; van den Carel A. Berg; Jan Flokstra; Pwpj Paul Grefen; Martin L. Kersten; Annita N. Wilschut

PRISMA/DB, a full-fledged parallel, main memory relational database management system (DBMS) is described. PRISMA/DBs high performance is obtained by the use of parallelism for query processing and main memory storage of the entire database. A flexible architecture for experimenting with functionality and performance is obtained using a modular implementation of the system in an object-oriented programming language. The design and implementation of PRISMA/DB are described in detail. A performance evaluation of the system shows that the system is comparable to other state-of-the-art database machines. The prototype implementation of the system runs on a 100-node parallel multiprocessor. >

international conference on management of data | 1995

Parallel evaluation of multi-join queries

Annita N. Wilschut; Jan Flokstra; Peter M. G. Apers

A number of execution strategies for parallel evaluation of multi-join queries have been proposed in the literature; their performance was evaluated by simulation. In this paper we give a comparative performance evaluation of four execution strategies by implementing all of them on the same parallel database system, PRISMA/DB. Experiments have been done up to 80 processors. The basic strategy is to first determine an execution schedule with minimum total cost and then parallelize this schedule with one of the four execution strategies. These strategies, coming from the literature, are named: Sequential Parallel, Synchronous Execution, Segmented Right-Deep, and Full Parallel. Based on the experiments clear guidelines are given when to use which strategy.

advances in geographic information systems | 1998

Road collapse in Magnum

Annita N. Wilschut; Roelof van Zwol; Jan Flokstra; Nick Brasa; Wilko Quak

This paper describes the implementation of a triangulation based collapse algorithm in the general-purpose object oriented DBMS Magnum. The contribution of the paper is twofold. First, we show that true integration of complex spatial functionality in a DBMS can be achieved. Second, we worked out a collapse algorithm to be used in the complex area of map generalization.

Proceedings of the PRISMA workshop on Parallel database systems | 1991

Parallel query execution in PRISMA/DB

Annita N. Wilschut; Peter M. G. Apers; Jan Flokstra

1 I n t r o d u c t i o n In the PRISMA-project, a large multi-processor system has been built, is be used to study the performance gains from parallelism. A parallel, main-memory relational database system (PItlSMA/DB) runs on this so-called POOMA-machine. This paper studies the possibilities of using parallelism to improve the performance of relational database management systems. Because the equi-join is an important, and time-consuming operation, queries consisting of a number of equi-joins are used to describe how different forms of parallelism can speed up the execution of such queries. This paper is organized as follows: First, a brief introduction into PRISMA and the DBMS running on it is given. After that, different forms of parallelism are described and the ways in which they can be used is identified. Using this knowledge, the possible parallelism in the execution of join-queries is discussed. Special attention is paid to pipelining. It is shown, that pipelining needs a new hash-join algorithm and that using this algorithm may yield effective parallelism over a pipeline of join operations. Finally, we discuss the implications of using pipelining as a source of parallelism for the optimization of join queries. The paper is concluded with our plans for future work. The PaRallel Inference and Storage MAchine PRISMA is a highly parallel machine for data and knowledge processing. The PRISMA-ma~hine contains 100 nodes that each contain a data processor, a communication processor and 16 Mbyte of local memory. 50 nodes have a disk a~d some nodes have an ethernet card that provides an interface with a host computer. Each communication processor connects a node to 4 other nodes. In this way a fast, high-bandwidth network is provided. This hardware can be classified as a shared-nothing multi-processor system. The maz~ine is designed to support a relational main memory database management system PRISMA/DB. An extensive introduction to this system can be found in [Kers87] and in [Wils89]. Here, only the features that are important for this paper are summarized. PRISMA/DB stores the entire database in main memory. Disks are used for backup only. To gain performance and to make storage in main memory feasible, the tuples belonging to one relation are fragmented over more than one node. A fragment is a set of tuples that belong to the same relation and that reside on the same node. A relation does not necessarily use all available nodes. …

extending database technology | 2006

MonetDB/XQuery—Consistent and efficient updates on the pre/post plane

Peter A. Boncz; Jan Flokstra; Torsten Grust; Maurice van Keulen; Stefan Manegold; K. Sjoerd Mullender; Jan Rittinger; Jens Teubner

Relational XQuery processors aim at leveraging mature relational DBMS query processing technology to provide scalability and efficiency. To achieve this goal, various storage schemes have been proposed to encode the tree structure of XML documents in flat relational tables. Basically, two classes can be identified: (1) encodings using fixed-length surrogates, like the preorder ranks in the pre/post encoding [5] or the equivalent pre/size/level encoding [8], and (2) encodings using variable-length surrogates, like, e.g., ORDPATH [9] or P-PBiTree [12]. Recent research [1] showed a clear advantage of the former for efficient evaluation of XPath location steps, exploiting techniques like cheap node order tests, positional lookup, and node skipping in staircase join [7]. However, once updates are involved, variable-length surrogates are often considered the better choice, mainly as a straightforward implementation of structural XML updates using fixed-length surrogates faces two performance bottlenecks: (i) high physical cost (the preorder ranks of all nodes following the update position must be modified—on average 50% of the document), and (ii) low transaction concurrency (updating the size of all ancestor nodes causes lock contention on the document root).

international conference on management of data | 1994

The IMPRESS DDT: a database design toolbox based on a formal specification language

Jan Flokstra; Maurice van Keulen; J. Skowronek

The Database Design Tool prototype is being developed in the IMPRESS project (Esprit project 6355). The IMPRESS project started in May 1992 and aims at creating a low-level storage manager tailored for multimedia applications, together with a library of efficient operators, a programming environment, high-level design tools and methodology. The DDT is part of this last effort. The project focuses on the field of Technical Information Systems, where there is a need for tools supporting modeling of complex objects. Designers in this field usually use incremental design or step by step prototyping, because this seems to be best suited for users coping with complexity and uncertainty about their own needs or requirements. The IMPRESS DDT aims at supporting the database design part of this process.

database and expert systems applications | 2003

Moa and the Multi-model Architecture: A New Perspective on NF2

M. van Keulen; Jochem Vonk; A.P. de Vries; Jan Flokstra; Henk Ernst Blok

Advanced non-traditional application domains such as geographic information systems and digital library systems demand advanced data management support. In an effort to cope with this demand, we present the concept of a novel multi-model DBMS architecture which provides evaluation of queries on complexly structured data without sacrificing efficiency. A vital role in this architecture is played by the Moa language featuring a nested relational data model based on XNF2, in which we placed renewed interest. Furthermore, extensibility in Moa avoids optimization obstacles due to black-box treatment of ADTs. The combination of a mapping of queries on complexly structured data to an efficient physical algebra expression via a nested relational algebra, extensibility open to optimization, and the consequently better integration of domain-specific algorithms, makes that the Moa system can efficiently and effectively handle complex queries from non-traditional application domains.

Distributed and Parallel Databases | 1996

Extending a multi-set relational algebra to a parallel environment

Pwpj Paul Grefen; Jan Flokstra

Parallel database systems will very probably be the future for high-performance data-intensive applications. In the past decade, many parallel database systems have been developed, together with many languages and approaches to specify operations in these systems. A common background is still missing, however. This paper proposes an extended relational algebra for this purpose, based on the well-known standard relational algebra. The extended algebra provides both complete database manipulation language features, and data distribution and process allocation primitives to describe parallelism. It is defined in terms of multi-sets of tuples to allow handling of duplicates and to obtain a close connection to the world of high-performance data processing. Due to its algebraic nature, the language is well suited for optimization and parallelization through expression rewriting. The proposed language can be used as a database manipulation language on its own, as has been done in the PRISMA parallel database project, or as a formal basis for other languages, like SQL.

database and expert systems applications | 1992

Performance Evaluation of Integrity Control in a Parallel Main-Memory Database System

Pwpj Paul Grefen; Jan Flokstra; Peter M.G. Apers

Integrity control is an important task of modern database management systems. One of the key problems impeding its general use in real-world applications is formed by the high processing costs associated with integrity constraint enforcement. Notwithstanding this observation, little attention has been paid in literature to the performance evaluation of integrity control mechanisms. This paper adresses this issue and has a threefold message. Firstly, it shows that integrity control can easily be integrated in a parallel, main-memory database system. Secondly, it demonstrates that parallelism and main-memory data storage are effective ways to deal with costly constraint enforcement. Thirdly, the overhead of constraint enforcement is shown to be acceptable compared to the execution of transactions without integrity control. The conclusion is drawn, that integrity control is well feasible in high-performance database systems.

database and expert systems applications | 2015

Incremental Data Uncertainty Handling Using Evidence Combination: A Case Study on Maritime Data Reasoning

Mena Badieh Habib; Brend Wanders; Jan Flokstra; Maurice van Keulen

Semantic incompatibility is a conflict that occurs in the meanings of data. In this paper, we propose an approach for data cleaning by resolving semantic incompatibility. Our approach applies a dynamic and incremental enhancement of data quality. It checks the coherency/conflict of the newly recorded facts/relations against the existing ones. It reasons over the existing information and comes up with new discovered facts/relations. We choose maritime data cleaning as a validation scenario.

Explore More