Seppo Sippu
Aalto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Seppo Sippu.
very large data bases | 2005
Ibrahim Jaluta; Seppo Sippu; Eljas Soisalon-Soininen
Abstract.In this paper we present new concurrent and recoverable B-link-tree algorithms. Unlike previous algorithms, ours maintain the balance of the B-link tree at all times, so that a logarithmic time bound for a search or an update operation is guaranteed under arbitrary sequences of record insertions and deletions. A database transaction can contain any number of operations of the form “fetch the first (or next) matching record”, “insert a record”, or “delete a record”, where database records are identified by their primary keys. Repeatable-read-level isolation for transactions is guaranteed by key-range locking. The algorithms apply the write-ahead logging (WAL) protocol and the steal and no-force buffering policies for index and data pages. Record inserts and deletes on leaf pages of a B-link tree are logged using physiological redo-undo log records. Each structure modification such as a page split or merge is made an atomic action by keeping the pages involved in the modification latched for the (short) duration of the modification and the logging of that modification; at most two B-link-tree pages are kept X-latched at a time. Each structure modification brings the B-link tree into a structurally consistent and balanced state whenever the tree was structurally consistent and balanced initially. Each structure modification is logged using a single physiological redo-only log record. Thus, a structure modification will never be undone even if the transaction that gave rise to it eventually aborts. In restart recovery, the redo pass of our ARIES-based recovery protocol will always produce a structurally consistent and balanced B-link tree, on which the database updates by backward-rolling transactions can always be undone logically, when a physical (page-oriented) undo is no longer possible.
ACM Transactions on Database Systems | 2006
Ibrahim Jaluta; Seppo Sippu; Eljas Soisalon-Soininen
We develop new algorithms for the management of transactions in a page-shipping client-server database system in which the physical database is organized as a sparse B-tree index. Our starvation-free fine-grained locking protocol combines adaptive callbacks with key-range locking and guarantees repeatable-read-level isolation (i.e., serializability) for transactions containing any number of record insertions, record deletions, and key-range scans. Partial and total rollbacks of client transactions are performed by the client. Each structure modification such as a page split or merge is defined as an atomic action that affects only two levels of the B-tree and is logged using a single redo-only log record, so that the modification never needs to be undone during transaction rollback or restart recovery. The steal-and-no-force buffering policy is applied by the server when flushing updated pages onto disk and by the clients when shipping updated data pages to the server, while pages involved in a structure modification are forced to the server when the modification is finished. The server performs the restart recovery from client and system failures using an ARIES/CSA-based recovery protocol. Our algorithms avoid accessing stale data but allow a data page to be updated by one client transaction and read by many other client transactions simultaneously, and updates may migrate from a data page to another in structure modifications caused by other transactions while the updating transaction is still active.
extending database technology | 2009
Tuukka Haapasalo; Ibrahim Jaluta; Bernhard Seeger; Seppo Sippu; Eljas Soisalon-Soininen
The multiversion B+-tree (MVBT) by Becker et al. assumes a single-data-item update model in which each new version created for a data item is given a timestamp that is unique across the entire MVBT. In this paper, we extend the MVBT model with multi-action transactions such that all (final) data-item versions created by a transaction are given the same timestamp. We show that the MVBT algorithms can be modified to work in a setting in which multiple readonly transactions and a single updating transaction operate concurrently in snapshot isolation on the MVBT, without compromising the asymptotically optimal time complexity of key inserts, key deletes, and key-range scans on any version. The structural consistency and balance of the MVBT is guaranteed by short-duration latching of pages, redo-only logging of structure modifications (version splits, key splits and page merges), and redo-undo logging of key insertions and deletions. The redo pass of our ARIES-based restart-recovery algorithm always produces a structurally consistent and balanced MVBT on which any undo action by a backward-rolling updating transaction can be performed logically if a physical undo is not possible. The standard steal-and-no-force buffering policy is assumed.
IEEE Transactions on Knowledge and Data Engineering | 2013
Tuukka Haapasalo; Ibrahim Jaluta; Seppo Sippu; Eljas Soisalon-Soininen
We consider the recoverability of traditional R-tree index structures under concurrent updating transactions, an important issue that is neglected or treated inadequately in many proposals of R-tree concurrency control. We present two solutions to ARIES-based recovery of transactions on R-trees. These assume a standard fine-grained single-version update model with physiological write-ahead logging and steal-and-no-force buffering where records with uncommitted updates by a transaction may migrate from their original page to another page due to structure modifications caused by other transactions. Both solutions guarantee that an R-tree will remain in a consistent and balanced state in the presence of any number of concurrent forward-rolling and (totally or partially) backward-rolling multiaction transactions and in the event of process failures and system crashes. One solution maintains the R-tree in a strictly consistent state in which the bounding rectangles of pages are as tight as possible, while in the other solution this requirement is relaxed. In both solutions only a small constant number of simultaneous exclusive latches (write latches) are needed, and in the solution that only maintains relaxed consistency also the number of simultaneous nonexclusive latches is similarly limited. In both solutions, deletions are handled uniformly with insertions, and a logarithmic insertion-path length is maintained under all circumstances.
symposium on principles of database systems | 1987
Gösta Grahne; Seppo Sippu; Eljas Soisalon-Soininen
Well-known results on graph traversal are used to develop a practical, efficient algorithm for evaluating regularly and linearly recursive queries in databases that contain only binary relations. Transformations are given that reduce a subset of regular and linear queries involving n-ary relations (n > 2) to queries involving only binary relations.
international conference on database theory | 2001
Seppo Sippu; Eljas Soisalon-Soininen
We consider transactions running on a database that consists of records with unique totally-ordered keys and is organized as a sparse primary search tree such as a B-tree index on disk storage.We extend the classical read-write model of transactions by considering inserts, deletes and key-range scans and by distinguishing between four types of transaction states: forward-rolling, committed, backward-rolling, and rolled-back transactions. A search-tree transaction is modelled as a two-level transaction containing structure modifications as open nested subtran-sactions that can commit even though the parent transaction aborts. Isolation conditions are defined for search-tree transactions with nested structure modifications that guarantee the structural consistency of the search tree, a required isolation level (including phantom prevention) for database operations, and recoverability for structure modifications and database operations.
international database engineering and applications symposium | 2009
Tuukka Haapasalo; Seppo Sippu; Ibrahim Jaluta; Eljas Soisalon-Soininen
Modern database applications increasingly often require access to historical versions of the database. Storing such multiversion data in a single-version B+ -tree database index is inefficient, especially for key-range queries. In this article, we present an index structure called the concurrent multiversion B+ -tree (CMVBT) for efficiently storing and querying multiversion data. The CMVBT structure uses an asymptotically optimal transactional multiversion B+ -tree (TMVBT) index as the main data storage. and a separate B+ -tree index called the versioned B+ -tree (VBT) to hold the updates of active transactions. The updates of committed transactions are moved, one transaction at a time, from the VBT into the TMVBT. This organization of two separate index structures allows us to maintain the asymptotic optimality guarantees of the TMVBT even in the presence of concurrent updating transactions. We provide concurrent algorithms for updating and reading the CMVBT structure. Our CMVBT algorithms can be used with the standard snapshot isolation concurrency-control and ARIES-based recovery algorithms to allow multiple read-only and updating transactions, to operate concurrently on the structure. Transaction rollback is also supported for all updating transactions, either entirely or up to a preset savepoint.
international conference on data engineering | 2007
Timo Lilja; Riku Saikkonen; Seppo Sippu; Eljas Soisalon-Soininen
We consider online bulk-delete operations on a large database table organized as a primary (sparse) B+-tree index on a multi-attribute key. Using the natural range partitions induced by prefixes of the key, we define a multi-granular key-range locking protocol in which a bulk operation locks a small number of logical fragments of the table covering the target of the operation. We also present an efficient and recoverable bulk-delete algorithm that minimizes the work needed in B-tree rebalancing and in transaction rollback. All the locks needed for a bulk-delete operation are acquired during a scan of the leaf pages covering the target key range; in this scan the records qualifying for deletion are only marked as deleted. The records are physically deleted in a rebalance phase that avoids visiting subtrees in which all records qualify for deletion, thus saving considerably on the number of rebalancing operations.
symposium on principles of database systems | 1988
Seppo Sippu; Eljas Soisalon-Soininen
We augment relational algebra with a generalized transitive closure operator that allows for the efficient evaluation of a subclass of recursive queries. The operator is based on a composition operator which is as general as possible when the operator is required to be associative and when only relational algebra operators are used in its definition. The closure of such a composition can be computed using the well-known efficient algorithms designed for the computation of the usual transitive closure. Besides the case in which complete materialization of recursive relations are required, our strategy also yields an efficient solution in the case in which a selection is applied to the closure.
Journal of the ACM | 1996
Seppo Sippu; Eljas Soisalon-Soininen
We analyze the optimization effect of the “magic sets” rewriting technique for datalog queries and present some supplementary or alternative techniques that avoid many shortcomings of the basic technique. Given a magic sets rewritten query, the set of facts generated for the original, nonmagic predicates by the seminaive bottom-up evaluation is characterized precisely. It is shown that—because of the additional magic facts—magic sets processing may result in generating an order of magnitude more facts than the straightforward naive evaluation. A refinement of magic sets in factorized magic sets is defined. These magic sets retain most of the efficiency of original magic sets in regards to the number of nonmagic facts generated and have the property that a linear-time bound with respect to seminaive evaluation is guaranteed in all cases. An alternative technique for magic sets, called envelopes, which has several desirable properties over magic sets, is introduced. Envelope predicates are never recursive with the original predicates; thus, envelopes can be computed as a preprocessing task. Envelopes also allow the utilization of multiple sideways information passing strategies (sips) for a rule. An envelope-transformed program may be “readorned” according to another choice of sips and reoptimized by magic sets (or envelopes), thus making possible an optimization effect that cannot be achieved by magic sets based on a particular choice of sips.