Jef Wijsen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jef Wijsen is active.

Explore More

Publication

Featured researches published by Jef Wijsen.

ACM Transactions on Database Systems | 2005

Database repairing using updates

Jef Wijsen

Repairing a database means bringing the database in accordance with a given set of integrity constraints by applying some minimal change. If a database can be repaired in more than one way, then the consistent answer to a query is defined as the intersection of the query answers on all repaired versions of the database.Earlier approaches have confined the repair work to deletions and insertions of entire tuples. We propose a theoretical framework that also covers updates as a repair primitive. Update-based repairing is interesting in that it allows rectifying an error within a tuple without deleting the tuple, thereby preserving consistent values in the tuple. Another novel idea is the construct of nucleus: a single database that yields consistent answers to a class of queries, without the need for query rewriting. We show the construction of nuclei for full dependencies and conjunctive queries. Consistent query answering and constructing nuclei is generally intractable under update-based repairing. Nevertheless, we also show some tractable cases of practical interest.

international conference on database theory | 2003

Condensed Representation of Database Repairs for Consistent Query Answering

Jef Wijsen

Repairing a database means bringing the database in accordance with a given set of integrity constraints by applying modifications that are as small as possible. In the seminal work of Arenas et al. on query answering in the presence of inconsistency, the possible modifications considered are deletions and insertions of tuples. Unlike earlier work, we also allow tuple updates as a repair primitive. Update-based repairing is advantageous, because it allows rectifying an error within a tuple without deleting the tuple, thereby preserving other consistent values in the tuple. At the center of the paper is the problem of query answering in the presence of inconsistency relative to this refined repair notion. Given a query, a trustable answer is obtained by intersecting the query answers on all repaired versions of the database. The problem arising is that, in general, a database can be repaired in infinitely many ways. A positive result is that for conjunctive queries and full dependencies, there exists a condensed representation of all repairs that permits computing trustable query answers.

ACM Transactions on Database Systems | 1999

Temporal FDs on complex objects

Jef Wijsen

Temporal functional dependencies (TFD) are defined for temporal databases that include object identity. It is argued that object identity can overcome certain semantic diffuculties with existing temporal relational data models. Practical applications of TFDs in object bases are discussed. Reasoning about TFDs is at the center of this paper. It turns out that the distinction between acyclic and cyclic schemas is significant. For acyclic schemas, a complete axiomatization for finite implication is given and an algorithm for deciding finite implication provided. The same axiomatization is proven complete for unrestricted implication in unrestricted schemas, which can be cyclic. An interesting result is that there are cyclic schemas for which unrestricted and finite implication do not coincide. TFDs relate and extend some earlier work on dependency theory in temporal databases. Throughout this paper, the construct of TFD is compared with the notion of temporal FD introduced by Wang et al. (1997). A comparison with other related work is provided at the end of the article.

Data Mining and Knowledge Discovery | 1998

On the Complexity of Mining Quantitative Association Rules

Jef Wijsen; Robert Meersman

The discovery of quantitative association rules in large databases is considered an interesting and important research problem. Recently, different aspects of the problem have been studied, and several algorithms have been presented in the literature, among others in (Srikant and Agrawal, 1996; Fukuda et al., 1996a; Fukuda et al., 1996b; Yoda et al., 1997; Miller and Yang, 1997). An aspect of the problem that has so far been ignored, is its computational complexity. In this paper, we study the computational complexity of mining quantitative association rules.

ACM Transactions on Database Systems | 2012

Determining the Currency of Data

Wenfei Fan; Floris Geerts; Jef Wijsen

Data in real-life databases become obsolete rapidly. One often finds that multiple values of the same entity reside in a database. While all of these values were once correct, most of them may have become stale and inaccurate. Worse still, the values often do not carry reliable timestamps. With this comes the need for studying data currency, to identify the current value of an entity in a database and to answer queries with the current values, in the absence of reliable timestamps. This article investigates the currency of data. (1) We propose a model that specifies partial currency orders in terms of simple constraints. The model also allows us to express what values are copied from other data sources, bearing currency orders in those sources, in terms of copy functions defined on correlated attributes. (2) We study fundamental problems for data currency, to determine whether a specification is consistent, whether a value is more current than another, and whether a query answer is certain no matter how partial currency orders are completed. (3) Moreover, we identify several problems associated with copy functions, to decide whether a copy function imports sufficient current data to answer a query, whether a copy function can be extended to import necessary current data for a query while respecting the constraints, and whether it suffices to copy data of a bounded size. (4) We establish upper and lower bounds of these problems, all matching, for combined complexity and data complexity, and for a variety of query languages. We also identify special cases that warrant lower complexity.

symposium on principles of database systems | 2010

On the first-order expressibility of computing certain answers to conjunctive queries over uncertain databases

Jef Wijsen

A natural way for capturing uncertainty in the relational data model is by having relations that violate their primary key constraint, that is, relations in which distinct tuples agree on the primary key. A repair (or possible world) of a database is then obtained by selecting a maximal number of tuples without ever selecting two distinct tuples that have the same primary key value. For a Boolean query q, CERTAINTY(q) is the problem that takes as input a database db and asks whether q evaluates to true on every repair of db. We are interested in determining queries q for which CERTAINTY(q) is first-order expressible (and hence in the low complexity class AC0). For queries q in the class of conjunctive queries without self-join, we provide a necessary syntactic condition for first-order expressibility of CERTAINTY(q). For acyclic queries, this necessary condition is also a sufficient condition. So we obtain a decision procedure for first-order expressibility of CERTAINTY(q) when q is acyclic and without self-join. We also show that if CERTAINTY(q) is first-order expressible, its first-order definition, commonly called (certain) first-order rewriting, can be constructed in a rather straightforward way.

ACM Transactions on Database Systems | 2002

Searching for dependencies at multiple abstraction levels

Toon Calders; Raymond T. Ng; Jef Wijsen

The notion of roll-up dependency (RUD) extends functional dependencies with generalization hierarchies. RUDs can be applied in OLAP and database design. The problem of discovering RUDs in large databases is at the center of this paper. An algorithm is provided that relies on a number of theoretical results. The algorithm has been implemented; results on two real-life datasets are given. The extension of functional dependency (FD) with roll-ups turns out to capture meaningful rules that are outside the scope of classical FD mining. Performance figures show that RUDs can be discovered in linear time in the number of tuples of the input dataset.

symposium on principles of database systems | 2011

Determining the currency of data

Wenfei Fan; Floris Geerts; Jef Wijsen

Data in real-life databases become obsolete rapidly. One often finds that multiple values of the same entity reside in a database. While all of these values were once correct, most of them may have become stale and inaccurate. Worse still, the values often do not carry reliable timestamps. With this comes the need for studying data currency, to identify the current value of an entity in a database and to answer queries with the current values, in the absence of timestamps. This paper investigates the currency of data. (1) We propose a model that specifies partial currency orders in terms of simple constraints. The model also allows us to express what values are copied from other data sources, bearing currency orders in those sources, in terms of copy functions defined on correlated attributes. (2) We study fundamental problems for data currency, to determine whether a specification is consistent, whether a value is more current than another, and whether a query answer is certain no matter how partial currency orders are completed. (3) Moreover, we identify several problems associated with copy functions, to decide whether a copy function imports sufficient current data to answer a query, whether such a function copies redundant data, whether a copy function can be extended to import necessary current data for a query while respecting the constraints, and whether it suffices to copy data of a bounded size. (4) We establish upper and lower bounds of these problems, all matching, for combined complexity and data complexity, and for a variety of query languages. We also identify special cases that warrant lower complexity.

Information Processing Letters | 2010

A remark on the complexity of consistent conjunctive query answering under primary key violations

Jef Wijsen

A natural way for capturing uncertainty in the relational data model is by allowing relations that violate their primary key. A repair of such relation is obtained by selecting a maximal number of tuples without ever selecting two tuples that agree on their primary key. Given a Boolean query q, CERTAINTY(q) is the problem that takes as input a relational database and asks whether q evaluates to true on every repair of that database. In recent years, CERTAINTY(q) has been studied primarily for conjunctive queries. Conditions have been determined under which CERTAINTY(q) is coNP-complete, first-order expressible, or not first-order expressible. A remaining open question was whether there exist conjunctive queries q without self-join such that CERTAINTY(q) is in PTIME but not first-order expressible. We answer this question affirmatively.

international conference on database theory | 2009

Consistent query answering under primary keys: a characterization of tractable queries

Jef Wijsen

This article deals with consistent query answering to conjunctive queries under primary key constraints. The repairs of an inconsistent database db are obtained by selecting a maximum number of tuples from db without ever selecting two tuples that agree on their primary key. For a Boolean conjunctive query q, we are interested in the following question: does there exist a Boolean first-order query &phis; such that for every database db, &phis; evaluates to true on db if and only if q evaluates to true on every repair of db? We address this problem for acyclic conjunctive queries in which no relation name occurs more than once. Our results improve previous solutions that are based on Fuxman-Miller join graphs.

Explore More