Is this you? Create Your Porfile

Seokki Lee

Illinois Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Seokki Lee is active.

Explore More

Publication

Featured researches published by Seokki Lee.

international conference on data engineering | 2017

A SQL-Middleware Unifying Why and Why-Not Provenance for First-Order Queries

Seokki Lee; Sven Köhler; Bertram Ludäscher; Boris Glavic

Explaining why an answer is in the result of a query or why it is missing from the result is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. Both types of questions, i.e., why and why-not provenance, have been studied extensively. In this work, we present the first practical approach for answering such questions for queries with negation (firstorder queries). Our approach is based on a rewriting of Datalog rules (called firing rules) that captures successful rule derivations within the context of a Datalog query. We extend this rewriting to support negation and to capture failed derivations that explain missing answers. Given a (why or why-not) provenance question, we compute an explanation, i.e., the part of the provenance that is relevant to answer the question. We introduce optimizations that prune parts of a provenance graph early on if we can determine that they will not be part of the explanation for a given question. We present an implementation that runs on top of a relational database using SQL to compute explanations. Our experiments demonstrate that our approach scales to large instances and significantly outperforms an earlier approach which instantiates the full provenance to compute explanations.

international provenance and annotation workshop | 2016

Implementing Unified Why- and Why-Not Provenance Through Games

Seokki Lee; Sven Köhler; Bertram Ludäscher; Boris Glavic

Using provenance to explain why a query returns a result or why a result is missing has been studied extensively. However, the two types of questions have been approached independently of each other. We present an efficient technique for answering both types of questions for Datalog queries based on a game-theoretic model of provenance called provenance games. Our approach compiles provenance requests into Datalog and translates the resulting query into SQL to execute it on a relational database backend. We apply several novel optimizations to limit the computation to provenance relevant to a given user question.

very large data bases | 2018

Provenance Summaries for Answers and Non-Answers

Seokki Lee; Bertram Ludäscher; Boris Glavic

Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG. PVLDB Reference Format: Seokki Lee, Bertram Ludäscher, Boris Glavic. Providing Provenance Summaries as Explanations for Answers and Non-Answers. PVLDB, 11 (12): 1954-1957, 2018. DOI: https://doi.org/10.14778/3229863.3236233

The Vldb Journal | 2018

PUG: a framework and practical implementation for why and why-not provenance

Seokki Lee; Bertram Ludäscher; Boris Glavic

Explaining why an answer is (or is not) returned by a query is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. In this work, we present the first practical approach for answering such questions for queries with negation (first-order queries). Specifically, we introduce a graph-based provenance model that, while syntactic in nature, supports reverse reasoning and is proven to encode a wide range of provenance models from the literature. The implementation of this model in our PUG (Provenance Unification through Graphs) system takes a provenance question and Datalog query as an input and generates a Datalog program that computes an explanation, i.e., the part of the provenance that is relevant to answer the question. Furthermore, we demonstrate how a desirable factorization of provenance can be achieved by rewriting an input query. We experimentally evaluate our approach demonstrating its efficiency.

very large data bases | 2017

Debugging transactions and tracking their provenance with reenactment

Xing Niu; Bahareh Sadat Arab; Seokki Lee; Su Feng; Xun Zou; Dieter Gawlick; Vasudha Krishnaswamy; Zhen Hua Liu; Boris Glavic

Debugging transactions and understanding their execution are of immense importance for developing OLAP applications, to trace causes of errors in production systems, and to audit the operations of a database. However, debugging transactions is hard for several reasons: 1) after the execution of a transaction, its input is no longer available for debugging, 2) internal states of a transaction are typically not accessible, and 3) the execution of a transaction may be affected by concurrently running transactions. We present a debugger for transactions that enables non-invasive, postmortem debugging of transactions with provenance tracking and supports what-if scenarios (changes to transaction code or data). Using reenactment, a declarative replay technique we have developed, a transaction is replayed over the state of the DB seen by its original execution including all its interactions with concurrently executed transactions from the history. Importantly, our approach uses the temporal database and audit logging capabilities available in many DBMS and does not require any modifications to the underlying database system nor transactional workload.

conference on innovative data systems research | 2017

Adaptive Schema Databases

William Spoth; Bahareh Sadat Arab; Eric S. Chan; Dieter Gawlick; Adel Ghoneimy; Boris Glavic; Beda Christoph Hammerschmidt; Oliver Kennedy; Seokki Lee; Zhen Hua Liu; Xing Niu; Ying Yang

TaPP | 2017