Maximilian Dylla
Max Planck Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maximilian Dylla.
Science of Computer Programming | 2012
Wolfgang Ahrendt; Maximilian Dylla
We present a semantics, calculus, and system for compositional verification of Creol, an object-oriented modelling language for concurrent distributed applications. The system is an instance of KeY, a framework for object-oriented software verification, which has so far been applied foremost to sequential Java. Building on KeY characteristic concepts, like dynamic logic, sequent calculus, symbolic execution via explicit substitutions, and the taclet rule language, the presented system addresses functional correctness of Creol models featuring local cooperative thread parallelism and global communication via asynchronous method calls. The calculus heavily operates on communication histories specified by the interfaces of Creol units. Two example scenarios demonstrate the usage of the system. This article extends the conference paper of Ahrendt and Dylla (2009) [5] with a denotational semantics of Creol and an assumption-commitment style semantics of the logic.
international conference on data engineering | 2013
Maximilian Dylla; Iris Miliaraki; Martin Theobald
We investigate a novel approach of computing confidence bounds for top-k ranking queries in probabilistic databases with non-materialized views. Unlike related approaches, we present an exact pruning algorithm for finding the top-ranked query answers according to their marginal probabilities without the need to first materialize all answer candidates via the views. Specifically, we consider conjunctive queries over multiple levels of select-project-join views, the latter of which are cast into Datalog rules which we ground in a top-down fashion directly at query processing time. To our knowledge, this work is the first to address integrated data and confidence computations for intensional query evaluations in the context of probabilistic databases by considering confidence bounds over first-order lineage formulas. We extend our query processing techniques by a tool-suite of scheduling strategies based on selectivity estimation and the expected impact on confidence bounds. Further extensions to our query processing strategies include improved top-k bounds in the case when sorted relations are available as input, as well as the consideration of recursive rules. Experiments with large datasets demonstrate significant runtime improvements of our approach compared to both exact and sampling-based top-k methods over probabilistic data.
very large data bases | 2013
Maximilian Dylla; Iris Miliaraki; Martin Theobald
Temporal annotations of facts are a key component both for building a high-accuracy knowledge base and for answering queries over the resulting temporal knowledge base with high precision and recall. In this paper, we present a temporal-probabilistic database model for cleaning uncertain temporal facts obtained from information extraction methods. Specifically, we consider a combination of temporal deduction rules, temporal consistency constraints and probabilistic inference based on the common possible-worlds semantics with data lineage, and we study the theoretical properties of this data model. We further develop a query engine which is capable of scaling to very large temporal knowledge bases, with nearly interactive query response times over millions of uncertain facts and hundreds of thousands of grounded rules. Our experiments over two real-world datasets demonstrate the increased robustness of our approach compared to related techniques based on constraint solving via Integer Linear Programming (ILP) and probabilistic inference via Markov Logic Networks (MLNs). We are also able to show that our runtime performance is more than competitive to current ILP solvers and the fastest available, probabilistic but non-temporal, database engines.
international conference on formal engineering methods | 2009
Wolfgang Ahrendt; Maximilian Dylla
We present a verification system for Creol, an object-oriented modeling language for concurrent distributed applications. The system is an instance of KeY, a framework for object-oriented software verification, which has so far been applied foremost to sequential Java. Building on KeY characteristic concepts, like dynamic logic, sequent calculus, explicit substitutions, and the taclet rule language, the system presented in this paper addresses functional correctness of Creol models featuring local cooperative thread parallelism and global communication via asynchronous method calls. The calculus heavily operates on communication histories which describe the interfaces of Creol units. Two example scenarios demonstrate the usage of the system.
conference on information and knowledge management | 2011
Timm Meiser; Maximilian Dylla; Martin Theobald
Recent advances in Web-based information extraction have allowed for the automatic construction of large, semantic knowledge bases, which are typically captured in RDF format. The very nature of the applied extraction techniques however entails that the resulting RDF knowledge bases may face a significant amount of incorrect, incomplete, or even inconsistent (i.e., uncertain) factual knowledge, which makes query answering over this kind of data a challenge. Our reasoner, coined URDF, supports SPARQL queries along with rule-based, first-order predicate logic to infer new facts and to resolve data uncertainty over millions of RDF triplets directly at query time. We demonstrate a fully interactive reasoning engine, combining a Java-based reasoning backend and a Flash-based visualization frontend in a dynamic client-server architecture. Our visualization frontend provides interactive access to the reasoning backend, including tasks like exploring the knowledge base, rule-based and statistical reasoning, faceted browsing of large query graphs, and explaining answers through lineage.
advances in databases and information systems | 2013
Martin Theobald; Luc De Raedt; Maximilian Dylla; Angelika Kimmig; Iris Miliaraki
Over the past decade, the two research areas of probabilistic databases and probabilistic programming have intensively studied the problem of making structured probabilistic inference scalable, but--so far--both areas developed almost independently of one another. While probabilistic databases have focused on describing tractable query classes based on the structure of query plans and data lineage, probabilistic programming has contributed sophisticated inference techniques based on knowledge compilation and lifted first-order inference. Both fields have developed their own variants of--both exact and approximate--top-k algorithms for query evaluation, and both investigate query optimization techniques known from SQL, Datalog, and Prolog, which all calls for a more intensive study of the commonalities and integration of the two fields. Moreover, we believe that natural-language processing and information extraction will remain a driving factor and in fact a longstanding challenge for developing expressive representation models which can be combined with structured probabilistic inference--also for the next decades to come.
conference on information and knowledge management | 2012
Yafang Wang; Maximilian Dylla; Zhaouchun Ren; Marc Spaniol; Gerhard Weikum
Acquiring high-quality (temporal) facts for knowledge bases is a labor-intensive process. Although there has been recent progress in the area of semi-supervised fact extraction, these approaches still have limitations, including a restricted corpus, a fixed set of relations to be extracted or a lack of assessment capabilities. In this paper we introduce PRAVDA-live, a framework that overcomes these limitations and supports the entire pipeline of interactive knowledge harvesting. To this end, our demo exhibits fact extraction from ad-hoc corpus creation, via relation specification, labeling and assessment all the way to ready-to-use RDF exports.
database and expert systems applications | 2016
Yafang Wang; Zhaochun Ren; Martin Theobald; Maximilian Dylla; Gerard de Melo
Recent advances in knowledge harvesting have enabled us to collect large amounts of facts about entities from Web sources. A good portion of these facts have a temporal scope that, for example, allows us to concisely capture a persons biography. However, raw sets of facts are not well suited for presentation to human end users. This paper develops a novel abstraction-based method to summarize a set of facts into natural-language sentences. Our method distills temporal knowledge from Web documents and generates a concise summary according to a particular users interest, such as, for example, a soccer players career. Our experiments are conducted on biography-style Wikipedia pages, and the results demonstrate the good performance of our system in comparison to existing text-summarization methods.
Reasoning Web International Summer School | 2014
Maximilian Dylla; Martin Theobald; Iris Miliaraki
Probabilistic Databases (PDBs) lie at the expressive intersection of databases, first-order logic, and probability theory. PDBs employ logical deduction rules to process Select-Project-Join (SPJ) queries, which form the basis for a variety of declarative query languages such as Datalog, Relational Algebra, and SQL. They employ logical consistency constraints to resolve data inconsistencies, and they represent query answers via logical lineage formulas (aka.“data provenance”) to trace the dependencies between these answers and the input tuples that led to their derivation. While the literature on PDBs dates back to more than 25 years of research, only fairly recently the key role of lineage for establishing a closed and complete representation model of relational operations over this kind of probabilistic data was discovered. Although PDBs benefit from their efficient and scalable database infrastructures for data storage and indexing, they couple the data computation with probabilistic inference, the latter of which remains a #P-hard problem also in the context of PDBs.
Archive | 2014
Maximilian Dylla
Probabilistic databases store, query, and manage large amounts of uncertain information. This thesis advances the state-of-the-art in probabilistic databases in three different ways: 1. We present a closed and complete data model for temporal probabilistic databases and analyze its complexity. Queries are posed via temporal deduction rules which induce lineage formulas capturing both time and uncertainty. 2. We devise a methodology for computing the top-k most probable query answers. It is based on first-order lineage formulas representing sets of answer candidates. Theoretically derived probability bounds on these formulas enable pruning low-probability answers. 3. We introduce the problem of learning tuple probabilities which allows updating and cleaning of probabilistic databases. We study its complexity, characterize its solutions, cast it into an optimization problem, and devise an approximation algorithm based on stochastic gradient descent. All of the above contributions support consistency constraints and are evaluated experimentally. Probabilistische Datenbanken konnen grose Mengen an ungewissen Informationen speichern, anfragen und verwalten. Diese Doktorarbeit treibt den Stand der Technik in diesem Gebiet auf drei Arten vorran: 1. Ein abgeschlossenes und vollstandiges Datenmodell fur temporale, probabilistische Datenbanken wird prasentiert. Anfragen werden mittels Deduktionsregeln gestellt, welche logische Formeln induzieren, die sowohl Zeit als auch Ungewissheit erfassen. 2. Ein Methode zur Berechnung der k Anworten hochster Wahrscheinlichkeit wird entwickelt. Sie basiert auf logischen Formeln erster Stufe, die Mengen an Antwortkandidaten reprasentieren. Beschrankungen der Wahrscheinlichkeit dieser Formeln ermoglichen das Kurzen von Antworten mit niedriger Wahrscheinlichkeit. 3. Das Problem des Lernens von Tupelwahrscheinlichkeiten fur das Aktualisieren und Bereiningen von probabilistischen Datenbanken wird eingefuhrt, auf Komplexitat und Losungen untersucht, als Optimierungsproblem dargestellt und von einem stochastischem Gradientenverfahren approximiert. All diese Beitrage unterstutzen Konsistenzbedingungen und wurden experimentell analysiert.