Sebastian Bächle
Kaiserslautern University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sebastian Bächle.
Computer Science - Research and Development | 2015
Christian Mathis; Theo Härder; Karsten Schmidt; Sebastian Bächle
XML Indexing and Storage (XMIS) techniques are crucial for the functionality and the overall performance of an XML database management system (XDBMS). Because of the complexity of XQuery and performance demands of XML query processing, efficient path processing operators—including those for tree-pattern queries (so-called twigs)—are urgently needed for which tailor-made indexes and their flexible use are indispensable. Although XML indexing and storage are standard problems and, of course, manifold approaches have been proposed in the last decade, adaptive and broad-enough solutions for satisfactory query evaluation support of all path processing operators are missing in the XDBMS context. Therefore, we think that it is worthwhile to take a step back and look at the complete picture to derive a salient and holistic solution. To do so, we first compile an XMIS wish list containing what—in our opinion—are essential functional storage and indexing requirements in a modern XDBMS. With these desiderata in mind, we then develop a new XMIS scheme, which—by reconsidering previous work—can be seen as a practical and general approach to XML storage and indexing. Interestingly, by working on both problems at the same time, we can make the storage and index managers live in a kind of symbiotic partnership, because the document store re-uses ideas originally proposed by the indexing community and vice versa. The XMIS scheme is implemented in XTC, an XDBMS used for empirical tests.
advances in databases and information systems | 2013
Caetano Sauer; Sebastian Bächle; Theo Härder
The MapReduce MR framework has become a standard tool for performing large batch computations--usually of aggregative nature--in parallel over a cluster of commodity machines. A significant share of typical MR jobs involves standard database-style queries, where it becomes cumbersome to specify map and reduce functions from scratch. To overcome this burden, higher-level languages such as HiveQL, PigLatin, and JAQL have been proposed to allow the automatic generation of MR jobs from declarative queries. We identify two major problems of these existing solutions: i they introduce new query languages and implement systems from scratch for the sole purpose of expressing MR jobs; and ii despite solving some of the major limitations of SQL, they still lack the flexibility required by big data applications. We propose BrackitMR, an approach based on the XQuery language with extended JSON support. XQuery not only is an established query language, but also has a more expressive data model and more powerful language constructs, enabling a much greater degree of flexibility. From a system design perspective, we extend an existing single-node query processor, Brackit, adding MR as a distributed coordination layer. Such heavy reuse of the standard query processor not only provides performance, but also allows for a more elegant design which transparently integrates MR processing into a generic query engine.
Datenbank-spektrum | 2014
Sebastian Bächle; Caetano Sauer
The XQuery language was initially developed as an SQL equivalent for XML data, but its roots in functional programming make it also a perfect choice for processing almost any kind of structured and semi-structured data. Apart from standard XML processing, however, advanced language features make it hard to efficiently implement the complete language for large data volumes. This work proposes a novel compilation strategy that provides both flexibility and efficiency to unleash XQuery’s potential as data programming language. It combines the simplicity and versatility of a storage-independent data abstraction with the scalability advantages of set-oriented processing. Expensive iterative sections in a query are unrolled to a pipeline of relational-style operators, which is open for optimized join processing, index use, and parallelization. The remaining aspects of the language are processed in a standard fashion, yet can be compiled anytime to more efficient native operations of the actual runtime environment. This hybrid compilation mechanism yields an efficient and highly flexible query engine that is able to drive any computation from simple XML transformation to complex data analysis, even on non-XML data. Experiments with our prototype and state-of-the-art competitors in classic XML query processing and business analytics over relational data attest the generality and efficiency of the design.
business intelligence for the real-time enterprises | 2015
Karsten Schmidt; Sebastian Bächle; Philipp Scholl; Georg Nold
Identifying and exploring relevant content in growing document collections is a challenge for researchers, users, and system providers alike. Supporting this is crucial for companies offering knowledge in the form of documents as their core product. Our demo shows an intelligent way of doing guided research in big text collections, using the collection of the major scientific publisher Springer SBM as an example data set. We use the SAP HANA platform for flexible text analysis, ad-hoc calculations and data linkage, in order to enhance the experience of users navigating and exploring publications. We integrate unstructured data (textual documents) and structured data (document metadata and web server logs), and provide interactive filters in order to enable a responsive user experience while searching for relevant content. With HANA, we are able to implement this functionality over big data on a single machine by making use of HANA’s SQL data store and the built-in application server.
web age information management | 2013
Caetano Sauer; Sebastian Bächle; Theo Härder
We present BrackitMR, a framework that executes XQuery programs over distributed data using MapReduce. The main goal is to provide flexible MapReduce-based data processing with minimal performance penalties. Based on the Brackit query engine, a generic query compilation and optimization infrastructure, our system allows for a transparent integration of multiple data sources, such as XML, JSON, and CSV files, as well as relational databases, NoSQL stores, and lower-level record APIs such as BerkeleyDB.
Computer Science - Research and Development | 2012
Karsten Schmidt; Sebastian Bächle
Effective I/O buffering is a performance-critical task in database management systems. Accordingly, systems usually employ various special-purpose buffers to align, e.g., device speed, page size, and replacement policies with the actual data and workload. However, such partitioning of available buffer memory results in complex optimization problems for database administrators and also in fragile configurations which quickly deteriorate on workload shifts. Reliable forecasts of I/O costs enable a system to evaluate alternative configurations to continuously optimize its buffer memory allocation at runtime. So far, all techniques proposed for the prediction of buffer performance focus solely on hit ratio gains for increased buffer sizes to identify buffers which promise the greatest benefit. These approaches, however, assume that their forecast allows to extrapolate the effect for buffer downsizing, too. As we will show, this comes along with a severe risk of wrong tuning decisions, which may heavily impact system performance. Thus, we emphasize the importance of reliably forecasting the penalty to expect for shrinking buffers in favor of others. We explore the use of lightweight extensions for widely used buffer algorithms to perform on-the-fly simulation of buffer performance of smaller and larger buffer sizes simultaneously. Furthermore, we present a simple cost model and demonstrate how to compose these concepts into a self-tuning component for dynamic buffer reallocation.
Datenbank-spektrum | 2011
Sebastian Bächle; Theo Härder; Volker Höfner; Joachim Klein; Yi Ou; Steffen Reithermann; Daniel Schall; Karsten Schmidt; Andreas M. Weiner
Seit 1985 richtet der GI-Fachbereich Datenbanken und Informationssysteme (DBIS) alle zwei Jahre seine Fachtagung Datenbanksysteme fur Business, Technologie und Web (BTW) aus, die in diesem Jahr vom 28. Februar bis zum 4. Marz in Kaiserslautern durchgefuhrt wurde. Nach 1991, als die 4. Fachtagung dieser Reihe an der TU Kaiserslautern stattfand, durfte die Arbeitsgruppe DBIS des Fachbereichs Informatik der TU damit zum zweiten Mal die groste Wissenschaftsveranstaltung der deutschsprachigen Datenbankgemeinde ausrichten. Sie wurde von Prof. Dr-.Ing. Dr. h.c. Theo Harder (lokale Organisation) und zwei „Ehemaligen“ der TU Kaiserslautern, Prof. Dr.-Ing. habil. Bernhard Mitschang (Wissenschaftliches Programmkomitee, Universitat Stuttgart) und Dr.-Ing. Harald Schoning (IndustrieProgrammkomitee, Software AG, Darmstadt), organisiert. Jurgen Bittner, Vorstandsvorsitzender der SQL Projekt AG, Dresden, und Prof. Dr. Hans-Jurgen Schek, Emeritus der ETH Zurich, waren als Ehrengaste der BTW eingeladen. Beide hatten 1991 zur 4. BTW-Tagung die Hauptvortrage mit den Themen „Die Architekturkonzeption eines DBMS aus pragmatischer Sicht“ und „Erweiterbarkeit, Kooperation, Foderation von Datenbanksystemen“ gehalten. Das Fachprogramm bestand aus mehr als 40 Vortragen im Wissenschaftsund Industrieprogramm und 12 Demonstrationen von neuen Informatik-Anwendungen. Alle Beitrage wurden in einem anonymen Begutachtungsverfahren von uber 50 Fachleuten aus dem DBIS-Bereich ausgewahlt. Die
networked digital technologies | 2010
Sebastian Bächle; Theo Härder
This paper focuses on an aspect that is widely neglected in native XML database management systems: support for concurrent transactional access. We analyze the isolation requirements of the XQuery Update language and disclose typical sources of anomalies of various query processing strategies. We also present extensions to our proven XML lock protocol, which allow us to exploit dynamic schema information for query processing and protects us against XML-specific “schema phantoms”. All concepts shown were implemented in XTC, which we developed as a research vehicle during the recent six years. Eventually, the outcome of this long-term project is a rather complete XML DBMS, which is used as an experimental testbed for XML-related research and also as a scalable framework for serializable XQuery.
database and expert systems applications | 2009
Sebastian Bächle; Theo Härder
conference on current trends in theory and practice of informatics | 2009
Theo Härder; Christian Mathis; Sebastian Bächle; Karsten Schmidt; Andreas M. Weiner