Lukas Blunschi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lukas Blunschi is active.

Explore More

Publication

Featured researches published by Lukas Blunschi.

symposium on large spatial databases | 2009

Indexing Moving Objects Using Short-Lived Throwaway Indexes

Jens Dittrich; Lukas Blunschi; Marcos Antonio Vaz Salles

With the exponential growth of moving objects data to the Gigabyte range, it has become critical to develop effective techniques for indexing, updating, and querying these massive data sets. To meet the high update rate as well as low query response time requirements of moving object applications, this paper takes a novel approach in moving object indexing. In our approach we do not require a sophisticated index structure that needs to be adjusted for each incoming update. Rather we construct conceptually simple short-lived throwaway indexes which we only keep for a very short period of time (sub-seconds) in main memory. As a consequence, the resulting technique MOVIES supports at the same time high query rates and high update rates and trades this for query result staleness. Moreover, MOVIES is the first main memory method supporting time-parameterized predictive queries. To support this feature we present two algorithms: non-predictive MOVIES and predictive MOVIES. We obtain the surprising result that a predictive indexing approach -- considered state-of-the-art in an external-memory scenario -- does not scale well in a main memory environment. In fact our results show that MOVIES outperforms state-of-the-art moving object indexes like a main-memory adapted B x -tree by orders of magnitude w.r.t. update rates and query rates. Finally, our experimental evaluation uses a workload unmatched by any previous work. We index the complete road network of Germany consisting of 40,000,000 road segments and 38,000,000 nodes. We scale our workload up to 100,000,000 moving objects, 58,000,000 updates per second and 10,000 queries per second which is unmatched by any previous work.

very large data bases | 2012

SODA: generating SQL for business users

Lukas Blunschi; Claudio Jossen; Donald Kossmann; Magdalini Mori; Kurt Stockinger

The purpose of data warehouses is to enable business analysts to make better decisions. Over the years the technology has matured and data warehouses have become extremely successful. As a consequence, more and more data has been added to the data warehouses and their schemas have become increasingly complex. These systems still work great in order to generate pre-canned reports. However, with their current complexity, they tend to be a poor match for non tech-savvy business analysts who need answers to ad-hoc queries that were not anticipated. This paper describes the design, implementation, and experience of the SODA system (Search over DAta Warehouse). SODA bridges the gap between the business needs of analysts and the technical complexity of current data warehouses. SODA enables a Google-like search experience for data warehouses by taking keyword queries of business users and automatically generating executable SQL. The key idea is to use a graph pattern matching algorithm that uses the metadata model of the data warehouse. Our results with real data from a global player in the financial services industry show that SODA produces queries with high precision and recall, and makes it much easier for business users to interactively explore highly-complex data warehouses.

international conference on data engineering | 2010

Intensional associations in dataspaces

Marcos Antonio Vaz Salles; Jens Dittrich; Lukas Blunschi

Dataspace applications necessitate the creation of associations among data items over time. For example, once information about people is extracted from sources on the Web, associations among them may emerge as a consequence of different criteria, such as their city of origin or their elected hobbies. In this paper, we advocate a declarative approach to specifying these associations. We propose that each set of associations be defined by an association trail. An association trail is a query-based definition of how items are connected by intensional (i.e., virtual) association edges to other items in the dataspace. We study the problem of processing neighborhood queries over such intensional association graphs. The naive approach to neighborhood query processing over intensional graphs is to materialize the whole graph and then apply previous work on dataspace graph indexing to answer queries. We present in this paper a novel indexing technique, the grouping-compressed index (GCI), that has better worst-case indexing cost than the naive approach. In our experiments, GCI is shown to provide an order of magnitude gain in indexing cost over the naive approach, while remaining competitive in query processing time.

very large data bases | 2008

Dwarfs in the rearview mirror: how big are they really?

Jens Dittrich; Lukas Blunschi; Marcos Antonio Vaz Salles

Online-Analytical Processing (OLAP) has been a field of competing technologies for the past ten years. One of the still unsolved challenges of OLAP is how to provide quick response times on any Terabyte-sized business data problem. Recently, a very clever multi-dimensional index structure termed Dwarf [26] has been proposed offering excellent query response times as well as unmatched index compression rates. The proposed index seems to scale well for both large data sets as well as high dimensions. Motivated by these surprisingly excellent results, we take a look into the rearview mirror. We have re-implemented the Dwarf index from scratch and make three contributions. First, we successfully repeat several of the experiments of the original paper. Second, we substantially correct some of the experimental results reported by the inventors. Some of our results differ by orders of magnitude. To better understand these differences, we provide additional experiments that better explain the behavior of the Dwarf index. Third, we provide missing experiments comparing Dwarf to baseline query processing strategies. This should give practitioners a better guideline to understand for which cases Dwarf indexes could be useful in practice.

conference on information and knowledge management | 2011

Data-thirsty business analysts need SODA: search over data warehouse

Lukas Blunschi; Claudio Jossen; Donald Kossmann; Magdalini Mori; Kurt Stockinger

Querying large data warehouses is very hard for non-tech savvy business users. Deep technical knowledge of both SQL as well as the schema of the database is required in order to build correct queries and to come up with new business insights. In this paper we introduce a novel system called SODA (Search Over DAta Warehouse) that bridges the gap between the business world and the IT world by enabling extended keyword search in a data warehouse. SODA uses metadata information, DBpedia entries as well as base data to generate SQL to allow intuitive exploration of the data. The process of query classification, query graph generation and SQL generation is visualized to provide the analysts with information on how the query results are produced. Experiments with real data of a global financial institution comprising around 300 tables showed promising results.

international conference on data engineering | 2012

The Credit Suisse Meta-data Warehouse

Claudio Jossen; Lukas Blunschi; Magdalini Mori; Donald Kossmann; Kurt Stockinger

This paper describes the meta-data warehouse of Credit Suisse that is productive since 2009. Like most other large organizations, Credit Suisse has a complex application landscape and several data warehouses in order to meet the information needs of its users. The problem addressed by the meta-data warehouse is to increase the agility and flexibility of the organization with regards to changes such as the development of a new business process, a new business analytics report, or the implementation of a new regulatory requirement. The meta-data warehouse supports these changes by providing services to search for information items in the data warehouses and to extract the lineage of information items. One difficulty in the design of such a meta-data warehouse is that there is no standard or well-known meta-data model that can be used to support such search services. Instead, the meta-data structures need to be flexible themselves and evolve with the changing IT landscape. This paper describes the current data structures and implementation of the Credit Suisse meta-data warehouse and shows how its services help to increase the flexibility of the whole organization. A series of example meta-data structures, use cases, and screenshots are given in order to illustrate the concepts used and the lessons learned based on feedback of real business and IT users within Credit Suisse.

international conference on data engineering | 2008

Adding structure to web search with itrails [position paper]

M.A. Vaz Salles; Jens Dittrich; Lukas Blunschi

We would like to discuss with workshop participants the iTrails framework for pay-as-you-go information integration, which was recently presented at VLDB 2007 (M. A. V. Salles et al., 2007). iTrails allows users to provide mini-mappings on their data that sharply increase the quality of search results. The core idea is to extend the semantics of a standard graphical search engine such that the quality of search results approaches the quality of a full-blown information integration system. In contrast to (M. A. V. Salles et al., 2007), this paper shows how iTrails can be used to tackle the challenges of adding structured information support to web search engines. We will show how iTrails enriches a web search engine with a powerful query rewriting mechanism enabling this engine to perform not only search but also integration of structured information.

very large data bases | 2007