Stratos Papadomanolakis

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stratos Papadomanolakis is active.

Explore More

Publication

Featured researches published by Stratos Papadomanolakis.

statistical and scientific database management | 2004

AutoPart: automating schema design for large scientific databases using data partitioning

Stratos Papadomanolakis; Anastassia Ailamaki

Database applications that use multi-terabyte datasets are becoming increasingly important for scientific fields such as astronomy and biology. Scientific databases are particularly suited for the application of automated physical design techniques, because of their data volume and the complexity of the scientific workloads. Current automated physical design tools focus on the selection of indexes and materialized views. In large-scale scientific databases, however the data volume and the continuous insertion of new data allows for only limited indexes and materialized views. By contrast, data partitioning does not replicate data, thereby reducing space requirements and minimizing update overhead. In this paper we present AutoPart, an algorithm that automatically partitions database tables to optimize sequential access assuming prior knowledge of a representative workload. The resulting schema is indexed using a fraction of the space required for indexing the original schema. To evaluate AutoPart we built an automated schema design tool that interfaces to commercial database systems. We experiment with AutoPart in the context of the Sloan Digital Sky Survey database, a real-world astronomical database, running on SQL Server 2000. Our experiments demonstrate the benefits of partitioning for large-scale systems: partitioning alone improves query execution performance by a factor of two on average. Combined with indexes, the new schema also outperforms the indexed original schema by 20% (for queries) and a factor of five (for updates), while using only half the original index space.

international conference on management of data | 2006

Efficient query processing on unstructured tetrahedral meshes

Stratos Papadomanolakis; Anastassia Ailamaki; Julio Lopez; Tiankai Tu; David R. O'Hallaron; Gerd Heber

Modern scientific applications such as fluid dynamics and earthquake modeling heavily depend on massive volumes of data produced by computer simulations. Such applications require new data management capabilities in order to scale to terabyte-scale data volumes. The most common way to discretize the application domain is to decompose it into pyramids, forming an unstructured tetrahedral mesh. Modern simulations generate meshes of high resolution and precision, to be queried by a visualization or analysis tool. Tetrahedral meshes are extremely flexible and therefore vital to accurately model complex geometries, but also are difficult to index. To reduce query execution time, applications either use only subsets of the data or rely on different (less flexible) structures, thereby trading accuracy for speed.This paper presents efficient indexing techniques for common spatial (point and range) on tetrahedral meshes. Because the prevailing multidimensional indexing techniques attempt to approximate the tetrahedra using simpler shapes (primarily rectangles) the query performance deteriorates significantly as a function of the meshs geometric complexity. We develop Directed Local Search (DLS), an efficient indexing algorithm based on mesh topology information that is practically insensitive to the geometric properties of meshes. We show how DLS can be easily and efficiently implemented within modern DBMS without requiring new exotic index structures and complex preprocessing. Finally, we present a new data layout approach for tetrahedral mesh datasets that provides better performance for scientific applications.compared to the traditional space filling curves. In our PostgreSQL implementation DLS reduces the number of disk page accesses by 26% to 4x, and improves the overall query execution time by 25% to 4.

international conference on data engineering | 2007

An Integer Linear Programming Approach to Database Design

Stratos Papadomanolakis; Anastassia Ailamaki

Existing index selection tools rely on heuristics to efficiently search within the large space of alternative solutions and to minimize the overhead of using the query optimizer for cost estimation. Index selection heuristics, despite being practical, are hard to analyze and formally compute how close they get to the optimal solution. In this paper we propose a model for index selection based on integer linear programming (ILP). The ILP formulation enables a wealth of combinatorial optimization techniques for providing quality guarantees, approximate solutions and even for computing optimal solutions. We present a system architecture for ILP-based index selection, in the context of commercial database systems. Our ILP-based approach offers higher solution quality, efficiency and scalability without sacrificing any of the precision offered by existing index selection tools.

database systems for advanced applications | 2007

A workload-driven unit of cache replacement for mid-tier database caching

Xiaodan Wang; Tanu Malik; Randal C. Burns; Stratos Papadomanolakis; Anastassia Ailamaki

Making multi-terabyte scientific databases publicly accessible over the Internet is increasingly important in disciplines such as Biology and Astronomy. However, contention at a centralized, backend database is a major performance bottleneck, limiting the scalability of Internet-based, database applications. Midtier caching reduces contention at the backend database by distributing database operations to the cache. To improve the performance of mid-tier caches, we propose the caching of query prototypes, a workload-driven unit of cache replacement in which the cache object is chosen from various classes of queries in the workload. In existingmid-tier caching systems, the storage organization in the cache is statically defined. Our approach adapts cache storage to workload changes, requires no prior knowledge about the workload, and is transparent to the application. Experiments over a one-month, 1.4 million query Astronomy workload demonstrate up to 70% reduction in network traffic and reduce query response time by up to a factor of three when compared with alternative units of cache replacement.

international conference on data engineering | 2007

MultiMap: Preserving disk locality for multidimensional datasets

Minglong Shao; Steven W. Schlosser; Stratos Papadomanolakis; Jiri Schindler; Anastassia Ailamaki; Gregory R. Ganger

MultiMap is an algorithm for mapping multidimensional datasets so as to preserve the datas spatial locality on disks. Without revealing disk-specific details to applications, MultiMap exploits modern disk characteristics to provide full streaming bandwidth for one (primary) dimension and maximally efficient non-sequential access (i.e., minimal seek and no rotational latency) for the other dimensions. This is in contrast to existing approaches, which either severely penalize non-primary dimensions or fail to provide full streaming bandwidth for any dimension. Experimental evaluation of a prototype implementation demonstrates MultiMaps superior performance for range and beam queries. On average, MultiMap reduces total I/O time by over 50% when compared to traditional linearized layouts and by over 30% when compared to space-filling curve approaches such as Z-ordering and Hilbert curves. For scans of the primary dimension, MultiMap and traditional linearized layouts provide almost two orders of magnitude higher throughput than space-filling curve approaches.

file and storage technologies | 2005

On multidimensional data and modern disks

Steven W. Schlosser; Jiri Schindler; Stratos Papadomanolakis; Minglong Shao; Anastassia Ailamaki; Christos Faloutsos; Gregory R. Ganger

very large data bases | 2007

Efficient use of the query optimizer for automated physical design

Stratos Papadomanolakis; Debabrata Dash; Anastasia Ailamaki

very large data bases | 2007

Efficient Use of the Query Optimizer for Automated Database Design

Stratos Papadomanolakis; Debabrata Dash; Anastassia Ailamaki

international conference on management of data | 2009

Real application testing with database replay

Yujun Wang; Supiti Buranawatanachoke; Romain Colle; Karl Dias; Leonidas Galanis; Stratos Papadomanolakis; Uri Shaft

international conference on management of data | 2008

Oracle real application testing

Peter Belknap; Supiti Buranawatanachoke; Romain Colle; Benoit Dageville; Karl Dias; Leonidas Galanis; Shantanu Joshi; Jonathan D. Klein; Stratos Papadomanolakis; Uri Shaft; Leng Seow Tan; Venkateshwaran Venkataramani; Yujun Wang; Graham Wood; Khaled Yagoub; Hailing Yu

Explore More