Is this you? Create Your Porfile

Roman Schmidt

École Polytechnique Fédérale de Lausanne

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roman Schmidt is active.

Explore More

Publication

Featured researches published by Roman Schmidt.

international conference on management of data | 2003

P-Grid: a self-organizing structured P2P system

Karl Aberer; Philippe Cudré-Mauroux; Anwitaman Datta; Zoran Despotovic; Manfred Hauswirth; Magdalena Punceva; Roman Schmidt

1 Self-organizing Structured P2P Systems In the P2P community a fundamental distinction is made among unstructured and structured P2P systems for resource location. In unstructured P2P systems in principle peers are unaware of the resources that neighboring peers in the overlay networks maintain. Typically they resolve search requests by flooding techniques. Gnutella [9] is the most prominent example of this class. In contrast, in structured P2P systems peers maintain information about what resources neighboring peers offer. Thus queries can be directed and in consequence substantially fewer messages are needed. This comes at the cost of increased maintenance efforts during changes in the overlay network as a result of peers joining or leaving. The most prominent class of approaches to structured P2P systems are distributed hash tables (DHT), for example Chord [17]. Unstructured P2P systems have generated substantial interest because of emergent globalscale phenomena. For example, the Gnutella overlay network exhibits the following characteristics [15]: 1. The network has a small diameter, which ensures that a message flooding approach for search works with a relatively low timeto-life (approximately 7). 2. The node degrees of the overlay network follow a power-law distribution. Thus few peers have a large number of incoming links whereas most peers have a very low number of such links. These properties result from the way Gnutella performs network maintenance: each peer maintains a fixed number of active links. Using the network maintenance protocol a peer discovers new peers in the network by flooding discovery

IEEE Internet Computing | 2002

Improving data access in P2P systems

Karl Aberer; Magdalena Punceva; Manfred Hauswirth; Roman Schmidt

The authors present Gridella, a Gnutella-compatible P2P system. Gridella is based on the Peer-Grid (P-Grid) approach, which draws on research in distributed and cooperative information systems to provide a decentralized, scalable data access structure. Gridella improves the highly chaotic and inefficient Gnutella infrastructure with directed search and advanced concepts, thus enhancing efficiency and providing a model for further analysis and research.

international conference on peer-to-peer computing | 2005

Range queries in trie-structured overlays

Anwitaman Datta; Manfred Hauswirth; Renault John; Roman Schmidt; Karl Aberer

Among the open problems in P2P systems, support for nontrivial search predicates, standardized query languages, distributed query processing, query load balancing, and quality of query results have been identified as some of the most relevant issues. This paper describes how range queries as an important nontrivial search predicate can be supported in a structured overlay network that provides O(log n) search complexity on top of a trie abstraction. We provide analytical results that show that the proposed approach is efficient, supports arbitrary granularity of ranges, and demonstrate that its algorithmic complexity in terms of messages is independent of the size of the queried ranges and only depends on the size of the result set. In contrast to other systems which provide evaluation results only through simulations, we validate the theoretical analysis of the algorithms with large-scale experiments on the PlanetLab infrastructure using a fully-fledged implementation of our approach.

international conference on data engineering | 2007

UniStore: Querying a DHT-based Universal Storage

Marcel Karnstedt; Kai-Uwe Sattler; Martin Richtarsky; Jessica Muller; Manfred Hauswirth; Roman Schmidt; Renault John

The idea of collecting and combining large public data sets and services became more and more popular. The special characteristics of such systems and the requirements of the participants demand for strictly decentralized solutions. However, this comes along with several ambitious challenges a corresponding system has to overcome. In this demonstration paper, we present a lightweight distributed universal storage capable of dealing with those challenges, and providing a powerful and flexible way of building Internet-scale public data management systems. We introduce our approach based on a triple storage on top of a distributed hash table (DHT) overlay system, based on the ideas of a universal relation model and the resource description framework (RDF), and outline solved challenges as well as open issues.

database and expert systems applications | 2005

An Overlay Network for Resource Discovery in Grids

Manfred Hauswirth; Roman Schmidt

As grids try to achieve optimal and balanced utilization of unused resources in a distributed system, fast and efficient discovery of resource states is a key requirement. For small to medium scale grids, solutions such as the approach in Globus work fine. However, for large, up to global-scale grids, this approach is not efficient and does not scale. Additionally, even for smaller grids, a centralized solution is always a performance bottleneck and a single point of failure. In this paper we investigate the applicability of a structured peer-to-peer system (overlay network) for the discovery of grid resources. Each node in the grid becomes a peer in the overlay network, which provides a distributed directory service that allows the participants to discover resources and maintain resource states. Overlay networks implicitly balance load, scale well to very large numbers in terms of nodes and data, and meet the partial failure property of distributed systems, i.e., the system remains operational despite partial failures. We outline a proof-of-concept implementation based on our P-Grid overlay network, present experimental results from a large-scale deployment on PlanetLab and discuss the pros and cons of overlay networks in the context of grids

cluster computing and the grid | 2007

Query-load balancing in structured overlays

Anwitaman Datta; Roman Schmidt; Karl Aberer

Query-load (forwarding and answering) balancing in structured overlays is one of the most critical and least studied problems. It has been assumed that caching heuristics can take care of it. We expose that caching, while necessary, is not in itself sufficient. We then provide simple and effective load-aware variants of the standard greedy routing used in overlays, exploiting routing redundancy originally needed for fault-tolerance, to achieve very good query load-balancing.

international conference on data engineering | 2006

Similarity Queries on Structured Data in Structured Overlays

Marcel Karnstedt; Kai-Uwe Sattler; Manfred Hauswirth; Roman Schmidt

Structured P2P systems based on distributed hash tables are a popular choice for building large-scaled data management systems. Generally, they only support exact match queries, but data heterogeneities often demand for more complex query types, particularly similarity queries. In this work, we suggest a vertical data organization, which allows for efficient processing of similarity queries on instance as well as on schema level, and we introduce corresponding physical similarity operators. Our novel approach is shown to be suitable in conjunction with P-Grid, as an example of robust, large-scaled and self-organizing P2P systems.

international database engineering and applications symposium | 2008

A DHT-based infrastructure for ad-hoc integration and querying of semantic data

Marcel Karnstedt; Kai-Uwe Sattler; Manfred Hauswirth; Roman Schmidt

A crucial prerequisite for the deployment and success of Peer-to-Peer data management applications is the availability of metadata in a way that makes it easy to access and combine data from different sources and domains. In this paper, we argue for a unified and distributed infrastructure providing a repository for semantic data by offering location transparency and advanced query services. After discussing the challenges of such an approach, we present our solution which applies extended SPARQL-like query features for dealing with large and possibly heterogeneous data sets. We focus on the integration into efficient distributed query processing and evaluate our approach in a series of experiments.

conference on information and knowledge management | 2008

Estimating the number of answers with guarantees for structured queries in p2p databases

Marcel Karnstedt; Kai-Uwe Sattler; Michael Haß; Manfred Hauswirth; Brahmananda Sapkota; Roman Schmidt

Structured P2P overlays supporting standard database functionalities are a popular choice for building large-scale distributed data management systems. In such systems, estimating the number of answers for structured queries can help approximating query completeness, but is especially challenging. In this paper, we propose to use routing graphs in order to achieve this. We introduce the general approach and briefly discuss further aspects like overhead and guarantees.

international conference on peer-to-peer computing | 2006

Cost-Aware Processing of Similarity Queries in Structured Overlays

Marcel Karnstedt; Kai-Uwe Sattler; Manfred Hauswirth; Roman Schmidt

Large-scale distributed data management with P2P systems requires the existence of similarity operators for queries as we cannot assume that all users agree on exactly the same schema and value representations and data quality problems due to spelling errors and typos. In this paper, we present an approach for efficient processing of similarity selections and joins in a structured overlay. We show that there are several possible strategies exploiting DHT features to a different extent (i.e., key organization, routing, multicasting) and thus the choice of the best operator implementation in a given situation (selectivity, data distribution, load) should be based on cost information allowing the system to estimate the computation and communication costs of query execution plans. Hence, we present a cost model for similarity operations on structured data in a DHT and demonstrate the efficiency of our proposal by experimental results from a large-scale PlanetLab deployment

Explore More