Is this you? Create Your Porfile

Fabius Klemm

École Polytechnique Fédérale de Lausanne

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fabius Klemm is active.

Explore More

Publication

Featured researches published by Fabius Klemm.

international conference on data engineering | 2007

Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys

Ivana Podnar; Martin Rajman; Toan Luu; Fabius Klemm; Karl Aberer

The suitability of peer-to-peer (P2P) approaches for full-text Web retrieval has recently been questioned because of the claimed unacceptable bandwidth consumption induced by retrieval from very large document collections. In this contribution we formalize a novel indexing/retrieval model that achieves high performance, cost-efficient retrieval by indexing with highly discriminative keys (HDKs) stored in a distributed global index maintained in a structured P2P network. HDKs correspond to carefully selected terms and term sets appearing in a small number of collection documents. We provide a theoretical analysis of the scalability of our retrieval model and report experimental results obtained with our HDK-based P2P retrieval engine. These results show that, despite increased indexing costs, the total traffic generated with the HDK approach is significantly smaller than the one obtained with distributed single-term indexing strategies. Furthermore, our experiments show that the retrieval performance obtained with a random set of real queries is comparable to the one of centralized, single-term solution using the best state-of-the-art BM25 relevance computation scheme. Finally, our scalability analysis demonstrates that the HDK approach can scale to large networks of peers indexing Web-size document collections, thus opening the way towards viable, truly-decentralized Web retrieval.

ad hoc networks | 2005

Improving TCP performance in ad hoc networks using signal strength based link management

Fabius Klemm; Zhenqiang Ye; Srikanth V. Krishnamurthy; Satish K. Tripathi

Mobility in ad hoc networks causes frequent link failures, which in turn causes packet losses. TCP attributes these packet losses to congestion. This incorrect inference results in frequent TCP re-transmission time-outs and therefore a degradation in TCP performance even at light loads. We propose mechanisms that are based on signal strength measurements to alleviate such packet losses due to mobility. Our key ideas are (a) if the signal strength measurements indicate that a link failure is most likely due to a neighbor moving out of range, in reaction, facilitate the use of temporary higher transmission power to keep the link alive and, (b) if the signal strength measurements indicate that a link is likely to fail, initiate a route re-discovery proactively before the link actually fails. We make changes at the MAC and the routing layers to predict link failures and estimate if a link failure is due to mobility. We also propose a simple mechanism at the MAC layer that can help alleviate false link failures, which occur due to congestion when the IEEE 802.11 MAC protocol is used. We compare the above proactive and reactive schemes and also demonstrate the benefits of using them together and along with our MAC layer extension. We show that, in high mobility, the goodput of a TCP session can be improved by as much as 75% at light loads (when there is only one TCP session in the network) when our methods are incorporated. When the network is heavily loaded (i.e., there are multiple TCP sessions in the network), the proposed schemes can improve the aggregate goodput of the TCP sessions by about 14-30%, on average.

information retrieval in peer to peer networks | 2006

ALVIS peers: a scalable full-text peer-to-peer retrieval engine

Toan Luu; Fabius Klemm; Ivana Podnar; Martin Rajman; Karl Aberer

We present Alvis peers, a full-text P2P retrieval engine designed to offer retrieval performance comparable to centralized solutions while scaling to a very large number of peers. It is the result of our research efforts within the project Alvis1 European FP 6 STREP project ALVIS, http://www.alvis.info/ that aims at building a truly-distributed semantic search engine. To cope with problem of unscalable bandwidth consumption in the P2P network, the engine implements a novel retrieval model that indexes highly-discriminative keys (HDKs)---terms and term sets appearing in a limited number of collection documents. Our prototype is a fully-functional retrieval engine built over a structured P2P network. It includes a component for HDK based indexing and retrieval, and a distributed content-based ranking module. Such an integrated system represents a substantial contribution to the design and development of realistic P2P retrieval systems.

very large data bases | 2008

AlvisP2P: scalable peer-to-peer text retrieval in a structured P2P network

Toan Luu; Gleb Skobeltsyn; Fabius Klemm; Maroje Puh; Ivana Podnar Žarko; Martin Rajman; Karl Aberer

In this paper we present the AlvisP2P IR engine, which enables efficient retrieval with multi-keyword queries from a global document collection available in a P2P network. In such a network, each peer publishes its local index and invests a part of its local computing resources (storage, CPU, bandwidth) to maintain a fraction of a global P2P index. This investment is rewarded by the network-wide accessibility of the local documents via the global search facility. The AlvisP2P engine uses an optimized overlay network and relies on novel indexing/retrieval mechanisms that ensure low bandwidth consumption, thus enabling unlimited network growth. Our demonstration shows how an easy-to-install AlvisP2P client can be used to join an existing P2P network, index local (text or even multimedia) documents with collection-specific indexing mechanisms, and control access rights to them.

european conference on research and advanced technology for digital libraries | 2006

A peer-to-peer architecture for information retrieval across digital library collections

Ivana Podnar; Toan Luu; Martin Rajman; Fabius Klemm; Karl Aberer

Peer-to-peer networks have been identified as promising architectural concept for developing search scenarios across digital library collections. Digital libraries typically offer sophisticated search over their local content, however, search methods involving a network of such stand-alone components are currently quite limited. We present an architecture for highly-efficient search over digital library collections based on structured P2P networks. As the standard single-term indexing strategy faces significant scalability limitations in distributed environments, we propose a novel indexing strategy–key-based indexing. The keys are term sets that appear in a restricted number of collection documents. Thus, they are discriminative with respect to the global document collection, and ensure scalable search costs. Moreover, key-based indexing computes posting list joins during indexing time, which significantly improves query performance. As search efficient solutions usually imply costly indexing procedures, we present experimental results that show acceptable indexing costs while the retrieval performance is comparable to the standard centralized solutions with TF-IDF ranking.

network computing and applications | 2006

Congestion Control for Distributed Hash Tables

Fabius Klemm; J.-Y. Le Boudec; Karl Aberer

Distributed hash tables (DHTs) provide a scalable mechanism for mapping identifiers to socket addresses. As each peer in the network can initiate lookup requests, a DHT has to process concurrently a potentially very large number of requests. In this paper, we look at congestion control for DHTs. Our goal is to control the flow of lookup requests that are routed in the overlay network. We first show that congestion control is essential for certain applications with high lookup rates. We then present two congestion control mechanisms for DHTs and compare their performances in different network conditions

international conference on peer-to-peer computing | 2007

On Routing in Distributed Hash Tables

Fabius Klemm; Sarunas Girdzijauskas; J.-Y. Le Boudec; Karl Aberer

There have been many proposals for constructing routing tables for distributed hash tables (DHT). They can be classified into two groups: A) those that assume that the peers are uniformly randomly distributed in the identifier space, and B) those that allow order-preserving hash functions that lead to a skewed peer distribution in the identifier space. Good solutions for group A have been known for many years. However, DHTs in group A are limited to use randomized hashing and therefore, queries over whole identifier ranges thus do not scale. Group B can handle such queries easily. However, it is more difficult to connect the peers such that the resulting topology provides efficient routing, small routing tables, and balanced routing load. We present an elegant new solution to construct an efficient DHT for group B. Our main idea is to decouple the identifier space from the routing topology. In consequence, our DHT allows arbitrarily skewed peer distributions in the identifier space and does not require the overhead of sampling. Furthermore, the table construction is cheap and does not require active replacement of lost routing entries. To evaluate the performance of routing cost and table construction under high churn, we built an efficient simulator. Using the right data structures, we can easily process the state of over one million peers in RAM.The recent desktop versions of Windows (XP SP2 and Vista) include a Peer-to-Peer (P2P) infrastructure that simplifies the development and deployment of true P2P applications. We have ported the latest version of this infrastructure to Windows Embedded CE 5.0, which is the underlying OS for Windows Mobile 5.0. To our knowledge, this is the first native implementation of a P2P infrastructure for Windows Mobile. This paper provides a short overview of the infrastructure and design considerations when running P2P applications on mobile phones. For demonstration purposes, we have developed a community-based photo sharing and chat application that solely uses the P2P infrastructure for communication. We will demonstrate the ease of creating P2P communities in an ad-hoc manner, and the interoperability between Windows Mobile and Windows Vista.

extending database technology | 2004

A query-adaptive partial distributed hash table for peer-to-peer systems

Fabius Klemm; Anwitaman Datta; Karl Aberer

The two main approaches to find data in peer-to-peer (P2P) systems are unstructured networks using flooding and structured networks using a distributed index A distributed index is usually built over all keys that are stored in the network whether they are queried or not Indexing all keys is no longer feasible when indexing metadata, as the key space becomes very large Here we need a query-adaptive approach that indexes only keys worth indexing, i.e keys that are queried at least with a certain frequency In this paper we study the cost of indexing and propose a query-adaptive partial distributed hash table (PDHT) that does not keep all keys in the index We model and analyze a scenario to show that query-adaptive partial indexing outperforms pure flooding and “index-everything” strategies Furthermore, our scheme is able to automatically adjust the index to changing query frequencies and distributions.

Lecture Notes in Computer Science | 2003

Alleviating Effects of Mobility on TCP Performance in Ad Hoc Networks Using Signal Strength Based Link Management

Fabius Klemm; Srikanth V. Krishnamurthy; Satish K. Tripathi

Mobility in ad hoc networks causes link failures, which in turn result in packet losses. TCP attributes these losses to congestion. This results in frequent TCP retransmission timeouts and degradation in TCP performance even at light loads. We propose mechanisms that are based on signal strength measurements to alleviate such packet losses due to mobility at light loads. Our key ideas are (a) if the signal strength measurements indicate that a link failure is most likely due to a neighbor moving out of range, in reaction, facilitate the use of temporary high power to re-establish the link and, (b) if the signal strength measurements indicate that a link is likely to fail, initiate a route re-discovery proactively before the link actually fails. We make changes at the MAC and the routing layers to predict link failures and estimate if a link failure is due to mobility. We also propose a simple mechanism that can help alleviate false link failures that occur due to congestion when the IEEE 802.11 MAC protocol is used. We compare the above proactive and reactive schemes and also demonstrate the benefits of using them together. We show that, in high mobility, the performance of a TCP session can increase by as much as 45 percent when our methods are incorporated.

databases information systems and peer to peer computing | 2005

Aggregation of a term vocabulary for P2P-IRtest: a DHT stress test

Fabius Klemm; Karl Aberer

There has been an increasing research interest in developing full-text retrieval based on peer-to-peer (P2P) technology. So far, these research efforts have largely concentrated on efficiently distributing an index. However, ranking of the results retrieved from the index is a crucial part in information retrieval. To determine the relevance of a document to a query, ranking algorithms use collection-wide statistics. Term frequency - inverse document frequency (TF-IDF), for example, is based on frequencies of documents containing a given term in the whole collection. Such global frequencies are not readily available in a distributed system. In this paper, we study the feasibility of aggregating global frequencies for a large term vocabulary in a P2P setting. We use a distributed hash table (DHT) for our analysis. Traditional applications of DHTs, such as file sharing, index keys in the order of tens of thousands. Aggregation of a vocabulary consisting of millions of terms poses extreme requirements to a DHT implementation. We study different aggregation strategies and propose optimizations to DHTs to efficiently process large numbers of keys.

Explore More