Christian Zimmer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christian Zimmer is active.

Explore More

Publication

Featured researches published by Christian Zimmer.

international acm sigir conference on research and development in information retrieval | 2005

Improving collection selection with overlap awareness in P2P search engines

Matthias Bender; Sebastian Michel; Peter Triantafillou; Gerhard Weikum; Christian Zimmer

Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and subsequently the collections are ranked accordingly. Our thesis is that this simple approach is insufficient for several applications in which the collections typically overlap. This is the case, for example, for the collections built by autonomous peers crawling the web. We argue for the extension of existing quality measures using estimators of mutual overlap among collections and present experiments in which this combination outperforms CORI, a popular approach based on quality estimation. We outline our prototype implementation of a P2P web search engine, coined MINERVA, that allows handling large amounts of data in a distributed and self-organizing manner. We conduct experiments which show that taking overlap into account during collection selection can drastically decrease the number of collections that have to be contacted in order to reach a satisfactory level of recall, which is a great step toward the feasibility of distributed web search.

conference on information and knowledge management | 2006

Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices

Sebastian Michel; Matthias Bender; Nikos Ntarmos; Peter Triantafillou; Gerhard Weikum; Christian Zimmer

Peer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional search results. These decisions are based on statistical summaries for each peer, which are usually organized on a per-keyword basis and managed in a distributed directory of routing indices. Such architectures disregard the possible correlations among keywords. Together with the coarse granularity of per-peer summaries, which are mandated for scalability, this limitation may lead to poor search result quality.This paper develops and evaluates two solutions to this problem, sk-STAT based on single-key statistics only, and mk-STAT based on additional multi-key statistics. For both cases, hash sketch synopses are used to compactly represent a peers data items and are efficiently disseminated in the P2P network to form a decentralized directory. Experimental studies with Gnutella and Web data demonstrate the viability and the trade-offs of the approaches.

very large data bases | 2004

COMPASS: a concept-based web search engine for HTML, XML, and deep web data

Jens Graupmann; Michael Biwer; Christian Zimmer; Patrick Zimmer; Matthias Bender; Martin Theobald; Gerhard Weikum

This chapter introduces a concept-based Web search engine for HTML, XML, and deep Web data—Context-Oriented Multi-Format Portal-Aware Search System (COMPASS). It also presents the features and architectures of COMPASS. The internal query language of COMPASS resembles a highly simplified version of mainstream languages such as SQL, XPath, or XQuery. Search conditions refer to concepts and values that correspond to element names and contents in an XML setting, and attribute names and values in a SQL setting. COMPASS uses a centralized data index for efficient search evaluation. All data and also the relationships among documents are represented in a relational database. All data formats are transformed into XML by using heuristics as well as external annotation tools such as GATE.

web information systems engineering | 2008

Approximate Information Filtering in Peer-to-Peer Networks

Christian Zimmer; Christos Tryfonopoulos; Klaus Berberich; Manolis Koubarakis; Gerhard Weikum

Most approaches to information filtering taken so far have the underlying hypothesis of potentially delivering notifications from every information producer to subscribers. This exact publish/subscribe model creates an efficiency and scalability bottleneck, and might not even be desirable in certain applications. The work presented here puts forward MAPS, a novel approach to support approximate information filtering in a peer-to-peer environment. In MAPS a user subscribes to and monitors only carefully selected data sources, and receives notifications about interesting events from these sources only. This way scalability is enhanced by trading recall for lower message traffic. We define the protocols of a peer-to-peer architecture especially designed for approximate information filtering, and introduce new node selection strategies based on time series analysis techniques to improve data source selection. Our experimental evaluation shows that MAPS is scalable; it achieves high recall by monitoring only few data sources.

european conference on research and advanced technology for digital libraries | 2007

MinervaDL: an architecture for information retrieval and filtering in distributed digital libraries

Christian Zimmer; Christos Tryfonopoulos; Gerhard Weikum

We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of MinervaDL is based on the peer-to-peer search engine Minerva, and is able to handle huge amounts of data provided by digital libraries in a distributed and self-organizing way. The two-tier architecture and the use of the distributed hash table as the routing substrate provides an infrastructure for creating large networks of digital libraries with minimal administration costs. We discuss the main components of this architecture, present the protocols that regulate node interactions, and experimentally evaluate our approach.

database systems for advanced applications | 2008

Flood little, cache more: effective result-reuse in P2P IR systems

Christian Zimmer; Srikanta J. Bedathur; Gerhard Weikum

State-of-the-art Peer-to-Peer Information Retrieval (P2P IR) systems suffer from their lack of response time guarantee especially with scale. To address this issue, a number of techniques for caching of multi-term inverted list intersections and query results have been proposed recently. Although these enable speedy query evaluations with low network overheads, they fail to consider the potential impact of caching on result quality improvements. In this paper, we propose the use of a cache-aware query routing scheme, that not only reduces the response delays for a query, but also presents an opportunity to improve the result quality while keeping the network usage low. In this regard, we make threefold contributions in this paper. First of all, we develop a cache-aware, multiround query routing strategy that balances between query efficiency and result-quality. Next, we propose to aggressively reuse the cached results of even subsets of a query towards an approximate caching technique that can drastically reduce the bandwidth overheads, and study the conditions under which such a scheme can retain good result-quality. Finally, we empirically evaluate these techniques over a fully functional P2P IR system, using a large-scale Wikipedia benchmark, and using both synthetic and real-world query workloads. Our results show that our proposal to combine result caching with multi-round, cache-aware query routing can reduce network traffic by more than half while doubling the result quality.

DELOS'04 Proceedings of the 6th Thematic conference on Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures | 2004

The MINERVA project: towards collaborative search in digital libraries using peer-to-peer technology

Matthias Bender; Sebastian Michel; Christian Zimmer; Gerhard Weikum

We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are equally viewed as peers and, thus, as part of the P2P network. Our system provides a versatile platform for a scalable search engine combining local index structures of autonomous peers with a global directory based on a distributed hash table (DHT) as an overlay network. Experiments with the MINERVA prototype testbed study the benefits and costs of P2P search for keyword queries.We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peerto-peer (P2P) environment. Both digital libraries and users are equally viewed as peers and, thus, as part of the P2P network. Our system provides a versatile platform for a scalable search engine combining local index structures of autonomous peers with a global directory based on a distributed hash table (DHT) as an overlay network.

international conference on peer-to-peer computing | 2008

P2P Information Retrieval and Filtering with MAPS

Christian Zimmer; Johannes Heinz; Christos Tryfonopoulos; Gerhard Weikum

In this demonstration paper we present MAPS, a novel system that combines approximate information retrieval and filtering functionality in a peer-to-peer setting. In MAPS, a user is able to submit one-time and continuous queries, and receive matching resources and notifications from selected information sources. The selection of these sources in the retrieval case is based on well-known resource selection techniques for peer-to-peer query routing, while in the filtering case a combination of resource selection and novel behavior prediction techniques using time-series analysis of publisher statistics is used. The integration of the two functionalities is done in a seamless way utilizing the same machinery: a conceptually global, but physically distributed directory of statistics about information sources based on distributed hash tables.

acm international conference on digital libraries | 2007

Efficient search and approximate information filtering in a distributed peer-to-peer environment of digital libraries

Christian Zimmer; Christos Tryfonopoulos; Gerhard Weikum

We present a new architecture for efficient search and approximate information filtering in a distributed Peer-to-Peer (P2P) environment of Digital Libraries. The MinervaLight search system uses P2P techniques over a structured overlay network to distribute and maintain a directory of peer statistics. Based on the same directory, the MAPS information filtering system provides an approximate publish/subscribe functionality by monitoring the most promising digital libraries for publishing appropriate documents regarding a continuous query. In this paper, we discuss our system architecture that combines searching and information filtering abilities. We show the system components of MinervaLight and explain the different facets of an approximate pub/sub system for subscriptions that is high scalable, efficient, and notifies the subscribers about the most interesting publications in the P2P network of digital libraries. We also compare both approaches in terms of common properties and differences to show an overview of search and pub/sub using the same infrastructure.

very large data bases | 2005