Kathrin Peter | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kathrin Peter is active.

Explore More

Publication

Featured researches published by Kathrin Peter.

network computing and applications | 2006

Comparison of Redundancy Schemes for Distributed Storage Systems

Peter Sobe; Kathrin Peter

Reliable distributed data storage systems have to employ redundancy codes to tolerate the loss of storages. Many appropriate codes and algorithms can be found in the literature, but efficient schemes for tolerating several storage failures and their embedding in a distributed system are still research issues. In this paper, a variety of redundancy schemes are compared that got implemented in a distributed storage system. All schemes are based on parity and Reed/Solomon and are integrated in the storage system NetRAID. This system allows to configure several user-specified layouts. A performance and reliability analysis of several data and redundancy layouts is presented that combines analytical and experimental results. In a detail, we present performance results for an optimized Reed/Solomon implementation and give an outline for speeding up encoding and recovery by reconfigurable hardware employed in the distributed storage system

Future Generation Computer Systems | 2009

Generalizing the data management of three community grids

Stefan Plantikow; Kathrin Peter; Mikael Högqvist; Christian Grimme; Alexander Papaspyrou

Implementing efficient data management is a key challenge of grid computing. Due to seemingly different domain specific requirements, data management solutions have been developed separately for each community grid using a selection of low-level tools and APIs. This has led to unnecessarily complex and overspecialized systems. We describe three D-Grid community grid projects, AstroGrid-D, C3Grid and MediGRID, and analyze to what degree they share the same data management requirements. As a result, we derive the viewpoint that data management systems should provide applications with data access based on declarative and logical addressing, while ensuring the required quality of service (QoS). As a possible approach for this, we describe a conceptual data management system architecture that separates application, community, and resource concerns, using three layers of addressing, thus providing a highly adaptable architecture for different community grids. Additionally, we discuss approaches for the integration of legacy applications and grid scheduling with the proposed architecture.

international workshop on data intensive distributed computing | 2012

Consistency and fault tolerance for erasure-coded distributed storage systems

Kathrin Peter; Alexander Reinefeld

One challenge in applying erasure codes (or error-correcting codes) to distributed storage systems is to maintain consistency between data and redundancy blocks in the face of crashing servers. We present two access protocols that provide sequential consistency and maximum distance separable fault tolerance at the same time. The protocols use sequence numbers to recover a consistent version in the presence of failures or partial writes. The first (pessimistic) PSW protocol uses a master per stripe to execute updates in sequence. The second (optimistic) OCW protocol allows concurrent writes to blocks in the same stripe to happen in parallel at the cost of additional buffer space. We present empirical performance results for PSW and OCW and compare them to other protocols. Our results show that OCW is as fast as simple replication while providing better fault tolerance and/or reduced storage overhead. This demonstrates that erasure coding can be used as a space-efficient alternative to replication in distributed storage systems.

Journal of Computational Science | 2012

Solutions for biomedical grid computing—Case studies from the D-Grid project Services@MediGRID

Frank Dickmann; Jürgen Falkner; Wilfried Gunia; Jochen Hampe; Michael Hausmann; Alexander M. Herrmann; Nick Kepper; Tobias A. Knoch; Svenja Lauterbach; Jörg Lippert; Kathrin Peter; Eberhard Schmitt; Ulrich Schwardmann; Juri Solodenko; Dietmar Sommerfeld; Thomas Steinke; Anette Weisbecker; Ulrich Sax

The project Services@MediGRID consortium established a tool set of grid-based biomedical services since 2008. The services are related to genetic analysis, genome data visualization, and pharmacokinetic modeling. Furthermore, business concepts for these services have been examined which are supported by an accounting and billing service. While the tools cover a whole service chain for biomedicine, the business concepts are rather heterogeneous. However, the overall addressed target market areas show promising potential. In addition, a structured coaching process reduces friction in the technology transfer from grid computing to biomedicine. This should be considered for similar future endeavors.

network computing and applications | 2008

Flexible Parameterization of XOR based Codes for Distributed Storage

Peter Sobe; Kathrin Peter

Distributed storage systems apply erasure-tolerant codes to guarantee reliable access to data despite failures of storage resources. While many codes can be mapped to XOR operations and efficiently implemented on common microprocessors, only a certain number of codes are usually implemented in a certain system (out of a wide variety of different codes). The ability to include new codes easily, to exchange codes and finally to select codes for several types of data is desirable. To provide this flexibility, a parameterization is used which allows the definition of different XOR based codes, and beyond different styles of en- and decoding. The parameters include (i) the assignment of data and redundancy elements to the storage resources and (ii) a description of en- and decoding algorithms with XOR based equations. The parameters of a certain code can be changed and in addition a wide variety of codes can be described and included in a storage system implementation. The proposed parameterization adopts the ability of codes like EVEN- ODD, Cauchy-R/S and Hover codes to map to distributed resources. Furthermore, en- and decoding algorithms can be described differently, either for minimal coding cost or for minimal coding time on parallel systems.

grid computing | 2010

Performance Analysis of Diffusion Tensor Imaging in an Academic Production Grid

Dagmar Krefting; Ralf Luetzkendorf; Kathrin Peter; Johannes Bernarding

Analysis of diffusion weighted magnetic resonance images serves increasingly for non-invasive tracking of nerve fibers in the human brain, both in clinical diagnosis and basic research. Diffusion-tensor imaging (DTI) enables in-vivo research on the internal structure of the central nervous system, an estimation of the interconnection of functional areas and diagnosis of brain tumors and de-myelinating diseases. But modeling the local diffusion parameters is computationally expensive and on standard desktop computers runtimes of up to days are common. A workflow based grid implementation of the algorithm with slice-based parallelization has shown significant speedup. However, in production use, the implementation frequently delayed and even failed, discouraging the medical collaborators to take up the management of the data processing themselves. Therefore a comprehensive analysis of possible sources for errors and delays as well as their real impact in the respective infrastructure is vital to enable clinical researchers to fully exploit the benefits of the Healthgrid application. In this manuscript, we tested different implementations of the DTI analysis with respect to robustness and runtime. Based on the results, concrete application improvements as well as general suggestions for the layout and maintenance of Healthgrids are concluded.

international parallel and distributed processing symposium | 2007

Combining Compression, Encryption and Fault-tolerant Coding for Distributed Storage

Peter Sobe; Kathrin Peter

Storing data in distributed systems aims to offer higher bandwidth and scalability than storing locally. But, a couple of disadvantageous issues must be taken into account such as unreliability caused by faults, temporal downtimes and malicious attacks. To improve dependability, redundancy codes like parity can be used as well as more sophisticated codes such as Reed/Solomon. Another issue-security requirements-arise when data is kept in untrusted units in a network. To encrypt data, it is common to use security algorithms like AES. For efficient transfer and storage, the amount of data can be reduced by compression algorithms. All these techniques-data distribution, fault-tolerant coding, encryption and compression-can be employed together using independent algorithms, but in a proper combination. A superposition of these techniques exploiting synergies is still an issue for research. Thus, in this paper we study proper technique combinations applied to distributed storage. The combinations are classified and examined with respect to their potential benefit and limitations. For our model, performance parameters from the distributed storage system NetRAID are used.

international parallel and distributed processing symposium | 2006

Construction of efficient OR-based deletion - tolerant coding schemes

Peter Sobe; Kathrin Peter

Fault-tolerant data layouts for storage systems are based on the principle to add redundancy to groups of data blocks and store them in different fault regions. Commonly, XOR-based codes are used with an optimal redundancy overhead but with the disadvantage of relatively high calculation costs. We present a scheme that encodes input data in a highly redundant code and exploits that redundancy for a fault tolerance scheme. It allows to recalculate missed bits in fewer steps than needed for XOR-based schemes. This simple and efficient en- and decoding requires an appropriate hardware architecture or a highly parallel microprocessor architecture. Particularly, disjunctions over many input bits must be calculated, e.g. by wide OR-gates or busses that are driven by multiple logic input lines. The high redundant encoding is combined with data compression for separated data streams, each stream dedicated to a storage device. The compression not only eliminates the introduced redundancy of the used code, it also eliminates redundancy in the input data.

parallel, distributed and network-based processing | 2011

Reliability Study of Coding Schemes for Wide-Area Distributed Storage Systems

Kathrin Peter

Distributed storage systems comprise a large number of commodity hardware distributed across several data centers. Even in the presence of failures (permanent failures) the system should provide reliable storage. While replication has advantages because of its simplicity there exist coding techniques that provide adaptable reliability properties with an optimal redundancy ratio at the same time e.g. MDS (maximum distance separable) erasure codes. The coding and distribution scheme influences the prospective storage reliability. In this paper we present reliability models for erasure coding and replication techniques especially for their application in wide-area storage systems. Furthermore we utilize these models to quantify the reliability properties of concrete data storage scenarios.

network computing and applications | 2012

Application of Regenerating Codes for Fault Tolerance in Distributed Storage Systems

Kathrin Peter; Peter Sobe

Recently, regenerating codes, a special network coding technique, were discovered for fault-tolerant storage systems with the promising advantage of efficient data recovery in the case of a single node failure and replacement (regeneration case). From the perspective of coding theory, regenerating codes are extensively studied, but there exists no reference on how to implement these codes in storage systems. We provide a comparison of Reed-Solomon codes and regenerating codes from an implementation point of view. The comparison includes the experimental evaluation of the encoding and the regeneration data throughput.

Explore More