Mikael Högqvist | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mikael Högqvist is active.

Explore More

Publication

Featured researches published by Mikael Högqvist.

storage network architecture and parallel i os | 2010

BabuDB: Fast and Efficient File System Metadata Storage

Jan Stender; Björn Kolbeck; Mikael Högqvist; Felix Hupfeld

Todays distributed file system architectures scale well to large amounts of data. Their performance, however, is often limited by their metadata server. In this paper, we reconsider the database backend of the metadata server and propose a design that simplifies implementation and enhances performance.In particular, we argue that the concept of log-structured merge (LSM) trees is a better foundation for the storage layer of a metadata server than the traditionally used B-trees. We present BabuDB, a database that relies on LSM-tree-like index structures, and describe how it stores file system metadata.We show that our solution offers better scalability and performance than equivalent ext4 and Berkeley DB-based metadata server implementations. Our experiments include real-world metadata traces from a Linux kernel build and an IMAP mail server. Results show that BabuDB is up to twice as fast as the ext4-based backend and outperforms a Berkeley DB setup by an order of magnitude.

extending database technology | 2012

SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets

Patrick Schäfer; Mikael Högqvist

Time series analysis, as an application for high dimensional data mining, is a common task in biochemistry, meteorology, climate research, bio-medicine or marketing. Similarity search in data with increasing dimensionality results in an exponential growth of the search space, referred to as Curse of Dimensionality. A common approach to postpone this effect is to apply approximation to reduce the dimensionality of the original data prior to indexing. However, approximation involves loss of information, which also leads to an exponential growth of the search space. Therefore, indexing an approximation with a high dimensionality, i. e. high quality, is desirable. We introduce Symbolic Fourier Approximation (SFA) and the SFA trie which allows for indexing of not only large datasets but also high dimensional approximations. This is done by exploiting the trade-off between the quality of the approximation and the degeneration of the index by using a variable number of dimensions to represent each approximation. Our experiments show that SFA combined with the SFA trie can scale up to a factor of 5--10 more indexed dimensions than previous approaches. Thus, it provides lower page accesses and CPU costs by a factor of 2--25 respectively 2--11 for exact similarity search using real world and synthetic data.

international parallel and distributed processing symposium | 2011

Flease - Lease Coordination Without a Lock Server

Björn Kolbeck; Mikael Högqvist; Jan Stender; Felix Hupfeld

Large-scale distributed systems often require scalable and fault-tolerant mechanisms to coordinate exclusive access to shared resources such as files, replicas or the primary role. The best known algorithms to implement distributed mutual exclusion with leases, such as Multipaxos, are complex, difficult to implement, and rely on stable storage to persist lease information. In this paper we present {\bf F}LEASE, an algorithm for fault-tolerant lease coordination in distributed systems that is simpler than Multipaxos and does not rely on stable storage. The evaluation shows that {\bf F}LEASE can be used to implement scalable, decentralized lease coordination that outperforms a central lock service implementation by an order of magnitude.

Future Generation Computer Systems | 2009

Generalizing the data management of three community grids

Stefan Plantikow; Kathrin Peter; Mikael Högqvist; Christian Grimme; Alexander Papaspyrou

Implementing efficient data management is a key challenge of grid computing. Due to seemingly different domain specific requirements, data management solutions have been developed separately for each community grid using a selection of low-level tools and APIs. This has led to unnecessarily complex and overspecialized systems. We describe three D-Grid community grid projects, AstroGrid-D, C3Grid and MediGRID, and analyze to what degree they share the same data management requirements. As a result, we derive the viewpoint that data management systems should provide applications with data access based on declarative and logical addressing, while ensuring the required quality of service (QoS). As a possible approach for this, we describe a conceptual data management system architecture that separates application, community, and resource concerns, using three layers of addressing, thus providing a highly adaptable architecture for different community grids. Additionally, we discuss approaches for the integration of legacy applications and grid scheduling with the proposed architecture.

New Astronomy | 2011

AstroGrid-D: Grid technology for astronomical science

Harry Enke; Matthias Steinmetz; Hans-Martin Adorf; Alexander Beck-Ratzka; Frank Breitling; Thomas Brüsemeister; Arthur Carlson; Torsten A. Ensslin; Mikael Högqvist; Iliya Nickelt; Thomas Radke; Alexander Reinefeld; Angelika Reiser; Tobias Scholl; Rainer Spurzem; J. Steinacker; W. Voges; Joachim Wambsganß; Steve White

Abstract We present status and results of AstroGrid-D, a joint effort of astrophysicists and computer scientists to employ grid technology for scientific applications. AstroGrid-D provides access to a network of distributed machines with a set of commands as well as software interfaces. It allows simple use of computer and storage facilities and to schedule or monitor compute tasks and data management. It is based on the Globus Toolkit middleware (GT4). Chapter 1 describes the context which led to the demand for advanced software solutions in Astrophysics, and we state the goals of the project. We then present characteristic astrophysical applications that have been implemented on AstroGrid-D in chapter 2. We describe simulations of different complexity, compute-intensive calculations running on multiple sites (Section 2.1 ), and advanced applications for specific scientific purposes (Section 2.2 ), such as a connection to robotic telescopes (Section 2.2.3 ). We can show from these examples how grid execution improves e.g. the scientific workflow. Chapter 3 explains the software tools and services that we adapted or newly developed. Section 3.1 is focused on the administrative aspects of the infrastructure, to manage users and monitor activity. Section 3.2 characterises the central components of our architecture: The AstroGrid-D information service to collect and store metadata, a file management system, the data management system, and a job manager for automatic submission of compute tasks. We summarise the successfully established infrastructure in chapter 4, concluding with our future plans to establish AstroGrid-D as a platform of modern e-Astronomy.

international performance, computing, and communications conference | 2010

Loosely time-synchronized snapshots in object-based file systems

Jan Stender; Mikael Högqvist; Björn Kolbeck

A file system snapshot is a stable image of all files and directories in a well-defined state. Local file systems offer point-in-time consistency of snapshots, which guarantees that all files are frozen in a state in which they were at the same point in time. However, this cannot be achieved in a distributed file system without global clocks or synchronous snapshot operations. We present an algorithm for distributed file system snapshots that overcomes this problem by relaxing the point-intime consistency of local file system snapshots to a time span-based consistency. Built on loosely synchronized server clocks, it makes snapshots available within milliseconds, without any kind of locking or synchronization. Our evaluation demonstrates that enabling and accessing snapshots involves a read/write throughput penalty of no more than 1% under normal conditions.

self-adaptive and self-organizing systems | 2008

Using Global Information for Load Balancing in DHTs

Mikael Högqvist; Seif Haridi; Nico Kruber; Alexander Reinefeld; Thorsten Schütt

Distributed hash tables (DHT) with order-preserving hash functions require load balancing to ensure an even item-load over all nodes. While previous item-balancing algorithms only improve the load imbalance, we argue that due to the cost of moving items, the competing goal of minimizing the used network traffic must be addressed as well. We aim to improve on existing algorithms by augmenting them with approximations of global knowledge, which can be distributed in a DHT with low cost using gossip mechanisms. In this paper we present initial simulation-based results from a decentralized balancing scheme extended with knowledge about the average node load. In addition, we discuss future work including a centralized auction-based algorithm that will be used as a benchmark.

ieee/acm international symposium cluster, cloud and grid computing | 2011

The Benefits of Estimated Global Information in DHT Load Balancing

Nico Kruber; Mikael Högqvist; Thorsten Schütt

Distributed hash tables (DHT) often rely on uniform hashing for balancing the load among their nodes. However, the most overloaded node may still have a load up to O(log N) times higher than the average load. DHTs with support for range queries cannot rely on hashing to fairly balance the systems load since hashing destroys the order of the stored items. Ensuring a fair load distribution is vital to avoid individual nodes becoming overloaded, potentially leading to node crashes or an incentive not to participate in the system. In both scenarios explicit load balancing schemes can help to spread the load more evenly. In this paper, we improve on existing algorithms for item-based active load balancing by relying on approximations of global properties. We show that the algorithms can be made more efficient by incorporating estimates of properties such as the average load and the standard deviation. Our algorithms reduce the network traffic induced by load balancing while achieving a better load balance than standard algorithms. We also show that these improvements can be applied to passive load balancing algorithms. Compared to DHTs without explicit load balancing, both variants are able to reduce the total maintenance traffic, i.e. item movements due to churn and load balancing, by up to 18%. Simultaneously the system is being balanced achieving a better load distribution.

grid computing | 2011

Infrastructure Federation Through Virtualized Delegation of Resources and Services

Georg Birkenheuer; André Brinkmann; Mikael Högqvist; Alexander Papaspyrou; Bernhard Schott; Dietmar Sommerfeld; Wolfgang Ziegler

Infrastructure federation is becoming an increasingly important issue for modern Distributed Computing Infrastructures (DCIs): Dynamic elasticity of quasi-static Grid environments, incorporation of special-purpose resources into commoditized Cloud infrastructures, cross-community collaboration for increasingly diverging areas of modern e-Science, and Cloud Bursting pose major challenges on the technical level for many resource and middleware providers. Especially with respect to increasing costs of operating data centers, the intelligent yet automated and secure sharing of resources is a key factor for success. With the D-Grid Scheduler Interoperability (DGSI) project within the German D-Grid Initiative, we provide a strategic technology for the automatically negotiated, SLA-secured, dynamically provisioned federation of resources and services for Grid-and Cloud-type infrastructures. This goal is achieved by complementing current DCI schedulers with the ability to federate infrastructure for the temporary leasing of resources and rechanneling of workloads. In this work, we describe the overall architecture and SLA-secured negotiation protocols within DGSI and depict an advanced mechanism for resource delegation through means of dynamically provisioned, virtualized middleware. Through this methodology, we provide the technological foundation for intelligent capacity planning and workload management in a cross-infrastructure fashion.

self-adaptive and self-organizing systems | 2010

Towards Explicit Data Placement in Scalable Key/Value Stores

Mikael Högqvist; Stefan Plantikow

Distributed key/value-stores are a key component of many large-scale applications. Traditionally they have been designed using Distributed Hash Tables (DHTs). DHTs, however, setup a tight coupling between the naming of nodes and assignment of keys to nodes which limits application control over data placement. We propose using small amounts of shared state in a semi-centralized architecture for more flexible data placement by introducing explicit mapping between keys and nodes via an indirection layer (blockspace). Our design is based on a membership layer that provides O(1) routing thereby targeting interactive applications. We evaluate a centralized and decentralized approach showing that both have relatively low overhead and provide efficient load balancing.

Explore More