Steffen Viken Valvåg
University of Tromsø
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Steffen Viken Valvåg.
high performance computing and communications | 2008
Steffen Viken Valvåg; Dag Johansen
The complexity of implementing large scale distributed computations has motivated new programming models. Googles MapReduce model has gained widespread use and aims to hide the complex details of data partitioning and distribution, scheduling, synchronization, and fault tolerance. However, our experiences from the enterprise search business indicate that many real-life applications must be implemented as a collection of related MapReduce programs. Since the execution of these programs must be monitored and coordinated externally, several issues concerning scheduling, synchronization, and fault tolerance resurface. To address these limitations, we introduce Oivos; a high-level declarative programming model and its underlying runtime. We show how Oivos programs may specify computations that span multiple heterogeneous and interdependent data sets, how the programs are compiled and optimized, and how our run-time orchestrates and monitors their distributed execution. Our experimental evaluation reveals that Oivos programs do less I/O and execute significantly faster than the equivalent sequences of MapReduce passes.
network and parallel computing | 2009
Steffen Viken Valvåg; Dag Johansen
MapReduce has become a popular paradigm for parallel data processing, both for ad-hoc schema-less processing using a simple functional interface, and as a building block for higher-level abstractions. Much subsequent work has layered additional functionality on top of MapReduce or similar infrastructures, building powerful software stacks for distributed applications. In this paper, we present Cogset, the result of re-thinking the original MapReduce architecture that sits at the bottom of the stack. We observe that the traditional loose coupling between the distributed file system and the MapReduce processing engine leads to poor data locality for many applications. Accordingly, Cogset offers both reliable storage and parallel data processing, fusing the two components into a single system that ensures good data locality. We also take a new approach to data shuffling, relying on highly efficient static routing, and devise new mechanisms for fault tolerance, load balancing and ensuring consistency. We evaluate Cogset using a suite of benchmark applications, comparing it to Hadoop with very favorable results. For example, on a 12-node cluster, an inverted index that takes 80 minutes to build using Hadoop can be constructed using Cogset in less than 35 minutes.
Concurrency and Computation: Practice and Experience | 2013
Steffen Viken Valvåg; Dag Johansen; Åge Kvalnes
Cogset is a generic and efficient engine for reliable storage and parallel processing of distributed data sets. It supports a number of high‐level programming interfaces, including a MapReduce interface compatible with Hadoop. In this paper, we present Cogsets architecture and evaluate its performance as a MapReduce engine, comparing it with Hadoop. Our results show that Cogset generally outperforms Hadoop by a significant margin. We investigate the underlying causes of this difference in performance and demonstrate some relatively minor modifications that markedly improve Hadoops performance, closing some of the gap. Copyright
ieee international conference on cloud computing technology and science | 2010
Steffen Viken Valvåg; Dag Johansen; Åge Kvalnes
Cog set is an efficient and generic engine for reliable storage and parallel processing of data. It supports a number of high-level programming interfaces, including a MapReduce interface compatible with Hadoop. In this paper, we evaluate Cogset’s performance as a MapReduce engine, comparing it to Hadoop. Our results show that Cog set generally outperforms Hadoop by a significant margin. We investigate the causes of this gap in performance and demonstrate some relatively minor modifications that markedly improveHadoop’s performance, closing some of the gap.
mobile cloud computing & services | 2014
Robert Pettersen; Steffen Viken Valvåg; Åge Kvalnes; Dag Johansen
Cloud database services are a convenient building block for emerging mobile cloud applications. A central database can simplify application architectures by serving both as a reliable point of contact and as a repository for critical state. Meanwhile, the issues of availability and scalability can be delegated to the cloud service provider. The convenience of this approach is balanced by associated costs, both in terms of latency and financial expenses. Hence, an attractive middle ground is to employ caching of data in a layer between applications and the cloud, to reduce the load imposed on the cloud database service. This paper presents Jovaku, a generic caching layer for cloud database services that can induce significant performance improvements and cost savings. Jovaku demonstrates the viability of a truly global caching infrastructure by building on the existing DNS system. Database operations are relayed through the DNS protocol, allowing results to be cached in DNS servers close to client devices. This greatly simplifies deployment, and offers supreme availability, allowing devices anywhere to benefit from database caching. Our evaluation shows that the latency to access Amazon DynamoDB is significantly reduced for requests that hit the cache, and that applications can benefit from caching with hit rates as low as 5%.
Proceedings of the 2013 international workshop on Hot topics in cloud services | 2013
Steffen Viken Valvåg; Dag Johansen; Åge Kvalnes
Cloud services traditionally have a centralized architecture, where all clients communicate individually with the central service, and not directly with each other. Data is primarily stored in the cloud, and computations that touch data are performed in the cloud. We present Rusta, a platform that allows cloud services to deploy in a more flexible and decentralized manner, potentially involving the client machines at the edge of the cloud both for storage and processing of data. This can reduce operational costs both by leveraging freely available client resources, and by reducing data traffic to and from the cloud. Rusta includes a group abstraction to delineate webs of trusted peers, a light-weight process abstraction based on asynchronous message passing, and a distributed data storage layer. For elasticity, processes may migrate freely among the clients of a group, and can be replicated in a transparent manner. A central hub service executes in the cloud and maintains critical system state, while delegating work to clients as appropriate. This paper describes the design and current implementation of Rusta, its high-level programming model, and some of its potential applications, in particular as a foundation for highly elastic computations at the edge of the cloud.
international conference on cloud computing and services science | 2015
Robert Pettersen; Steffen Viken Valvåg; Åge Kvalnes; Dag Johansen
We demonstrate a practical way to reduce latency for mobile .NET applications that interact with cloud database services. We provide a programming abstraction for location-independent code, which has the potential to execute either locally or at a satellite execution environment in the cloud, in close proximity to the database service. This preserves a programmatic style of database access, and maintains a simple deployment model, but allows applications to offload latency-sensitive code to the cloud. Our evaluation shows that this approach can significantly improve the response time for applications that execute dependent queries, and that the required cloud-side resources are modest.
networking architecture and storages | 2009
Steffen Viken Valvåg; Dag Johansen
Key/value databases are popular abstractions for applications that require synchronous single-key look-ups.However, such databases invariably have a random I/O access pattern, which is inefficient on traditional storage media. To maximize throughput, an alternative is to rely on asynchronous batch processing of requests. As applications evolve, changing requirements with regard to scale or load may thus lead to a redesign to increase the use of batch processing. We present a new abstraction that we have found useful in making such transitions:the update map. It aims to combine the convenience of a key/value database with the performance of a batch-oriented approach. The interface resembles that of an ordinary key/value database, but its implementation can rely on batch processing and sequential I/O, for improved throughput. We evaluate our new abstraction by comparing three different implementations and their performance trade-offs. Specifically, we identify some conditions under which update maps significantly outperform commonly deployed key/value databases. Finally,we discuss ways to improve generic batch processing systems like MapReduce as well as traditional key/value databases based on our findings.
international conference on cloud computing and services science | 2016
Steffen Viken Valvåg; Robert Pettersen; Håvard D. Johansen; Dag Johansen
Distributed applications that span mobile devices, computing clusters, and the cloud, require robust and flexible mechanisms for dynamically loading code. This paper describes LADY, a system that augments the .NET platform with a highly reliable mechanism for resolving and loading assemblies and arranges for safe execution of partially trusted code. Key benefits of LADY are the low latency and high availability achieved through its novel integration with DNS.
international conference on cloud computing and services science | 2016
Robert Pettersen; Håvard D. Johansen; Steffen Viken Valvåg; Dag Johansen
Distributed applications that span mobile devices, computing clusters, and cloud services, require robust and flexible mechanisms for dynamically loading code. This paper describes lady: a system that augments the .net platform with a highly reliable mechanism for resolving and loading assemblies, and arranges for safe execution of partially trusted code. Key benefits of lady are the low latency and high availability achieved through its novel integration with dns.