Ymir Vigfusson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ymir Vigfusson is active.

Explore More

Publication

Featured researches published by Ymir Vigfusson.

Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware | 2008

Dr. Multicast: Rx for data center communication scalability

Ymir Vigfusson; Hussam Abu-Libdeh; Mahesh Balakrishnan; Kenneth P. Birman; Yoav Tock

Data centers avoid IP Multicast (IPMC) because of a series of problems with the technology. We introduce Dr. Multicast (MCMD), a system that maps IPMC operations to a combination of point-to-point unicast and traditional IPMC transmissions. MCMD optimizes the use of IPMC addresses within a data center, while simultaneously respecting an administrator-specified acceptable-use policy. We argue that with the resulting range of options, IPMC no longer represents a threat and can therefore be used much more widely.

european conference on computer systems | 2010

Dr. multicast: Rx for data center communication scalability

Ymir Vigfusson; Hussam Abu-Libdeh; Mahesh Balakrishnan; Kenneth P. Birman; Robert Burgess; Haoyuan Li; Yoav Tock

IP Multicast (IPMC) in data centers becomes disruptive when the technology is used by a large number of groups, a capability desired by event notification systems. We trace the problem to root causes, and introduce Dr. Multicast (MCMD), a system that eliminates the issue by mapping IPMC operations to a combination of point-to-point unicast and traditional IPMC transmissions guaranteed to be safe. MCMD optimizes the use of IPMC addresses within a data center by merging similar multicast groups in a principled fashion, while simultaneously respecting hardware limits expressed through administrator-controlled policies. The system is fully transparent, making it backward-compatible with commodity hardware and software found in modern data centers. Experimental evaluation shows that MCMD allows a large number of IPMC groups to be used without disruption, restoring a powerful group communication primitive to its traditional role.

distributed event-based systems | 2010

Magnet: practical subscription clustering for Internet-scale publish/subscribe

Sarunas Girdzijauskas; Ymir Vigfusson; Yoav Tock; Roie Melamed

An effective means for building Internet-scale distributed applications, and in particular those involving group-based information sharing, is to deploy peer-to-peer overlay networks. The key pre-requisite for supporting these types of applications on top of the overlays is efficient distribution of messages to multiple subscribers dispersed across numerous multicast groups. In this paper, we introduce Magnet: a peer-to-peer publish/subscribe system which achieves efficient message distribution by dynamically organizing peers with similar subscriptions into dissemination structures which preserve locality in the subscription space. Magnet is able to significantly reduce the message propagation costs by taking advantage of subscription correlations present in many large-scale group-based applications. We evaluate Magnet by comparing its performance against a strawman pub/sub system which does not cluster similar subscriptions by simulation. We find that Magnet outperforms the strawman by a substantial margin on clustered subscription workloads produced using both generative models and real application traces.

symposium on cloud computing | 2013

Dynamic performance profiling of cloud caches

Hjörtur Björnsson; Trausti Sæmundsson; Ymir Vigfusson

In-memory object caches, such as memcached, are critical to the success of popular web sites, such as Facebook [3], by reducing database load and improving scalability [2]. The prominence of caches implies that configuring their ideal memory size has the potential for significant savings on computation resources and energy costs, but unfortunately cache configuration is poorly understood. The modern practice of manually tweaking live caching systems takes significant effort and may both increase the variance for client request latencies and impose high load on the database backend.

Proceedings of the 4th International Workshop on Large Scale Distributed Systems and Middleware | 2010

Data caching as a cloud service

Guy Laden; Ymir Vigfusson

We discuss the challenges of devising a useful shared data cache service as a part of the cloud platform. Outside the cloud, such a service appeals to developers for two main reasons. Most importantly, data caches reduce the response latency experienced by users. For example, rendering a content page with various personalized boxes as part of a user web session often involves numerous database lookups. Therefore, if the content generation involves cheap memory accesses to a data cache instead of actual database queries, the user experiences will improve. Moreover, data caches are simple to use: they normally expose a simple get/set interface akin to key/value stores and a rudimentary mechanism to expire values [2], thus allowing result-based caching to be seamlessly integrated with existing database-driven code.

Ibm Journal of Research and Development | 2011

Design and implementation of caching services in the cloud

Guy Laden; Ymir Vigfusson

Data caching is a key paradigm for improving the performance of web services in terms of both end-user latency and database load. Such caching is becoming an essential component of any application or service designed for the cloud platform. In order to allow hosted applications to benefit from caching capabilities while avoiding dependence on explicit implementations and idiosyncrasies of internal caches, the caching services should be offered by a cloud provider as an integral part of its platform-as-a-service portfolio. We highlight various challenges associated with supporting cloud-based caching services, such as identifying the appropriate metering and service models, performance management, and resource sharing across cloud tenants. We also describe how these challenges were addressed by our prototype implementation, which is called Simple Cache for Cloud (SC2). We demonstrate the effectiveness of these techniques by experimentally evaluating our prototype on a synthetic multitenant workload.

hot topics in networks | 2014

Characterizing Load Imbalance in Real-World Networked Caches

Qi Huang; Helga Gudmundsdottir; Ymir Vigfusson; Daniel A. Freedman; Kenneth P. Birman; Robbert van Renesse

Modern Web services rely extensively upon a tier of in-memory caches to reduce request latencies and alleviate load on backend servers. Within a given cache, items are typically partitioned across cache servers via consistent hashing, with the goal of balancing the number of items maintained by each cache server. Effects of consistent hashing vary by associated hashing function and partitioning ratio. Most real-world workloads are also skewed, with some items significantly more popular than others. Inefficiency in addressing both issues can create an imbalance in cache-server loads. We analyze the degree of observed load imbalance, focusing on read-only traffic against Facebooks graph cache tier in TAO. We investigate the principal causes of load imbalance, including data co-location, non-ideal hashing scenarios, and hot-spot temporal effects. We also employ trace-drive analytics to study the benefits and limitations of current load-balancing methods, suggesting areas for future research.

IEEE Transactions on Computers | 2009

Slicing Distributed Systems

Vincent Gramoli; Ymir Vigfusson; Kenneth P. Birman; Anne-Marie Kermarrec; R. van Renesse

Peer-to-peer (P2P) architectures are popular for tasks such as collaborative download, VoIP telephony, and backup. To maximize performance in the face of widely variable storage capacities and bandwidths, such systems typically need to shift work from poor nodes to richer ones. Similar requirements are seen in todays large data centers, where machines may have widely variable configurations, loads, and performance. In this paper, we consider the slicing problem, which involves partitioning the participating nodes into k subsets using a one-dimensional attribute, and updating the partition as the set of nodes and their associated attributes change. The mechanism thus facilitates the development of adaptive systems. We begin by motivating this problem statement and reviewing prior work. Existing algorithms are shown to have problems with convergence, manifesting as inaccurate slice assignments, and to adapt slowly as conditions change. Our protocol, Sliver, has provably rapid convergence, is robust under stress and is simple to implement. We present both theoretical and experimental evaluations of the protocol.

Proceedings of the 4th International Workshop on Large Scale Distributed Systems and Middleware | 2010

Sources of instability in data center multicast

Dmitry Basin; Kenneth P. Birman; Idit Keidar; Ymir Vigfusson

When replicating data in a cloud computing setting, it is common to send updates using reliable dissemination mechanisms such as network overlay trees. We show that as data centers scale up, such multicast schemes manifest various performance and stability problems.

very large data bases | 2009

Adaptively parallelizing distributed range queries

Ymir Vigfusson; Adam Silberstein; Brian F. Cooper; Rodrigo Fonseca

We consider the problem of how to best parallelize range queries in a massive scale distributed database. In traditional systems the focus has been on maximizing parallelism, for example by laying out data to achieve the highest throughput. However, in a massive scale database such as our PNUTS system [11] or BigTable [10], maximizing parallelism is not necessarily the best strategy: the system has more than enough servers to saturate a single client by returning results faster than the client can consume them, and when there are multiple concurrent queries, maximizing parallelism for all of them will cause disk contention, reducing everybodys performance. How can we find the right parallelism level for each query in order to achieve high, consistent throughput for all queries? We propose an adaptive approach with two aspects. First, we adaptively determine the ideal parallelism for a single query execution, which is the minimum number of parallel scanning servers needed to satisfy the client, depending on query selectivity, client load, client-server bandwidth, and so on. Second, we adaptively schedule which servers will be assigned to different query executions, to minimize disk contention on servers and ensure that all queries receive good performance. Our scheduler can be tuned based on different policies, such as favoring short versus long queries or high versus low priority queries. An experimental study demonstrates the effectiveness of our techniques in the PNUTS system.

Explore More