Is this you? Create Your Porfile

Michael A. Sevilla

University of California, Santa Cruz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael A. Sevilla is active.

Explore More

Publication

Featured researches published by Michael A. Sevilla.

ieee international conference on high performance computing data and analytics | 2015

Mantle: a programmable metadata load balancer for the ceph file system

Michael A. Sevilla; Noah Watkins; Carlos Maltzahn; Ike Nassi; Scott A. Brandt; Sage A. Weil; Greg Farnum; Sam Palo Alto Fineberg

Migrating resources is a useful tool for balancing load in a distributed system, but it is difficult to determine when to move resources, where to move resources, and how much of them to move. We look at resource migration for file system metadata and show how CephFSs dynamic subtree partitioning approach can exploit varying degrees of locality and balance because it can partition the namespace into variable sized units. Unfortunately, the current metadata balancer is complicated and difficult to control because it struggles to address many of the general resource migration challenges inherent to the metadata management problem. To help decouple policy from mechanism, we introduce a programmable storage system that lets the designer inject custom balancing logic. We show the flexibility and transparency of this approach by replicating the strategy of a state-of-the-art metadata balancer and conclude by comparing this strategy to other custom balancers on the same system.

Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems | 2013

A framework for an in-depth comparison of scale-up and scale-out

Michael A. Sevilla; Ike Nassi; Kleoni Ioannidou; Scott A. Brandt; Carlos Maltzahn

When data grows too large, we scale to larger systems, either by scaling out or up. It is understood that scale-out and scale-up have different complexities and bottlenecks but a thorough comparison of the two architectures is challenging because of the diversity of their programming interfaces, their significantly different system environments, and their sensitivity to workload specifics. In this paper, we propose a novel comparison framework based on MapReduce that accounts for the application, its requirements, and its input size by considering input, software, and hardware parameters. Part of this framework requires implementing scale-out properties on scale-up and we discuss the complex trade-offs, interactions, and dependencies of these properties for two specific case studies (word count and sort). This work lays the foundation for future work in quantifying design decisions and in building a system that automatically compares architectures and selects the best one.

international parallel and distributed processing symposium | 2017

The Popper Convention: Making Reproducible Systems Evaluation Practical

Ivo Jimenez; Michael A. Sevilla; Noah Watkins; Carlos Maltzahn; Jay F. Lofstead; Kathryn Mohror; Andrea C. Arpaci-Dusseau; Remzi H. Arpaci-Dusseau

Independent validation of experimental results in the field of systems research is a challenging task, mainly due to differences in software and hardware in computational environments. Recreating an environment that resembles the original is difficult and time-consuming. In this paper we introduce _Popper_, a convention based on a set of modern open source software (OSS) development principles for generating reproducible scientific publications. Concretely, we make the case for treating an article as an OSS project following a DevOps approach and applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Popper leverages existing cloud-computing infrastructure and DevOps tools to produce academic articles that are easy to validate and extend. We present a use case that illustrates the usefulness of this approach. We show how, by following the _Popper_ convention, reviewers and researchers can quickly get to the point of getting results without relying on the original authors intervention.

international parallel and distributed processing symposium | 2014

SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce

Michael A. Sevilla; Ike Nassi; Kleoni Ioannidou; Scott A. Brandt; Carlos Maltzahn

Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and post-processing steps in large batch computations. Unfortunately, todays data sets are so large that the ingest and merge job phases are now performance bottlenecks. In this paper, we mitigate the ingest and merge bottlenecks by leveraging the scale-up MapReduce model. We introduce an ingest chunk pipeline and a merge optimization that increases CPU utilization (50-100%) and job phase speedups (1.16× - 3.13×) for the ingest and merge phases. Our techniques are based on well-known algorithms and scale-out MapReduce optimizations, but applying them to a scale-up computation framework to mitigate the ingest and merge bottlenecks is novel.

european conference on computer systems | 2017

Malacology: A Programmable Storage System

Michael A. Sevilla; Noah Watkins; Ivo Jimenez; Peter Alvaro; Shel Finkelstein; Jeff LeFevre; Carlos Maltzahn

Storage systems need to support high-performance for special-purpose data processing applications that run on an evolving storage device technology landscape. This puts tremendous pressure on storage systems to support rapid change both in terms of their interfaces and their performance. But adapting storage systems can be difficult because unprincipled changes might jeopardize years of code-hardening and performance optimization efforts that were necessary for users to entrust their data to the storage system. We introduce the programmable storage approach, which exposes internal services and abstractions of the storage stack as building blocks for higher-level services. We also build a prototype to explore how existing abstractions of common storage system services can be leveraged to adapt to the needs of new data processing systems and the increasing variety of storage devices. We illustrate the advantages and challenges of this approach by composing existing internal abstractions into two new higher-level services: a file system metadata load balancer and a high-performance distributed shared-log. The evaluation demonstrates that our services inherit desirable qualities of the back-end storage system, including the ability to balance load, efficiently propagate service metadata, recover from failure, and navigate trade-offs between latency and throughput using leases.

international conference on performance engineering | 2018

quiho : Automated Performance Regression Testing Using Inferred Resource Utilization Profiles

Ivo Jimenez; Noah Watkins; Michael A. Sevilla; Jay F. Lofstead; Carlos Maltzahn

We introduce quiho, a framework for profiling application performance that can be used in automated performance regression tests. quiho profiles an application by applying sensitivity analysis, in particular statistical regression analysis (SRA), using application-independent performance feature vectors that characterize the performance of machines. The result of the SRA, feature importance specifically, is used as a proxy to identify hardware and low-level system software behavior. The relative importance of these features serve as a performance profile of an application (termed inferred resource utilization profile or IRUP), which is used to automatically validate performance behavior across multiple revisions of an application»s code base without having to instrument code or obtain performance counters. We demonstrate that quiho can successfully discover performance regressions by showing its effectiveness in profiling application performance for synthetically introduced regressions as well as those found in real-world applications.

international parallel and distributed processing symposium | 2018