Benjamin Herta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Benjamin Herta is active.

Explore More

Publication

Featured researches published by Benjamin Herta.

very large data bases | 2012

M3R: increased performance for in-memory Hadoop jobs

Avraham Shinnar; David Cunningham; Vijay A. Saraswat; Benjamin Herta

Main Memory Map Reduce (M3R) is a new implementation of the Hadoop Map Reduce (HMR) API targeted at online analytics on high mean-time-to-failure clusters. It does not support resilience, and supports only those workloads which can fit into cluster memory. In return, it can run HMR jobs unchanged -- including jobs produced by compilers for higher-level languages such as Pig, Jaql, and SystemML and interactive front-ends like IBM BigSheets -- while providing significantly better performance than the Hadoop engine on several workloads (e.g. 45x on some input sizes for sparse matrix vector multiply). M3R also supports extensions to the HMR API which can enable Map Reduce jobs to run faster on the M3R engine, while not affecting their performance under the Hadoop engine.

acm sigplan symposium on principles and practice of parallel programming | 2014

Resilient X10: efficient failure-aware programming

David Cunningham; David Grove; Benjamin Herta; Arun Iyengar; Kiyokuni Kawachiya; Hiroki Murata; Vijay A. Saraswat; Mikio Takeuchi; Olivier Tardieu

Scale-out programs run on multiple processes in a cluster. In scale-out systems, processes can fail. Computations using traditional libraries such as MPI fail when any component process fails. The advent of Map Reduce, Resilient Data Sets and MillWheel has shown dramatic improvements in productivity are possible when a high-level programming framework handles scale-out and resilience automatically. We are concerned with the development of general-purpose languages that support resilient programming. In this paper we show how the X10 language and implementation can be extended to support resilience. In Resilient X10, places may fail asynchronously, causing loss of the data and tasks at the failed place. Failure is exposed through exceptions. We identify a {\em Happens Before Invariance Principle} and require the runtime to automatically repair the global control structure of the program to maintain this principle. We show this reduces much of the burden of resilient programming. The programmer is only responsible for continuing execution with fewer computational resources and the loss of part of the heap, and can do so while taking advantage of domain knowledge. We build a complete implementation of the language, capable of executing benchmark applications on hundreds of nodes. We describe the algorithms required to make the language runtime resilient. We then give three applications, each with a different approach to fault tolerance (replay, decimation, and domain-level checkpointing). These can be executed at scale and survive node failure. We show that for these programs the overhead of resilience is a small fraction of overall runtime by comparing to equivalent non-resilient X10 programs. On one program we show end-to-end performance of Resilient X10 is ~100x faster than Hadoop.

acm sigplan symposium on principles and practice of parallel programming | 2014

GLB: lifeline-based global load balancing library in x10

Wei Zhang; Olivier Tardieu; David Grove; Benjamin Herta; Tomio Kamada; Vijay A. Saraswat; Mikio Takeuchi

We present GLB, a programming model and an associated implementation that can handle a wide range of irregular parallel programming problems running over large-scale distributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to problems that are hard to statically load balance. GLB hides the intricate synchronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and result collection) from the users. GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al [25]. Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the parallel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales well -- achieving nearly linear speedup on different computer architectures (Power, Blue Gene/Q, and K) -- up to 16K cores.

Ibm Journal of Research and Development | 2016

META: Middleware for Events, Transactions, and Analytics

Matthew Arnold; David Grove; Benjamin Herta; Michael Hind; Martin Hirzel; Arun Iyengar; Louis Mandel; Vijay A. Saraswat; Avraham Shinnar; Jérôme Siméon; Mikio Takeuchi; Olivier Tardieu; Wei Zhang

Businesses that receive events in the form of messages and react to them quickly can take advantage of opportunities and avoid risks as they occur. Since quick reactions are important, event processing middleware is a core technology in many businesses. However, the need to act quickly must be balanced against the need to act profitably, and the best action often depends on more context than just the latest event. Unfortunately, the context is often too large to analyze in the time allotted to processing an event. Instead, out-of-band analytics can train an analytical model, against which an event can be quickly scored. We built middleware that combines transactional event processing with analytics, using a data store to bridge between the two. Since the integration happens in the middleware, solution developers need not integrate technologies for events and analytics by hand. At the surface, our Middleware for Events, Transactions, and Analytics (META) offers a unified rule-based programming model. Internally, META uses the X10 distributed programming language. A core technical challenge involved ensuring that the solutions are highly available on unreliable commodity hardware, and continuously available through updates. This paper describes the programming model of META, its architecture, and its distributed runtime system.

theory and applications of satisfiability testing | 2012

SatX10: a scalable plug&play parallel SAT framework

Bard Bloom; David Grove; Benjamin Herta; Ashish Sabharwal; Horst Samulowitz; Vijay A. Saraswat

We propose a framework for SAT researchers to conveniently try out new ideas in the context of parallel SAT solving without the burden of dealing with all the underlying system issues that arise when implementing a massively parallel algorithm. The framework is based on the parallel execution language X10, and allows the parallel solver to easily run on both a single machine with multiple cores and across multiple machines, sharing information such as learned clauses.

Proceedings of the 6th ACM SIGPLAN Workshop on X10 | 2016

Resilient X10 over MPI user level failure mitigation

Sara S. Hamouda; Benjamin Herta; Josh Milthorpe; David Grove; Olivier Tardieu

Many PGAS languages and libraries rely on high performance transport layers such as GASNet and MPI to achieve low communication latency, portability and scalability. As systems increase in scale, failures are expected to become normal events rather than exceptions. Unfortunately, GASNet and standard MPI do not pro- vide fault tolerance capabilities. This limitation hinders PGAS languages and other high-level programming models from supporting resilience at scale. For this reason, Resilient X10 has previously been supported over sockets only, not over MPI. This paper describes the use of a fault tolerant MPI implementation, called ULFM (User Level Failure Mitigation), as a transport layer for Resilient X10. By providing fault tolerant collective and agreement algorithms, on demand failure propagation, and support for InfiniBand, ULFM provides the required infrastructure to create a high performance transport layer for Resilient X10. We show that replacing X10’s emulated collectives with ULFM’s blocking collectives results in significant performance improvements. For three iterative SPMD-style applications running on 1000 X10 places, the improvement ranged between 30% and 51%. The per-step overhead for resilience was less than 9%. A proposal for adding ULFM to the coming MPI-4 standard is currently under assessment by the MPI Forum. Our results show that adding user-level fault tolerance support in MPI makes it a suitable base for resilience in high-level programming models.

parallel computing | 2016

X10 and APGAS at Petascale

Olivier Tardieu; Benjamin Herta; David Cunningham; David Grove; Prabhanjan Kambadur; Vijay A. Saraswat; Avraham Shinnar; Mikio Takeuchi; Mandana Vaziri; Wei Zhang

X10 is a high-performance, high-productivity programming language aimed at large-scale distributed and shared-memory parallel applications. It is based on the Asynchronous Partitioned Global Address Space (APGAS) programming model, supporting the same fine-grained concurrency mechanisms within and across shared-memory nodes. We demonstrate that X10 delivers solid performance at petascale by running (weak scaling) eight application kernels on an IBM Power--775 supercomputer utilizing up to 55,680 Power7 cores (for 1.7Pflop/s of theoretical peak performance). For the four HPC Class 2 Challenge benchmarks, X10 achieves 41% to 87% of the system’s potential at scale (as measured by IBM’s HPCC Class 1 optimized runs). We also implement K-Means, Smith-Waterman, Betweenness Centrality, and Unbalanced Tree Search (UTS) for geometric trees. Our UTS implementation is the first to scale to petaflop systems. We describe the advances in distributed termination detection, distributed load balancing, and use of high-performance interconnects that enable X10 to scale out to tens of thousands of cores. We discuss how this work is driving the evolution of the X10 language, core class libraries, and runtime systems.

Archive | 2012