Is this you? Create Your Porfile

Lavanya Ramakrishnan

Lawrence Berkeley National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lavanya Ramakrishnan is active.

Explore More

Publication

Featured researches published by Lavanya Ramakrishnan.

ieee international conference on cloud computing technology and science | 2010

Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud

Keith Jackson; Lavanya Ramakrishnan; Krishna Muriki; Shane Canon; Shreyas Cholia; John Shalf; Harvey Wasserman; Nicholas J. Wright

Cloud computing has seen tremendous growth, particularly for commercial web applications. The on-demand, pay-as-you-go model creates a flexible and cost-effective means to access compute resources. For these reasons, the scientific computing community has shown increasing interest in exploring cloud computing. However, the underlying implementation and performance of clouds are very different from those at traditional supercomputing centers. It is therefore critical to evaluate the performance of HPC applications in today’s cloud environments to understand the tradeoffs inherent in migrating to the cloud. This work represents the most comprehensive evaluation to date comparing conventional HPC platforms to Amazon EC2, using real applications representative of the workload at a typical supercomputing center. Overall results indicate that EC2 is six times slower than a typical mid-range Linux cluster, and twenty times slower than a modern HPC system. The interconnect on the EC2 cloud platform severely limits performance and causes significant variability.

Cluster Computing | 2002

Programming the Grid: Distributed Software Components, P2P and Grid Web Services for Scientific Applications

Dennis Gannon; Randall Bramley; Geoffrey C. Fox; Shava Smallen; Al Rossi; Rachana Ananthakrishnan; Felipe Bertrand; Kenneth Chiu; Matt Farrellee; Madhusudhan Govindaraju; Sriram Krishnan; Lavanya Ramakrishnan; Yogesh Simmhan; Aleksander Slominski; Yu Ma; Caroline Olariu; Nicolas Rey-Cenvaz

Computational Grids [17,25] have become an important asset in large-scale scientific and engineering research. By providing a set of services that allow a widely distributed collection of resources to be tied together into a relatively seamless computing framework, teams of researchers can collaborate to solve problems that they could not have attempted before. Unfortunately the task of building Grid applications remains extremely difficult because there are few tools available to support developers. To build reliable and re-usable Grid applications, programmers must be equipped with a programming framework that hides the details of most Grid services and allows the developer a consistent, non-complex model in which applications can be composed from well tested, reliable sub-units. This paper describes experiences with using a software component framework for building Grid applications. The framework, which is based on the DOE Common Component Architecture (CCA) [1,2,3,8], allows individual components to export function/service interfaces that can be remotely invoked by other components. The framework also provides a simple messaging/event system for asynchronous notification between application components. The paper also describes how the emerging Web-services [52] model fits with a component-oriented application design philosophy. To illustrate the connection between Web services and Grid application programming we describe a simple design pattern for application factory services which can be used to simplify the task of building reliable Grid programs. Finally we address several issues of Grid programming that better understood from the perspective of Peer-to-Peer (P2P) systems. In particular we describe how models for collaboration and resource sharing fit well with many Grid application scenarios.

Computing in Science and Engineering | 2005

Service-Oriented Environments for Dynamically Interacting with Mesoscale Weather

Kelvin K. Droegemeier; Dennis Gannon; Daniel A. Reed; Beth Plale; Jay Alameda; Tom Baltzer; Keith Brewster; Richard D. Clark; Ben Domenico; Sara J. Graves; Everette Joseph; Donald Murray; Mohan Ramamurthy; Lavanya Ramakrishnan; John A. Rushing; Daniel B. Weber; Robert B. Wilhelmson; Anne Wilson; Ming Xue; Sepideh Yalda

Within a decade after John von Neumann and colleagues conducted the first experimental weather forecast on the ENIAC computer in the late 1940s, numerical models of the atmosphere become the foundation of modern-day weather forecasting and one of the driving application areas in computer science. This article describes research that is enabling a major shift toward dynamically adaptive responses to rapidly changing environmental conditions.

scientific cloud computing | 2013

Performance evaluation of a MongoDB and hadoop platform for scientific data analysis

Elif Dede; Madhusudhan Govindaraju; Daniel K. Gunter; Richard Shane Canon; Lavanya Ramakrishnan

Scientific facilities such as the Advanced Light Source (ALS) and Joint Genome Institute and projects such as the Materials Project have an increasing need to capture, store, and analyze dynamic semi-structured data and metadata. A similar growth of semi-structured data within large Internet service providers has led to the creation of NoSQL data stores for scalable indexing and MapReduce for scalable parallel analysis. MapReduce and NoSQL stores have been applied to scientific data. Hadoop, the most popular open source implementation of MapReduce, has been evaluated, utilized and modified for addressing the needs of different scientific analysis problems. ALS and the Materials Project are using MongoDB, a document oriented NoSQL store. However, there is a limited understanding of the performance trade-offs of using these two technologies together.In this paper we evaluate the performance, scalability and fault-tolerance of using MongoDB with Hadoop, towards the goal of identifying the right software environment for scientific data analysis.

Proceedings of the second international workshop on Data intensive computing in the clouds | 2011

I/O performance of virtualized cloud environments

Devarshi Ghoshal; Richard Shane Canon; Lavanya Ramakrishnan

The scientific community is exploring the suitability of cloud infrastructure to handle High Performance Computing (HPC) applications. The goal of Magellan, a project funded through DOE ASCR, is to investigate the potential role of cloud computing to address the computing needs of the Department of Energys Office of Science, especially for mid-range computing and data-intensive applications which are not served through existing DOE centers today. Prior work has shown that applications with significant communication or I/O tend to perform poorly in virtualized cloud environments. However, there is a limited understanding of the I/O characteristics in virtualized cloud environments. This paper will present our results in benchmarking the I/O performance over different cloud and HPC platforms to identify the major bottlenecks in existing infrastructure. We compare the I/O performance using IOR benchmark on two cloud platforms - Amazon and the Magellan cloud testbed. We analyze the performance of different storage options available, different instance types in multiple availability zones. Finally, we perform large-scale tests in order to analyze the variability in the I/O patterns over time and region. Our results highlight the overhead and variability in I/O performance on both public and private cloud solutions. Our results will help applications decide between the different storage options enabling applications to make effective choices.

scientific cloud computing | 2011

Magellan: experiences from a science cloud

Lavanya Ramakrishnan; Piotr T. Zbiegel; Scott Campbell; Rick Bradshaw; Richard Shane Canon; Susan Coghlan; Iwona Sakrejda; Narayan Desai; Tina Declerck; Anping Liu

Cloud resources promise to be an avenue to address new categories of scientific applications including data-intensive science applications, on-demand/surge computing, and applications that require customized software environments. However, there is a limited understanding on how to operate and use clouds for scientific applications. Magellan, a project funded through the Department of Energys (DOE) Advanced Scientific Computing Research (ASCR) program, is investigating the use of cloud computing for science at the Argonne Leadership Computing Facility (ALCF) and the National Energy Research Scientific Computing Facility (NERSC). In this paper, we detail the experiences to date at both sites and identify the gaps and open challenges from both a resource provider as well as application perspective.

symposium on cloud computing | 2010

Defining future platform requirements for e-Science clouds

Lavanya Ramakrishnan; Keith Jackson; Shane Canon; Shreyas Cholia; John Shalf

Cloud computing has evolved in the commercial space to support highly asynchronous web 2.0 applications. Scientific computing has traditionally been supported by centralized federally funded supercomputing centers and grid resources with a focus on bulk-synchronous compute and data-intensive applications. The scientific computing community has shown increasing interest in exploring cloud computing to serve e-Science applications, with the idea of taking advantage of some of its features such as customizable environments and on-demand resources. Magellan, a recently funded cloud computing project is investigating how cloud computing can serve the needs of mid-range computing and future data-intensive scientific workloads. This paper summarizes the application requirements and business model needed to support the requirements of both existing and emerging science applications, as learned from the early experiences on Magellan and commercial cloud environments. We provide an overview of the capabilities of leading cloud offerings and identify the existent gaps and challenges. Finally, we discuss how the existing cloud software stack may be evolved to better meet e-Science needs, along with the implications for resource providers and middleware developers.

conference on high performance computing (supercomputing) | 2006

Toward a doctrine of containment: grid hosting with adaptive resource control

Lavanya Ramakrishnan; David E. Irwin; Laura E. Grit; Aydan R. Yumerefendi; Adriana Iamnitchi; Jeffrey S. Chase

Grid computing environments need secure resource control and predictable service quality in order to be sustainable. We propose a grid hosting model in which independent, self-contained grid deployments run within isolated containers on shared resource provider sites. Sites and hosted grids interact via an underlying resource control plane to manage a dynamic binding of computational resources to containers. We present a prototype grid hosting system, in which a set of independent globus grids share a network of cluster sites. Each grid instance runs a coordinator that leases and configures cluster resources for its grid on demand. Experiments demonstrate adaptive provisioning of cluster resources and contrast job-level and container-level resource management in the context of two grid application managers

Journal of Parallel and Distributed Computing | 2015

Performance and energy efficiency of big data applications in cloud environments

Eugen Feller; Lavanya Ramakrishnan; Christine Morin

The exponential growth of scientific and business data has resulted in the evolution of the cloud computing environments and the MapReduce parallel programming model. The focus of cloud computing is increased utilization and power savings through consolidation while MapReduce enables large scale data analysis. Hadoop, an open source implementation of MapReduce has gained popularity in the last few years. In this paper, we evaluate Hadoop performance in both the traditional model of collocated data and compute services as well as consider the impact of separating out the services. The separation of data and compute services provides more flexibility in environments where data locality might not have a considerable impact such as virtualized environments and clusters with advanced networks. In this paper, we also conduct an energy efficiency evaluation of Hadoop on physical and virtual clusters in different configurations. Our extensive evaluation shows that: (1) coexisting virtual machines on servers decrease the disk throughput; (2) performance on physical clusters is significantly better than on virtual clusters; (3) performance degradation due to separation of the services depends on the data to compute ratio; (4) application completion progress correlates with the power consumption and power consumption is heavily application specific. Finally, we present a discussion on the implications of using cloud environments for big data analyses. Coexisting VMs decrease the disk throughput and thus the application performance.Hadoop on VMs yields significant performance decrease with increasing data scales.Separation of data and compute layers increases the energy consumption.Power profiles are application specific and correlate with the map/reduce phases.

international conference on cloud computing | 2012

Evaluating Hadoop for Data-Intensive Scientific Operations

Zacharia Fadika; Madhusudhan Govindaraju; Richard Shane Canon; Lavanya Ramakrishnan

Emerging sensor networks, more capable instruments, and ever increasing simulation scales are generating data at a rate that exceeds our ability to effectively manage, curate, analyze, and share it. Data-intensive computing is expected to revolutionize the next-generation software stack. Hadoop, an open source implementation of the MapReduce model provides a way for large data volumes to be seamlessly processed through use of large commodity computers. The inherent parallelization, synchronization and fault-tolerance the model offers, makes it ideal for highly-parallel data-intensive applications. MapReduce and Hadoop have traditionally been used for web data processing and only recently been used for scientific applications. There is a limited understanding on the performance characteristics that scientific data intensive applications can obtain from MapReduce and Hadoop. Thus, it is important to evaluate Hadoop specifically for data-intensive scientific operations -- filter, merge and reorder-- to understand its various design considerations and performance trade-offs. In this paper, we evaluate Hadoop for these data operations in the context of High Performance Computing (HPC) environments to understand the impact of the file system, network and programming modes on performance.

Explore More