Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Raghul Gunasekaran is active.

Publication


Featured researches published by Raghul Gunasekaran.


ieee international conference on high performance computing data and analytics | 2014

Best practices and lessons learned from deploying and operating large-scale data-centric parallel file systems

Sarp Oral; James A Simmons; Jason J Hill; Dustin B Leverman; Feiyi Wang; Matt Ezell; Ross Miller; Douglas Fuller; Raghul Gunasekaran; Young-Jae Kim; Saurabh Gupta; Devesh Tiwari; Sudharshan S. Vazhkudai; James H. Rogers; David A Dillow; Galen M. Shipman; Arthur S. Bland

The Oak Ridge Leadership Computing Facility (OLCF) has deployed multiple large-scale parallel file systems (PFS) to support its operations. During this process, OLCF acquired significant expertise in large-scale storage system design, file system software development, technology evaluation, benchmarking, procurement, deployment, and operational practices. Based on the lessons learned from each new PFS deployment, OLCF improved its operating procedures, and strategies. This paper provides an account of our experience and lessons learned in acquiring, deploying, and operating large-scale parallel file systems. We believe that these lessons will be useful to the wider HPC community.


petascale data storage workshop | 2015

Comparative I/O workload characterization of two leadership class storage clusters

Raghul Gunasekaran; Sarp Oral; Jason J Hill; Ross Miller; Feiyi Wang; Dustin B Leverman

The Oak Ridge Leadership Computing Facility (OLCF) is a leader in large-scale parallel file system development, design, deployment and continuous operation. For the last decade, the OLCF has designed and deployed two large center-wide parallel file systems. The first instantiation, Spider 1, served the Jaguar supercomputer and its predecessor, Spider 2, now serves the Titan supercomputer, among many other OLCF computational resources. The OLCF has been rigorously collecting file and storage system statistics from these Spider systems since their transition to production state. In this paper we present the collected I/O workload statistics from the Spider 2 system and compare it to the Spider 1 data. Our analysis show that the Spider 2 workload is more more write-heavy I/O compared to Spider 1 (75% vs. 60%, respectively). The data also show the OLCF storage policies such as periodic purges are effectively managing the capacity resource of Spider 2. Furthermore, due to improvements in tdm_multipath and ib_srp software, we are utilizing the Spider 2 system bandwidth and latency resources more effectively. The Spider 2 bandwidth usage statistics shows that our system is working within the design specifications. However, it is also evident that our scientific applications can be more effectively served by a burst buffer storage layer. All the data has been collected by monitoring tools developed for the Spider ecosystem. We believe the observed data set and insights will help us better design the next-generation Spider file and storage system. It will also be helpful to the larger community for building more effective large-scale file and storage systems.


international conference on big data | 2016

Constellation: A science graph network for scalable data and knowledge discovery in extreme-scale scientific collaborations

Sudharshan S. Vazhkudai; John Harney; Raghul Gunasekaran; Dale Stansberry; Seung-Hwan Lim; Thomas E Barron; Andrew W Nash; Arvind Ramanathan

Constellations overarching goal is the federation of information from resources within an extreme-scale scientific collaboration to enable the scalable discovery of data and new knowledge pathways. The resource fabric is comprised of petascale supercomputers and storage systems, users, jobs, datasets and lifecycle artifacts. For an extreme-scale supercomputing center, normal operations can generate hundreds of millions of data products and metadata entries describing the resource fabric. Constellation federates the information extracted from the resources using a custom, transformative science graph network; constructs rich metadata indexes and higher-order derived metadata from the extracted information; and conducts scalable graph analytics to unravel hidden data pathways. Our implementation and deployment for a production, supercomputing facility shows that the graph can scale to more than 750 million vertices, its domain agnostic indexing can answer interesting science queries, and its analytics can aid in structural, topological and temporal analysis to identify usage hotspots.


ieee international conference on high performance computing data and analytics | 2017

GUIDE: a scalable information directory service to collect, federate, and analyze logs for operational insights into a leadership HPC facility

Sudharshan S. Vazhkudai; Ross Miller; Devesh Tiwari; Christopher Zimmer; Feiyi Wang; Sarp Oral; Raghul Gunasekaran; Deryl Steinert

In this paper, we describe the GUIDE framework used to collect, federate, and analyze log data from the Oak Ridge Leadership Computing Facility (OLCF), and how we use that data to derive insights into facility operations. We collect system logs and extract monitoring data at every level of the various OLCF subsystems, and have developed a suite of pre-processing tools to make the raw data consumable. The cleansed logs are then ingested and federated into a central, scalable data warehouse, Splunk, that offers storage, indexing, querying, and visualization capabilities. We have further developed and deployed a set of tools to analyze these multiple disparate log streams in concert and derive operational insights. We describe our experience from developing and deploying the GUIDE infrastructure, and deriving valuable insights on the various subsystems, based on two years of operations in the production OLCF environment.


file and storage technologies | 2014

Automatic identification of application I/O signatures from noisy server-side traces

Yang Liu; Raghul Gunasekaran; Xiaosong Ma; Sudharshan S. Vazhkudai


usenix conference on hot topics in cloud ccomputing | 2012

Big data platforms as a service: challenges and approach

James Horey; Edmon Begoli; Raghul Gunasekaran; Seung-Hwan Lim; James J. Nutaro


Archive | 2010

Monitoring Tools for Large Scale Systems

Ross Miller; Jason J Hill; David A Dillow; Raghul Gunasekaran; Don Maxwell


Archive | 2012

A Next-Generation Parallel File System Environment for the OLCF

David A Dillow; Douglas Fuller; Raghul Gunasekaran; Young-Jae Kim; H Sarp Oral; Doug M Reitz; James A Simmons; Feiyi Wang; Galen M. Shipman; Jason J Hill


ieee international conference on high performance computing data and analytics | 2016

Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems

Yang Liu; Raghul Gunasekaran; Xiaosong Ma; Sudharshan S. Vazhkudai


9th International Workshop on Feedback Computing (Feedback Computing 14) | 2014

Feedback Computing in Leadership Compute Systems

Raghul Gunasekaran; Young-Jae Kim

Collaboration


Dive into the Raghul Gunasekaran's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

David A Dillow

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Galen M. Shipman

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jason J Hill

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Byung H. Park

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Feiyi Wang

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Ross Miller

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Sarp Oral

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Seung-Hwan Lim

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge