Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Junmin Gu is active.

Publication


Featured researches published by Junmin Gu.


MSS | 2005

Storage resource managers: Middleware components for gridstorage

Arie Shoshani; Alex Sim; Junmin Gu

The amount of scientific data generated by simulations orcollected from large scale experiments have reached levels that cannot bestored in the researchers workstation or even in his/her local computercenter. Such data are vital to large scientific collaborations dispersedover wide-area networks. In the past, the concept of a Gridinfrastructure [1]mainly emphasized the computational aspect ofsupporting large distributed computational tasks, and optimizing the useof the network by using bandwidth reservation techniques. In this paperwe discuss the concept of Storage Resource Managers (SRMs) as componentsthat complement this with the support for the storage management of largedistributed datasets. The access to data is becoming the main bottleneckin such data intensive applications because the data cannot bereplicated in all sites. SRMs can be used to dynamically optimize the useof storage resource to help unclog this bottleneck.


Lawrence Berkeley National Laboratory | 2009

FastBit: interactively searching massive data

Kesheng Wu; Sean Ahern; Edward W Bethel; Jacqueline H. Chen; Hank Childs; E. Cormier-Michel; Cameron Geddes; Junmin Gu; Hans Hagen; Bernd Hamann; Wendy S. Koegler; Jerome Lauret; Jeremy S. Meredith; Peter Messmer; Ekow J. Otoo; V Perevoztchikov; A. M. Poskanzer; Prabhat; Oliver Rübel; Arie Shoshani; Alexander Sim; Kurt Stockinger; Gunther H. Weber; W. M. Zhang

As scientific instruments and computer simulations produce more and more data, the task of locating the essential information to gain insight becomes increasingly difficult. FastBit is an efficient software tool to address this challenge. In this article, we present a summary of the key underlying technologies, namely bitmap compression, encoding, and binning. Together these techniques enable FastBit to answer structured (SQL) queries orders of magnitude faster than popular database systems. To illustrate how FastBit is used in applications, we present three examples involving a high-energy physics experiment, a combustion simulation, and an accelerator simulation. In each case, FastBit significantly reduces the response time and enables interactive exploration on terabytes of data.


Journal of Physics: Conference Series | 2008

Storage Resource Manager Version 2.2: design, implementation, and testing experience

Flavia Donno; Lana Abadie; Paolo Badino; Jean Philippe Baud; Ezio Corso; Shaun De Witt; Patrick Fuhrmann; Junmin Gu; B. Koblitz; Sophie Lemaitre; Maarten Litmaath; Dimitry Litvintsev; Giuseppe Lo Presti; L. Magnoni; Gavin McCance; Tigran Mkrtchan; Rémi Mollon; Vijaya Natarajan; Timur Perelmutov; D. Petravick; Arie Shoshani; Alex Sim; David Smith; Paolo Tedesco; Riccardo Zappi

Storage Services are crucial components of the Worldwide LHC Computing Grid Infrastructure spanning more than 200 sites and serving computing and storage resources to the High Energy Physics LHC communities. Up to tens of Petabytes of data are collected every year by the four LHC experiments at CERN. To process these large data volumes it is important to establish a protocol and a very efficient interface to the various storage solutions adopted by the WLCG sites. In this work we report on the experience acquired during the definition of the Storage Resource Manager v2.2 protocol. In particular, we focus on the study performed to enhance the interface and make it suitable for use by the WLCG communities. At the moment 5 different storage solutions implement the SRM v2.2 interface: BeStMan (LBNL), CASTOR (CERN and RAL), dCache (DESY and FNAL), DPM (CERN), and StoRM (INFN and ICTP). After a detailed inside review of the protocol, various test suites have been written identifying the most effective set of tests: the S2 test suite from CERN and the SRM- Tester test suite from LBNL. Such test suites have helped verifying the consistency and coherence of the proposed protocol and validating existing implementations. We conclude our work describing the results achieved.


ieee conference on mass storage systems and technologies | 2007

Storage Resource Managers: Recent International Experience on Requirements and Multiple Co-Operating Implementations

Lana Abadie; Paolo Badino; J.-P. Baud; Ezio Corso; M. Crawford; S. De Witt; Flavia Donno; A. Forti; Ákos Frohner; Patrick Fuhrmann; G. Grosdidier; Junmin Gu; Jens Jensen; B. Koblitz; Sophie Lemaitre; Maarten Litmaath; D. Litvinsev; G. Lo Presti; L. Magnoni; T. Mkrtchan; Alexander Moibenko; Rémi Mollon; Vijaya Natarajan; Gene Oleynik; Timur Perelmutov; D. Petravick; Arie Shoshani; Alex Sim; David Smith; M. Sponza

Storage management is one of the most important enabling technologies for large-scale scientific investigations. Having to deal with multiple heterogeneous storage and file systems is one of the major bottlenecks in managing, replicating, and accessing files in distributed environments. Storage resource managers (SRMs), named after their Web services control protocol, provide the technology needed to manage the rapidly growing distributed data volumes, as a result of faster and larger computational facilities. SRMs are grid storage services providing interfaces to storage resources, as well as advanced functionality such as dynamic space allocation and file management on shared storage systems. They call on transport services to bring files into their space transparently and provide effective sharing of files. SRMs are based on a common specification that emerged over time and evolved into an international collaboration. This approach of an open specification that can be used by various institutions to adapt to their own storage systems has proven to be a remarkable success - the challenge has been to provide a consistent homogeneous interface to the grid, while allowing sites to have diverse infrastructures. In particular, supporting optional features while preserving interoperability is one of the main challenges we describe in this paper. We also describe using SRM in a large international high energy physics collaboration, called WLCG, to prepare to handle the large volume of data expected when the Large Hadron Collider (LHC) goes online at CERN. This intense collaboration led to refinements and additional functionality in the SRM specification, and the development of multiple interoperating implementations of SRM for various complex multi- component storage systems.


Archive | 2004

Storage Resource Managers

Arie Shoshani; Alexander Sim; Junmin Gu

Storage Resource Managers (SRMs) are middleware components whose function is to provide dynamic space allocation and file management of shared storage components on the Grid. They complement Compute Resource Managers and Network Resource Managers in providing storage reservation and dynamic information on storage availability for the planning and execution of a Grid job. SRMs manage two types of resources: space and files. When managing space, SRMs negotiate space allocation with the requesting client, and/or assign default space quotas. When managing files, SRMs allocate space for files, invoke file transfer services to move files into the space, pin files for a certain lifetime, release files upon the clients request, and use file replacement policies to optimize the use of the shared space. SRMs can be designed to provide effective sharing of files, by monitoring the activity of shared files, and make dynamic decisions on which files to replace when space is needed. In addition, SRMs perform automatic garbage collection of unused files by removing selected files whose lifetime has expired when space is needed. In this chapter we discuss the design considerations for SRMs, their functionality, and their interfaces. We demonstrate the use of SRMs with several examples of real implementations that are in use today in a routine fashion or in a prototype form.


International Conference on Computing in High Energy and Nuclear Physics, CHEP 2010 | 2011

StorNet: Integrated Dynamic Storage and Network Resource Provisioning and Management for Automated Data Transfers

Junmin Gu; Dimitrios Katramatos; Xin Liu; Vijaya Natarajan; Arie Shoshani; Alex Sim; Dantong Yu; Scott Bradley; Shawn Patrick McKee

StorNet is a joint project of Brookhaven National Laboratory (BNL) and Lawrence Berkeley National Laboratory (LBNL) to research, design, and develop an integrated end-to-end resource provisioning and management framework for high-performance data transfers. The StorNet framework leverages heterogeneous network protocols and storage types in a federated computing environment to provide the capability of predictable, efficient delivery of high-bandwidth data transfers for data intensive applications. The framework incorporates functional modules to perform such data transfers through storage and network bandwidth co-scheduling, storage and network resource provisioning, and performance monitoring, and is based on LBNLs BeStMan/SRM, BNLs TeraPaths, and ESNets OSCARS systems.


international parallel and distributed processing symposium | 2016

Visualization and Analysis for Near-Real-Time Decision Making in Distributed Workflows

David Pugmire; James Kress; Jong Choi; Scott Klasky; Tahsin M. Kurç; R.M. Churchill; Matthew Wolf; Greg Eisenhower; Hank Childs; Kesheng Wu; Alexander Sim; Junmin Gu; Jonathan Low

Data driven science is becoming increasingly more common, complex, and is placing tremendous stresses on visualization and analysis frameworks. Data sources producing 10GB per second (and more) are becoming increasingly commonplace in both simulation, sensor and experimental sciences. These data sources, which are often distributed around the world, must be analyzed by teams of scientists that are also distributed. Enabling scientists to view, query and interact with such large volumes of data in near-real-time requires a rich fusion of visualization and analysis techniques, middleware and workflow systems. This paper discusses initial research into visualization and analysis of distributed data workflows that enables scientists to make near-real-time decisions of large volumes of time varying data.


Journal Name: J.Phys.Conf.Ser.119:072019,2008; Conference: Prepared for International Conference on Computing in High Energy and Nuclear Physics (CHEP 07), Victoria, BC, Canada, 2-7 Sep 2007 | 2008

Grid data access on widely distributed worker nodes using scalla and SRM

Pavel Jakl; Jerome Lauret; Andrew Hanushevsky; Arie Shoshani; Alex Sim; Junmin Gu

Facing the reality of storage economics, NP experiments such as RHIC/STAR have been engaged in a shift of the analysis model, and now heavily rely on using cheap disks attached to processing nodes, as such a model is extremely beneficial over expensive centralized storage. Additionally, exploiting storage aggregates with enhanced distributed computing capabilities such as dynamic space allocation (lifetime of spaces), file management on shared storages (lifetime of files, pinning file), storage policies or a uniform access to heterogeneous storage solutions is not an easy task. The Xrootd/Scalla system allows for storage aggregation. We will present an overview of the largest deployment of Scalla (Structured Cluster Architecture for Low Latency Access) in the world spanning over 1000 CPUs co-sharing the 350 TB Storage Elements and the experience on how to make such a model work in the RHIC/STAR standard analysis framework. We will explain the key features and approach on how to make access to mass storage (HPSS) possible in such a large deployment context. Furthermore, we will give an overview of a fully gridified solution using the plug-and-play features of Scalla architecture, replacing standard storage access with grid middleware SRM (Storage Resource Manager) components designed for space management and will compare the solution with the standard Scalla approach in use in STAR for the past 2 years. Integration details, future plans and status of development will be explained in the area of best transfer strategy between multiple-choice data pools and best placement with respect of load balancing and interoperability with other SRM aware tools or implementations.


Grid resource management | 2004

Storage resource managers: essential components for the Grid

Arie Shoshani; Alexander Sim; Junmin Gu


very large data bases | 2000

OLAP++: Powerful and Easy-to-Use Federations of OLAP and Object Databases

Junmin Gu; Torben Bach Pedersen; Arie Shoshani

Collaboration


Dive into the Junmin Gu's collaboration.

Top Co-Authors

Avatar

Arie Shoshani

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Alex Sim

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Alexander Sim

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Kesheng Wu

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Vijaya Natarajan

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jerome Lauret

Brookhaven National Laboratory

View shared research outputs
Top Co-Authors

Avatar

A. M. Poskanzer

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge