Stephen C. Simms | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stephen C. Simms is active.

Explore More

Publication

Featured researches published by Stephen C. Simms.

siguccs: user services conference | 2010

What is cyberinfrastructure

Craig A. Stewart; Stephen C. Simms; Beth Plale; Matthew R. Link; David Y. Hancock; Geoffrey C. Fox

Cyberinfrastructure is a word commonly used but lacking a single, precise definition. One recognizes intuitively the analogy with infrastructure, and the use of cyber to refer to thinking or computing -- but what exactly is cyberinfrastructure as opposed to information technology infrastructure? Indiana University has developed one of the more widely cited definitions of cyberinfrastructure. Cyberinfrastructure consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible. A second definition, more inclusive of scholarship generally and educational activities, has also been published and is useful in describing cyberinfrastructure: Cyberinfrastructure consists of computational systems, data and information management, advanced instruments, visualization environments, and people, all linked together by software and advanced networks to improve scholarly productivity and enable knowledge breakthroughs and discoveries not otherwise possible. In this paper, we describe the origin of the term cyberinfrastructure based on the history of the root word infrastructure, discuss several terms related to cyberinfrastructure, and provide several examples of cyberinfrastructure

ieee international conference on high performance computing data and analytics | 2012

Demonstrating lustre over a 100Gbps wide area network of 3,500km

Robert Henschel; Stephen C. Simms; David Y. Hancock; Scott Michael; Tom Johnson; Nathan Heald; Thomas William; Donald K. Berry; Matthew Allen; Richard Knepper; Matt Davy; Matthew R. Link; Craig A. Stewart

As part of the SCinet Research Sandbox at the Supercomputing 2011 conference, Indiana University (IU) demonstrated use of the Lustre high performance parallel file system over a dedicated 100 Gbps wide area network (WAN) spanning more than 3,500 km (2,175 mi). This demonstration functioned as a proof of concept and provided an opportunity to study Lustres performance over a 100 Gbps WAN. To characterize the performance of the network and file system, low level iperf network tests, file system tests with the IOR benchmark, and a suite of real-world applications reading and writing to the file system were run over a latency of 50.5 ms. In this article we describe the configuration and constraints of the demonstration and outline key findings.

Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches | 2007

Empowering distributed workflow with the data capacitor: maximizing lustre performance across the wide area network

Stephen C. Simms; Gregory G. Pike; Scott Teige; Bret Hammond; Yu Ma; Larry L. Simms; C. Westneat; Douglas A. Balog

The Indiana University Data Capacitor is a 535 TB distributed parallel filesystem constructed for short to mid term storage of large research data sets. Spanning multiple, geographically distributed compute, storage, and visualization resources and showing unprecedented performance across the wide area network, the Data Capacitors Lustre filesystem can be used as a powerful tool to accommodate loosely coupled, service oriented computing. In this paper we demonstrate single file/single client write performance from Oak Ridge National Laboratory to Indiana University in excess of 750 MB/s. We evaluate client parameters that will allow widely distributed services to achieve data transfer rates closely matching those of local services. Finally, we outline the tuning strategy used to maximize performance, and present the results of this tuning.

teragrid conference | 2010

Enabling Lustre WAN for production use on the TeraGrid: a lightweight UID mapping scheme

Joshua Walgenbach; Stephen C. Simms; Kit Westneat; Justin P. Miller

The Indiana University Data Capacitor wide area Lustre file system provides over 350 TB of short- to mid-term storage of large research data sets. It spans multiple geographically distributed compute, storage, and visualization resources. In order to effectively harness the power of these resources from various institutions, it has been necessary to develop software to keep ownership and permission data consistent across many client mounts. This paper describes the Data Capacitors Lustre WAN service and the history, development, and implementation of IUs UID mapping scheme that enables Lustre WAN on the TeraGrid.

international parallel and distributed processing symposium | 2004

LINPACK performance on a geographically distributed Linux cluster

Peng Wang; George Turner; Daniel A. Lauer; Matthew Allen; Stephen C. Simms; David Hart; Mary Papakhian; Craig A. Stewart

Summary form only given. As the first geographically distributed supercomputer on the top 500 list, the AVIDD facility of Indiana University ranked 50/sup th/ in June of 2003. It achieved 1.169 tera-flops running the LINPACK benchmark. Here, our work of improving LINPACK performance is reported, and the impact of math kernel, LINPACK problem size and network tuning is analyzed based on the performance model of LINPACK.

Future Generation Computer Systems | 2013

Performance and quality of service of data and video movement over a 100 Gbps testbed

Michael Kluge; Stephen C. Simms; Thomas William; Robert Henschel; Andy Georgi; Christian Meyer; Matthias S. Mueller; Craig A. Stewart; Wolfgang Wünsch; Wolfgang E. Nagel

Digital instruments and simulations are creating an ever-increasing amount of data. The need for institutions to acquire these data and transfer them for analysis, visualization, and archiving is growing as well. In parallel, networking technology is evolving, but at a much slower rate than our ability to create and store data. Single fiber 100 Gbps networking solutions have recently been deployed as national infrastructure. This article describes our experiences with data movement and video conferencing across a networking testbed, using the first commercially available single fiber 100 Gbps technology. The testbed is unique in its ability to be configured for a total length of 60, 200, or 400 km, allowing for tests with varying network latency. We performed low-level TCP tests and were able to use more than 99.9% of the theoretical available bandwidth with minimal tuning efforts. We used the Lustre file system to simulate how end users would interact with a remote file system over such a high performance link. We were able to use 94.4% of the theoretical available bandwidth with a standard file system benchmark, essentially saturating the wide area network. Finally, we performed tests with H.323 video conferencing hardware and quality of service (QoS) settings, showing that the link can reliably carry a full high-definition stream. Overall, we demonstrated the practicality of 100?Gbps networking and Lustre as excellent tools for data management. Highlights? The need for institutions to acquire and transfer data is growing. ? We tested data transfer on the first commercial single fiber 100?Gbps network. ? We used Lustre to simulate user interaction with a remote file system. ? We were able to use more than 94.4% of the theoretical available bandwidth. ? 100?Gbps networking and Lustre are excellent tools for data management.

high performance distributed computing | 2010

A distributed workflow for an astrophysical OpenMP application: using the data capacitor over WAN to enhance productivity

Robert Henschel; Scott Michael; Stephen C. Simms

Astrophysical simulations of protoplanetary disks and gas giant planet formation are being performed with a variety of numerical methods. Some of the codes in use today have been producing scientifically significant results for several years, or even decades. Each must simulate millions of resolution elements for millions of time steps, capture and store output data, and rapidly and efficiently analyze this data. To do this effectively, a parallel code is needed that scales to tens or hundreds of processors. Furthermore, an efficient workflow for the transport, analysis, and interpretation of the output data is needed to achieve scientifically meaningful results. Since such simulations are usually performed on moderate to large parallel systems, the compute system is generally located at a remote institution. However, analysis of results is typically performed interactively, and due to the fact that most supercomputing centers do not offer dedicated interactive nodes, the transfer of simulation output data to local resources becomes necessary. Even if interactive resources were available, typical network latencies make X-forwarded displays nearly impossible to work with. Since data sets can be quite large and traditional transfer mechanisms such as scp and sftp offer relatively low throughput, this transfer of data sets becomes a bottleneck in the research workflow. In this article we measure the scalability of the Computational HYdronamics with MultiplE Radiation Algorithms (CHYMERA) code on the SGI Altix architecture. We find that it scales well up to 64 threads for moderate and large sized problems. We also present a novel approach to enable rapid transfer and analysis of simulation data via the Data Capacitor (DC) and Lustre WAN (Wide Area Network) [17]. The usage of a WAN file system to tie batch system operated compute resources and interactive analysis and visualization resources together is of general interest and can be applied broadly.

conference on high performance computing (supercomputing) | 2006

All in a day's work: advancing data-intensive research with the data capacitor

Stephen C. Simms; Matt Davy; Bret Hammond; Matthew R. Link; Craig A. Stewart; Randall Bramley; Beth Plale; Dennis Gannon; Mu-Hyun Baik; Scott Teige; John C. Huffman; Rick McMullen; Doug Balog; Greg Pike

Indiana University provides powerful compute, storage, and network resources to a diverse local and national research community every day. IUs facilities have been used to support data-intensive applications ranging from digital humanities to computational biology.For this years bandwidth challenge, several IU researchers will conduct experiments from the exhibit floor utilizing the resources that University Information Technology Services currently provides.Using IUs newly constructed 535 TB Data Capacitor and an additional component installed on the exhibit floor, we will use Lustre across the wide area network to simultaneously facilitate dynamic weather modeling, protein analysis, instrument data capture, and the production, storage, and analysis of simulation data.

international workshop on data intensive distributed computing | 2012

A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency

Scott Michael; Liang Zhen; Robert Henschel; Stephen C. Simms; Eric Barton; Matthew R. Link

As part of the SCinet Research Sandbox at the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), Indiana University utilized a dedicated 100 Gbps wide area network (WAN) link spanning more than 3,500 km (2,175 mi) to demonstrate the capabilities of the Lustre high performance parallel file system in a high bandwidth, high latency WAN environment. This demonstration functioned as a proof of concept and provided an opportunity to study Lustres performance over a 100 Gbps WAN. To characterize the performance of the network and file system, a series of benchmarks and tests were undertaken. These included low level iperf network tests, Lustre networking (LNET) tests, file system tests with the IOR benchmark, and a suite of real-world applications reading and writing to the file system. All of the benchmarks were run over a the WAN link with a latency of 50.5 ms. In this article, we describe the configuration and constraints of the demonstration, and focus on the key findings made regarding the Lustre networking layer for this extremely high bandwidth, high latency connection. Of particular interest is the relationship between the peer_credits and max_rpcs_in_flight settings when considering LNET performance.

teragrid conference | 2010

A compelling case for a centralized filesystem on the TeraGrid: enhancing an astrophysical workflow with the data capacitor WAN as a test case

Scott Michael; Stephen C. Simms; W. B. Breckenridge Iii; Roger Smith; Matthew R. Link

In this article we explore the utility of a centralized filesystem provided by the TeraGrid to both TeraGrid and non-TeraGrid sites. We highlight several common cases in which such a filesystem would be useful in obtaining scientific insight. We present results from a test case using Indiana Universitys Data Capacitor over the wide area network as a central filesystem for simulation data generated at multiple TeraGrid sites and analyzed at Mississippi State University. Statistical analysis of the I/O patterns and rates, via detailed trace records generated with VampirTrace, for both the Data Capacitor and a local Lustre filesystem are provided. The benefits of a centralized filesystem and potential hurdles in adopting such a system for both TeraGrid and non-TeraGrid sites are discussed.

Explore More