Craig A. Stewart | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Craig A. Stewart is active.

Explore More

Publication

Featured researches published by Craig A. Stewart.

Archive | 2006

Euro-Par 2006 Parallel Processing

Wolfgang Lehner; Norbert Meyer; Achim Streit; Craig A. Stewart

A Network Monitoring system is a vital component of a Grid; however, its scalability is a challenge. We propose a network monitoring approach that combines passive monitoring, a domain oriented overlay network, and an attitude for demand driven monitoring sessions. In order to keep into account the demand for extreme scalability, we introduce a solution for two problems that are inherent to the proposed approach: security and group membership maintenance.

conference on high performance computing (supercomputing) | 2001

Parallel Implementation and Performance of FastDNAml — A Program for Maximum Likelihood Phylogenetic Inference

Craig A. Stewart; David Hart; Donald K. Berry; Gary J. Olsen; Eric A. Wernert; William Fischer

This paper describes the parallel implementation of fastDNAml, a program for the maximum likelihood inference of phylogenetic trees from DNA sequence data. Mathematical means of inferring phylogenetic trees have been made possible by the wealth of DNA data now available. Maximum likelihood analysis of phylogenetic trees is extremely computationally intensive. Availability of computer resources is a key factor limiting use of such analyses. fastDNAml is implemented in serial, PVM, and MPI versions, and may be modified to use other message passing libraries in the future. We have developed a viewer for comparing phylogenies. We tested the scaling behavior of fastDNAml on an IBM RS/6000 SP up to 64 processors. The parallel version of fastDNAml is one of very few computational phylogenetics codes that scale well. fastDNAml is available for download as source code or compiled for Linux or AIX.

extreme science and engineering discovery environment | 2015

Jetstream: a self-provisioned, scalable science and engineering cloud environment

Craig A. Stewart; Timothy Cockerill; Ian T. Foster; David Y. Hancock; Nirav Merchant; Edwin Skidmore; Dan Stanzione; James Taylor; Steven Tuecke; George Turner; Matthew W. Vaughn; Niall Gaffney

Jetstream will be the first production cloud resource supporting general science and engineering research within the XD ecosystem. In this report we describe the motivation for proposing Jetstream, the configuration of the Jetstream system as funded by the NSF, the team that is implementing Jetstream, and the communities we expect to use this new system. Our hope and plan is that Jetstream, which will become available for production use in 2016, will aid thousands of researchers who need modest amounts of computing power interactively. The implementation of Jetstream should increase the size and disciplinary diversity of the US research community that makes use of the resources of the XD ecosystem.

siguccs: user services conference | 2010

What is cyberinfrastructure

Craig A. Stewart; Stephen C. Simms; Beth Plale; Matthew R. Link; David Y. Hancock; Geoffrey C. Fox

Cyberinfrastructure is a word commonly used but lacking a single, precise definition. One recognizes intuitively the analogy with infrastructure, and the use of cyber to refer to thinking or computing -- but what exactly is cyberinfrastructure as opposed to information technology infrastructure? Indiana University has developed one of the more widely cited definitions of cyberinfrastructure. Cyberinfrastructure consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible. A second definition, more inclusive of scholarship generally and educational activities, has also been published and is useful in describing cyberinfrastructure: Cyberinfrastructure consists of computational systems, data and information management, advanced instruments, visualization environments, and people, all linked together by software and advanced networks to improve scholarly productivity and enable knowledge breakthroughs and discoveries not otherwise possible. In this paper, we describe the origin of the term cyberinfrastructure based on the history of the root word infrastructure, discuss several terms related to cyberinfrastructure, and provide several examples of cyberinfrastructure

ieee international conference on high performance computing data and analytics | 2012

Demonstrating lustre over a 100Gbps wide area network of 3,500km

Robert Henschel; Stephen C. Simms; David Y. Hancock; Scott Michael; Tom Johnson; Nathan Heald; Thomas William; Donald K. Berry; Matthew Allen; Richard Knepper; Matt Davy; Matthew R. Link; Craig A. Stewart

As part of the SCinet Research Sandbox at the Supercomputing 2011 conference, Indiana University (IU) demonstrated use of the Lustre high performance parallel file system over a dedicated 100 Gbps wide area network (WAN) spanning more than 3,500 km (2,175 mi). This demonstration functioned as a proof of concept and provided an opportunity to study Lustres performance over a 100 Gbps WAN. To characterize the performance of the network and file system, low level iperf network tests, file system tests with the IOR benchmark, and a suite of real-world applications reading and writing to the file system were run over a latency of 50.5 ms. In this article we describe the configuration and constraints of the demonstration and outline key findings.

Alcohol | 2010

Implementation of a shared data repository and common data dictionary for fetal alcohol spectrum disorders research.

Andrew Arenson; Ludmila N. Bakhireva; Christina D. Chambers; Christina Deximo; Tatiana Foroud; Joseph L. Jacobson; Sandra W. Jacobson; Kenneth Lyons Jones; Sarah N. Mattson; Philip A. May; Elizabeth S. Moore; Kimberly Ogle; Edward P. Riley; Luther K. Robinson; Jeffrey Rogers; Ann P. Streissguth; Michel Tavares; Joseph Urbanski; Yelena Yezerets; Radha Surya; Craig A. Stewart; William K. Barnett

Many previous attempts by fetal alcohol spectrum disorders researchers to compare data across multiple prospective and retrospective human studies have failed because of both structural differences in the collected data and difficulty in coming to agreement on the precise meaning of the terminology used to describe the collected data. Although some groups of researchers have an established track record of successfully integrating data, attempts to integrate data more broadly among different groups of researchers have generally faltered. Lack of tools to help researchers share and integrate data has also hampered data analysis. This situation has delayed improving diagnosis, intervention, and treatment before and after birth. We worked with various researchers and research programs in the Collaborative Initiative on Fetal Alcohol Spectrum Disorders (CI-FASD) to develop a set of common data dictionaries to describe the data to be collected, including definitions of terms and specification of allowable values. The resulting data dictionaries were the basis for creating a central data repository (CI-FASD Central Repository) and software tools to input and query data. Data entry restrictions ensure that only data that conform to the data dictionaries reach the CI-FASD Central Repository. The result is an effective system for centralized and unified management of the data collected and analyzed by the initiative, including a secure, long-term data repository. CI-FASD researchers are able to integrate and analyze data of different types, using multiple methods, and collected from multiple populations, and data are retained for future reuse in a secure, robust repository.

international parallel and distributed processing symposium | 2004

LINPACK performance on a geographically distributed Linux cluster

Peng Wang; George Turner; Daniel A. Lauer; Matthew Allen; Stephen C. Simms; David Hart; Mary Papakhian; Craig A. Stewart

Summary form only given. As the first geographically distributed supercomputer on the top 500 list, the AVIDD facility of Indiana University ranked 50/sup th/ in June of 2003. It achieved 1.169 tera-flops running the LINPACK benchmark. Here, our work of improving LINPACK performance is reported, and the impact of math kernel, LINPACK problem size and network tuning is analyzed based on the performance model of LINPACK.

extreme science and engineering discovery environment | 2014

Methods For Creating XSEDE Compatible Clusters

Jeremy Fischer; Richard Knepper; Matthew Standish; Craig A. Stewart; Resa Alvord; David Lifka; Barbara Hallock; Victor Hazlewood

The Extreme Science and Engineering Discovery Environment has created a suite of software that is collectively known as the basic XSEDE-compatible cluster build. It has been distributed as a Rocks roll for some time. It is now available as individual RPM packages, so that it can be downloaded and installed in portions as appropriate on existing and working clusters. In this paper, we explain the concept of the XSEDE-compatible cluster and explain how to install individual components as RPMs through use of Puppet and the XSEDE compatible cluster YUM repository.

Proceedings of the 1st Workshop on The Science of Cyberinfrastructure | 2015

Jetstream: A Distributed Cloud Infrastructure for Underresourced higher education communities

Jeremy Fischer; Steven Tuecke; Ian T. Foster; Craig A. Stewart

The US National Science Foundation (NSF) in 2015 awarded funding for a first-of-a-kind distributed cyberinfrastructure (DCI) system called Jetstream. Jetstream will be the NSFs first production cloud for general-purpose science and engineering research and education. Jetstream, scheduled for production in January 2016, will be based on the OpenStack cloud environment software with a menu-driven interface to make it easy for users to select a pre-composed Virtual Machine (VM) to perform a particular discipline-specific analysis. Jetstream will use the Atmosphere user interface developed as part of iPlant, providing a low barrier to use by practicing scientists, engineers, educators, and students, and Globus services from the University of Chicago for seamless integration into the national cyberinfrastructure fabric. The team implementing Jetstream has as their primary mission extending the reach of the NSFs eXtreme Digital (XD) program to researchers, educators, and research students who have not previously used NSF XD program resources, including those in communities and at institutions that traditionally lack significant cyberinfrastructure resources. We will, for example, use virtual Linux Desktops to deliver DCI capabilities supporting research and research education at small colleges and universities, including Historically Black Colleges and Universities (HBCUs), Minority Serving Institutions (MSIs), Tribal colleges, and higher education institutions in states designated by the NSF as eligible for funding via the Experimental Program to Stimulate Competitive Research (EPSCoR). Jetstream will be a novel distributed cyberinfrastructure, with production components in Indiana and Texas. In particular, Jetstream will deliver virtual Linux desktops to tablet devices and PDAs with reasonable responsiveness running over cellular networks. This paper will discuss design and application plans for Jetstream as a novel Distributed CyberInfrastructure system for research education.

Future Generation Computer Systems | 2013

Performance and quality of service of data and video movement over a 100 Gbps testbed

Michael Kluge; Stephen C. Simms; Thomas William; Robert Henschel; Andy Georgi; Christian Meyer; Matthias S. Mueller; Craig A. Stewart; Wolfgang Wünsch; Wolfgang E. Nagel

Digital instruments and simulations are creating an ever-increasing amount of data. The need for institutions to acquire these data and transfer them for analysis, visualization, and archiving is growing as well. In parallel, networking technology is evolving, but at a much slower rate than our ability to create and store data. Single fiber 100 Gbps networking solutions have recently been deployed as national infrastructure. This article describes our experiences with data movement and video conferencing across a networking testbed, using the first commercially available single fiber 100 Gbps technology. The testbed is unique in its ability to be configured for a total length of 60, 200, or 400 km, allowing for tests with varying network latency. We performed low-level TCP tests and were able to use more than 99.9% of the theoretical available bandwidth with minimal tuning efforts. We used the Lustre file system to simulate how end users would interact with a remote file system over such a high performance link. We were able to use 94.4% of the theoretical available bandwidth with a standard file system benchmark, essentially saturating the wide area network. Finally, we performed tests with H.323 video conferencing hardware and quality of service (QoS) settings, showing that the link can reliably carry a full high-definition stream. Overall, we demonstrated the practicality of 100?Gbps networking and Lustre as excellent tools for data management. Highlights? The need for institutions to acquire and transfer data is growing. ? We tested data transfer on the first commercial single fiber 100?Gbps network. ? We used Lustre to simulate user interaction with a remote file system. ? We were able to use more than 94.4% of the theoretical available bandwidth. ? 100?Gbps networking and Lustre are excellent tools for data management.

Explore More