Is this you? Create Your Porfile

Eli Dart

Lawrence Berkeley National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eli Dart is active.

Explore More

Publication

Featured researches published by Eli Dart.

ieee international conference on high performance computing data and analytics | 2013

The Science DMZ: a network design pattern for data-intensive science

Eli Dart; Lauren Rotman; Brian Tierney; Mary Hester; Jason M. Zurawski

The ever-increasing scale of scientific data has become a significant challenge for researchers that rely on networks to interact with remote computing systems and transfer results to collaborators worldwide. Despite the availability of high-capacity connections, scientists struggle with inadequate cyberinfrastructure that cripples data transfer performance, and impedes scientific progress. The Science DMZ paradigm comprises a proven set of network design patterns that collectively address these problems for scientists. We explain the Science DMZ model, including network architecture, system configuration, cybersecurity, and performance tools, that creates an optimized network environment for science. We describe use cases from universities, supercomputing centers and research laboratories, highlighting the effectiveness of the Science DMZ model in diverse operational settings. In all, the Science DMZ model is a solid platform that supports any science workflow, and flexibly accommodates emerging network technologies. As a result, the Science DMZ vastly improves collaboration, accelerating scientific discovery.

conference on high performance computing (supercomputing) | 2006

Detecting distributed scans using high-performance query-driven visualization

Kurt Stockinger; E. Bethel; Scott Campbell; Eli Dart; Kesheng Wu

Modern forensic analytics applications, like network traffic analysis, perform high-performance hypothesis testing, knowledge discovery and data mining on very large datasets. One essential strategy to reduce the time required for these operations is to select only the most relevant data records for a given computation. In this paper, we present a set of parallel algorithms that demonstrate how an efficient selection mechanism - bitmap indexing - significantly speeds up a common analysis task, namely, computing conditional histogram on very large datasets. We present a thorough study of the performance characteristics of the parallel conditional histogram algorithms. As a case study, we compute conditional histograms for detecting distributed scans hidden in a dataset consisting of approximately 2.5 billion network connection records. We show that these conditional histograms can be computed on interactive time scale (i.e., in seconds). We also show how to progressively modify the selection criteria to narrow the analysis and find the sources of the distributed scans

high performance distributed computing | 2010

Lessons learned from moving earth system grid data sets over a 20 Gbps wide-area network

Rajkumar Kettimuthu; Alex Sim; Dan Gunter; Bill Allcock; Peer-Timo Bremer; John Bresnahan; Andrew Cherry; Lisa Childers; Eli Dart; Ian T. Foster; Kevin Harms; Jason Hick; Jason Lee; Michael Link; Jeff Long; Keith Miller; Vijaya Natarajan; Valerio Pascucci; Ken Raffenetti; David Ressman; Dean N. Williams; Loren Wilson; Linda Winkler

In preparation for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report, the climate community will run the Coupled Model Intercomparison Project phase 5 (CMIP-5) experiments, which are designed to answer crucial questions about future regional climate change and the results of carbon feedback for different mitigation scenarios. The CMIP-5 experiments will generate petabytes of data that must be replicated seamlessly, reliably, and quickly to hundreds of research teams around the globe. As an end-to-end test of the technologies that will be used to perform this task, a multi-disciplinary team of researchers moved a small portion (10 TB) of the multimodel Coupled Model Intercomparison Project, Phase 3 data set used in the IPCC Fourth Assessment Report from three sources---the Argonne Leadership Computing Facility (ALCF), Lawrence Livermore National Laboratory (LLNL) and National Energy Research Scientific Computing Center (NERSC)---to the 2009 Supercomputing conference (SC09) show floor in Portland, Oregon, over circuits provided by DOEs ESnet. The team achieved a sustained data rate of 15 Gb/s on a 20 Gb/s network. More important, this effort provided critical feedback on how to deploy, tune, and monitor the middleware that will be used to replicate the upcoming petascale climate datasets. We report on obstacles overcome and the key lessons learned from this successful bandwidth challenge effort.

Journal of the American Medical Informatics Association | 2016

The Medical Science DMZ

Sean Peisert; William K. Barnett; Eli Dart; James Cuff; Robert L. Grossman; Edward Balas; Ari E. Berman; Anurag Shankar; Brian Tierney

Objective We describe use cases and an institutional reference architecture for maintaining high-capacity, data-intensive network flows (e.g., 10, 40, 100 Gbps+) in a scientific, medical context while still adhering to security and privacy laws and regulations. Materials and Methods High-end networking, packet filter firewalls, network intrusion detection systems. Results We describe a “Medical Science DMZ” concept as an option for secure, high-volume transport of large, sensitive data sets between research institutions over national research networks. Discussion The exponentially increasing amounts of “omics” data, the rapid increase of high-quality imaging, and other rapidly growing clinical data sets have resulted in the rise of biomedical research “big data.” The storage, analysis, and network resources required to process these data and integrate them into patient diagnoses and treatments have grown to scales that strain the capabilities of academic health centers. Some data are not generated locally and cannot be sustained locally, and shared data repositories such as those provided by the National Library of Medicine, the National Cancer Institute, and international partners such as the European Bioinformatics Institute are rapidly growing. The ability to store and compute using these data must therefore be addressed by a combination of local, national, and industry resources that exchange large data sets. Maintaining data-intensive flows that comply with HIPAA and other regulations presents a new challenge for biomedical research. Recognizing this, we describe a strategy that marries performance and security by borrowing from and redefining the concept of a “Science DMZ”—a framework that is used in physical sciences and engineering research to manage high-capacity data flows. Conclusion By implementing a Medical Science DMZ architecture, biomedical researchers can leverage the scale provided by high-performance computer and cloud storage facilities and national high-speed research networks while preserving privacy and meeting regulatory requirements.

12th International Conference on Synchrotron Radiation Instrumentation (SRI), JUL 06-10, 2015, New York, NY | 2016

Real-time data-intensive computing

Dilworth Y. Parkinson; Keith Beattie; Xian Chen; Joaquin Correa; Eli Dart; Benedikt J. Daurer; Jack Deslippe; Alexander Hexemer; Harinarayan Krishnan; Alastair A. MacDowell; Filipe R. N. C. Maia; Stefano Marchesini; Howard A. Padmore; Simon J. Patton; Talita Perciano; James A. Sethian; David Shapiro; Rune Stromsness; Nobumichi Tamura; Brian Tierney; Craig E. Tull; Daniela Ushizima

Today users visit synchrotrons as sources of understanding and discovery—not as sources of just light, and not as sources of data. To achieve this, the synchrotron facilities frequently provide not just light but often the entire end station and increasingly, advanced computational facilities that can reduce terabytes of data into a form that can reveal a new key insight. The Advanced Light Source (ALS) has partnered with high performance computing, fast networking, and applied mathematics groups to create a “super-facility”, giving users simultaneous access to the experimental, computational, and algorithmic resources to make this possible. This combination forms an efficient closed loop, where data—despite its high rate and volume—is transferred and processed immediately and automatically on appropriate computing resources, and results are extracted, visualized, and presented to users or to the experimental control system, both to provide immediate insight and to guide decisions about subsequent experim...

Lawrence Berkeley National Laboratory | 2011

BER Science Network Requirements

Eli Dart; Brian Tierney

The Energy Sciences Network (ESnet) is the primary provider of network connectivity for the U.S. Department of Energy (DOE) Office of Science (SC), the single largest supporter of basic research in the physical sciences in the United States. To support SC programs, ESnet regularly updates and refreshes its understanding of the networking requirements of the instruments, facilities, scientists, and science programs it serves. This focus has helped ESnet to be a highly successful enabler of scientific discovery for over 20 years. In August 2011, ESnet and the Office of Nuclear Physics (NP), of the DOE SC, organized a workshop to characterize the networking requirements of the programs funded by NP. The requirements identified at the workshop are summarized in the Findings section, and are described in more detail in the body of the report.

Other Information: PBD: 20 Apr 2001 | 2001

Building and measuring a high performance network architecture

William Kramer; Timothy Toole; Chuck Fisher; Jon Dugan; David R. Wheeler; William R. Wing; William Nickless; Gregory Goddard; Steven Corbato; E. Paul Love; Paul Daspit; Hal Edwards; Linden Mercer; David Koester; Basil Decina; Eli Dart; Paul Reisinger; Riki Kurihara; Matthew J. Zekauskas; Eric Plesset; Julie Wulf; Douglas Luce; James Rogers; Rex Duncan; Jeffery Mauth

Once a year, the SC conferences present a unique opportunity to create and build one of the most complex and highest performance networks in the world. At SC2000, large-scale and complex local and wide area networking connections were demonstrated, including large-scale distributed applications running on different architectures. This project was designed to use the unique opportunity presented at SC2000 to create a testbed network environment and then use that network to demonstrate and evaluate high performance computational and communication applications. This testbed was designed to incorporate many interoperable systems and services and was designed for measurement from the very beginning. The end results were key insights into how to use novel, high performance networking technologies and to accumulate measurements that will give insights into the networks of the future.

Lawrence Berkeley National Laboratory | 2010

Efficient Bulk Data Replication for the Earth System Grid

Alex Sim; Dan Gunter; Vijaya Natarajan; Arie Shoshani; Dean N. Williams; Jeff Long; Jason Hick; Jason D. Lee; Eli Dart

The Earth System Grid (ESG) community faces the difficult challenge of managing the distribution of massive data sets to thousands of scientists around the world. To move data replicas efficiently, the ESG has developed a data transfer management tool called the Bulk Data Mover (BDM). We describe the performance results of the current system and plans towards extending the techniques developed so far for the up- coming project, in which the ESG will employ advanced networks to move multi-TB datasets with the ulti- mate goal of helping researchers understand climate change and its potential impacts on world ecology and society.

Archive | 2014

DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

Richard A. Gerber; William Allcock; Chris Beggio; Stuart Campbell; Andrew Cherry; Shreyas Cholia; Eli Dart; Clay England; Tim J. Fahey; Fernanda Foertter; Robin J. Goldstone; Jason Hick; David Karelitz; Kaki Kelly; Laura Monroe; Prabhat; David Skinner; Julia White

Author(s): Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat; Skinner, David; White, Julia | Abstract: U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.

PeerJ | 2018