Joseph P. White | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joseph P. White is active.

Explore More

Publication

Featured researches published by Joseph P. White.

Physics of Fluids | 2005

Three-dimensional instabilities of liquid-lined elastic tubes: A thin-film fluid-structure interaction model

Joseph P. White; Matthias Heil

We develop a theoretical model of surface-tension-driven, three-dimensional instabilities of liquid-lined elastic tubes—a model for pulmonary airway closure. The model is based on large-displacement shell theory, coupled to the equations of lubrication theory, modified to ensure the exact representation of the system’s equilibrium configurations. The liquid film that lines the initially uniform, axisymmetric tube can become unstable to a surface-tension-driven instability. We show that, if the surface tension of the liquid lining is sufficiently large (relative to the tube’s bending stiffness), the axisymmetric redistribution of fluid by this instability can increase the wall compression to such an extent that the system becomes unstable to a secondary, nonaxisymmetric instability which causes the tube wall to buckle. We establish the conditions for the occurrence of the nonaxisymmetric instability by a linear stability analysis and use finite element simulations to explore the system’s subsequent evoluti...

international conference on cluster computing | 2015

Analysis of XDMoD/SUPReMM Data Using Machine Learning Techniques

Steven M. Gallo; Joseph P. White; Robert L. DeLeon; Thomas R. Furlani; Helen Ngo; Abani K. Patra; Matthew D. Jones; Jeffrey T. Palmer; Nikolay Simakov; Jeanette M. Sperhac; Martins Innus; Thomas Yearke; Ryan Rathsam

Machine learning techniques were applied to job accounting and performance data for application classification. Job data were accumulated using the XDMoD monitoring technology named SUPReMM, they consist of job accounting information, application information from Lariat/XALT, and job performance data from TACC_Stats. The results clearly demonstrate that community applications have characteristic signatures which can be exploited for job classification. We conclude that machine learning can assist in classifying jobs of unknown application, in characterizing the job mixture, and in harnessing the variation in node and time dependence for further analysis.

extreme science and engineering discovery environment | 2014

An Analysis of Node Sharing on HPC Clusters using XDMoD/TACC_Stats

Joseph P. White; Robert L. DeLeon; Thomas R. Furlani; Steven M. Gallo; Matthew D. Jones; Amin Ghadersohi; Cynthia D. Cornelius; Abani K. Patra; James C. Browne; William L. Barth; John Hammond

When a user requests less than a full node for a job on XSEDEs large resources - Stampede and Lonestar4 -, that is less than 16 cores on Stampede or 12 cores on Lonestar4, they are assigned a full node by policy. Although the actual CPU hours consumed by these jobs is small when compared to the total CPU hours delivered by these resources, they do represent a substantial fraction of the total number of jobs (~18% for Stampede and ~15% for Lonestar4 between January and February 2014). Academic HPC centers, such as the Center for Computational Research (CCR) at the University at Buffalo, SUNY typically have a much larger proportion of small jobs than the large XSEDE systems. For CCRs production cluster, Rush, the decision was made to allow the allocation of simultaneous jobs on the same node. This greatly increases the overall throughput but also raises questions whether the jobs that share the same node will interfere with one another. We present here an analysis that explores this issue using data from Rush, Stampede and Lonestar4. Analysis of usage data indicates little interference.

Scopus | 2014

Comprehensive, open-source resource usage measurement and analysis for HPC systems

James C. Browne; Robert L. DeLeon; Abani K. Patra; William L. Barth; John Hammond; Jones; Tom Furlani; Barry I. Schneider; Steven M. Gallo; Amin Ghadersohi; Ryan J. Gentner; Jeffrey T. Palmer; Nikolay Simakov; Martins Innus; Andrew E. Bruno; Joseph P. White; Cynthia D. Cornelius; Thomas Yearke; Kyle Marcus; G. Von Laszewski; Fugang Wang

The important role high‐performance computing (HPC) resources play in science and engineering research, coupled with its high cost (capital, power and manpower), short life and oversubscription, requires us to optimize its usage – an outcome that is only possible if adequate analytical data are collected and used to drive systems management at different granularities – job, application, user and system. This paper presents a method for comprehensive job, application and system‐level resource use measurement, and analysis and its implementation. The steps in the method are system‐wide collection of comprehensive resource use and performance statistics at the job and node levels in a uniform format across all resources, mapping and storage of the resultant job‐wise data to a relational database, which enables further implementation and transformation of the data to the formats required by specific statistical and analytical algorithms. Analyses can be carried out at different levels of granularity: job, user, application or system‐wide. Measurements are based on a new lightweight job‐centric measurement tool ‘TACC_Stats’, which gathers a comprehensive set of resource use metrics on all compute nodes and data logged by the system scheduler. The data mapping and analysis tools are an extension of the XDMoD project. The method is illustrated with analyses of resource use for the Texas Advanced Computing Centers Lonestar4, Ranger and Stampede supercomputers and the HPC cluster at the Center for Computational Research. The illustrations are focused on resource use at the system, job and application levels and reveal many interesting insights into system usage patterns and also anomalous behavior due to failure/misuse. The method can be applied to any system that runs the TACC_Stats measurement tool and a tool to extract job execution environment data from the system scheduler. Copyright

Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale | 2016

A Quantitative Analysis of Node Sharing on HPC Clusters Using XDMoD Application Kernels

Nikolay Simakov; Robert L. DeLeon; Joseph P. White; Thomas R. Furlani; Martins Innus; Steven M. Gallo; Matthew D. Jones; Abani K. Patra; Benjamin D. Plessinger; Jeanette M. Sperhac; Thomas Yearke; Ryan Rathsam; Jeffrey T. Palmer

In this investigation, we study how application performance is affected when jobs are permitted to share compute nodes. A series of application kernels consisting of a diverse set of benchmark calculations were run in both exclusive and node-sharing modes on the Center for Computational Researchs high-performance computing (HPC) cluster. Very little increase in runtime was observed due to job contention among application kernel jobs run on shared nodes. The small differences in runtime were quantitatively modeled in order to characterize the resource contention and attempt to determine the circumstances under which it would or would not be important. A machine learning regression model applied to the runtime data successfully fitted the small differences between the exclusive and shared node runtime data; it also provided insight into the contention for node resources that occurs when jobs are allowed to share nodes. Analysis of a representative job mix shows that runtime of shared jobs is affected primarily by the memory subsystem, in particular by the reduction in the effective cache size due to sharing; this leads to higher utilization of DRAM. Insights such as these are crucial when formulating policies proposing node sharing as a mechanism for improving HPC utilization.

Scopus | 2015

Application kernels: HPC resources performance monitoring and variance analysis

Nikolay Simakov; Joseph P. White; Robert L. DeLeon; Amin Ghadersohi; Tom Furlani; Jones; Steven M. Gallo; Abani K. Patra

Application kernels are computationally lightweight benchmarks or applications run repeatedly on high performance computing (HPC) clusters in order to track the Quality of Service (QoS) provided to the users. They have been successful in detecting a variety of hardware and software issues, some severe, that have subsequently been corrected, resulting in improved system performance and throughput. In this work, the application kernels performance monitoring module of eXtreme Data Metrics on Demand (XDMoD) is described. Through the XDMoD framework, the application kernels have been run repetitively on the Texas Advanced Computing Centers Stampede and Lonestar4 clusters for a total of over 14,000 jobs. This provides a body of data on the HPC clusters operation that can be used to statistically analyze how the application performance, as measured by metrics such as execution time and communication bandwidth, is affected by the clusters workload. We discuss metric distributions, carry out regression and correlation analyses, and use a PCA study to describe the variance and relate the variance to factors such as the spatial distribution of the application in the cluster. Ultimately, these types of analyses can be used to improve the application kernel mechanism, which in turn results in improved QoS of the HPC infrastructure that is delivered to the end users. Copyright

International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems | 2017

A Slurm Simulator: Implementation and Parametric Analysis

Nikolay Simakov; Martins Innus; Matthew D. Jones; Robert L. DeLeon; Joseph P. White; Steven M. Gallo; Abani K. Patra; Thomas R. Furlani

Slurm is an open-source resource manager for HPC that provides high configurability for inhomogeneous resources and job scheduling. Various Slurm parametric settings can significantly influence HPC resource utilization and job wait time, however in many cases it is hard to judge how these options will affect the overall HPC resource performance. The Slurm simulator can be a very helpful tool to aid parameter selection for a particular HPC resource. Here, we report our implementation of a Slurm simulator and the impact of parameter choice on HPC resource performance. The simulator is based on a real Slurm instance with modifications to allow simulation of historical jobs and to improve the simulation speed. The simulator speed heavily depends on job composition, HPC resource size and Slurm configuration. For an 8000 cores heterogeneous cluster, we achieve about 100 times acceleration, e.g. 20 days can be simulated in 5 h. Several parameters affecting job placement were studied. Disabling node sharing on our 8000 core cluster showed a 45% increase in the time needed to complete the same workload. For a large system (>6000 nodes) comprised of two distinct sub-clusters, two separate Slurm controllers and adding node sharing can cut waiting times nearly in half.

Proceedings of the Practice and Experience on Advanced Research Computing | 2018

Automatic Characterization of HPC Job Parallel Filesystem I/O Patterns

Joseph P. White; Alexander D. Kofke; Robert L. DeLeon; Martins Innus; Matthew D. Jones; Thomas R. Furlani

As part of the NSF funded XMS project, we are actively researching automatic detection of poorly performing HPC jobs. To aid the analysis we have generated a taxonomy of the temporal I/O patterns for HPC jobs. In this paper we describe the design of temporal pattern characterization algorithms for HPC job I/O. We have implemented these algorithms in the Open XDMoD job analysis framework. These I/O classifications include periodic patterns and a variety of characteristic non-periodic patterns. We present an analysis of the I/O patterns observed on the/scratch filesystem on an academic HPC cluster. This type of analysis can be extended to other HPC usage data such as memory, CPU and interconnect usage. Ultimately this analysis will be used to improve HPC throughput and efficiency by, for example, automatically identifying anomalous HPC jobs.

Proceedings of the Practice and Experience on Advanced Research Computing | 2018

Slurm Simulator: Improving Slurm Scheduler Performance on Large HPC systems by Utilization of Multiple Controllers and Node Sharing

Nikolay Simakov; Robert L. DeLeon; Martins Innus; Matthew D. Jones; Joseph P. White; Steven M. Gallo; Abani K. Patra; Thomas R. Furlani

A Slurm simulator was used to study the potential benefits of using multiple Slurm controllers and node-sharing on the TACC Stampede 2 system. Splitting a large cluster into smaller sub-clusters with separate Slurm controllers can offer better scheduling performance and better responsiveness due to an increased computational capability which increases the backfill scheduler efficiency. The disadvantage is additional hardware, more maintenance and an incapability to run jobs across the sub-clusters. Node sharing can increase system throughput by allowing several sub-node jobs to be executed on the same node. However, node sharing is more computationally demanding and might not be advantageous on larger systems. The Slurm simulator allows an estimation of the potential benefits from these configurations and provides information on the advantages to be expected from such a configuration deployment. In this work, multiple Slurm controllers and node-sharing were tested on a TACC Stampede 2 system consisting of two distinct node types: 4,200 Intel Xeon Phi Knights Landing (KNL) nodes and 1,736 Intel Xeon Skylake-X (SLX) nodes. For this system utilization of separate controllers for KNL and SLX nodes with node sharing allowed on SLX nodes resulted in a 40% reduction in waiting times for jobs executed on the SLX nodes. This improvement can be attributed to the better performance of the backfill scheduler. It scheduled 30% more SLX jobs, has a 30% reduction in the fraction of cycles that hit the time-limit and nearly doubles the jobs scheduling attempts.

Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact | 2017

Challenges of Workload Analysis on Large HPC Systems: A Case Study on NCSA Blue Waters

Joseph P. White; Martins Innus; Matthew D. Jones; Robert L. DeLeon; Nikolay Simakov; Jeffrey T. Palmer; Steven M. Gallo; Thomas R. Furlani; Michael T. Showerman; Robert J. Brunner; Andry Kot; Gregory H. Bauer; Brett Bode; Jeremy Enos; William T. Kramer

Blue Waters [4] is a petascale-level supercomputer whose mission is to greatly accelerate insight to the most challenging computational and data analysis problems. We performed a detailed workload analysis of Blue Waters [8] using Open XDMoD [10]. The analysis used approximately 35,000 node hours to process the roughly 95 TB of input data from over 4.5M jobs that ran on Blue Waters during the period that was studied (April 1, 2013 - September 30, 2016). This paper describes the work that was done to collate, process and analyze the data that was collected on Blue Waters, the design decisions that were made, tools that we created and the various software engineering problems that we encountered and solved. In particular, we describe the challenges to data processing unique to Blue Waters engendered by the extremely large jobs that it typically executed.

Explore More