Is this you? Create Your Porfile

W. Michael Brown

National Center for Computational Sciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where W. Michael Brown is active.

Explore More

Publication

Featured researches published by W. Michael Brown.

Computer Physics Communications | 2011

Implementing Molecular Dynamics on Hybrid High Performance Computers - Short Range Forces

W. Michael Brown; Peng Wang; Steven J. Plimpton; Arnold N. Tharrington

Abstract The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines – (1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory, (2) minimizing the amount of code that must be ported for efficient acceleration, (3) utilizing the available processing power from both multi-core CPUs and accelerators, and (4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS, however, the methods can be applied in many molecular dynamics codes. Specifically, we describe algorithms for efficient short range force calculation on hybrid high-performance machines. We describe an approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPUs and 180 CPU cores.

Computer Physics Communications | 2012

Implementing Molecular Dynamics on Hybrid High Performance Computers – Particle-Particle Particle-Mesh

W. Michael Brown; Axel Kohlmeyer; Steven J. Plimpton; Arnold N. Tharrington

Abstract The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. In this paper, we present a continuation of previous work implementing algorithms for using accelerators into the LAMMPS molecular dynamics software for distributed memory parallel hybrid machines. In our previous work, we focused on acceleration for short-range models with an approach intended to harness the processing power of both the accelerator and (multi-core) CPUs. To augment the existing implementations, we present an efficient implementation of long-range electrostatic force calculation for molecular dynamics. Specifically, we present an implementation of the particle–particle particle-mesh method based on the work by Harvey and De Fabritiis. We present benchmark results on the Keeneland InfiniBand GPU cluster. We provide a performance comparison of the same kernels compiled with both CUDA and OpenCL. We discuss limitations to parallel efficiency and future directions for improving performance on hybrid or heterogeneous computers.

Physical Chemistry Chemical Physics | 2013

New insights into the dynamics and morphology of P3HT:PCBM active layers in bulk heterojunctions

Jan-Michael Y. Carrillo; Rajeev Kumar; Monojoy Goswami; Bobby G. Sumpter; W. Michael Brown

Organic photovoltaics (OPVs) are a topic of extensive research because of their potential application in solar cells. Recent work has led to the development of a coarse-grained model for studying poly(3-hexylthiophene) (P3HT) and [6,6]-phenyl-C61-butyric acid methyl ester (PCBM) blends using molecular simulations. Here we provide further validation of the force field and use it to study the thermal annealing process of P3HT:PCBM blends. A key finding of our study is that, in contrast to a previous report, the annealing process does not converge at the short time scales reported. Rather, we find that the self-assembly of the blends is characterized by three rate dependent stages that require much longer simulations to approach convergence. Using state-of-the-art high performance computing, we are able to study annealing at length and time scales commensurate with devices used in experiments. Our simulations show different phase segregated morphologies dependent on the P3HT chain length and PCBM volume fraction in the blend. For short chain lengths, we observed a smectic morphology containing alternate P3HT and PCBM domains. In contrast, a phase segregated morphology containing domains of P3HT and PCBM distributed randomly in space is found for longer chain lengths. Theoretical arguments justifying stabilization of these morphologies due to shape anisotropy of P3HT (rod-like) and PCBM (sphere-like) are presented. Furthermore, results on the structure factor, miscibility of P3HT and PCBM, domain spacing and kinetics of phase segregation in the blends are presented in detail. Qualitative comparison of these results with published small-angle neutron scattering experiments in the literature is presented and an excellent agreement is found.

Computer Physics Communications | 2013

Implementing molecular dynamics on hybrid high performance computers—Three-body potentials

W. Michael Brown; Masako Yamada

Abstract The use of coprocessors or accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, defined as machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. Although there has been extensive research into methods to use accelerators efficiently to improve the performance of molecular dynamics (MD) codes employing pairwise potential energy models, little is reported in the literature for models that include many-body effects. 3-body terms are required for many popular potentials such as MEAM, Tersoff, REBO, AIREBO, Stillinger–Weber, Bond-Order Potentials, and others. Because the per-atom simulation times are much higher for models incorporating 3-body terms, there is a clear need for efficient algorithms usable on hybrid high performance computers. Here, we report a shared-memory force-decomposition for 3-body potentials that avoids memory conflicts to allow for a deterministic code with substantial performance improvements on hybrid machines. We describe modifications necessary for use in distributed memory MD codes and show results for the simulation of water with Stillinger–Weber on the hybrid Titan supercomputer. We compare performance of the 3-body model to the SPC/E water model when using accelerators. Finally, we demonstrate that our approach can attain a speedup of 5.1 with acceleration on Titan for production simulations to study water droplet freezing on a surface.

international conference on conceptual structures | 2012

An Evaluation of Molecular Dynamics Performance on the Hybrid Cray XK6 Supercomputer

W. Michael Brown; Trung Dac Nguyen; Miguel Fuentes-Cabrera; Jason D. Fowlkes; Philip D. Rack; Mark Berger; Arthur S. Bland

For many years, the drive towards computational physics studies that match the size and time-scales of experiment has been fueled by increases in processor and interconnect performance that could be exploited with relatively little modification to existing codes. Engineering and electrical power constraints have disrupted this trend, requiring more drastic changes to both hardware and software solutions. Here, we present details of the Cray XK6 architecture that achieves increased performance with the use of GPU accelerators. We review software development efforts in the LAMMPS molecular dynamics package that have been implemented in order to utilize hybrid high performance computers. We present benchmark results for solid-state, biological, and mesoscopic systems and discuss some challenges for utilizing hybrid systems. We present some early work in improving application performance on the XK6 and performance results for the simulation of liquid copper nanostructures with the embedded atom method.

Journal of Chemical Theory and Computation | 2013

A Case Study of Truncated Electrostatics for Simulation of Polyelectrolyte Brushes on GPU Accelerators.

Trung Dac Nguyen; Jan-Michael Y. Carrillo; Andrey V. Dobrynin; W. Michael Brown

Numerous issues have disrupted the trend for increasing computational performance with faster CPU clock frequencies. In order to exploit the potential performance of new computers, it is becoming increasingly desirable to re-evaluate computational physics methods and models with an eye toward approaches that allow for increased concurrency and data locality. The evaluation of long-range Coulombic interactions is a common bottleneck for molecular dynamics simulations. Enhanced truncation approaches have been proposed as an alternative method and are particularly well-suited for many-core architectures and GPUs due to the inherent fine-grain parallelism that can be exploited. In this paper, we compare efficient truncation-based approximations to evaluation of electrostatic forces with the more traditional particle-particle particle-mesh (P(3)M) method for the molecular dynamics simulation of polyelectrolyte brush layers. We show that with the use of GPU accelerators, large parallel simulations using P(3)M can be greater than 3 times faster due to a reduction in the mesh-size required. Alternatively, using a truncation-based scheme can improve performance even further. This approach can be up to 3.9 times faster than GPU-accelerated P(3)M for many polymer systems and results in accurate calculation of shear velocities and disjoining pressures for brush layers. For configurations with highly nonuniform charge distributions, however, we find that it is more efficient to use P(3)M; for these systems, computationally efficient parametrizations of the truncation-based approach do not produce accurate counterion density profiles or brush morphologies.

ACS Macro Letters | 2013

Poly(3-hexylthiophene) Molecular Bottlebrushes via Ring-Opening Metathesis Polymerization: Macromolecular Architecture Enhanced Aggregation

Suk-kyun Ahn; Deanna L. Pickel; W. Michael Kochemba; Jihua Chen; David Uhrig; Juan Pablo Hinestrosa; Jan-Michael Y. Carrillo; Ming Shao; Changwoo Do; Jamie M. Messman; W. Michael Brown; Bobby G. Sumpter; S. Michael Kilbey

Macromolecules | 2012