Eugene D. Brooks
Lawrence Livermore National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eugene D. Brooks.
International Journal of Parallel Programming | 1986
Eugene D. Brooks
We describe and algorithm for barrier synchronization that requires only read and write to shared store. The algorithm is faster than the traditionallocked counter approach for two processors and has an attractive log2N time scaling for largerN. The algorithm is free of hot spots and critical regions and requires a shared memory bandwidth which grows linearly withN, the number of participating processors. We verify the technique using both a real shared memory multiprocessor, for numbers of processors up to 30, and a shared memory multiprocessor simulator, for number of processors up to 256.
Journal of Computational Physics | 1989
Eugene D. Brooks
Abstract We introduce a new implicit Monte Carlo technique for solving time dependent radiation transport problems involving spontaneous emission. In the usual implicit Monte Carlo procedure an effective scattering term in dictated by the requirement of self-consistency between the transport and implicitly differenced atomic populations equations. The effective scattering term, a source of inefficiency for optically thick problems, becomes an impasse for problems with gain where its sign is negative. In our new technique the effective scattering term does not occur and the execution time for the Monte Carlo portion of the algorithm is independent of opacity. We compare the performance and accuracy of the new symbolic implicit Monte Carlo technique to the usual effective scattering technique for the time dependent description of a two-level system in slab geometry. We also examine the possibility of effectively exploiting multiprocessors on the algorithm, obtaining supercomputer performance using shared memory multiprocessors based on cheap commodity microprocessor technology.
Journal of Computational Physics | 2003
Michael Scott McKinley; Eugene D. Brooks; Abraham Szöke
We compare the implicit Monte Carlo (IMC) technique to the symbolic IMC (SIMC) technique, with and without weight vectors in frequency space, for time-dependent line transport in the presence of collisional pumping. We examine the efficiency and accuracy of the IMC and SIMC methods for test problems involving the evolution of a collisionally pumped trapping problem to its steady-state, the surface heating of a cold medium by a beam, and the diffusion of energy from a localized region that is collisionally pumped. The importance of spatial biasing and teleportation for problems involving high opacity is demonstrated. Our numerical solution, along with its associated teleportation error, is checked against theoretical calculations for the last example.
Scientific Programming | 1992
Eugene D. Brooks; Brent C. Gorda; Karen H. Warren
We describe a parallel extension of the C programming language designed for multiprocessors that provide a facility for sharing memory between processors. The programming model was initially developed on conventional shared memory machines with small processor counts such as the Sequent Balance and Alliant FX/8, but has more recently been used on a scalable massively parallel machine, the BBN TC2000. The programming model is split-join rather than fork-join. Concurrency is exploited to use a fixed number of processors more efficiently rather than to exploit more processors as in the fork-join model. Team splitting, a mechanism to split the team of processors executing a code into subteams to handle parallel subtasks, is used to provide an efficient mechanism to exploit nested concurrency. We have found the split-join programming model to have an inherent implementation advantage, compared to the fork-join model, when the number of processors in a machine becomes large.
parallel computing | 1988
Eugene D. Brooks
Abstract We investigate the use of a hypercube packet switching network as a shared memory server for vector multiprocessors. Using the generalization of a high performance switch node introduced in an earlier paper, we develop a packet switched memory server capable of providing high bandwidth vector access to a shared memory. The network exhibits adaptive behavior, absorbing conflicts as a vector operation proceeds, and delivers full vector bandwidth to all processors simultaneously. In addition to its vector performance, the hypercube has another feature that makes it attractive as a shared memory server. The memory words are not equidistant from the processors. A hierarchy of distances occurs. By taking advantage of this one can provide segments of fast access memory within the global shared memory environment. This makes the shared memory hypercube very promising as a general purpose parallel computer.
parallel computing | 1987
Eugene D. Brooks
Abstract A fundamental hurdle impeding the development of large N common memory multiprocessors is the performance limitation incurred in the switch connecting the processors to the memory modules. Multistage networks currently considered for this connection have a memory latency which grows like α log 2 N . For scientific computing, it is natural to look for a multiprocessor architecture that will enable the use of vector operations to mask memory latency. The problem to be overcome here is the chaotic behavior introduced by conflicts occurring in the switch. In this paper we examine the performance of the butterfly or indirect binary n -cube network in a vector processing environment. We describe a simplemodification of the standard 2 × 2 switch node used in such networks. This local modification to the switch node endows the network with a surprising global property. It adaptively removes chaotic behavior during a vector operation.
Journal of Computational Physics | 2006
Eugene D. Brooks; Abraham Szke; J. L. Peterson
We describe a Monte Carlo solution for time dependent photon transport, in the difference formulation with the material in local thermodynamic equilibrium, that is piecewise linear in its treatment of the material state variable. Our method employs a Galerkin solution for the material energy equation while using Symbolic Implicit Monte Carlo to solve the transport equation. In constructing the scheme, one has the freedom to choose between expanding the material temperature, or the equivalent black body radiation energy density at the material temperature, in terms of finite element basis functions. The former provides a linear treatment of the material energy while the latter provides a linear treatment of the radiative coupling between zones. Subject to the conditional use of a lumped material energy in the vicinity of strong gradients, possible with a linear treatment of the material energy, our approach provides a robust solution for time dependent transport of thermally emitted radiation that can address a wide range of problems. It produces accurate results in thick media.
Scientific Programming | 1992
L. h. Yang; Eugene D. Brooks; J. Belak
A molecular dynamics algorithm for performing large-scale simulations using the Parallel C Preprocessor (PCP) programming paradigm on the BBN TC2000, a massively parallel computer, is discussed. The algorithm uses a linked-cell data structure to obtain the near neighbors of each atom as time evoles. Each processor is assigned to a geometric domain containing many subcells and the storage for that domain is private to the processor. Within this scheme, the interdomain (i.e., interprocessor) communication is minimized.
conference on high performance computing (supercomputing) | 1997
Eugene D. Brooks; Karen H. Warren
We examine the use of a shared memory programming model to address the problem of portability between distributed memory and shared memory architectures. We conduct this evaluation by extending an existing programming model, the Parallel C Preprocessor, with a type qualifier interpretation of the data sharing keywords borrowed from the Split-C and AC compilers. We evaluate the performance of the resulting programming model on a wide range of shared memory and distributed memory computing platforms using several numerical algorithms as benchmarks. We find the type-qualifier-based programming model capable of efficient execution on distributed memory and shared memory architectures.
parallel computing | 1988
Eugene D. Brooks
Abstract In an earlier paper we introduced an indirect binary n -cube memory server network which has adaptive properties making it useful in a parallel vector processing environment. The memory server network, due to a special choice in the design of the basic switch node, has the property that N vector processors issuing vector fetches with similar strides are forced into lock step after an initial startup investment. In this paper we extend this work to the case of the indirect k -any n -cube. As this network has a more favorable memory latency scaling of log k N , one expects that the short vector performance will be improved as k is increased for a given N . We find this to be the case. We also find that the cost of the memory server system scales in a manner which prefers modest values of k above 2.