Ivy Bo Peng
Royal Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ivy Bo Peng.
Journal of Geophysical Research | 2016
Gabor Zsolt Toth; Xianzhe Jia; Stefano Markidis; Ivy Bo Peng; Yuxi Chen; L. K. S. Daldorff; Valeriy M. Tenishev; Dmitry Borovikov; John D. Haiducek; Tamas I. Gombosi; Alex Glocer; J. C. Dorelli
We have recently developed a new modeling capability to embed the implicit particle-in-cell (PIC) model iPIC3D into the Block-Adaptive-Tree-Solarwind-Roe-Upwind-Scheme magnetohydrodynamic (MHD) model. The MHD with embedded PIC domains (MHD-EPIC) algorithm is a two-way coupled kinetic-fluid model. As one of the very first applications of the MHD-EPIC algorithm, we simulate the interaction between Jupiters magnetospheric plasma and Ganymedes magnetosphere. We compare the MHD-EPIC simulations with pure Hall MHD simulations and compare both model results with Galileo observations to assess the importance of kinetic effects in controlling the configuration and dynamics of Ganymedes magnetosphere. We find that the Hall MHD and MHD-EPIC solutions are qualitatively similar, but there are significant quantitative differences. In particular, the density and pressure inside the magnetosphere show different distributions. For our baseline grid resolution the PIC solution is more dynamic than the Hall MHD simulation and it compares significantly better with the Galileo magnetic measurements than the Hall MHD solution. The power spectra of the observed and simulated magnetic field fluctuations agree extremely well for the MHD-EPIC model. The MHD-EPIC simulation also produced a few flux transfer events (FTEs) that have magnetic signatures very similar to an observed event. The simulation shows that the FTEs often exhibit complex 3-D structures with their orientations changing substantially between the equatorial plane and the Galileo trajectory, which explains the magnetic signatures observed during the magnetopause crossings. The computational cost of the MHD-EPIC simulation was only about 4 times more than that of the Hall MHD simulation.
international conference on conceptual structures | 2015
Ivy Bo Peng; Stefano Markidis; Andris Vaivads; Juris Vencels; Jorge Amaya; Andrey Divin; Erwin Laure; Giovanni Lapenta
We demonstrate the improvements to an implicit Particle-in-Cell code, iPic3D, on the example of dipolar magnetic field immersed in the flow of the plasma and show the formation of a magnetosphere. ...
Journal of Plasma Physics | 2015
Ivy Bo Peng; Juris Vencels; Giovanni Lapenta; Andrey Divin; Andris Vaivads; Erwin Laure; Stefano Markidis
We carried out a 3D fully kinetic simulation of Earths magnetotail magnetic reconnection to study the dynamics of energetic particles. We developed and implemented a new relativistic particle move ...
Proceedings of the 3rd Workshop on Exascale MPI | 2015
Ivy Bo Peng; Stefano Markidis; Erwin Laure; Daniel J. Holmes; Mark Bull
Data streaming model is an effective way to tackle the challenge of data-intensive applications. As traditional HPC applications generate large volume of data and more data-intensive applications move to HPC infrastructures, it is necessary to investigate the feasibility of combining message-passing and streaming programming models. MPI, the de facto standard for programming on HPC, cannot intuitively express the communication pattern and the functional operations required in streaming models. In this work, we designed and implemented a data streaming library MPIStream atop MPI to allocate data producers and consumers, to stream data continuously or irregularly and to process data at run-time. In the same spirit as the STREAM benchmark, we developed a parallel stream benchmark to measure data processing rate. The performance of the library largely depends on the size of the stream element, the number of data producers and consumers and the computational intensity of processing one stream element. With 2,048 data producers and 2,048 data consumers in the parallel benchmark, MPIStream achieved 200 GB/s processing rate on a Blue Gene/Q supercomputer. We illustrate that a streaming library for HPC applications can effectively enable irregular parallel I/O, application monitoring and threshold collective operations.
international parallel and distributed processing symposium | 2017
Ivy Bo Peng; Roberto Gioiosa; Gokcen Kestor; Pietro Cicotti; Erwin Laure; Stefano Markidis
Hardware accelerators have become a de-facto standard to achieve high performance on current supercomputers and there are indications that this trend will increase in the future. Modern accelerators feature high-bandwidth memory next to the computing cores. For example, the Intel Knights Landing (KNL) processor is equipped with 16 GB of high-bandwidth memory (HBM) that works together with conventional DRAM memory. Theoretically, HBM can provide ∼4× higher bandwidth than conventional DRAM. However, many factors impact the effective performance achieved by applications, including the application memory access pattern, the problem size, the threading level and the actual memory configuration. In this paper, we analyze the Intel KNL system and quantify the impact of the most important factors on the application performance by using a set of applications that are representative of scientific and data-analytics workloads. Our results show that applications with regular memory access benefit from MCDRAM, achieving up to 3× performance when compared to the performance obtained using only DRAM. On the contrary, applications with random memory access pattern are latency-bound and may suffer from performance degradation when using only MCDRAM. For those applications, the use of additional hardware threads may help hide latency and achieve higher aggregated bandwidth when using HBM.
international symposium on memory management | 2017
Ivy Bo Peng; Roberto Gioiosa; Gokcen Kestor; Pietro Cicotti; Erwin Laure; Stefano Markidis
Traditional scientific and emerging data analytics applications require fast, power-efficient, large, and persistent memories. Combining all these characteristics within a single memory technology is expensive and hence future supercomputers will feature different memory technologies side-by-side. However, it is a complex task to program hybrid-memory systems and to identify the best object-to-memory mapping. We envision that programmers will probably resort to use default configurations that only require minimal interventions on the application code or system settings. In this work, we argue that intelligent, fine-grained data placement can achieve higher performance than default setups. We present an algorithm for data placement on hybrid-memory systems. Our algorithm is based on a set of single-object allocation rules and global data placement decisions. We also present RTHMS, a tool that implements our algorithm and provides recommendations about the object-to-memory mapping. Our experiments on a hybrid memory system, an Intel Knights Landing processor with DRAM and HBM, show that RTHMS is able to achieve higher performance than the default configuration. We believe that RTHMS will be a valuable tool for programmers working on complex hybrid-memory systems.
The Astrophysical Journal | 2016
Vyacheslav Olshevsky; Jan Deca; Andrey Divin; Ivy Bo Peng; Stefano Markidis; Maria Elena Innocenti; Emanuele Cazzola; Giovanni Lapenta
We present a systematic attempt to study magnetic null points and the associated magnetic energy conversion in kinetic Particle-in-Cell simulations of various plasma configurations. We address three-dimensional simulations performed with the semi-implicit kinetic electromagnetic code iPic3D in different setups: variations of a Harris current sheet, dipolar and quadrupolar magnetospheres interacting with the solar wind; and a relaxing turbulent configuration with multiple null points. Spiral nulls are more likely created in space plasmas: in all our simulations except lunar magnetic anomaly and quadrupolar mini-magnetosphere the number of spiral nulls prevails over the number of radial nulls by a factor of 3-9. We show that often magnetic nulls do not indicate the regions of intensive energy dissipation. Energy dissipation events caused by topological bifurcations at radial nulls are rather rare and short-lived. The so-called X-lines formed by the radial nulls in the Harris current sheet and lunar magnetic anomaly simulations are rather stable and do not exhibit any energy dissipation. Energy dissipation is more powerful in the vicinity of spiral nulls enclosed by magnetic flux ropes with strong currents at their axes (their cross-sections resemble 2D magnetic islands). These null lines reminiscent of Z-pinches efficiently dissipate magnetic energy due to secondary instabilities such as the two-stream or kinking instability, accompanied by changes in magnetic topology. Current enhancements accompanied by spiral nulls may signal magnetic energy conversion sites in the observational data.
international conference on conceptual structures | 2015
Juris Vencels; Gian Luca Delzanno; Alec Johnson; Ivy Bo Peng; Erwin Laure; Stefano Markidis
A spectral method for kinetic plasma simulations based on the expansion of the velocity distribution function in a variable number of Hermite polynomials is presented. The method is based on a set of non-linear equations that is solved to determine the coefficients of the Hermite expansion satisfying the Vlasov and Poisson equations. In this paper, we first show that this technique combines the fluid and kinetic approaches into one framework. Second, we present an adaptive strategy to increase and decrease the number of Hermite functions dynamically during the simulation. The technique is applied to the Landau damping and two-stream instability test problems. Performance results show 21% and 47% saving of total simulation time in the Landau and two-stream instability test cases, respectively.
international conference on cluster computing | 2015
Ivy Bo Peng; Stefano Markidis; Erwin Laure
Synchronization in message passing systems is achieved by communication among processes. System and architectural noise and different workloads cause processes to be imbalanced and to reach synchronization points at different time. Thus, both communication and imbalance impact the synchronization performance. In this paper, we study the algorithmic properties that allow the communication in synchronization to absorb the initial imbalance among processes. We quantify the imbalance absorption properties of different barrier algorithms using a LogP Monte Carlo simulator. We found that linear and f-way tournament barriers can absorb up to 95% of random exponential imbalance with the standard deviation equal to the communication time for one message. Dissemination, butterfly and pairwise exchange barriers, on the other hand, do not absorb imbalance but can effectively bound the post-barrier imbalance. We identify that synchronization transits from communication-dominated to imbalance-dominated when the standard deviation of imbalance distribution is more than twice the communication time for one message. In our study, f-way tournament barriers provided the best imbalance absorption rate and convenient communication time.
Journal of Geophysical Research | 2017
Gabor Zsolt Toth; Yuxi Chen; Tamas I. Gombosi; P. A. Cassak; Stefano Markidis; Ivy Bo Peng
We investigate the use of artificially increased ion and electron kinetic scales in global plasma simulations. We argue that as long as the global and ion inertial scales remain well separated, 1) the overall global solution is not strongly sensitive to the value of the ion inertial scale, while 2) the ion inertial scale dynamics will also be similar to the original system, but it occurs at a larger spatial scale, and 3) structures at intermediate scales, such as magnetic islands, grow in a self-similar manner. To investigate the validity and limitations of our scaling hypotheses, we carry out many simulations of a two-dimensional magnetosphere with the magnetohydrodynamics with embedded particle-in-cell (MHD-EPIC) model. The PIC model covers the dayside reconnection site. The simulation results confirm that the hypotheses are true as long as the increased ion inertial length remains less than about 5% of the magnetopause standoff distance. Since the theoretical arguments are general, we expect these results to carry over to three dimensions. The computational cost is reduced by the third and fourth powers of the scaling factor in two- and three-dimensional simulations, respectively, which can be many orders of magnitude. The present results suggest that global simulations that resolve kinetic scales for reconnection are feasible. This is a crucial step for applications to the magnetospheres of Earth, Saturn and Jupiter and to the solar corona.