Masaharu Matsumoto | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Masaharu Matsumoto is active.

Explore More

Publication

Featured researches published by Masaharu Matsumoto.

Physics of Plasmas | 2012

Momentum transfer of solar wind plasma in a kinetic scale magnetosphere

Toseo Moritaka; Yoshihiro Kajimura; Hideyuki Usui; Masaharu Matsumoto; Tatsuki Matsui; I. Shinohara

Solar wind interaction with a kinetic scale magnetosphere and the resulting momentum transfer process are investigated by 2.5-dimensional full kinetic particle-in-cell simulations. The spatial scale of the considered magnetosphere is less than or comparable to the ion inertial length and is relevant for magnetized asteroids or spacecraft with mini-magnetosphere plasma propulsion. Momentum transfer is evaluated by studying the Lorentz force between solar wind plasma and a hypothetical coil current density that creates the magnetosphere. In the zero interplanetary magnetic field (IMF) limit, solar wind interaction goes into a steady state with constant Lorentz force. The dominant Lorentz force acting on the coil current density is applied by the thin electron current layer at the wind-filled front of the magnetosphere. Dynamic pressure of the solar wind balances the magnetic pressure in this region via electrostatic deceleration of ions. The resulting Lorentz force is characterized as a function of the scal...

2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs | 2014

Auto-tuning of Computation Kernels from an FDM Code with ppOpen-AT

Takahiro Katagiri; Satoshi Ohshima; Masaharu Matsumoto

In this paper, we propose an Auto-tuning (AT) function with an AT language for a dedicated numerical library with respect to supercomputers in operation. The AT function is based on well-known loop transformation techniques, such as loop split, fusion, and re-ordering of statements. However, loop split with copies or increase of computations, and loop fusion to the split loop are taken into account by utilizing user knowledge.

2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs | 2014

Performance Optimization of SpMV Using CRS Format by Considering OpenMP Scheduling on CPUs and MIC

Satoshi Ohshima; Takahiro Katagiri; Masaharu Matsumoto

In this study, we evaluate the performance of sparse matrix-vector multiplication (SpMV) using the compressed row storage (CRS) format on CPUs and MIC. We focus on the relationship between OpenMP scheduling and performance. The performance of SpMV is measured using various OpenMP scheduling settings and the results are analyzed, which show that OpenMP scheduling has a considerable effect on the performance of SpMV. We confirm that some scheduling settings resulted in performance improvements compared with default scheduling for particular matrices. The results of the evaluation show that the performance of SpMV is improved by up to 1.57 times compared with SPARC64 IXfx, 2.47 times compared with Xeon Ivy Bridge-EP, and 2.26 times compared with Knights Corner. Next, we modify the SpMV function of OpenATLib, an auto-tuned numerical library, to consider the scheduling of optimization as an additional SpMV implementation. We measure the performance of the GMRES solver and obtain performance improvements of up to 11.4%. These results will help to improve the performance of various numerical calculation applications.

international parallel and distributed processing symposium | 2015

Directive-Based Auto-Tuning for the Finite Difference Method on the Xeon Phi

Takahiro Katagiri; Satoshi Ohshima; Masaharu Matsumoto

In this paper, we present a directive-based auto-tuning (AT) framework, called ppOpen-AT, and demonstrate its effect using simulation code based on the Finite Difference Method (FDM). The framework utilizes well-known loop transformation techniques. However, the codes used are carefully designed to minimize the software stack in order to meet the requirements of a many-core architecture currently in operation. The results of evaluations conducted using ppOpen-AT indicate that maximum speedup factors greater than 550% are obtained when it is applied in eight nodes of the Intel Xeon Phi. Further, in the AT for data packing and unpacking, a 49% speedup factor for the whole application is achieved. By using it with strong scaling on 32 nodes in a cluster of the Xeon Phi, we also obtain 24% speedups for the overall execution.

international conference on conceptual structures | 2014

Development of a Computational Framework for Block-based AMR Simulations

Hideyuki Usui; Akihide Nagara; Masanori Nunami; Masaharu Matsumoto

Abstract AMR technique can provide efficient numerical calculation by adapting fine cells to regions where higher numerical resolution is required. However, it is generally difficult for users to implement the AMR technique in their generic simulation programs which use uniform ce lls. For the purpose of carrying out numerical simulations including the AMR technique, we developed a framework for blocked-structured AMR simulation by which we can easily convert a generic uniform-cell simu lation program to the one with the AMR treatment. In this paper we describe the developed framework and show the implementation of a simulation program into the framework by taking a two -dimensional advection simulation as an example.

ieee international conference on high performance computing data and analytics | 2014

Performance Optimization of the 3D FDM Simulation of Seismic Wave Propagation on the Intel Xeon Phi Coprocessor Using the ppOpen-APPL/FDM Library

Futoshi Mori; Masaharu Matsumoto; Takashi Furumura

We evaluate the performance of a parallel 3D finite-difference method (FDM) simulation of seismic wave propagation using the Intel Xeon Phi coprocessor. Since a continued decrease in the byte/flop ratio of future machines is forecast, program optimization with a decrease byte/flop ratio was applied by fusing the original major kernel and omitting the storing and loading of intermediate variables. We confirm that 1) MPI/OpenMP hybrid parallel computing with hyper-threading is more efficient than pure MPI parallel computing and 2) the performance of the FDM simulation with a splitting of triple DO loops is 1.3 times faster than the modified code with triple DO loops, while no performance acceleration is achieved with a fused double DO-loop calculation. We consider that loop distribution optimization is effective for prefetching and the thread parallelization of each loop by its use and reuse on cache data.

IEEE Transactions on Plasma Science | 2010

Full Particle-in-Cell Simulation Study on Magnetic Inflation Around a Magneto Plasma Sail

Toseo Moritaka; Hideyuki Usui; Masanori Nunami; Yoshihiro Kajimura; Masao Nakamura; Masaharu Matsumoto

In order to consider a next-generation space propulsion system referred to as the “magneto plasma sail,” the magnetic inflation mechanism of a small artificial magnetosphere is investigated. We carry out a two-and-half-dimensional full particle-in-cell simulation, and magnetic inflation mediated by the gyration motion of injected ions is observed. As a result of the gyration motion, an ion-rich region is formed near the direction-reversal position of the injected ions. Magnetic inflation takes place due to the flow of electrons toward the ion-rich region, which carries the field lines of the original magnetosphere. This inflation process is effective for a magnetosphere with a scale comparable to the gyration radius of the injected ions. If the original magnetosphere is much smaller than this, background electrons flow into the ion-rich region outside the magnetosphere, and the inflated magnetosphere is confined to a smaller region. In addition, the thermal effects of background electrons have a similar impact on the inflation process, even if the direction-reversal position is located inside the magnetosphere.

Computer Physics Communications | 2012

Application of a total variation diminishing scheme to electromagnetic hybrid particle-in-cell plasma simulation

Masaharu Matsumoto; Yoshihiro Kajimura; Hideyuki Usui; Ikkoh Funaki; I. Shinohara

Abstract A discretization procedure for a total variation diminishing (TVD) scheme is introduced to an electromagnetic hybrid particle-in-cell (PIC) plasma simulation code in order to improve the numerical stability and resolution when calculating the plasma flow field in which magnetic field discontinuities (for example, Rankine–Hugoniot jump conditions for shock waves) are generated. In the hybrid PIC code used in this study, ions are treated as particles and electrons are assumed to be an inertia-less (mass-less) fluid. In the numerical results of one-dimensional test simulations, the TVD scheme significantly prevents non-physical, numerical oscillations, which would ordinarily be produced in the solution when the convection term of the magnetic induction equation in the hybrid PIC code is discretized by central difference schemes at magnetic field discontinuities. Furthermore, a two-dimensional simulation of the global structure of a collision-less bow shock, which is suitable for practical use, makes it possible to clearly capture the bow shock by using the hybrid PIC code with the TVD scheme.

38th Plasmadynamics and Lasers Conference | 2007

Numerical Study of Pulsed Heat Source MHD Electrical Power Generation

Shota Kajihara; Masaharu Matsumoto; Tomoyuki Murakami; Yoshihiro Okuno

4K pulsed-likely within short time (1µsec) in a stagnant energy input volume, and the energy of high temperature inert gas is converted to the electricity with the medium of pure inert gas plasma without seeding. The numerical simulation results show that an enthalpy extraction ratio (= electrical output energy / pulsed heat energy) of several tens of % can be achieved, which is the same level as the conventional seeded nonequilibrium plasma MHD generator. Although there still exist many phenomena to be clarified and many problems to be overcome in order to realize the system, the pulsed heat source high temperature inert gas MHD generator is surely worth examining in more detail.

international parallel and distributed processing symposium | 2016

Utilization and Expansion of ppOpen-AT for OpenACC

Satoshi Ohshima; Takahiro Katagiri; Masaharu Matsumoto

For application programmers, reducing efforts for optimizing programs is an important issue. Our solution of this issue is an auto-tuning (AT) technique. We are developing an AT language named ppOpen-AT. We have shown that this language is useful for multi-and many-core parallel programming. Today, OpenACC attracts attention as an easy and useful graphics processing unit (GPU) programming environment. While OpenACC is one possible parallel programming environment, users have to spend time and energy in order to optimize OpenACC programs. In this study, we investigate the usability of ppOpen-AT for OpenACC programs and propose to expand ppOpen-AT for further optimization of OpenACC.

Explore More