Ikuo Miyoshi
Fujitsu
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ikuo Miyoshi.
ieee international conference on high performance computing data and analytics | 2011
Yukihiro Hasegawa; Jun-Ichi Iwata; Miwako Tsuji; Daisuke Takahashi; Atsushi Oshiyama; Kazuo Minami; Taisuke Boku; Fumiyoshi Shoji; Atsuya Uno; Motoyoshi Kurokawa; Hikaru Inoue; Ikuo Miyoshi; Mitsuo Yokokawa
Real space DFT (RSDFT) is a simulation technique most suitable for massively-parallel architectures to perform first-principles electronic-structure calculations based on density functional theory. We here report unprecedented simulations on the electron states of silicon nanowires with up to 107,292 atoms carried out during the initial performance evaluation phase of the K computer being developed at RIKEN. The RSDFT code has been parallelized and optimized so as to make effective use of the various capabilities of the K computer. Simulation results for the self-consistent electron states of a silicon nanowire with 10,000 atoms were obtained in a run lasting about 24 hours and using 6,144 cores of the K computer. A 3.08 peta-flops sustained performance was measured for one iteration of the SCF calculation in a 107,292-atom Si nanowire calculation using 442,368 cores, which is 43.63% of the peak performance of 7.07 peta-flops.
ieee international conference on high performance computing data and analytics | 2015
Yuichi Inadomi; Tapasya Patki; Koji Inoue; Mutsumi Aoyagi; Barry Rountree; Martin Schulz; David K. Lowenthal; Yasutaka Wada; Keiichiro Fukazawa; Masatsugu Ueda; Masaaki Kondo; Ikuo Miyoshi
A key challenge in next-generation supercomputing is to effectively schedule limited power resources. Modern processors suffer from increasingly large power variations due to the chip manufacturing process. These variations lead to power inhomogeneity in current systems and manifest into performance inhomogeneity in power constrained environments, drastically limiting supercomputing performance. We present a first-of-its-kind study on manufacturing variability on four production HPC systems spanning four microarchitectures, analyze its impact on HPC applications, and propose a novel variation-aware power budgeting scheme to maximize effective application performance. Our low-cost and scalable budgeting algorithm strives to achieve performance homogeneity under a power constraint by deriving application-specific, module-level power allocations. Experimental results using a 1,920 socket system show up to 5.4X speedup, with an average speedup of 1.8X across all benchmarks when compared to a variation-unaware power allocation scheme.
ieee international conference on high performance computing data and analytics | 2014
Yukihiro Hasegawa; Jun-Ichi Iwata; Miwako Tsuji; Daisuke Takahashi; Atsushi Oshiyama; Kazuo Minami; Taisuke Boku; Hikaru Inoue; Yoshito Kitazawa; Ikuo Miyoshi; Mitsuo Yokokawa
Silicon nanowires are potentially useful in next-generation field-effect transistors, and it is important to clarify the electron states of silicon nanowires to know the behavior of new devices. Computer simulations are promising tools for calculating electron states. Real-space density functional theory (RSDFT) code performs first-principles electronic structure calculations. To obtain higher performance, we applied various optimization techniques to the code: multi-level parallelization, load balance management, sub-mesh/torus allocation, and a message-passing interface library tuned for the K computer. We measured and evaluated the performance of the modified RSDFT code on the K computer. A 5.48 petaflops (PFLOPS) sustained performance was measured for an iteration of a self-consistent field calculation for a 107,292-atom Si nanowire simulation using 82,944 compute nodes, which is 51.67% of the K computer’s peak performance of 10.62 PFLOPS. This scale of simulation enables analysis of the behavior of a silicon nanowire with a diameter of 10–20 nm.
ieee international conference on high performance computing data and analytics | 2012
Yasuhiro Idomura; Motoki Nakata; Sususmu Yamada; Masahiko Machida; Toshiyuki Imamura; T.-H. Watanabe; Masanori Nunami; Hikaru Inoue; Shigenobu Tsutsumi; Ikuo Miyoshi; Naoyuki Shida
A plasma turbulence research based on 5D gyrokinetic simulations is one of the most critical and demanding issues in fusion science. To pioneer new physics regimes both in problem sizes and in time scales, an improvement of strong scaling is essential. Overlap of computations and communications is a promising approach in improving strong scaling, but it often fails on practical applications with conventional MPI libraries. In this work, this classical issue is revisited, and resolved by communication overlap techniques, which work even on conventional MPI libraries. These techniques dramatically improve the parallel efficiency of a gyrokinetic Eularian code GT5D on the K-computer and the Helios, which are based on dedicated and commodity networks, respectively. On the K-computer, excellent strong scaling is confirmed beyond 100k cores with keeping the peak ratio of ~10% (~307 TFlops using 196,608 cores), and simulations for ITER-size fusion devices are significantly accelerated.
Archive | 1998
Tsunehisa Doi; Ikuo Miyoshi; Takeshi Sekine; Tatsuya Shindo
Archive | 2008
Ikuo Miyoshi
IEICE Transactions on Information and Systems | 2011
Hideki Miwa; Ryutaro Susukita; Hidetomo Shibamura; Tomoya Hirao; Jun Maki; Makoto Yoshida; Takayuki Kando; Yuichiro Ajima; Ikuo Miyoshi; Toshiyuki Shimizu; Yuji Oinaga; Hisashige Ando; Yuichi Inadomi; Koji Inoue; Mutsumi Aoyagi; Kazuaki Murakami
Archive | 2011
Ikuo Miyoshi
Archive | 2017
Ikuo Miyoshi
Archive | 2010
Ikuo Miyoshi