Hidetoshi Iwashita | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hidetoshi Iwashita is active.

Explore More

Publication

Featured researches published by Hidetoshi Iwashita.

Concurrency and Computation: Practice and Experience | 2002

HPF/JA: extensions of High Performance Fortran for accelerating real‐world applications

Yoshiki Seo; Hidetoshi Iwashita; Hiroshi Ohta; Hitoshi Sakagami

This paper presents a set of extensions on High Performance Fortran (HPF) to make it more usable for parallelizing real‐world production codes. HPF has been effective for programs that a compiler can automatically optimize efficiently. However, once the compiler cannot, there have been no ways for the users to explicitly parallelize or optimize their programs. In order to resolve the situation, we have developed a set of HPF extensions (HPF/JA) to give the users more control over sophisticated parallelization and communication optimizations. They include parallelization of loops with complicated reductions, asynchronous communication, user‐controllable shadow, and communication pattern reuse for irregular remote data accesses. Preliminary experiments have proved that the extensions are effective at increasing HPFs usability. Copyright

international conference on supercomputing | 1995

HPF compiler for the AP1000

Tatsuya Shindo; Hidetoshi Iwashita; Tsunehisa Doi; Junichi Hagiwara; Shaun Kaneshiro

High Performance Fortran (HPF) is a candidate for a standard programming language for distributed memory parallel computers. This paper presents the design and implement ation of an HPF compiler for the Fujitsu AP1OOO parallel computers. There are two novel features implemented in the compiler. The first is a machine-independent optimization based on the intermediate format. The second is a code generation technique utilizing the direct remote data access (DRDA) mechanism and stride data transfer supported by the AP1OOO hardware. With the results of experiments on the AP1OOO, this paper shows the effects of the optimization and code generation techniques.

international conference on supercomputing | 1994

Twisted data layout

Tatsuya Shindo; Hidetoshi Iwashita; Shaun Kaneshiro; Tsunehisa Doi; Junichi Hagiwara

This paper proposes twisted data layout as a novel and efficient data layout technique for distributed memory parallel processors (DMPP). Data layout is an important aspect in efficiently executing a parallel program on DMPPs. The optimal data layout pattern for an array may differ throughout the program. Twisted data layout can be used to resolve the conflicts among the optimal array distributions in a special case. Experimental results on the AP1000 multicomputer measure the performance of the twisted data layout scheme.

ieee international conference on high performance computing data and analytics | 2005

Mapping normalization technique on the HPF compiler fhpf

Hidetoshi Iwashita; Masaki Aoki

We propose a technique of mapping normalization which reduces the variety of data and computational mapping representation of HPF into a certain standard form. The base of the reduction is a set of equivalent transformations of an HPF program, using composition of alignment and affine transformation of data and loop indices. The mapping normalization technique was implemented in the HPF compiler fhpf, and made the succeeding processes, such as local access detection and SPMD conversion, much slimmer. The measurement result shows that performance of the MPI code generated by the fhpf compiler is fairly comparable to the one written by a skillful MPI programmer.

Concurrency and Computation: Practice and Experience | 2002

VPP Fortran and the design of HPF/JA extensions

Hidetoshi Iwashita; Naoki Sueyasu; Sachio Kamiya; G. Matthijs van Waveren

VPP Fortran is a data parallel language that has been designed for the VPP series of supercomputers. In addition to pure data parallelism, it contains certain low‐level features that were designed to extract high performance from user programs. A comparison of VPP Fortran and High‐Performance Fortran (HPF) 2.0 shows that these low‐level features are not available in HPF 2.0. The features include asynchronous inter‐processor communication, explicit shadow, and the LOCAL directive. They were shown in VPP Fortran to be very useful in handling real‐world applications, and they have been included in the HPF/JA extensions. They are described in the paper. The HPF/JA Language Specification Version 1.0 is an extension of HPF 2.0 to achieve practical performance for real‐world applications and is a result of collaboration in the Japan Association for HPF (JAHPF). Some practical programming and tuning procedures with the HPF/JA Language Specification are described, using the NAS Parallel Benchmark BT as an example. Copyright

2015 9th International Conference on Partitioned Global Address Space Programming Models | 2015

Preliminary Implementation of Coarray Fortran Translator Based on Omni XcalableMP

Hidetoshi Iwashita; Masahiro Nakao; Mitsuhisa Sato

XcalableMP (XMP) is a PGAS language for distributed memory environments. It employs Coarray Fortran (CAF) features as the local-view programming model. We implemented the main part of CAF in the form of a translator, i.e., a source-to-source compiler, as a part of Omni XMP compiler. The compiler uses GASNet and the Fujitsu RDMA interface to allocate static and allocatable coarrays and to get and put coindexed objects while avoiding ill effects in the backend Fortran compiler. The evaluation of the Himeno benchmark shows that ported CAF programs compiled with Omni compiler offer high performance on par with the original message passing interface (MPI) program, despite having 32% fewer lines of source code.

international workshop on openmp | 2010

Hybrid parallel programming on SMP clusters using XPFortran and OpenMP

Yuanyuan Zhang; Hidetoshi Iwashita; Kuninori Ishii; Masanori Kaneko; Tomotake Nakamura; Kohichiro Hotta

Process-thread hybrid programming paradigm is commonly employed in SMP clusters. XPFortran, a parallel programming language that specifies a set of compiler directives and library routines, can be used to realize process-level parallelism in distributed memory systems. In this paper, we introduce hybrid parallel programming by XPFortran to SMP clusters, in which thread-level parallelism is realized by OpenMP. We present the language support and compiler implementation of OpenMP directives in XPFortran, and show some of our experiences in XPFortran-OpenMP hybrid programming. For nested loops parallelized by process-thread hybrid programming, it’s common sense to use process parallelization for outer loops and thread parallelization for inner ones. However, we have found that in some cases it’s possible to write XPFortran-OpenMP hybrid program in a reverse way, i.e., OpenMP outside, XPFortran inside. Our evaluation results show that this programming style sometimes delivers better performance than the traditional one. We therefore recommend using the hybrid parallelization flexibly.

ieee international conference on high performance computing data and analytics | 2009

Towards a Lightweight HPF Compiler

Hidetoshi Iwashita; Kohichiro Hotta; Sachio Kamiya; G. Matthijs van Waveren

The UXP/V HPF compiler, that has been developed for the VPP series vector-parallel supercomputers, extracts the highest performance from the hardware. However, it is getting difficult for developers to concentrate on a specific hardware. This paper describes a method of developing an HPF compiler for multiple platforms without losing performance. Advantage is taken of existing technology. The code generator and runtime system of VPP Fortran are reused for high-end computers; MPI is employed for general distributed environments, such as a PC cluster. Following a performance estimation on different systems, we discuss effectiveness of the method and open issues.

ieee international conference on high performance computing data and analytics | 2003

On the Implementation of OpenMP 2.0 Extensions in the Fujitsu PRIMEPOWER Compiler

Hidetoshi Iwashita; Masanori Kaneko; Masaki Aoki; Kohichiro Hotta; G. Matthijs van Waveren

The OpenMP Architecture Review Board has released version 2.0 of the OpenMP Fortran language specification in November 2000, and version 2.0 of the OpenMP C/C++ language specification in March 2002. This paper discusses the implementation of the OpenMP Fortran 2.0 WORKSHARE construct, NUM_THREADS clause, COPYPRIVATE clause, and array REDUCTION clause in the Parallelnavi software package. We focus on the WORKSHARE construct and discuss how we attain parallelization with loop fusion.

Concurrency and Computation: Practice and Experience | 2002

Code generator for the HPF Library and Fortran 95 transformational functions

G. Matthijs van Waveren; Cliff Addison; Peter Harrison; Dave Orange; Norman Brown; Hidetoshi Iwashita

One of the language features of the core language of HPF 2.0 (High Performance Fortran) is the HPF Library. The HPF Library consists of 55 generic functions. The implementation of this library presents the challenge that all data types, data kinds, array ranks and input distributions need to be supported. For instance, more than 2 billion separate functions are required to support COPY_SCATTER fully. The efficient support of these billions of specific functions is one of the outstanding problems of HPF. We have solved this problem by developing a library generator which utilizes the mechanism of parameterized templates. This mechanism allows the procedures to be instantiated at compile time for arguments with a specific type, kind, rank and distribution over a specific processor array. We describe the algorithms used in the different library functions. The implementation gives the ease of generating a large number of library routines from a single template. The templates can be extended with special code for specific combinations of the input arguments. We describe in detail the implementation and performance of the matrix multiplication template for the Fujitsu VPP5000 platform. Copyright

Explore More