Kentaro Inenaga
Kyushu University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kentaro Inenaga.
ieee international conference on high performance computing data and analytics | 1997
Kentaro Inenaga; Shigeru Kusakabe; Tetsuro Morimoto; Makoto Amamiya
Dataflow-based non-strict functional programming languages have attractive features for writing concise programs. In order to avoid performance penalties on non-dataflow stock machines, we speculatively use a stack frame instead of a heap frame for a fine grain function instance, which may require dynamic scheduling. As a static approach, we introduce a merging policy to a thread partitioning algorithm in order to find functions with a potentially strict call interface. To complement this static analysis, we provide a hybrid runtime mechanism which can dynamically change a suspended stack frame into a heap frame. The results of the performance evaluation indicate that we can reduce superfluous heap frame management and achieve practical performance even on stock machines.
hawaii international conference on system sciences | 1997
Shigeru Kusakabe; Taku Nagai; Kentaro Inenaga; Makoto Amamiya
In order to show the feasibility of a fine-grain dataflow computation scheme, we are implementing a fine-grain dataflow language on off-the-shelf computers, using a fine-grain multithread approach. Fine-grain parallel data-structures such as I-structures provide high level abstraction to easily write programs with potentially high parallelism. The results of preliminary experiments on a distributed memory parallel machine indicate that the performance inefficiency related to fine-grain parallel data-structures in the naive implementation is mainly caused by the calculation of the local address for distributed data, and the frequent fine-grain data access using message passing. In order to reduce the addressing overhead, we introduce a two-level table addressing technique. We employ a caching mechanism and a grouping mechanism for the fine-grain data access. The preliminary performance evaluation results indicate that these techniques are effective to improve the performance.
international parallel processing symposium | 1999
Shigeru Kusakabe; Kentaro Inenaga; Makoto Amamiya; Xinan Tang; Andres Marquez; Guang R. Gao
The combination of a language with fine-grain implicit parallelism and a dataflow evaluation scheme is suitable for high-level programming on massively parallel architectures. We are developing a compiler of V, a non-strict functional programming language, for EARTH(Efficient Architecture for Running THreads). Our compiler generates codes in Threaded-C, which is a lower-level programming language for EARTH. We have developed translation rules, and integrated them into the compiler. Since overhead caused by fine-grain processing may degrade performance for programs with little parallelism, we have adopted a thread merging rule. The preliminary performance results are encouraging. Although further improvement is required for non-strict data-structures, some codes generated from V programs by our compiler achieved comparable performance with the performance of hand-written Threaded-C codes.
Atmospheric Environment | 1999
Kentaro Inenaga; Shigeru Kusakabe; Makoto Amamiya
Fine-grain non-strict data structures such as I-structures provide high level abstraction to easily write programs with potentially high parallelism due to the eager evaluation of non-strict functions and non-strict structured-data. Non-strict data structures require frequent dynamic scheduling at a fine-grain level, which offsets the gain of latency hiding and asynchronous accesses to structured-data using non-strict data structures. These cause heavy overhead on commodity machines. In order to solve these problems for fine-grain non-strict structured-data, we employ a method to analyze dependencies between the structured-data and to schedule their producers and consumers. The performance evaluation results indicate that the scheduling technique is effective to improve the performance of fine-grain non-strict programs on commodity machines.
Innovative Architecture for Future Generation High-Performance Processors and Systems | 1998
Shigeru Kusakabe; Kentaro Inenaga; Makoto Amamiya; Xinan Tang; Andres Marquez; Guang R. Gao
The combination of a language with fine-grain implicit parallelism and a dataflow evaluation scheme is suitable for high-level programming on massively parallel architectures. We are developing a compiler of V, a non-strict functional programming language, for EARTH(Eficient Architecture for Running THreads). Our compiler generates codes in Threaded-C, which is a lower-level programming language for EARTH. We have developed translation rules, and integrated them into the compiler. While EARTH directly supports fine-grain thread execution, thread-level optimization by compiler is also.effective on EARTH. The preliminary performance results are encouraging, although further improvement is required for non-strict datastructures. Some codes generated from V programs by our compiler achieved comparable performance with the performance of hand-written Threaded-C codes.
Lecture Notes in Computer Science | 1997
Shigeru Kusakabe; Kentaro Inenaga; Makoto Amamiya
We tried to efficiently implement array-comprehensions on commodity machines. As a runtime support, we provided a hybrid runtime mechanism, which can dynamically change a suspended stack frame into a heap frame when a filling function suspends and requires dynamic scheduling due to suspending factors such as non-strict data-structure accesses. The results of the preliminary performance evaluation indicated that we can implement array-comprehensions with practical performance even on stock machines.
international conference on parallel architectures and compilation techniques | 1996
Shigeru Kusakabe; Taku Nagai; Kentaro Inenaga; Makoto Amamiya
Dataflow-based fine-grain parallel data-structures provide high-level abstraction to easily write programs with potentially high parallelism. In order to show the feasibility of a fine-grain dataflow paradigm, we implement a non-strict dataflow language on off-the-shelf computers, including a distributed-memory parallel machine. The results of preliminary experiments indicate that the inefficiency related to fine-grain parallel arrays in the naive distributed-memory implementation is mainly caused by the address generation for distributed data. To reduce overhead, we introduce a two-level table addressing technique that can efficiently generate addresses. The results of performance evaluation indicate that this technique is useful to improve the performance at a practical level even on off-the-shelf computers.
ASIAN '96 Proceedings of the Second Asian Computing Science Conference on Concurrency and Parallelism, Programming, Networking, and Security | 1996
Shigeru Kusakabe; Kentaro Inenaga; Kiyotoshi Nishimura; Makoto Amamiya
Research reports on information science and electrical engineering of Kyushu University | 2002
健太郎 稲永; Kentaro Inenaga; 茂 日下部; Shigeru Kusakabe; 真人 雨宮; Makoto Amamiya
情報処理学会論文誌. ハイパフォーマンスコンピューティングシステム | 2000
健太郎 稲永; Kentaro Inenaga; 茂 日下部; Shigeru Kusakabe; 真人 雨宮; Makoto Amamiya