Tomio Kamada | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tomio Kamada is active.

Explore More

Publication

Featured researches published by Tomio Kamada.

acm sigplan symposium on principles and practice of parallel programming | 2014

GLB: lifeline-based global load balancing library in x10

Wei Zhang; Olivier Tardieu; David Grove; Benjamin Herta; Tomio Kamada; Vijay A. Saraswat; Mikio Takeuchi

We present GLB, a programming model and an associated implementation that can handle a wide range of irregular parallel programming problems running over large-scale distributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to problems that are hard to statically load balance. GLB hides the intricate synchronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and result collection) from the users. GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al [25]. Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the parallel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales well -- achieving nearly linear speedup on different computer architectures (Power, Blue Gene/Q, and K) -- up to 16K cores.

information integration and web-based applications & services | 2008

Application framework with demand-driven mashup for selective browsing

Sohei Ikeda; Takakazu Nagamine; Tomio Kamada

This paper proposes a mashup framework for creating flexible mashup applications in which the user can selectively browse through mashup items. Our framework provides a data management engine for on-demand data generation, and GUI components called widgets that can be used to browse through mashed-up data selectively. The application developer has to only prepare a mashup relation specifying the web service combinations and widget configurations specifying how to display the mashed-up data. On the basis of these configurations, widgets monitor user interactions and requests data from the data management engine that processes the demand-driven creation of mashed-up data. To enable selective browsing, a table widget, for instance, allows selection of columns to be displayed, provides a limited view with scroll bars, and filtering facilities. Our framework also offers a mechanism for widget coordination where a widget can change the display target according to states or events of other widgets. We introduce a sample application for tour planning using five cooperative widgets, and discuss the usability and performance advantages of our framework.

Journal of Information Processing | 2016

Introducing a Multithread and Multistage Mechanism for the Global Load Balancing Library of X10

Kento Yamashita; Tomio Kamada

Load balancing is a major concern in massively parallel computing. X10 is a partitioned global address space language for scale-out computing and provides a global load balancing (GLB) library that shows high scalability over ten thousand CPU cores. This study proposes a multistage mechanism for GLB to assign execution stages to tasks and introduces a multithread design into GLB to allow efficient data sharing between CPU cores. The system gives high priority to tasks that are assigned to earlier stages and then proceeds with subsequent stage tasks. When a computing node runs out of tasks at the earliest stage, it requests tasks at the earliest stage from other nodes and awaits responses by processing subsequent stage tasks. When the system identifies the task termination at a certain stage, it executes a reduction operation over nodes. Programmers can define their reduction operations to gather or exchange results of completed tasks. This study provides the implementation method of the extended library and evaluates its runtime overhead using the K computer to a maximum of 256 nodes.

Artificial Life and Robotics | 2017

Platform design for large-scale artificial market simulation and preliminary evaluation on the K computer

Takuma Torii; Tomio Kamada; Kiyoshi Izumi; K. Yamada

Artificial market simulations have the potential to be a strong tool for studying rapid and large market fluctuations and designing financial regulations. High-frequency traders, that exchange multiple assets simultaneously within a millisecond, are said to be a cause of rapid and large market fluctuations. For such a large-scale problem, this paper proposes a software or computing platform for large-scale and high-frequency artificial market simulations (Plham: /pl

document engineering | 2005

A programming environment for demand-driven processing of network XML data and its performance evaluation

Masakazu Yamanaka; Kenji Niimura; Tomio Kamada

International Workshop on Theory and Practice of Parallel Programming | 1994

An algorithm for efficient global garbage collection on massively parallel computers (extend abstract)

Tomio Kamada; Satoshi Matsuoka; Akinori Yonezawa

\Lambda

International Journal of Distributed Sensor Networks | 2018

Efficient and reliable packet transfer protocol for wireless multihop bidirectional communications

Yumi Takaki; Makoto Ando; Keisuke Maesako; Keisuke Fujita; Tomio Kamada; Chikara Ohta

annual acis international conference on computer and information science | 2016

Redistribution mechanism for associative distributed collections of objects

Daisuke Fujishima; Tomio Kamada

Λm). The computing platform, Plham, enables modeling financial markets composed of various brands of assets and a large number of agents trading on a short timescale. The design feature of Plham is the separation of artificial market models (simulation models) from their execution (execution models). This allows users to define their simulation models without parallel computing expertise and to choose one of the execution models they need. This computing platform provides a prototype execution model for parallel simulations, which exploits the variety in trading frequency among traders, that is, the fact that some traders do not require up-to-date information of markets changing in millisecond order. We evaluated a prototype implementation on the K computer using up to 256 computing nodes.

software engineering, artificial intelligence, networking and parallel/distributed computing | 2012

Joined View Editor for Mashups of Web Data Stores

Yoshio Kumagai; Masaya Senba; Takakazu Nagamine; Tomio Kamada

This paper proposes a programming environment for Java thatprocesses network XML data in a demand-driven manner to returnquick initial responses. Our system provides a data binding tooland a tree operation package, and the programmer can easily handlenetwork XML data as tree-based operations using these facilities.For efficiency, demand-driven data binding allows the applicationto start the processing of a network XML document before thearrival of the whole data, and our tree operators are also designedto start the calculation using the initially accessible part of theinput data. Our system uses multithread technology forimplementation with optimization techniques to reduce runtimeoverheads. It can return initial responses quickly, and oftenshortens the total execution time due to the effects of latencyhiding and the reduction of memory usage. Compared with an ordinarytree-based approach, our system shows a highly improved responseand a 1-28% reduction of total execution time on the benchmarkprograms. It only needs 1-4% runtime overheads against theevent-driven programs.This paper proposes a programming environment for Java thatprocesses network XML data in a demand-driven manner to returnquick initial responses. Our system provides a data binding tooland a tree operation package, and the programmer can easily handlenetwork XML data as tree-based operations using these facilities.For efficiency, demand-driven data binding allows the applicationto start the processing of a network XML document before thearrival of the whole data, and our tree operators are also designedto start the calculation using the initially accessible part of theinput data. Our system uses multithread technology forimplementation with optimization techniques to reduce runtimeoverheads. It can return initial responses quickly, and oftenshortens the total execution time due to the effects of latencyhiding and the reduction of memory usage. Compared with an ordinarytree-based approach, our system shows a highly improved responseand a 1-28% reduction of total execution time on the benchmarkprograms. It only needs 1-4% runtime overheads against theevent-driven programs.

database systems for advanced applications | 2010

Application developments in mashup framework for selective browsing

Takakazu Nagamine; Tomio Kamada

Several distributed GC algorithms have been proposed in the past. But, realistic applications to commercially-available, large-scale MPPs have not been entirely successful. This is because algorithms (1) incurred excessive message traffic, (2) involved prohibitive runtime overhead, (3) did not properly scale to larger numbers of processors, and/or (4) have not been implemented to test their validity/efficiency. In general, GC algorithm, s for distributed-memory architectures can be roughly classified into two types:

Explore More