Is this you? Create Your Porfile

Yusuke Tanimura

National Institute of Advanced Industrial Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yusuke Tanimura is active.

Explore More

Publication

Featured researches published by Yusuke Tanimura.

IEEE Systems Journal | 2008

Design Principles and IT Overviews of the GEO Grid

Satoshi Sekiguchi; Yoshio Tanaka; Isao Kojima; Naotaka Yamamoto; Shohei Yokoyama; Yusuke Tanimura; Ryosuke Nakamura; Koki Iwao; Satoshi Tsuchida

As the Earths ecosystem is a spatially and temporally complex system by nature, it is not sufficient to observe such events and phenomena locally; problems must be solved on a global scale. Therefore, the accumulation of knowledge about the Earth in various forms and a scientifically correct understanding of the Earth are necessary. The authors have been leading the ldquoGlobal Earth Observation (GEO) Gridrdquo project since 2005, which is primarily aimed at providing an e-Science infrastructure for the worldwide Earth sciences community. In the community, there are wide varieties of existing datasets including satellite imagery, geological data, and ground sensed data that each data owner insists own licensing policy. Also, there are so many related projects that will be configured as a virtual organization (VO) enabled by Grid technology. The GEO Grid is designed to integrate all of the relevant data virtually, again enabled by Grid technology, and is accessible as a set of services. In this paper, first we describe design principles of the GEO Grid that are determined based on accommodating users requirements for publishing, managing, and using data. Second, software architecture and its preliminary implementations are specified where we take the Grid computing and Web service technologies as the core components that comply with a standard set of technologies and protocols. In addition, GEO Grid has been recognized to contribute to GEO or Global Earth Observation System of Systems (GEOSS) as a part of the Japanese governments commitment.

cluster computing and the grid | 2006

Deploying Scientific Applications to the PRAGMA Grid Testbed: Strategies and Lessons

David Abramson; Amanda H. Lynch; Hiroshi Takemiya; Yusuke Tanimura; Susumu Date; Haruki Nakamura; Karpjoo Jeong; Suntae Hwang; Ji Zhu; Zhonghua Lu; Celine Amoreira; Kim K. Baldridge; Chi-wei Wang; Horng-liang Shih; Tomas E. Molina; Wilfred W. Li; Peter W. Arzberger

Recent advances in grid infrastructure and middleware development have enabled various types of applications in science and engineering to be deployed on the grid. The characteristics of these applications and the diverse infrastructure and middleware solutions developed, utilized or adapted by PRAGMA member institutes are summarized. The applications include those for climate modeling, computational chemistry, bioinformatics and computational genomics, remote control of instruments, and distributed databases. Many of the applications are deployed to the PRAGMA grid testbed in routine basis experiments. Strategies for deploying applications without modifications, and those taking advantage of new programming models on the grid are explored and valuable lessons learned are reported. Comprehensive end to end solutions from PRAGMA member institutes that provide important grid middleware components and generalized models of integrating applications and instruments on the grid are also described.

international conference on move to meaningful internet systems | 2011

ADERIS: an adaptive query processor for joining federated SPARQL endpoints

Steven J. Lynden; Isao Kojima; Akiyoshi Matono; Yusuke Tanimura

Integrating distributed RDF data is facilitated by Linked Data and shared ontologies, however joins over distributed SPARQL services can be costly, time consuming operations. This paper describes the design and implementation of ADERIS, a query processing system for efficiently joining data from multiple distributed SPARQL endpoints. ADERIS decomposes federated SPARQL queries into multiple source queries and integrates the results utilising two techniques: adaptive join reordering, for which a cost model is defined, and the optimisation of subsequent queries to data sources to retrieve further data. The benefit of the approach in terms of minimising response time is illustrated by sample queries containing common SPARQL join patterns.

cluster computing and the grid | 2006

The PRAGMA Testbed - Building a Multi-Application International Grid

Cindy Zheng; David Abramson; Peter W. Arzberger; Shahaan Ayyub; Colin Enticott; Slavisa Garic; Mason J. Katz; Jae-Hyuck Kwak; Bu-Sung Lee; Philip M. Papadopoulos; Sugree Phatanapherom; Somsak Sriprayoonsakul; Yoshio Tanaka; Yusuke Tanimura; Osamu Tatebe; Putchong Uthayopas

This practices and experience paper describes the coordination, design, implementation, availability, and performance of the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) Grid Testbed. Applications in high-energy physics, genome annotation, quantum computational chemistry, wildfire simulation, and protein sequence alignment have driven the middleware requirements, and the testbed provides a mechanism for international users to share software beyond the essential, de facto standard Globus core. In this paper, we describe how human factors, resource availability and performance issues have affected the middleware, applications and the testbed design. We also describe how middleware components in grid monitoring, grid accounting, grid Remote Procedure Calls, grid-aware file systems, and grid-based optimization have dealt with some of the major characteristics of our testbed. We also briefly describe a number of mechanisms that we have employed to make software more easily available to testbed administrators.

databases in networked information systems | 2010

Adaptive integration of distributed semantic web data

Steven J. Lynden; Isao Kojima; Akiyoshi Matono; Yusuke Tanimura

The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible via Web Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search engines. This paper addresses the problem of effectively integrating RDF data stored in separate Web-accessible databases. An approach based on distributed query processing is described, where data from multiple repositories are used to construct partitioned tables that are integrated using an adaptive query processing technique supporting join reordering, which limits any reliance on statistics and metadata about SPARQL endpoints, as such information is often inaccurate or unavailable, but is required by existing systems supporting federated SPARQL queries. The approach presented extends existing approaches in this area by allowing tables to be added to the query plan while it is executing, and shows how an approach currently used within relational query processing can be applied to distributed SPARQL query processing. The approach is evaluated using a prototype implementation and potential applications are discussed.

international conference on data engineering | 2010

Extensions to the Pig data processing platform for scalable RDF data processing using Hadoop

Yusuke Tanimura; Akiyoshi Matono; Steven J. Lynden; Isao Kojima

In order to effectively handle the growing amount of available RDF data, a scalable and flexible RDF data processing framework is needed. We previously proposed a Hadoop-based framework, which takes advantages of scalable and fault-tolerant distributed processing technologies, originally proposed as Googles distributed file system and MapReduce parallel model. In this paper, we present a method extending the Pig data processing platform on top of the Hadoop infrastructure. Pig compiles programs written in a high level language, called Pig Latin, into MapReduce programs that can be executed by Hadoop. In order to support RDF, Pig was extended with the ability to load and store RDF data efficiently. Furthermore, as reasoning is an important requirement for most systems storing RDF data, support for inferring new triples using entailment rules was also added. In this paper, we describe these extensions and present an evaluation of their performance.

Journal of Grid Computing | 2006

Implementation of Fault-Tolerant GridRPC Applications

Yusuke Tanimura; Tsutomu Ikegami; Hidemoto Nakada; Yoshio Tanaka; Satoshi Sekiguchi

A task parallel application is implemented with Ninf-G, a GridRPC system. A series of experiments are conducted on the Grid testbed in Asia Pacific for three months. Through tens of long executions, typical fault patterns were collected, and instability of the network throughput was determined to be a major reason of the faults. Several important points are stressed to avoid task throughput decline due to the fault-recovery operations: Timeout minimization for fault detection, background recovery, redundant task assignments, and so on. This study also issues a steer for design of the automated fault-tolerant mechanism in an upper layer of the GridRPC framework.

grid computing | 2010

A distributed storage system allowing application users to reserve I/O performance in advance for achieving SLA

Yusuke Tanimura; Koie Hidetaka; Tomohiro Kudoh; Isao Kojima; Yoshio Tanaka

Performance assurance has become an important aspect in grid and cloud computing which provide services over the Internet, and Service Level Agreements (SLA) are frequently contracted between users and the service providers. However, the I/O performance of the storage or data access service is still provided on a best effort basis. Some distributed storage systems implement performance reservation, but the reservation is implemented inside of the storage and works in an adaptive manner. In order to promise performance guarantees to users, we propose a distributed storage system allowing application users to explicitly make an advanced and time-based reservation for I/O access and storage space. Thus the requested performance is guaranteed during the reserved time. This paper describes our proposed concept and the design architecture of the storage system, including the reservation interface, resource management and I/O control frameworks. Then it explains our prototype which implements a simple resource allocation strategy and I/O control of the storage network along the design. The experiment results using the prototype are also shown. They indicate that the reservation cost entailed only a low performance impact on users, and that the requested performance was achieved by the reservation feature.

european conference on parallel processing | 2014

Applying Selectively Parallel I/O Compression to Parallel Storage Systems

Rosa Filgueira; Malcolm P. Atkinson; Yusuke Tanimura; Isao Kojima

This paper presents a new I/O technique called Selectively Parallel I/O Compression (SPIOC) for providing high-speed storage and access to data in QoS enabled parallel storage systems. SPIOC reduces the time of I/O operations by applying transparent compression between the computing and the storage systems. SPIOC can predict whether to compress or not at runtime, allowing parallel or sequential compression techniques, guaranteeing QoS and allowing partial and full reading by decompressing the minimum part of the file. SPIOC maximises the measured efficiency of data movement by applying run-time customising compression before storing data in the Papio storage system.

computer and information technology | 2006

Design and Implementation of Distributed Task Sequencing on GridRPC

Yusuke Tanimura; Hidemoto Nakada; Yoshio Tanaka; Satoshi Sekiguchi

In the framework of GridRPC, a new function that allows direct data transfer between RPC servers is implemented for efficient execution of a Task Sequencing job in a grid environment. In Task Sequencing, RPC requires dependency between input and output parameters, which means output of a previous RPC becomes the input of the next RPC. In this study, the direct transfer of data is implemented using the grid filesystem without destroying the GridRPC programming model and without changing very many parts of the existing Ninf-G implementation. Our Task Sequencing API library analyzes RPC arguments to detect intermediate data after task submissions, and reports the information to GridRPC servers so that the intermediate data is created on the grid filesystem. Through our performance evaluation on LAN and on the Japan-US grid environment, it was verified that the function achieved performance improvement in distributed Task Sequencing.

Explore More