Itsujiro Arita | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Itsujiro Arita is active.

Explore More

Publication

Featured researches published by Itsujiro Arita.

international conference on supercomputing | 2000

Table size reduction for data value predictors by exploiting narrow width values

Toshinori Sato; Itsujiro Arita

Recently, the practice of speculation in resolving data dependences has been studied as a means of extracting more instruction level parallelism (ILP). An outcome of an instruction is predicted by value predictors. The instruction and its dependent instructions can be executed simultaneously, thereby exploiting ILP aggressively. One of the serious hurdles for realizing data speculation is huge hardware budget of the predictors. In this paper, we propose a technique reducing the budget by exploiting narrow width values. The hardware budget of value predictors is reduced by up to 45.1%. Simulation results show that the technique, called 2-mode scheme, maintains processor performance with slight decrease of the value prediction accuracy.

international conference on parallel processing | 2001

In search of efficient reliable processor design

Toshinori Sato; Itsujiro Arita

In this paper, we investigate an efficient reliable processor which can detect and recover from transient faults. There are two driving forces to study fault-tolerant techniques for microprocessors. One is deep submicron fabrication technology. Future semiconductor technologies could become more susceptible to alpha particles and other cosmic radiation. The other is increasing popularity of mobile platforms. Recently cell phones are used for applications which are critical to our financial security, such as flight ticket reservation, mobile banking, and mobile trading. In such applications, it is expected that computer systems will always work correctly. From these observations, we have proposed a mechanism which is based on instruction reissue technique for incorrect data speculation recovery and utilizes time redundancy. In order to mitigate overhead caused by including fault-tolerant facility, we evaluate some alternative designs and find that speculatively updating branch predictors and removing redundant memory accesses are very effective.

international conference on parallel processing | 2000

Partial resolution in data value predictors

Toshinori Sato; Itsujiro Arita

Recently, the practice of speculation in resolving data dependences has been studied as a means of extracting more instruction level parallelism (ILP). An outcome of an instruction is predicted by value predictors. The instruction and its dependent instructions can be executed simultaneously, thereby exploiting ILP aggressively. One of the serious hurdles for realizing data speculation is huge hardware budget of the predictors. In this paper, we investigate a technique reducing the budget by employing partial resolution, using fewer tag address bits than necessary to uniquely identify every instruction. Simulation results show only two tag bits are enough for achieving performance improvement comparable to full resolution, saving the hardware budget of value predictors substantially.

ieee international conference on high performance computing data and analytics | 2009

Low-Cost Value Predictors Using Frequent Value Locality

Toshinori Sato; Itsujiro Arita

The practice of speculation in resolving data dependences has been recently studied as a means of extracting more instruction level parallelism (ILP). Each instructions outcome is predicted by value predictors. The instruction and its dependent instructions can be executed in parallel, thereby exploiting ILP aggressively. One of the serious hurdles for realizing data speculation is the huge hardware budget required by the predictors. In this paper, we propose techniques that exploit frequent value locality, resulting in a significant budget reduction. Based on these proposals, we evaluate two value predictors, named the zero-value predictor and the 0/1-value predictor. The zero-value predictor generates only value 0. Similarly, the 0/1-value predictor generates only values 0 and 1. Simulation results show that the proposed predictors have greater performance than does the last-value predictor which requires a hardware budget twice as large as that of the predictors. Therefore, the zero- and the 0/1-value predictors are promising candidates for cost-effective and practical value predictors which can be implemented in real microprocessors.

Compilers and operating systems for low power | 2003

Constructive timing violation for improving energy efficiency

Toshinori Sato; Itsujiro Arita

A novel technique for improving the energy efficiency of microprocessors is disclosed. This new method relies on a fault-tolerance mechanism for timing violations, based on a speculative execution technique. Since power reduces quadratically with supply voltage, supply voltage reductions can result in substantial power savings. However, these reductions also cause a longer gate delay, and so the clock frequency must be reduced so that timing constraints of critical paths are not violated. If any fault-tolerance mechanism is provided for timing faults, it is not necessary to maintain the constraints. From these observations, we propose a fault-tolerance technique for timing violations, that efficiently utilizes the speculative execution mechanism and reduces power consumption. We call the technique constructive timing violation. The present study evaluated our proposal regarding this technique using a cycle-by-cycle simulator and determined the techniques efficiency regarding energy consumption.

pacific rim international symposium on dependable computing | 2001

Evaluating low-cost fault-tolerance mechanism for microprocessors on multimedia applications

Toshinori Sato; Itsujiro Arita

We evaluate a low-cost fault-tolerance mechanism for microprocessors, which can detect and recover from transient faults, using multimedia applications. There are two driving forces to study fault-tolerance techniques for microprocessors. One is deep submicron fabrication technologies. Future semiconductor technologies could become more susceptible to alpha particles and other cosmic radiation. The other is the increasing popularity of mobile platforms. Recently cell phones have been used for applications which are critical to our financial security, such as flight ticket reservation, mobile banking, and mobile trading. In such applications, it is expected that computer systems will always work correctly. From these observations, we propose a mechanism which is based on an instruction reissue technique for incorrect data speculation recovery which utilizes time redundancy. Unfortunately, we found significant performance loss when we evaluated the proposal using the SPEC2000 benchmark suite. We evaluate it using MediaBench which contains more practical mobile applications than SPEC2000.

Systems and Computers in Japan | 2003

Combining variable latency pipeline with instruction reuse for execution latency reduction

Toshinori Sato; Itsujiro Arita

Operand bypass logic is likely to be one of the critical structures for future microprocessors to achieve high clock speed. The logic delay causes the execution time budget to be reduced significantly, so that the execution stage is divided into several stages. The variable latency pipeline (VLP) structure has the advantages of pipelining and pseudo-asynchronous design. According to the source operands delivered to arithmetic units, VLP changes execution latency and thus achieves both high speed and low latency for most operands. In this paper we evaluate VLP with dynamically scheduled superscalar processors, using a cycle-by-cycle simulator. Our experimental results show that VLP successfully reduces the effective execution time, and thus relaxes the constraints on the operand bypass logic. We also evaluate the instruction reuse technique in order to support VLP.

annual conference on computers | 1992

Experiments of a reconfigurable multiprocessor simulation on a distributed environment

B.O. Apduhan; T. Sueyoshi; Y. Namiuchi; T. Tezuka; Itsujiro Arita

The experiments and analysis of a reconfigurable multiprocessor simulation on a cluster of workstations connected by Ethernet are presented. The system model and simulation environment is described. The monitoring/debugging tool and the concept of SPP, a proposed parallel programming paradigm which can effectively reduce the synchronization operations, are described. The structure of the modules comprised by the system software model are also described. The sequential and parallel versions of a computationally intensive sequential program were executed on different network topologies and its speedup ratios are analyzed and discussed. The crucial issues in realizing reconfigurable multiprocessor simulation on a distributed environment are considered.<<ETX>>

Journal of Multimedia | 2008

Navilite: a lightweight indoor location-aware mobile navigation service for the handicapped and the elderly

Toshihiro Uchibayashi; Bernady O. Apduhan; Itsujiro Arita

In visual secret sharing (VSS) schemes, a secret image can be visually revealed from overlapping shadow images without additional computations. However, the contrast of reconstructed image is much lost. Employing reversing operation to reverse black and white pixels as well as increasing encoding runs is an effective way to improve the contrast. A novel VSS scheme with reversing is presented in this paper. It achieves really ideal contrast within only [m/h] encoding runs (where m and h are the number of the total columns and the number of the wholewhite columns in the basis matrix to encode white pixels, respectively) and no pixel expansion occurs. It encodes the secret image block by block. A block consists of m pixels, which means that m pixels together join into each encoding step. It is suitable for all access structures and can be applied to encrypt black-white, gray-scale and chromatic images. The experimental results, analyses and comparisons show that the proposed scheme is optimal among those schemes with reversing in encoding runs, pixel expansion, complexity, system capacity and encoding efficiency.

ACM Sigarch Computer Architecture News | 2003

A trace-level value predictor for Contrail processors

Takenori Koushiro; Toshinori Sato; Itsujiro Arita

Contrail processors utilize multithreading for improving energy efficiency. In Contrail, an execution of an application is divided into two streams. One is called the speculation stream. It consists of the main part of the execution and is dispatched into the fast functional units. However, several regions of the execution are skipped by utilizing trace-level value prediction. The other stream is called the verification stream. It supports the speculation stream by verifying each data prediction, and is dispatched into the slow units. The key idea is that the trace-level value prediction translates each critical path into non-critical one and moves it from the speculation stream into the verification stream, and then the non-critical instructions are executed on the slow units. In this paper, we investigate a trace-level value predictor for Contrail processors.

Explore More