Kenichiro Anjo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kenichiro Anjo is active.

Explore More

Publication

Featured researches published by Kenichiro Anjo.

field-programmable technology | 2004

Stream applications on the dynamically reconfigurable processor

Masayasu Suzuki; Yohei Hasegawa; Yutaka Yamada; Naoto Kaneko; Katsuaki Deguchi; Hideharu Amano; Kenichiro Anjo; Masato Motomura; Kazutoshi Wakabayashi; Takao Toi; Toru Awashima

Dynamically reconfigurable processor (DRP) developed by NEC Electronics is a coarse grain reconfigurable processor that selects a data path from the on-chip repository of sixteen circuit configurations, or contexts, to implement different logic on one single DRP chip. Several stream applications have been implemented on the DRP-1, the first prototype chip, and evaluation results are presented. By pipelining the executions, DRP-1 outperformed Pentium III/4, embedded CPU MIPS64, and Texas Instruments DSP TMS320C67J3 in some stream application examples. We also present programming techniques applicable on dynamically reconfigurable processors and discuss their feasibility in boosting system performance.

field-programmable logic and applications | 2003

Reducing the configuration loading time of a coarse grain multicontext reconfigurable device

Toshiro Kitaoka; Hideharu Amano; Kenichiro Anjo

High speed and low cost configuration loading methods for a coarse grain multicontext reconfigurable device DRP(Dynamically Reconfigurable Processor) are proposed and implemented. In these methods, the configuration data is compressed on the host computer before loading, and decoded at the time of loading by circuits implemented on a part of logics. Unlike conventional reconfigurable device, the logic for decoder circuits is switched with application circuits immediately after loading in multicontext reconfigurable devices. Thus, the circuit does not use a real estate of the chip during the execution. Two compression methods LZSS-ARC and Selective coding are implemented and evaluated. LZSS-ARC achieves better compression ratio, while Selective coding can work at the same frequency of the data loading.

IEEE Transactions on Parallel and Distributed Systems | 2006

A Simple Data Transfer Technique Using Local Address for Networks-on-Chips

Michihiro Koibuchi; Kenichiro Anjo; Yutaka Yamada; Akiya Jouraku; Hideharu Amano

Networks-on-chips (NoCs) have been studied to connect a number of modules in a chip by introducing a network structure which is similar to that in parallel computers. Since embedded streaming applications usually generate predictable small-sized data traffic, the network structure can be customized to the target traffic. Accordingly, we develop a data transfer technique for simplifying routers for predictable small-sized communication in simple tile-based architectures. A data structure is split into single-flit packets, and a label is attached to each of them in order to route them independently. A label is transferred on dedicated wires beside data lines in a channel by taking advantage of relaxed pin count limitations of a channel. To reduce the wiring area for the label, the label is locally assigned according to a preanalysis of required communication pairs of nodes. Analysis results show that only a 3-bit local label is sufficient to route all data of evaluated streaming applications in the case of a 16-node 2D torus. The required amount of hardware for a router is reduced by 37 percent compared with that for a wormhole packet router with the same number of routing table entries

international interconnect technology conference | 2000

Clock distribution networks with on-chip transmission lines

Masayuki Mizuno; Kenichiro Anjo; Yoshikazu Sumi; Muneo Fukaishi; Hitoshi Wakabayashi; Tohru Mogami; Tadahiko Horiuchi; Masakazu Yamashina

Todays fabrication process scaling enables on-chip lossy transmission lines to be used for long interconnects and high-speed clocking. Advantages and design tradeoffs of on-chip transmission lines are discussed and a 100-mm/sup 2/ 5-GHz clocking chip using on-chip transmission lines is introduced.

field-programmable custom computing machines | 2004

Implementing and evaluating stream applications on the dynamically reconfigurable processor

Noriaki Suzuki; Shunsuke Kurotaki; Masayasu Suzuki; Naoto Kaneko; Yutaka Yamada; Katsuaki Deguchi; Yohei Hasegawa; Hideharu Amano; Kenichiro Anjo; Masato Motomura; Kazutoshi Wakabayashi; Takeo Toi; Toru Awashima

Dynamically reconfigurable processor (DRP) developed by NEC electronics is a coarse grain reconfigurable processor that selects a data path from the on-chip repository of sixteen circuit configurations, or contexts, to implement different logic on one single DRP chip. Several stream applications have been implemented on DRP-1, the first prototype chip, and evaluation results are presented. By computing parallelly using the processing elements(PEs) and distributed memory modules, DRP-1 outperformed pentium III/4 and embedded CPU MIPS64 in some stream application examples. We also present programming techniques applicable on reconfigurable processors and discuss their feasibility in boosting system performance.

field-programmable logic and applications | 2003

A Dynamically Adaptive Switching Fabric on a Multicontext Reconfigurable Device

Hideharu Amano; Akiya Jouraku; Kenichiro Anjo

A framework of dynamically adaptive hardware mechanism on multicontext reconfigurable devices is proposed, and as an example, an adaptive switching fabric is implemented on NEC’s novel reconfigurable device DRP(Dynamically Reconfigurable Processor).

embedded and ubiquitous computing | 2004

Folded Fat H-Tree: An Interconnection Topology for Dynamically Reconfigurable Processor Array

Yutaka Yamada; Hideharu Amano; Michihiro Koibuchi; Akiya Jouraku; Kenichiro Anjo; Katsunobu Nishimura

Fat H-Tree is a novel on-chip network topology for a dynamic reconfigurable processor array. It includes both fat tree and torus structure, and suitable to map tasks in a stream processing. For on-chip implementation, folding layout is also proposed. Evaluation results show that Fat H-Tree reduces the distance of H-Tree from 13% to 55%, and stretches the throughput almost three times.

symposium on frontiers of massively parallel computation | 1999

The preliminary evaluation of MBP-light with two protocol policies for a massively parallel processor-JUMP-1

Inoue Hiroaki; Kenichiro Anjo; Junji Yamamoto; Jun Tanabe; Masaki Wakabayashi; Mitsuru Sato; Hideharu Amano; Kei Hiraki

A massively parallel processor called JUMP-1 has been developed to build an efficient cache coherent-distributed shared memory (DSM) on a large system with more than 1000 processors. Here, the dedicated processor called MBP (Memory Based Processor)-light to manage the DSM of JUMP-1 is introduced, and its preliminary performance with two protocol policies-update/invalidate-is evaluated. From results of its simulation, it appears that simple operations like the tag check and the collection/generation of acknowledgment packets are mostly processed by the hardware mechanisms in MBP-light without the aids of the core processor with both policies. Also, the buffer-register architecture adopted by the core processor in MBP-light is exploited enough to process a protocol transaction for both policies.

asia and south pacific design automation conference | 1997

The RDT network router chip

Hiroaki Nishi; Hideharu Amano; Katsunobu Nishimura; Kenichiro Anjo; Tomohiro Kudoh

The RDT network router chip is a versatile router for the massively parallel computer prototype JUMP-1. The major goal of this project is to establish techniques for building an efficient distributed shared memory on a massively parallel processor. For this purpose, the reduced hierarchical bit-map directory (RHBD) schemes are used for efficient cache management of the distributed shared memory. In order to implement (RHBD) schemes efficiently, we proposed a novel interconnection network RDT (recursive diagonal torus), and developed a sophisticated router chip for the RDT which equips a hierarchical multicast mechanism without deadlock and acknowledge combining mechanism. By using the 0.5/spl mu/BiCMOS SOG technology it can transfer all packets synchronized with a unique CPU clock(60MHz). Long coaxial cables are directly driven with the ECL interface of this chip. The mixed design approach with schematic and VHDL permits the development of the complicated chip with 90,522 gates in a year.

Archive | 2003