Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yasuhiro Take is active.

Publication


Featured researches published by Yasuhiro Take.


asia and south pacific design automation conference | 2013

A case for wireless 3D NoCs for CMPs

Hiroki Matsutani; Paul Bogdan; Radu Marculescu; Yasuhiro Take; Daisuke Sasaki; Hao Zhang; Michihiro Koibuchi; Tadahiro Kuroda; Hideharu Amano

Inductive-coupling is yet another 3D integration technique that can be used to stack more than three known-good-dies in a SiP without wire connections. We present a topology-agnostic 3D CMP architecture using inductive-coupling that offers great flexibility in customizing the number of processor chips, SRAM chips, and DRAM chips in a SiP after chips have been fabricated. In this paper, first, we propose a routing protocol that exchanges the network information between all chips in a given SiP to establish efficient deadlock-free routing paths. Second, we propose its optimization technique that analyzes the application traffic patterns and selects different spanning tree roots so as to minimize the average hop counts and improve the application performance.


IEEE Transactions on Computers | 2014

3D NoC with Inductive-Coupling Links for Building-Block SiPs

Yasuhiro Take; Hiroki Matsutani; Daisuke Sasaki; Michihiro Koibuchi; Tadahiro Kuroda; Hideharu Amano

A wireless 3D NoC architecture is described for building-block SiPs, in which the number of hardware components (or chips) in a package can be changed after chips have been fabricated. The architecture uses inductive-coupling links that can connect more than two examined dies without wire connections. Each chip has data transceivers for the uplink and downlink in order to communicate with its neighboring chips in the package. These chips form a vertical unidirectional ring network so as to fully exploit the flexibility of the wireless approach that enables us to add, remove, and swap the chips in the ring. To avoid protocol and structural deadlocks in the ring, we use bubble flow control, which does not rely on the conventional VC-based deadlock avoidance mechanism. In addition, we propose a bidirectional communication scheme to form a bidirectional ring network by using the inductive-coupling transceivers that can dynamically change the communication modes, such as TX, RX, and Idle modes. This paper illustrates the inductive-coupling transceiver circuits, which can carry high data transfer rates of up to 8 Gbps per channel, for the wireless 3D NoC. It also illustrates an implementation of a wireless 3D NoC that has on-chip routers and transceivers implemented with a 65 nm process in order to show the feasibility of our proposal. The vertical bubble flow control and conventional VC-based approach on the uni- and bidirectional ring networks are compared with the vertical broadcast bus in terms of throughput, hardware amount, and application performance using a full system multiprocessor simulator. The results show that the proposed bidirectional communication scheme efficiently improves application performance without adding any inductive-coupling transceivers. In addition, the proposed vertical bubble flow network outperforms the conventional VC-based approach by 7.9-12.5 percent with a 33.5 percent smaller router area for building-block SiPs connecting up to eight chips.


design, automation, and test in europe | 2014

Low-latency wireless 3D NoCs via randomized shortcut chips

Hiroki Matsutani; Michihiro Koibuchi; Ikki Fujiwara; Takahiro Kagami; Yasuhiro Take; Tadahiro Kuroda; Paul Bogdan; Radu Marculescu; Hideharu Amano

In this paper, we demonstrate that we can reduce the communication latency significantly by inserting a fraction of randomness into a wireless 3D NoC (where CMOS wireless links are used for vertical inter-chip communication) when considering the physical constraints of the 3D design space. Towards this end, we consider two cases, namely 1) replacing existing horizontal 2D links in a wireless 3D NoC with randomized shortcut NoC links and 2) enabling full connectivity by adding a randomized NoC layer to a wireless 3D platform with partial or no horizontal connectivity. Consequently, the packet routing is optimized by exploiting both the existing and the newly added random NoC. At the same time, by adding randomly wired shortcut NoCs to a wireless 3D platform, a good balance can be established between the modularity of the design and the minimum randomness needed to achieve low latency, and experimental results show that by adding a random NoC chip to wireless 3D CMPs without built-in horizontal connectivity, the communication latency can be reduced by as much as 26.2% when compared to adding a 2D mesh NoC. Also, the application execution time and average flit transfer energy can be improved accordingly.


networks on chips | 2011

A vertical bubble flow network using inductive-coupling for 3-D CMPs

Hiroki Matsutani; Yasuhiro Take; Daisuke Sasaki; Masayuki Kimura; Yuki Ono; Yukinori Nishiyama; Michihiro Koibuchi; Tadahiro Kuroda; Hideharu Amano

A wireless 3-D NoC architecture for CMPs, in which the number of processor and cache chips stacked in a package can be changed after the chip fabrication, is proposed by using the inductive coupling technology that can connect more than two known-good-dies without wire connections. Each chip has data transceivers for uplink and downlink in order to communicate with its neighboring chips in the package. These chips form a single vertical ring network so as to fully exploit the flexibility of the wireless approach that enables us to add, remove, and swap the chips in the ring. To avoid protocol and structural deadlocks in the ring network, we use the bubble flow control which is more flexible and efficient compared to the conventional VC-based deadlock avoidance. We implemented a real 3-D chip that has on-chip routers and inductive-coupling data transceivers using a 65nm process in order to show the feasibility of our proposal. The vertical bubble flow control is compared with the conventional VC-based approach and vertical bus in terms of the throughput, hardware amount, and application performance using a full system CMP simulator. The results show that the proposed vertical bubble flow network outperforms the VC-based approach by 7.9%-12.5% with a 33.5% smaller router area.


IEEE Micro | 2013

A Scalable 3D Heterogeneous Multicore with an Inductive ThruChip Interface

Noriyuki Miura; Yusuke Koizumi; Yasuhiro Take; Hiroki Matsutani; Tadahiro Kuroda; Hideharu Amano; Ryuichi Sakamoto; Mitaro Namiki; Kimiyoshi Usami; Masaaki Kondo; Hiroshi Nakamura

The authors developed a scalable heterogeneous multicore processor. 3D heterogeneous chip stacking of a general-purpose CPU and reconfigurable multicore accelerators enables various trade-offs between performance and energy consumption. The stacked chips interconnect through a scalable 3D network on a chip (NoC). By simply changing the number of stacked accelerator chips, processor parallelism can be widely scaled. No design change is needed, and hence, no additional nonrecurring engineering (NRE) cost is required. An inductive-coupling ThruChip Interface (TCI) is applied to stacked-chip communications, forming a low-cost and robust high-speed 3D NoC. The authors developed a prototype system called Cube-1 with 65-nm CMOS test chips, and confirmed successful system operations, including 10 hours of continuous Linux OS operation. Simple filters and a streaming application were implemented on Cube-1 and performance acceleration up to about three times was achieved.


symposium on vlsi circuits | 2010

Simultaneous 6Gb/s data and 10mW power transmission using nested clover coils for non-contact memory card

Yuxiang Yuan; Andrzej Radecki; Noriyuki Miura; Iori Aikawa; Yasuhiro Take; Hiroki Ishikuro; Tadahiro Kuroda

This paper presents a non-contact memory card and a host employing simultaneous data and power transmission through inductive coupling. Nested clover-shaped data coils are proposed for reducing interference from a power link. The host wirelessly tracks current consumption of the card and adjusts transmit power to improve power transfer efficiency. The prototype is implemented in 65nm CMOS. It achieves 6Gb/s data rate and almost 10% power transfer efficiency over a 100−2kΩ range of the load.


IEEE Micro | 2013

A scalable 3D heterogeneous multi-core processor with inductive-coupling thruchip interface

Noriyuki Miura; Yusuke Koizumi; Eiichi Sasaki; Yasuhiro Take; Hiroki Matsutani; Tadahiro Kuroda; Hideharu Amano; Ryuichi Sakamoto; Mitaro Namiki; Kimiyoshi Usami; Masaaki Kondo; Hiroshi Nakamura

A scalable heterogeneous multi-core processor is developed. 3D heterogeneous chip stacking of a general-purpose CPU and reconfigurable multi-core accelerators improves computational energy efficiency by proper task assignment and massive parallel computing. The stacked chips interconnect through a scalable 3D Network on Chip (NoC). By simply changing the number of stacked accelerator chips, processor parallelism can be widely scaled. In combination with Dynamic Voltage and Frequency Scaling (DVFS), the energy efficiency can be optimized for various performance requirements. No design change is needed, and hence no additional Non-Recurring Engineering (NRE) cost. An inductive-coupling ThruChip Interface (TCI) is applied to stacked-chip communications, forming a low-cost and robust high-speed 3D NoC. A prototype demonstration system has been developed with 65nm CMOS test chips. Successful system operations including 10-hours continuous Linux OS operation are confirmed for the first time.


international solid-state circuits conference | 2011

A 2.7Gb/s/mm 2 0.9pJ/b/chip 1coil/channel ThruChip interface with coupled-resonator-based CDR for NAND Flash memory stacking

Noriyuki Miura; Yasuhiro Take; Mitsuko Saito; Yoichi Yoshida; Tadahiro Kuroda

This paper presents an inductive-coupling interface for NAND Flash memory stacking whose bandwidth per unit area is 2.7Gb/s/mm2 and energy consumption per chip is 0.9pJ/b/chip. The bandwidth is increased by 10× (in other words, layout area is reduced to 1/10 for the same data rate), and the energy consumption is reduced by half, both compared to the latest research results [1]. A relayed transmission scheme using one coil is proposed to reduce the number of coils in a data link. Coupled resonation is utilized for clock and data recovery (CDR) for the first time in the world, resulting in elimination of a source synchronous clock link. As a result, total number of coils needed to form a channel is reduced from 6 to 1, yielding the significant improvement in data rate, layout area and energy consumption.


IEEE Journal of Solid-state Circuits | 2011

A 30 Gb/s/Link 2.2 Tb/s/mm

Yasuhiro Take; Noriyuki Miura; Tadahiro Kuroda

This paper presents a 30 Gb/s/link 2.2 Tb/s/mm2 inductive-coupling link for a high-speed DRAM interface. The data rate per layout area is the highest among DRAM interfaces reported up to now. The proposed interface employs a high-speed injection-locking CDR technique that utilizes the derivative property of inductive coupling. Compared to conventional injection-locking CDR based on an XOR edge detector, the proposed technique doubles the operation speed and increases the data rate to 30 Gb/s/link. As a result, the data rate per layout area is increased to 2.2 Tb/s/mm2 , which is 2X that of the state-of-the-art inductive-coupling link, and 22X that of the state-of-the-art wired link.


IEEE Transactions on Very Large Scale Integration Systems | 2016

^{2}

Takahiro Kagami; Hiroki Matsutani; Michihiro Koibuchi; Yasuhiro Take; Tadahiro Kuroda; Hideharu Amano

Wireless 3-D network-on-chips (NoCs) with inductive-coupling ThruChip interfaces provide a large degree of flexibility for customizing the number of arbitrary chips in a package after chips have been fabricated. To simplify the vertical communication interfaces, static time division multiple access (TDMA) is used for the vertical broadcast buses, while arbitrary or customized topologies can be used for the intrachip network. This paper proposes two techniques to break through the simple static TDMA-based vertical buses while maintaining a simple communication interface. The first technique is headfirst sliding (HS) routing to reduce the waiting time for acquiring the communication time-slot. HS routing selects the best vertical bus based on the current time, taking advantage of static TDMA. The second technique extends carrier sense multiple access with collision detection (CSMA/CD) for vertical broadcast buses. We introduce a packet collision detection technique for inductive-coupling buses and propose two retransmission strategies to reduce the waiting time for packet retransmissions caused by collisions. Network simulation results show that HS routing reduces the communication latency by 39.1% compared with the conventional static TDMA bus-based 3-D NoC that uses the shortest path routing. The proposed CSMA/CD bus also improves the latency by 52.5% and throughput by 34.1%. The full-system simulation results show that HS routing and the proposed CSMA/CD technique reduce the application execution time accordingly while maintaining the average flit transfer energy overhead modest.

Collaboration


Dive into the Yasuhiro Take's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michihiro Koibuchi

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge