Seidai Takeda
University of Tokyo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Seidai Takeda.
international conference on computer design | 2008
Naomi Seki; Lei Zhao; Jo Kei; Daisuke Ikebuchi; Yu Kojima; Yohei Hasegawa; Hideharu Amano; Toshihiro Kashima; Seidai Takeda; Toshiaki Shirai; Mitustaka Nakata; Kimiyoshi Usami; Tetsuya Sunata; Jun Kanai; Mitaro Namiki; Masaaki Kondo; Hiroshi Nakamura
A fine-grain dynamic power gating is proposed for saving the leakage power in MIPS R3000 by sleep control and applied to a processor pipeline. An execution unit is divided into four small units: multiplier, divider, shifter and other (CLU). The power of each unit is cut off dynamically, based on the operation. We tape-outed the prototype chip Geyser-0, which provides an R3000 Core with the power reduction technique, 16 KB caches and translation lookaside buffer (TLB) using 90 nm CMOS technology. The evaluation results of four benchmark programs for embedded applications show that 47% of the leakage power is reduced on average with 41% area overhead.
asia and south pacific design automation conference | 2010
Daisuke Ikebuchi; Naomi Seki; Yu Kojima; M. Kamata; Lei Zhao; Hideharu Amano; Toshiaki Shirai; Satoshi Koyama; Tatsunori Hashida; Y. Umahashi; Hiroki Masuda; Kimiyoshi Usami; Seidai Takeda; Hiroshi Nakamura; Mitaro Namiki; Masaaki Kondo
Geyser-1 is a MIPS CPU which provides a fine-grained run-time power gating (PG) controlled by instructions. Unlike traditional PGs, it uses special standard cells in which the virtual ground (VGND) is separated from the real ground, and a certain number of the sleep transistors are inserted for quick power shut-down and wake-up. In Geyser-1, the fine-grained run-time PG is applied to computational modules in the execution stage. The power shut-down and wakeup are controlled with architectural and software level. This implementation is the first available CPU with this type of run-time PG technique. Geyser-1 has both time and spatial fine-grained PG and works well with a real chip.
asian solid state circuits conference | 2009
Daisuke Ikebuchi; Naomi Seki; Yu Kojima; M. Kamata; Lei Zhao; Hideharu Amano; Toshiaki Shirai; Satoshi Koyama; Tatsunori Hashida; Y. Umahashi; Hiroki Masuda; Kimiyoshi Usami; Seidai Takeda; Hiroshi Nakamura; Mitaro Namiki; Masaaki Kondo
Geyser-1, a prototype MIPS R3000 CPU with fine grain runtime PG for major computational components in the execution stage is available. Function units such as CLU, shifter, multiplier and divider are power-gated and controlled at runtime such that only the function unit to be used is powered-on to minimize the leakage power. The evaluation results on the real chip reveals that the fine grain runtime PG mechanism works without electric problems. It reduces the leakage power 7% at 25 °C and 24% at 80°C. The evaluation results using benchmark programs show that the power consumption can be reduced from 3% at 25 °C and 30% at 80°C.
international conference on vlsi design | 2009
Kimiyoshi Usami; Toshiaki Shirai; Tatsunori Hashida; Hiroki Masuda; Seidai Takeda; Mitsutaka Nakata; Naomi Seki; Hideharu Amano; Mitaro Namiki; Masashi Imai; Masaaki Kondo; Hiroshi Nakamura
This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.
field-programmable technology | 2008
Yoshiki Saito; Tomoaki Shirai; Takuro Nakamura; Takashi Nishimura; Yohei Hasegawa; Satoshi Tsutsumi; Toshihiro Kashima; Mitsutaka Nakata; Seidai Takeda; Kimiyoshi Usami; Hideharu Amano
One of the benefits of coarse grained dynamically reconfigurable processor array(DRPA) is its low dynamic power consumption by operating a number of processing elements(PE) in parallel with low clock frequency. However, in the future advanced processes, leakage power will occupy a considerable part of the total power consumption, and it may degrade the advantage of DRPAs. In order to reduce the leakage power, a fine grained Power Gating(PG) is applied to a DRPA, MuCCRA-2.32b, and leakage power and area overhead are measured. We evaluated the effect of two control modes; Pair and Unit Individual based on layout design and real applications. It appears that by applying PG for ALUs and SMUs in PEs individually, 48% of leakage power can be reduced with 9.0% of area overhead.
international conference on ic design and technology | 2009
Kimiyoshi Usami; Mitsutaka Nakata; Toshiaki Shirai; Seidai Takeda; Naomi Seki; Hideharu Amano; Hiroshi Nakamura
In a 32b×32b multiplier, when the bit width of both operands is less than 16-bit, the upper array of the multiplier computing the upper bits of the product does not need to operate and hence consumes wasteful leakage energy. We propose a technique to control run-time power gating (RTPG) for the upper array by dynamically detecting the operand width. Since RTPG suffers from energy overhead due to turning on/off power switches, the sleep time at each sleep event should be longer than the break-even time (BET) to gain in energy savings. Using an analytical model we built, we show that BET reduces exponentially with higher temperature. Since the chip temperature goes up during the operation, the sleep time becomes more likely to exceed the shortened BET, leading to the increase of energy savings. We evaluated our technique through designing a 32b×32b multiplier and implementing in a commercial 90nm CMOS technology. Post-layout simulation results showed that BET reduces from 32 cycles at 25°C to 10 cycles at 65°C and to 3 cycles at 100°C at 100MHz. We also simulated energy dissipation by incorporating our multiplier into a MIPS R3000 based CPU and running a JPEG encoding program. Results showed that our technique reduces energy by 5% at 65°C and by 39% at 100°C over the PG-disabled case even counting the overhead. In contrast, energy was increased by 36% at 25°C. The ground bounce at the wakeup was effectively suppressed to 91mV by using delay-skewed buffering for power switches, while achieving the wakeup time of 1.44ns.
asia and south pacific design automation conference | 2011
Lei Zhao; Daisuke Ikebuchi; Yoshiki Saito; M. Kamata; Naomi Seki; Yu Kojima; Hideharu Amano; Satoshi Koyama; Tatsunori Hashida; Y. Umahashi; D. Masuda; Kimiyoshi Usami; Keiji Kimura; Mitaro Namiki; Seidai Takeda; Hiroshi Nakamura; Masaaki Kondo
Geyser-2 is the second prototype MIPS CPU which provides a fine-grained run-time power gating (PG) controlled by instructions. Geyser-1[1], the first prototype only provides the fine-grained run-time PG core. Although it demonstrated the leakage power reduction on a real chip, the operational frequency is limited at 60MHz because of the limitation of the I/O speed. Geyser-2 with cache and TLB mechanism is implemented to show (1) run-time PG works at least with 200MHz which is commonly used clock for embedded systems, and (2) it is also efficient on the environment with real application programs with an operating system.
international symposium on quality electronic design | 2012
Seidai Takeda; Shinobu Miwa; Kimiyoshi Usami; Hiroshi Nakamura
Power Gating (PG) and Body Biasing (BB) are effective schemes to save leakage power in standby-time. However, in run-time, their large overhead energy and latency for sleep control prevent the circuit from saving power in short idle times. To reduce those overheads, advanced PG and BB using shallow sleep mode are studied. Those circuits achieve leakage saving even in short idle time. The depth of sleep mode has trade-offs between the overheads and the amount of saved leakage power; hence, making decision of depth of a shallow sleep is an important issue to maximize total leakage saving. However, the depth which achieves best leakage saving depends heavily on run-time factors, such as application behavior and temperature. Thus, the conventional circuit has multiple shallow sleep modes and chooses an adequate mode in run-time. However, it causes large overhead power because of additional voltage generators for shallow sleep modes. In this paper, we propose a sleep control scheme named Opt-static for run-time leakage saving. Our scheme uses only one shallow sleep mode, but its depth is reconfigurable. It successfully achieves leakage saving by adopting its depth with run-time factors. In addition, our scheme needs only one active voltage generator; hence overhead power associated with voltage generators is smaller than the conventional circuit which has multiple shallow sleep modes. Experimental results show that our schemes applied to Multi-mode PG achieves higher leakage saving than the conventional Multi-mode PG which has two shallow sleep modes, although it does not take into account for overhead power of voltage generators.
great lakes symposium on vlsi | 2012
Kyundong Kim; Seidai Takeda; Shinobu Miwa; Hiroshi Nakamura
Caches are one of the most leakage consuming components in modern processor because of massive amount of transistors. To reduce leakage power of caches, several techniques using power-gating(PG) were proposed. Despite of its high leakage saving, a side effect of PG for caches is the loss of data during a sleep. If useful data is lost in sleep mode, it should be fetched again from a lower level memory. This consumes a considerable amount of energy, which very unfortunately mitigates the leakage saving. This paper proposes a new PG scheme considering data retentiveness of SRAM. After entering the sleep mode, data of an SRAM cell is not lost immediately and is usable by checking the validity of the data. Therefore, we utilize data retentiveness of SRAM to avoid energy overhead for data recovery, which results in further chance of leakage saving. To check availability, we introduce a simple hardware whose overhead is ignorable. We also examined leakage saving potential of our approach. For both L1 data and instruction caches, our scheme results in more than 2 times of smaller leakage energy compared to conventional PG scheme.
great lakes symposium on vlsi | 2012
Seidai Takeda; Shinobu Miwa; Kimiyoshi Usami; Hiroshi Nakamura
Recently, run-time sleep control scheme using multiple sleep modes have been studied. In those studies, each sleep mode has its own sleep depth. Deeper sleep mode provides higher leakage saving but incurs larger overhead energy.Use of multiple modes is helpful for further leakage saving if an appropriate mode is selected, but the best mode depends on the idle period whose length cannot be told in advance. Although the implementations how to realize different sleep depths have been well studied, few attention has been paid to the method of how to select the best sleep depth dynamically during execution. This paper proposes a simple but novel sleep control scheme, called stepwise sleep depth control, which aims to select the best depth among provided multiple sleep depths.Our scheme automatically applies deeper depth in a step-by-step manner after an idle state starts. It successfully reduces leakage energy while only a small modification is required for circuit implementation. This paper also proposes a methodology for optimizing control parameters of our sleep control scheme according to program behavior and temperature. Experimental result shows that stepwise sleep depth control applied to body biasing circuit improves net leakage saving of up to 43% for FPAlu at 1.0GHz, 75°C compared to conventional reverse body biasing.