Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hideho Arakida is active.

Publication


Featured researches published by Hideho Arakida.


international solid-state circuits conference | 1998

A 60 mW MPEG4 video codec using clustered voltage scaling with variable supply-voltage scheme

Masafumi Takahashi; Mototsugu Hamada; Tsuyoshi Nishikawa; Hideho Arakida; Yoshiro Tsuboi; Tetsuya Fujita; Fumitoshi Hatori; Shinji Mita; Kojiro Suzuki; Akihiko Chiba; Toshihiro Terazawa; Fumihiko Sano; Y. Watanabe; Hiroshi Momose; Kimiyoshi Usami; Mutsunori Igarashi; Takashi Ishikawa; Masahiro Kanazawa; Tadahiro Kuroda; Tohru Furuyama

This MPEG4 video codec implements essential functions in the MPEG4 committee draft. It consumes 60 mW at 30 MHz, 30% of the power dissipation of a conventional CMOS design. Measured power dissipation is summarized. 70% power reduction is achieved by low-power techniques at circuit and architectural levels. A 16b RISC processor provides software programmability. Binary shape decoding uses 20% of the computation power of the RISC processor at 30MHz clock, with negligible increase in chip power dissipation. Three-step hierarchical motion estimation reduces power dissipation.


custom integrated circuits conference | 1998

A top-down low power design technique using clustered voltage scaling with variable supply-voltage scheme

Mototsugu Hamada; Masafumi Takahashi; Hideho Arakida; Akihiko Chiba; Toshihiro Terazawa; Takashi Ishikawa; Masahiro Kanazawa; Mutsunori Igarashi; Kimiyoshi Usami; Tadahiro Kuroda

A novel design technique which combines a variable supply-voltage scheme and a clustered voltage scaling is presented (VS-CVS scheme). A theory to choose the optimum supply voltages in the VS-CVS scheme is discussed which enables us to perform chip design in a top-down fashion. Level-shifting flip-flops are developed which reduce power, delay and area penalties significantly. Application of this technique to an MPEG4 video codec saves 55% of the power dissipation without degrading circuit performance compared to a conventional CMOS design.


international solid-state circuits conference | 2000

A 60 MHz 240 mW MPEG-4 video-phone LSI with 16 Mb embedded DRAM

Tsuyoshi Nishikawa; Masafumi Takahashi; Mototsugu Hamada; Toshinari Takayanagi; Hideho Arakida; Noriaki Machida; Hideaki Yamamoto; Toshihide Fujiyoshi; Osamu Yamagishi; T. Samata; Atsushi Asano; Toshihiro Terazawa; Kenji Ohmori; Junya Shirakura; Y. Watanabe; Hiroki Nakamura; Shigenobu Minami; Tadahiro Kuroda; Tohru Furuyama

A 240 mW single-chip MPEG-4 video-phone LSI with a 16 Mb embedded DRAM is fabricated in a 0.25 /spl mu/m CMOS, triple-well, quad-metal technology. The chip integrates a 16 Mb DRAM and three dedicated 16 b RISC processors with dedicated hardware accelerators that serve as an MPEG-4 video codec, a speech codec, and a multiplexer. It also integrates camera, display, and audio interfaces required for a video-phone system. It consumes 240 mW at 60 MHz operation, which is only 22% of the power dissipation of a conventional design. A variable threshold voltage CMOS (VTCMOS) technology is employed to reduce standby leakage current to 26 /spl mu/A, which is only 17% of the conventional CMOS design.


design automation conference | 1998

Design methodology of ultra low-power MPEG4 codec core exploiting voltage scaling techniques

Kimiyoshi Usami; Mutsunori Igarashi; Takashi Ishikawa; Masahiro Kanazawa; Masafumi Takahashi; Mototsugu Hamada; Hideho Arakida; Toshihiro Terazawa; Tadahiro Kuroda

This paper describes a fully automated low-power design methodology in which three different voltage-scaling techniques are combined together. Supply voltage is scaled globally, selectively, and adaptively while keeping the performance. This methodology enabled us to design an MPEG4 codec core with 58% less power than the original in three week turn-around-time.


international solid-state circuits conference | 2005

A 63-mW H.264/MPEG-4 audio/visual codec LSI with module-wise dynamic Voltage/frequency scaling

Toshihide Fujiyoshi; Shinichiro Shiratake; Shuou Nomura; Tsuyoshi Nishikawa; Yoshiyuki Kitasho; Hideho Arakida; Yuji Okuda; Yoshiro Tsuboi; Mototsugu Hamada; Hiroyuki Hara; Tetsuya Fujita; Fumitoshi Hatori; Takayoshi Shimazawa; Kunihiko Yahagi; Hideki Takeda; Masami Murakata; Fumihiro Minami; Naoyuki Kawabe; Takeshi Kitahara; Katsuhiro Seta; Masafumi Takahashi; Yukihito Oowaki; Tohru Furuyama

A single-chip H.264 and MPEG-4 audio-visual LSI for mobile applications including terrestrial digital broadcasting system (ISDB-T / DVB-H) with a module-wise, dynamic voltage/frequency scaling architecture is presented for the first time. This LSI can keep operating even during the voltage/frequency transition, so there is no performance overhead. It is realized through a dynamic deskewing system and an on-chip voltage regulator with slew rate control. By the combination with traditional low power techniques such as embedded DRAM and clock gating, it consumes only 63 mW in decoding QVGA H.264 video at 15 frames/sec and MPEG-4 AAC LC audio simultaneously.


international solid-state circuits conference | 2003

A 160 mW, 80 nA standby, MPEG-4 audiovisual LSI with 16 Mb embedded DRAM and a 5 GOPS adaptive post filter

Hideho Arakida; Masafumi Takahashi; Yoshiro Tsuboi; Tsuyoshi Nishikawa; Hideaki Yamamoto; Toshihide Fujiyoshi; Yoshiyuki Kitasho; Yoshihiro Ueda; Manabu Watanabe; Tetsuya Fujita; Toshihiro Terazawa; K. Ohmori; M. Koana; H. Nakamura; E. Watanabe; H. Ando; T. Aikawa; Tohru Furuyama

A single-chip MPEG-4 audiovisual LSI in a 0.13 /spl mu/m 5M CMOS technology with 16 Mb embedded DRAM is presented. Four 16 b RISC processors and dedicated hardware accelerators including a 5 GOPS post filtering engine are integrated on the IC. The chip consumes 160 mW at 125 MHz and uses 80 nA in the standby mode. This LSI handles MPEG-4 CIF video encoding at 15 frames/s and audio encoding simultaneously.


IEEE Journal of Solid-state Circuits | 2011

A 40 nm 222 mW H.264 Full-HD Decoding, 25 Power Domains, 14-Core Application Processor With x512b Stacked DRAM

Yu Kikuchi; Makoto Takahashi; Tomohisa Maeda; Masatoshi Fukuda; Yasuhiro Koshio; Hiroyuki Hara; Hideho Arakida; Hideaki Yamamoto; Yousuke Hagiwara; Tetsuya Fujita; Manabu Watanabe; Hirokazu Ezawa; Takayoshi Shimazawa; Yasuo Ohara; Takashi Miyamori; Mototsugu Hamada; Masafumi Takahashi; Yukihito Oowaki

In this paper we introduce a 14-core application processor for multimedia mobile applications, implemented in 40 nm, with a 222 mW H.264 full high-definition (full-HD) video engine, a 124 mW 40 M-polygons/s 3D/2D graphics engine, and a video/audio multiprocessor for various Codecs and image processing. The application processor has 25 power domains to achieve coarse-grain power gating for adjusting to the required performance of wide range of multimedia applications. The simple on-chip power switch circuits perform less than 1 μs switching while reducing rush current. Furthermore, the Stacked Chip SoC (SCS) technology enables rewiring to the DRAM chip during assembly/packaging phase using a wire with 10 μm minimum pitch on Re-Distribution Layer (RDL) using electroplating. The peak memory bandwidth is 10.6 GB/s with an x512b SCS-DRAM interface, and the power consumption of this interface is 3.9 mW at 2.4 GB/s workload.


international symposium on circuits and systems | 2000

A scalable MPEG-4 video codec architecture for IMT-2000 multimedia applications

Masafumi Takahashi; Tsuyoshi Nishikawa; Hideho Arakida; Noriaki Machida; Hideaki Yamamoto; Toshihide Fujiyoshi; Yoko Matsumoto; Osamu Yamagishi; T. Samata; Atsushi Asano; Toshihiro Terazawa; Kenji Ohmori; Junya Shirakura; Yoshinori Watanabe; Hiroki Nakamura; Shigenobu Minami; Tohru Furuyama

A scalable MPEG-4 video codec architecture is proposed to achieve low power consumption and high cost-effectiveness for IMT-2000 multimedia applications. The MPEG-4 video codec consists of a 16-bit multimedia-extended RISC processor and dedicated hardware accelerators, which bring about both low power consumption and programmability. The proposed architecture is extended and applied for the development of two MPEG-4 LSIs. One is an MPEG-4 video codec LSI, which performs an MPEG-4 video encoding and decoding at 15 frames per second with quarter common intermediate format. The other is an MPEG-4 audiovisual LSI, containing three 16-bit RISC processors and a 16-Mbit embedded DRAM, executes the major functions of 3GPP 3G-324M video telephony for IMT-2000 applications. By introducing the optimization of the embedded DRAM configuration, clock gating technique, and low power motion estimation, the MPEG-4 audiovisual LSI consumes only 240 mW when it activates MPEG-4 video SP@L1 codec, the AMR speech codec, and the H.223 annex B multiplex at 60 MHz clock rate.


international solid-state circuits conference | 2010

A 222mW H.264 Full-HD decoding application processor with x512b stacked DRAM in 40nm

Yu Kikuchi; Makoto Takahashi; Tomohisa Maeda; Hiroyuki Hara; Hideho Arakida; Hideaki Yamamoto; Yousuke Hagiwara; Tetsuya Fujita; Manabu Watanabe; Takayoshi Shimazawa; Yasuo Ohara; Takashi Miyamori; Mototsugu Hamada; Masafumi Takahashi; Yukihito Oowaki

Todays multimedia mobile devices must support a wide range of multimedia applications in addition to full high-definition (Full-HD) video processing. Conventional hardware engine approaches [1-3] cannot handle new applications that may be required once the chips are fabricated. We report an application processor with a hybrid architecture that combines a software solution with a multi-core processor [4] for various applications and a hardware solution with hardware engines for low-power and specific high-performance tasks such as Full-HD video and 3D graphics. Another issue faced in multimedia mobile devices is to achieve high memory bandwidth with low power consumption. DDR memory connections in System-in-Package (SiP) technologies need a large number of I/Os or high interface frequency at the expense of high power consumption. A Chip-on-Chip (CoC) connection using micro-bumps [5] is a power-efficient technology to achieve high memory bandwidth and low power. However, in the case of the conventional CoC technique, customized DRAM chips are necessary, because wiring between a logic chip and a DRAM chip is implemented on the metal layers in the DRAM chip. To use a DRAM chip for multiple logic LSIs, the Stacked-Chip SoC (SCS) technology used for this application processor enables rewiring at the assembly/packaging phase using minimum 5µm-pitch metal wiring on the Re-Distribution Layer (RDL). We also report an on-chip power switch with a simple structure that inhibits rush currents. The application processor has 25 power domains and controls these domains finely to optimize for various ranges of performance requirements.


design, automation, and test in europe | 2009

Design and implementation of scalable, transparent threads for multi-core media processor

Takeshi Kodaka; Shunsuke Sasaki; Takahiro Tokuyoshi; Ryuichiro Ohyama; Nobuhiro Nonogaki; Koji Kitayama; Tatsuya Mori; Yasuyuki Ueda; Hideho Arakida; Yuji Okuda; Toshiki Kizu; Yoshiro Tsuboi; Nobu Matsumoto

In this paper, we propose a scalable and transparent parallelization scheme using threads for multi-core processor. The performance achieved by our scheme is scalable to the number of cores, and the application program is not affected by the actual number of cores. For the performance efficiency, we designed the threads so that they do not suspend and that they do not start their execution until the data necessary for them are available. We implemented our design using three modules: the dependency controller, which controls dependencies among threads, the thread pool, which manages the ready threads, and the thread dispatcher, which fetches threads from the pool and executes them on the core. Our design and implementation provide efficient thread scheduling with low overhead. Moreover, by hiding the actual number of cores, it realizes transparency. We confirmed the transparency and scalability of our scheme by applying it to the H.264 decoder program. With this scheme, modification of application program is not necessary even if the number of cores changes due to disparate requirements. This feature makes the developing time shorter and contributes to the reduction of the developing cost.

Collaboration


Dive into the Hideho Arakida's collaboration.

Researchain Logo
Decentralizing Knowledge