Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Atsuhiro Suga is active.

Publication


Featured researches published by Atsuhiro Suga.


IEEE Micro | 2000

Introducing the FR500 embedded microprocessor

Atsuhiro Suga; Kunihiko Matsunami

Because conventional RISC processors have insufficient processing power to support the continuing development of digital consumer products, we need a new high performance processor for multimedia applications. Processing multimedia video images requires more than 10 times the currently available performance. At Fujitsu, we provide this higher performance in software to attain a high degree of flexibility. We developed the FR500 microprocessor with a novel embedded VLIW (very long instruction word) architecture for use in such digital consumer products. The FR500 is the first product in the FR-V line, Fujitsus generic name for VLIW architecture microprocessors. The FR-V line offers the flexibility to develop new products optimized for a wide variety of digital consumer products. In this paper, we describe the FR-V architecture, which includes our variable-length VLIW and instruction set architectures, speculative execution control, and conditional execution control. We also evaluate its performance.


international solid-state circuits conference | 2005

A 51.2 GOPS 1.0 GB/s-DMA single-chip multi-processor integrating quadruple 8-way VLIW processors

Tetsuyoshi Shiota; Kenichi Kawasaki; Yukihito Kawabe; Wataru Shibamoto; Atsushi Sato; Tetsutaro Hashimoto; Fumihiko Hayakawa; Shin-ichirou Tago; Hiroshi Okano; Yasuki Nakamura; Hideo Miyake; Atsuhiro Suga; Hiromasa Takahashi

A 51.2-GOPS chip multi-processor integrates four 8-way VLIW embedded processors with 1.0 GB/s local-bus direct memory access. This IC completes MPEG2 MP@HL video-stream decoding at 68% of its processor capability without dedicated hardware. The 11.9 mm /spl times/ 10.3 mm chip is fabricated in a 90 nm 9M CMOS process and consumes 5 W at 533 MHz.


international solid-state circuits conference | 1999

A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism

Hajime Kubosawa; Naoshi Higaki; Satoshi Ando; Hiromasa Takahashi; Yoshimi Asada; Hideaki Anbutsu; Tomio Sato; Masato Sakate; Atsuhiro Suga; Michihide Kimura; Hideo Miyake; Hiroshi Okano; Akira Asato; Yasunori Kimura; Hiroshi Nakayama; Masayoshi Kimoto; Katsuji Hirochi; Hideki Saito; Norio Kaido; Yukihiro Nakagawa; T. Shimada

A 4-way VLIW geometry processor runs at 312 MHz and contains a PCI/AGP bus bridge in a three-layer-metal CMOS process with 0.21 /spl mu/m design rules at 2.5 V. It features: (1) VLIW and SIMD instruction sets, (2) a software bypass mechanism, (3) special condition-code registers and branch condition generator for clipping, and (4) automatic clock delay tuning. The result is performance of 2.5 GFLOPS and 6.5 Mpolygons/s in a 3D geometry processor. This chip can be added to conventional graphics systems without requiring additional LSIs.


international solid-state circuits conference | 2000

A 4-way VLIW embedded multimedia processor

Atsuhiro Suga; T. Sukemura; H. Takahashi; K. Wada; H. Miyake; Y. Nakamura; Y. Takebe; I. Azegami; Y. Hirose; M. Kimura; T. Okano; T. Shiota; M. Saito; S. Wakayama; T. Ozawa; T. Satoh; A. Sakurai; T. Katayama; K. Abe; K. Kuwano

Performance requirements are soaring for embedded processors, whose demand in multimedia processing is rising now more than ever. Some DSP and media processors satisfy this by means of VLIW architecture. However, for embedded processors, less code, low power and small die are compulsory. These requirements make 4-way super-scalar embedded processor impractical. This embedded processor utilizes 4-way VLIW architecture characterized by: (1) parallel execution by VLIW. (2) generic CPU function in combination with media processing function for enhancing multimedia processing ability. (3) NOP instruction suppressing by packing flags for compatibility among different parallel levels. (4) two parallel execution mechanisms, ILP and SIMD.


international conference on computer design | 1990

A 64-bit floating-point processing unit with a horizontal instruction code for parallel operations

Akira Katsuno; Hiromasa Takahashi; Hajime Kubosawa; Tomio Sato; Atsuhiro Suga; Gensuke Goto

A full 64-bit floating-point processing unit (FPU) with a long horizontal instruction code for parallel operations without pipeline interlock is described. The FPU is implemented on a 1.0- mu m CMOS chip containing 300 K transistors and operating at 25 MHz. It runs at a peak rate of 50 MFLOPs and a sustained rate of 15.4 MFLOPs. The register-to-register latency of double and single-precision addition, subtraction and multiplication are 120 ns each. The latency of double-precision division is 640 ns and that of square root is 880 ns.<<ETX>>


international solid-state circuits conference | 1998

A 1.2 W 2.16 GOPS/720 MFLOPS embedded superscalar microprocessor for multimedia applications

Hajime Kubosawa; Hiromasa Takahashi; Satoshi Ando; Yoshimi Asada; Akira Asato; Atsuhiro Suga; Michihide Kimura; Naoshi Higaki; Hideo Miyake; Tomio Sato; Hideaki Anbutsu; Toshitaka Tsuda; Tetsuo Yoshimura; Isao Amano; Mutsuaki Kai; Shin Mitarai

A microprocessor with single instruction multiple data stream (SIMD) architecture and as many as 170 media instructions for multimedia embedded systems meets all requirements of embedded systems, including (a) MPEG2 (MP@ML) decoding and 3DCG image processing capabilities, (b) programming flexibility, and (c) low power dissipation and low cost. It also works as a general purpose microprocessor with mid-range performance. The microprocessor uses 0.21 /spl mu/m CMOS technology, and the chip achieves 2.16 GOPS/720 MFLOPS at a 180 MHz operation with 1.2 W dissipation.


Proceedings. IEEE Asia-Pacific Conference on ASIC, | 2002

An 8-way VLIW embedded multimedia processor with advanced cache mechanism

Fumihiko Hayakawa; Hiroshi Okano; Atsuhiro Suga

An 8-way VLIW embedded multimedia processor is developed in 0.11 /spl mu/m 7-layer Cu/Al metal CMOS process technology. The processor achieved the peak performance of 2132 MIPS/2.1 GFLOPS/4.26 GOPS at 533 MHz. This processor equips 4-way integer and 4-way floating/media pipelines. Each media pipeline can execute a 4-parallel SIMD instruction, so 16-operations can be executed at a cycle. It also equips the data and instruction caches, and each of their size is 32 KB. The data cache has a non-aligned data access mechanism and an enhanced store operation mechanism, and a 9% performance improvement was verified. The instruction cache has a reconfigurable associativity mechanism, and achieved more than 12% of power consumption decrease. As a result, this processor achieved more than twice the performance over our previous processor. This processor enables MPEG2 MP@ML 2-stream decoding.


Proceedings Euro ASIC '92 | 1992

A 64-bit floating point processing unit for a RISC microprocessor

Hajime Kubosawa; Akira Katsuno; Hiromasa Takahashi; Tomio Sato; Atsuhiro Suga; Gensuke Goto

Describes architecture, layout, and simulation methodology of a high performance 64-bit floating point processing unit (FPU) which is applicable to a RISC microprocessor. The FPU contains a floating point execution unit and a floating point controller for the SPARC S-25 microprocessor. The FPU supports SPARC floating point instructions based on the IEEE Standard for Binary Floating Point Arithmetic (ANSI/IEEE std. 754-1985). Operating frequency is 25 MHz and peak floating point computing performance is 12.5 MFLOPS when it is used with the S-25 SPARC microprocessor. The chip was designed using 0.8 mu m CMOS standard cell technology. The chip size is 16.4*16.4 mm and packaged into 179-pin PGA. Total transistor count was approximately 330000.<<ETX>>


international symposium on computing and networking | 2014

ARM Based Platform SoC for Embedded Applications

Fumihiko Hayakawa; Atsuhiro Suga

In these days embedded system needs Open Source Software (OSS) and open framework like as Open GL/Open CL because of better developing efficiency. This paper shows you about high performance SoC architecture for embedded system. This SoC shows lower power consumption in stand-by case and better performance about some applications. To reuse these hardware and software platform, custom SoCs can be developed easily in short period.


asia and south pacific design automation conference | 2006

Single-chip multi-processor integrating quadruple 8-way VLIW processors with interface timing analysis considering power supply noise

Satoshi Imai; Atsuki Inoue; Motoaki Matsumura; Kenichi Kawasaki; Atsuhiro Suga

This paper introduces a 51.2Gops, 1.0GB/s-DMA single-chip multiprocessor integrating quadruple cores and proposes a power integrity analysis. Our multiprocessor is designed to decode MP@HL streams without any dedicated circuits. To achieve such high performance, data throughput as well as processing capability is important, requiring a large number of high speed I/Os. However, this makes for a high level of power supply noise. We then applied an interface timing margin analysis tool that took power supply noise into account, and succeeded in putting reasonable restrictions on LSI design, as well as that for the printed circuit board. As a result, we succeeded in operating the processor at 533MHz with the 2ch 64bit main memory IF at 266MHz and 64bit system bus at 178MHz

Collaboration


Dive into the Atsuhiro Suga's collaboration.

Researchain Logo
Decentralizing Knowledge