Nobu Matsumoto | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nobu Matsumoto is active.

Explore More

Publication

Featured researches published by Nobu Matsumoto.

IEEE Journal of Solid-state Circuits | 2003

A single-chip MPEG-2 codec based on customizable media embedded processor

Shunichi Ishiwata; Tomoo Yamakage; Yoshiro Tsuboi; Takayoshi Shimazawa; Tomoko Kitazawa; Shuji Michinaka; Kunihiko Yahagi; Hideki Takeda; Akihiro Oue; Tomoya Kodama; Nobu Matsumoto; Takayuki Kamei; Mitsuo Saito; Takashi Miyamori; Goichi Ootomo; Masataka Matsui

A single-chip MPEG-2 MP@ML codec, integrating 3.8M gates on a 72-mm/sup 2/ die, is described. The codec employs a heterogeneous multiprocessor architecture in which six microprocessors with the same instruction set but different customization execute specific tasks such as video and audio concurrently. The microprocessor, developed for digital media processing, provides various extensions such as a very-long-instruction-word coprocessor, digital signal processor instructions, and hardware engines. Making full use of the extensions and optimizing the architecture of each microprocessor based upon the nature of specific tasks, the chip can execute not only MPEG-2 MP@ML video/audio/system encoding and decoding concurrently, but also MPEG-2 MP@HL decoding in real time.

design automation conference | 2001

A new verification methodology for complex pipeline behavior

Kazuyoshi Kohno; Nobu Matsumoto

A new test program generation tool, MVpGen, is developed for verifying pipeline design of microprocessors. The only inputs MVpGen requires are pipeline-behavior specifications; it automatically generates test cases at first from pipeline-behavior specifications and then automatically generates test programs corresponding to the test cases. Test programs for verifying complex pipeline behavior such as hazard and branch or hazard and exception, are generated. mVpGen has been integrated into a verification system for verifying RTL descriptions of a real microprocessor design and complex bugs that remained hidden in the RTL descriptions are detected.

design automation conference | 1993

A Compaction Method for Full Chip VLSI Layouts

Joseph Dao; Nobu Matsumoto; Tsuneo Hamai; Chusei Ogawa; Shojiro Mori

An algorithm independent layout compaction method for full chip layouts is proposed. The partitioning compaction method cuts up a large layout, compacts each block independently and then merges them to give the final compacted layout. A 16-bit CPU core (28.8K transistors) layout was compacted on a standard workstation using this method. Both the computer memory usage and processing time were reduced. Parallel processing is possible to further speed up the computation.

IEEE Journal of Solid-state Circuits | 1991

Hierarchical symbolic design methodology for large-scale data paths

Kimiyoshi Usami; Yukio Sugeno; Nobu Matsumoto; Shojiro Mori

A symbolic layout methodology for large-scale data paths is proposed. A gate-level symbolic expression, the logic transformation diagram, is adopted as a layout input. A mask layout is automatically generated from the symbolic expression. A hierarchical design method is used in combination with a bit-slice regular structure and a performance-determining irregular structure. A 1-b field of the bit-slice structure is designed symbolically and then compacted, and finally the entire data path is generated. Performance-determining irregular macrocells, such as adders with carry-look-ahead (CLA) circuits, are handcrafted independently and combined with the entire data path at the final step. To achieve high density and high performance, effort is focused on optimizing the layout of the 1-b field. Iteration of the editing and compaction loop can be executed in a short turnaround time (TAT). By the proposed methodology, a data path containing 21 K transistors in a 32-b microprocessor has been successfully produced. Design productivity has been increased tenfold, achieving a layout density equivalent to that of the handcrafted design. >

custom integrated circuits conference | 2002

A single-chip MPEG-2 codec based on customizable media microprocessor

A single-chip MPEG2 MP@ML codec, integrating 3.8M gates on a 72mm/sup 2/ die, is described. It has a heterogeneous multiprocessor architecture in which six microprocessors with the same instruction set but different customization execute specific tasks such as video, audio etc. concurrently. The microprocessor, developed for digital media processing, provides various extensions such as a VLIW one and a DSP one inherent in its architecture. Making full use of the extensions, the chip executes encoding and decoding of video, audio and system concurrently in real time.

design, automation, and test in europe | 2009

Design and implementation of scalable, transparent threads for multi-core media processor

Takeshi Kodaka; Shunsuke Sasaki; Takahiro Tokuyoshi; Ryuichiro Ohyama; Nobuhiro Nonogaki; Koji Kitayama; Tatsuya Mori; Yasuyuki Ueda; Hideho Arakida; Yuji Okuda; Toshiki Kizu; Yoshiro Tsuboi; Nobu Matsumoto

In this paper, we propose a scalable and transparent parallelization scheme using threads for multi-core processor. The performance achieved by our scheme is scalable to the number of cores, and the application program is not affected by the actual number of cores. For the performance efficiency, we designed the threads so that they do not suspend and that they do not start their execution until the data necessary for them are available. We implemented our design using three modules: the dependency controller, which controls dependencies among threads, the thread pool, which manages the ready threads, and the thread dispatcher, which fetches threads from the pool and executes them on the core. Our design and implementation provide efficient thread scheduling with low overhead. Moreover, by hiding the actual number of cores, it realizes transparency. We confirmed the transparency and scalability of our scheme by applying it to the H.264 decoder program. With this scheme, modification of application program is not necessary even if the number of cores changes due to disparate requirements. This feature makes the developing time shorter and contributes to the reduction of the developing cost.

design automation conference | 1990

Datapath generator based on gate-level symbolic layout

Nobu Matsumoto; Y. Watanabe; Kimiyoshi Usami; Yukio Sugeno; Hiroshi Hatada; Shojiro Mori

This paper describes a new datapath generator that generates high-density mask layouts equivalent to hand-crafted ones. An entry of the generator is a hierarchical symbolic layout at the gate level. <italic>Bit-and-row-slicing technique</italic> is a key feature to realize large-size and high-density datapath generation. A 21K transistor datapath was generated using 1-μm CMOS technology, whose density is 5.64 KTr/mm<supscrpt>2</supscrpt>, greater than the 5.38 KTr/mm<supscrpt>2</supscrpt> of a hand-crafted datapath.

custom integrated circuits conference | 1988

Symbolic design methodology for high-density macro-cell

Nobu Matsumoto; Yob Watanabe; Shojiro Mori

A design methodology is proposed for macro-cell design. Macro-cell layout is optimized at symbolic design level. So-called versatile cells are used to make the best use of symbolic layout features. Unnecessary jogs are eliminated by fitting versatile cells into surrounding circuits. If a macro cell is composed of small circuit components, eliminating jogs greatly reduces circuit component connecting area. The stick-diagram-generating system KIMERA was implemented to realize the methodology.<<ETX>>

asia and south pacific design automation conference | 2010

A new compilation technique for SIMD code generation across basic block boundaries

Hiroaki Tanaka; Yutaka Ota; Nobu Matsumoto; Takuji Hieda; Yoshinori Takeuchi; Masaharu Imai

Although SIMD instructions are effective for many digital signal processing applications, current compilers cannot take full advantage of SIMD instructions. One factor inhibiting SIMD code generation is control flow structure; the target scope of SIMD code generation is currently limited to single basic block or loop that consists of single basic block. SIMD instructions cannot be mapped typically across basic block boundaries even if basic blocks inside the control structure have enough parallelism. In this paper, a new compilation technique to generate SIMD code without modifying control flow structure is proposed. The data dependency between basic blocks is exploited to generate SIMD instructions. The packing cost is introduced for effective vectorization to maintain data dependency across basic block boundaries. Experimental results show that the new SIMD code generation technique reduced 67% of dynamic execution cycles of inter prediction in H.264 decoder.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2007

Generation of Pack Instruction Sequence for Media Processors Using Multi-Valued Decision Diagram

Hiroaki Tanaka; Yoshinori Takeuchi; Keishi Sakanushi; Masaharu Imai; Hiroki Tagawa; Yutaka Ota; Nobu Matsumoto

SIMD instructions are often implemented in modern multimedia oriented processors. Although SIMD instructions are useful for many digital signal processing applications, most compilers do not exploit SIMD instructions. The difficulty in the utilization of SIMD instructions stems from data parallelism in registers. In assembly code generation, the positions of data in registers must be noted. A technique of generating pack instructions which pack or reorder data in registers is essential for exploitation of SIMD instructions. This paper presents a code generation technique for SIMD instructions with pack instructions. SIMD instructions are generated by finding and grouping the same operations in programs. After the SIMD instruction generation, pack instructions are generated. In the pack instruction generation, Multi-valued Decision Diagram (MDD) is introduced to represent and to manipulate sets of packed data. Experimental results show that the proposed code generation technique can generate assembly code with SIMD and pack instructions performing repacking of 8 packed data in registers for a RISC processor with a dual-issue coprocessor which supports SIMD and pack instructions. The proposed method achieved speedup ratio up to about 8.5 by SIMD instructions and multiple-issue mechanism of the target processor.

Explore More