Junichi Miyakoshi
Kobe University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Junichi Miyakoshi.
symposium on vlsi circuits | 2006
Yasuhiro Morita; Hidehiro Fujiwara; Hiroki Noguchi; Kentaro Kawakami; Junichi Miyakoshi; Shinji Mikami; Koji Nii; Hiroshi Kawaguchi; Masahiko Yoshimoto
This paper proposes a voltage-control scheme for an SRAM that makes a minimum operation voltage down to 0.3 V even on a future memory-rich SoC. A self-aligned timing control guarantees stable operation in a wide range of Vdd under DVS environment. A measurement result of a 64-kb SRAM in a 90-nm process technology shows that 30% power reduction is achieved at 100 MHz. The area overhead is only 5.6%
IEICE Transactions on Electronics | 2005
Junichi Miyakoshi; Yuichiro Murachi; Koji Hamano; Tetsuro Matsuno; Masayuki Miyama; Masahiko Yoshimoto
SUMMARY This paper proposes a low-power systolic array architecture for a block-matching motion estimation processor IP for portable and high-resolution video applications. The architecture features a ringconnected processing element (PE) array to reduce both computation cycles and memory access cycles at the same time, allowing lower power characteristics. The feature of low memory access cycles allows concurrent operation of a half-pel processing unit with no extra cache. Furthermore, the architecture allows various summation schemes for absolute difference values. For that reason, it is applicable to various video coding modes such as the adaptive field/frame mode in MPEG2 and multiple macroblock mode in H.264. When the architecture is introduced to a design of a MPEG2 MP@HL motion estimation processor VLSI, the power consumption of the VLSI is reduced by 45–73% in comparison to cases with conventional
asia and south pacific design automation conference | 2004
Yuki Kuroda; Junichi Miyakoshi; Masayuki Miyama; Kousuke Imamura; Hideo Hashimoto; Masahiko Yoshimoto
This paper describes a sub-mW motion estimation processor core for MPEG-4 video encoding. It features a Gradient Descent Search algorithm whose computation power is only 7% of the conventional 1:4-subsampling search, producing higher picture quality. Another feature is an optimized SIMD datapath architecture to decrease a clock frequency and an operating voltage. It has been fabricated with CMOS 5-metal 0.18 um technology. The measured power consumption to process a QCIF 15 fps video is 0.4 mW under 0.85 [email protected] V.
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2005
Yuichiro Murachi; Koji Hamano; Tetsuro Matsuno; Junichi Miyakoshi; Masayuki Miyama; Masahiko Yoshimoto
This paper describes a 95mW MPEG2 MP@HL motion estimation processor core for portable and high resolution video application like an HD camcorder. It features a novel hierarchical algorithm and a low power ring-connected systolic array architecture. It supports the frame/field and bi-directional prediction with half-pel precision for 1920/spl times/1080@30fps resolution video. The search range is /spl plusmn/128/spl times//spl plusmn/64. The ME core integrates 2.25M transistors in 3.1mm/spl times/3.1mm using 0.18micron technology.
IEICE Transactions on Electronics | 2006
Noriyuki Minegishi; Junichi Miyakoshi; Yuki Kuroda; Tadayoshi Katagiri; Yuki Fukuyama; Ryo Yamamoto; Masayuki Miyama; Kousuke Imamura; Hideo Hashimoto; Masahiko Yoshimoto
An optical flow processor architecture is proposed. It offers accuracy and image-size scalability for video segmentation extraction. The Hierarchical Optical flow Estimation (HOE) algorithm [1] is optimized to provide an appropriate bit-length and iteration number to realize VLSI. The proposed processor architecture provides the following features. First, an algorithm-oriented data-path is introduced to execute all necessary processes of optical flow derivation allowing hardware cost minimization. The data-path is designed using 4-SIMD architecture, which enables high-throughput operation. Thereby, it achieves real-time optical flow derivation with 100% pixel density. Second, it has scalable architecture for higher accuracy and higher resolution. A third feature is the CMOS-process compatible on-chip 2-port DRAM for die-area reduction. The proposed processor has performance for CIF 30 fr/s with 189 MHz clock frequency. Its estimated core size is 6.02 x 5.33 mm 2 with six-metal 90-nm CMOS technology.
IEICE Transactions on Electronics | 2008
Yuichiro Murachi; Yuki Fukuyama; Ryo Yamamoto; Junichi Miyakoshi; Hiroshi Kawaguchi; Hajime Ishihara; Masayuki Miyama; Yoshio Matsuda; Masahiko Yoshimoto
SUMMARY This paper describes an optical-flow processor core for real-time video recognition. The processor is based on the Pyramidal Lu- cas and Kanade (PLK) algorithm. It features a smaller chip area, higher pixel rate, and higher accuracy than conventional optical-flow processors. Introduction of search range limitation and the Carman filter to the original PLK algorithm improve the optical-flow accuracy, and reduce the proces- sor hardware cost. Furthermore, window interleaving and window overlap methods reduces the necessary clock frequency of the processor by 70%, allowing low-power characteristics. We first verified the PLK algorithm and architecture with a proto-typed FPGA implementation. Then, we de- signed a VLSI processor that can handle a VGA 30-fps image sequence at a clock frequency of 332 MHz. The core size and power consumption are estimated at 3.50 × 3.00 mm 2 and 600 mW, respectively, in a 90-nm process
symposium on vlsi circuits | 2005
Yuichiro Murachi; Tetsuro Matsuno; Koji Hamano; Junichi Miyakoshi; Masayuki Miyama; Masahiko Yoshimoto
This paper describes a 95mW MPEG2 MP@HL motion estimation processor core for portable and high resolution video application like an HD camcorder. It features a novel hierarchical algorithm and a low power ring-connected systolic array architecture. It supports the frame/field and bi-directional prediction with half-pel precision for 1920/spl times/1080@30fps resolution video. The search range is /spl plusmn/128/spl times//spl plusmn/64. The ME core integrates 2.25M transistors in 3.1mm/spl times/3.1mm using 0.18micron technology.
international symposium on vlsi design, automation and test | 2008
Yuichiro Murachi; Tetsuya Kamino; Junichi Miyakoshi; Hiroshi Kawaguchi; Masahiko Yoshimoto
This paper describes a unique SRAM architecture for super- parallel video processing. It features one cycle functional access of a rectangular image data (n x m pixels) with segmentation-free. To achieve this accessibility, a local word-line select scheme and a merged X-decoder method are newly introduced with elimination of extra X-decoder employed in usage of the conventional divided SRAM macro. The proposed SRAM has been adopted to a search window buffer for H.264 motion estimation processor for HDTV resolution video. As a result, a power and area of the search window buffer are reduced by 49% and by 48%, respectively. Furthermore, it is shown that the proposed SRAM is more efficient for super-HDTV resolution video which requires more parallelism.
IEICE Transactions on Electronics | 2008
Yuichiro Murachi; Junichi Miyakoshi; Masaki Hamamoto; Takahiro Iinuma; Tomokazu Ishihara; Fang Yin; Jangchung Lee; Hiroshi Kawaguchi; Masahiko Yoshimoto
We describe a sub 100-mW H.264 [email protected] integerpel motion estimation processor core for low power video encoder. It supports macro block adaptive frame field (MBAFF) encoding and bidirectional prediction for a resolution of 1920×1080 pixels at 30fps. The proposed processor features a novel hierarchical algorithm, reconfigurable ring-connected systolic array architecture and segmentation-free, rectangle-access search window buffer. The hierarchical algorithm consists of a fine search and a coarse search. A complementary recursive cross search is newly introduced in the coarse search. The fine search is adaptively carried out, based on an image analysis result obtained by the coarse search. The proposed systolic array architecture minimizes the amount of transferred data, and lowers computation cycles for the coarse and fine searches. In addition, we propose a novel search window buffer SRAM that has instantaneous accessibility to a rectangular area with arbitrary location. The processor core has been designed with a 90nm CMOS design rule. Core size is 2.5×2.5mm2. One core supports one-reference-frame and dissipates 48mW at 1V. Two core configuration consumes 96mW for two-reference-frame search.
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2006
Yasuhiro Morita; Hidehiro Fujiwara; Hiroki Noguchi; Kentaro Kawakami; Junichi Miyakoshi; Shinji Mikami; Koji Nii; Hiroshi Kawaguchi; Masahiko Yoshimoto
We propose a voltage control scheme for 6T SRAM cells that makes a minimum operation voltage down to 0.3 V under DVS environment. A supply voltage to the memory cells and wordline drivers, bitline voltage, and body bias voltage of load pMOSFETs are controlled according to read and write operations, which secures operation margins even at a low operation voltage. A self-aligned timing control with a dummy wordline and its feedback is also introduced to guarantee stable operation in a wide range of the supply voltage. A measurement result of a 64-kb SRAM in a 90-nm process technology shows that a power reduction of 30% can be achieved at 100 MHz. In a 65-nm 64-Mb SRAM, a 74% power saving is expected at 1/6 of the maximum operating frequency. The performance penalty by the proposed scheme is less than 1%, and area overhead is 5.6%.