Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ichiro Kuroda is active.

Publication


Featured researches published by Ichiro Kuroda.


international conference on acoustics speech and signal processing | 1999

Radix-4 FFT implementation using SIMD multimedia instructions

Kouhei Nadehara; Takashi Miyazaki; Ichiro Kuroda

A fast radix-4 complex FFT implementation using 4-parallel SIMD instructions is presented. Four radix-4 butterflies are calculated in parallel at all stages by loading consecutive 4 elements into a register. At the last stage, every 4 elements is packed into a register and calculated in parallel. This regular data flow enables higher parallelism and an overhead reduction in data format conversion. The implementation result on the V830R processor, which has a 4-parallel SIMD-type multimedia instruction set, achieves practical performance quite competitive with high-end parallel DSPs. Multiply-accumulate instructions with symmetrical rounding introduced to the V830R processor are effective to maintain FFT accuracy.


international conference on acoustics speech and signal processing | 1998

H.263 mobile video codec based on a low power consumption digital signal processor

Yukihiro Naito; Ichiro Kuroda

This paper describes an H.263 video codec implementation based on a low power consumption general purpose DSP. Fast algorithms, such as a fast motion estimation algorithm and a low complexity noise reduction filter, are proposed to implement the video codec on a single DSP chip maintaining sufficient picture quality. By using a 50 MIPS, 100 mW DSP, the developed codec encodes and decodes 7.5 QCIF frames per second, which is sufficient performance for low bit-rate video compression, typically below 64 kbps.


international conference on acoustics, speech, and signal processing | 2001

A low-power programmable DSP core architecture for 3G mobile terminals

Takahiro Kumura; Daiji Ishii; Masao Ikekawa; Ichiro Kuroda; Makoto Yoshida

We have developed a new-generation, general-purpose digital signal processor (DSP) core with low power dissipation for use in third-generation (3G) mobile terminals. The DSP core employs a 4-way VLIW (very long instruction word) approach, as well as a dual-multiply-accumulate (dual-MAC) architecture with good orthogonality. It is able to perform both video and speech codec for 3G wireless communications at 384 k bit/sec with a power consumption of approximately 50 mW. This paper presents an overview of both the DSP core architecture and a DSP instruction set, and it also gives some application benchmarks.


international conference on acoustics, speech, and signal processing | 1997

Code positioning to reduce instruction cache misses in signal processing applications on multimedia RISC processors

Hans-Joachim Stolberg; Masao Ikekawa; Ichiro Kuroda

Real-time operation of signal processing applications on multimedia RISC processors is often limited by high instruction cache miss rates of direct-mapped caches. In this paper, a heuristic approach is presented which reduces high instruction cache miss rates in direct-mapped caches by code positioning. The proposed algorithm rearranges functions in memory based on trace data so as to minimize cache line conflicts. Moreover, a new method to extract potential cache misses from trace data is introduced which enables accurate cache behavior analysis and greatly enhances code positioning efficiency. Application of code positioning to an MPEG-1 video decoder implementation on the V830 multimedia RISC processor reduced instruction cache refill cycles by 66-98%. The proposed code positioning algorithm does not require hardware modifications; it can easily be integrated in an object linker to automate the optimization process.


international symposium on circuits and systems | 1998

Low-energy programmable finite field data path architectures

Leilei Song; Keshab K. Parhi; Ichiro Kuroda; Takao Nishitani

This paper considers implementation of finite field multiplication data paths in a domain-specific programmable digital signal processor (DS-PDSP), where special hardware units and corresponding instructions are assumed to be used to program finite field multiplication operation. These multiplication data paths are designed to accommodate programmability with respect to the primitive polynomial as well as the field order. Three types of multipliers are considered; these include semi-systolic array (in both least-significant-bit first and most-significant-bit first modes), fully-parallel, and the proposed approach where polynomial multiplication and polynomial module operations are implemented separately and two instructions, MAC and DEGRGD are assigned to them, respectively. Two approaches are considered for achieving programmability with respect to the field order, either with special control circuitry, or with pre- and post-logical shifting operations. It is concluded that the one-level pipelined fully-parallel multiplier without control circuitry consumes the least energy at component level when only one multiplication is considered. However, at system level, when vector-vector multiplications, common in most DSP algorithms, are considered, the proposed approach is able to achieve 70% energy reduction at the expense of increasing the total instruction count by one.


international conference on acoustics speech and signal processing | 1998

Low-energy heterogeneous digit-serial Reed-Solomon codecs

Leilei Song; Keshab K. Parhi; Ichiro Kuroda; Takao Nishitani

Reed-Solomon (RS) codecs are used for error control coding in many applications such as digital audio, digital TV, software radio, CD players, and wireless and satellite communications. This paper considers software-based implementation of RS codecs where special instructions are assumed to be used to program finite field multiplication datapaths inside a domain-specific programmable digital-signal processor (DS-PDSP). A heterogeneous digit-serial approach is presented, where the heterogeneity corresponds to the use of different digit-sizes in the multiply-accumulate (MAC for polynomial multiplication) and degree reduction (DEGRED for polynomial module operation) subarrays. The salient feature of this digit-serial approach is that only the digit-cells are implemented in hardware, the finite field multiplications are performed digit-serially in software by dynamically scheduling the internal digit-level operations in RS encoders and decoders. It is concluded that, for 2-error-correcting RS(n,k) codec implementations over finite field GF(2/sup 8/), a parallel MAC unit (of digit-size 8) and a DEGRED unit with digit-size 2 is the best datapath, with respect to least energy consumption and energy-delay products. With this datapath architecture and appropriate digit-serial scheduling strategies, more than 60% energy reduction and more than 1/3 energy delay reduction can be achieved compared with the parallel multiplication datapath based approach.


IEEE Transactions on Acoustics, Speech, and Signal Processing | 1987

A CCITT standard 32 kbit/s ADPCM LSI codec

Takao Nishitani; Ichiro Kuroda; Masao Satoh; Tadaharu Katoh; Yasuo Aoki

An LSI ADPCM codec, which is based on the CCITT standard 32 kbit/s algorithm, has been developed. After thoroughly investigating complex CCITT specifications for arithmetic operations, a software controllable custom LSI processor approach is chosen to reduce hardware amount and power dissipation. The processor architecture is optimized for the CCITT algorithm. A reconfigurable pipeline multiplier-normalizer-accumulator circuit is effectively utilized for floating-point multiply-and-add operations in a predictor output calculation, nonlinear PCM to/from linear PCM code conversions, and powers of 2 multiplications used in adaptation logic units. A microinstruction set is chosen to perform efficient binary tree search processing for ADPCM quantization, in addition to performing parallel processing in independent processor resources. The LSI chip, implemented by 2.5 μ CMOS technology, has 8.2 × 7.4 mm die size and dissipates only 90 mW.


international conference on acoustics, speech, and signal processing | 1985

A CCITT standard 32 kbps ADPCM LSI codec

Takao Nishitani; Ichiro Kuroda; M. Satoh; T. Katoh; R. Fukuda; Y. Aoki

An LSI ADPCM codec, which is based on the CCITT standard 32 kbps algorithm, has been developed. The LSI chip has been designed as a software controllable signal processor whose architecture is optimized for the CCITT algorithm. A reconfigurable pipeline multiplier-normalizer-accumulator circuit is effectively utilized for realizing complex ADPCM specifications. The LSI chip, implemented by 2.5 µ CMOS technology, dissipates only 90 milliwatts of power.


IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2006

Special Section on Papers Selected from the 20th Symposium on Signal Processing

Ichiro Kuroda

Vinylidene fluoride polymer ultrafiltration membranes are prepared by casting a sheet of said polymer dissolved in a mixture of a specified solvent and a specified non-solvent, on a smooth substrate, evaporating a portion of the solvent from the sheet, immersing said sheet in a gelation liquid therefore, and optionally, stabilizing the gelled sheet by heat treatment thereof. A porous vinylidene fluoride polymer membrane having smooth, unwrinkled surfaces can be prepared in accordance with the above described process without restraining the membrane during the evaporation and gelation steps by utilizing triethyl phosphate as the solvent.


international conference on multimedia and expo | 2001

Multimedia signal processor for mobile applications

Masao Ikekawa; Masatsugu Hori; Kouhei Nadehara; Takahiro Kumura; Makoto Yoshida; Ichiro Kuroda; Takao Nishitani

This paper describes an efficient architecture enhancement for video codec on a new-generation, general-purpose digital signal processor (DSP) core called SPXK5 developed for handheld devices. With high performance features of SPXK5s base architecture, an MPEG-4 video codec can be implemented efficiently. In addition, only a few SIMD type instructions effectively accelerate MPEG-4 video codec implementation by 20% with only 2.5% hardware increase. By reducing cycle count, the DSPs power consumption can be reduced. Both video and speech codec for 3G mobile service at 384kbps can be realized with a power consumption of less than 50mW.

Collaboration


Dive into the Ichiro Kuroda's collaboration.

Researchain Logo
Decentralizing Knowledge