Is this you? Create Your Porfile

An-Nan Suen

National Cheng Kung University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where An-Nan Suen is active.

Explore More

Publication

Featured researches published by An-Nan Suen.

international symposium on circuits and systems | 1993

A high throughput-rate architecture for 8*8 2D DCT

Ming-Hwa Sheu; Jau-Yien Lee; Jhing-Fa Wang; An-Nan Suen; Liang-Ying Liu

A new architecture for VLSI implementation of an 8 /spl times/ 8 2D discrete cosine transform (DCT) is proposed. The main merits of this architecture are: (1) the multipliers are replaced by memory look-up tables; (2) no input registers are required to save a column of input data; (3) the chip performance is independent of data width; and (4) the latency (the largest delay path) is short.<<ETX>>

IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 1992

A data-reuse architecture for gray-scale morphologic operations

Ming-Hwa Sheu; Jhing-Fa Wang; Jer-Sheng Chen; An-Nan Suen; Yuan-Long Jeang; Jau-Yien Lee

Presents an efficient pipeline architecture to perform gray-scale morphologic operations. The features of the architecture are 1) lower hardware cost, 2) faster operation time in processing an image, 3) lower data access times from the image memory, 4) shorter latency, 5) suitability for VLSI implementation, and 6) adaptability for N*N morphologic operations. >

international symposium on circuits and systems | 1995

A cepstrum chip: architecture and implementation

An-Nan Suen; Jhing-Fa Wang; Yuen-Lin Chiang

The cepstrum coefficients have been widely used for speech signal representation and play a very important role in recognition accuracies. We present a low cost architecture for VLSI implementation of LPC-based cepstrum algorithm. The circuit performs the cepstrum operation for each frame of the speech data. A pipelining architecture leads to high speed performance up to speech recognition rate. The cepstrum chip is fabricated in 1.2 /spl mu/m double-metal CMOS technology after the physical design and circuit verification. On the whole, the chip can process 18.3 MHz sampled data and it contains about 24000 transistors which occupy 227.5/spl times/213.3 mils/sup 2/ area. It has been shown to be fully functional and is the first working cepstrum chip.

international conference on acoustics speech and signal processing | 1996

A programmable application-specific CELP processor with parallel architectures

An-Nan Suen; Jhing-Fa Wang; Bor-Yueh Liu

The code excited linear predictive (CELP) coder has been widely used as the most effective technique among various linear predictive coding methods for speech compression. However, it is computationally intensive and general-purpose DSP chips are usually not powerful enough to handle such coding algorithms. The CELP processor architecture and a VLSI implementation are presented. A programmable application-specific single chip design for the CELP algorithm will drastically reduce the cost and achieve real-time performance. The CELP processor is programmable and contains a specific modular design for the codebook searches. On the whole, the chip can process 40 MHz sampled speech data. The FS1016 CELP coder was implemented on this processor, that is we can encode the speech data at 4.8 kbps in real-time using this single chip. Fabricated in 0.8 /spl mu/m double-metal CMOS technology, the chip size is 6.3/spl times/6.1 mm/sup 2/ and is the first chip designed for CELP.

Integration | 1997

VLSI architecture and implementation for FS1016 CELP decoder with reduced power and memory requirements

An-Nan Suen; Jhing-Fa Wang; Jia-Lang Lin

Abstract The code-excited linear predictive (CELP) coder is the most effective technique among various linear predictive coding methods for speech compression. Hence, designing a relatively low-cost and low-power CELP decoder chip for the portable systems and wireless digital communication environment becomes increasingly important. This paper presents the VLSI architecture and chip implementation for the FS1016 CELP decoder with reduced power and memory requirements. A single-chip implementation of the CELP decoder drastically reduces the cost and the size of many CELP vocoder systems. The decoder chip can achieve the following: (1) excellent accuracy results due to the accuracy studies for the finite word length, (2) power savings and high-speed operations owing to the combined advantages of pipeline and concurrent processing structures, (3) table size reducing by applying the memoryless realization for stochastic codebook and partial sums technique and (4) specification satisfying the FS1016 CELP coder. Fabricated with 0.8 μm double-metal CMOS technology, the chip contains approximately 13 000 transistors occupying a 6.1 × 6.2 mm 2 area. It has been tested to be fully functional at IMS XL-60 tester and is the first working FS1016 CELP 4.8 K decoder chip. In particular, the fixed-point accuracy studies for the CELP decoder are also addressed herein.

international conference on consumer electronics | 1997

VLSI Implementation Of 3-D Sound Generator

An-Nan Suen; Jhing-Fa Wang; Jia-Ching Wang

We propose a VLSI architecture for the 3-D sound generator. The main feature of the proposed 3-D sound architecture is the high performance for sound source localization. We maintain high localization cues owing to the long length of head related transfer function (HRTF). Though short length HRTF can reduce the computation load, the localization cues will degrade. A fast algorithm and an efficient architecture are proposed to generate 3-D sound by the long length HRTF at a 44.1 kHz sampling rate. In addition, the crossfading effect is used for smoothly moving sound source.

international conference on electronics circuits and systems | 2001

A programmable application-specific VLSI architecture for speech recognition

Jia-Ching Wang; Jhing-Fa Wang; An-Nan Suen; Yu-Sheng Weng

In this paper, we present an efficient VLSI architecture for the stand-alone application of a speech recognition system. With the analysis of the computation complexity, mel frequency cepstrum extraction and Bayesian neural network operations are the most time consuming computation tasks in the recognition algorithm. The specific recognition core to deal with them is proposed based on a much improved algorithm. The construction of the special logarithm look-up table saves on computation time and drastically reduces the memory size. Moreover, the cost efficient programmable architecture is designed for other non computation-intensive operations. The best aspects of both programmable and application specific architectures including the performance, design complexity, and flexibility are incorporated in the proposed VLSI speech recognizer.

international symposium on circuits and systems | 1997

On the fixed-point error analysis and VLSI architecture for FS1016 CELP decoder

An-Nan Suen; Jhing-Fa Wang; Horng-Jei Chang

In this paper, the fixed-point accuracy analysis and VLSI architecture of FS1O16 CELP decoder are presented. The code excited linear predictive (CELP) coder is the most effective technique among various linear predictive coding methods for speech compression. Hence to design a low cost and low power CELP decoder chip for the portable systems and wireless digital communication environment becomes increasingly important. The decoder VLSI architecture can achieve (1) excellent accuracy results due to the accuracy studies for the finite word length, (2) power saving and high speed operations resulting from the combined advantages of pipeline, current processing for LSEs interpolating and cosine operation, (3) table size reducing by applying the memoryless realization for stochastic codebook and partial sums technique, and (4) specification satisfying the FS1016 CELP coder.

international symposium on neural networks | 1995

A Bayesian neural network chip design for speech recognition system

Jhing-Fa Wang; An-Nan Suen; Jia-Ru Lee; Chung-Hsien Wu

The Bayesian neural network (BNN) has been widely used as speech recognition template which combines the merits of the dynamic programming (DP) and hidden Markov model (HMM) methods. However, it is computationally intensive and very costly to implement using DSP component. A single chip implementation of the BNN will drastically reduce the cost and the size of many speech recognition systems. It will also make low cost implementation of real-time speech recognition system possible. In this paper, the implementation of single BNN chip for the real-time speech recognizer is presented. Fabricated in 0.8 /spl mu/m double-metal CMOS technology, the chip contains approximately 13000 transistors which occupy a 3.1/spl times/3.2 mm/sup 2/ area and has been tested to be fully functional at IMS XL-60 tester.

international conference on consumer electronics | 1998

1.2kbps Fblpc Vocoder With Applications In Phone-to-phone Over Internet

Jhing-Fa Wang; Jia-Ching Wang; An-Nan Suen; Yun-Fei Chao; Chao-Yong Wang

Since the Mixed-Excitation Linear Predictive (MELP) Vocoder was selected as the new 2.4Kbps Federal Standard speech coder in 1996[1], the very low bit rate speech coding method has become new tendency for research. This paper presents a new LPC vocoder model(FBLPC) based on forward-backward waveform prediction[2] for very low bit rate speech coding. Besides, we use the technology to accomplish the application system named phone-to-phone over intemet. 1.2Kbps FBLPC Vocoder The basic structure of the FBLPC vocoder is based on traditional LPC. But the most annoying output speech of the basic LPC vocoder is strong buzzy quality. This problem is that it can not produce suitable excitation. At low bit rates the quality of synthesized voiced speech is superior to that of unvoiced speech because there is no important long-term correlation for unvoiced speech. Voiced speech coding technology becomes more important apparently. The FBLPC vocoder uses mixtures of pulse and noise excitation for voiced speech. Besides, we apply the smoothed excitation filter for voiced and unvoiced excitation. They can smooth excitation and sounding speech. The block diagram of the FBLPC is shown in Fig. 1. The forward-backward waveform prediction method is adopted to reduce bit rates. The basic idea of the algorithm is only to encode and transmit partial representative waveforms(RW) using LPC model. On the other hand, other partial waveforms which are not to be encoded and transmitted are reconstructed by a interpolating method called forward-backward waveform prediction in the synthesis, This interpolating method can decrease the bit rates rapidly ,and the vocoder also can produce intelligible speech. A 1.2Kbps FBLPC vocoder based on this model has been designed and implemented in real time. In the encoder, RW is encoded and transmitted. The 10 order LPC coefficients are determined by autocorrelation technique with Hamming window and coded using scalar quantization of the LSP parameter which has better characteristics in the interpolation and quantization. The pitch is estimated ffom detecting 0.6 of maximum amplitude. In the decoder, if the size of the prediction waveform(PW) is too large, the synthetic speech quality will decade eminently. However ,if it is too small, the high compressed rate is not accomplished. According to experiments, we choose 15msec(120 samples) in a PW frame. The parameter and bit allocation of the 1.2Kbps FBLPC vocoder is shown in Table 1. and Table 2. This 1.2Kbps FBLPC vocoder is implemented in the Visual C++ language and runs on an IBM compatible PC or any computer which uses the Win95 or WinNT. The mean opinion score(M0S) obtained from a informal listening tests is shown in Table 3. Phone-to-phone over internet The phone-to-phone over intemet is a phone-to-phone system using intemet resource. The diagram is shown in Fig. 2. For example, if a user in Taipei want to call the other user in Chicago by telephone, he can call the the server in Taipei first. Then, the the server in Taipei connect to the server in Chicago by intemet. Finally, the server in Chicago call the user in Chicago. Thus, they can talk by telephone over internet. The phone-to-phone over intemet adopts 1.2Kbps FBLPC vocoder and there are several advantages: one is

Explore More