Is this you? Create Your Porfile

Ta-Wen Kuan

National Cheng Kung University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ta-Wen Kuan is active.

Explore More

Publication

Featured researches published by Ta-Wen Kuan.

IEEE Transactions on Very Large Scale Integration Systems | 2012

VLSI Design of an SVM Learning Core on Sequential Minimal Optimization Algorithm

Ta-Wen Kuan; Jhing-Fa Wang; Jia-Ching Wang; Po-Chuan Lin; Gaung-Hui Gu

The sequential minimal optimization (SMO) algorithm has been extensively employed to train the support vector machine (SVM). This work presents an efficient application specific integrated circuit chip design for sequential minimal optimization. This chip is implemented as an intellectual property core, suitable for use in an SVM-based recognition system on a chip. The proposed SMO chip was tested and found to be fully functional, using a prototype system based on the Altera DE2 board with a Cyclone II 2C70 field-programmable gate array.

systems, man and cybernetics | 2011

Hardware/software co-design for fast-trainable speaker identification system based on SMO

Jhing-Fa Wang; Jr-Shiang Peng; Jia-Ching Wang; Po-Chuan Lin; Ta-Wen Kuan

Embedded speaker identification system is a popular research, but most of current systems can not provide fast training ability. Because of the low computational ability in the embedded environment, a large amount of waiting time usually makes the human-machine interface not friendly. This paper presents a hardware and software (HW/SW) co-design solution for fast-trainable speaker identification system. Fast training ability makes this embedded speaker identification system possess high flexibility and enhances the convenience to a wide range of real-world applications. The proposed system consists of a training phase and a multiclass identification phase. The sequential minimal optimization (SMO) training algorithm occupies the heaviest computational load and is realized as a dedicated VLSI module, i.e., the hardware component. The other processes such as speech preprocess, speech feature extraction, and SVM voting strategy are implemented by software. Moreover, a data-packed mechanism is presented to improve the bandwidth utilization. Compared with the embedded C code based on ARM processor, our system reduces 90% of the training time and achieves 89.9% identification rate with the NIST 2010 speaker recognition database. The proposed system was tested and found to be fully functional working on a Socle CDK prototype system with an AMBA based Xilinx FPGA and an ARM926EJ processor.

international conference on orange technologies | 2013

A new approach of image inpainting based on PSO algorithm

Shu-Chiang Chung; Ta-Wen Kuan; Chuan-Pin Lu; Hsin-Yi Lin

In this paper, an efficient approach is developed to incorporate the exemplar-based image of inpainting method and the minimum error boundary of cut technique, that is proposed to improve the image inpainting in a more nature quality with high performance. The approach is based on the particle swarm optimization. Several advantages are addressed as follow. First, the image inpainting in texture with the linear structure is exploited to reasonably inpaint the damaged image in a high priority. Due to set the first priority to inpaint in the linear structure, such a method guarantees the integrality of the linear structure. In addition, the minimum error boundary of cut technique can effectively decrease the unnatural phenomena through inpainting the gap of damaged image. Owing to the damaged one will be dawdling extraordinarily if searching the similar block while inpainting. In this case, the method is proposed by joined the particle swarm optimization algorithm, and the outcome indeed improves the efficiency of image inpainting.

international conference on orange technologies | 2014

Frowning expression detection based on SOBEL filter for negative emotion recognition

Shu-Chiang Chung; Shovan Barma; Ta-Wen Kuan; Ting-Wei Lin

This paper proposes a novel method to improve happiness status by detection negative emotional status based on frowning lines on face and a new term called facial expression factor (FEF). The FEF correlates the frowning and with emotional status. The frowning lines are detected using SOBEL filter and FEF factors are calculated from selected frowning lines to know the actual emotional status. Thus the negative emotional state are detected which could help to promote the happiness further. The experiment is conducted on 10 participants. In total 40 images (including 20 neutral and 20 frowning expression) are considered for experiment. The results show that the emotional status of 8 persons out of 10 participants is recognized correctly. Further, the wrong recognition results are corrected by tuning the threshold. Hence, the results depict the recognition accuracy up to 80%. The proposed work is based on simple training which also reduces the training time cost effectively. Furthermore, the proposed method is able to detect more complex facial expression (e.g., forced smile) using FEF. The tuning of threshold makes the method more effective. Therefore, such results show its effectiveness by detecting negative emotional state to promote the happiness.

international conference on orange technologies | 2014

A happiness-oriented home care system for elderly daily living

Yang-Yen Ou; Po-Yi Shih; Ta-Wen Kuan; Shao-Hsien Shih; Jhing-Fa Wang; Jaw-Shyang Wu

Currently the modern developing home-care systems highlight the functionalities on bio-signals measurement, security surveillance and health care, however most of them work independently. In this paper, a newly warming-care framework for elderly is proposed, not only to reach the aforementioned services, but also including following kindly services, that is, the remote monitoring, the web camera management, the emergency call for help, the behavior recognition and feedback, and the remote control entertainment services, to reach a comprehensive humanistic-caring system. The proposed framework is motivated by the individual alphabet on “HAPPINESS” which are redefined and interpreted as “Health”, “Ability”, “Protection”, “Personalization”, “Interaction”, “Nursing”, “Entertainment”, “Succor” and “Smile”. Three main services are spotlighted to achieve the goals described below. The Web-based Central Camera Management Service (WCCMS) is a real-time remote monitoring function that a caregiver can pay attention to care elderly anytime and anywhere through web services; the Multimodal Human-Machine Interaction Service (MHMIS) provides the audio-visual cognitive functions to interact with elderly, and the Web-based User Management Service (WUMS) gives user a smart HMI interface including bio-signal measurement, help button, remote control, and hospital appointment scheduling functionalities. To evaluate the proposed framework usability, MOS (Mean Opinion Score) is applied and average MOS 4.2 score is acquired that reveals the proposed system expectable.

ubiquitous intelligence and computing | 2008

Ubiquitous and Robust Text-Independent Speaker Recognition for Home Automation Digital Life

Jhing-Fa Wang; Ta-Wen Kuan; Jia-Chang Wang; Gaung-Hui Gu

This paper presents a ubiquitous and robust text-independent speaker recognitionarchitecture for home automation digital life. In this architecture, a multiple microphone configuration is adopted to receive the pervasive speech signals. The multi-channel speech signals are then added together with a mixer. In a ubiquitous computing environment, the received speech signal is usually heavily corrupted by background noises. An SNR-aware subspace speech enhancement approach is used as a pre-processing to enhance the mixed signal. Considering the text-independent speaker recognition, this paper applies a multi-class support vectors machine (SVM)[10][11] instead of conventional Gaussian mixture models (GMMs)[12]. In our experiments, the speaker recognition rate can averagely reach 97.2% with the proposed ubiquitous speaker recognitionarchitecture.

IEEE Transactions on Very Large Scale Integration Systems | 2014

REC-STA: Reconfigurable and Efficient Chip Design With SMO-Based Training Accelerator

Chih-Hsiang Peng; Bo-Wei Chen; Ta-Wen Kuan; Po-Chuan Lin; Jhing-Fa Wang; Nai-Sheng Shih

Sequential minimal optimization (SMO) and Karush-Kuhn-Tucker condition are often used to solve learning problems in support vector machines. However, during hardware implementation of the SMO algorithm, enhancing chip performance without excessively increasing chip area is often a crucial issue. The solution proposed in this paper is a novel reconfigurable and efficient chip design with SMO-based training accelerator (REC-STA). Two novel methods used in the proposed REC-STA are trimode coarse-grained reconfigurable architecture (TCRA) and triple finite-state-machine with dynamic scheduling The first method modifies the baseline SMO design by developing trimode reconfigurable architectures with parallel and pipeline computing capabilities. The second method provides a schedule for efficient reconfiguration of the TCRA. Use of these methods can remove kernel cache design. For chip manufacturing, the implementation of the REC-STA is synthesized, placed, and routed using the TSMC 0.18-μm technology library. The core size is 2.94 mm × 2.94 mm and the power consumption is 77.3 mW. Compared with the baseline design, the FPGA simulation results show that the proposed architecture requires 50% less memory and 31% fewer gate counts but provides a 16-fold improvement in training performance. The experimental results confirm the efficacy of the proposed architecture and methods.

Expert Systems With Applications | 2012

A new hybrid and dynamic fusion of multiple experts for intelligent porch system

Ta-Wen Kuan; Hsin-Chun Tsai; Jhing-Fa Wang; Jia-Ching Wang; Bo-Wei Chen; Zong-You Lin

Highlights? We proposed two novel fusion approaches for intelligent porch development. ? We designed an inexpensive and unobtrusive multi-modal expert system for smart home. ? We proposed an intelligent indoor and outdoor interactive system for home users. ? The audio-visual experts including speaker, speech, face and height expert. ? The contribution developed an identity recognition system in a natural way. Intelligent porch research is an important issue in smart home development; however, such a field was rarely investigated in the literature. This investigation proposes a new hybrid and dynamic fusion of multiple experts for the intelligent porch system. First, a new hybrid priority tree with decision fusion (HPTD-fusion) is proposed to eliminate the problems of tag-based authentication outdoors. The HPTD-fusion first verifies the vocal entrance code (VEC), and subsequently the remaining experts are performed in the cases of AND, OR or majority voting for decision fusion. Second, the post-mapping dynamic weighted fusion (PMDW-fusion) scheme is presented to adapt the indoor porch audio-visual environment. The PMDW-fusion dynamically assigns the higher weight to experts with higher performance, and then sums all participating experts for score fusion. The experimental results demonstrate that FRR and FAR can reach up to 0.18 and 0.19, respectively, when the system is tested in the outdoor environment. Furthermore, the indoor recognition accuracy can be increased to 86.1% using the proposed fusion scheme. The experiments have verified the effectiveness and feasibility of the proposed system. Restated, the contribution of this work is to develop a novel intelligent porch system incorporating a natural and unobtrusive method for identity recognition. The proposed system has been installed and tested in a real-world environment in the Technologies for Ubiquitous Computing and Humanity (TOUCH) Center at National Cheng Kung University.

Artificial Intelligence Research | 2016

A robust BFCC feature extraction for ASR system

Ta-Wen Kuan; An-Chao Tsai; Po-Hsun Sung; Jhing-Fa Wang; Hsien-Shun Kuo

An auditory-based feature extraction algorithm naming the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC) is proposed to increase the robustness for automatic speech recognition. Compared to Fourier spectrogram based of the Mel-Frequency Cepstral Coefficient (MFCC) method, the proposed BFCC method engages an auditory spectrogram based on agammachirp wavelet transform to simulate the auditory response of human inner ear to improve the noise immunity. In addition, the Hidden Markov Model (HMM) is used for evaluating the proposed BFCC in phases of training and testing purposes conducted by AURORA-2 corpus with different Signal-to-Noise Ratios (SNRs) degrees of datasets. The experimental results indicate the proposed BFCC, compared with MFCC, Gammatone Wavelet Cepstral Coefficient (GWCC), and Gammatone Frequency Cepstral Coefficient (GFCC), improves the speech recognition rate by 13%, 17%, and 0.5% respectively, on average given speech samples with SNRs ranging from -5 to 20 dB.

Iet Computers and Digital Techniques | 2015

Memory-efficient buffering method and enhanced reference template for embedded automatic speech recognition system

Chih-Hung Chou; Ta-Wen Kuan; Po-Chuan Lin; Bo-Wei Chen; Jhing-Fa Wang

This work realises a memory-efficient embedded automatic speech recognition (ASR) system on a resource-constrained platform. A buffering method called ultra-low queue-accumulator buffering is presented to efficiently use the constrained memory to extract the linear prediction cepstral coefficient (LPCC) feature in the embedded ASR system. The optimal order of the LPCC is evaluated to balance the recognition accuracy and the computational cost. In the decoding part, the proposed enhanced cross-words reference templates (CWRTs) method is incorporated into the template matching method to reach the speaker-independent characteristic of ASR tasks without the large memory burden of the conventional CWRTs method. The proposed techniques are implemented on a 16-bit microprocessor GPCE063A platform with a 49.152 MHz clock, using a sampling rate of 8 kHz. Experimental results demonstrate that recognition accuracy reaches 95.22% in a 30-sentence speaker-independent embedded ASR task, using only 0.75 kB RAM.

Explore More