Jhing-Fa Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jhing-Fa Wang is active.

Explore More

Publication

Featured researches published by Jhing-Fa Wang.

IEEE Transactions on Circuits and Systems for Video Technology | 1993

Dynamic search-window adjustment and interlaced search for block-matching algorithm

Liang-Wei Lee; Jhing-Fa Wang; Jau-Yien Lee; Jung-Dar Shie

A technique called dynamic search-window adjustment is proposed to improve the performance of three-step searches (TSS) and to prevent the search direction from being easily misdirected by insufficient information. An interlaced-search technique is presented for the purpose of reducing the search positions. A fast search algorithm using both techniques is proposed. It is shown that the average displaced frame difference and search positions of the proposed algorithm are about 1-7% and 24-44% fewer than TSS, respectively. >

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2000

Segmentation of single- or multiple-touching handwritten numeral string using background and foreground analysis

Yi-Kai Chen; Jhing-Fa Wang

An approach of segmenting a single- or multiple-touching handwritten numeral string (two-digits) is proposed. Most algorithms for segmenting connected digits mainly focus on the analysis of foreground pixels. Some concentrated on the analysis of background pixels only and others are based on a recognizer. We combine background and foreground analysis to segment single- or multiple-touching handwritten numeral strings. Thinning of both foreground and background regions are first processed on the image of connected numeral strings and the feature points on foreground and background skeletons are extracted. Several possible segmentation paths are then constructed and useless strokes are removed. Finally, the parameters of geometric properties of each possible segmentation paths are determined and these parameters are analyzed by the mixture Gaussian probability function to decide the best segmentation path or reject it. Experimental results on NIST special database 19 (an update of NIST special database 3) and some other images collected by ourselves show that our algorithm can get a correct rate of 96 percent with rejection rate of 7.8 percent, which compares favorably with those reported in the literature.

IEEE Transactions on Circuits and Systems for Video Technology | 2007

A Fast Mode Decision Algorithm and Its VLSI Design for H.264/AVC Intra-Prediction

Jia-Ching Wang; Jhing-Fa Wang; Jar-Ferr Yang; Jang-Ting Chen

In this paper, we present a fast mode decision algorithm and design its VLSI architecture for H.264 intra-prediction. A regular spatial domain filtering technique is proposed to compute the dominant edge strength (DES) to reduce the possible predictive modes. Experimental results revealed that the proposed fast intra-algorithm reduces 40% computation with slight peak signal-to-noise ratio (PSNR) degradation. The designed DES VLSI engine comprises a zigzag converter, a DES finite-state machine (FSM), and a DES core. The former two units handle memory allocation and control flow while the last performs pseudoblock computation, edge filtering, and dominant edge strength extraction. With semicustom design fabricated by 0.18 mum CMOS single-poly-six-metal technology, the realized die size is roughly 0.15 times 0.15 mm2 and can be operated at 66 MHz.

IEEE Transactions on Multimedia | 2009

A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities

Bo-Wei Chen; Jia-Ching Wang; Jhing-Fa Wang

Video summarization techniques have been proposed for years to offer people comprehensive understanding of the whole story in the video. Roughly speaking, existing approaches can be classified into the two types: one is static storyboard, and the other is dynamic skimming. However, despite that these traditional methods give brief summaries for users, they still do not provide with a concept-organized and systematic view. In this paper, we present a structural video content browsing system and a novel summarization method by utilizing the four kinds of entities: who, what, where, and when to establish the framework of the video contents. With the assistance of the above-mentioned indexed information, the structure of the story can be built up according to the characters, the things, the places, and the time. Therefore, users can not only browse the video efficiently but also focus on what they are interested in via the browsing interface. In order to construct the fundamental system, we employ maximum entropy criterion to integrate visual and text features extracted from video frames and speech transcripts, generating high-level concept entities. A novel concept expansion method is introduced to explore the associations among these entities. After constructing the relational graph, we exploit graph entropy model to detect meaningful shots and relations, which serve as the indices for users. The results demonstrate that our system can achieve better performance and information coverage.

IEEE Transactions on Circuits and Systems for Video Technology | 2008

Intensity Gradient Technique for Efficient Intra-Prediction in H.264/AVC

An-Chao Tsai; Anand Paul; Jia-Ching Wang; Jhing-Fa Wang

This study presents an intensity gradient approach for intra-prediction in H.264 encoding system, which enhances the performance and efficiency of previous fast algorithms. We propose a preprocessing stage in which eight orientation features are extracted from a macro block by the intensity gradient filters. The orientation features are utilized to select a subset of prediction modes to be involved in the rate-distortion calculation so that the encoding time can be reduced. The simulation results indicate that the intensity gradient based algorithm for intra-prediction contributes better tradeoff between rate-distorion performance and encoding complexity than the previous algorithms. Compared to H.264 reference software, the proposed algorithm introduces slight PSNR degradation and bit rate increase but saves around 76% of the total encoding time with all intra-frame coding.

IEEE Transactions on Automation Science and Engineering | 2008

Robust Environmental Sound Recognition for Home Automation

Jia-Ching Wang; Hsiao Ping Lee; Jhing-Fa Wang; Cai-Bei Lin

This work presents a robust environmental sound recognition system for home automation. Specific home automation services can be activated based on identified sound classes. Additionally, when the sound category is human speech, such speech can be recognized for detecting human intentions as in conventional research on home automation. To attain this ambitious goal, this study uses two key techniques: signal-to-noise ratio-aware subspace-based signal enhancement and sound recognition with independent component analysis mel-frequency cepstral coefficients and a frame-based multiclass support vector machines, respectively. Simulations and an experiment in a real-world environment are given to illustrate the performance of the proposed robust sound recognition system.

Expert Systems With Applications | 2009

Improving the generalization performance of RBF neural networks using a linear regression technique

C. L. Lin; Jhing-Fa Wang; Chen-Yuan Chen; Cheng-Wu Chen; Chen-Wen Yen

In this paper we present a method for improving the generalization performance of a radial basis function (RBF) neural network. The method uses a statistical linear regression technique which is based on the orthogonal least squares (OLS) algorithm. We first discuss a modified way to determine the center and width of the hidden layer neurons. Then, substituting a QR algorithm for the traditional Gram-Schmidt algorithm, we find the connected weight of the hidden layer neurons. Cross-validation is utilized to determine the stop training criterion. The generalization performance of the network is further improved using a bootstrap technique. Finally, the solution method is used to solve a simulation and a real problem. The results demonstrate the improved generalization performance of our algorithm over the existing methods.

IEEE Transactions on Audio, Speech, and Language Processing | 2006

Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis

Chung-Hsien Wu; Chi-Chun Hsia; Te-Hsien Liu; Jhing-Fa Wang

This paper presents an expressive voice conversion model (DeBi-HMM) as the post processing of a text-to-speech (TTS) system for expressive speech synthesis. DeBi-HMM is named for its duration-embedded characteristic of the two HMMs for modeling the source and target speech signals, respectively. Joint estimation of source and target HMMs is exploited for spectrum conversion from neutral to expressive speech. Gamma distribution is embedded as the duration model for each state in source and target HMMs. The expressive style-dependent decision trees achieve prosodic conversion. The STRAIGHT algorithm is adopted for the analysis and synthesis process. A set of small-sized speech databases for each expressive style is designed and collected to train the DeBi-HMM voice conversion models. Several experiments with statistical hypothesis testing are conducted to evaluate the quality of synthetic speech as perceived by human subjects. Compared with previous voice conversion methods, the proposed method exhibits encouraging potential in expressive speech synthesis

IEEE Transactions on Circuits and Systems for Video Technology | 2008

Effective Subblock-Based and Pixel-Based Fast Direction Detections for H.264 Intra Prediction

An-Chao Tsai; Jhing-Fa Wang; Jar-Ferr Yang; Wei-Guang Lin

In H.264/AVC intra-frame coding, the rate-distortion optimization (RDO) is employed to select the optimal coding mode to achieve the minimum rate-distortion cost. Due to a large number of combinations of coding modes, the computational burden becomes extremely high in intra-prediction. In this paper, we propose two fast, efficient but reliable direction detection algorithms by computing subblock and pixel direction differences. Both proposed methods effectively estimate the edge direction inside the block to narrow down the predictive modes to reduce the RDO computation. Experimental results show that the proposed methods can reduce the encoding time by about 60% with negligible loss of coding performance. For hardware realization, a fast mode decision VLSI circuit for intra-prediction with the silicon core size of 0.12 times 0.12 mm2 at 0.18-mum CMOS technology is implemented. The fast mode decision VLSI in three-stage pipelined architecture operated at 173 MHz can encode 30 fps real-time videos up to Level 5.

signal processing systems | 2004

Speech Enhancement Using Perceptual Wavelet Packet Decomposition and Teager Energy Operator

Shi-Huang Chen; Jhing-Fa Wang

It has been shown in the literature that the perceptual wavelet packet decomposition (PWPD) and the Teager energy operator (TEO) are useful for various speech processing systems and speech enhancement applications, respectively. By the use of the PWPD and the TEO, this paper presents an improved wavelet-based speech enhancement method. The main advantage of the proposed method is that the over thresholding of speech segments which is usually occurred in conventional wavelet-based speech enhancement schemes can be avoided. As a consequence, the enhanced speech quality of the proposed method can be increased substantially from those of conventional approaches. In addition, the proposed method does not require a complicated estimation of the noise level or any knowledge of the SNR. Using speech signals corrupted by additive and real noises, experimental results demonstrate that the speech enhancement method presented in this paper is capable of outperforming conventional noise cancellation schemes.

Explore More