Wing W. Y. Ng
Harbin Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wing W. Y. Ng.
IEEE Transactions on Neural Networks | 2007
Daniel S. Yeung; Wing W. Y. Ng; Defeng Wang; Eric C. C. Tsang; Xi-Zhao Wang
The generalization error bounds found by current error models using the number of effective parameters of a classifier and the number of training samples are usually very loose. These bounds are intended for the entire input space. However, support vector machine (SVM), radial basis function neural network (RBFNN), and multilayer perceptron neural network (MLPNN) are local learning machines for solving problems and treat unseen samples near the training samples to be more important. In this paper, we propose a localized generalization error model which bounds from above the generalization error within a neighborhood of the training samples using stochastic sensitivity measure. It is then used to develop an architecture selection technique for a classifier with maximal coverage of unseen samples by specifying a generalization error threshold. Experiments using 17 University of California at Irvine (UCI) data sets show that, in comparison with cross validation (CV), sequential learning, and two other ad hoc methods, our technique consistently yields the best testing classification accuracy with fewer hidden neurons and less training time.
Archive | 2009
Daniel S. Yeung; Ian Cloete; Daming Shi; Wing W. Y. Ng
Artificial neural networks are used to model systems that receive inputs and produce outputs. The relationships between the inputs and outputs and the representation parameters are critical issues in the design of related engineering systems, and sensitivity analysis concerns methods for analyzing these relationships. Perturbations of neural networks are caused by machine imprecision, and they can be simulated by embedding disturbances in the original inputs or connection weights, allowing us to study the characteristics of a function under small perturbations of its parameters. This is the first book to present a systematic description of sensitivity analysis methods for artificial neural networks. It covers sensitivity analysis of multilayer perceptron neural networks and radial basis function neural networks, two widely used models in the machine learning field. The authors examine the applications of such analysis in tasks such as feature selection, sample reduction, and network optimization. The book will be useful for engineers applying neural network sensitivity analysis to solve practical problems, and for researchers interested in foundational problems in neural networks.
international conference on machine learning and cybernetics | 2003
Wing W. Y. Ng; Rocky K. C. Chang; Daniel S. Yeung
In this paper, we have presented a feature importance ranking methodology based on the stochastic radial basis function neural network output sensitivity measure and have shown, for the 10% training set of the DARPA network intrusion detection data set prepared by MIT Lincoln Labs, that 33 out of 41 features (more than 80% dimensionality reduction) can be removed without causing great harm to the classification accuracy of denial of service (DoS) attacks and normal packets (false positives rise from 0.7% to 0.93%). The reduced feature subset leads to more generalized and less complex model for classifying DoS and normal. Exploratory discussions on the relevancy of the selected features and the DoS attack types are presented.
systems, man and cybernetics | 2008
Binbin Sun; Wing W. Y. Ng; Daniel S. Yeung; Jun Wang
Content-based image auto-annotation becomes a hot research topic owing to the development of image retrieval system and the storing technology of multimedia information. It is a key step in most of those image processing applications. In this work, we adopt active learning to image annotation for reducing the number of labeled images required for supervised learning procedure. Localized Generalization Error Model (L-GEM) based active learning uses localized generalization error bound as the sample selection criterion. In each turn, the most informative sample from a set of unlabeled samples is selected by the L-GEM based active learning will be labeled and added to the training dataset. A heuristic and a Q value selection improvement methods are introduced in this paper. The experimental results show that the proposed active learning efficiently reduces the number of labeled training samples. Moreover, the improvement method improve the performances in both testing accuracy and training time which are both essential in image annotation applications.
systems, man and cybernetics | 2005
Patrick P. K. Chan; Wing W. Y. Ng; Daniel S. Yeung
In classification problem, the learning process can be more efficient if the informative samples can be selected actively based on the knowledge of the classifier. This problem is called active learning. Most of the existing active learning methods did not directly relate to the generalization error of classifiers. Also, some of them need high computational time or are based on strict assumptions. This paper describes a new active learning strategy using the concept of localized generalization error of the candidate samples. The sample which yields the largest generalization error will be chosen for query. This method can be applied to different kinds of classifiers and its complexity is low. Experimental results demonstrate that the prediction accuracy of the classifier can be improved by using this selecting method and fewer training samples are possible for the same prediction accuracy.
systems, man and cybernetics | 2003
Wing W. Y. Ng; Daniel S. Yeung; Ian Cloete
Large data sets containing irrelevant or redundant input samples reduce the performance of learning and increases storage and labeling costs. This work compares several sample selection and active learning techniques and proposes a novel sample selection method based on the stochastic radial basis function neural network sensitivity measure (SM). The experimental results for the UCI IRIS data set show that we can remove 99% of data while keeping 95% of classification accuracy when applying both sensitivity based feature and sample selection methods. We propose a single and consistent method, which is robust enough to handle both feature and sample selection for a supervised RBFNN classification system, by using the same neural network architecture for both selection and classification tasks.
International Journal of Wavelets, Multiresolution and Information Processing | 2013
Binbin Sun; Wing W. Y. Ng; Daniel S. Yeung; Patrick P. K. Chan
Sparse LS-SVM yields better generalization capability and reduces prediction time in comparison to full dense LS-SVM. However, both methods require careful selection of hyper-parameters (HPS) to achieve high generalization capability. Leave-One-Out Cross Validation (LOO-CV) and k-fold Cross Validation (k-CV) are the two most widely used hyper-parameter selection methods for LS-SVMs. However, both fail to select good hyper-parameters for sparse LS-SVM. In this paper we propose a new hyper-parameter selection method, LGEM-HPS, for LS-SVM via minimization of the Localized Generalization Error (L-GEM). The L-GEM consists of two major components: empirical mean square error and sensitivity measure. A new sensitivity measure is derived for LS-SVM to enable the LGEM-HPS select hyper-parameters yielding LS-SVM with smaller training error and minimum sensitivity to minor changes in inputs. Experiments on eleven UCI data sets show the effectiveness of the proposed method for selecting hyper-parameters for sparse LS...
international conference on machine learning and cybernetics | 2002
Wing W. Y. Ng; Daniel S. Yeung
The curse of dimensionality is always problematic in pattern classification problems. We provide a brief comparison of the major methodologies for reducing input dimensionality and summarize them in three categories: correlation among features, transformation and neural network sensitivity analysis. Furthermore, we propose a method for reducing input dimensionality that uses a stochastic RBFNN sensitivity measure. The experimental results are promising for our method of reducing input dimensionality.
systems, man and cybernetics | 2005
Wing W. Y. Ng; Aki P. F. Chan; Daniel S. Yeung; Eric C. C. Tsang
Multiple classifier system (MCS) has been one of the hot research topics in machine learning field. A MCS merges an ensemble of different or same type of classifiers together to enhance the problem solving performance of machine learning. However, the choice of the number of classifiers and the fusion method are usually based on ad-hoc selection. In this paper, we propose a novel quantitative measure of the generalization error for MCS. The localized generalization error model bounds above the mean square error (MSE) of a MCS for unseen samples located within a neighborhood of the training samples. The relationship between the proposed model and classification accuracy is also discussed in this paper. This model quantitatively measures the goodness of the MCS in approximating the unknown input-output mapping hidden in the training dataset The localized generalization error model is applied to select a MCS, among different choices of number of classifiers and fusion methods, for a given classification problem. Experimental results on three real world datasets are performed to show promising results.
international conference on machine learning and cybernetics | 2008
Jun Wang; Wing W. Y. Ng; Eric C. C. Tsang; Tao Zhu; Binbin Sun; Daniel S. Yeung
MPEG-7 provides a set of descriptors to describe the content of an image. However, how to select or combine descriptors for a specific image classification problem is still an open problem. Currently, descriptors are usually selected by human experts. Moreover, selecting the same set of descriptors for different classes of images may not be reasonable. In this work we propose a MPEG-7 descriptor selection method which selects different MPEG-7 descriptors for different image class in an image classification problem. The proposed method L-GEMIM combines Localized Generalization Error Model (L-GEM) and Mutual Information (MI) to assess the relevance of MPEG-7 descriptors for a particular image class. The L-GEMIM model assesses the relevance based on the generalization capability of a MPEG-7 descriptor using L-GEM and prevents redundant descriptors being selected by MI. Experimental results using 4,000 images in 4 classes show that L-GEMIM selects better set of MPEG-7 descriptors yielding a higher testing accuracy of image classification.