Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Binbin Sun is active.

Publication


Featured researches published by Binbin Sun.


systems, man and cybernetics | 2008

Localized generalization error based active learning for image annotation

Binbin Sun; Wing W. Y. Ng; Daniel S. Yeung; Jun Wang

Content-based image auto-annotation becomes a hot research topic owing to the development of image retrieval system and the storing technology of multimedia information. It is a key step in most of those image processing applications. In this work, we adopt active learning to image annotation for reducing the number of labeled images required for supervised learning procedure. Localized Generalization Error Model (L-GEM) based active learning uses localized generalization error bound as the sample selection criterion. In each turn, the most informative sample from a set of unlabeled samples is selected by the L-GEM based active learning will be labeled and added to the training dataset. A heuristic and a Q value selection improvement methods are introduced in this paper. The experimental results show that the proposed active learning efficiently reduces the number of labeled training samples. Moreover, the improvement method improve the performances in both testing accuracy and training time which are both essential in image annotation applications.


International Journal of Wavelets, Multiresolution and Information Processing | 2013

HYPER-PARAMETER SELECTION FOR SPARSE LS-SVM VIA MINIMIZATION OF ITS LOCALIZED GENERALIZATION ERROR

Binbin Sun; Wing W. Y. Ng; Daniel S. Yeung; Patrick P. K. Chan

Sparse LS-SVM yields better generalization capability and reduces prediction time in comparison to full dense LS-SVM. However, both methods require careful selection of hyper-parameters (HPS) to achieve high generalization capability. Leave-One-Out Cross Validation (LOO-CV) and k-fold Cross Validation (k-CV) are the two most widely used hyper-parameter selection methods for LS-SVMs. However, both fail to select good hyper-parameters for sparse LS-SVM. In this paper we propose a new hyper-parameter selection method, LGEM-HPS, for LS-SVM via minimization of the Localized Generalization Error (L-GEM). The L-GEM consists of two major components: empirical mean square error and sensitivity measure. A new sensitivity measure is derived for LS-SVM to enable the LGEM-HPS select hyper-parameters yielding LS-SVM with smaller training error and minimum sensitivity to minor changes in inputs. Experiments on eleven UCI data sets show the effectiveness of the proposed method for selecting hyper-parameters for sparse LS...


international conference on machine learning and cybernetics | 2008

MPEG-7 descriptor selection using Localized Generalization Error Model with mutual information

Jun Wang; Wing W. Y. Ng; Eric C. C. Tsang; Tao Zhu; Binbin Sun; Daniel S. Yeung

MPEG-7 provides a set of descriptors to describe the content of an image. However, how to select or combine descriptors for a specific image classification problem is still an open problem. Currently, descriptors are usually selected by human experts. Moreover, selecting the same set of descriptors for different classes of images may not be reasonable. In this work we propose a MPEG-7 descriptor selection method which selects different MPEG-7 descriptors for different image class in an image classification problem. The proposed method L-GEMIM combines Localized Generalization Error Model (L-GEM) and Mutual Information (MI) to assess the relevance of MPEG-7 descriptors for a particular image class. The L-GEMIM model assesses the relevance based on the generalization capability of a MPEG-7 descriptor using L-GEM and prevents redundant descriptors being selected by MI. Experimental results using 4,000 images in 4 classes show that L-GEMIM selects better set of MPEG-7 descriptors yielding a higher testing accuracy of image classification.


systems, man and cybernetics | 2008

Quantitative study on candlestick pattern for Shenzhen Stock Market

Huili Li; Wing W. Y. Ng; John W. T. Lee; Binbin Sun; Daniel S. Yeung

Shenzhen stock market grows rapidly yet is still a young market when compared with Hong Kong, New York and London markets. Its daily turnover reaches billions US dollars. A good prediction of stock price will bring us substantial pecuniary reward. Technical analysis is a widely adopted financial prediction tool in worldwide stock markets. Candlestick pattern is one of the most efficient methods in technical analysis. However, does candlestick pattern prediction works for Shenzhen stocks? Candlestick patterns are always defined by fuzzy terms, could we have a quantitative definition of these patterns? We perform a quantitative study on these two major research problems in this paper. We study the morning star pattern in this work and the method in this paper could be extended to other patterns easily. So, we propose adopting the radial basis function neural networks trained with localized generalization error model to predict whether or not the stock price will increase after the appearance of it. Then, we extract the patterns from the neural network to provide a quantitative definition of the morning star pattern for a particular stock. Experimental results show that our modification to the morning star pattern prediction prevents up to 69% of false prediction of the morning star pattern. We also provide a quantitative measure of the morning star patterns for two of the Shenzhen stocks.


international conference on wavelet analysis and pattern recognition | 2008

L-gem based co-training for CBIR with relevance feedback

Tao Zhu; Wing W. Y. Ng; John W. T. Lee; Binbin Sun; Jun Wang; Daniel S. Yeung

Relevance feedback has been developed for several years and becomes an effective method for capturing userpsilas concepts to improve the performance of content-based image retrieval (CBIR). In contrast to fully labeled training dataset in supervised learning, semi-supervised learning and active learning deal with training dataset with only a small portion of labeled samples. This is more realistic because one could easily find thousands of unlabeled images from the Internet. How to make use of such unlabeled resources on the Internet is an important research topic. Co-training method is to expand the number of labeled samples in semi-supervised learning by swapping training samples between two classifiers. In this work, we propose to apply the localized generalization error model (L-GEM) to co-training. Two radial basis function neural networks (RBFNN) with different features split is adopted in the co-training and the unlabeled samples with lowest L-GEM value is added to the co-training in next iteration. In the CBIR system, we output those positive images with lowest L-GEM value as the highest confident image and output those images with highest L-GEM to ask for user labeling. Higher the L-GEM value of a sample is, the less confident is the classifier to predict its image class. Experimental results show that the proposed method could effectively improve the image retrieval results.


international conference on machine learning and cybernetics | 2009

L-GEM based MCS aided candlestick pattern investment strategy in the Shenzhen stock market

Wei Xiao; Wing W. Y. Ng; Michael Firth; Daniel S. Yeung; Gao-Yang Cai; Jin-Cheng Li; Binbin Sun

An integral part of Chinas economic reforms is the privatization of state-owned enterprises (SOEs) and listing the profitable units of the SOEs on the stock market. The two stock exchanges in Shanghai and Shenzhen were opened nearly twenty years ago. The Shenzhen stock exchange market is young and energetic. Moreover, it practices a T+1 settlement rule instead of real time trade as in Hong Kong or other exchange markets. One important research question is whether there are patterns that can be identified in stock prices that can be used to develop profitable investment strategies. If strategies can be found, then this represents a violation of the efficient market hypothesis. In this work, we propose an investment strategy by using Radial Basis Function Neural Networks (RBFNN) trained by Localized Generalization Error Model (L-GEM) and 4 stock price candlestick patterns. Every base RBFNN in the Multiple Classifier System (MCS) recognizes the occurrence of a particular candlestick pattern and the MCS combines opinions from the 4 base RBFNNs by a weighted sum to provide a final prediction. If the MCS predicts an increase for the next day, it will buy the stock and sell it within three days whenever the opening price is higher than the buy-in price or else after three days have passed. Experimental results with stocks in Shenzhen market show that our investment strategy statistically significantly outperforms a random investment, i.e. the EMH is invalid in this case.


International Journal of Machine Learning and Cybernetics | 2017

Improved sparse LSSVMS based on the localized generalization error model

Binbin Sun; Wing W. Y. Ng; Patrick P. K. Chan

The least squares support vector machine (LSSVM) is computationally efficient because it converts the quadratic programming problem in the training of SVM to a linear programming problem. The sparse LSSVM is proposed to promote the predictive speed and generalization capability. In this paper, two sparse LSSVM algorithms: the SMRLSSVM and the RQRLSSVM are proposed based on the Localized Generalization Error of the LSSVM. Experimental results show that the RQRLSSVM yields both better generalization capability and sparseness in comparison to other sparse LSSVM algorithms.


systems, man and cybernetics | 2008

Information extraction based on information fusion from multiple news sources from the web

Yang Lv; Wing W. Y. Ng; John W. T. Lee; Binbin Sun; Daniel S. Yeung

The traditional information extraction tools have been developed for years. But, the accuracy of extraction is not very satisfactory, especially for named entity extraction. In this work, we analyze the reasons of it and propose a novel method to improve the accuracy. Existing methods extract information from a text which is collected from a single source. This is very difficult to extract the exact information we need. From the Internet, one could easily find tens of sources for the same information (e.g. particular news). In this work, we propose to combine information extracted from multiple sources using a majority voting to find the information we needed. We use Change of CEO as an example and we extract the new CEO, original CEO and the company name for the event. A off-the-shelf named entity extraction tool is adopted and our major contribution is the fusion of extraction results. Without our work, one finds single news provides many people and company names, such that we do not know who the new CEO is. By using our method, we provide the 2 CEO names and 1 company name. Experimental results show that our method has a high accuracy in finding the exact information.


international conference on machine learning and cybernetics | 2009

Image classification using L-GEM based RBFNN with local feature keypoints and MPEG-7 descriptors

Qian-Cheng Wang; Daniel S. Yeung; Wing W. Y. Ng; Cheng-Hu Lin; Binbin Sun; Jin-Cheng Li

Image with MPEG-7 descriptors as features may loss local details. In this work, we combine MPEG-7 descriptors with local feature key points to cover both global and local image characteristics. Images are classified by a Radial Basis Function Neural Network (RBFNN) trained via a minimization of Localized Generalization Error Model (L-GEM). In this paper, we extract local feature key points by the Scale Invariant Feature Transform (SIFT). Four color and three texture MPEG-7 descriptors are extracted. Experimental results show that the introduction of local feature key points effectively improves the testing accuracy of image classification.


international conference on machine learning and cybernetics | 2012

Sparse LS-SVM two-steps model selection method

Binbin Sun; Daniel S. Yeung

Least Square Support Vector Machine (LS-SVM) converts the hinge loss function of SVM into a least square loss function which simplified the original quadratic programming training method to a linear system solving problem. Sparse LS-SVM is obtained with a pruning procedure. The performance of sparse LS-SVM depends on the selection of hyper-parameters (i.e. kernel and penalty parameters). Currently, CV and LOO are the most common methods to select hyper-parameters for LS-SVM. However, CV is computationally expensive while LOO yields a high variance of validation error which may mislead the selection of hyper-parameters. Selecting both kernel and penalty parameters simultaneously needs to search in a high dimensional parameter space. In this work, we propose a new two-step hyper-parameter selection method. Distance between Two Classes (DBTC) method is adopted to select the kernel parameters based on a maximization of between-class separation of the projected samples in the feature space. However, data distribution could not be helpful in penalty parameter selection. Therefore, we propose to select the penalty parameter via a minimization of a Localized Generalization Error to enhance the generalization capability of the LS-SVM. Experimental results comparing to existing methods show the proposed two-step method yields better LS-SVMs in term of average testing accuracies.

Collaboration


Dive into the Binbin Sun's collaboration.

Top Co-Authors

Avatar

Daniel S. Yeung

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Wing W. Y. Ng

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jin-Cheng Li

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Jun Wang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Patrick P. K. Chan

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

John W. T. Lee

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Tao Zhu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Cheng-Hu Lin

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Gao-Yang Cai

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Guo-Li Ye

Harbin Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge