Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Man-Wai Mak is active.

Publication


Featured researches published by Man-Wai Mak.


IEEE Transactions on Neural Networks | 2000

Estimation of elliptical basis function parameters by the EM algorithm with application to speaker verification

Man-Wai Mak; Sun-Yuan Kung

This paper proposes to incorporate full covariance matrices into the radial basis function (RBF) networks and to use the expectation-maximization (EM) algorithm to estimate the basis function parameters. The resulting networks, referred to as elliptical basis function (EBF) networks, are evaluated through a series of text-independent speaker verification experiments involving 258 speakers from a phonetically balanced, continuous speech corpus (TIMIT).We propose a verification procedure using RBF and EBF networks as speaker models and show that the networks are readily applicable to verifying speakers using LP-derived cepstral coefficients as features. Experimental results show that small EBF networks with basis function parameters estimated by the EM algorithm outperform the large RBF networks trained in the conventional approach. The results also show that the equal error rate achieved by the EBF networks is about two-third of that achieved by the vetor quantization (VQ)-based speaker models.


Journal of Theoretical Biology | 2013

GOASVM: A subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition

Shibiao Wan; Man-Wai Mak; Sun-Yuan Kung

Prediction of protein subcellular localization is an important yet challenging problem. Recently, several computational methods based on Gene Ontology (GO) have been proposed to tackle this problem and have demonstrated superiority over methods based on other features. Existing GO-based methods, however, do not fully use the GO information. This paper proposes an efficient GO method called GOASVM that exploits the information from the GO term frequencies and distant homologs to represent a protein in the general form of Chous pseudo-amino acid composition. The method first selects a subset of relevant GO terms to form a GO vector space. Then for each protein, the method uses the accession number (AC) of the protein or the ACs of its homologs to find the number of occurrences of the selected GO terms in the Gene Ontology annotation (GOA) database as a means to construct GO vectors for support vector machines (SVMs) classification. With the advantages of GO term frequencies and a new strategy to incorporate useful homologous information, GOASVM can achieve a prediction accuracy of 72.2% on a new independent test set comprising novel proteins that were added to Swiss-Prot six years later than the creation date of the training set. GOASVM and Supplementary materials are available online at http://bioinfo.eie.polyu.edu.hk/mGoaSvmServer/GOASVM.html.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2008

PairProSVM: Protein Subcellular Localization Based on Local Pairwise Profile Alignment and SVM

Man-Wai Mak; Jian Guo; Sun-Yuan Kung

The subcellular locations of proteins are important functional annotations. An effective and reliable subcellular localization method is necessary for proteomics research. This paper introduces a new method - PairProSVM - to automatically predict the subcellular locations of proteins. The profiles of all protein sequences in the training set are constructed by PSI-BLAST, and the pairwise profile alignment scores are used to form feature vectors for training a support vector machine (SVM) classifier. It was found that PairProSVM outperforms the methods that are based on sequence alignment and amino acid compositions even if most of the homologous sequences have been removed. PairProSVM was evaluated on Huang and Lis and Gardy et al.s protein data sets. The overall accuracies on these data sets reach 75.3 percent and 91.9 percent, respectively, which are higher than or comparable to those obtained by sequence alignment and composition-based methods.


BMC Bioinformatics | 2012

mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines

Shibiao Wan; Man-Wai Mak; Sun-Yuan Kung

BackgroundAlthough many computational methods have been developed to predict protein subcellular localization, most of the methods are limited to the prediction of single-location proteins. Multi-location proteins are either not considered or assumed not existing. However, proteins with multiple locations are particularly interesting because they may have special biological functions, which are essential to both basic research and drug discovery.ResultsThis paper proposes an efficient multi-label predictor, namely mGOASVM, for predicting the subcellular localization of multi-location proteins. Given a protein, the accession numbers of its homologs are obtained via BLAST search. Then, the original accession number and the homologous accession numbers of the protein are used as keys to search against the Gene Ontology (GO) annotation database to obtain a set of GO terms. Given a set of training proteins, a set of T relevant GO terms is obtained by finding all of the GO terms in the GO annotation database that are relevant to the training proteins. These relevant GO terms then form the basis of a T-dimensional Euclidean space on which the GO vectors lie. A support vector machine (SVM) classifier with a new decision scheme is proposed to classify the multi-label GO vectors. The mGOASVM predictor has the following advantages: (1) it uses the frequency of occurrences of GO terms for feature representation; (2) it selects the relevant GO subspace which can substantially speed up the prediction without compromising performance; and (3) it adopts an efficient multi-label SVM classifier which significantly outperforms other predictors. Briefly, on two recently published virus and plant datasets, mGOASVM achieves an actual accuracy of 88.9% and 87.4%, respectively, which are significantly higher than those achieved by the state-of-the-art predictors such as iLoc-Virus (74.8%) and iLoc-Plant (68.1%).ConclusionsmGOASVM can efficiently predict the subcellular locations of multi-label proteins. The mGOASVM predictor is available online athttp://bioinfo.eie.polyu.edu.hk/mGoaSvmServer/mGOASVM.html.


IEEE Transactions on Evolutionary Computation | 2000

A study of the Lamarckian evolution of recurrent neural networks

Kim-wing C. Ku; Man-Wai Mak; Wan-Chi Siu

Training neural networks by evolutionary search can require a long computation time. In certain situations, using Lamarckian evolution, local search and evolutionary search can complement each other to yield a better training algorithm. This paper demonstrates the potential of this evolutionary-learning synergy by applying it to train recurrent neural networks in an attempt to resolve a long-term dependency problem and the inverted pendulum problem. This work also aims at investigating the interaction between local search and evolutionary search when they are combined; it is found that the combinations are particularly efficient when the local search is simple. In the case where no teacher signal is available for the local search to learn the desired task directly, the paper proposes a related local task for the local search to learn, and finds that this approach is able to reduce the training time considerably.


Computer Speech & Language | 2014

A study of voice activity detection techniques for NIST speaker recognition evaluations

Man-Wai Mak; Hon-Bill Yu

Since 2008, interview-style speech has become an important part of the NIST speaker recognition evaluations (SREs). Unlike telephone speech, interview speech has lower signal-to-noise ratio, which necessitates robust voice activity detectors (VADs). This paper highlights the characteristics of interview speech files in NIST SREs and discusses the difficulties in performing speech/non-speech segmentation in these files. To overcome these difficulties, this paper proposes using speech enhancement techniques as a pre-processing step for enhancing the reliability of energy-based and statistical-model-based VADs. A decision strategy is also proposed to overcome the undesirable effects caused by impulsive signals and sinusoidal background signals. The proposed VAD is compared with the ASR transcripts provided by NIST, VAD in the ETSI-AMR Option 2 coder, satistical-model (SM) based VAD, and Gaussian mixture model (GMM) based VAD. Experimental results based on the NIST 2010 SRE dataset suggest that the proposed VAD outperforms these conventional ones whenever interview-style speech is involved. This study also demonstrates that (1) noise reduction is vital for energy-based VAD under low SNR; (2) the ASR transcripts and ETSI-AMR speech coder do not produce accurate speech and non-speech segmentations; and (3) spectral subtraction makes better use of background spectra than the likelihood-ratio tests in the SM-based VAD. The segmentation files produced by the proposed VAD can be found in http://bioinfo.eie.polyu.edu.hk/ssvad.


ieee international conference on evolutionary computation | 1997

Exploring the effects of Lamarckian and Baldwinian learning in evolving recurrent neural networks

Kim-wing C. Ku; Man-Wai Mak

A drawback of using genetic algorithms (GAs) to train recurrent neural networks is that it takes a large number of generations to evolve the networks into an optimal solution. In order to reduce the number of generations taken, the Lamarckian learning mechanism and the Baldwinian learning mechanism are embedded into a cellular GA. This paper investigates the effects of these two learning mechanisms on the convergence performance of the cellular GA. The criteria that make learning useful to GAs are also discussed. The results show that the Lamarckian mechanism is able to assist the cellular GA, while the Baldwinian mechanism fails to do so. In addition to reducing the number of generations taken, we have found that it is also possible to reduce the time taken by embedding learning into the cellular GA in an appropriate manner.


Neurocomputing | 1999

On the improvement of the real time recurrent learning algorithm for recurrent neural networks

Man-Wai Mak; Kim-wing C. Ku; Yee-Ling Lu

Abstract This paper reviews different approaches to improving the real time recurrent learning (RTRL) algorithm and attempts to group them into common frameworks. The characteristics of sub-grouping strategy, mode exchange RTRL, and cellular genetic algorithms are discussed. The relationships between these algorithms are highlighted and their time complexities and convergence capability are compared. The learning algorithms are applied to train recurrent neural networks in an attempt to solve a long-term dependency problem, to model the Henon map, and to predict the chaotic intensity pulsations of an NH3 laser. The results show that the original RTRL algorithm achieves the lowest error among the gradient-based algorithms, but it requires the longest training time; whereas the sub-grouping strategy uses the shortest training time but its convergence capability is the poorest. The results also demonstrate that the cellular genetic algorithm is an alternative means of training recurrent neural networks when the gradient-based methods fail to find an acceptable solution.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

Boosting the Performance of I-Vector Based Speaker Verification via Utterance Partitioning

Wei Rao; Man-Wai Mak

The success of the recent i-vector approach to speaker verification relies on the capability of i-vectors to capture speaker characteristics and the subsequent channel compensation methods to suppress channel variability. Typically, given an utterance, an i-vector is determined from the utterance regardless of its length. This paper investigates how the utterance length affects the discriminative power of i-vectors and demonstrates that the discriminative power of i-vectors reaches a plateau quickly when the utterance length increases. This observation suggests that it is possible to make the best use of a long conversation by partitioning it into a number of sub-utterances so that more i-vectors can be produced for each conversation. To increase the number of sub-utterances without scarifying the representation power of the corresponding i-vectors, repeated applications of frame-index randomization and utterance partitioning are performed. Results on NIST 2010 speaker recognition evaluation (SRE) suggest that (1) using more i-vectors per conversation can help to find more robust linear discriminant analysis (LDA) and within-class covariance normalization (WCCN) transformation matrices, especially when the number of conversations per training speaker is limited; and (2) increasing the number of i-vectors per target speaker helps the i-vector based support vector machines (SVM) to find better decision boundaries, thus making SVM scoring outperforms cosine distance scoring by 19% and 9% in terms of minimum normalized DCF and EER.


IEEE Transactions on Neural Networks | 1999

Adding learning to cellular genetic algorithms for training recurrent neural networks

Kim-wing C. Ku; Man-Wai Mak; Wan-Chi Siu

This paper proposes a hybrid optimization algorithm which combines the efforts of local search (individual learning) and cellular genetic algorithms (GAs) for training recurrent neural networks (RNNs). Each weight of an RNN is encoded as a floating point number, and a concatenation of the numbers forms a chromosome. Reproduction takes place locally in a square grid with each grid point representing a chromosome. Two approaches, Lamarckian and Baldwinian mechanisms, for combining cellular GAs and learning have been compared. Different hill-climbing algorithms are incorporated into the cellular GAs as learning methods. These include the real-time recurrent learning (RTRL) and its simplified versions, and the delta rule. The RTRL algorithm has been successively simplified by freezing some of the weights to form simplified versions. The delta rule, which is the simplest form of learning, has been implemented by considering the RNNs as feedforward networks during learning. The hybrid algorithms are used to train the RNNs to solve a long-term dependency problem. The results show that Baldwinian learning is inefficient in assisting the cellular GA. It is conjectured that the more difficult it is for genetic operations to produce the genotypic changes that match the phenotypic changes due to learning, the poorer is the convergence of Baldwinian learning. Most of the combinations using the Lamarckian mechanism show an improvement in reducing the number of generations required for an optimum network; however, only a few can reduce the actual time taken. Embedding the delta rule in the cellular GAs has been found to be the fastest method. It is also concluded that learning should not be too extensive if the hybrid algorithm is to be benefit from learning.

Collaboration


Dive into the Man-Wai Mak's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shibiao Wan

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Kwok-Kwong Yiu

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Wei Rao

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Ming-Cheung Cheung

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Shi-Xiong Zhang

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Jen-Tzung Chien

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Chi-Kwong Li

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Helen M. Meng

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Na Li

Hong Kong Polytechnic University

View shared research outputs
Researchain Logo
Decentralizing Knowledge