Vojislav Kecman
Virginia Commonwealth University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vojislav Kecman.
international conference on machine learning and applications | 2010
Qi Li; Vojislav Kecman; Raied Salman
Calculating Euclidean distance matrix is a data intensive operation and becomes computationally prohibitive for large datasets. Recent development of Graphics Processing Units (GPUs) has produced superb performance on scientific computing problems using massive parallel processing cores. However, due to the limited size of device memory, many GPU based algorithms have low capability in solving problems with large datasets. In this paper, a chunking method is proposed to calculate Euclidean distance matrix on large datasets. This is not only designed for scalability in multi-GPU environment but also to maximize the computational capability of each individual GPU device. We first implement a fast GPU algorithm that is suitable for calculating sub matrices of Euclidean distance matrix. Then we utilize a Map-Reduce like framework to split the final distance matrix calculation into many small independent jobs of calculating partial distance matrices, which can be efficiently solved by our GPU algorithm. The framework also dynamically allocates GPU resources to those independent jobs for maximum performance. The experimental results have shown a speed up of 15x on datasets which contain more than half million data points.
Expert Systems With Applications | 2011
Tao Yang; Vojislav Kecman; Longbing Cao; Chengqi Zhang; Joshua Zhexue Huang
Recognition of protein folding patterns is an important step in protein structure and function predictions. Traditional sequence similarity-based approach fails to yield convincing predictions when proteins have low sequence identities, while the taxonometric approach is a reliable alternative. From a pattern recognition perspective, protein fold recognition involves a large number of classes with only a small number of training samples, and multiple heterogeneous feature groups derived from different propensities of amino acids. This raises the need for a classification method that is able to handle the data complexity with a high prediction accuracy for practical applications. To this end, a novel ensemble classifier, called MarFold, is proposed in this paper which combines three margin-based classifiers for protein fold recognition. The effectiveness of our method is demonstrated with the benchmark D-B dataset with 27 classes. The overall prediction accuracy obtained by MarFold is 71.7%, which surpasses the existing fold recognition methods by 3.1-15.7%. Moreover, one component classifier for MarFold, called ALH, has obtained a prediction accuracy of 65.5%, which is 4.7-9.5% higher than the prediction accuracies for the published methods using single classifiers. Additionally, the feature set of pairwise frequency information about the amino acids, which is adopted by MarFold, is found to be important for discriminating folding patterns. These results imply that the MarFold method and its operation engine ALH might become useful vehicles for protein fold recognition, as well as other bioinformatics tasks. The MarFold method and the datasets can be obtained from: (http://www-staff.it.uts.edu.au/~lbcao/publication/MarFold.7z).
IEEE Transactions on Neural Networks | 2010
Johnny Wei-Hsun Kao; Stevan M. Berber; Vojislav Kecman
The algorithm and the results of a blind multiuser detector using a machine learning technique called support vector machine (SVM) on a chaos-based code division multiple access system is presented in this paper. Simulation results showed that the performance achieved by using SVM is comparable to existing minimum mean square error (MMSE) detector under both additive white Gaussian noise (AWGN) and Rayleigh fading conditions. However, unlike the MMSE detector, the SVM detector does not require the knowledge of spreading codes of other users in the system or the estimate of the channel noise variance. The optimization of this algorithm is considered in this paper and its complexity is compared with the MMSE detector. This detector is much more suitable to work in the forward link than MMSE. In addition, original theoretical bit-error rate expressions for the SVM detector under both AWGN and Rayleigh fading are derived to verify the simulation results.
Journal of Parallel and Distributed Computing | 2013
Qi Li; Raied Salman; Erik Test; Robert Strack; Vojislav Kecman
The Support Vector Machine (SVM) is an efficient tool in machine learning with high accuracy performance. However, in order to achieve the highest accuracy performance, n-fold cross validation is commonly used to identify the best hyperparameters for SVM. This becomes a weak point of SVM due to the extremely long training time for various hyperparameters of different kernel functions. In this paper, a novel parallel SVM training implementation is proposed to accelerate the cross validation procedure by running multiple training tasks simultaneously on a Graphics Processing Unit (GPU). All of these tasks with different hyperparameters share the same cache memory which stores the kernel matrix of the support vectors. Therefore, this heavily reduces redundant computations of kernel values across different training tasks. Considering that the computations of kernel values are the most time consuming operations in SVM training, the total time cost of the cross validation procedure decreases significantly. The experimental tests indicate that the time cost for the multitask cross validation training is very close to the time cost of the slowest task trained alone. Comparison tests have shown that the proposed method is 10 to 100 times faster compared to the state of the art LIBSVM tool.
Neurocomputing | 2013
Robert Strack; Vojislav Kecman; Beata Strack; Qi Li
This paper introduces Sphere Support Vector Machines (SVMs) as the new fast classification algorithm based on combining a minimal enclosing ball approach, state of the art nearest point problem solvers and probabilistic techniques. The blending of the three significantly speeds up the training phase of SVMs and also attains practically the same accuracy as the other classification models over several large real datasets within the strict validation frame of a double (nested) cross-validation. The results shown are promoting SphereSVM as outstanding alternatives for handling large and ultra-large datasets in a reasonable time without switching to various parallelization schemes for SVM algorithms recently proposed.
computational intelligence in bioinformatics and computational biology | 2009
Vojislav Kecman; Tao Yang
Protein fold recognition task is important for understanding the biological functions of proteins. The adaptive local hyperplane (ALH) algorithm has been shown to perform better than many other renown classifiers including support vector machines, K-nearest neighbor, linear discriminant analysis, K-local hyperplane distance nearest neighbor algorithms and decision trees on a variety of data sets. In this paper, we apply the ALH algorithm to well-known data sets on protein fold recognition task without sequence similarity from Ding and Dubchak (2001). The results obtained demonstrate that the ALH algorithm outperforms all the seven other very well known and established benchmarking classifiers applied to same data sets.
Central European Journal of Computer Science | 2011
Qi Li; Raied Salman; Erik Test; Robert Strack; Vojislav Kecman
GPUSVM (Graphic Processing Unit Support Vector Machine) is a Computing Unified Device Architecture (CUDA) based Support Vector Machine (SVM) package. It is designed to offer an end-user a fully functional and user friendly SVM tool which utilizes the power of GPUs. The core package includes an efficient cross validation tool, a fast training tool and a predicting tool. In this article, we first introduce the background theory of how we build our parallel SVM solver using CUDA programming model. Then we compare our GPUSVM package with the popular state of the art Libsvm package on several well known datasets. The preliminary results have shown one to two orders of magnitude speed improvement in both training and predicting phases compared to Libsvm using our Tesla server.
Pattern Analysis and Applications | 2010
Tao Yang; Vojislav Kecman
The paper introduces a novel adaptive local hyperplane (ALH) classifier and it shows its superior performance in the face recognition tasks. Four different feature extraction methods (2DPCA, (2D)2PCA, 2DLDA and (2D)2LDA) have been used in combination with five classifiers (K-nearest neighbor (KNN), support vector machine (SVM), nearest feature line (NFL), nearest neighbor line (NNL) and ALH). All the classifiers and feature extraction methods have been applied to the renown benchmarking face databases—the Cambridge ORL database and the Yale database and the ALH classifier with a LDA based extractor outperforms all the other methods on them. The ALH algorithm on these two databases is very promising but more study on larger databases need yet to be done to show all the advantages of the proposed algorithm.
Information Sciences | 2017
Gabriella Melki; Alberto Cano; Vojislav Kecman; Sebastián Ventura
Abstract Multi-target regression is a challenging task that consists of creating predictive models for problems with multiple continuous target outputs. Despite the increasing attention on multi-label classification, there are fewer studies concerning multi-target (MT) regression. The current leading MT models are based on ensembles of regressor chains, where random, differently ordered chains of the target variables are created and used to build separate regression models, using the previous target predictions in the chain. The challenges of building MT models stem from trying to capture and exploit possible correlations among the target variables during training. This paper presents three multi-target support vector regression models. The first involves building independent, single-target Support Vector Regression (SVR) models for each output variable. The second builds an ensemble of random chains using the first method as a base model. The third calculates the targets’ correlations and forms a maximum correlation chain, which is used to build a single chained support vector regression model, improving the models’ prediction performance while reducing the computational complexity. The experimental study evaluates and compares the performance of the three approaches with seven other state-of-the-art multi-target regressors on 24 multi-target datasets. The experimental results are then analyzed using non-parametric statistical tests. The results show that the maximum correlation SVR approach improves the performance of using ensembles of random chains.
international symposium on neural networks | 2010
Vojislav Kecman; J. Paul Brooks
The paper introduces various local models for solving machine learning (i.e., data mining) problems. In particular (and, due to their superior results) it focuses on a novel design of locally linear support vector machines classifiers. It presents them as powerful alternatives to the global (over the whole input space) nonlinear classifiers. Locally linear support vector machine (LL SVM) maximizes the margin in the original input features space and it never performs the nonlinear mapping to some kernel induced feature space. In performing such a task it uses only the K closest points to the query data point q. In this way it grasps the local decision function better than the standard global SVM does. This is shown to be a powerful approach when data are unevenly distributed in the input space and when a suitable decision function possesses different nonlinear characteristics in various parts of the input space. Experiments on eleven benchmark data sets display both the superior performance of LL SVMs as well as great performances of other classic locally linear classifiers. In addition, this is the first paper which proves the stability bounds for local SVMs and it shows that they are tighter than the ones for traditional, global, SVM. LL SVM is a natural classifier for multiclass problems which means that it can be easily adopted for solving regression tasks.