Vilen Vilen Jumutc | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vilen Vilen Jumutc is active.

Explore More

Publication

Featured researches published by Vilen Vilen Jumutc.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014

Multi-Class Supervised Novelty Detection

Vilen Vilen Jumutc; Johan A. K. Suykens

In this paper we study the problem of finding a support of unknown high-dimensional distributions in the presence of labeling information, called Supervised Novelty Detection (SND). The One-Class Support Vector Machine (SVM) is a widely used kernel-based technique to address this problem. However with the latter approach it is difficult to model a mixture of distributions from which the support might be constituted. We address this issue by presenting a new class of SVM-like algorithms which help to approach multi-class classification and novelty detection from a new perspective. We introduce a new coupling term between classes which leverages the problem of finding a good decision boundary while preserving the compactness of a support with the l2-norm penalty. First we present our optimization objective in the primal and then derive a dual QP formulation of the problem. Next we propose a Least-Squares formulation which results in a linear system which drastically reduces computational costs. Finally we derive a Pegasos-based formulation which can effectively cope with large data sets that cannot be handled by many existing QP solvers. We complete our paper with experiments that validate the usefulness and practical importance of the proposed methods both in classification and novelty detection settings.

international conference on big data | 2014

Representative subsets for big data learning using k-NN graphs

Raghvendra Mall; Vilen Vilen Jumutc; Rocco Langone; Johan A. K. Suykens

In this paper we propose a deterministic method to obtain subsets from big data which are a good representative of the inherent structure in the data. We first convert the large scale dataset into a sparse undirected k-NN graph using a distributed network generation framework that we propose in this paper. After obtaining the k-NN graph we exploit the fast and unique representative subset (FURS) selection method [1], [2] to deterministically obtain a subset for this big data network. The FURS selection technique selects nodes from different dense regions in the graph retaining the natural community structure. We then locate the points in the original big data corresponding to the selected nodes and compare the obtained subset with subsets acquired from state-of-the-art subset selection techniques. We evaluate the quality of the selected subset on several synthetic and real-life datasets for different learning tasks including big data classification and big data clustering.

international symposium on neural networks | 2013

Fixed-size Pegasos for hinge and pinball loss SVM

Vilen Vilen Jumutc; Xiaolin Huang; Johan A. K. Suykens

Pegasos has become a widely acknowledged algorithm for learning linear Support Vector Machines. It utilizes properties of hinge loss and theory of strongly convex optimization problems for fast convergence rates and lower computational and memory costs. In this paper we adopt the recently proposed pinball loss for the Pegasos algorithm and show some advantages of using it in a variety of classification problems. First we present the newly derived Pegasos optimization objective with respect to pinball loss and analyze its properties and convergence rates. Additionally we present extensions of the Pegasos algorithm applied to the kernel-induced and Nyström approximated feature maps which introduce non-linearity in the input space. This is done using a Fixed-Size kernel method approach. Second we give experimental results for publicly available UCI datasets to justify the advantages and the importance of pinball loss for achieving a better classification accuracy and greater numerical stability in the partially or fully stochastic setting. Finally we conclude our paper with a brief discussion of the applicability of pinball loss to real-life problems.

international conference on big data | 2015

Regularized and sparse stochastic k-means for distributed large-scale clustering

Vilen Vilen Jumutc; Rocco Langone; Johan A. K. Suykens

In this paper we present a novel clustering approach based on the stochastic learning paradigm and regularization with l1-norms. Our approach is an extension of the widely acknowledged K-Means algorithm. We introduce a simple regularized dual averaging scheme for learning prototype vectors (centroids) with l1-norms in a stochastic mode. In our approach we distribute the learning of individual prototype vectors for each cluster, and the re-assignment of cluster memberships is performed only for a fixed number of outer iterations. The latter approach is exactly the same as in original K-Means algorithm and aims at re-shuffling the pool of samples per cluster according to the learned centroids. We report an extended evaluation and comparison of our approach with respect to various clustering techniques like randomized K-Means and Proximal Plane Clustering. Our experimental studies indicate the usefulness of the proposed methods for obtaining better prototype vectors and corresponding cluster memberships while being able to perform feature selection by l1-norm minimization.

computational intelligence and data mining | 2013

Supervised Novelty Detection

Vilen Vilen Jumutc; Johan A. K. Suykens

In this paper we present a novel approach and a new machine learning problem, called Supervised Novelty Detection (SND). This problem extends the One-Class Support Vector Machine setting for binary classification while keeping the nice properties of novelty detection problem at hand. To tackle this we approach binary classification from a new perspective using two different estimators and a coupled regularization term. It involves optimization over a different objective and a doubled set of Lagrange multipliers. One might consider our approach as a joint estimation of the support for different probability distributions per class where an ultimate goal is to separate classes with the largest possible angle between the normal vectors to the decision hyperplanes in the feature space. Regarding an obvious novelty of our problem we report and compare the results along the lines of standard C-SVM, LS-SVM and One-Class SVM. Experiments have demonstrated promising results that validate the usefulness of the proposed method.

Neurocomputing | 2016

Reweighted stochastic learning

Vilen Vilen Jumutc; Johan A. K. Suykens

Recent advances in stochastic learning, such as dual averaging schemes for proximal subgradient-based methods and simple but theoretically well-grounded solvers for linear Support Vector Machines (SVMs), revealed an ongoing interest in making these approaches consistent, robust and tailored towards sparsity inducing norms. In this paper we study reweighted schemes for stochastic learning (specifically in the context of classification problems) based on linear SVMs and dual averaging methods with primal-dual iterate updates. All these methods favor properties of a convex and composite optimization objective. The latter consists of a convex regularization term and loss function with Lipschitz continuous subgradients, e.g. l1-norm ball together with hinge loss. Some approaches approximate in a limit the l0-type of a penalty. In our analysis we focus on a regret and convergence criteria of such an approximation. We derive our results in terms of a sequence of convex and strongly convex optimization objectives. These objectives are obtained via the smoothing of a generic sub-differential and possibly non-smooth composite function by the global proximal operator. We report an extended evaluation and comparison of the reweighted schemes against different state-of-the-art techniques and solvers for linear SVMs. Our experimental study indicates the usefulness of the proposed methods for obtaining sparser and better solutions. We show that reweighted schemes can outperform state-of-the-art traditional approaches in terms of generalization error as well.

Archive | 2017

Large-Scale Clustering Algorithms

Rocco Langone; Vilen Vilen Jumutc; Johan A. K. Suykens

Computational tools in modern data analysis must be scalable to satisfy business and research time constraints. In this regard, two alternatives are possible: (i) adapt available algorithms or design new approaches such that they can run on a distributed computing environment (ii) develop model-based learning techniques that can be trained efficiently on a small subset of the data and make reliable predictions. In this chapter two recent algorithms following these different directions are reviewed. In particular, in the first part a scalable in-memory spectral clustering algorithm is described. This technique relies on a kernel -based formulation of the spectral clustering problem also known as kernel spectral clustering . More precisely, a finite dimensional approximation of the feature map via the Nystrom method is used to solve the primal optimization problem, which decreases the computational time from cubic to linear. In the second part, a distributed clustering approach with fixed computational budget is illustrated. This method extends the k-means algorithm by applying regularization at the level of prototype vectors. An optimal stochastic gradient descent scheme for learning with \(l_1\) and \(l_2\) norms is utilized, which makes the approach less sensitive to the influence of outliers while computing the prototype vectors.

international symposium on neural networks | 2014

Reweighted L2-regularized dual averaging approach for highly sparse stochastic learning

Vilen Vilen Jumutc; Johan A. K. Suykens

Recent advances in dual averaging schemes for primal-dual subgradient methods and stochastic learning revealed an ongoing and growing interest in making stochastic and online approaches consistent and tailored towards sparsity inducing norms. In this paper we focus on the reweighting scheme in the \(l_2\)-Regularized Dual Averaging approach which favors properties of a strongly convex optimization objective while approximating in a limit the \(l_0\)-type of penalty. In our analysis we focus on a regret and convergence criteria of such an approximation. We derive our results in terms of a sequence of strongly convex optimization objectives obtained via the smoothing of a sub-differential and non-smooth loss function, e.g. hinge loss. We report an empirical evaluation of the convergence in terms of the cumulative training error and the stability of the selected set of features. Experimental evaluation shows some improvements over the \(l_1\)-RDA method in the generalization error as well.

computational intelligence and data mining | 2014

New bilinear formulation to semi-supervised classification based on Kernel Spectral Clustering

Vilen Vilen Jumutc; Johan A. K. Suykens

In this paper we present a novel semi-supervised classification approach which combines bilinear formulation for non-parallel binary classifiers based upon Kernel Spectral Clustering. The cornerstone of our approach is a bilinear term introduced into the primal formulation of semi-supervised classification problem. In addition we perform separate manifold regularization for each individual classifier. The latter relates to the Kernel Spectral Clustering unsupervised counterpart which helps to obtain more precise and generalizable classification boundaries. We derive the dual problem which can be effectively translated into a linear system of equations and then solved without introducing extra costs. In our experiments we show the usefulness and report considerable improvements in performance with respect to other semi-supervised approaches, like Laplacian SVMs and other KSC-based models.

pattern recognition and machine intelligence | 2013

Weighted Coordinate-Wise Pegasos

Vilen Vilen Jumutc; Johan A. K. Suykens

Pegasos is a popular and reliable machine learning algorithm for making linear Support Vector Machines solvable at the larger scale. It benefits from the strongly convex optimization objective, faster convergence rates and lower computational and memory costs. In this paper we devise a new weighted formulation of the Pegasos algorithm which favors from the different coordinate-wise λ i regularization parameters. Together with the proposed extension we give a brief theoretical justification of its convergence to an optimal solution and analyze at a glance its computational costs. We conclude our paper with the numerical results obtained for UCI datasets and demonstrate the merits and the importance of our approach for achieving a better classification accuracy and convergence rates in the partially or fully stochastic setting.

Explore More