Rocco Langone
Katholieke Universiteit Leuven
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rocco Langone.
Entropy | 2013
Raghvendra Mall; Rocco Langone; Johan A. K. Suykens
This paper shows the feasibility of utilizing the Kernel Spectral Clustering (KSC) method for the purpose of community detection in big data networks. KSC employs a primal-dual framework to construct a model. It results in a powerful property of effectively inferring the community affiliation for out-of-sample extensions. The original large kernel matrix cannot fitinto memory. Therefore, we select a smaller subgraph that preserves the overall community structure to construct the model. It makes use of the out-of-sample extension property for community membership of the unseen nodes. We provide a novel memory- and computationally efficient model selection procedure based on angular similarity in the eigenspace. We demonstrate the effectiveness of KSC on large scale synthetic networks and real world networks like the YouTube network, a road network of California and the Livejournal network. These networks contain millions of nodes and several million edges.
PLOS ONE | 2014
Raghvendra Mall; Rocco Langone; Johan A. K. Suykens
Kernel spectral clustering corresponds to a weighted kernel principal component analysis problem in a constrained optimization framework. The primal formulation leads to an eigen-decomposition of a centered Laplacian matrix at the dual level. The dual formulation allows to build a model on a representative subgraph of the large scale network in the training phase and the model parameters are estimated in the validation stage. The KSC model has a powerful out-of-sample extension property which allows cluster affiliation for the unseen nodes of the big data network. In this paper we exploit the structure of the projections in the eigenspace during the validation stage to automatically determine a set of increasing distance thresholds. We use these distance thresholds in the test phase to obtain multiple levels of hierarchy for the large scale network. The hierarchical structure in the network is determined in a bottom-up fashion. We empirically showcase that real-world networks have multilevel hierarchical organization which cannot be detected efficiently by several state-of-the-art large scale hierarchical community detection techniques like the Louvain, OSLOM and Infomap methods. We show that a major advantage of our proposed approach is the ability to locate good quality clusters at both the finer and coarser levels of hierarchy using internal cluster quality metrics on 7 real-life networks.
Engineering Applications of Artificial Intelligence | 2015
Rocco Langone; Carlos Alzate; Bart De Ketelaere; Jonas Vlasselaer; Wannes Meert; Johan A. K. Suykens
Abstract Accurate prediction of forthcoming faults in modern industrial machines plays a key role in reducing production arrest, increasing the safety of plant operations, and optimizing manufacturing costs. The most effective condition monitoring techniques are based on the analysis of historical process data. In this paper we show how Least Squares Support Vector Machines (LS-SVMs) can be used effectively for early fault detection in an online fashion. Although LS-SVMs are existing artificial intelligence methods, in this paper the novelty is represented by their successful application to a complex industrial use case, where other approaches are commonly used in practice. In particular, in the first part we present an unsupervised approach that uses Kernel Spectral Clustering (KSC) on the sensor data coming from a vertical form seal and fill (VFFS) machine, in order to distinguish between normal operating condition and abnormal situations. Basically, we describe how KSC is able to detect in advance the need of maintenance actions in the analysed machine, due the degradation of the sealing jaws. In the second part we illustrate a nonlinear auto-regressive (NAR) model, thus a supervised learning technique, in the LS-SVM framework. We show that we succeed in modelling appropriately the degradation process affecting the machine, and we are capable to accurately predict the evolution of dirt accumulation in the sealing jaws.
IEEE Transactions on Neural Networks | 2015
Siamak Mehrkanoon; Carlos Alzate; Raghvendra Mall; Rocco Langone; Johan A. K. Suykens
This paper proposes a multiclass semisupervised learning algorithm by using kernel spectral clustering (KSC) as a core model. A regularized KSC is formulated to estimate the class memberships of data points in a semisupervised setting using the one-versus-all strategy while both labeled and unlabeled data points are present in the learning process. The propagation of the labels to a large amount of unlabeled data points is achieved by adding the regularization terms to the cost function of the KSC formulation. In other words, imposing the regularization term enforces certain desired memberships. The model is then obtained by solving a linear system in the dual. Furthermore, the optimal embedding dimension is designed for semisupervised clustering. This plays a key role when one deals with a large number of clusters.
international conference on big data | 2013
Raghvendra Mall; Rocco Langone; Johan A. K. Suykens
We propose a parameter-free kernel spectral clustering model for large scale complex networks. The kernel spectral clustering (KSC) method works by creating a model on a subgraph of the complex network. The model requires a kernel function which can have parameters and the number of communities k has be detected in the large scale network. We exploit the structure of the projections in the eigenspace to automatically identify the number of clusters. We use the concept of entropy and balanced clusters for this purpose. We show the effectiveness of the proposed approach by comparing the cluster memberships w.r.t. several large scale community detection techniques like Louvain, Infomap and Bigclam methods. We conducted experiments on several synthetic networks of varying size and mixing parameter along with large scale real world experiments to show the efficiency of the proposed approach.
international conference on big data | 2014
Raghvendra Mall; Vilen Vilen Jumutc; Rocco Langone; Johan A. K. Suykens
In this paper we propose a deterministic method to obtain subsets from big data which are a good representative of the inherent structure in the data. We first convert the large scale dataset into a sparse undirected k-NN graph using a distributed network generation framework that we propose in this paper. After obtaining the k-NN graph we exploit the fast and unique representative subset (FURS) selection method [1], [2] to deterministically obtain a subset for this big data network. The FURS selection technique selects nodes from different dense regions in the graph retaining the natural community structure. We then locate the points in the original big data corresponding to the selected nodes and compare the obtained subset with subsets acquired from state-of-the-art subset selection techniques. We evaluate the quality of the selected subset on several synthetic and real-life datasets for different learning tasks including big data classification and big data clustering.
international symposium on neural networks | 2013
Rocco Langone; Raghvendra Mall; Johan A. K. Suykens
In this paper we propose an algorithm for soft (or fuzzy) clustering. In soft clustering each point is not assigned to a single cluster (like in hard clustering), but it can belong to every cluster with a different degree of membership. Generally speaking, this property is desirable in order to improve the interpretability of the results. Our starting point is a state-of-the art technique called kernel spectral clustering (KSC). Instead of using the hard assignment method present therein, we suggest a fuzzy assignment based on the cosine distance from the cluster prototypes. We then call the new method soft kernel spectral clustering (SKSC). We also introduce a related model selection technique, called average membership strength criterion, which solves the drawbacks of the previously proposed method (namely balanced linefit). We apply the new algorithm to synthetic and real datasets, for image segmentation and community detection on networks. We show that in many cases SKSC outperforms KSC.
international symposium on neural networks | 2012
Rocco Langone; Carlos Alzate; Johan A. K. Suykens
This paper is related to community detection in complex networks. We show the use of kernel spectral clustering for the analysis of unweighted networks. We employ the primal-dual framework and make use of out-of-sample extension. In the latter the assignment rule for the new nodes is based on a model learned in the training phase. We propose a method to extract from a network a small subgraph representative for its overall community structure. We use a model selection procedure based on the modularity statistic which is novel, because modularity is commonly used only at a training level. We demonstrate the effectiveness of our model on synthetic networks and benchmark data from real networks (power grid network and protein interaction network of yeast). Finally, we compare our model with the Nyström method, showing that our approach is better in terms of quality of the discovered partitions and needs less computation time.
arXiv: Learning | 2016
Rocco Langone; Raghvendra Mall; Carlos Alzate; Johan A. K. Suykens
In this chapter we review the main literature related to kernel spectral clustering (KSC), an approach to clustering cast within a kernel-based optimization setting. KSC represents a least-squares support vector machine-based formulation of spectral clustering described by a weighted kernel PCA objective. Just as in the classifier case, the binary clustering model is expressed by a hyperplane in a high dimensional space induced by a kernel. In addition, the multi-way clustering can be obtained by combining a set of binary decision functions via an Error Correcting Output Codes (ECOC) encoding scheme. Because of its model-based nature, the KSC method encompasses three main steps: training, validation, testing. In the validation stage model selection is performed to obtain tuning parameters, like the number of clusters present in the data. This is a major advantage compared to classical spectral clustering where the determination of the clustering parameters is unclear and relies on heuristics. Once a KSC model is trained on a small subset of the entire data, it is able to generalize well to unseen test points. Beyond the basic formulation, sparse KSC algorithms based on the Incomplete Cholesky Decomposition (ICD) and L0, \(L_{1},L_{0} + L_{1}\), Group Lasso regularization are reviewed. In that respect, we show how it is possible to handle large-scale data. Also, two possible ways to perform hierarchical clustering and a soft clustering method are presented. Finally, real-world applications such as image segmentation, power load time-series clustering, document clustering, and big data learning are considered.
international symposium on neural networks | 2013
Diego Hernán Peluffo-Ordóñez; Sergio García-Vega; Rocco Langone; Johan A. K. Suykens; Germán Castellanos-Domínguez
In this paper we propose a kernel spectral clustering-based technique to catch the different regimes experienced by a time-varying system. Our method is based on a multiple kernel learning approach, which is a linear combination of kernels. The calculation of the linear combination coefficients is done by determining a ranking vector that quantifies the overall dynamical behavior of the analyzed data sequence over-time. This vector can be calculated from the eigenvectors provided by the the solution of the kernel spectral clustering problem. We apply the proposed technique to a trial from the Graphics Lab Motion Capture Database from Carnegie Mellon University, as well as to a synthetic example, namely three moving Gaussian clouds. For comparison purposes, some conventional spectral clustering techniques are also considered, namely, kernel k-means and min-cuts. Also, standard k-means. The normalized mutual information and adjusted random index metrics are used to quantify the clustering performance. Results show the usefulness of proposed technique to track dynamic data, even being able to detect hidden objects.