[PDF] Constructing the F-Graph with a Symmetric Constraint for Subspace Clustering

Abstract

Based on further studying the low-rank subspace clustering (LRSC) and L2-graph subspace clustering algorithms, we propose a F-graph subspace clustering algorithm with a symmetric constraint (FSSC), which constructs a new objective function with a symmetric constraint basing on F-norm, whose the most significant advantage is to obtain a closed-form solution of the coefficient matrix. Then, take the absolute value of each element of the coefficient matrix, and retain the k largest coefficients per column, set the other elements to 0, to get a new coefficient matrix. Finally, FSSC performs spectral clustering over the new coefficient matrix. The experimental results on face clustering and motion segmentation show FSSC algorithm can not only obviously reduce the running time, but also achieve higher accuracy compared with the state-of-the-art representation-based subspace clustering algorithms, which verifies that the FSSC algorithm is efficacious and feasible.

Full PDF

CCorresponding author: Xiao-Jun Wu ([email protected])

Constructing the F-Graph with a Symmetric Constraint for SubspaceClustering

Kai Xu, Xiao-Jun Wu*, Wen-Bo HuSchool of IoT Engineering, Jiangnan University, Wuxi 214122, China

Abstract : Based on further studying the low-rank subspace clustering (LRSC) and L2-graph subspaceclustering algorithms, we propose a F-graph subspace clustering algorithm with a symmetric constraint(FSSC), which constructs a new objective function with a symmetric constraint basing on F-norm,whose the most significant advantage is to obtain a closed-form solution of the coefficient matrix. Then,take the absolute value of each element of the coefficient matrix, and retain the k largest coefficientsper column, set the other elements to 0, to get a new coefficient matrix. Finally, FSSC performsspectral clustering over the new coefﬁcient matrix. The experimental results on face clustering andmotion segmentation show FSSC algorithm can not only obviously reduce the running time, but alsoachieve higher accuracy compared with the state-of-the-art representation-based subspace clusteringalgorithms, which verifies that the FSSC algorithm is efficacious and feasible.

Keywords: s ubspace clustering, coefficient representation, symmetric constraint, F-norm ． INTRODUCTION

High dimensional datasets become increasingly ubiquitous with the rapid development ofinformation technology over the past few decades, which not only increases the demand for memoryand algorithm running time but also have an adverse effect on the performance of algorithms [1-4] dueto the “curse of dimensionality”. To reduce the dimensionality of original images, many subspacelearning are presented [5-7].In many application fields, high-dimensional data in the same class or directory can be wellrepresented by low-dimensional subspaces. In recent years, subspace clustering base on sparserepresentation has received widespread attention due to its wide application in signal processing [8-9].Subspace clustering algorithms are based on the idea that the intrinsic dimension of highdimensional datasets is often much smaller than the dimension of the ambient space, whose task is tosegment all data points into their respective subspaces, which have lots of applications in computervision, image processing, and systems theory. The existing subspace clustering algorithms can bedivided into four main categories: iterative methods [10, 11], algebraic methods [12, 13], statisticalmethods [14, 15], and spectral clustering-based methods [16-18]. The performance of spectralclustering-based methods is excellent for some applications in computer vision [19], such as imagesegmentation [20], face clustering [21-23], and motion segmentation [24, 25].The core of the spectral clustering algorithm [26, 27] is building an affinity matrix, whoseelements measure the similarity of corresponding data points. The pairwise distance-based method ofbuilding an affinity matrix measures the similarity of two data points by computing the distancebetween them, e.g., Euclidean distance, which could only get the local structure of datasets, beingsensitive to noises and outliers. However, representation coefficients-based methods [28-32] are robustto noise and outliers, because the value of coefficient depends on both the two connected data pointsand the other data points. Recently, a few of works have shown that representation coefficient-basedsubspace clustering algorithms, such as robust subspace segmentation by low-rank representation (LRR)[18], sparse subspace clustering (SSC) [21], and constructing the L2-graph for subspace learning andsubspace clustering (L2-graph) [32], perform better than those based on pairwise distance. orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected])

Although LRR and SSC algorithms have achieved desirable clustering quality, they obtain theapproximation solution of the coefficient matrix by iteration, and there is no more appropriate methodto construct the affinity matrix, so there is a great room for improvement on running time andclustering accuracy of the algorithms. Affinity matrix is the core of the spectral clustering algorithm,whose elements measure a similarity degree of corresponding data points. Constructing a suitableaffinity matrix can effectively improve the accuracy of algorithms. So, L2-graph subspace clusteringalgorithm takes the absolute value of each element of the coefficient matrix, retain the k largestcoefficients per column, and set the other elements to 0, to get a new coefficient matrix, and then usethe new coefficient matrix to construct a sparse affinity matrix. Moreover, the L2-graph algorithm canobtain a closed-form coefficient matrix, but it needs to encode each sample to obtain the coefficientmatrix separately. So there is still much room for improvement. The FSSC clustering algorithmconstructs a new objective function with a symmetric constraint based on F-norm, which can directlyobtain the closed-form solution of the coefficient matrix by matrix calculation. Moreover, FSSC adds asymmetry constraint on the coefficient matrix, which can more really reflect the similarity betweensamples. The experimental results on face clustering and motion segmentation show FSSC algorithmcan not only obviously reduce the running time, but also achieve higher accuracy.The rest of the article is organized as follows: Section 2 presents the related work aboutrepresentation-based subspace clustering algorithms, and Section 3 proposes the FSSC algorithm byconstructing a new objective function with a symmetric constraint based on F-norm and provides thedetailed derivation of obtaining the closed-form solution. Section 4 reports the results of a series ofexperiments to examine the effectiveness of the algorithm in the context of face clustering and motionsegmentation. Finally, Section 5 concludes this work. ． RELATED WORK

Exploring the structure of data space is a challenging task in a diverse set of fields, which oftenrelates to a rank-minimization problem. Low-Rank Subspace Clustering (LRSC) ( ) [22] solves thefollowing optimization problem to get the coefficient matrix:where denotes the nuclear norm, i.e., the sum of the singular values of a matrix. is thelow-rank representation of the data set . The procedure of the low-rank subspace clusteringalgorithm ( ) is described in Algorithm 1.Algorithm 1 LRSC ( ) [22]Input ： a set of points , and the number of clusters u.1 ： Solve the nuclear norm minimization problem (1) to get .2 ： Form an affinity matrix .3 ： Apply spectral clustering to the affinity matrix W.Output: The cluster assignments of .It shows that the collaborative representation-based subspace clustering algorithm can achievebetter clustering quality than that got by the low-rank representation-based subspace clusteringalgorithm [32]. The L2-graph algorithm needs to solve the following problem,where is the collaborative representation of the dataset ， Theclosed-form solution of (2) can be obtained easily by the Lagrange multiplier method. Theimplementation process of the L2-graph algorithm is described in algorithm 2. （）（） orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected]) Algorithm 2 L2-graph [32]Input ： a set of points , the number of clusters u and the number of reserved coefficients kper column1 ： Solve the L2 norm minimization problem (2) to get and normalize to give a unit L2 norm.2 ： Form an coefficient matrix .3 ： Take the absolute value of each element of the coefficient matrix, and then retain the k largestcoefficients per column, set the other elements to 0, to get a new coefficient matrix .4 ： Form an affinity matrix .5 ： Apply spectral clustering to the affinity matrix W.Output: The cluster assignments of . ． Constructing the F-Graph with a Symmetric Constraint for Subspace Clustering

Before describing the FSSC algorithm in detail, a lemma [22] is introduced as follows:Lemma1. For any real-valued, symmetric positive definite matrices and ,where and are the descending singularvalues of X and Z , respectively. The case of equality occurs if and only if it is possible to ﬁnd a unitarymatrix that simultaneously singular value-decomposes X and Z in the sense thatwhere and denote the n*n diagonal matrices with the singular values of X and Z , respectively,down in the diagonal in descending order, and is a permutation matrix such that containsthe singular values of Z in the diagonal in ascending order.On the basis of further studying the LRSC and L2-graph algorithms in section 1, we construct anew objective function as followswhere is the representation matrix of the dataset . The detailed derivation ofobtaining the closed form solution referring to [32] is described as followsLet be the SVD of Y and be the eigenvalue decomposition (EVD)of C , which can guarantee that C is symmetric. The cost function of (3) reduces to （） orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected]) where . In order to minimize this cost function, we need to first take the first item of thecost function into consideration, i.e.,,Applying Lemma 1 to and ,we can obtain and , then there is a new cost function as followsLet the ith largest element in the diagonal of and beand , respectively. Thenwe can independently solve for each to find the optimal asThe closed form solution to this problem can be obtained aswhich can be written in matrix form as . Therefore,where is partitioned according to the two sets and.Because and , the optimal C is equivalent toThe symmetric constraint criterion can preserve the subspace structures of high-dimensional dataand guarantee weight consistency for each pair of data points so that highly correlated data points ofsubspaces are represented together [28]. However, the closed-form solution C is not sparse andcontains a large number of redundancy relations, which will reduce the accuracy of the algorithm.Therefore, after obtaining the coefﬁcient matrix of the data set, referring to [32], we take the absolutevalue of each element of the coefficient matrix and retain the k largest coefficients per column, set theother elements to 0, to get a new coefficient matrix . Then, FSSC performs spectral clustering overthe new coefﬁcient matrix as described in Algorithm 3.Algorithm 3 FSSCInput ： A set of points , the number of clusters u and the number of reserved coefficientsk per column1 ： Solve the F norm minimization problem (3) to get .2 ： Take the absolute value of each element of the coefficient matrix, and then retain the k largestcoefficients per column, set the other elements to 0, to get a new coefficient matrix . orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected])

4. Experiments

In this section, we use the subspace clustering Accuracy, Normalized Mutual Information (NMI)and Running Time to evaluate the performance of the FSSC algorithm in dealing with two computervision tasks: face clustering and motion segmentation. We choose the state-of-the-art subspaceclustering algorithms as a baseline, such as robust subspace segmentation by low-rank representation(LRR) [18], sparse subspace clustering (SSC) [21], and constructing the L2-graph for subspacelearning and subspace clustering (L2-graph) [32]. All the experiments are implemented in MatlabR2013a and ran on a personal computer with Intel Core i3-3240 CPU and 8GB memory.Datasets: We evaluate the performance of the algorithms for face clustering using three accessibleimage datasets, i.e., AR [33], Extended Yale B (ExYaleB) [34], and Multiple PIE (MPIE) [35]. Theoverview of these databases is provided in TABLE I. ExYaleB dataset contains the frontal face imagesof 38 individuals, where there are about 64 images for each subject. Moreover, the AR dataset consistsof over 4000 face images of 126 individuals (70 male and 56 female), where 26 images consisting of14 clean images, 6 images with sunglasses, and 6 images with scarves for each subject acquired undervarious expression, varying lighting conditions. As in [36], we randomly select a subset containing1400 clean face images from 50 male and 50 female subjects. MPIE dataset contains the facial imagesof 337 subjects captured under 15 viewpoints and 19 illumination conditions in up to four recordingsessions.

TABLE IDATA SETS USED IN THE EXPERIMENTS

Database Samples Original size Cropped size Features Dim. classesAR 1400 192*168 55*40 167 100ExYaleB 2414 165*120 55*40 114 38MPIE-S2 2030 100*82 50*41 115 203MPIE-S3 1640 100*82 50*41 115 164MPIE-S4 1760 100*82 50*41 115 176

FSSC algorithm contains two parameters: the balance parameter τ and the number k of reservedcoefficients per column, on which this section discusses the influence of the two parameters in AR andExtended Yale B datasets in detail.Fig. 1 shows the evaluation results of FSSC with different values of the two parameters. In the ARdataset, when the τ of the FSSC algorithm ranges from 0.01 to 10, its accuracy varies from 37.21% to85.14% and NMI varies from 66.31% to 93.40%. When the τ of FSSC ranges from 10 to 70, itsAccuracy and NMI almost remain stable. When the k of FSSC algorithm ranges from 3 to 8, itsAccuracy varies from 62.64% to 87.57% and NMI varies from 80.77% to 94.55%. With the increasingof k, its Accuracy and NMI has a tendency to decline. On the other hand, in the Extended Yale Bdataset, we can find the same tendency that the performance of FSSC is superb and stable with theincreasing of τ as in the AR dataset, and FSSC achieves the best result when k equals 6. Based on theabove experimental results ， we use τ=45 and k=8 for AR dataset, τ=3 and k=6 for the Extended Yale Bdataset. orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected])

Fig. 1 The influence of parameters τ and k on the FSSC algorithm

Due to the limitation of space, we directly give parameter settings for MPIE and Hopkins 155datasets. The parameters are set as τ=30 and k=13 for the MPIE dataset. In Hopkins 155 dataset, we useτ=26 and k=5 for 2 motions, τ=34 and k=5 for 3 motions.

To evaluate the performance of the FSSC algorithm, we compare it with L2-graph, LRR, and SSCthese latest subspace clustering algorithms in three aspects, including the clustering accuracy, NMI, andrunning time. For the state-of-the-art algorithms, we use the codes provided by their authors and set theparameters to be optimal. Results in Tables II and III are obtained by taking the mean after repeatedlyrunning 20 times, and bold data represents the best performance.

TABLE IICLUSTERING QUALITY(ACCURACY(AC(%)) AND THE CORRESPOIDING NORMALIZED MUTUALINFORMATION(NMI(%))) OF DIFFERENT ALGORITHMS

Databases FSSC L2-graph [32] LRR [24] SSC [12]AC NMI AC NMI AC NMI AC NMIAR

Table II shows the clustering results of various approaches using different datasets. Our proposedFSSC algorithm has a distinct advantage in clustering quality compared with the other three subspaceclustering algorithms. Construct the new objective function with a closed-form solution, and take theabsolute value of each element of the coefficient matrix, and retain the k largest coefficients per column, orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected]) set the other elements to 0, to get a new coefficient matrix, and then use the new coefficient matrix toconstruct a sparse affinity matrix is the key to the FSSC algorithm to achieve excellent performance.The four subspace clustering algorithms all perform good clustering quality on the MPIE dataset,where FSSC and L2-graph have almost the same clustering accuracy, whose accuracy is about 6%higher than that of LRR algorithm. On AR and ExYaleB datasets, the accuracy of the FSSC algorithmis about 3.51% and 1.58% higher than that of the L2-graph subspace clustering algorithm, respectively.The accuracy of LRR and SSC clustering algorithms on the two datasets is similar, but it is at least 10%lower than the accuracy of the FSSC algorithm. TABLE IIIAVERAGE RUNNING TIME (s) OF DIFFERENT ALGORITHMS

Databases FSSC L2-graph [32] LRR [24] SSC [12]AR 70 78 73 138ExYaleB 45 79 63 283MPIE-S2 213 231 233 375MPIE-S3 135 152 142 239MPIE-S4 135 158 153 271

TABLE III reports the time costs obtained by averaging the elapsed CPU time over 10independent experiments for each algorithm. We can see that the FSSC algorithm achieves not only theoptimal clustering quality but also the shortest running time. The running time of the FSSC algorithm isless about 9.7%, 42.85%, and 11.18% than that of L2-graph on AR, ExYaleB, and MPIE databases,respectively. Moreover, the L2-graph and LRR algorithms almost use the same running time. Thereason is that FSSC can directly obtain the closed-form solution of the coefficient matrix by matrixcalculation. L2-graph algorithm also can obtain a closed-form coefficient matrix. Still, it needs toencode each sample to get the coefficient matrix separately, and the LRR algorithm obtains theapproximation solution of the coefficient matrix by iteration. Furthermore, SSC also achieves theapproximation solution of the coefficient matrix by iteration, which costs about twice the running timethan the FSSC algorithm.

Motion segmentation [22] refers to the problem of clustering a set of 2D feature points extractedfrom a video sequence into groups corresponding to different rigid-body motions. Here, the dimensionof data matrix Y is 2 F * N , where N is the number of 2D feature trajectories, and F is the number offrames in the video.The Hopkins 155 dataset is used to evaluate the performance of the FSSC algorithm against thatof the other algorithms, which consists of 120 video sequences of two motions and 35 sequences ofthree motions. Because the four subspace clustering algorithms all perform well in Hopkins 115 dataset,this section employs the subspace clustering error to evaluate the performance of the algorithmsintuitively. TABLE IVCLUSTERING ERROR(%) OF THE EVALUATED ALGORITHMS ON THE HOPKINS 155 RAW DATA

Algorithms 2 Motions 3 Motions AllMean Median Mean Median Mean Median Run TimeFSSC 1.54

L2-graph[32] 2.30

Table IV shows the results of applying different subspace clustering algorithms to the original 2F orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected]) dimensional feature trajectories. By analyzing the results, we can observe that the FSSC clusteringalgorithm is the best in clustering quality and running time, especially in the aspect of saving runningtime, less 23.8%, 84.9%, and 84% than the L2-graph, LRR, and SSC algorithm, respectively. ． CONCLUSION

The structure of the objective function and affinity matrix is the core of the representation-basedsubspace clustering algorithm. On the basis of further studying the LRSC and L2-graph algorithms, wedesign an F-graph with a symmetric constraint algorithm for subspace clustering, which has aclosed-form solution that can not only obviously reduce the running time, but also achieve higheraccuracy. However, there are still many aspects worth further studying, such as the selection of thenumber k of reserved coefficients per column in the coefficient matrix, and the relationship between kand the intrinsic dimensionality of a subspace.

ACKNOWLEDGMENT REFERENCES [1] Hartigan J A, Wong M A. Algorithm AS 136: A k-means clustering algorithm [J]. Appliedstatistics, 1979: 100-108.[2] Kanungo T, Mount D M, Netanyahu N S, et al. An efficient k-means clustering algorithm:Analysis and implementation[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2002, 24(7): 881-892.[3] Pal N R, Bezdek J C. On cluster validity for the fuzzy c-means model[J]. Fuzzy Systems, IEEETransactions on, 1995, 3(3): 370-379.[4] Pal N R, Pal K, Keller J M, et al. A possibilistic fuzzy c-means clustering algorithm[J]. FuzzySystems, IEEE Transactions on, 2005, 13(4): 517-530.[5] Xiao-Jun W, Kittler J, Jing-Yu Y, et al. A new direct LDA (D-LDA) algorithm for featureextraction in face recognition[C]//Proceedings of the 17th International Conference on PatternRecognition, 2004. ICPR 2004. IEEE, 2004, 4: 545-548.[6] Zheng Y J, Yang J Y, Yang J, et al. Nearest neighbour line nonparametric discriminant analysisfor feature extraction[J]. Electronics Letters, 2006, 42(12): 679-680.[7] Zheng Y, Yang J, Yang J, et al. A reformative kernel Fisher discriminant algorithm and itsapplication to face recognition[J]. Neurocomputing, 2006, 69(13-15): 1806-1810.[8] Dong W, Wu X, Kittler J. Sparse subspace clustering via smoothed ℓ p minimization[J]. PatternRecognition Letters, 2019, 125: 206-211.[9] Dong W, Wu X J, Kittler J, et al. Sparse subspace clustering via nonconvex approximation[J].Pattern Analysis and Applications, 2019, 22(1): 165-176.[10] Vidal R, Ma Y, Sastry S. Generalized principal component analysis (GPCA)[J]. IEEETransactions on Pattern Analysis and Machine Intelligence, 2005, 27(12): 1945-1959[11] Vidal R, Ma Y, Piazzi J. A new GPCA algorithm for clustering subspaces by fitting,differentiating and dividing polynomials[C]//Computer Vision and Pattern Recognition, 2004.CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. IEEE, 2004, 1:I-510-I-517 Vol. 1.[12] Zhang T, Szlam A, Lerman G. Median k-flats for hybrid linear modeling with many outliers [C]//2009 IEEE 12th International Conference on Computer Vision Workshops. Kyoto: IEEE, 2009:234-241[13] Lu L, Vidal R. Combined central and subspace clustering for computer vision applications [C]//Proceedings of the 23rd international conference on Machine learning. New York: ACM, 2006:593-600[14] Rao S, Tron R, Vidal R, et al. Motion segmentation in the presence of outlying, incomplete, orcorrupted trajectories[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010,32(10): 1832-1845[15] Rao S R, Tron R, Vidal R, et al. Motion segmentation via robust subspace separation in thepresence of outlying, incomplete, or corrupted trajectories[C]//Computer Vision and PatternRecognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 2008: 1-8.[16] Favaro P, Vidal R, Ravichandran A. A closed form solution to robust subspace estimation andclustering [C] //2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Providence: IEEE, 2011: 1801-1807[17] Elhamifar E, Vidal R. Clustering disjoint subspaces via sparse representation [C] //2010 IEEEInternational Conference on Acoustics Speech and Signal Processing (ICASSP). Dallas: IEEE,2010: 1926-1929 orresponding author: Xiao-Jun Wu ([email protected])orresponding author: Xiao-Jun Wu ([email protected])