Artur Starczewski
Częstochowa University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Artur Starczewski.
Pattern Analysis and Applications | 2017
Artur Starczewski
In this paper, a new cluster validity index which can be considered as a measure of the accuracy of the partitioning of data sets is proposed. The new index, called the STR index, is defined as the product of two components which determine changes of compactness and separability of clusters during a clustering process. The maximum value of this index identifies the best clustering scheme. Three popular algorithms have been applied as underlying clustering techniques, namely complete-linkage, expectation maximization and K-means algorithms. The performance of the new index is demonstrated for several artificial and real-life data sets. Moreover, this new index has been compared with other well-known indices, i.e., Dunn, Davies-Bouldin, PBM and Silhouette indices, taking into account the number of clusters in a data set as the comparison criterion. The results prove superiority of the new index as compared to the above-mentioned indices.
international conference on artificial intelligence and soft computing | 2016
Artur Starczewski; Adam Krzyzak
In this paper a modification of the well-known Silhouette validity index is proposed. This index, which can be considered a measure of the data set partitioning accuracy, enjoys significant popularity and is often used by researchers. The proposed modification involves using an additional component in the original index. This approach improves performance of the index and provides better results during a clustering process, especially when changes of cluster separability are big. The new version of the index is called the SILA index and its maximum value identifies the best clustering scheme. The performance of the new index is demonstrated for several data sets, where the popular algorithm has been applied as underlying clustering techniques, namely the Complete–linkage algorithm. The results prove superiority of the new approach as compared to the original Silhouette validity index.
international conference on artificial intelligence and soft computing | 2017
Artur Starczewski; Adam Krzyzak
In this paper a detail analysis of an improvement of the Silhouette validity index is presented. This proposed approach is based on using an additional component which improves clusters validity assessment and provides better results during a clustering process, especially when the naturally existing groups in a data set are located in very different distances. The performance of the modified index is demonstrated for several data sets, where the Complete–linkage method has been applied as the underlying clustering technique. The results prove superiority of the new approach as compared to other methods.
international conference on artificial intelligence and soft computing | 2015
Tomasz Galkowski; Artur Starczewski; Xiuju Fu
Big data sets and variety of data types lead to new types of problems in modern intelligent data analysis. This requires the development of new techniques and models. One of the important subjects is to reveal and indicate heterogeneous of non-trivial features of a large database. Original techniques of modelling, data mining, pattern recognition, machine learning in such fields like commercial behaviour of Internet users, social networks analysis, management and investigation of various databases in static or dynamic states have been recently investigated. Many techniques discovering hidden structures in the data set like clustering and projection of data from high-dimensional spaces have been developed. In this paper we have proposed a model for multiple view unsupervised clustering based on Kohonen self-organizing-map method.
international conference on artificial intelligence and soft computing | 2013
Artur Starczewski
This paper describes a new method to the determination of the optimal number of well-separable clusters in data sets. The determination of this parameter is necessary for many clustering algorithms to define the naturally existing clusters correctly. In the presented method the idea of the agglomerative hierarchical clustering has been used, and the modified RS cluster validity index has been applied. In the first phase of the method, clusters are created due to the idea of hierarchical clustering. Then, for the optimal number of clusters the k-means algorithm is performed. The method has been used for multidimensional data, and the received results confirm very good performances of the proposed method.
international conference on artificial intelligence and soft computing | 2012
Artur Starczewski
In this paper a new hierarchical clustering technique is presented. This approach is similar to two popular hierarchical clustering algorithms, i.e. single-link and complete-link. These hierarchical methods play an important role in clustering data and allow to create well-separable clusters, whenever the clusters exist. The proposed method has been used to clustering artificial and real data sets. Obtained results confirm very good performances of the method.
international conference on artificial intelligence and soft computing | 2015
Artur Starczewski; Adam Krzyzak
This article provides the performance evaluation of the Silhouette index, which is based on the so called silhouette width. However, the index can be calculated in two ways, and so, the first approach uses the mean of the mean silhouettes through all the clusters. On the other hand, the second one is realized by averaging the silhouettes over the whole data set. These various approaches of the index have significant influence on indicating the proper number of clusters in a data set. To study the performance of the index, as the underlying clustering algorithms, two popular hierarchical methods were applied, that is, the complete-linkage and the single-linkage algorithm. These methods have been used for artificial and real-life data sets, and the results confirm very good performances of the index and they also allow to choose the best approach.
international conference on artificial intelligence and soft computing | 2012
Artur Starczewski
This paper describes a new cluster validity index for the well-separable clusters in data sets. The validity indices are necessary for many clustering algorithms to assign the naturally existing clusters correctly. In the presented method, to determine the optimal number of clusters in data sets, the new cluster validity index has been used. It has been applied to the complete link hierarchical clustering algorithm. The basis to define the new cluster validity index is founding of the large increments of intercluster and intracluster distances, when the clustering algorithm is performed. The maximum value of the index determines the optimal number of clusters in the given set simultaneously. Obtained results confirm very good performances of the proposed approach.
international conference on artificial intelligence and soft computing | 2012
Tomasz Galkowski; Artur Starczewski
In various data mining applications performing the task of extracting information from large databases is serious problem, which occurs in many fields e.g.: bioinformatics, commercial behaviour of Internet users, social networks analysis, management and investigation of various databases in static or dynamic states. In recent years many techniques discovering hidden structures in the data set like clustering and projection of data from high-dimensional spaces have been developed. In this paper, we propose a model for multiple view unsupervised clustering based on Kohonen self-organizing-map algorithm. The results of simulations in two dimensional space using three views of training sets having different statistical properties have been presented.
international conference on artificial intelligence and soft computing | 2006
Artur Starczewski
Presented paper shows a new approach to creating a fuzzy system based on an exclusive use of clustering algorithms, which determine the value of necessary parameters. The applied multisegment fuzzy system functions as a classifier. Each segment makes an independent fuzzy system with a defined knowledge base and uses singleton fuzzification, as well as fuzzy inference with product operation as the Cartesian product and well-matched membership functions. Defuzzification method is not used. Only the rule-firing level must be analysed and its value suffices to determine the class. The use of clustering algorithms has allowed a qualification of the number of rules in the base of fuzzy rules for each independent segment, as well as a specification of the centers of fuzzy sets used in the given rules. The calculated parameters have proved precise, so that no additional methods have been applied to correct their values. This procedure greatly simplifies the creation of a fuzzy system. The constructed fuzzy system has been tested on medical data that come from the Internet. In the future, those systems may help doctors with their everyday work.