Agnes Vathy-Fogarassy
University of Pannonia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Agnes Vathy-Fogarassy.
foundations of information and knowledge systems | 2006
Agnes Vathy-Fogarassy; Attila Kiss; János Abonyi
Clustering is an important tool to explore the hidden structure of large databases. There are several algorithms based on different approaches (hierarchical, partitional, density-based, model-based, etc.). Most of these algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. In this paper we introduce a new clustering algorithm based on the sinergistic combination of the hierarchial and graph theoretic minimal spanning tree based clustering and the partitional Gaussian mixture model-based clustering algorithms. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the Gaussian mixture based clustering algorithms.
Information Sciences | 2009
Agnes Vathy-Fogarassy; János Abonyi
As data analysis tasks often have to deal with complex data structures, the nonlinear dimensionality reduction methods play an important role in exploratory data analysis. In the literature a number of nonlinear dimensionality reduction techniques have been proposed (e.g. Sammon mapping, Locally Linear Embedding). These techniques attempt to preserve either the local or the global geometry of the original data, and they perform metric or non-metric dimensionality reduction. Nevertheless, it is difficult to apply most of them to large data sets. There is a need for new algorithms that are able to combine vector quantisation and mapping methods in order to visualise the data structure in a low-dimensional vector space. In this paper we define a new class of algorithms to quantify and disclose the data structure, that are based on the topology representing networks and apply different mapping methods to the low-dimensional visualisation. Not only existing methods are combined for that purpose but also a novel group of mapping methods (Topology Representing Network Map) are introduced as a part of this class. Topology Representing Network Maps utilise the main benefits of the topology representing networks and of the multidimensional scaling methods to disclose the real structure of the data set under study. To determine the main properties of the topology representing network based mapping methods, a detailed analysis of classical benchmark examples (Wine and Optical Recognition of Handwritten Digits data set) is presented.
trans. computational science | 2008
Agnes Vathy-Fogarassy; Attila Kiss; János Abonyi
In practical data mining problems high-dimensional data has to be analyzed. In most of these cases it is very informative to map and visualize the hidden structure of complex data set in a low-dimensional space. The aim of this paper is to propose a new mapping algorithm based both on the topology and the metric of the data. The utilized Topology Representing Network (TRN) combines neural gas vector quantization and competitive Hebbian learning rule in such a way that the hidden data structure is approximated by a compact graph representation. TRN is able to define a low-dimensional manifold in the high-dimensional feature space. In case the existence of a manifold, multidimensional scaling and/or Sammon mapping of the graph distances can be used to form the map of the TRN (TRNMap). The systematic analysis of the algorithms that can be used for data visualization and the numerical examples presented in this paper demonstrate that the resulting map gives a good representation of the topology and the metric of complex data sets, and the component plane representation of TRNMap is useful to explore the hidden relations among the features.
Scientific Reports | 2017
Daniel Leitold; Agnes Vathy-Fogarassy; János Abonyi
Network theory based controllability and observability analysis have become widely used techniques. We realized that most applications are not related to dynamical systems, and mainly the physical topologies of the systems are analysed without deeper considerations. Here, we draw attention to the importance of dynamics inside and between state variables by adding functional relationship defined edges to the original topology. The resulting networks differ from physical topologies of the systems and describe more accurately the dynamics of the conservation of mass, momentum and energy. We define the typical connection types and highlight how the reinterpreted topologies change the number of the necessary sensors and actuators in benchmark networks widely studied in the literature. Additionally, we offer a workflow for network science-based dynamical system analysis, and we also introduce a method for generating the minimum number of necessary actuator and sensor points in the system.
Fuzzy Sets and Systems | 2016
András Király; Agnes Vathy-Fogarassy; János Abonyi
Clustering high dimensional data and identifying central nodes in a graph are complex and computationally expensive tasks. We utilize k-nn graph of high dimensional data as efficient representation of the hidden structure of the clustering problem. Initial cluster centers are determined by graph centrality measures. Cluster centers are fine-tuned by minimizing fuzzy-weighted geodesic distances. The shortest-path based representation is parallel to the concept of transitive closure. Therefore, our algorithm is capable to cluster networks or even more complex and abstract objects based on their partially known pairwise similarities.The algorithm is proven to be effective to identify senior researchers in a co-author network, central cities in topographical data, and clusters of documents represented by high dimensional feature vectors.
Information Systems | 2017
Agnes Vathy-Fogarassy; Tamás Hugyák
Abstract Integration of data stored in heterogeneous database systems is a very challenging task and it may hide several difficulties. As NoSQL databases are growing in popularity, integration of different NoSQL systems and interoperability of NoSQL systems with SQL databases become an increasingly important issue. In this paper, we propose a novel data integration methodology to query data individually from different relational and NoSQL database systems. The suggested solution does not support joins and aggregates across data sources; it only collects data from different separated database management systems according to the filtering options and migrates them. The proposed method is based on a metamodel approach and it covers the structural, semantic and syntactic heterogeneities of source systems. To introduce the applicability of the proposed methodology, we developed a web-based application, which convincingly confirms the usefulness of the novel method.
intelligent data engineering and automated learning | 2007
Agnes Vathy-Fogarassy; Ágnes Werner-Stark; Balazs Gal; János Abonyi
As data analysis tasks often have to face the analysis of huge and complex data sets there is a need for new algorithms that combine vector quantization and mapping methods to visualize the hidden data structure in a low-dimensional vector space. In this paper a new class of algorithms is defined. Topology representing networks are applied to quantify and disclose the data structure and different nonlinear mapping algorithms for the low-dimensional visualization are applied for the mapping of the quantized data. To evaluate the main properties of the resulted topology representing network based mapping methods a detailed analysis based on the wine benchmark example is given.
Journal of Mathematical Modelling and Algorithms | 2008
Agnes Vathy-Fogarassy; Ágnes Werner-Stark; János Abonyi
In practical data mining tasks, high-dimensional data has to be analyzed. In most of the cases it is very informative to map and visualize the hidden structure of a complex data set in a low-dimensional space. In this paper a new class of mapping algorithms is defined. These algorithms combine topology representing networks and different nonlinear mapping algorithms. While the former methods aim to quantify the data and disclose the real structure of the objects, the nonlinear mapping algorithms are able to visualize the quantized data in the low-dimensional vector space. In this paper, techniques based on these methods are gathered and the results of a detailed analysis performed on them are shown. The primary aim of this analysis is to examine the preservation of distances and neighborhood relations of the objects. Preservation of neighborhood relations was analyzed both in local and global environments. To evaluate the main properties of the examined methods we show the outcome of the analysis based both on synthetic and real benchmark examples.
international workshop on fuzzy logic and applications | 2007
Agnes Vathy-Fogarassy; Attila Kiss; János Abonyi
Different clustering algorithms are based on different similarity or distance measures (e.g. Euclidian distance, Minkowsky distance, Jackard coefficient, etc.). Jarvis-Patrick clustering method utilizes the number of the common neighbors of the k-nearest neighbors of objects to disclose the clusters. The main drawback of this algorithm is that its parameters determine a too crisp cutting criterion, hence it is difficult to determine a good parameter set. In this paper we give an extension of the similarity measure of the Jarvis-Patrick algorithm. This extension is carried out in the following two ways: (i) fuzzyfication of one of the parameters, and (ii) spreading of the scope of the other parameter. The suggested fuzzy similarity measure can be applied in various forms, in different clustering and visualization techniques (e.g. hierarchical clustering, MDS, VAT). In this paper we give some application examples to illustrate the efficiency of the use of the proposed fuzzy similarity measure in clustering. These examples show that the proposed fuzzy similarity measure based clustering techniques are able to detect clusters with different sizes, shapes and densities. It is also shown that the outliers are also detectable by the proposed measure.
Archive | 2013
Agnes Vathy-Fogarassy; János Abonyi
The way how graph-based clustering algorithms utilize graphs for partitioning data is very various. In this chapter, two approaches are presented. The first hierarchical clustering algorithm combines minimal spanning trees and Gath-Geva fuzzy clustering. The second algorithm utilizes a neighborhood-based fuzzy similarity measure to improve k-nearest neighbor graph based Jarvis-Patrick clustering.