Lukáš Vojáček
Technical University of Ostrava
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lukáš Vojáček.
web intelligence | 2011
Kateřina Slaninová; Jan Martinovič; Tomáš Novosád; Pavla Dráždilová; Lukáš Vojáček; Václav Snášel
Web site community analysis is one of the most valuable tools which can be used for user segmentation in web marketing sphere. The user segmentation is successfully used in campaign analysis, for web/product/service recommendation, or for web usage optimization. This type of analysis can be helpful in web performance analysis, web usability or accessibility as well. Various software is available for user behavior analysis or for analysis of user interaction with the web site. However, most of them have the user segmentation based only on statistical measurement of such information like click-through rates, identification of popular paths and others. In this paper there is presented the web site community analysis oriented to the user segmentation. The analysis is based on the users similar behavior on the website. For the identification of similar behavioral patterns was proposed the algorithm based on sequential pattern mining method combined with clustering using generalized suffix tree data structure.
computer information systems and industrial management applications | 2013
Lukáš Vojáček; Jiří Dvorský
The paper deals with the high dimensional data clustering problem. One possible way to cluster this kind of data is based on Artificial Neural Networks (ANN) such as SOM or Growing Neural Gas (GNG). The learning phase of the ANN, which is time-consuming especially for large high-dimensional datasets, is the main drawback of this approach to data clustering. The parallel modification, Growing Neural Gas, and its implementation on the HPC cluster is presented in the paper. Some experimental results are also presented.
Neural Network World | 2013
Jan Martinovič; Kateřina Slaninová; Lukáš Vojáček; Pavla Dráždilová; Jiří Dvorský; Ivo Vondrák
With increasing opportunities for analyzing large data sources, we have noticed a lack of effective processing in datamining tasks working with large sparse datasets of high dimensions. This work focuses on this issue and on effective clustering using models of artificial intelligence. The authors of this article propose an effective clustering algorithm to exploit the features of neural networks, and especially Self Organizing Maps (SOM), for the reduction of data dimensionality. The issue of computational complexity is resolved by using a parallelization of the standard SOM algorithm. The authors have focused on the acceleration of the presented algorithm using a version suitable for data collections with a certain level of sparsity. Effective acceleration is achieved by improving the winning neuron finding phase and the weight actualization phase. The output presented here demonstrates sufficient acceleration of the standard SOM algorithm while preserving the appropriate accuracy.
computer information systems and industrial management applications | 2011
Lukáš Vojáček; Jan Martinovič; Jiří Dvorský; Kateřina Slaninová; Ivo Vondrák
Self organizing maps (also called Kohonen maps) are known for their capability of projecting high-dimensional space into lower dimensions. There are commonly discussed problems like rapidly increased computational complexity or specific similarity representation in the high-dimensional space. In the paper there is proposed the effective clustering algorithm based on self organizing map with the main purpose to reduce high dimension of the input dataset. The problem of computational complexity is solved using parallelization; the speed of proposed algorithm is accelerated using the algorithm version suitable for data collections with certain level of sparsity.
computer information systems and industrial management applications | 2015
Lukáš Vojáček; Pavla Dráždilová; Jiří Dvorský
The paper deals with the Self Organizing Maps (SOM). The SOM is a standard tool for clustering and visualization of high-dimensional data. The learning phase of SOM is time-consuming especially for large datasets. There are two main bottleneck in the learning phase of SOM: finding of a winner of competitive learning process and updating of neurons’ weights. The paper is focused on the second problem. There are two extremal update strategies. Using the first strategy, all necessary updates are done immediately after processing one input vector. The other extremal choice is used in Batch SOM – updates are processed at the end of whole epoch. In this paper we study update strategies between these two extremal strategies. Learning of the SOM with delay updates are proposed in the paper. Proposed strategies are also experimentally evaluated.
computer information systems and industrial management applications | 2014
Lukáš Vojáček; Pavla Dráždilová; Jiří Dvorský
The paper deals with the high dimensional data clustering problem. One possible way to cluster this kind of data is based on Artificial Neural Networks (ANN) such as Growing Neural Gas (GNG) or Self Organizing Maps (SOM). The learning phase of ANN, which is time-consuming especially for large high-dimensional datasets, is the main drawback of this approach to data clustering. Parallel modification, Growing Neural Gas with pre-processing by Self Organizing Maps, and its implementation on the HPC cluster is presented in the paper. Some experimental results are also presented.
intelligent systems design and applications | 2013
Lukáš Vojáček; Jiri Dvorsky; Katerina Slaninová; Jan Martinovič
Extraction of social networks from log files and social network analysis then requires the usage of data mining methods focused on areas such as data clustering or pattern mining. Our research is focused on log files where one log file attribute is an originator of the recorded activity and the originator is also a person. Hence, based on the similar attributes of people, we are able to construct models which explain certain aspects of a persons behaviour. Moreover, we can extract user profiles based on person behaviour in the web applications. Working with large user profiles, usually acquired from the web log files, the dimension reduction from original high dimensional space to 2D space could be done using Kohonen SOM. The SOM also provides clusters of similar web profiles of particular users. For large SOM learning it is appropriate to use parallel computing environment. Our version of scalable parallel SOM learning algorithm and experiment with web user profiles are presented in this paper.
computer information systems and industrial management applications | 2012
Jiří Dvorský; Zbyněk Janoška; Lukáš Vojáček
Membrane computing is an emergent branch of natural computing, taking inspiration from the structure and functioning of a living cell. P systems, computing devices of this paradigm, are parallel, distributed and non-deterministic computing models which aim to capture processes taking place in a living cell and represent them as a computation. In last decade, a great variety of extensions of model, introduced by Paun in 1998, were presented. In this paper, we focus on modelling the traffic flow by the means of P systems. P systems enable mezoscopic representation of traffic flow with individual modelling of each cars behaviour. Theoretical model is presented together with an XML scheme to store the output of the model.
computer information systems and industrial management applications | 2018
Lukáš Vojáček; Pavla Dráždilová; Jiří Dvorský
The size, complexity and dimensionality of data collections are ever increasing from the beginning of the computer era. Clustering methods, such as Growing Neural Gas (GNG) [10] that is based on unsupervised learning, is used to reveal structures and to reduce large amounts of raw data. The growth of computational complexity of such clustering method, caused by growing data dimensionality and the specific similarity measurement in a high-dimensional space, reduces the effectiveness of clustering method in many real applications. The growth of computational complexity can be partially solved using the parallel computation facilities, such as High Performance Computing (HPC) cluster with MPI. An effective parallel implementation of GNG is discussed in this paper, while the main focus is on minimizing of interprocess communication which depends on the number of neurons and edges among neurons in the neural network. A new algorithm of adding neurons depending on data density is proposed in the paper.
computer information systems and industrial management applications | 2017
Lukáš Vojáček; Pavla Dráždilová; Jiří Dvorský
The size, complexity and dimensionality of data collections are ever increasing from the beginning of the computer era. Clustering is used to reveal structures and to reduce large amounts of raw data. There are two main issues when clustering based on unsupervised learning, such as Growing Neural Gas (GNG) [9], is performed on vast high dimensional data collection – the fast growth of computational complexity with respect to growing data dimensionality, and the specific similarity measurement in a high-dimensional space. These two factors reduce the effectiveness of clustering algorithms in many real applications. The growth of computational complexity can be partially solved using the parallel computation facilities, such as High Performance Computing (HPC) cluster with MPI. An effective parallel implementation of GNG is discussed in this paper, while the main focus is on minimizing of interprocess communication. The achieved speed-up was better than previous approach and the results from the standard and parallel version of GNG are same.