Andreas Janecek
University of Vienna
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andreas Janecek.
congress on evolutionary computation | 2013
Shaoqiu Zheng; Andreas Janecek; Ying Tan
In this paper, we present an improved version of the recently developed Fireworks Algorithm (FWA) based on several modifications. A comprehensive study on the operators of conventional FWA revealed that the algorithm works surprisingly well on benchmark functions which have their optimum at the origin of the search space. However, when being applied on shifted functions, the quality of the results of conventional FWA deteriorates severely and worsens with increasing shift values, i.e., with increasing distance between function optimum and origin of the search space. Moreover, compared to other metaheuristic optimization algorithms, FWA has high computational cost per iteration. In order to tackle these limitations, we present five major improvements of FWA: (i) a new minimal explosion amplitude check, (ii) a new operator for generating explosion sparks, (iii) a new mapping strategy for sparks which are out of the search space, (iv) a new operator for generating Gaussian sparks, and (v) a new operator for selecting the population for the next iteration. The resulting algorithm is called Enhanced Fireworks Algorithm (EFWA). Experimental evaluation on twelve benchmark functions with different shift values shows that EFWA outperforms conventional FWA in terms of convergence capabilities, while reducing the runtime significantly.
international conference on swarm intelligence | 2011
Andreas Janecek; Ying Tan
The nonnegative matrix factorization (NMF) is a boundconstrained low-rank approximation technique for nonnegative multivariate data. NMF has been studied extensively over the last years, but an important aspect which only has received little attention so far is a proper initialization of the NMF factors in order to achieve a faster error reduction. Since the NMF objective function is usually non-differentiable, discontinuous, and may possess many local minima, heuristic search algorithms are a promising choice as initialization enhancers for NMF. In this paper we investigate the application of five population based algorithms (genetic algorithms, particle swarm optimization, fish school search, differential evolution, and fireworks algorithm) as new initialization variants for NMF. Experimental evaluation shows that some of them are well suited as initialization enhancers and can reduce the number of NMF iterations needed to achieve a given accuracy. Moreover, we compare the general applicability of these five optimization algorithms for continuous optimization problems, such as the NMF objective function.
International Journal of Swarm Intelligence Research | 2011
Andreas Janecek; Ying Tan
The Non-negative Matrix Factorization (NMF) is a special low-rank approximation which allows for an additive parts-based and interpretable representation of the data. This article presents efforts to improve the convergence, approximation quality, and classification accuracy of NMF using five different meta-heuristics based on swarm intelligence. Several properties of the NMF objective function motivate the utilization of meta-heuristics: this function is non-convex, discontinuous, and may possess many local minima. The proposed optimization strategies are two-fold: On the one hand, a new initialization strategy for NMF is presented in order to initialize the NMF factors prior to the factorization; on the other hand, an iterative update strategy is proposed, which improves the accuracy per runtime for the multiplicative update NMF algorithm. The success of the proposed optimization strategies are shown by applying them on synthetic data and data sets coming from the areas of spam filtering/email classification, and evaluate them also in their application context. Experimental results show that both optimization strategies are able to improve NMF in terms of faster convergence, lower approximation error, and better classification accuracy. Especially the initialization strategy leads to significant reductions of the runtime per accuracy ratio for both, the NMF approximation as well as the classification results achieved with NMF.
international conference on natural computation | 2011
Andreas Janecek; Ying Tan
Low-rank approximations of data (e. g. based on the Singular Value Decomposition) have proven very useful in various data mining applications. The Non-negative Matrix Factorization (NMF) leads to special low-rank approximations which satisfy non-negativity constraints. The Multiplicative Update (MU) algorithm is one of the two original NMF algorithms and is still one of the fastest NMF algorithms per iteration. Nevertheless, MU demands a quite large number of iterations in order to provide an accurate approximation of the original data. In this paper we present a new iterative update strategy for the MU algorithm based on nature-inspired optimization algorithms. The goal is to achieve a better accuracy per runtime compared to the standard version of MU. Several properties of the NMF objective function underlying the MU algorithm motivate the utilization of heuristic search algorithms. Indeed, this function is usually non-differentiable, discontinuous, and may possess many local minima. Experimental results show that our new iterative update strategy for the MU algorithm achieves the same approximation error than the standard version in significantly fewer iterations and in faster overall runtime.
ubiquitous computing | 2012
Andreas Janecek; Karin Anna Hummel; Danilo Valerio; Fabio Ricciato; Helmut Hlavacs
Road traffic can be monitored by means of static sensors and derived from floating car data, i.e., reports from a sub-set of vehicles. These approaches suffer from a number of technical and economical limitations. Alternatively, we propose to leverage the mobile cellular network as a ubiquitous mobility sensor. We show how vehicle travel times and road congestion can be inferred from anonymized signaling data collected from a cellular mobile network. While other previous studies have considered data only from active devices, e.g., engaged in voice calls, our approach exploits also data from idle users resulting in an enormous gain in coverage and estimation accuracy. By validating our approach against four different traffic monitoring datasets collected on a sample highway over one month, we show that our method can detect congestions very accurately and in a timely manner.
Archive | 2008
Wilfried N. Gansterer; Andreas Janecek; Robert Neumayer
In this paper, a study on the classification performance of a vector space model (VSM) and of latent semantic indexing (LSI) applied to the task of spam filtering is summarized. Based on a feature set used in the extremely widespread, de-facto standard spam filtering system SpamAssassin, a vector space model and latent semantic indexing are applied for classifying e-mail messages as spam or not spam. The test data sets used are partly from the official TREC 2005 data set and partly
availability, reliability and security | 2008
Andreas Janecek; Wilfried N. Gansterer; K.A. Kumar
We present the idea and implementation details of a highly effective and reliable e-mail filtering technique. At the core of the component-based architecture is a novel combination of an enhanced self-learning variant of greylisting with a reputation-based trust mechanism. These strategies provide separate feature extraction and classification components with the opportunity of utilizing the time between two delivery attempts of an e-mail message. The approach presented features a very high spam blocking rate and also minimizes the workload on the client side, as no responsibility for messages classified as spam is taken. The reputation-based trust mechanism decreases the delay in the transfer process of e-mail messages from reliable senders and also reduces the number of erroneously blocked legitimate messages.
IEEE Transactions on Intelligent Transportation Systems | 2015
Andreas Janecek; Danilo Valerio; Karin Anna Hummel; Fabio Ricciato; Helmut Hlavacs
Mobile cellular networks can serve as ubiquitous sensors for physical mobility. We propose a method to infer vehicle travel times on highways and to detect road congestion in real-time, based solely on anonymized signaling data collected from a mobile cellular network. Most previous studies have considered data generated from mobile devices active in calls, namely Call Detail Records (CDR), an approach that limits the number of observable devices to a small fraction of the whole population. Our approach overcomes this drawback by exploiting the whole set of signaling events generated by both idle and active devices. While idle devices contribute with a large volume of spatially coarse-grained mobility data, active devices provide finer-grained spatial accuracy for a limited subset of devices. The combined use of data from idle and active devices improves congestion detection performance in terms of coverage, accuracy, and timeliness. We apply our method to real mobile signaling data obtained from an operational network during a one-month period on a sample highway segment in the proximity of a European city, and present an extensive validation study based on ground-truth obtained from a rich set of reference datasources - road sensor data, toll data, taxi floating car data, and radio broadcast messages.
Current Computer - Aided Drug Design | 2008
Michael A. Demel; Andreas Janecek; Khac-Minh Thai; Gerhard F. Ecker; Wilfried N. Gansterer
Since the advent of QSAR (quantitative structure activity relationship) modeling quantitative representations of molecular structures are encoded in terms of information-preserving descriptor values. Nowadays, a nearly infinite variety of potential descriptors is available and descriptor selection is no longer a task which can be done manually. There is an increasing need for automation in order to reduce the dimensionality of the descriptor space. Classical feature selection (FS) and dimensionality reduction (DR) methods like principal component analysis, which relies on the selection of those descriptors that contribute most to the variance of a data set, often fail in providing the best classification result. More sophisticated methods like genetic algorithms, self-organizing-maps and stepwise linear discriminant analysis have proven to be useful techniques in the process of selecting descriptors with a significant discriminative power. The topic FS and DR becomes even more important when predictive models are approached which should describe the QSAR of highly promiscuous target proteins. The ABC-transporter family, the cardiac hERG-potassium channel, and the hepatic cytochrom-P450-family are classical representatives of such poly-specific proteins. In this case the interaction pattern is a rather complex one and thus the selection of the most predictive descriptors needs advanced methods. This review surveys FS and DR methods that have recently been successfully applied to classify ligands of poly-specific target proteins.
mobile data management | 2008
Julien Gossa; Andreas Janecek; Karin Anna Hummel; Wilfried N. Gansterer; Jean-Marc Pierson
In mobile distributed computing scenarios, data replication management can be enhanced by considering client mobility. This work introduces a mobility model for mobile clients which enables proactive replica placement based on mobility predictions to increase the responsiveness of data access for mobile end users and to reduce the network load in the system. The concept is applied to a flexible replica placement algorithm and is evaluated using GPS traces generated by taxis in the city of Vienna, Austria. We show results demonstrating that replica placement can be improved by adding mobility prediction and discuss the influences of prediction accuracy on the accuracy of the placement strategy.