Is this you? Create Your Porfile

Emil Eirola

Arcada University of Applied Sciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Emil Eirola is active.

Explore More

Publication

Featured researches published by Emil Eirola.

Neurocomputing | 2016

Extreme learning machine for missing data using multiple imputations

Dušan Sovilj; Emil Eirola; Yoan Miche; Kaj-Mikael Björk; Rui Nian; Anton Akusok; Amaury Lendasse

In the paper, we examine the general regression problem under the missing data scenario. In order to provide reliable estimates for the regression function (approximation), a novel methodology based on Gaussian Mixture Model and Extreme Learning Machine is developed. Gaussian Mixture Model is used to model the data distribution which is adapted to handle missing values, while Extreme Learning Machine enables to devise a multiple imputation strategy for final estimation. With multiple imputation and ensemble approach over many Extreme Learning Machines, final estimation is improved over the mean imputation performed only once to complete the data. The proposed methodology has longer running times compared to simple methods, but the overall increase in accuracy justifies this trade-off.

international work-conference on artificial and natural neural networks | 2015

Extreme Learning Machines for Multiclass Classification: Refining Predictions with Gaussian Mixture Models

Emil Eirola; Andrey Gritsenko; Anton Akusok; Kaj-Mikael Björk; Yoan Miche; Dušan Sovilj; Rui Nian; Bo He; Amaury Lendasse

This paper presents an extension of the well-known Extreme Learning Machines (ELMs). The main goal is to provide probabilities as outputs for Multiclass Classification problems. Such information is more useful in practice than traditional crisp classification outputs. In summary, Gaussian Mixture Models are used as post-processing of ELMs. In that context, the proposed global methodology is keeping the advantages of ELMs (low computational time and state of the art performances) and the ability of Gaussian Mixture Models to deal with probabilities. The methodology is tested on 3 toy examples and 3 real datasets. As a result, the global performances of ELMs are slightly improved and the probability outputs are seen to be accurate and useful in practice.

Computers & Security | 2017

A pragmatic android malware detection procedure

Paolo Palumbo; Luiza Sayfullina; Dmitriy Komashinskiy; Emil Eirola; Juha Karhunen

Abstract The academic security research community has studied the Android malware detection problem extensively. Machine learning methods proposed in previous work typically achieve high reported detection performance on fixed datasets. Some of them also report reasonably fast prediction times. However, most of them are not suitable for real-world deployment because requirements for malware detection go beyond these figures of merit. In this paper, we introduce several important requirements for deploying Android malware detection systems in the real world. One such requirement is that candidate approaches should be tested against a stream of continuously evolving data. Such streams of evolving data represent the continuous flow of unknown file objects received for categorization, and provide more reliable and realistic estimate of detection performance once deployed in a production environment. As a case study we designed and implemented an ensemble approach for automatic Android malware detection that meets the real-world requirements we identified. Atomic Naive Bayes classifiers used as inputs for the Support Vector Machine ensemble are based on different APK feature categories, providing fast speed and additional reliability against the attackers due to diversification. Our case study with several malware families showed that different families are detected by different atomic classifiers. To the best of our knowledge, our work contains the first publicly available results generated against evolving data streams of nearly 1 million samples with a model trained over a massive sample set of 120,000 samples.

Archive | 2016

Probabilistic Methods for Multiclass Classification Problems

Andrey Gritsenko; Emil Eirola; Daniel Schupp; Edward Ratner; Amaury Lendasse

In this paper, two approaches for probability-based class prediction are presented. In the first approach, the output of Extreme Learning Machines algorithm is used as an input for Gaussian Mixture models. In this case, ELM performs as dimensionality reduction technique. The second approach is based on ELM and a newly proposed Histogram Probability method. Detailed description and analysis of these methods are presented. To evaluate these methods five datasets from UCI Machine Learning Repository are used.

Archive | 2018

Predicting Huntington’s Disease: Extreme Learning Machine with Missing Values

Emil Eirola; Anton Akusok; Kaj-Mikael Björk; Hans J. Johnson; Amaury Lendasse

Problems with incomplete data and missing values are common and important in real-world machine learning scenarios, yet often underrepresented in the research field. Particularly data related to healthcare tends to feature missing values which must be handled properly, and ignoring any incomplete samples is not an acceptable solution. The Extreme Learning Machine has demonstrated excellent performance in a variety of machine learning tasks, including situations with missing values. In this paper, we present an application to predict the onset of Huntington’s disease several years in advance based on data from MRI brain scans. Experimental results show that such prediction is indeed realistic with reasonable accuracy, provided the missing values are handled with care. In particular, Multiple Imputation ELM achieves exceptional prediction accuracy.

hybrid artificial intelligence systems | 2017

Solve Classification Tasks with Probabilities. Statistically-Modeled Outputs

Andrey Gritsenko; Emil Eirola; Daniel Schupp; Edward Ratner; Amaury Lendasse

In this paper, an approach for probability-based class prediction is presented. This approach is based on a combination of a newly proposed Histogram Probability (HP) method and any classification algorithm (in this paper results for combination with Extreme Learning Machines (ELM) and Support Vector Machines (SVM) are presented). Extreme Learning Machines is a method of training a single-hidden layer neural network. The paper contains detailed description and analysis of the HP method by the example of the Iris dataset. Eight datasets, four of which represent computer vision classification problem and are derived from Caltech-256 image database, are used to compare HP method with another probability-output classifier [11, 18].

international conference on machine learning and applications | 2016

Android Malware Detection: Building Useful Representations

Luiza Sayfullina; Emil Eirola; Dmitry Komashinsky; Paolo Palumbo; Juha Karhunen

The problem of proactively detecting Android Malware has proven to be a challenging one. The challenges stem from a variety of issues, but recent literature has shown that this task is hard to solve with high accuracy when only a restricted set of features, like permissions or similar fixed sets of features, are used. The opposite approach of including all available features is also problematic, as it causes the features space to grow beyond reasonable size. In this paper we focus on finding an efficient way to select a representative feature space, preserving its discriminative power on unseen data. We go beyond traditional approaches like Principal Component Analysis, which is too heavy for large-scale problems with millions of features. In particular we show that many feature groups that can be extracted from Android application packages, like features extracted from the manifest file or strings extracted from the Dalvik Executable (DEX), should be filtered and used in classification separately. Our proposed dimensionality reduction scheme is applied to each group separately and consists of raw string preprocessing, feature selection via log-odds and finally applying random projections. With the size of the feature space growing exponentially as a function of the training sets size, our approach drastically decreases the size of the feature space of several orders of magnitude, this in turn allows accurate classification to become possible in a real world scenario. After reducing the dimensionality we use the feature groups in a light-weight ensemble of logistic classifiers. We evaluated the proposed classification scheme on real malware data provided by the antivirus vendor and achieved state-of-the-art 88.24% true positive and reasonably low 0.04% false positive rates with a significantly compressed feature space on a balanced test set of 10,000 samples.

Archive | 2019

Extreme Learning Tree

Anton Akusok; Emil Eirola; Kaj-Mikael Björk; Amaury Lendasse

The paper proposes a new variant of a decision tree, called an Extreme Learning Tree. It consists of an extremely random tree with non-linear data transformation, and a linear observer that provides predictions based on the leaf index where the data samples fall. The proposed method outperforms linear models on a benchmark dataset, and may be a building block for a future variant of Random Forest.

Archive | 2019

Distance Estimation for Incomplete Data by Extreme Learning Machine

Emil Eirola; Anton Akusok; Kaj-Mikael Björk; Amaury Lendasse

Data with missing values are very common in practice, yet many machine learning models are not designed to handle incomplete data. As most machine learning approaches can be formulated in terms of distance between samples, estimating these distances on data with missing values provides an effective way to use such models. This paper present a procedure to estimate the distances using the Extreme Learning Machine. Experimental comparison shows that the proposed approach achieves competitive accuracy with other methods on standard benchmark datasets.

International Symposium on Extreme Learning Machine | 2018

Incremental ELMVIS for Unsupervised Learning

Anton Akusok; Emil Eirola; Yoan Miche; Ian Oliver; Kaj-Mikael Björk; Andrey Gritsenko; Stephen Baek; Amaury Lendasse

An incremental version of the ELMVIS+ method is proposed in this paper. It iteratively selects a few best fitting data samples from a large pool, and adds them to the model. The method keeps high speed of ELMVIS+ while allowing for much larger possible sample pools due to lower memory requirements. The extension is useful for reaching a better local optimum with greedy optimization of ELMVIS, and the data structure can be specified in semi-supervised optimization. The major new application of incremental ELMVIS is not to visualization, but to a general dataset processing. The method is capable of learning dependencies from non-organized unsupervised data—either reconstructing a shuffled dataset, or learning dependencies in complex high-dimensional space. The results are interesting and promising, although there is space for improvements.

Explore More