Eric Malmi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eric Malmi is active.

Explore More

Publication

Featured researches published by Eric Malmi.

knowledge discovery and data mining | 2016

DopeLearning: A Computational Approach to Rap Lyrics Generation

Eric Malmi; Pyry Takala; Hannu Toivonen; Tapani Raiko; Aristides Gionis

Writing rap lyrics requires both creativity to construct a meaningful, interesting story and lyrical skills to produce complex rhyme patterns, which form the cornerstone of good flow. We present a rap lyrics generation method that captures both of these aspects. First, we develop a prediction model to identify the next line of existing lyrics from a set of candidate next lines. This model is based on two machine-learning techniques: the RankSVM algorithm and a deep neural network model with a novel structure. Results show that the prediction model can identify the true next line among 299 randomly selected lines with an accuracy of 17%, i.e., over 50 times more likely than by random. Second, we employ the prediction model to combine lines from existing songs, producing lyrics with rhyme and a meaning. An evaluation of the produced lyrics shows that in terms of quantitative rhyme density, the method outperforms the best human rappers by 21%. The rap lyrics generator has been deployed as an online tool called DeepBeat, and the performance of the tool has been assessed by analyzing its usage logs. This analysis shows that machine-learned rankings correlate with user preferences.

international symposium on neural networks | 2012

Semi-supervised detection of collective anomalies with an application in high energy particle physics

Tommi Vatanen; Mikael Kuusela; Eric Malmi; Tapani Raiko; Timo Aaltonen; Yoshikazu Nagai

We study a novel type of a semi-supervised anomaly detection problem where the anomalies occur collectively among a background of normal data. Such problem arises in experimental high energy physics when one is trying to discover deviations from known Standard Model physics. We solve the problem by first fitting a mixture of Gaussians to a labeled background sample. We then fit a mixture of this background model and a number of additional Gaussians to an unlabeled sample containing both background and anomalies. This way we not only detect but also perform pattern recognition of anomalies. Such mixture model allows us to perform classification of anomalies vs. background, estimate the proportion of anomalies in the sample and study the statistical significance of the anomalous contribution. We first verify the performance of the method using artificial data and then demonstrate its real-life applicability using a data set related to the search of the Higgs boson at the Tevatron collider.

Data Mining and Knowledge Discovery | 2017

Lagrangian relaxations for multiple network alignment

Eric Malmi; Sanjay Chawla; Aristides Gionis

We propose a principled approach for the problem of aligning multiple partially overlapping networks. The objective is to map multiple graphs into a single graph while preserving vertex and edge similarities. The problem is inspired by the task of integrating partial views of a family tree (genealogical network) into one unified network, but it also has applications, for example, in social and biological networks. Our approach, called Flan, introduces the idea of generalizing the facility location problem by adding a non-linear term to capture edge similarities and to infer the underlying entity network. The problem is solved using an alternating optimization procedure with a Lagrangian relaxation. Flan has the advantage of being able to leverage prior information on the number of entities, so that when this information is available, Flan is shown to work robustly without the need to use any ground truth data for fine-tuning method parameters. Additionally, we present three multiple-network extensions to an existing state-of-the-art pairwise alignment method called Natalie. Extensive experiments on synthetic, as well as real-world datasets on social networks and genealogical networks, attest to the effectiveness of the proposed approaches which clearly outperform a popular multiple network alignment method called IsoRankN.

arXiv: Computer Vision and Pattern Recognition | 2017

Domain Adaptation for Resume Classification Using Convolutional Neural Networks

Luiza Sayfullina; Eric Malmi; Yiping Liao; Alexander Jung

We propose a novel method for classifying resume data of job applicants into 27 different job categories using convolutional neural networks. Since resume data is costly and hard to obtain due to its sensitive nature, we use domain adaptation. In particular, we train a classifier on a large number of freely available job description snippets and then use it to classify resume data. We empirically verify a reasonable classification performance of our approach despite having only a small amount of labeled resume data available.

european conference on machine learning | 2015

The Blind Leading the Blind: Network-Based Location Estimation Under Uncertainty

Eric Malmi; Arno Solin; Aristides Gionis

We propose a probabilistic method for inferring the geographical locations of linked objects, such as users in a social network. Unlike existing methods, our model does not assume that the exact locations of any subset of the linked objects, like neighbors in a social network, are known. The method efficiently leverages prior knowledge on the locations, resulting in high geolocation accuracies even if none of the locations are initially known. Experiments are conducted for three scenarios: geolocating users of a location-based social network, geotagging historical church records, and geotagging Flickr photos. In each experiment, the proposed method outperforms two state-of-the-art network-based methods. Furthermore, the last experiment shows that the method can be employed not only to network-based but also to content-based location estimation.

ubiquitous computing | 2014

Quality matters: usage-based app popularity prediction

Eric Malmi

In recent years, mobile application (app) economy has grown to a huge market but it is only the top apps that are able to turn this boom into significant revenues. In this paper, we study how the quality of an app, as reflected in how people start to use it, is linked to the popularity of the app. We show that features extracted from the Device Analyzer dataset, describing the aggregate usage of the app, can be used to predict its popularity. We also look at the connection between app popularity and the past popularity of other apps from the same publisher and find a surprisingly small correlation between the two.

international world wide web conferences | 2017

AncestryAI: A Tool for Exploring Computationally Inferred Family Trees

Eric Malmi; Marko Rasa; Aristides Gionis

Many people are excited to discover their ancestors and thus decide to take up genealogy. However, the process of finding the ancestors is often very laborious since it involves comparing a large number of historical birth records and trying to manually match the people mentioned in them. We have developed AncestryAI, an open-source tool for automatically linking historical records and exploring the resulting family trees. We introduce a record-linkage method for computing the probabilities of the candidate matches, which allows the users to either directly identify the next ancestor or narrow down the search. We also propose an efficient layout algorithm for drawing and navigating genealogical graphs. The tool is additionally used to crowdsource training and evaluation data so as to improve the matching algorithm. Our objective is to build a large genealogical graph, which could be used to resolve various interesting questions in the areas of computational social science, genetics, and evolutionary studies. The tool is openly available at: http://emalmi.kapsi.fi/ancestryai/.

Data Mining and Knowledge Discovery | 2015

Beyond rankings: comparing directed acyclic graphs

Eric Malmi; Nikolaj Tatti; Aristides Gionis

Defining appropriate distance measures among rankings is a classic area of study which has led to many useful applications. In this paper, we propose a more general abstraction of preference data, namely directed acyclic graphs (DAGs), and introduce a measure for comparing DAGs, given that a vertex correspondence between the DAGs is known. We study the properties of this measure and use it to aggregate and cluster a set of DAGs. We show that these problems are

international world wide web conferences | 2018