Donato Malerba | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Donato Malerba is active.

Explore More

Publication

Featured researches published by Donato Malerba.

Pattern Recognition | 2017

A novel spectral-spatial co-training algorithm for the transductive classification of hyperspectral imagery data

Annalisa Appice; Pietro Guccione; Donato Malerba

Abstract The automatic classificationr of hyperspectral data is made complex by several factors, such as the high cost of true sample labeling coupled with the high number of spectral bands, as well as the spatial correlation of the spectral signature. In this paper, a transductive collective classifier is proposed for dealing with all these factors in hyperspectral image classification. The transductive inference paradigm allows us to reduce the inference error for the given set of unlabeled data, as sparsely labeled pixels are learned by accounting for both labeled and unlabeled information. The collective inference paradigm allows us to manage the spatial correlation between spectral responses of neighboring pixels, as interacting pixels are labeled simultaneously. In particular, the innovative contribution of this study includes: (1) the design of an application-specific co-training schema to use both spectral information and spatial information, iteratively extracted at the object (set of pixels) level via collective inference; (2) the formulation of a spatial-aware example selection schema that accounts for the spatial correlation of predicted labels to augment training sets during iterative learning and (3) the investigation of a diversity class criterion that allows us to speed-up co-training classification. Experimental results validate the accuracy and efficiency of the proposed spectral-spatial, collective, co-training strategy.

NFMCP'13 Proceedings of the 2nd International Conference on New Frontiers in Mining Complex Patterns | 2013

Process mining to forecast the future of running cases

Sonja Pravilovic; Annalisa Appice; Donato Malerba

Processes are everywhere in our daily lives. More and more information about executions of processes are recorded in event logs by several information systems. Process mining techniques are used to analyze historic information hidden in event logs and to provide surprising insights for managers, system developers, auditors, and end users. While existing process mining techniques mainly analyze full process instances (cases), this paper extends the analysis to running cases, which have not yet completed. For running cases, process mining can be used to notify future events. This forecasting ability can provide insights for check conformance and support decision making. This paper details a process mining approach, which uses predictive clustering to equip an execution scenario with a prediction model. This model accounts for recent events of running cases to predict the characteristics of future events. Several tests with benchmark logs investigate the viability of the proposed approach.

Archive | 2012

An Intelligent System for Real Time Fault Detection in PV Plants

Anna Ciampi; Annalisa Appice; Donato Malerba; Angelo Muolo

The rising need of energy to improve the quality of life has paved the way for the development and the incentive of different kinds of renewable energy technologies. In particular, the recent increase in the number of installed PhotoVoltaic (PV) plants has boosted the marketing of new monitoring systems designed to take under control the energy production of PV plants. In this paper, we present an intelligent monitoring system, called SUNInspector, which resorts to spatio-temporal data mining techniques, in order to monitor energy productions of PV plants and detect real-time possible plant faults. SUNInspector uses spatio-temporal patterns, called trend clusters, to model the trends according to the energy production of the PV plants varies depending on the region where it is installed (spatial dependence) and the period of the year of the measurements (temporal dipendence). Each time a PV plant transmits its energy production measurement, the risk of a plant fault is measured by evaluating the persistence of an high difference between the real production and the expected production. A case study with PV plants distributed over the South of Italy is illustrated.

Revised Selected Papers of the 4th International Workshop on New Frontiers in Mining Complex Patterns - Volume 9607 | 2015

Discovering and Tracking Organizational Structures in Event Logs

Annalisa Appice; Marco Di Pietro; Claudio Greco; Donato Malerba

The goal of process mining is to extract process-related information by observing events recorded in event logs. An event is an activity initiated or completed by a resource at a certain time point. Organizational mining is a subfield of process mining that focuses on the organizational perspective of a business process. It considers the resource attribute and derives a profile that characterizes the behavior of a resource in a specific business process. By relating resources associated with correlated profiles, it is possible to define a social network. This paper focuses on the idea of performing organizational mining of event logs via social network mining. It presents a framework that resorts to a stream representation of an event log. It adapts the time-based window model to process this stream, so that window-based social resource networks can be constructed, in order to represent interactions between resources operating at the data window level. Finally, it integrates specific algorithms, in order to discover overlapping communities of resources and track the evolution of these communities over consecutive windows. This paper applies the defined framework to two real eventi¾źlogs.

MSM'10/MUSE'10 Proceedings of the 2010 international conference on Analysis of social media and ubiquitous data | 2010

Online and offline trend cluster discovery in spatially distributed data streams

Anna Ciampi; Annalisa Appice; Donato Malerba

Emerging real life applications, such as environmental compliance, ecological studies and meteorology, are characterized by real-time data acquisition through remote sensor networks. The most important aspect of the sensor readings is that they comprise a space dimension and a time dimension which are both information bearing. Additionally, they usually arrive at a rapid rate in a continuous, unbounded stream. Streaming prevents us from storing all readings and performing multiple scans of the entire data set. The drift of data distribution poses the additional problem of mining patterns which may change over the time. We address these challenges for the trend cluster cluster discovery, that is, the discovery of clusters of spatially close sensors which transmit readings, whose temporal variation, called trend polyline, is similar along the time horizon of a window. We present a stream framework which segments the stream into equally-sized windows, computes online intra-window trend clusters and stores these trend clusters in a database. Trend clusters are queried offline at any time, to determine trend clusters along larger windows (i.e. windows of windows). Experiments with several streams demonstrate the effectiveness of the proposed framework in discovering accurate and relevant to human trend clusters.

Information Sciences | 2018

Active learning via collective inference in network regression problems

Annalisa Appice; Corrado Loglisci; Donato Malerba

Abstract Active learning is a promising machine learning paradigm for querying oracles and obtaining actual labels for particular examples. Its goal is to decrease the number of labels needed, in order to learn a predictive model able to achieve a high level of accuracy. It may turn out to be advantageous in several regression problems where scarce labels can be acquired. A novel active learning algorithm for regression problems in network data is defined. This algorithm performs active learning by taking into account explicitly the correlation property of network data, which makes the labels of linked nodes related to each other. Specifically it resorts to collective inference, in order to accommodate the data correlation in the active selection of the network nodes labeled by oracles. The empirical study proves that the proposed combination of active learning and collective inference can actually boost regression performances in various network domains.

A Comprehensive Guide Through the Italian Database Research | 2018

Relational Data Mining in the Era of Big Data

Annalisa Appice; Michelangelo Ceci; Donato Malerba

The aim of this article is to synthetically describe a sample of distinct approaches and applications of Relational Data Mining, which address the issue of managing complex, and possibly big, amounts of data. Specifically, we report a brief review of the literature on Relational Data Mining in the fields of Spatial Data Mining, Process Mining, Network Data Analysis and Stream Data Mining, with an emphasis on the Italian research. For each field, we describe the milestones that have been reached, as well as the future research trends that are fuelled by the emergent ubiquity of Big Data.

International Workshop on New Frontiers in Mining Complex Patterns | 2016

Mining Spatio-Temporal Patterns of Periodic Changes in Climate Data

Corrado Loglisci; Michelangelo Ceci; Angelo Impedovo; Donato Malerba

The climate changes have attracted always interest because they may have great impact on the life on Earth and living beings. Computational solutions may be useful both for the prediction of the climate changes and for their characterization, perhaps in association with other phenomena. Due to the cyclic and seasonal nature of many climate processes, studying their repeatability may be relevant and, in many cases, determinant. In this paper, we investigate the task of determining changes of the weather conditions, which are periodically repeated over time and space. We introduce the spatio-temporal patterns of periodic changes and propose a computational solution to discover them. These patterns allows us to represent spatial regions with same periodic changes. The method works on a grid-based data representation and relies on a time-windows analysis model to detect periodic changes in the grid cells. Then, the cells with same changes are selected to form a spatial region of interest. The usefulness of the method is demonstrated on a real-world dataset collecting weather conditions.

MSM/MUSE | 2012

Using Geographic Cost Functions to Discover Vessel Itineraries from AIS Messages

Annalisa Appice; Donato Malerba; Antonietta Lanza

With the development of AIS (Automatic Identification System), more and more vessels are equipped with AIS technology. Vessels’ reports (e.g. position in geodetic coordinates, speed, course), periodically transmitted by AIS, have become an abundant and inexpensive source of ubiquitous motion information for the maritime surveillance. In this study, we investigate the problem of processing the ubiquitous data, which are enclosed in the AIS messages of a vessel, in order to display an interpolation of the itinerary of the vessel. We define a graph-aware itinerary mining strategy, which uses spatio-temporal knowledge enclosed in each AIS message to constrain the itinerary search. Experiments investigate the impact of the proposed spatio-temporal data mining algorithm on the accuracy and efficiency of the itinerary interpolation process, also when reducing the amount of AIS messages processed per vessel.

Archive | 2018

Handling Multi-scale Data via Multi-target Learning for Wind Speed Forecasting

Annalisa Appice; Antonietta Lanza; Donato Malerba

Wind speed forecasting is particularly important for wind farms due to cost-related issues, dispatch planning, and energy markets operations. This paper presents a multi-target learning method, in order to model historical wind speed data and yield accurate forecasts of the wind speed on the day-ahead (24 h) horizon. The proposed method is based on the analysis of historical data, which are represented at multiple scales in both space and time. Handling multi-scale data allows us to leverage the knowledge hidden in both the spatial and temporal variability of the shared information, in order to identify spatio-temporal aided patterns that contribute to yield accurate wind speed forecasts. The viability of the presented method is evaluated by considering benchmark data. Specifically, the empirical study shows that learning multi-scale historical data allows us to determine accurate wind speed forecasts.

Explore More