Indre Zliobaite | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Indre Zliobaite is active.

Explore More

Publication

Featured researches published by Indre Zliobaite.

IEEE Transactions on Neural Networks | 2014

Active Learning With Drifting Streaming Data

Indre Zliobaite; Albert Bifet; Bernhard Pfahringer; Geoffrey Holmes

In learning to classify streaming data, obtaining true labels may require major effort and may incur excessive cost. Active learning focuses on carefully selecting as few labeled instances as possible for learning an accurate predictive model. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and models need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. Changes occurring further from the boundary may be missed, and models may fail to adapt. This paper presents a theoretically supported framework for active learning from drifting data streams and develops three active learning strategies for streaming data that explicitly handle concept drift. They are based on uncertainty, dynamic allocation of labeling efforts over time, and randomization of the search space. We empirically demonstrate that these strategies react well to changes that can occur anywhere in the instance space and unexpectedly.

IEEE Transactions on Neural Networks | 2014

Dealing With Concept Drifts in Process Mining

R. P. Jagadeesh Chandra Bose; Wil M. P. van der Aalst; Indre Zliobaite; Mykola Pechenizkiy

Although most business processes change over time, contemporary process mining techniques tend to analyze these processes as if they are in a steady state. Processes may change suddenly or gradually. The drift may be periodic (e.g., because of seasonal influences) or one-of-a-kind (e.g., the effects of new legislation). For the process management, it is crucial to discover and understand such concept drifts in processes. This paper presents a generic framework and specific techniques to detect when a process changes and to localize the parts of the process that have changed. Different features are proposed to characterize relationships among activities. These features are used to discover differences between successive populations. The approach has been implemented as a plug-in of the ProM process mining framework and has been evaluated using both simulated event data exhibiting controlled concept drifts and real-life event data from a Dutch municipality.

IEEE Transactions on Knowledge and Data Engineering | 2014

Adaptive Preprocessing for Streaming Data

Indre Zliobaite; Bogdan Gabrys

Many supervised learning approaches that adapt to changes in data distribution over time (e.g., concept drift) have been developed. The majority of them assume that the data comes already preprocessed or that preprocessing is an integral part of a learning algorithm. In real-application tasks, data that comes from, e.g., sensor readings, is typically noisy, contain missing values, redundant features, and a very large part of model development efforts is devoted to data preprocessing. As data is evolving over time, learning models need to be able to adapt to changes automatically. From a practical perspective, automating a predictor makes little sense if preprocessing requires manual adjustment over time. Nevertheless, adaptation of preprocessing has been largely overlooked in research. In this paper, we introduce and address the problem of adaptive preprocessing. We analyze when and under what circumstances it is beneficial to handle adaptivity of preprocessing and adaptivity of the learning model separately. We present three scenarios where handling adaptive preprocessing separately benefits the final prediction accuracy and illustrate them using computational examples. As a result of our analysis, we construct a prototype approach for combining adaptive preprocessing with adaptive predictor online. Our case study with real sensory data from a production process demonstrates that decoupling the adaptivity of preprocessing and the predictor contributes to improving the prediction accuracy. The developed reference framework and our experimental findings are intended to serve as a starting point in systematic research of adaptive preprocessing mechanisms for adaptive learning with evolving data.

IEEE Transactions on Knowledge and Data Engineering | 2013

Predictive Handling of Asynchronous Concept Drifts in Distributed Environments

Hock Hee Ang; Vivekanand Gopalkrishnan; Indre Zliobaite; Mykola Pechenizkiy; Steven C. H. Hoi

In a distributed computing environment, peers collaboratively learn to classify concepts of interest from each other. When external changes happen and their concepts drift, the peers should adapt to avoid increase in misclassification errors. The problem of adaptation becomes more difficult when the changes are asynchronous, i.e., when peers experience drifts at different times. We address this problem by developing an ensemble approach, PINE, that combines reactive adaptation via drift detection, and proactive handling of upcoming changes via early warning and adaptation across the peers. With empirical study on simulated and real-world data sets, we show that PINE handles asynchronous concept drifts better and faster than current state-of-the-art approaches, which have been designed to work in less challenging environments. In addition, PINE is parameter insensitive and incurs less communication cost while achieving better accuracy.

Knowledge and Information Systems | 2013

Quantifying explainable discrimination and removing illegal discrimination in automated decision making

Faisal Kamiran; Indre Zliobaite; Tgk Toon Calders

Recently, the following discrimination-aware classification problem was introduced. Historical data used for supervised learning may contain discrimination, for instance, with respect to gender. The question addressed by discrimination-aware techniques is, given sensitive attribute, how to train discrimination-free classifiers on such historical data that are discriminative, with respect to the given sensitive attribute. Existing techniques that deal with this problem aim at removing all discrimination and do not take into account that part of the discrimination may be explainable by other attributes. For example, in a job application, the education level of a job candidate could be such an explainable attribute. If the data contain many highly educated male candidates and only few highly educated women, a difference in acceptance rates between woman and man does not necessarily reflect gender discrimination, as it could be explained by the different levels of education. Even though selecting on education level would result in more males being accepted, a difference with respect to such a criterion would not be considered to be undesirable, nor illegal. Current state-of-the-art techniques, however, do not take such gender-neutral explanations into account and tend to overreact and actually start reverse discriminating, as we will show in this paper. Therefore, we introduce and analyze the refined notion of conditional non-discrimination in classifier design. We show that some of the differences in decisions across the sensitive groups can be explainable and are hence tolerable. Therefore, we develop methodology for quantifying the explainable discrimination and algorithmic techniques for removing the illegal discrimination when one or more attributes are considered as explanatory. Experimental evaluation on synthetic and real-world classification datasets demonstrates that the new techniques are superior to the old ones in this new context, as they succeed in removing almost exclusively the undesirable discrimination, while leaving the explainable differences unchanged, allowing for differences in decisions as long as they are explainable.

computer-based medical systems | 2010

Heart failure hospitalization prediction in remote patient management systems

Mykola Pechenizkiy; Ekaterina Vasilyeva; Indre Zliobaite; Aleksandra Tesanovic; Goran Manev

Healthcare systems are shifting from patient care in hospitals to monitored care at home. It is expected to improve the quality of care without exploding the costs. Remote patient management (RPM) systems offer a great potential in monitoring patients with chronic diseases, like heart failure or diabetes. Patient modeling in RPM systems opens opportunities in two broad directions: personalizing information services, and alerting medical personnel about the changing conditions of a patient. In this study we focus on heart failure hospitalization (HFH) prediction, which is a particular problem of patient modeling for alerting. We formulate a short term HFH prediction problem and show how to address it with a data mining approach. We emphasize challenges related to the heterogeneity, different types and periodicity of the data available in RPM systems. We present an experimental study on HFH prediction using, which results lay a foundation for further studies and implementation of alerting and personalization services in RPM systems.

Evolving Systems | 2013

Introduction to the special issue on handling concept drift in adaptive information systems

Mykola Pechenizkiy; Indre Zliobaite

Modern information systems collect data from multiple sources, process it, extract information and use it for decision support or decision making. Predictive modeling is an important component of an information system that makes the system intelligent. This special issue focuses on adaptive information systems that can adjust their behavior relying on additional mechanisms continuously monitoring the operational setting and/or the performance of predictive models. Adaptive information systems have become ubiquitous in various application areas including online businesses, personal information access, industry, medicine, education, defence, and in which predictive analytics is an important decision making or decision support component. In the real world data is often non stationary. In predictive analytics, machine learning and data mining the phenomenon of unexpected change in underlying data over time is known as concept drift. Changes in underlying data may occur due to changing personal interests, changes in population, adversary activities or they can be attributed to the complex nature of the environment. When data drifts, predictions may become less accurate as the time passes or opportunities to improve the accuracy may be missed. Thus, the learning models need to be able to adapt automatically to changes over time. The problem of concept drift is of increasing importance in machine learning and data mining since more and more data is organized in the form of data streams rather than static databases, and it is rather unusual that concepts and data distributions stay stable over long periods of time. It is not surprising that the problem of concept drift has been studied in several research communities including but not limited to machine learning and data mining, data streams, information retrieval, and recommender systems. Different approaches for detecting and handling concept drift have been proposed in the literature, and many of them have already proved their potential in a wide range of application domains, e.g. fraud detection, adaptive system control, user modeling, information retrieval, text mining, biomedicine. Moreover, a fast growing scope of applications that rely on data arriving in real time where very often the problem of data drift is observed and recognized to be important, helped to identify and shape a number of new important research challenges that have not been wellstudied in the research communities yet. This special issue includes selected contributions from the first and the second International Workshops on Handling Concept Drift in Adaptive Information Systems; HaCDAIS at ECMLPKDD 2010 and HaCDAIS at IEEE ICDM 2011. The papers address both methodological issues and practical challenges for handling concept drift, such as (a) label availability, (b) recurring concepts, (c) systematic handling of event detection, and (d) mining changes in customer profiling and medical anesthesia domains. We hope you will find the following papers interesting for reading. The first paper ‘‘Drift Detection Using Uncertainty Distribution Divergence’’ by Patrick Lindstrom, Brian Mac Namee and Sarah Jane Delany addresses the problem of M. Pechenizkiy Eindhoven University of Technology, Eindhoven, The Netherlands e-mail: [email protected]

International Conference on Innovative Techniques and Applications of Artificial Intelligence | 2012

Predicting Multi-class Customer Profiles Based on Transactions: a Case Study in Food Sales

Edward Apeh; Indre Zliobaite; Mykola Pechenizkiy; Bogdan Gabrys

Predicting the class of customer profiles is a key task in marketing, which enables businesses to approach the customers in a right way to satisfy the customer’s evolving needs. However, due to costs, privacy and/or data protection, only the business’ owned transactional data is typically available for constructing customer profiles. We present a new approach that is designed to efficiently and accurately handle the multi-class classification of customer profiles built using sparse and skewed transactional data. Our approach first bins the customer profiles on the basis of the number of items transacted. The discovered bins are then partitioned and prototypes within each of the discovered bins selected to build the multi-class classifier models. The results obtained from using four multi-class classifiers on real-world transactional data consistently show the critical numbers of items at which the predictive performance of customer profiles can be substantially improved.

computer-based medical systems | 2010

Handling concept drift in medical applications: Importance, challenges and solutions

Mykola Pechenizkiy; Indre Zliobaite

In the real world data is often non stationary. In supervised learning, concept drift means that the statistical properties of the target variable, which the model aims to predict, change over time unexpectedly. This causes problems because the predictions might become less accurate as the time passes or opportunities to improve the accuracy might be missed. With the proposed tutorial we intend to reach the following goals: 1) highlight the importance of concept drift handling mechanisms in medical applications; 2) overview existing approaches for handling different types of drift in supervised learning, emphasizing the underlying assumptions that these approaches implicitly or explicitly make about the nature and causes of changes; 3) discuss practical aspects of applying drift handling mechanisms to a wide range of medical applications and present a foreseen development in this field.

international symposium on neural networks | 2017

BLPA: Bayesian learn-predict-adjust method for online detection of recurrent changepoints

Alexandr V. Maslov; Mykola Pechenizkiy; Yulong Pel; Indre Zliobaite; Alexander Shklyaev; Tommi Kärkkäinen; Jaakko Hollmén

Online changepoint detection is an important task for machine learning in changing environments, as it signals when the learning model needs to be updated. Presence of noise that can be mistaken for real changes makes it difficult to develop an effective approach that would have a low false alarm rate and being able to detect all the changes with a minimal delay. In this paper we study how performance of popular Bayesian online detectors can be improved in case of recurrent changes. Modelling recurrence allows us to anticipate future changepoints and predict their locations in time. We propose an approach for inducing and integrating recurrence information in the streaming settings, and demonstrate its effectiveness on synthetic and real-world human activity datasets.

Explore More