Milan Vukicevic | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Milan Vukicevic is active.

Explore More

Publication

Featured researches published by Milan Vukicevic.

International Journal of Computational Intelligence Systems | 2012

Using data mining on student behavior and cognitive style data for improving e-learning systems: a case study

Milos Jovanovic; Milan Vukicevic; Miloš Milovanović; Miroslav Minović

Abstract In this research we applied classification models for prediction of students’ performance, and cluster models for grouping students based on their cognitive styles in e-learning environment. Classification models described in this paper should help: teachers, students and business people, for early engaging with students who are likely to become excellent on a selected topic. Clustering students based on cognitive styles and their overall performance should enable better adaption of the learning materials with respect to their learning styles. The approach is tested using well-established data mining algorithms, and evaluated by several evaluation measures. Model building process included data preprocessing, parameter optimization and attribute selection steps, which enhanced the overall performance. Additionally we propose a Moodle module that allows automatic extraction of data needed for educational data mining analysis and deploys models developed in this study.

Artificial Intelligence Review | 2009

Reusable components for partitioning clustering algorithms

Boris Delibasic; Kathrin Kirchner; Johannes Ruhland; Milos Jovanovic; Milan Vukicevic

Clustering algorithms are well-established and widely used for solving data-mining tasks. Every clustering algorithm is composed of several solutions for specific sub-problems in the clustering process. These solutions are linked together in a clustering algorithm, and they define the process and the structure of the algorithm. Frequently, many of these solutions occur in more than one clustering algorithm. Mostly, new clustering algorithms include frequently occurring solutions to typical sub-problems from clustering, as well as from other machine-learning algorithms. The problem is that these solutions are usually integrated in their algorithms, and that original algorithms are not designed to share solutions to sub-problems outside the original algorithm easily. We propose a way of designing cluster algorithms and to improve existing ones, based on reusable components. Reusable components are well-documented, frequently occurring solutions to specific sub-problems in a specific area. Thus we identify reusable components, first, as solutions to characteristic sub-problems in partitioning cluster algorithms, and, further, identify a generic structure for the design of partitioning cluster algorithms. We analyze some partitioning algorithms (K-means, X-means, MPCK-means, and Kohonen SOM), and identify reusable components in them. We give examples of how new cluster algorithms can be designed based on them.

PLOS ONE | 2016

Scalable Predictive Analysis in Critically Ill Patients Using a Visual Open Data Analysis Platform

Sven Van Poucke; Zhongheng Zhang; Martin Schmitz; Milan Vukicevic; Margot Vander Laenen; Leo Anthony Celi; Cathy De Deyne

With the accumulation of large amounts of health related data, predictive analytics could stimulate the transformation of reactive medicine towards Predictive, Preventive and Personalized (PPPM) Medicine, ultimately affecting both cost and quality of care. However, high-dimensionality and high-complexity of the data involved, prevents data-driven methods from easy translation into clinically relevant models. Additionally, the application of cutting edge predictive methods and data manipulation require substantial programming skills, limiting its direct exploitation by medical domain experts. This leaves a gap between potential and actual data usage. In this study, the authors address this problem by focusing on open, visual environments, suited to be applied by the medical community. Moreover, we review code free applications of big data technologies. As a showcase, a framework was developed for the meaningful use of data from critical care patients by integrating the MIMIC-II database in a data mining environment (RapidMiner) supporting scalable predictive analytics using visual tools (RapidMiner’s Radoop extension). Guided by the CRoss-Industry Standard Process for Data Mining (CRISP-DM), the ETL process (Extract, Transform, Load) was initiated by retrieving data from the MIMIC-II tables of interest. As use case, correlation of platelet count and ICU survival was quantitatively assessed. Using visual tools for ETL on Hadoop and predictive modeling in RapidMiner, we developed robust processes for automatic building, parameter optimization and evaluation of various predictive models, under different feature selection schemes. Because these processes can be easily adopted in other projects, this environment is attractive for scalable predictive analytics in health research.

Journal of Medical Internet Research | 2016

Are Randomized Controlled Trials the (G)old Standard? From Clinical Intelligence to Prescriptive Analytics

Sven Van Poucke; Michiel Thomeer; John Heath; Milan Vukicevic

Despite the accelerating pace of scientific discovery, the current clinical research enterprise does not sufficiently address pressing clinical questions. Given the constraints on clinical trials, for a majority of clinical questions, the only relevant data available to aid in decision making are based on observation and experience. Our purpose here is 3-fold. First, we describe the classic context of medical research guided by Poppers’ scientific epistemology of “falsificationism.” Second, we discuss challenges and shortcomings of randomized controlled trials and present the potential of observational studies based on big data. Third, we cover several obstacles related to the use of observational (retrospective) data in clinical studies. We conclude that randomized controlled trials are not at risk for extinction, but innovations in statistics, machine learning, and big data analytics may generate a completely new ecosystem for exploration and validation.

data and knowledge engineering | 2012

An architecture for component-based design of representative-based clustering algorithms

Boris Delibasic; Milan Vukicevic; Milos Jovanovic; Kathrin Kirchner; Johannes Ruhland; Milija Suknovic

We propose an architecture for the design of representative-based clustering algorithms based on reusable components. These components were derived from K-means-like algorithms and their extensions. With the suggested clustering design architecture, it is possible to reconstruct popular algorithms, but also to build new algorithms by exchanging components from original algorithms and their improvements. In this way, the design of a myriad of representative-based clustering algorithms and their fair comparison and evaluation are possible. In addition to the architecture, we show the usefulness of the proposed approach by providing experimental evaluation.

Artificial Intelligence in Medicine | 2016

Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression.

Milos Jovanovic; Sandro Radovanovic; Milan Vukicevic; Sven Van Poucke; Boris Delibasic

OBJECTIVES Quantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th-revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions. MATERIALS AND METHODS The analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy. RESULTS The results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve-AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have similar performances reaching AUC values 0.783 and 0.779 for traditional Lasso and Tree-Lasso, respectfully. However, information loss of Lasso models is 0.35 bits higher compared to Tree-Lasso model. CONCLUSIONS We propose a method for building predictive models applicable for the detection of readmission risk based on Electronic Health records. Integration of domain knowledge (in the form of ICD-9-CM taxonomy) and a data-driven, sparse predictive algorithm (Tree-Lasso Logistic Regression) resulted in an increase of interpretability of the resulting model. The models are interpreted for the readmission prediction problem in general pediatric population in California, as well as several important subpopulations, and the interpretations of models comply with existing medical understanding of pediatric readmission. Finally, quantitative assessment of the interpretability of the models is given, that is beyond simple counts of selected low-level features.

artificial intelligence in medicine in europe | 2015

Domain knowledge Based Hierarchical Feature Selection for 30-Day Hospital Readmission Prediction

Sandro Radovanovic; Milan Vukicevic; Ana Kovacevic; Gregor Stiglic; Zoran Obradovic

Many studies fail to provide models for 30-day hospital re-admission prediction with satisfactory performance due to high dimensionality and sparsity. Efficient feature selection techniques allow better generalization of predictive models and improved interpretability, which is a very important property for applications in health care. We propose feature selection method that exploits hierarchical domain knowledge together with data. The new method is evaluated on predicting 30-day hospital readmission for pediatric patients from California and provides evidence that a knowledge-based approach outperforms traditional methods and that the newly proposed method is competitive with state-of-the-art methods.

Knowledge and Information Systems | 2013

Finding best algorithmic components for clustering microarray data

Milan Vukicevic; Kathrin Kirchner; Boris Delibasic; Milos Jovanovic; Johannes Ruhland; Milija Suknovic

The analysis of microarray data is fundamental to microbiology. Although clustering has long been realized as central to the discovery of gene functions and disease diagnostic, researchers have found the construction of good algorithms a surprisingly difficult task. In this paper, we address this problem by using a component-based approach for clustering algorithm design, for class retrieval from microarray data. The idea is to break up existing algorithms into independent building blocks for typical sub-problems, which are in turn reassembled in new ways to generate yet unexplored methods. As a test, 432 algorithms were generated and evaluated on published microarray data sets. We found their top performers to be better than the original, component-providing ancestors and also competitive with a set of new algorithms recently proposed. Finally, we identified components that showed consistently good performance for clustering microarray data and that should be considered in further development of clustering algorithms.

The Scientific World Journal | 2014

Cloud based metalearning system for predictive modeling of biomedical data.

Milan Vukicevic; Sandro Radovanovic; Miloš Milovanović; Miroslav Minović

Rapid growth and storage of biomedical data enabled many opportunities for predictive modeling and improvement of healthcare processes. On the other side analysis of such large amounts of data is a difficult and computationally intensive task for most existing data mining algorithms. This problem is addressed by proposing a cloud based system that integrates metalearning framework for ranking and selection of best predictive algorithms for data at hand and open source big data technologies for analysis of biomedical data.

bioinformatics and biomedicine | 2011

Internal Evaluation Measures as Proxies for External Indices in Clustering Gene Expression Data

Milan Vukicevic; Boris Delibasic; Milos Jovanovic; Milija Suknovic; Zoran Obradovic

Several external indices that use information not present in the dataset were shown to be useful for evaluation of representative based clustering algorithms. However, such supervised measures are not directly useful for construction of better clustering algorithms when class labels are not provided. We propose a method for identifying internal cluster evaluation measures that use only information present in the dataset and are related to given external indices. We utilize these internal measures for the construction of representative based clustering algorithms. Both identification and utilization steps of the proposed method are enabled by use of a component-based clustering algorithm design. Experiments on 432 algorithms using gene expression data sets provide evidence that some internal measures could be used as surrogates for external indices proposed in the literature. Moreover, the obtained results suggest that internal measures correlated to selected external indices can guide the algorithms toward significantly better cluster models.

Explore More