Miloš Radovanović | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Miloš Radovanović is active.

Explore More

Publication

Featured researches published by Miloš Radovanović.

IEEE Transactions on Knowledge and Data Engineering | 2014

The Role of Hubness in Clustering High-Dimensional Data

Nenad Tomašev; Miloš Radovanović; Dunja Mladenic; Mirjana Ivanović

High-dimensional data arise naturally in many domains, and have regularly presented a great challenge for traditional data mining techniques, both in terms of effectiveness and efficiency. Clustering becomes difficult due to the increasing sparsity of such data, as well as the increasing difficulty in distinguishing distances between data points. In this paper, we take a novel perspective on the problem of clustering high-dimensional data. Instead of attempting to avoid the curse of dimensionality by observing a lower dimensional feature subspace, we embrace dimensionality by taking advantage of inherently high-dimensional phenomena. More specifically, we show that hubness, i.e., the tendency of high-dimensional data to contain points (hubs) that frequently occur in k-nearest-neighbor lists of other points, can be successfully exploited in clustering. We validate our hypothesis by demonstrating that hubness is a good measure of point centrality within a high-dimensional data cluster, and by proposing several hubness-based clustering algorithms, showing that major hubs can be used effectively as cluster prototypes or as guides during the search for centroid-based cluster configurations. Experimental results demonstrate good performance of our algorithms in multiple settings, particularly in the presence of large quantities of noise. The proposed methods are tailored mostly for detecting approximately hyperspherical clusters and need to be extended to properly handle clusters of arbitrary shapes.

international conference on machine learning | 2009

Nearest neighbors in high-dimensional data: the emergence and influence of hubs

Miloš Radovanović; Alexandros Nanopoulos; Mirjana Ivanović

High dimensionality can pose severe difficulties, widely recognized as different aspects of the curse of dimensionality. In this paper we study a new aspect of the curse pertaining to the distribution of k-occurrences, i.e., the number of times a point appears among the k nearest neighbors of other points in a data set. We show that, as dimensionality increases, this distribution becomes considerably skewed and hub points emerge (points with very high k-occurrences). We examine the origin of this phenomenon, showing that it is an inherent property of high-dimensional vector space, and explore its influence on applications based on measuring distances in vector spaces, notably classification, clustering, and information retrieval.

knowledge discovery and data mining | 2011

The role of hubness in clustering high-dimensional data

Nenad Tomašev; Miloš Radovanović; Dunja Mladenic; Mirjana Ivanović

Scientometrics | 2014

The structure and evolution of scientific collaboration in Serbian mathematical journals

Miloš Savić; Mirjana Ivanović; Miloš Radovanović; Zoran Ognjanović; Aleksandar Pejović; Tatjana Jakšić Krüger

Digital preservation of scientific papers enables their wider accessibility, but also provides a valuable source of information that can be used in a longitudinal scientometric study. The Electronic Library of the Mathematical Institute of the Serbian Academy of Sciences and Arts (eLib) digitizes the most prominent mathematical journals printed in Serbia. In this paper, we study a co-authorship network which represents collaborations among authors who published their papers in the eLib journals in an 80 year period (from 1932 to 2011). Such study enables us to identify patterns and long-term trends in scientific collaborations that are characteristic for a community which mainly consists of Serbian (Yugoslav) mathematicians. Analysis of connected components of the network reveals a topological diversity in the network structure: the network contains a large number of components whose sizes obey a power-law, the majority of components are isolated authors or small trivial components, but there is also a small number of relatively large, non-trivial components of connected authors. Our evolutionary analysis shows that the evolution of the network can be divided into six periods that are characterized by different intensity and type of collaborative behavior among eLib authors. Analysis of author metrics shows that betweenness centrality is a better indicator of author productivity and long-term presence in the eLib journals than degree centrality. Moreover, the strength of correlation between productivity metrics and betweenness centrality increases as the network evolves suggesting that even more stronger correlation can be expected in the future.

Knowledge Based Systems | 2014

The influence of global constraints on similarity measures for time-series databases

Vladimir Kurbalija; Miloš Radovanović; Zoltan Geler; Mirjana Ivanović

A time series consists of a series of values or events obtained over repeated measurements in time. Analysis of time series represents an important tool in many application areas, such as stock-market analysis, process and quality control, observation of natural phenomena, and medical diagnosis. A vital component in many types of time-series analyses is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. Furthermore, it has been reported that such constrained measures can also achieve better accuracy. In this paper, we investigate four representative time-series distance/similarity measures based on dynamic programming, namely Dynamic Time Warping (DTW), Longest Common Subsequence (LCS), Edit distance with Real Penalty (ERP) and Edit Distance on Real sequence (EDR), and the effects of global constraints on them when applied via the Sakoe-Chiba band. To better understand the influence of global constraints and provide deeper insight into their advantages and limitations we explore the change of the 1-nearest neighbor graph with respect to the change of the constraint size. Also, we examine how these changes reflect on the classes of the nearest neighbors of time series, and evaluate the performance of the 1-nearest neighbor classifier with respect to different distance measures and constraints. Since we determine that constraints introduce qualitative differences in all considered measures, and that different measures are affected by constraints in various ways, we expect our results to aid researchers and practitioners in selecting and tuning appropriate time-series similarity measures for their respective tasks.

Advances in Web Intelligence and Data Mining | 2006

CatS: A Classification-Powered Meta-Search Engine

Miloš Radovanović; Mirjana Ivanović

CatS is a meta-search engine that utilizes text classification techniques to improve the presentation of search results. After posting a query, the user is offered an opportunity to refine the results by browsing through a category tree derived from the dmoz Open Directory topic hierarchy. This paper describes some key aspects of the system (including HTML parsing, classification and displaying of results), outlines the text categorization experiments performed in order to choose the right parameters for classification, and puts the system into the context of related work on (meta-)search engines. The approach of using a separate category tree represents an extension of the standard relevance list, and provides a way to refine the search on need, offering the user a non-imposing, but potentially powerful tool for locating needed information quickly and efficiently. The current implementation of CatS may be considered a baseline, on top of which many enhancements are possible.

conference on recommender systems | 2009

How does high dimensionality affect collaborative filtering

Alexandros Nanopoulos; Miloš Radovanović; Mirjana Ivanović

A crucial operation in memory-based collaborative filtering (CF) is determining nearest neighbors (NNs) of users/items. This paper addresses two phenomena that emerge when CF algorithms perform NN search in high-dimensional spaces that are typical in CF applications. The first is similarity concentration and the second is the appearance of hubs (i.e. points which appear in

artificial intelligence methodology systems applications | 2010

A framework for time-series analysis

Vladimir Kurbalija; Miloš Radovanović; Zoltan Geler; Mirjana Ivanović

international test conference | 2011

CHARACTERISTICS OF CLASS COLLABORATION NETWORKS IN LARGE JAVA SOFTWARE PROJECTS

Miloš Savić; Mirjana Ivanović; Miloš Radovanović

-NN lists of many other points). Through theoretical analysis and experimental evaluation we show that these phenomena are inherent properties of high-dimensional space, unrelated to other data properties like sparsity, and that they can impact CF algorithms by questioning the meaning and representativeness of discovered NNs. Moreover, we show that it is not easy to mitigate the phenomena using dimensionality reduction. Studying these phenomena aims to provide a better understanding of the limitations of memory-based CF and motivate the development of new algorithms that would overcome them.

data warehousing and knowledge discovery | 2006

Document representations for classification of short web-page descriptions

Miloš Radovanović; Mirjana Ivanović

The popularity of time-series databases in many applications has created an increasing demand for performing data-mining tasks (classification, clustering, outlier detection, etc.) on time-series data. Currently, however, no single system or library exists that specializes on providing efficient implementations of data-mining techniques for time-series data, supports the necessary concepts of representations, similarity measures and preprocessing tasks, and is at the same time freely available. For these reasons we have designed a multi-purpose, multifunctional, extendable system FAP - Framework for Analysis and Prediction, which supports the aforementioned concepts and techniques for mining time-series data. This paper describes the architecture of FAP and the current version of its Java implementation which focuses on time-series similarity measures and nearest-neighbor classification. The correctness of the implementation is verified through a battery of experiments which involve diverse time-series data sets from the UCR repository.

Explore More