Vlastislav Dohnal
Masaryk University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vlastislav Dohnal.
Multimedia Tools and Applications | 2003
Vlastislav Dohnal; Claudio Gennaro; Pasquale Savino; Pavel Zezula
In order to speedup retrieval in large collections of data, index structures partition the data into subsets so that query requests can be evaluated without examining the entire collection. As the complexity of modern data types grows, metric spaces have become a popular paradigm for similarity retrieval. We propose a new index structure, called D-Index, that combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. We have qualitatively analyzed D-Index and verified its properties on actual implementation. We have also compared D-Index with other index structures and demonstrated its superiority on several real-life data sets. Contrary to tree organizations, the D-Index structure is suitable for dynamic environments with a high rate of delete/insert operations.
database and expert systems applications | 2003
Vlastislav Dohnal; Claudio Gennaro; Pavel Zezula
Similarity join in distance spaces constrained by the metric postulates is the necessary complement of more famous similarity range and the nearest neighbor search primitives. However, the quadratic computational complexity of similarity joins prevents from applications on large data collections. We present the eD-Index, an extension of D-index, and we study an application of the eD-Index to implement two algorithms for similarity self joins, i.e. the range query join and the overloading join. Though also these approaches are not able to eliminate the intrinsic quadratic complexity of similarity joins, significant performance improvements are confirmed by experiments.
european conference on information retrieval | 2003
Vlastislav Dohnal; Claudio Gennaro; Pasquale Savino; Pavel Zezula
Similarity join in distance spaces constrained by the metric postulates is the necessary complement of more famous similarity range and the nearest neighbors search primitives. However, the quadratic computational complexity of similarity joins prevents from applications on large data collections. We first study the underlying principles of such joins and suggest three categories of implementation strategies based on filtering, partitioning, or similarity range searching. Then we study an application of the D-index to implement the most promising alternative of range searching. Though also this approach is not able to eliminate the intrinsic quadratic complexity of similarity joins, significant performance improvements are confirmed by experiments.
international symposium on multimedia | 2008
Jan Sedmidubsky; Vlastislav Dohnal; Stanislav Barton; Pavel Zezula
We propose a self-organized and self-adapting system for content-based search in multimedia data. In particular, we build a semantic overlay over an existing peer-to-peer network. The self-organization of the overlay is obtained by using the social-network paradigm. The connections between peers are formed on the basis of a query-answer principle. The knowledge about answers to previous queries is exploited to route queries efficiently. At the same time, a randomized mechanism is used to explore new and unvisited parts of the network. In this way, the self-adaptable and robust system is built. Moreover, the metric space data model is used to achieve extensibility. The proposed concepts are verified on a network consisting of 2,000 peers and indexing 10 million images.
international conference on data engineering | 2008
Jan Sedmidubsky; Stanislav Barton; Vlastislav Dohnal; Pavel Zezula
Exploiting the concepts of social networking represents a novel approach to the approximate similarity query processing. We present a metric social network where relations between peers, giving similar results, are established on per-query basis. Based on the universal law of generalization, a new query forwarding algorithm is proposed. The same principle is used to manage query histories of individual peers with the possibility to tune the tradeoff between the extent of the history and the level of the query-answer approximation. All algorithms are tested on real data and real network of computers.
similarity search and applications | 2009
Michal Batko; Vlastislav Dohnal; David Novak; Jan Sedmidubsky
It has become customary that practically any information can be in a digital form. However, searching for relevant information can be complicated because of: (1) the diversity of ways in which specific data can be sorted, compared, related, or classified, and (2) the exponentially increasing amount of digital data. Accordingly, a successful search engine should address problems of extensibility and scalability. The Multi-Feature Indexing Network (MUFIN) is a general purpose search engine that satisfies these requirements. The extensibility is ensured by adopting the metric space to model the similarity, so MUFIN can evaluate queries over a wide variety of data domains compared by metric distance functions. The scalability is achieved by utilizing the paradigm of structured peer-to-peer networks, where the computational workload of query execution is distributed over multiple independent peers which can work in parallel. We demonstrate these unique capabilities of MUFIN on a database of 100 million images indexed according to a combination of five MPEG-7 descriptors.
similarity search and applications | 2009
Vlastislav Dohnal; Jan Sedmidubsky
We analyze routing mechanisms of a self-organizing semantic overlay for content-based search in multimedia data. This overlay operates over any existing P2P network based on the metric space approach. In particular, we replace the previous design of routing mechanisms in Metric Semantic Overlay (MSO) with a new adaptive query-routing algorithm. An advantage of it lies in an automatic tuning of confusability of queries that is used to select peers during query evaluation. These improvements are experimentally evaluated on a real-life and synthetic dataset.
extending database technology | 2004
Vlastislav Dohnal
Similarity retrieval is an important paradigm for searching in environments where exact match has little meaning Moreover, in order to enlarge the set of data types for which the similarity search can efficiently be performed, the mathematical notion of metric space provides a useful abstraction of similarity In this paper, we present a novel access structure for similarity search in arbitrary metric spaces, called D-Index D-Index supports easy insertions and deletions and bounded search costs for range queries with radius up to ρ D-Index also supports disk memories, thus, it is able to deal with large archives However, the partitioning principles employed in the D-Index are not very optimal since they produce high number of empty partitions We propose several strategies of partitioning and, finally, compare them.
web information and data management | 2009
Tomáš Skopal; Vlastislav Dohnal; Michal Batko; Pavel Zezula
As the volume of multimedia data available on internet is tremendously increasing, the content-based similarity search becomes a popular approach to multimedia retrieval. The most popular retrieval concept is the k nearest neighbor (kNN) search. For a long time, the kNN queries provided an effective retrieval in multimedia databases. However, as todays multimedia databases available on the web grow to massive volumes, the classic kNN query quickly loses its descriptive power. In this paper, we introduce a new similarity query type, the k distinct nearest neighbors (kDNN), which aims to generalize the classic kNN query to be more robust with respect to the database size. In addition to retrieving just objects similar to the query example, the kDNN further ensures the objects within the result have to be distinct enough, i.e. excluding near duplicates.
international conference on data engineering | 2008
Vlastislav Dohnal; Jan Sedmidubsky; Pavel Zezula; David Novak
Due to the exponential growth of digital data and its complexity, we need a technique which allows us to search such collections efficiently. A suitable solution seems to be based on the peer-to-peer (P2P) network paradigm and the metric-space model of similarity. During the building phase of the distributed structure, the peers often split as new peers join the network. During a peer split, the local data is halved and one half is migrated to the new peer. In this paper, we study the problem of efficient splits of metric data locally organized by an M-tree and we propose a novel algorithm for speeding the splits up. In particular, we focus on the metric-based structured P2P network called the M-Chord. In experimental evaluation, we compare the proposed algorithm with several straightforward solutions on a real network organizing 10 million images. Our algorithm provides a significant performance boost.