Tomáš Skopal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tomáš Skopal is active.

Explore More

Publication

Featured researches published by Tomáš Skopal.

extending database technology | 2006

On fast non-metric similarity search by metric access methods

Tomáš Skopal

The retrieval of objects from a multimedia database employs a measure which defines a similarity score for every pair of objects. The measure should effectively follow the nature of similarity, hence, it should not be limited by the triangular inequality, regarded as a restriction in similarity modeling. On the other hand, the retrieval should be as efficient (or fast) as possible. The measure is thus often restricted to a metric, because then the search can be handled by metric access methods (MAMs). In this paper we propose a general method of non-metric search by MAMs. We show the triangular inequality can be enforced for any semimetric (reflexive, non-negative and symmetric measure), resulting in a metric that preserves the original similarity orderings (retrieval effectiveness). We propose the TriGen algorithm for turning any black-box semimetric into (approximated) metric, just by use of distance distribution in a fraction of the database. The algorithm finds such a metric for which the retrieval efficiency is maximized, considering any MAM.

advances in databases and information systems | 2003

Revisiting M-Tree Building Principles

Tomáš Skopal; Jaroslav Pokorný; Michal Krátký; Václav Snášel

The M-tree is a dynamic data structure designed to index metric datasets. In this paper we introduce two dynamic techniques of building the M-tree. The first one incorporates a multi-way object insertion while the second one exploits the generalized slim-down algorithm. Usage of these techniques or even combination of them significantly increases the querying performance of the M-tree. We also present comparative experimental results on large datasets showing that the new techniques outperform by far even the static bulk loading algorithm.

database systems for advanced applications | 2005

Nearest neighbours search using the PM-Tree

Tomáš Skopal; Jaroslav Pokorný; Václav Snášel

We introduce a method of searching the k nearest neighbours (k-NN) using PM-tree. The PM-tree is a metric access method for similarity search in large multimedia databases. As an extension of M-tree, the structure of PM-tree exploits local dynamic pivots (like M-tree does it) as well as global static pivots (used by LAESA-like methods). While in M-tree a metric region is represented by a hyper-sphere, in PM-tree the ”volume” of metric region is further reduced by a set of hyper-rings. As a consequence, the shape of PM-trees metric region bounds the indexed objects more tightly which, in turn, improves the overall search efficiency. Besides the description of PM-tree, we propose an optimal k-NN search algorithm. Finally, the efficiency of k-NN search is experimentally evaluated on large synthetic as well as real-world datasets.

international conference on multimedia retrieval | 2011

Indexing the signature quadratic form distance for efficient content-based multimedia retrieval

Christian Beecks; Jakub Lokoč; Thomas Seidl; Tomáš Skopal

The Signature Quadratic Form Distance has been introduced as an adaptive similarity measure coping with flexible content representations of various multimedia data. Although the Signature Quadratic Form Distance has shown good retrieval performance with respect to their qualities of effectiveness and efficiency, its applicability to index structures remains a challenging issue due to its dynamic nature. In this paper, we investigate the indexability of the Signature Quadratic Form Distance regarding metric access methods. We show how the distances inherent parameters determine the indexability and analyze the relationship between effectiveness and efficiency on numerous image databases.

multimedia information retrieval | 2006

Dynamic similarity search in multi-metric spaces

Benjamin Bustos; Tomáš Skopal

An important research issue in multimedia databases is the retrieval of similar objects. For most applications in multi-media databases, an exact search is not meaningful. Thus, much effort has been devoted to develop efficient and effective similarity search techniques. A recent approach, that has been shown to improve the effectiveness of similarity search in multimedia databases, resorts to the usage of combinations of metrics where the desirable contribution (weight) of each metric is chosen at query time. This paper presents the Multi-Metric M-tree (M 3 -tree), a metric access method that supports similarity queries with dynamic combinations of metric functions. The M 3-tree, an extension of the M-tree, stores partial distances to better estimate the weighed distances between routing/ground entries and each query, where a single distance function is used to build the whole index. An experimental evaluation shows that the M 3-tree may be as efficient as having multiple M-trees (one for each).

similarity search and applications | 2011

Ptolemaic indexing of the signature quadratic form distance

Jakub Lokoč; Magnus Lie Hetland; Tomáš Skopal; Christian Beecks

The signature quadratic form distance has been introduced as an adaptive similarity measure coping with flexible content representations of multimedia data. While this distance has shown high retrieval quality, its high computational complexity underscores the need for efficient search methods. Recent research has shown that a huge improvement in search efficiency is achieved when using metric indexing. In this paper, we analyze the applicability of Ptolemaic indexing to the signature quadratic form distance. We show that it is a Ptolemaic metric and present an application of Ptolemaic pivot tables to image databases, resolving queries nearly four times as fast as the state-of-the-art metric solution, and up to 300 times as fast as sequential scan.

conference on multimedia modeling | 2014

Signature-Based Video Browser

Jakub Lokoă; Adam Blažek; Tomáš Skopal

In this paper, we present a new signature-based video browser tool relying on the natural human ability to perceive and memorize visual stimuli of color regions in video frames. The tool utilizes feature signatures based on color and position extracted from the key frames in the preprocessing phase. Such content representation facilitates users in drawing simple query sketches and enables also effective and efficient processing of the query sketches. Besides user drawn simple sketches of desired scenes, the tool supports also several additional automatic content-based analysis techniques enabling restrictions to various concepts like faces or shapes.

conference on multimedia modeling | 2015

Enhanced Signature-Based Video Browser

Adam Blažek; Jakub Lokoč; Filip Matzner; Tomáš Skopal

The success of our Signature-Based Video Browser presented last year at Video Browser Showdown 2014 (now renamed to Video Search Showcase) was mainly based on effective filtering using position-color feature signatures, while browsing in the results comprising matched keyframes was based just on a simple sequential search approach. Since the results can consist of highly similar keyframes (e.g., news studio scenes) making the browsing more difficult, we have enhanced our tool with more advanced browsing techniques considering also homogeneous result sets obtained after filtering phase. Furthermore, we have utilized improved search models based on feature signatures to make the filtering phase more effective.

IEEE Transactions on Knowledge and Data Engineering | 2012

D-Cache: Universal Distance Cache for Metric Access Methods

Tomáš Skopal; Jakub Lokoč; Benjamin Bustos

The caching of accessed disk pages has been successfully used for decades in database technology, resulting in effective amortization of I/O operations needed within a stream of query or update requests. However, in modern complex databases, like multimedia databases, the I/O cost becomes a minor performance factor. In particular, metric access methods (MAMs), used for similarity search in complex unstructured data, have been designed to minimize rather the number of distance computations than I/O cost (when indexing or querying). Inspired by I/O caching in traditional databases, in this paper we introduce the idea of distance caching for usage with MAMs - a novel approach to streamline similarity search. As a result, we present the D-cache, a main-memory data structure which can be easily implemented into any MAM, in order to spare the distance computations spent by queries/updates. In particular, we have modified two state-of-the-art MAMs to make use of D-cache - the M-tree and Pivot tables. Moreover, we present the D-file, an index-free MAM based on simple sequential search augmented by D-cache. The experimental evaluation shows that performance gain achieved due to D-cache is significant for all the MAMs, especially for the D-file.

Journal of Discrete Algorithms | 2009

New dynamic construction techniques for M-tree

Tomáš Skopal; Jakub Lokoč

Since its introduction in 1997, the M-tree became a respected metric access method (MAM), while remaining, together with its descendants, still the only database-friendly MAM, that is, a dynamic structure persistent in paged index. Although there have been many other MAMs developed over the last decade, most of them require either static or expensive indexing. By contrast, the dynamic M-tree construction allows us to index very large databases in subquadratic time, and simultaneously the index can be maintained up-to-date (i.e., supports arbitrary insertions/deletions). In this article we propose two new techniques improving dynamic insertions in M-tree-the forced reinsertion strategies and so-called hybrid-way leaf selection. Both of the techniques preserve logarithmic asymptotic complexity of a single insertion, while they aim to produce more compact M-tree hierarchies (which leads to faster query processing). In particular, the former technique reuses the well-known principle of forced reinsertions, where the new insertion algorithm tries to re-insert the content of an M-tree leaf that is about to split in order to avoid that split. The latter technique constitutes an efficiency-scalable selection of suitable leaf node wherein a new object has to be inserted. In the experiments we show that the proposed techniques bring a clear improvement (speeding up both indexing and query processing) and also provide a tuning tool for indexing vs. querying efficiency trade-off. Moreover, a combination of the new techniques exhibits a synergic effect resulting in the best strategy for dynamic M-tree construction proposed so far.

Explore More