Vassilis J. Tsotras | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vassilis J. Tsotras is active.

Explore More

Publication

Featured researches published by Vassilis J. Tsotras.

symposium on principles of database systems | 1999

On indexing mobile objects

George Kollios; Dimitrios Gunopulos; Vassilis J. Tsotras

We show how to index mobile objects in one and two dimensions using efficient dynamic external memory data structures. The problem is motivated by real life applications in traffic monitoring, intelligent navigation and mobile communications domains. For the l-dimensional case, we give (i) a dynamic, external memory algorithm with guaranteed worst case performance and linear space and (ii) a practical approximation algorithm also in the dynamic, external memory setting, which has linear space and expected logarithmic query time. We also give an algorithm with guaranteed logarithmic query time for a restricted version of the problem. We present extensions of our techniques to two dimensions. In addition we give a lower bound on the number of I/O’s needed to answer the d-dimensional problem. Initial experimental results and comparisons to traditional indexing approaches are also included.

ACM Computing Surveys | 1999

Comparison of access methods for time-evolving data

Betty Salzberg; Vassilis J. Tsotras

This paper compares different indexing techniques proposed for supporting efficient access to temporal data. The comparison is based on a collection of important performance criteria, including the space consumed, update processing, and query time for representative queries. The comparison is based on worst-case analysis, hence no assumptions on data distribution or query frequencies are made. When a number of methods have the same asymptotic worst-case behavior, features in the methods that affect average case behavior are discussed. Additional criteria examined are the pagination of an index, the ability to cluster related data together, and the ability to efficiently separate old from current data (so that larger archival storage media such as write-once optical disks can be used). The purpose of the paper is to identify the difficult problems in accessing temporal data and describe how the different methods aim to solve them. A general lower bound for answering basic temporal queries is also introduced.

very large data bases | 2002

Efficient structural joins on indexed XML documents

Shu-Yao Chien; Zografoula Vagena; Donghui Zhang; Vassilis J. Tsotras; Carlo Zaniolo

Queries on XML documents typically combine selections on element contents, and, via path expressions, the structural relationships between tagged elements. Structural joins are used to find all pairs of elements satisfying the primitive structural relationships specified in the query, namely, parent-child and ancestor-descendant relationships. Efficient support for structural joins is thus the key to efficient implementations of XML queries. Recently proposed node numbering schemes enable the capturing of the XML document structure using traditional indices (such as B+-trees or R-trees). This paper proposes efficient structural join algorithms in the presence of tag indices. We first concentrate on using B+- trees and show how to expedite a structural join by avoiding collections of elements that do not participate in the join. We then introduce an enhancement (based on sibling pointers) that further improves performance. Such sibling pointers are easily implemented and dynamically maintainable. We also present a structural join algorithm that utilizes R-trees. An extensive experimental comparison shows that the B+-tree structural joins are more robust. Furthermore, they provide drastic improvement gains over the current state of the art.

international conference on management of data | 2000

Approximating multi-dimensional aggregate range queries over real attributes

Dimitrios Gunopulos; George Kollios; Vassilis J. Tsotras; Carlotta Domeniconi

Finding approximate answers to multi-dimensional range queries over real valued attributes has significant applications in data exploration and database query optimization. In this paper we consider the following problem: given a table of d attributes whose domain is the real numbers, and a query that specifies a range in each dimension, find a good approximation of the number of records in the table that satisfy the query. We present a new histogram technique that is designed to approximate the density of multi-dimensional datasets with real attributes. Our technique finds buckets of variable size, and allows the buckets to overlap. Overlapping buckets allow more efficient approximation of the density. The size of the cells is based on the local density of the data. This technique leads to a faster and more compact approximation of the data distribution. We also show how to generalize kernel density estimators, and how to apply them on the multi-dimensional query approximation problem. Finally, we compare the accuracy of the proposed techniques with existing techniques using real and synthetic datasets.

extending database technology | 2002

Efficient Indexing of Spatiotemporal Objects

Marios Hadjieleftheriou; George Kollios; Vassilis J. Tsotras; Dimitrios Gunopulos

Spatiotemporal objects i.e., objects which change their position and/or extent over time, appear in many applications. This paper addresses the problem of indexing large volumes of such data. We consider general object movements and extent changes. We further concentrate on snapshot as well as small interval historical queries on the gathered data. The obvious approach that approximates spatiotemporal objects with MBRs and uses a traditional multidimensional access method to index them is inefficient. Objects that live for long time intervals have large MBRs which introduce a lot of empty space. Clustering long intervals has been dealt in temporal databases by the use of partially persistent indices. What differentiates this problem from traditional temporal indexing is that objects are allowed to move/change during their lifetime. Better methods are thus needed to approximate general spatiotemporal objects. One obvious solution is to introduce artificial splits: the lifetime of a long-lived object is split into smaller consecutive pieces. This decreases the empty space but increases the number of indexed MBRs. We first introduce two algorithms for splitting a given spatiotemporal object. Then, given an upper bound on the total number of possible splits, we present three algorithms that decide how the splits should be distributed among the objects so that the total empty space is minimized.

very large data bases | 2002

Efficient schemes for managing multiversionXML documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo

Abstract. Multiversion support for XML documents is needed in many critical applications, such as software configuration control, cooperative authoring, web information warehouses, and ”e-permanence” of web documents. In this paper, we introduce efficient and robust techniques for: (i) storing and retrieving; (ii) viewing and exchanging; and (iii) querying multiversion XML documents. We first discuss the limitations of traditional version control methods, such as RCS and SCCS, and then propose novel techniques that overcome their limitations. Initially, we focus on the problem of managing secondary storage efficiently, and introduce an edit-based versioning scheme that enhances RCS with an effective clustering policy based on the concept of page-usefulness. The new scheme drastically improves version retrieval at the expense of a small (linear) space overhead. However, the edit-based approach falls short of achieving objectives (ii) and (iii). Therefore, we introduce and investigate a second scheme, which is reference-based and preserves the structure of the original document. In the reference-based approach, a multiversion document can be represented as yet another XML document, which can be easily exchanged and viewed on the web; furthermore, simple queries are also expressed and supported well under this representation. To achieve objective (i), we extend the page-usefulness clustering technique to the reference-based scheme. After characterizing the asymptotic behavior of the new techniques proposed, the paper presents the results of an experimental study evaluating and comparing their performance.

web information systems engineering | 2001

Storing and querying multiversion XML documents using durable node numbers

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo; Donghui Zhang

Managing multiple versions of XML documents represents an important problem for many traditional applications, such as software configuration control, as well as new ones, such as link permanence of web documents. Research on managing multiversion XML documents seeks to provide efficient and robust techniques for storing, retrieving and querying such documents. In this paper we present a novel approach to version management that achieves these objectives by a scheme based on Durable Node Numbers and timestamps for the elements of XML documents. We first present efficient storage and retrieval techniques for multiversion documents. Then, we explore the indexing and clustering strategies needed to assure efficient support for complex queries on content and on document evolution.

international conference on management of data | 2001

XML document versioning

Shu Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo

Managing multiple versions of XML documents represents an important problem, because of many applications ranging from traditional ones, such as software configuration control, to new ones, such as link permanence of web documents. Research on managing multiversion XML documents seeks to provide efficient and robust techniques for (i) storing and retrieving, (ii) exchanging, and (iii) querying such documents. In this paper, we first show that traditional version control methods, such as RCS, and SCCS, fall short from satisfying these three requirements, and discuss alternative solutions. First, we enhance RCS with a temporal page clustering policy to achieve objective (i). Then, we discuss a reference-based versioning scheme that achieves both objectives (i) and (ii) and is also effective at supporting simple queries. The topic of supporting complex queries, including temporal ones, meshes with the burgeoning interest of database researchers in XML as a database description language, and in XML query languages. In this context, the XML versioning problems are akin to those of transaction time management for databases of objects and semistructured information. Nevertheless, the need to preserve the natural ordering of XML documents frequently requires different techniques.

Distributed and Parallel Databases | 2011

ASTERIX: towards a scalable, semistructured data platform for evolving-world models

Alexander Behm; Vinayak R. Borkar; Michael J. Carey; Raman Grover; Chen Li; Nicola Onose; Rares Vernica; Alin Deutsch; Yannis Papakonstantinou; Vassilis J. Tsotras

ASTERIX is a new data-intensive storage and computing platform project spanning UC Irvine, UC Riverside, and UC San Diego. In this paper we provide an overview of the ASTERIX project, starting with its main goal—the storage and analysis of data pertaining to evolving-world models. We describe the requirements and associated challenges, and explain how the project is addressing them. We provide a technical overview of ASTERIX, covering its architecture, its user model for data and queries, and its approach to scalable query processing and data management. ASTERIX utilizes a new scalable runtime computational platform called Hyracks that is also discussed at an overview level; we have recently made Hyracks available in open source for use by other interested parties. We also relate our work on ASTERIX to the current state of the art and describe the research challenges that we are currently tackling as well as those that lie ahead.

symposium on large spatial databases | 2003

On-Line Discovery of Dense Areas in Spatio-temporal Databases

Marios Hadjieleftheriou; George Kollios; Dimitrios Gunopulos; Vassilis J. Tsotras

Moving object databases have received considerable attention recently. Previous work has concentrated mainly on modeling and indexing problems, as well as query selectivity estimation. Here we introduce a novel problem, that of addressing density-based queries in the spatio-temporal domain. For example: “Find all regions that will contain more than 500 objects, ten minutes from now”. The user may also be interested in finding the time period (interval) that the query answer remains valid. We formally define a new class of density-based queries and give approximate, on-line techniques that answer them efficiently. Typically the threshold above which a region is considered to be dense is part of the query. The difficulty of the problem lies in the fact that the spatial and temporal predicates are not specified by the query. The techniques we introduce find all candidate dense regions at any time in the future. To make them more scalable we subdivide the spatial universe using a grid and limit queries within a pre-specified time horizon. Finally, we validate our approaches with a thorough experimental evaluation.

Explore More