Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stefanos Souldatos is active.

Publication


Featured researches published by Stefanos Souldatos.


international world wide web conferences | 2008

Efficient evaluation of generalized path pattern queries on XML data

Xiaoying Wu; Stefanos Souldatos; Dimitri Theodoratos; Theodore Dalamagas; Timos K. Sellis

Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path-patterns or tree-patterns. Requirements in flexible querying of XML data have motivated recently the introduction of query languages that allow a partial specification of path-patterns in a query. In this paper, we focus on the efficient evaluation of partial path queries, a generalization of path pattern queries. Our approach explicitly deals with repeated labels (that is, multiple occurrences of the same label in a query). We show that partial path queries can be represented as rooted dags for which a topological ordering of the nodes exists. We present three algorithms for the efficient evaluation of these queries under the indexed streaming evaluation model. The first one exploits a structural summary of data to generate a set of path-patterns that together are equivalent to a partial path query. To evaluate these path-patterns, we extend PathStack so that it can work on path-patterns with repeated labels. The second one extracts a spanning tree from the query dag, uses a stack-based algorithm to find the matches of the root-to-leaf paths in the tree, and merge-joins the matches to compute the answer. Finally, the third one exploits multiple pointers of stack entries and a topological ordering of the query dag to apply a stack-based holistic technique. An analysis of the algorithms and extensive experimental evaluation shows that the holistic algorithm outperforms the other ones.


conference on information and knowledge management | 2007

Evaluation of partial path queries on xml data

Stefanos Souldatos; Xiaoying Wu; Dimitri Theodoratos; Theodore Dalamagas; Timos K. Sellis

XML query languages typically allow the specification of structural patterns of elements. Finding the occurrences of such patterns in an XML tree is the key operation in XML query processing. Many algorithms have been presented for this operation. These algorithms focus mainly on the evaluation of path-pattern or tree-pattern queries. In this paper, we define a partial path-pattern query language, and we address the problem of its efficient evaluation on XML data. In order to process partial path-pattern queries, we introduce a set of sound and complete inference rules to characterize structural relationship derivation. We provide necessary and sufficient conditions for detecting query unsatisfiability and node redundancy. We show how partial path-pattern queries can be equivalently put in a canonical directed acyclic graph form. We developed two stack-based algorithms for the evaluation of partial path-pattern queries, PartialMJ and PartialPathStack. PartialMJ computes answers to the query by merge-joining the results of the root-to-leaf paths of a spanning tree of the query. PartialPathStack exploits a topological order of the nodes of the query graph to match the query pattern as a whole to the XML tree. The experimental evaluation of our algorithms shows that PartialPathStack is independent of intermediate results and largely outperforms PartialMJ.


IEEE Transactions on Knowledge and Data Engineering | 2012

Processing and Evaluating Partial Tree Pattern Queries on XML Data

Xiaoying Wu; Stefanos Souldatos; Dimitri Theodoratos; Theodore Dalamagas; Yannis Vassiliou; Timos K. Sellis

XML query languages typically allow the specification of structural patterns using XPath. Usually, these structural patterns are in the form of trees (Tree-Pattern Queries-TPQs). Finding the occurrences of such patterns in an XML tree is a key operation in XML query evaluation. The multiple previous algorithms presented for this operation focus mainly on the evaluation of tree-pattern queries. Recently, requirements for flexible querying of XML data have motivated the consideration of query classes that are more expressive and flexible than TPQs for which efficient nonmain-memory evaluation algorithms are not known. In this paper, we consider a class of queries, called Partial Tree-Pattern Queries (PTPQs), which generalize and strictly contain TPQs. PTPQs represent a broad fragment of XPath which is very useful in practice. In order to process PTPQs, we introduce a set of sound and complete inference rules to characterize structural relationship derivation. We provide necessary and sufficient conditions for detecting query unsatisfiability and node redundancy. We also show that PTPQs can be represented as directed acyclic graphs augmented with the “same-path” constraints. In order to leverage existing efficient evaluation algorithms for less expressive classes of queries, we design two approaches that evaluate a PTPQ by decomposing it into a set of simpler queries: algorithm IndexTPQGen, exploits a structural summary of the XML data and evaluates a PTPQ by generating an equivalent set of TPQs and unioning their answers. Algorithm PartialPathJoin decomposes the PTPQ into partial-path queries, and merge-joins their solutions. We also develop PartialTreeStack, an original polynomial time holistic algorithm for PTPQs. To the best of our knowledge, this is the first algorithm to support the evaluation of such a broad structural fragment of XPath in the inverted lists evaluation model. We provide a theoretical analysis of our algorithm and identify cases where it is asymptotically optimal. An extensive experimental evaluation shows that it is more efficient, robust, and stable than the other two and it outperforms a state-of-the art XQuery engine on PTPQs.


international world wide web conferences | 2010

Evaluation Techniques for Generalized Path Pattern Queries on XML Data

Xiaoying Wu; Dimitri Theodoratos; Stefanos Souldatos; Theodore Dalamagas; Timos K. Sellis

Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path patterns or tree patterns. Current applications of XML require querying of data whose structure is complex or is not fully known to the user, or integrating XML data sources with different structures. These applications have motivated recently the introduction of query languages that allow a partial specification of path patterns in a query. In this paper, we consider partial path queries, a generalization of path pattern queries, and we focus on their efficient evaluation under the indexed streaming evaluation model. Our approach explicitly deals with repeated labels (that is, multiple occurrences of the same label in a query). We show that partial path queries can be represented as rooted dags for which a topological ordering of the nodes exists. We present three algorithms for the efficient evaluation of these queries. The first one exploits a structural summary of data to generate a set of path patterns that together are equivalent to a partial path query. To evaluate these path patterns, we extend a previous algorithm for path-pattern queries so that it can work on path patterns with repeated labels. The second one extracts a spanning tree from the query dag, uses a stack-based algorithm to find the matches of the root-to-leaf paths in the tree, and merge-joins the matches to compute the answer. Finally, the third one exploits multiple pointers of stack entries and a topological ordering of the query dag to apply a stack-based holistic technique. We analyze our algorithms and perform extensive experimental evaluations. Our experimental results show that the holistic algorithm outperforms the other ones. Our approaches are the first ones to efficiently evaluate this class of queries in the indexed streaming model.


statistical and scientific database management | 2006

Containment of Partially Specified Tree-Pattern Queries

Dimitri Theodoratos; Theodore Dalamagas; Pawel Placek; Stefanos Souldatos; Timos K. Sellis

Nowadays, huge volumes of data, including scientific data, are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for integrating multiple data sources with different tree structures has driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper we adopt a query language with partially specified tree-pattern queries. A central feature of this type of queries is that the structure can be specified fully, partially, or not at all in a query. Important issues in query optimization require solving the query containment problem. We study the containment problem for partially specified tree-pattern queries. To support the evaluation of such queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the absence (absolute query containment) and in the presence (relative query containment) of dimension graphs, and we provide necessary and sufficient conditions for each type of query containment. We suggest a technique for relative query containment checking based on structural information extracted in advance from the dimension graph. Our approach is implemented and validated, through extensive experimental evaluation


very large data bases | 2009

Containment of partially specified tree-pattern queries in the presence of dimension graphs

Dimitri Theodoratos; Pawel Placek; Theodore Dalamagas; Stefanos Souldatos; Timos K. Sellis

Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.


conference on information and knowledge management | 2008

A heuristic approach for checking containment of generalized tree-pattern queries

Pawel Placek; Dimitri Theodoratos; Stefanos Souldatos; Theodore Dalamagas; Timos K. Sellis

Query processing techniques for XML data have focused mainly on tree-pattern queries (TPQs). However, the need for querying XML data sources whose structure is very complex or not fully known to the user, and the need to integrate multiple XML data sources with different structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In order to implement the processing of such languages in current DBMSs, their containment problem has to be efficiently solved. In this paper, we consider a query language which generalizes TPQs by allowing the partial specification of a tree pattern. Partial tree-pattern queries (PTPQs) constitute a large fragment of XPath that flexibly permits the specification of a broad range of queries from keyword queries without structure, to queries with partial specification of the structure, to complete TPQs. We address the containment problem for PTPQs. This problem becomes more complex in the context of PTPQs because the partial specification of the structure allows new, non-trivial, structural expressions to be inferred from those explicitly specified in a query. We show that the containent problem cannot be characterized by homomorphisms between PTPQs, even when PTPQs are put in a canonical form that comprises all derived structural expressions. We provide necessary and sufficient conditions for this problem in terms of homomorphisms between PTPQs and (a possibly exponential number of) TPQs. To cope with the high complexity of PTPQ containment, we suggest a heuristic approach for this problem that trades accuracy for speed. An extensive experimental evaluation of our heuristic shows that our heuristic approach can be efficiently implemented in a query optimizer.


conference on information and knowledge management | 2006

Heuristic containment check of partial tree-pattern queries in the presence of index graphs

Dimitri Theodoratos; Stefanos Souldatos; Theodore Dalamagas; Pawel Placek; Timos K. Sellis


Informatica (lithuanian Academy of Sciences) | 2006

Captain Nemo: A metasearch engine with personalized hierarchical search space

Stefanos Souldatos; Theodore Dalamagas; Timos K. Sellis


international conference on machine learning | 2005

Sailing the web with captain Nemo: a personalized metasearch engine

Stefanos Souldatos; Theodore Dalamagas; Timos K. Sellis

Collaboration


Dive into the Stefanos Souldatos's collaboration.

Top Co-Authors

Avatar

Theodore Dalamagas

Institute for the Management of Information Systems

View shared research outputs
Top Co-Authors

Avatar

Timos K. Sellis

Swinburne University of Technology

View shared research outputs
Top Co-Authors

Avatar

Dimitri Theodoratos

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pawel Placek

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yannis Vassiliou

National Technical University of Athens

View shared research outputs
Researchain Logo
Decentralizing Knowledge