Is this you? Create Your Porfile

Vasilis Vassalos

Athens University of Economics and Business

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vasilis Vassalos is active.

Explore More

Publication

Featured researches published by Vasilis Vassalos.

next generation information technologies and systems | 1997

The TSIMMIS Approach to Mediation: Data Models and Languages

Hector Garcia-Molina; Yannis Papakonstantinou; Dallan Quass; Anand Rajaraman; Yehoshua Sagiv; Jeffrey D. Ullman; Vasilis Vassalos; Jennifer Widom

TSIMMIS—The Stanford-IBM Manager of Multiple InformationSources—is a system for integrating information. It offers a datamodel and a common query language that are designed to support thecombining of information from many different sources. It also offerstools for generating automatically the components that are needed tobuild systems for integrating information. In this paper we shalldiscuss the principal architectural features and their rationale.

international conference on management of data | 1997

Template-based wrappers in the TSIMMIS system

Joachim Hammer; Hector Garcia-Molina; Svetlozar Nestorov; Ramana Yerneni; Markus M. Breunig; Vasilis Vassalos

In order to access information from a variety of heterogeneous information sources, one has to be able to translate queries and data from one data model into another. This functionality is provided by so-called (source) wrappers [4,8] which convert queries into one or more commands/queries understandable by the underlying source and transform the native results into a format understood by the application. As part of the TSIMMIS project [1, 6] we have developed hard-coded wrappers for a variety of sources (e.g., Sybase DBMS, WWW pages, etc.) including legacy systems (Folio). However, anyone who has built a wrapper before can attest that a lot of effort goes into developing and writing such a wrapper. In situations where it is important or desirable to gain access to new sources quickly, this is a major drawback. Furthermore, we have also observed that only a relatively small part of the code deals with the specific access details of the source. The rest of the code is either common among wrappers or implements query and data transformation that could be expressed in a high level, declarative fashion. Based on these observations, we have developed a wrapper implementation toolkit [7] for quickly building wrappers. The toolkit contains a library for commonly used functions, such as for receiving queries from the application and packaging results. It also contains a facility for translating queries into source-specific commands, and for translating results into a model useful to the application. The philosophy behind our “template-based” translation methodology is as follows. The wrapper implementor specifies a set of templates (rules) written in a high level declarative language that describe the queries accepted by the wrapper as well as the objects that it returns. If an application query matches a template, an implementor-provided action associated with the template is executed to provide the native query for the underlying source1. When the source returns the result of the query, the wrapper transforms the answer which is represented in the data model of the source into a representation that is used by the application. Using this toolkit one can quickly design a simple wrapper with a few templates that cover some of the desired functionality, probably the one that is most urgently needed. However, templates can be added gradually as more functionality is required later on. Another important use of wrappers is in extending the query capabilities of a source. For instance, some sources may not be capable of answering queries that have multiple predicates. In such cases, it is necessary to pose a native query to such a source using only predicates that the source is capable of handling. The rest of the predicates are automatically separated from the user query and form a filter query. When the wrapper receives the results, a post-processing engine applies the filter query. This engine supports a set of built-in predicates based on the comparison operators =,≠,<,>, etc. In addition, the engine supports more complex predicates that can be specified as part of the filter query. The postprocessing engine is common to wrappers of all sources and is part of the wrapper toolkit. Note that because of postprocessing, the wrapper can handle a much larger class of queries than those that exactly match the templates it has been given. Figure 1 shows an overview of the wrapper architecture as it is currently implemented in our TSIMMIS testbed. Shaded components are provided by the toolkit, the white component is source-specific and must be generated by the implementor. The driver component controls the translation process and invokes the following services: the parser which parses the templates, the native schema, as well as the incoming queries into internal data structures, the matcher which matches a query against the set of templates and creates a filter query for postprocessing if necessary, the native component which submits the generated action string to the source, and extracts the data from the native result using the information given in the source schema, and the engine, which transforms and packages the result and applies a postprocessing filter if one has been created by the matcher. We now describe the sequence of events that occur at the wrapper during the translation of a query and its result using an example from our prototype system. The queries are formulated using a rule-based language called MSL that has been developed as a template specification and query language for the TSIMMIS project. Data is represented using our Object Exchange Model (OEM). We will briefly describe MSL and OEM in the next section. Details on MSL can be found in [5], a full introduction to OEM is given in [1].

international conference on management of data | 1999

Query rewriting for semistructured data

Yannis Papakonstantinou; Vasilis Vassalos

We address the problem of query rewriting for TSL, a language for querying semistructured data. We develop and present an algorithm that, given a semistructured query <italic>q</italic> and a set of semistructured views <italic>V</italic>, finds <italic>rewriting</italic> queries, i.e., queries that access the views and produce the same result as <italic>q</italic>. Our algorithm is based on appropriately generalizing <italic>containment mappings</italic>, the <italic>chase</italic>, and <italic>query composition</italic> — techniques that were developed for structured, relational data. We also develop an algorithm for equivalence checking of TSL queries. We show that the algorithm is sound and complete for TSL, i.e., it always finds every non-trivial TSL rewriting query of <italic>q</italic>, and we discuss its complexity. We extend the rewriting algorithm to use some forms of structural constraints (such as DTDs) and find more opportunities for query rewriting.

international conference on management of data | 1998

Capability based mediation in TSIMMIS

Chen Li; Ramana Yerneni; Vasilis Vassalos; Hector Garcia-Molina; Yannis Papakonstantinou; Jeffrey D. Ullman; Murty Valiveti

Conventional mediators focus their attention on the contents of the sources and their relationship to the integrated views provided to the users. They do not take into account the capabilities of sources to answer queries. This may lead them to generate plans involving source queries that cannot be answered by the sources. In the TSIMMIS system, we have developed a source capability sensitive plan generation module that constructs feasible plans for user queries in the presence of limited source capabilities

international conference on management of data | 2002

QURSED: querying and reporting semistructured data

Yannis Papakonstantinou; Michalis Petropoulos; Vasilis Vassalos

QURSED enables the development of web-based query forms and reports (QFRs) that query and report semistructured XML data, i.e., data that are characterized by nesting, irregularities and structural variance. The query aspects of a QFR are captured by its query set specification, which formally encodes multiple parameterized condition fragments and can describe large numbers of queries. The run-time component of QURSED produces XQuery-compliant queries by synthesizing fragments from the query set specification that have been activated during the interaction of the end-user with the QFR. The design-time component of QURSED, called QURSED Editor, semi-automates the development of the query set specification and its association with the visual components of the QFR by translating visual actions into appropriate query set specifications. We describe QURSED and illustrate how it accommodates the intricacies that the semistructured nature of the underlying database introduces. We specifically focus on the formal model of the query set specification, its generation via the QURSED Editor and its coupling with the visual aspects of the web-based form and report.

Journal of Logic Programming | 2000

Expressive Capabilities Description Languages and Query Rewriting Algorithms

Vasilis Vassalos; Yannis Papakonstantinou

Abstract Information integration systems have to cope with a wide variety of different information sources, which support query interfaces with very varied capabilities. To deal with this problem, the integration systems need descriptions of the query capabilities of each source, i.e., the set of queries supported by each source. Moreover, the integration systems need algorithms for deciding how a query can be answered given the capabilities of the sources. Finally, they need to translate a query into the format that the source understands. We present two languages suitable for descriptions of query capabilities of sources and compare their expressive power. We also use one of the languages to automatically derive the capabilities description of the integration system itself, in terms of the capabilities of the sources it integrates. We describe algorithms for deciding whether a query “matches” the description and show their application to the problem of translating user queries into source-specific queries and commands. We propose new, improved algorithms for the problem of answering queries using these descriptions. Finally, we identify an interesting class of source capability descriptions, for which our algorithms are much more efficient.

international world wide web conferences | 2001

XML query forms (XQForms): declarative specification of XML query interfaces

Michalis Petropoulos; Vasilis Vassalos; Yannis Papakonstantinou

XQForms is the first generator of Web-based query forms and reports for XML data. XQForms takes as input (i) XML Schemas that model the data to be queried and presented, (ii) declarative specifications, called annotations, of the logic of the query forms and reports that will be generated, and (iii) a set of template presentation libraries. The output is a set of query forms and reports that provide automated query construction and report formatting in order for the end users to query and browse the underlying XML data. Thus XQForms separates content (given by the XML Schema of the data), query form logic (specified by the annotations) and presentation of the forms and reports. The system architecture is modular and consists of four main components: (a) a collection of query form controls that incorporate query capabilities and allow parameter passing from the end users via the form page. A set of query form controls makes up a query form. (b) An annotation scheme for binding these controls to data elements of the XML Schema and for specifying their properties, (c) a compiler for creating the HTML representation of the query forms, and (d) a runtime engine that constructs and executes the queries against the XML data and renders the query results to create the reports. General Terms Design, Standardization, Languages.

international conference on data engineering | 2011

Semi-Streamed Index Join for near-real time execution of ETL transformations

Mihaela A. Bornea; Antonios Deligiannakis; Yannis Kotidis; Vasilis Vassalos

Active data warehouses have emerged as a new business intelligence paradigm where data in the integrated repository is refreshed in near real-time. This shift of practices achieves higher consistency between the stored information and the latest updates, which in turn influences crucially the output of decision making processes. In this paper we focus on the changes required in the implementation of Extract Transform Load (ETL) operations which now need to be executed in an online fashion. In particular, the ETL transformations frequently include the join between an incoming stream of updates and a disk-resident table of historical data or metadata. In this context we propose a novel Semi-Streaming Index Join (SSIJ) algorithm that maximizes the throughput of the join by buffering stream tuples and then judiciously selecting how to best amortize expensive disk seeks for blocks of the stored relation among a large number of stream tuples. The relation blocks required for joining with the stream are loaded from disk based on an optimal plan. In order to maximize the utilization of the available memory space for performing the join, our technique incorporates a simple but effective cache replacement policy for managing the retrieved blocks of the relation. Moreover, SSIJ is able to adapt to changing characteristics of the stream (i.e. arrival rate, data distribution) by dynamically adjusting the allocated memory between the cached relation blocks and the stream. Our experiments with a variety of synthetic and real data sets demonstrate that SSIJ consistently outperforms the state-of-the-art algorithm in terms of the maximum sustainable throughput of the join while being also able to accommodate deadlines on stream tuple processing.

data and knowledge engineering | 2003

XML queries and algebra in the Enosys integration platform

Yannis Papakonstantinou; Vinayak R. Borkar; Maxim Orgiyan; Konstantinos Stathatos; Lucian Suta; Vasilis Vassalos; Pavel Velikhov

We describe the Enosys XML integration platform, focusing on the query language, algebra, and architecture of its query processor. The platform enables the development of eBusiness applications in customer relationship management, e-commerce, supply chain management, and decision support. These applications often require that data be integrated dynamically from multiple information sources. The Enosys platform allows one to build (virtual and/or materialized) integrated XML views of multiple sources, using XML queries as view definitions. During run-time, the application issues XML queries against the views. Queries and views are translated into the XCQL algebra and are combined into a single algebra expression/plan. Query plan composition and query plan decomposition challenges are faced in this process. Finally, the query processor lazily evaluates the result, using an appropriate adaptation of relational database iterator models to XML. The paper describes the platform architecture and components, the supported XML query language and the query processor architecture. It focuses on the underlying XML query algebra, which differs from the algebras that have been considered by W3C in that it is particularly tuned to semistructured data and to optimization and efficient evaluation in a system that follows the conventional architecture of database systems.

international conference on management of data | 2010

TACO: tunable approximate computation of outliers in wireless sensor networks

Nikos Giatrakos; Yannis Kotidis; Antonios Deligiannakis; Vasilis Vassalos; Yannis Theodoridis

Wireless sensor networks are becoming increasingly popular for a variety of applications. Users are frequently faced with the surprising discovery that readings produced by the sensing elements of their motes are often contaminated with outliers. Outlier readings can severely affect applications that rely on timely and reliable sensory data in order to provide the desired functionality. As a consequence, there is a recent trend to explore how techniques that identify outlier values can be applied to sensory data cleaning. Unfortunately, most of these approaches incur an overwhelming communication overhead, which limits their practicality. In this paper we introduce an in-network outlier detection framework, based on locality sensitive hashing, extended with a novel boosting process as well as efficient load balancing and comparison pruning mechanisms. Our method trades off bandwidth for accuracy in a straightforward manner and supports many intuitive similarity metrics.

Explore More