Michalis Petropoulos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michalis Petropoulos is active.

Explore More

Publication

Featured researches published by Michalis Petropoulos.

international conference on management of data | 2004

Industrial-strength schema matching

Philip A. Bernstein; Sergey Melnik; Michalis Petropoulos; Christoph Quix

Schema matching identifies elements of two given schemas that correspond to each other. Although there are many algorithms for schema matching, little has been written about building a system that can be used in practice. We describe our initial experience building such a system, a customizable schema matcher called Protoplasm.

international conference on management of data | 2002

QURSED: querying and reporting semistructured data

Yannis Papakonstantinou; Michalis Petropoulos; Vasilis Vassalos

QURSED enables the development of web-based query forms and reports (QFRs) that query and report semistructured XML data, i.e., data that are characterized by nesting, irregularities and structural variance. The query aspects of a QFR are captured by its query set specification, which formally encodes multiple parameterized condition fragments and can describe large numbers of queries. The run-time component of QURSED produces XQuery-compliant queries by synthesizing fragments from the query set specification that have been activated during the interaction of the end-user with the QFR. The design-time component of QURSED, called QURSED Editor, semi-automates the development of the query set specification and its association with the visual components of the QFR by translating visual actions into appropriate query set specifications. We describe QURSED and illustrate how it accommodates the intricacies that the semistructured nature of the underlying database introduces. We specifically focus on the formal model of the query set specification, its generation via the QURSED Editor and its coupling with the visual aspects of the web-based form and report.

conference on information and knowledge management | 2010

FACeTOR: cost-driven exploration of faceted query results

Abhijith Kashyap; Vagelis Hristidis; Michalis Petropoulos

Faceted navigation is being increasingly employed as an effective technique for exploring large query results on structured databases. This technique of mitigating information-overload leverages metadata of the query results to provide users with facet conditions that can be used to progressively refine the users query and filter the query results. However, the number of facet conditions can be quite large, thereby increasing the burden on the user. We present the FACeTOR system that proposes a cost-based approach to faceted navigation. At each step of the navigation, the user is presented with a subset of all possible facet conditions that are selected such that the overall expected navigation cost is minimized and every result is guaranteed to be reachable by a facet condition. We prove that the problem of selecting the optimal facet conditions at each navigation step is NP-Hard, and subsequently present two intuitive heuristics employed by FACeTOR. Our user study at Amazon Mechanical Turk shows that FACeTOR reduces the user navigation time compared to the cutting edge commercial and academic faceted search algorithms. The user study also confirms the validity of our cost model. We also present the results of an extensive experimental evaluation on the performance of the proposed approach using two real datasets. FACeTOR is available at http://db.cse.buffalo.edu/facetor/.

international conference on management of data | 2014

Orca: a modular query optimizer architecture for big data

Mohamed A. Soliman; Lyublena Antova; Venkatesh Raghavan; Amr El-Helw; Zhongxian Gu; Entong Shen; George Constantin Caragea; Carlos Garcia-Alvarado; Foyzur Rahman; Michalis Petropoulos; Florian Waas; Sivaramakrishnan Narayanan; Konstantinos Krikellas; Rhonda Baldwin

The performance of analytical query processing in data management systems depends primarily on the capabilities of the systems query optimizer. Increased data volumes and heightened interest in processing complex analytical queries have prompted Pivotal to build a new query optimizer. In this paper we present the architecture of Orca, the new query optimizer for all Pivotal data management products, including Pivotal Greenplum Database and Pivotal HAWQ. Orca is a comprehensive development uniting state-of-the-art query optimization technology with own original research resulting in a modular and portable optimizer architecture. In addition to describing the overall architecture, we highlight several unique features and present performance comparisons against other systems.

international world wide web conferences | 2001

XML query forms (XQForms): declarative specification of XML query interfaces

Michalis Petropoulos; Vasilis Vassalos; Yannis Papakonstantinou

XQForms is the first generator of Web-based query forms and reports for XML data. XQForms takes as input (i) XML Schemas that model the data to be queried and presented, (ii) declarative specifications, called annotations, of the logic of the query forms and reports that will be generated, and (iii) a set of template presentation libraries. The output is a set of query forms and reports that provide automated query construction and report formatting in order for the end users to query and browse the underlying XML data. Thus XQForms separates content (given by the XML Schema of the data), query form logic (specified by the annotations) and presentation of the forms and reports. The system architecture is modular and consists of four main components: (a) a collection of query form controls that incorporate query capabilities and allow parameter passing from the end users via the form page. A set of query form controls makes up a query form. (b) An annotation scheme for binding these controls to data elements of the XML Schema and for specifying their properties, (c) a compiler for creating the HTML representation of the query forms, and (d) a runtime engine that constructs and executes the queries against the XML data and renders the query results to create the reports. General Terms Design, Standardization, Languages.

ACM Transactions on Internet Technology | 2005

Graphical query interfaces for semistructured data: the QURSED system

Michalis Petropoulos; Yannis Papakonstantinou; Vasilis Vassalos

We describe the QURSED system for the declarative specification and automatic generation of Web-based query forms and reports (QFRs) for semistructured XML data. In QURSED, a QFR is formally described by its query set specification (QSS) which captures the complex query and reporting capabilities of the QFR and the associations of the query set specification with visual elements that implement these capabilities on a Web page. The design-time component of QURSED, called QURSED Editor, semi-automates the development of the query set specification and its association with visual elements by translating intuitive visual actions taken by a developer into appropriate specification fragments. The run-time component of QURSED produces XQuery statements by synthesizing fragments from the query set specification that have been activated during the interaction of the end-user with the QFR and renders the query results in interactive reports as specified by the QSS. We describe the techniques and algorithms employed by QURSED with emphasis on how it accommodates the intricacies introduced by the semistructured nature of the underlying data. We present the formal model of the query set specification, as well as its generation via the QURSED Editor, and focus on the techniques and heuristics the Editor employs for translating visual designer input into meaningful specifications. We also present the algorithms QURSED employs for query generation and report generation. An online demonstration of the system is available at http://www.db.ucsd.edu/qursed/.

ACM Transactions on Database Systems | 2007

Exporting and interactively querying Web service-accessed sources: The CLIDE System

Michalis Petropoulos; Alin Deutsch; Yannis Papakonstantinou; Yannis Katsis

The CLIDE System assists the owners of sources that participate in Web service-based data publishing systems to publish a restricted set of parameterized queries over the schema of their sources and package them as WSDL services. The sources may be relational databases, which naturally have a schema, or ad hoc information/application systems whereas the owner publishes a virtual schema. CLIDE allows information clients to pose queries over the published schema and utilizes prior work on answering queries using views to answer queries that can be processed by combining and processing the results of one or more Web service calls. These queries are called feasible. Contrary to prior work, where infeasible queries are rejected without an explanatory feedback, leading the user into a frustrating trial-and-error cycle, CLIDE features a query formulation interface, which extends the QBE-like query builder of Microsofts SQL Server with a color scheme that guides the user toward formulating feasible queries. CLIDE guarantees that the suggested query edit actions are complete (i.e., each feasible query can be built by following only suggestions), rapidly convergent (the suggestions are tuned to lead to the closest feasible completions of the query), and suitably summarized (at each interaction step, only a minimal number of actions needed to preserve completeness are suggested). We present the algorithms, implementation, and performance evaluation showing that CLIDE is a viable on-line tool.

international conference on data engineering | 2009

BioNav: Effective Navigation on Query Results of Biomedical Databases

Abhijith Kashyap; Vagelis Hristidis; Michalis Petropoulos; Sotiria Tavoulari

Search queries on biomedical databases like PubMed often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is according to their MeSH annotations, a comprehensive concept hierarchy used by PubMed. In this paper, we present the BioNav system, a novel search interface that enables the user to navigate large number of query results by organizing them using the MeSH concept hierarchy. First, the query results are organized into a navigation tree. Previous works expand the hierarchy in a predefined static manner. In contrast, BioNav uses an intuitive navigation cost model to decide what concepts to display at each step. Another difference from previous works is that the hierarchy is not strictly displayed level-by-level.

IEEE Transactions on Knowledge and Data Engineering | 2011

Effective Navigation of Query Results Based on Concept Hierarchies

Abhijith Kashyap; Vagelis Hristidis; Michalis Petropoulos; Sotiria Tavoulari

Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is according to their MeSH annotations. MeSH is a comprehensive concept hierarchy used by PubMed. In this paper, we present the BioNav system, a novel search interface that enables the user to navigate large number of query results by organizing them using the MeSH concept hierarchy. First, the query results are organized into a navigation tree. At each node expansion step, BioNav reveals only a small subset of the concept nodes, selected such that the expected user navigation cost is minimized. In contrast, previous works expand the hierarchy in a predefined static manner, without navigation cost modeling. We show that the problem of selecting the best concepts to reveal at each node expansion is NP-complete and propose an efficient heuristic as well as a feasible optimal algorithm for relatively small trees. We show experimentally that BioNav outperforms state-of-the-art categorization systems by up to an order of magnitude, with respect to the user navigation cost. BioNav for the MEDLINE database is available at http://db.cse.buffalo.edu/bionav.

international conference on management of data | 2014

Optimizing queries over partitioned tables in MPP systems

Lyublena Antova; Amr El-Helw; Mohamed A. Soliman; Zhongxian Gu; Michalis Petropoulos; Florian Waas

Partitioning of tables based on value ranges provides a powerful mechanism to organize tables in database systems. In the context of data warehousing and large-scale data analysis partitioned tables are of particular interest as the nature of queries favors scanning large swaths of data. In this scenario, eliminating partitions from a query plan that contain data not relevant to answering a given query can represent substantial performance improvements. Dealing with partitioned tables in query optimization has attracted significant attention recently, yet, a number of challenges unique to Massively Parallel Processing (MPP) databases and their distributed nature remain unresolved. In this paper, we present optimization techniques for queries over partitioned tables as implemented in Pivotal Greenplum Database. We present a concise and unified representation for partitioned tables and devise optimization techniques to generate query plans that can defer decisions on accessing certain partitions to query run-time. We demonstrate, the resulting query plans distinctly outperform conventional query plans in a variety of scenarios.

Explore More