Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Medha Atre is active.

Publication


Featured researches published by Medha Atre.


international world wide web conferences | 2010

Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Medha Atre; Vineet Chaoji; Mohammed Javeed Zaki; James A. Hendler

The Semantic Web community, until now, has used traditional database systems for the storage and querying of RDF data. The SPARQL query language also closely follows SQL syntax. As a natural consequence, most of the SPARQL query processing techniques are based on database query processing and optimization techniques. For SPARQL join query optimization, previous works like RDF-3X and Hexastore have proposed to use 6-way indexes on the RDF data. Although these indexes speed up merge-joins by orders of magnitude, for complex join queries generating large intermediate join results, the scalability of the query processor still remains a challenge. In this paper, we introduce (i) BitMat - a compressed bit-matrix structure for storing huge RDF graphs, and (ii) a novel, light-weight SPARQL join query processing method that employs an initial pruning technique, followed by a variable-binding-matching algorithm on BitMats to produce the final results. Our query processing method does not build intermediate join tables and works directly on the compressed data. We have demonstrated our method against RDF graphs of upto 1.33 billion triples - the largest among results published until now (single-node, non-parallel systems), and have compared our method with the state-of-the-art RDF stores - RDF-3X and MonetDB. Our results show that the competing methods are most effective with highly selective queries. On the other hand, BitMat can deliver 2-3 orders of magnitude better performance on complex, low-selectivity queries over massive data.


international conference on management of data | 2015

Left Bit Right : For SPARQL Join Queries with OPTIONAL Patterns (Left-outer-joins)

Medha Atre

SPARQL basic graph pattern (BGP) (a.k.a. SQL inner-join) query optimization is a well researched area. However, optimization of OPTIONAL pattern queries (a.k.a. SQL left-outer-joins) poses additional challenges, due to the restrictions on the reordering of left-outer-joins. The occurrence of such queries tends to be as high as 50% of the total queries (e.g., DBPedia query logs). In this paper, we present Left Bit Right (LBR), a technique for well-designed nested BGP and OPTIONAL pattern queries. Through LBR, we propose a novel method to represent such queries using a graph of supernodes, which is used to aggressively prune the RDF triples, with the help of compressed indexes. We also propose novel optimization strategies -- first of a kind, to the best of our knowledge -- that combine together the characteristics of acyclicity of queries, minimality, and nullification, best-match operators. In this paper, we focus on OPTIONAL patterns without UNIONs or FILTERs, but we also show how UNIONs and FILTERs can be handled with our technique using a query rewrite. Our evaluation on RDF graphs of up to and over one billion triples, on a commodity laptop with 8 GB memory, shows that LBR can process well-designed low-selectivity complex queries up to 11 times faster compared to the state-of-the-art RDF column-stores as Virtuoso and MonetDB, and for highly selective queries, LBR is at par with them.


international conference on data engineering | 2010

Visualizing large-scale RDF data using Subsets, Summaries, and Sampling in Oracle

Seema Sundara; Medha Atre; Vladimir Kolovski; Souripriya Das; Zhe Wu; Eugene Inseok Chong; Jagannathan Srinivasan

The paper addresses the problem of visualizing large scale RDF data via a 3-S approach, namely, by using, 1) Subsets: to present only relevant data for visualisation; both static and dynamic subsets can be specified, 2) Summaries: to capture the essence of RDF data being viewed; summarized data can be expanded on demand thereby allowing users to create hybrid (summary-detail) fisheye views of RDF data, and 3) Sampling: to further optimize visualization of large-scale data where a representative sample suffices. The visualization scheme works with both asserted and inferred triples (generated using RDF(S) and OWL semantics). This scheme is implemented in Oracle by developing a plug-in for the Cytoscape graph visualization tool, which uses functions defined in a Oracle PL/SQL package, to provide fast and optimized access to Oracle Semantic Store containing RDF data. Interactive visualization of a synthesized RDF data set (LUBM 1 million triples), two native RDF datasets (Wikipedia 47 million triples and UniProt 700 million triples), and an OWL ontology (eClassOwl with a large class hierarchy including over 25,000 OWL classes, 5,000 properties, and 400,000 class-properties) demonstrates the effectiveness of our visualization scheme.


international world wide web conferences | 2016

For the DISTINCT Clause of SPARQL Queries

Medha Atre

Evaluating SPARQL queries with the DISTINCT clause may become memory intensive due to the requirement of additional auxiliary data structures, like hash-maps, to discard the duplicates. DISTINCT queries make up to 16% of all the queries (e.g., DBPedia), and thus are non-negligible. In this poster we propose a novel method for such queries, by just manipulating the compressed bit-vector indexes called BitMats, for acyclic basic graph pattern (BGP) queries.


Journal of Web Semantics | 2010

Invited paper: Scalable reduction of large datasets to interesting subsets

Gregory Todd Williams; Jesse Weaver; Medha Atre; James A. Hendler


international semantic web conference | 2008

BitMat: a main-memory bit matrix of RDF triples for conjunctive triple pattern queries

Medha Atre; Jagannathan Srinivasan; James A. Hendler


arXiv: Databases | 2013

A technique for SPARQL OPTIONAL (left-outer-join) queries

Medha Atre


Archive | 2011

Bit-by-bit: indexing and querying rdf data using compressed bit-vectors

James A. Hendler; Medha Atre


international conference data science and management | 2018

Efficient RDF dictionaries with B+ trees

Gurkirat Singh; Dhawal Upadhyay; Medha Atre


arXiv: Databases | 2018

Algorithms and Analysis for the SPARQL Constructs.

Medha Atre

Collaboration


Dive into the Medha Atre's collaboration.

Top Co-Authors

Avatar

James A. Hendler

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Mohammed Javeed Zaki

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Vineet Chaoji

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gregory Todd Williams

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Jesse Weaver

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge