Adina Crainiceanu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adina Crainiceanu is active.

Explore More

Publication

Featured researches published by Adina Crainiceanu.

international workshop on the web and databases | 2004

Querying peer-to-peer networks using P-trees

Adina Crainiceanu; Prakash Linga; Johannes Gehrke; Jayavel Shanmugasundaram

We propose a new distributed, fault-tolerant peer-to-peer index structure called the <B>P-tree</B>. P-trees efficiently evaluate range queries in addition to equality queries.

international conference on management of data | 2007

P-ring: an efficient and robust P2P range index structure

Adina Crainiceanu; Prakash Linga; Ashwin Machanavajjhala; Johannes Gehrke; Jayavel Shanmugasundaram

Peer-to-peer systems have emerged as a robust, scalable and decentralized way to share and publish data. In this paper, we propose P-Ring, a new P2P index structure that supports both equality and range queries. P-Ring is fault-tolerant, provides logarithmic search performance even for highly skewed data distributions and efficiently supports large sets of data items per peer. We experimentally evaluate P-Ring using both simulations and a real distributed deployment on PlanetLab, and we compare its performance with Skip Graphs, Online Balancing and Chord.

Proceedings of the 1st International Workshop on Cloud Intelligence | 2012

Rya: a scalable RDF triple store for the clouds

Roshan Punnoose; Adina Crainiceanu; David Rapp

Resource Description Framework (RDF) was designed with the initial goal of developing metadata for the Internet. While the Internet is a conglomeration of many interconnected networks and computers, most of todays best RDF storage solutions are confined to a single node. Working on a single node has significant scalability issues, especially considering the magnitude of modern day data. In this paper we introduce a scalable RDF data management system that uses Accumulo, a Google Bigtable variant. We introduce storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes, while providing fast and easy access to the data through conventional query mechanisms such as SPARQL. Our performance evaluation shows that in most cases, our system outperforms existing distributed RDF solutions, even systems much more complex than ours.

international world wide web conferences | 2004

P-tree: a p2p index for resource discovery applications

Adina Crainiceanu; Prakash Linga; Johannes Gehrke; Jayavel Shanmugasundaram

We propose a new distributed, fault-tolerant Peer-to-Peer index structure for resource discovery applications called the P-tree. P-trees efficiently support range queries in addition to equality queries.

international conference on management of data | 2005

Guaranteeing correctness and availability in P2P range indices

Prakash Linga; Adina Crainiceanu; Johannes Gehrke

New and emerging P2P applications require sophisticated range query capability and also have strict requirements on query correctness, system availability and item availability. While there has been recent work on developing new P2P range indices, none of these indices guarantee correctness and availability. In this paper, we develop new techniques that can provably guarantee the correctness and availability of P2P range indices. We develop our techniques in the context of a general P2P indexing framework that can be instantiated with most P2P index structures from the literature. As a specific instantiation, we implement P-Ring, an existing P2P range index, and show how it can be extended to guarantee correctness and availability. We quantitatively evaluate our techniques using a real distributed implementation.

Information Systems | 2015

SPARQL in the cloud using Rya

Roshan Punnoose; Adina Crainiceanu; David Rapp

SPARQL is the standard query language for Resource Description Framework (RDF) data. RDF was designed with the initial goal of developing metadata for the Internet. While the number and the size of the generated RDF datasets are continually increasing, most of todays best RDF storage solutions are confined to a single node. Working on a single node has significant scalability issues, especially considering the magnitude of modern day data. In this paper we introduce Rya, a scalable RDF data management system that efficiently supports SPARQL queries. We introduce storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes, while providing fast and easy access to the data through conventional query mechanisms such as SPARQL. Our performance evaluation shows that in most cases, our system outperforms existing distributed RDF solutions, even systems much more complex than ours. HighlightsWe build a scalable RDF data management system in a cloud environment.Rya is based on Accumulo columnar store and OpenRDF Sesame framework.We used 3 indexed tables, SPO, POS, OSP, with triple data stored in the row ID.We implemented performance enhancements to scale to billions of triples and milliseconds query time for most queries.Rya provides fast and easy access to the data through SPARQL.

ACM Transactions on Internet Technology | 2011

Load Balancing and Range Queries in P2P Systems Using P-Ring

Adina Crainiceanu; Prakash Linga; Ashwin Machanavajjhala; Johannes Gehrke; Jayavel Shanmugasundaram

In peer-to-peer (P2P) systems, computers from around the globe share data and can participate in distributed computation. P2P became famous, and infamous, due to file-sharing systems like Napster. However, the scalability and robustness of these systems make them appealing to a wide range of applications. This article introduces P-Ring, a new peer-to-peer index structure. P-Ring is fully distributed, fault tolerant, and provides load balancing and logarithmic search performance while supporting both equality and range queries. Our theoretical analysis as well as experimental results, obtained both in a simulated environment and on PlanetLab, show the performance of our system.

international conference on management of data | 2004

An indexing framework for peer-to-peer systems

Adina Crainiceanu; Prakash Linga; Ashwin Machanavajjhala; Johannes Gehrke; Jayavel Shanmugasundaram

Current peer-to-peer (P2P) indices are monolithic pieces of software that address only a subset of the desired functionality for P2P databases. For instance, Chord [6] provides reliability and scalability, but only supports equality queries. Skip Graphs [1] support equality and range queries, but only for one data item per peer. PePeR [4] supports equality and range queries over multiple data items per peer, but does not provide any search or reliability guarantees in face of multiple failures. Galanis et al. [5] describe an index structure for locating XML documents, but this index does not provide any provable guarantees on size and performance. In a P2P database system, all of the above functionality is required, but none of the existing systems supports it. We devise a modularized indexing framework that cleanly separates different functional components. This allows us to reuse existing algorithms rather than implement everything anew and to experiment with different implementations for the same component so that we can clearly evaluate and quantify the benefits of a particular implementation. Our indexing framework has the following components: 1. Fault-tolerant Torus: Provides fault-tolerant connectivity among peers. 2. Data Store: Stores actual data and provides methods for reliably exchanging data items between peers. 3. Replication Manager: Ensures data items are stored reliably even in the face of peer failures. 4. Content Router: Allows efficient location of data items.

Proceedings of the 2nd International Workshop on Cloud Intelligence | 2013

Bloofi: a hierarchical Bloom filter index with applications to distributed data provenance

Adina Crainiceanu

Bloom filters are probabilistic data structures that have been successfully used for approximate membership problems in many areas of Computer Science (networking, distributed systems, databases, etc.). With the huge increase in data size and distribution of data, problems arise where a large number of Bloom filters are available, and all the Bloom filters need to be searched for potential matches. As an example, in a federated cloud environment, with hundreds of geographically distributed clouds participating in the federation, information needs to be shared by the semi-autonomous cloud providers. Each cloud provider could encode the information using Bloom filters and share the Bloom filters with a central coordinator. The problem of interest is not only whether a given object is in any of the sets represented by the Bloom filters, but which of the existing sets contain the given object. This problem cannot be solved by just constructing a Bloom filter on the union of all the sets. We propose Bloofi, a hierarchical index structure for Bloom filters that speeds-up the search process and can be efficiently constructed and maintained. We apply our index structure to the problem of determining the complete data provenance graph in a geographically distributed setting. Our theoretical and experimental results show that Bloofi provides a scalable and efficient solution for searching through a large number of Bloom filters.

Information Systems | 2015

Bloofi : Multidimensional Bloom Filters

Adina Crainiceanu; Daniel Lemire

Bloom filters are probabilistic data structures commonly used for approximate membership problems in many areas of Computer Science (networking, distributed systems, databases, etc.). With the increase in data size and distribution of data, problems arise where a large number of Bloom filters are available, and all them need to be searched for potential matches. As an example, in a federated cloud environment, each cloud provider could encode the information using Bloom filters and share the Bloom filters with a central coordinator. The problem of interest is not only whether a given element is in any of the sets represented by the Bloom filters, but which of the existing sets contain the given element. This problem cannot be solved by just constructing a Bloom filter on the union of all the sets. Instead, we effectively have a multidimensional Bloom filter problem: given an element, we wish to receive a list of candidate sets where the element might be. To solve this problem, we consider 3 alternatives. Firstly, we can naively check many Bloom filters. Secondly, we propose to organize the Bloom filters in a hierarchical index structure akin to a B+ tree, that we call Bloofi. Finally, we propose another data structure that packs the Bloom filters in such a way as to exploit bit-level parallelism, which we call Flat-Bloofi. Our theoretical and experimental results show that Bloofi and Flat-Bloofi provide scalable and efficient solutions alternatives to search through a large number of Bloom filters.

Explore More