Akon Dey | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Akon Dey is active.

Explore More

Publication

Featured researches published by Akon Dey.

international conference on data engineering | 2014

YCSB+T: Benchmarking web-scale transactional databases

Akon Dey; Alan Fekete; Raghunath Nambiar; Uwe Röhm

Database system benchmarks like TPC-C and TPC-E focus on emulating database applications to compare different DBMS implementations. These benchmarks use carefully constructed queries executed within the context of transactions to exercise specific RDBMS features, and measure the throughput achieved. Cloud services benchmark frameworks like YCSB, on the other hand, are designed for performance evaluation of distributed NoSQL key-value stores, early examples of which did not support transactions, and so the benchmarks use single operations that are not inside transactions. Recent implementations of web-scale distributed NoSQL systems like Spanner and Percolator, offer transaction features to cater to new web-scale applications. This has exposed a gap in standard benchmarks. We identify the issues that need to be addressed when evaluating transaction support in NoSQL databases. We describe YCSB+T, an extension of YCSB, that wraps database operations within transactions. In this framework, we include a validation stage to detect and quantify database anomalies resulting from any workload, and we gather metrics that measure transactional overhead. We have designed a specific workload called Closed Economy Workload (CEW), which can run within the YCSB+T framework. We share our experience with using CEW to evaluate some NoSQL systems.

Technology Conference on Performance Evaluation and Benchmarking | 2014

Introducing TPCx-HS: The First Industry Standard for Benchmarking Big Data Systems

Raghunath Nambiar; Meikel Poess; Akon Dey; Paul Cao; Tariq Magdon-Ismail; Da Qi Ren; Andrew Bond

The designation Big Data has become a mainstream buzz phrase across many industries as well as research circles. Today many companies are making performance claims that are not easily verifiable and comparable in the absence of a neutral industry benchmark. Instead one of the test suites used to compare performance of Hadoop based Big Data systems is the TeraSort. While it nicely defines the data set and tasks to measure Big Data Hadoop systems it lacks a formal specification and enforcement rules that enable the comparison of results across systems. In this paper we introduce TPCx-HS, the industry’s first industry standard benchmark, designed to stress both hardware and software that is based on Apache HDFS API compatible distributions. TPCx-HS extends the workload defined in TeraSort with formal rules for implementation, execution, metric, result verification, publication and pricing. It can be used to asses a broad range of system topologies and implementation methodologies of Big Data Hadoop systems in a technically rigorous and directly comparable and vendor-neutral manner.

very large data bases | 2013

Scalable transactions across heterogeneous NoSQL key-value data stores

Akon Dey; Alan Fekete; Uwe Röhm

Many cloud systems provide data stores with limited features, especially they may not provide transactions, or else restrict transactions to a single item. We propose a approach that gives multi-item transactions across heterogeneous data stores, using only a minimal set of features from each store such as single item consistency, conditional update, and the ability to include extra metadata within a value. We offer a client-coordinated transaction protocol that does not need a central coordinating infrastructure. A prototype implementation has been built as a Java library and measured with an extension of YCSB benchmark to exercise multi-item transactions.

Technology Conference on Performance Evaluation and Benchmarking | 2014

Towards an Extensible Middleware for Database Benchmarking

David Bermbach; Jörn Kuhlenkamp; Akon Dey; Sherif Sakr; Raghunath Nambiar

Today’s database benchmarks are designed to evaluate a particular type of database. Furthermore, popular benchmarks, like those from TPC, come without a ready-to-use implementation requiring database benchmark users to implement the benchmarking tool from scratch. The result of this is that there is no single framework that can be used to compare arbitrary database systems. The primary reason for this, among others, being the complexity of designing and implementing distributed benchmarking tools.

international conference on data engineering | 2015

Metadata-as-a-Service

Akon Dey; Gajanan S. Chinchwadkar; Alan Fekete

We present a vision of a technology and domain agnostic service that will store metadata that describes properties of the diverse data sets in an enterprise (or across several enterprises), and spread among heterogenous stores, such as relational databases, data warehouses, NoSQL or NewSQL cloud storage platforms, etc. The Metadata-as-a-Service will allow search over the metadata, so users and applications can find useful data sets, whether those are raw data or derived data. We make a preliminary proposal for the high-level architecture and API of such a service.

international conference on data engineering | 2015

Scalable distributed transactions across heterogeneous stores

Akon Dey; Alan Fekete; Uwe Röhm

Typical cloud computing systems provide highly scalable and fault-tolerant data stores that may sacrifice other features like general multi-item transaction support. Recently techniques to implement multi-item transactions in these types of systems have focused on transactions across homogeneous data stores. Since applications access data in heterogeneous storage systems for legacy or interoperability reasons, we propose an approach that enables multi-item transactions with snapshot isolation across multiple heterogeneous data stores using only a minimal set of commonly implemented features such as single item consistency, conditional updates, and the ability to store additional meta-data. We define an client-coordinated transaction commitment protocol that does not rely on a central coordinating infrastructure. The application can take advantage of the scalability and fault-tolerance characteristics of modern key-value stores and access existing data in them, and also have multi-item transactional access guarantees with little performance impact. We have implemented our design in a Java library called Cherry Garcia (CG), that supports data store abstractions to Windows Azure Storage (WAS), Google Cloud Storage (GCS) and our own high-performance key-value store called Tora.

ieee international conference on cloud engineering | 2015

REST+T: Scalable Transactions over HTTP

Akon Dey; Alan Fekete; Uwe Röhm

Restful APIs are widely adopted in designing components that are combined to form web information systems. The use of REST is growing with the inclusion of smart devices and the Internet of Things, within the scope of web information systems, along with large-scale distributed NoSQL data stores and other web-based and cloud-hosted services. There is an important subclass of web information systems and distributed applications which would benefit from stronger transactional support, as typically found in traditional enterprise systems. In this paper, we propose REST+T (REST with Transactions), a transactional Restful data access protocol and API that extends HTTP to provide multi-item transactional access to data and state information across heterogeneous systems. We describe a case study called Tora, where we provide access through REST+T to an existing key-value store (WiredTiger) that was intended for embedded operation.

international conference on data engineering | 2014

Curracurrong cloud: Stream processing in the cloud

Vasvi Kakkad; Akon Dey; Alan Fekete; Bernhard Scholz

The dominant model for computing with large-scale data in cloud environments has been founded on batch processing including the Map-Reduce model. Important use-cases such as monitoring and alerting in the cloud require instead the incremental and continual handling of new data. Thus recent systems such as Storm, Samza and S4 have adopted ideas from stream processing to the cloud environment. We describe a novel system, Curracurrong Cloud, that, for the first time, allows the computation and data origins to share a cloud-hosted cluster, offers a lightweight algebraic-style description of the processing pipeline, and supports automated placement of computation among compute resources.

international conference on service oriented computing | 2017

BenchFoundry: A Benchmarking Framework for Cloud Storage Services

David Bermbach; Jörn Kuhlenkamp; Akon Dey; Alan Fekete; Stefan Tai

Understanding quality of services in general, and of cloud storage services in particular, is often crucial. Previous proposals to benchmark storage services are too restricted to cover the full variety of NoSQL stores, or else too simplistic to capture properties of use by realistic applications; they also typically measure only one facet of the complex tradeoffs between different qualities of service. In this paper, we present BenchFoundry which is not a benchmark itself but rather is a benchmarking framework that can execute arbitrary application-driven benchmark workloads in a distributed deployment while measuring multiple qualities at the same time. BenchFoundry can be used or extended for every kind of storage service. Specifically, BenchFoundry is the first system where workload specifications become mere configuration files instead of code. In our design, we have put special emphasis on ease-of-use and deterministic repeatability of benchmark runs which is achieved through a trace-based workload model.

Archive | 2015