Anil Kumar Goel | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anil Kumar Goel is active.

Explore More

Publication

Featured researches published by Anil Kumar Goel.

international conference on data engineering | 2007

SQL Anywhere: A Holistic Approach to Database Self-management

Ivan T. Bowman; Peter Bumbulis; Dan Farrar; Anil Kumar Goel; Brendan Lucier; Anisoara Nica; G. N. Paulley; John Smirnios; Matthew Young-Lai

In this paper we present an overview of the self-management features of SQL anywhere, a full-function relational database system designed for frontline business environments with minimal administration. SQL Anywhere can serve as a high-performance workgroup server, an embedded database that is installed along with an application, or as a mobile database installed on a handheld device that provides full database services, including two-way synchronization, to applications when the device is disconnected from the corporate intranet. We illustrate how the various self-management features work in concert to provide a robust data management solution in zero-administration environments.

very large data bases | 2013

SAP HANA: the evolution from a modern main-memory data platform to an enterprise application platform

Vishal Sikka; Franz Färber; Anil Kumar Goel; Wolfgang Lehner

SAP HANA is a pioneering, and one of the best performing, data platform designed from the grounds up to heavily exploit modern hardware capabilities, including SIMD, and large memory and CPU footprints. As a comprehensive data management solution, SAP HANA supports the complete data life cycle encompassing modeling, provisioning, and consumption. This extended abstract outlines the vision and planned next step of the SAP HANA evolution growing from a core data platform into an innovative enterprise application platform as the foundation for current as well as novel business applications in both on-premise and on-demand scenarios. We argue that only a holistic system design rigorously applying co-design at different levels may yield a highly optimized and sustainable platform for modern enterprise applications.

very large data bases | 2015

Towards scalable real-time analytics: an architecture for scale-out of OLxP workloads

Anil Kumar Goel; Jeffrey Pound; Nathan Auch; Peter Bumbulis; Scott Maclean; Franz Färber; Francis Gropengiesser; Christian Mathis; Thomas Bodner; Wolfgang Lehner

We present an overview of our work on the SAP HANA Scale-out Extension, a novel distributed database architecture designed to support large scale analytics over real-time data. This platform permits high performance OLAP with massive scale-out capabilities, while concurrently allowing OLTP workloads. This dual capability enables analytics over real-time changing data and allows fine grained user-specified service level agreements (SLAs) on data freshness. We advocate the decoupling of core database components such as query processing, concurrency control, and persistence, a design choice made possible by advances in high-throughput low-latency networks and storage devices. We provide full ACID guarantees and build on a logical timestamp mechanism to provide MVCC-based snapshot isolation, while not requiring synchronous updates of replicas. Instead, we use asynchronous update propagation guaranteeing consistency with timestamp validation. We provide a view into the design and development of a large scale data management platform for real-time analytics, driven by the needs of modern enterprise customers.

mobile data management | 2014

Supporting Location-Based Services in a Main-Memory Database

Suprio Ray; Rolando Blanco; Anil Kumar Goel

With the proliferation of mobile devices and explosive growth of spatio-temporal data, Location-Based Services (LBS) have become an indispensable technology in our daily lives. The key characteristics of the LBS applications include a high rate of time-stamped location updates, and many concurrent historical, present and predictive queries. The commercial providers of LBS must support all three kinds of queries and address the high update rates. While they employ relational databases for this purpose, traditional databases are unable to cope with the growing demands of many LBS systems. Support for spatio-temporal indexes within these databases are limited to R-tree based approaches. Although a number of advanced spatio-temporal indexes have been proposed by the research community, only a few of them support historical queries. These indexing techniques, with support for historical queries, are unable to sustain high update and query throughput typical in LBS. Technological trends involving increasingly large main memory and core footprints offer opportunities to address some of these issues. We present several key ideas to support high performance commercial LBS by exploiting in-memory database techniques. Taking advantage of very large memory available in modern machines, our system maintains the location data and index for the past N days in memory. Older data and index are kept in disk. We propose an in-memory storage organization for high insert performance. We also introduce a novel spatio-temporal index that maintains partial temporal indexes in a versioned-grid structure. The partial temporal indexes are organized as compressed bitmaps. With extensive evaluation, we demonstrate that our system supports high insert and query throughputs and it outperforms the leading LBS system by a significant margin.

international conference on data engineering | 2007

Towards Adaptive Costing of Database Access Methods

Ye Qin; Kenneth Salem; Anil Kumar Goel

Most database query optimizers use cost models to identify good query execution plans. Inaccuracies in the cost models can cause query optimizers to select poor plans. In this paper, we consider the problem of accurately estimating the I/O costs of database access methods, such as index scans. We present some experimental results which show that existing analytical I/O cost models can be very inaccurate. We also present a simple analysis which shows that larger cost estimation errors can cause the query optimizer to make larger mistakes in plan selection. We propose the use of an adaptive black-box statistical cost estimation methodology to achieve better estimates.

international workshop on geostreaming | 2013

Enhanced database support for location-based services

Suprio Ray; Rolando Blanco; Anil Kumar Goel

The ubiquity of GPS-enabled mobile devices and sensors have led to the explosive growth of time-stamped location data. Consequently Location-Based Services (LBS) has become a popular technology impacting various aspects of our lives. LBS applications are characterized by very high rate of location record updates, and many concurrent historic, present and predictive queries. Commercial LBS providers rely on relational databases to manage their data. However, traditional relational databases do not provide adequate support to meet the growing demands of many LBS systems. Moreover, existing indexing techniques that support historical queries are unable to sustain high update and query throughput as required by many LBS applications. To address this, we propose to exploit in-memory database techniques and present a few key ideas to support high performance commercial LBS. We also introduce a novel in-memory spatio-temporal index in which the spatial domain is organized as grid cells and for each grid cell partial temporal indexes are maintained for moving objects that visited the cell. The partial temporal indexes are implemented as compressed bitmaps. Using fast bitmap operations and utilizing parallelism rendered by multi-core systems, our system offers significantly better performance than traditional relational databases.

Geoinformatica | 2017

High performance location-based services in a main-memory database

Suprio Ray; Rolando Blanco; Anil Kumar Goel

With the widespread adoption of mobile devices and explosive growth of spatio-temporal data, Location-Based Services (LBS) have become an indispensable technology in our daily lives. The key characteristics of the LBS applications include a high rate of time-stamped location updates, and many concurrent historical, present and predictive queries. The commercial providers of LBS must support all three kinds of queries and address the high update rates. While they employ relational databases for this purpose, traditional databases are unable to cope with the growing demands of many LBS systems. Support for spatio-temporal indexes within these databases are limited to R-tree based approaches. Although a number of advanced spatiotemporal indexes have been proposed by the research community, only a few of them support historical queries. These indexing techniques, with support for historical queries, are unable to sustain high update and query throughput typical in LBS. Technological trends involving increasingly large main memory and growing processing core count offer opportunities to address some of these issues. We present several key ideas to support high performance commercial LBS by exploiting in-memory database techniques. Taking advantage of very large memory available in modern machines, our system maintains the location data and index for the past N days in memory. Older data and index are kept in disk. We propose an in-memory storage organization for high insert performance. We also introduce a novel spatio-temporal index that maintains partial temporal indexes in a versioned grid structure. The partial temporal indexes are organized as compressed bitmaps. With extensive evaluation, we demonstrate that our system supports high insert and query throughputs and it outperforms the leading LBS system by a significant margin.

international conference on management of data | 2016

Page As You Go: Piecewise Columnar Access In SAP HANA

Reza Sherkat; Colin Florendo; Mihnea Andrei; Anil Kumar Goel; Anisoara Nica; Peter Bumbulis; Ivan Schreter; Günter Radestock; Christian Bensberg; Daniel Booss; Heiko Gerwens

In-memory columnar databases such as SAP HANA achieve extreme performance by means of vector processing over logical units of main memory resident columns. The core in-memory algorithms can be challenged when the working set of an application does not fit into main memory. To deal with memory pressure, most in-memory columnar databases evict candidate columns (or tables) using a set of heuristics gleaned from recent workload. As an alternative approach, we propose to reduce the unit of load and eviction from column to a contiguous portion of the in-memory columnar representation, which we call a page. In this paper, we adapt the core algorithms to be able to operate with partially loaded columns while preserving the performance benefits of vector processing. Our approach has two key advantages. First, partial column loading reduces the mandatory memory footprint for each column, making more memory available for other purposes. Second, partial eviction extends the in-memory lifetime of partially loaded column. We present a new in-memory columnar implementation for our approach, that we term page loadable column. We design a new persistency layout and access algorithms for the encoded data vector of the column, the order-preserving dictionary, and the inverted index. We compare the performance attributes of page loadable columns with those of regular in-memory columns and present a use-case for page loadable columns for cold data in data aging scenarios. Page loadable columns are completely integrated in SAP HANA, and we present extensive experimental results that quantify the performance overhead and the resource consumption when these columns are deployed.

international conference on big data | 2015

Parallel in-memory trajectory-based spatiotemporal topological join

Suprio Ray; Angela Demke Brown; Nick Koudas; Rolando Blanco; Anil Kumar Goel

The rapid growth of spatiotemporal Big Data is fueling the emergence and growth of many applications. Many of these applications are characterized by complex spatiotemporal queries. An important category of such queries is the trajectory-based spatiotemporal topological join queries, which combine a trajectory dataset and a spatial objects dataset based on spatiotemporal predicates. Although these queries have many important use-cases, they have not received much attention from the research community. We systematically evaluate several feasible in-memory spatiotemporal topological join algorithms, using existing trajectory index (TB-tree) and spatial index (STR). We show that even the best among these algorithms is long running and not scalable. To address the performance problems of these algorithms we introduce PISTON, a parallel in-memory indexing system targeted for spatiotemporal topological join. With extensive evaluations, we demonstrate that even the single-threaded performance of PISTON is significantly better than the feasible approaches that use existing trajectory and spatial indexes. Moreover, the parallel performance of PISTON is orders of magnitude better than these approaches.

symposium on cloud computing | 2017

Efficient and consistent replication for distributed logs

Hua Fan; Jeffrey Pound; Peter Bumbulis; Nathan Auch; Scott Maclean; Eric Garber; Anil Kumar Goel

Distributed shared logs are a powerful building block for distributed systems. By providing fault-tolerant persistence and strong ordering guarantees, applications can use a distributed shared log to reliably communicate a stream of events between processes. This can be used, for example, to replicate application state or to build a reliable publish/subscribe system. The log itself must also replicate data in order to provide availability and fault-tolerance. Key to the design of a distributed shared log is the choice of replication algorithm, which will determine many properties of the system. We propose an algorithm for consistent replication of log data, quorum-replication with meta-data exchange (QMX), that is linearizable while allowing writes to be successful with only a single round-trip to a quorum of replicas and allowing reads to generally be serviced by any single replica, or read-one/write-quorum. This is achieved by coupling the reads with an asynchronous message exchange algorithm that continuously runs amongst the replicas. The message exchange algorithm allows replicas to infer the global state of writes across the cluster, in order to deduce which writes have been successfully quorum replicated and which have not. This metadata allows any single replica to directly answer reads in many cases, though in the worst case a read must wait for the message passing round to complete before being serviced which requires a majority quorum of servers to be responsive.

Explore More