Manoj Syamala
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Manoj Syamala.
very large data bases | 2004
Sanjay Agrawal; Surajit Chaudhuri; Lubor J. Kollar; Arunprasad P. Marathe; Vivek R. Narasayya; Manoj Syamala
Publisher Summary This chapter provides an overview of Database Tuning Advisors (DTAs) novel functionality, the rationale for its architecture, and demonstrates DTAs quality and scalability on large customer workloads. The DTA is part of Microsoft SQL Server 2005. It is an automated physical database design tool that significantly advances the state-of-the-art in several ways. First, the DTA is capable of providing an integrated physical design recommendation for horizontal partitioning, indexes, and materialized views. Second, unlike todays physical design tools that focus solely on performance, the DTA also supports the capability for a database administrator (DBA) to specify manageability requirements while optimizing for performance. Third, the DTA is able to scale to large databases and workloads using several novel techniques including: workload compression, reduced statistics creation, and exploiting test server to reduce load on production server. Finally, the DTA greatly enhances scriptability and customization through the use of a public XML schema for input and output.
international conference on data engineering | 2009
Arjun Dasgupta; Vivek R. Narasayya; Manoj Syamala
Database developers today use data access APIs such as ADO.NET to execute SQL queries from their application. These applications often have security problems such as SQL injection vulnerabilities and performance problems such as poorly written SQL queries. However todays compilers have little or no understanding of data access APIs or DBMS, and hence the above problems can go undetected until much later in the application lifecycle. We present a framework that adapts traditional program analysis by leveraging understanding of data access APIs in order to identify such problems early on during application development. Our framework can analyze database application binaries that use ADO.NET data access APIs. We show how our framework can be used for a variety of analysis tasks such as SQL injection detection, workload extraction, identifying performance problems, and verifying data integrity constraints in the application.
very large data bases | 2011
Hideaki Kimura; Vivek R. Narasayya; Manoj Syamala
Modern RDBMSs support the ability to compress data using methods such as null suppression and dictionary encoding. Data compression offers the promise of significantly reducing storage requirements and improving I/O performance for decision support queries. However, compression can also slow down update and query performance due to the CPU costs of compression and decompression. In this paper, we study how data compression affects choice of appropriate physical database design, such as indexes, for a given workload. We observe that approaches that decouple the decision of whether or not to choose an index from whether or not to compress the index can result in poor solutions. Thus, we focus on the novel problem of integrating compression into physical database design in a scalable manner. We have implemented our techniques by modifying Microsoft SQL Server and the Database Engine Tuning Advisor (DTA) physical design tool. Our techniques are general and are potentially applicable to DBMSs that support other compression methods. Our experimental results on real world as well as TPC-H benchmark workloads demonstrate the effectiveness of our techniques.
very large data bases | 2015
Vivek R. Narasayya; Ishai Menache; Mohit Singh; Feng Li; Manoj Syamala; Surajit Chaudhuri
Relational database-as-a-service (DaaS) providers need to rely on multi-tenancy and resource sharing among tenants, since statically reserving resources for a tenant is not cost effective. A major consequence of resource sharing is that the performance of one tenant can be adversely affected by resource demands of other co-located tenants. One such resource that is essential for good performance of a tenants workload is buffer pool memory. In this paper, we study the problem of how to effectively share buffer pool memory in multi-tenant relational DaaS. We first develop an SLA framework that defines and enforces accountability of the service provider to the tenant even when buffer pool memory is not statically reserved on behalf of the tenant. Next, we present a novel buffer pool page replacement algorithm (MT-LRU) that builds upon theoretical concepts from weighted online caching, and is designed for multi-tenant scenarios involving SLAs and overbooking. MT-LRU generalizes the LRU-K algorithm which is commonly used in relational database systems. We have prototyped our techniques inside a commercial DaaS engine and extensive experiments demonstrate the effectiveness of our solution.
international conference on management of data | 2016
Feng Li; Sudipto Das; Manoj Syamala; Vivek R. Narasayya
Memory is a crucial resource in relational databases (RDBMSs). When there is insufficient memory, RDBMSs are forced to use slower media such as SSDs or HDDs, which can significantly degrade workload performance. Cloud database services are deployed in data centers where network adapters supporting remote direct memory access (RDMA) at low latency and high bandwidth are becoming prevalent. We study the novel problem of how a Symmetric Multi-Processing (SMP) RDBMS, whose memory demands exceed locally-available memory, can leverage available remote memory in the cluster accessed via RDMA to improve query performance. We expose available memory on remote servers using a lightweight file API that allows an SMP RDBMS to leverage the benefits of remote memory with modest changes. We identify and implement several novel scenarios to demonstrate these benefits, and address design challenges that are crucial for efficient implementation. We implemented the scenarios in Microsoft SQL Server engine and present the first end-to-end study to demonstrate benefits of remote memory for a variety of micro-benchmarks and industry-standard benchmarks. Compared to using disks when memory is insufficient, we improve the throughput and latency of queries with short reads and writes by 3X to 10X, while improving the latency of multiple TPC-H and TPC-DS queries by 2X to 100X.
international conference on data engineering | 2010
Vivek R. Narasayya; Manoj Syamala
Decision support queries that scan large indexes can suffer significant degradation in I/O performance due to index fragmentation. DBAs rely on rules of thumb that use index size and fragmentation information to accomplish the task of deciding which indexes to defragment. However, there are two fundamental limitations that make this task challenging. First, database engines offer little support to help estimate the impact of defragmenting an index on the I/O performance of a query. Second, defragmentation is supported only at the granularity of an entire B+-Tree, which can be too restrictive since defragmentation is an expensive operation. This paper describes techniques for addressing the above limitations. We also study the problem of selecting the appropriate indexes to defragment for a given workload. We have implemented our techniques in Microsoft SQL Server and developed a tool that can provide appropriate index defragmentation recommendations to DBAs. We evaluate the effectiveness of the proposed techniques on several real and synthetic databases.
international conference on management of data | 2009
Surajit Chaudhuri; Vivek R. Narasayya; Manoj Syamala
Relational database management systems (RDBMSs) today serve as the backend for many real-world data intensive applications. Database developers use data access APIs such as ADO.NET to execute SQL queries and access data. While modern program analysis and code profilers are extensively used during the software development life cycle, there is a significant gap in these technologies for database applications because these tools have little or no understanding of data access APIs or the DBMS. We have developed tools that: (a) Enhance traditional static analysis of programs by leveraging understanding of database APIs to help developers identify security, correctness and performance problems in the application. This enables such problems to be detected early in the application lifecycle. (b) Extend the existing DBMS and application profiling infrastructure to enable correlation of application events with DBMS events. This allows profiling across application, data access and DBMS layers. We demonstrate how our tools enable a rich class of analysis, tuning and profiling tasks that are otherwise not possible today.
international conference on data engineering | 2013
Tao Cheng; Kaushik Chakrabarti; Surajit Chaudhuri; Vivek R. Narasayya; Manoj Syamala
Retail is increasingly moving online. There are only a few big e-tailers but there is a long tail of small-sized e-tailers. The big e-tailers are able to collect significant data on user activities at their websites. They use these assets to derive insights about their products and to provide superior experiences for their users. On the other hand, small e-tailers do not possess such user data and hence cannot match the rich user experiences offered by big e-tailers. Our key insight is that web search engines possess significant data on user behaviors that can be used to help smaller e-tailers mine the same signals that big e-tailers derive from their proprietary user data assets. These signals can be exposed as data services in the cloud; e-tailers can leverage them to enable similar user experiences as the big e-tailers. We present three such data services in the paper: entity synonym data service, query-to-entity data service and entity tagging data service. The entity synonym service is an in-production data service that is currently available while the other two are data services currently in development at Microsoft. Our experiments on product datasets show (i) these data services have high quality and (ii) they have significant impact on user experiences on e-tailer websites. To the best of our knowledge, this is the first paper to explore the potential of using search engine data assets for e-tailers.
international conference on management of data | 2018
Adam Dziedzic; Jingjing Wang; Sudipto Das; Bolin Ding; Vivek R. Narasayya; Manoj Syamala
Commercial DBMSs, such as Microsoft SQL Server, cater to diverse workloads including transaction processing, decision support, and operational analytics. They also support variety in physical design structures such as B+ tree and columnstore. The benefits of B+ tree for OLTP workloads and columnstore for decision support workloads are well-understood. However, the importance of hybrid physical designs, consisting of both columnstore and B+ tree indexes on the same database, is not well-studied --- a focus of this paper. We first quantify the trade-offs using carefully-crafted micro-benchmarks. This micro-benchmarking indicates that hybrid physical designs can result in orders of magnitude better performance depending on the workload. For complex real-world applications, choosing an appropriate combination of columnstore and B+ tree indexes for a database workload is challenging. We extend the Database Engine Tuning Advisor for Microsoft SQL Server to recommend a suitable combination of B+ tree and columnstore indexes for a given workload. Through extensive experiments using industry-standard benchmarks and several real-world customer workloads, we quantify how a physical design tool capable of recommending hybrid physical designs can result in orders of magnitude better execution costs compared to approaches that rely either on columnstore-only or B+ tree-only designs.
international conference on management of data | 2005
Sanjay Agrawal; Surajit Chaudhuri; Lubor J. Kollar; Arunprasad P. Marathe; Vivek R. Narasayya; Manoj Syamala