Santonu Sarkar
Birla Institute of Technology and Science
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Santonu Sarkar.
ACM Computing Surveys | 2015
Nidhi Tiwari; Santonu Sarkar; Umesh Bellur; Maria Indrawan
A MapReduce scheduling algorithm plays a critical role in managing large clusters of hardware nodes and meeting multiple quality requirements by controlling the order and distribution of users, jobs, and tasks execution. A comprehensive and structured survey of the scheduling algorithms proposed so far is presented here using a novel multidimensional classification framework. These dimensions are (i) meeting quality requirements, (ii) scheduling entities, and (iii) adapting to dynamic environments; each dimension has its own taxonomy. An empirical evaluation framework for these algorithms is recommended. This survey identifies various open issues and directions for future research.
ieee acm international conference utility and cloud computing | 2014
Antonio Pecchia; Domenico Cotroneo; Rajeshwari Ganesan; Santonu Sarkar
Security alerts collected under real workload conditions represent a goldmine of information to protect integrity and confidentiality of a production Cloud. Nevertheless, the volume of runtime alerts overwhelms operations teams and makes forensics hard and time consuming. This paper investigates the use of different text weighting schemes to filter an average volume of 1,000 alerts/day produced by a security information and event management (SIEM) tool in a production SaaS Cloud. As a result, a filtering approach based on the log. Entropy scheme, has been developed to pinpoint relevant information across the amount of daily textual alerts. The proposed filter is valuable to support operations team and allowed identifying real incidents that affected several nodes and required manual response.
ACM Computing Surveys | 2015
Arpan Roy; Santonu Sarkar; Rajeshwari Ganesan; Geetika Goel
In response to the revival of virtualized technology by Rosenblum and Garfinkel [2005], NIST defined cloud computing, a new paradigm in service computing infrastructures. In cloud environments, the basic security mechanism is ingrained in virtualization—that is, the execution of instructions at different privilege levels. Despite its obvious benefits, the caveat is that a crashed virtual machine (VM) is much harder to recover than a crashed workstation. When crashed, a VM is nothing but a giant corrupt binary file and quite unrecoverable by standard disk-based forensics. Therefore, VM crashes should be avoided at all costs. Security is one of the major contributors to such VM crashes. This includes compromising the hypervisor, cloud storage, images of VMs used infrequently, and remote cloud client used by the customer as well as threat from malicious insiders. Although using secure infrastructures such as private clouds alleviate several of these security problems, most cloud users end up using cheaper options such as third-party infrastructures (i.e., private clouds), thus a thorough discussion of all known security issues is pertinent. Hence, in this article, we discuss ongoing research in cloud security in order of the attack scenarios exploited most often in the cloud environment. We explore attack scenarios that call for securing the hypervisor, exploiting co-residency of VMs, VM image management, mitigating insider threats, securing storage in clouds, abusing lightweight software-as-a-service clients, and protecting data propagation in clouds. Wearing a practitioners glasses, we explore the relevance of each attack scenario to a service company like Infosys. At the same time, we draw parallels between cloud security research and implementation of security solutions in the form of enterprise security suites for the cloud. We discuss the state of practice in the form of enterprise security suites that include cryptographic solutions, access control policies in the cloud, new techniques for attack detection, and security quality assurance in clouds.
india software engineering conference | 2015
Santonu Sarkar; Sayantan Mitra
GPUs offer a powerful bulk synchronous programming model for exploiting data parallelism; however, branch divergence amongst executing warps can lead to serious performance degradation due to execution serialization. We propose a novel profile guided approach to optimize branch divergence while transforming a serial program to a data-parallel program for GPUs. Our approach is based on the observation that branches inside some data parallel loops although divergent, exhibit repetitive regular patterns of outcomes. By exploiting such patterns, loop iterations can be aligned so that the corresponding iterations traverse the same branch path. These aligned iterations when executed as a warp in a GPU, become convergent. We propose a new metric based on the repetitive pattern characteristics that indicates whether a data-parallel loop is worth restructuring. When tested our approach on the well-known Rodinia benchmark, we found that it is possible to achieve upto 48% performance improvement by loop restructuring suggested by the patterns and our metrics.
international parallel and distributed processing symposium | 2016
Nidhi Tiwari; Umesh Bellur; Santonu Sarkar; Maria Indrawan
Energy efficiency is an important concern for data centers today. Most of these data centers use MapReduce frameworks for big data processing. These frameworks and modern hardware provide the flexibility in form of parameters to manage the performance and energy consumption of system. However tuning these parameters such that it reduces energy consumption without impacting performance is challenging since - 1) there are a large number of parameters across the layers of frameworks, 2) impact of the parameters differ based on the workload characteristics, 3) the same parameter may have conflicting impacts on performance and energy and 4) parameters may have interaction effects. To streamline the parameter tuning, we present the systematic design of experiments to study the effects of different parameters on performance and energy consumption with a view to identify the most influential ones quickly and efficiently. The final goal is to use the identified parameters to build predictive models for tuning the environment. We perform a detailed analysis of the main and interaction effects of rationally selected parameters on performance and energy consumption for typical MapReduce workloads. Based on a relatively small number of experiments, we ascertain that replication-factor has highest impact and, surprisingly compression has least impact on the energy efficiency of MapReduce systems. Furthermore, from the results of factorial design we infer that the two-way interactions between block-size, Map-slots, and CPU-frequency, parameters of Hadoop platform have a high impact on energy efficiency of all types of workloads due to the distributed, parallel, pipe-lined design.
IEEE Transactions on Reliability | 2017
Catello Di Martino; Santonu Sarkar; Rajeshwari Ganesan; Zbigniew Kalbarczyk; Ravishankar K. Iyer
A software-as-a-service (SaaS) needs to provide its intended service as per its stated service-level agreements (SLAs). While SLA violations in a SaaS platform have been reported, not much work has been done to empirically characterize failures of SaaS. In this paper, we study SLA violations of a production SaaS platform, diagnose the causes, unearth several critical failure modes, and then, suggest various solution approaches to increase the availability of the platform as perceived by the end user. Our approach combines field failure data analysis (FFDA) and fault injection. Our study is based on 283 days of operational logs of the platform. During this time, the platform received business workload from 42 customers spread over 22 countries. We have first developed a set of home-grown FFDA tools to analyze the log, and second implemented a fault injector to automatically inject several runtime errors in the application code written in .NET/C#, and then, collate the injection results. We summarize our finding as: first, system failures have caused 93% of all SLA violations; second, our fault injector has been able to recreate a few cases of bursts of SLA violations that could not be diagnosed from the logs; and third, the fault injection mechanism could recreate several error propagation paths leading to data corruptions that the failure data analysis could not reveal. Finally, the paper presents some system-level implication of this study and how the joint use of fault injection and log analysis may help in improving the reliability of the measured platform.
IEEE Transactions on Big Data | 2016
Subhajit Datta; Santonu Sarkar; A. S. M. Sajeev
We all want to be associated with long lasting ideas; as originators, or at least, expositors. For a tyro researcher or a seasoned veteran, knowing how long an idea will remain interesting in the community is critical in choosing and pursuing research threads. In the physical sciences, the notion of half-life is often evoked to quantify decaying intensity. In this paper, we study a corpus of 19,000+ papers written by 21,000+ authors across 16 software engineering publication venues from 1975 to 2010, to empirically determine the half-life of software engineering research topics. In the absence of any consistent and well-accepted methodology for associating research topics to a publication, we have used natural language processing techniques to semi-automatically identify and associate a set of topics with a paper. We adapted measures of half-life already existing in the bibliometric context for our study, and also defined a new measure based on publication and citation counts. We find evidence that some of the identified research topics show a mean half-life of close to 15 years, and there are topics with sustaining interest in the community. We report the methodology of our study in this paper, as well as the implications and utility of our results.
international symposium on software reliability engineering | 2014
Flavio Frattini; Santonu Sarkar; Jyotiska Nath Khasnabish; Stefano Russo
Invariants represent properties of a system that are expected to hold when everything goes well. Thus, the violation of an invariant most likely corresponds to the occurrence of an anomaly in the system. In this paper, we discuss the accuracy and the completeness of an anomaly detection system based on invariants. The case study we have taken is a back-end operation of a SaaS platform. Results show the rationality of the approach and discuss the impact of the invariant mining strategy on the detection capabilities, both in terms of accuracy and of time to reveal violations.
automated technology for verification and analysis | 2017
Soumyadip Bandyopadhyay; Santonu Sarkar; Dipankar Sarkar; Chittaranjan A. Mandal
An application program can go through significant optimizing and parallelizing transformations, both automated and human guided, before being mapped to an architecture. Formal verification of these transformations is crucial to ensure that they preserve the original behavioural specification. PRES+ model (Petri net based Representation of Embedded Systems) encompassing data processing is used to model parallel behaviours more vividly. This paper presents a translation validation tool for verifying optimizing and parallelizing code transformations by checking equivalence between two PRES+ models, one representing the source code and the other representing its optimized and (or) parallelized version.
international conference on parallel and distributed systems | 2016
Nidhi Tiwari; Umesh Bellur; Santonu Sarkar; Maria Indrawan
Energy efficiency is a major concern in todays data centers that house large scale distributed processing systems such as data parallel MapReduce clusters. Modern power aware systems utilize the dynamic voltage and frequency scaling mechanism available in processors to manage the energy consumption. In this paper, we initially characterize the energy efficiency of MapReduce jobs with respect to built-in power governors. Our analysis indicates that while a built-in power governor provides the best energy efficiency for a job that is CPU as well as IO intensive, a common CPU-frequency across the cluster provides best the energy efficiency for other types of jobs. In order to identify this optimal frequency setting, we derive energy and performance models for MapReduce jobs on a HPC cluster and validate these models experimentally on different platforms. We demonstrate how these models can be used to improve energy efficiency of the machine learning MapReduce applications running on the Yarn platform. The execution of jobs at their optimal frequencies improves the energy efficiency by average 25% over the default governor setting. In case of mixed workloads, the energy efficiency improves by up to 10% when we use an optimal CPU-frequency across the cluster.