Haricharan Ramachandra
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Haricharan Ramachandra.
international conference on performance engineering | 2015
Zhenyun Zhuang; Haricharan Ramachandra; Cuong Tran; Subbu Subramaniam; Chavdar Botev; Chaoyue Xiong; Badri Sridharan
Internet companies like LinkedIn handle a large amount of incoming web traffic. Events generated in response to user input or actions are stored in a source database. These database events feature the typical characteristics of Big Data: high volume, high velocity and high variability. Database events are replicated to isolate source database and form a consistent view across data centers. Ensuring a low replication latency of database events is critical to business values. Given the inherent characteristics of Big Data, minimizing the replication latency is a challenging task. In this work we study the problem of taming the database replication latency by effective capacity planning. Based on our observations into LinkedIns production traffic and various playing parts, we develop a practical and effective model to answer a set of business-critical questions related to capacity planning. These questions include: future traffic rate forecasting, replication latency prediction, replication capacity determination, replication headroom determination and SLA determination.
international conference on big data | 2015
Tao Feng; Zhenyun Zhuang; Yi Pan; Haricharan Ramachandra
Data quality is essential in big data paradigm as poor data can have serious consequences when dealing with large volumes of data. While it is trivial to spot poor data for small-scale and offline use cases, it is challenging to detect and fix data inconsistency in large-scale and online (real-time or near-real time) big data context. An example of such scenario is spotting and fixing poor data using Apache Samza, a stream processing framework that has been increasingly adopted to process near-real-time data at LinkedIn. To optimize the deployment of Samza processing and reduce business cost, in this work we propose a memory capacity model for Apache Samza to allow denser deployments of high performing data-filtering applications built on Samza. The model can be used to provision just-enough memory resource to applications by tightening the bounds on the memory allocations. We apply our memory capacity model on Linkedlns real use cases in production, which significantly increases the deployment density and saves business costs. We will share key learning in this paper.
international conference on cloud computing | 2014
Zhenyun Zhuang; Cuong Tran; Haricharan Ramachandra; Badri Sridharan
Cloud Computing promises a cost-effective and administration-effective solution to the traditional needs of computing resources. While bringing efficiency to the users thanks to the shared hardware and software, the multi-tenency characteristics also bring unique challenges to the backend cloud platforms. In particular, the JVM mechanisms used by Java applications, coupled with OS-level features, give rise to a set of problems that are not present in other deployment scenarios. In this work, we consider the problem of ensuring high-performance of mission-critical Java applications in multi-tenant cloud environments. Based on our experiences with Linkedins platforms, we identify and solve a set of problems caused by multi-tenancy. We share the lessons and knowledge we learned during the course.
2017 International Conference on Computing, Networking and Communications (ICNC) | 2017
Zhenyun Zhuang; Cuong Tran; Jerry Weng; Haricharan Ramachandra; Badri Sridharan
Linux kernel feature of Cgroups (Control Groups) is being increasingly adopted for running applications in multi-tenanted environments. Many projects (e.g., Docker) rely on cgroups to isolate resources such as CPU and memory. It is critical to ensure high performance for such deployments. At LinkedIn, we have been using Cgroups and investigated its performance. This work presents our findings about memory-related performance issues of cgroups during certain scenarios. These issues can significantly affect the performance of the applications running in cgroups. Specifically, (1) memory is not reserved for cgroups (as with virtual machines); (2) both anonymous memory and page cache are part of memory limit and the former can evict the latter; (3) OS can steal page cache from any cgroups; (4) OS can swap any cgroups. We provide a set of recommendations for addressing these issues.
international congress on big data | 2016
Zhenyun Zhuang; Tao Feng; Yi Pan; Haricharan Ramachandra; Badri Sridharan
Increasing adoption of Big Data in business environments have driven the needs of stream joining in realtime fashion. Multi-stream joining is an important stream processing type in todays Internet companies, and it has been used to generate higher-quality data in business pipelines. Multi-stream joining can be performed in two models: (1) All-In-One (AIO) Joining and (2) Step-By-Step (SBS) Joining. Both models have advantages and disadvantages with regard to memory footprint, joining latency, deployment complexity, etc. In this work, we analyze the performance tradeoffs associated with these two models using Apache Samza.
international symposium on performance analysis of systems and software | 2017
Susie Xia; Zhenyun Zhuang; Anant Rao; Haricharan Ramachandra; Yi Feng; Ramya Pasumarti
Accurate capacity measurement of Internet services is critical to ensure high-performing production computing environments. In this work, we present our solution of performing accurate capacity measurement. Referred to as “LiveRedliner”, it uses live traffic in production environments to drive the measurement, hence avoiding many pitfalls that prevent capacity measurement from obtaining accurate values in synthetic lab environment.
ieee international conference semantic computing | 2017
Ramya Pasumarti; Rushin Barot; Susie Xia; Ang Xu; Haricharan Ramachandra
Large-scale web services like LinkedIn serve millions of users across the globe. The user experience depends on high service availability and performance of the services. In such a scenario, capacity measurement is critical for these cloud services. Resources should be provisioned such that the service can easily handle peak traffic without experiencing bottlenecks or compromising on latency. In addition, accurate understanding of service capacity will lead to systematic provisioning of resources saving millions of dollars in capital investment and better savings in energy. Stateful services like NoSQL databases are one of the most expensive and critical components in a cloud stack. A clear understanding of the capacity limits of a stateful service will lead to better availability and performance across the stack. However, based on our experience, accurately measuring capacity of NoSQL databases is much more challenging than regular stateless services. In this work, we present various approaches to accurately measure the capacity of stateful NoSQL services, their benefits and costs, and discuss in detail about the solution we prefer to use.
international conference on cloud computing | 2016
Zhenyun Zhuang; Cuong Tran; Haricharan Ramachandra; Badri Sridharan
For PaaS-deployed (Platform as a Service) customer-facing applications (e.g., online gaming and online chatting), ensuring low latencies is not just a preferred feature, but a must-have feature. Given the popularity and powerfulness of Java platforms, a significant portion of todays PaaS platforms run Java. JVM (Java Virtual Machine) manages a heap space to hold application objects. The heap space can be frequently GC-ed (Garbage Collected), and applications can be occasionally stopped for long time during some GC and JVM activities. In this work, we investigated the JVM pause problem. We found out that there are some (and large) JVM STW pauses cannot be explained by application-level activities and JVM activities during GC, instead, they are caused by OS mechanisms. We successfully reproduced such problems and root-cause-ed the reasons. The findings can be used to enhance JVM implementation. We also proposed a set of solutions to mitigate and eliminate these large STW pauses. We share the knowledge and experiences in this writing.
international conference on big data | 2016
Zhenyun Zhuang; Haricharan Ramachandra; Badri Sridharan; Brandon Duncan; Kishore Gopalakrishna; Jean-Francois Im
Todays applications are increasingly using memory mapped files for managing large volumes of data in hoping to enjoy the performance benefits of memory mapping compared with traditional file IO. Memory mapped files uses the OS page caching mechanism to save expensive system call and copying. However, as we find out, a naive usage of memory mapped files will cause severe performance problem due to the ineffective usage of physical memory. We propose a solution called SmartCache to address the performance issue. SmartCache maintains an application-layer caching space to more effectively use the physical memory. SmartCache can be implemented inside an application or as an independent library for applications to use.
computer software and applications conference | 2016
Zhenyun Zhuang; Sergiy Zhuk; Haricharan Ramachandra; Badri Sridharan
SSD (Solid State Drive) is being increasingly adopted to alleviate the IO performance bottlenecks of applications. Numerous measurement results have been published to showcase the performance improvement brought by SSD as compared to HDD (Hard Disk Drive). However, in most deployment scenarios, SSD is simply treated as a “faster HDD”. Hence its potential is not fully utilized. Though applications gain better performance when using SSD as the storage, the gains are mainly attributed to the higher IOPS and bandwidth of SSD. The naive adoption of SSD does not utilize SSD to its full potential. In other words, application performance improvements brought by SSD could be more significant if applications are designed to be SSD-friendly. In this work, we propose a set of 9 SSD-friendly design changes at application layer. Applying the design changes by applications can result in three types of benefits: (1) improved application performance; (2) increased SSD IO efficiency; and (3) longer SSD life.