Istemi Ekin Akkus
Max Planck Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Istemi Ekin Akkus.
computer and communications security | 2012
Istemi Ekin Akkus; Ruichuan Chen; Michaela Hardt; Paul Francis; Johannes Gehrke
Today, websites commonly use third party web analytics services t obtain aggregate information about users that visit their sites. This information includes demographics and visits to other sites as well as user behavior within their own sites. Unfortunately, to obtain this aggregate information, web analytics services track individual user browsing behavior across the web. This violation of user privacy has been strongly criticized, resulting in tools that block such tracking as well as anti-tracking legislation and standards such as Do-Not-Track. These efforts, while improving user privacy, degrade the quality of web analytics. This paper presents the first design of a system that provides web analytics without tracking. The system gives users differential privacy guarantees, can provide better quality analytics than current services, requires no new organizational players, and is practical to deploy. This paper describes and analyzes the design, gives performance benchmarks, and presents our implementation and deployment across several hundred users.
acm special interest group on data communication | 2013
Ruichuan Chen; Istemi Ekin Akkus; Paul Francis
There is a growing body of research on mechanisms for preserving online user privacy while still allowing aggregate queries over private user data. A common approach is to store user data at users devices, and to query the data in such a way that a differentially private noisy result is produced without exposing individual user data to any system component. A particular challenge is to design a system that scales well while limiting how much the malicious users can distort the result. This paper presents SplitX, a high-performance analytics system for making differentially private queries over distributed user data. SplitX is typically two to three orders of magnitude more efficient in bandwidth, and from three to five orders of magnitude more efficient in computation than previous comparable systems, while operating under a similar trust model. SplitX accomplishes this performance by replacing public-key operations with exclusive-or operations. This paper presents the design of SplitX, analyzes its security and performance, and describes its implementation and deployment across 416 users.
Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference on | 2017
Jörg Thalheim; Antonio Rodrigues; Istemi Ekin Akkus; Pramod Bhatotia; Ruichuan Chen; Bimal Viswanath; Lei Jiao; Christof Fetzer
Major cloud computing operators provide powerful monitoring tools to understand the current (and prior) state of the distributed systems deployed in their infrastructure. While such tools provide a detailed monitoring mechanism at scale, they also pose a significant challenge for the application developers/operators to transform the huge space of monitored metrics into useful insights. These insights are essential to build effective management tools for improving the efficiency, resiliency, and dependability of distributed systems. This paper reports on our experience with building and deploying Sieve---a platform to derive actionable insights from monitored metrics in distributed systems. Sieve builds on two core components: a metrics reduction framework, and a metrics dependency extractor. More specifically, Sieve first reduces the dimensionality of metrics by automatically filtering out unimportant metrics by observing their signal over time. Afterwards, Sieve infers metrics dependencies between distributed components of the system using a predictive-causality model by testing for Granger Causality. We implemented Sieve as a generic platform and deployed it for two microservices-based distributed systems: OpenStack and Share-Latex. Our experience shows that (1) Sieve can reduce the number of metrics by at least an order of magnitude (10 -- 100×), while preserving the statistical equivalence to the total number of monitored metrics; (2) Sieve can dramatically improve existing monitoring infrastructures by reducing the associated overheads over the entire system stack (CPU---80%, storage---90%, and network---50%); (3) Lastly, Sieve can be effective to support a wide-range of workflows in distributed systems---we showcase two such workflows: Orchestration of autoscaling, and Root Cause Analysis (RCA).
symposium on cloud computing | 2018
Do Le Quoc; Istemi Ekin Akkus; Pramod Bhatotia; Spyros Blanas; Ruichuan Chen; Christof Fetzer; Thorsten Strufe
A distributed join is a fundamental operation for processing massive datasets in parallel. Unfortunately, computing an equi-join over such datasets is very resource-intensive, even when done in parallel. Given this cost, the equi-join operator becomes a natural candidate for optimization using approximation techniques, which allow users to trade accuracy for latency. Finding the right approximation technique for joins, however, is a challenging task. Sampling, in particular, cannot be directly used in joins; naïvely performing a join over a sample of the dataset will not preserve statistical properties of the query result. To address this problem, we introduce ApproxJoin. We interweave Bloom filter sketching and stratified sampling with the join computation in a new operator that preserves statistical properties of an aggregation over the join output. ApproxJoin leverages Bloom filters to avoid shuffling non-joinable data items around the network, and then applies stratified sampling to obtain a representative sample of the join output. We implemented ApproxJoin in Apache Spark, and evaluated it using microbenchmarks and real-world workloads. Our evaluation shows that ApproxJoin scales well and significantly reduces data movement, without sacrificing tight error bounds on the accuracy of the final results. ApproxJoin achieves a speedup of up to 9x over unmodified Spark-based joins with the same sampling ratio. Furthermore, the speedup is accompanied by a significant reduction in the shuffled data volume, which is up to 82x less than unmodified Spark-based joins.
conference on emerging network experiment and technology | 2017
Ruichuan Chen; Istemi Ekin Akkus; Bimal Viswanath; Ivica Rimac; Volker Hilt
A common practice to increase the reliability of a cloud application is to deploy redundant instances. Unfortunately such redundancy efforts can be undermined if the applications instances share common dependencies. This paper presents ReCloud, a novel system that can efficiently find a reliable deployment plan for cloud applications. ReCloud considers and avoids common dependencies shared across application instances that may lead to correlated failures, and works with applications that even have complex internal structures. ReCloud utilizes various pieces of available dependency information (e.g., hardware, software and/or network dependencies) about the cloud infrastructure to quantitatively assess the reliability of the applications deployment plan with rigorous error bounds. This assessment further enables ReCloud to find a deployment plan that balances between reliability and other criteria such as application performance and resource utilization. We implemented a fully functional system. The experimental results show that, even in a large cloud environment with more than 27K hosts, ReCloud needs only 30 seconds to find a deployment plan that is one order of magnitude more reliable than the common practice.
ieee international conference on cloud computing technology and science | 2011
Pramod Bhatotia; Alexander Wieder; Istemi Ekin Akkus; Rodrigo Rodrigues; Umut A. Acar
arXiv: Distributed, Parallel, and Cluster Computing | 2018
Do Le Quoc; Istemi Ekin Akkus; Pramod Bhatotia; Spyros Blanas; Ruichuan Chen; Christof Fetzer; Thorsten Strufe
usenix annual technical conference | 2018
Istemi Ekin Akkus; Ruichuan Chen; Ivica Rimac; Manuel Stein; Klaus Satzke; Andre Beck; Paarijaat Aditya; Volker Hilt
arXiv: Distributed, Parallel, and Cluster Computing | 2017
Jörg Thalheim; Antonio Wendell De Oliveira Rodrigues; Istemi Ekin Akkus; Pramod Bhatotia; Ruichuan Chen; Bimal Viswanath; Lei Jiao; Christof Fetzer
Archive | 2017
Istemi Ekin Akkus; Ivica Rimac; Ruichuan Chen; Bimal Viswanath; Volker Hilt