Is this you? Create Your Porfile

Gauri Joshi

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gauri Joshi is active.

Explore More

Publication

Featured researches published by Gauri Joshi.

IEEE Journal on Selected Areas in Communications | 2014

On the Delay-Storage Trade-Off in Content Download from Coded Distributed Storage Systems

Gauri Joshi; Yanpei Liu; Emina Soljanin

We study how coding in distributed storage reduces expected download time, in addition to providing reliability against disk failures. The expected download time is reduced because when a content file is encoded with redundancy and distributed across multiple disks, reading only a subset of the disks is sufficient for content reconstruction. For the same total storage used, coding exploits the diversity in storage better than simple replication, and hence gives faster download. We use a novel fork-join queueing framework to model multiple users requesting the content simultaneously, and derive bounds on the expected download time. Our system model and results are a novel generalization of the fork-join system that is studied in queueing theory literature. Our results demonstrate the fundamental trade-off between the expected download time and the amount of storage space. This trade-off can be used for design of the amount of redundancy required to meet the delay constraints on content delivery.

allerton conference on communication, control, and computing | 2012

Coding for fast content download

Gauri Joshi; Yanpei Liu; Emina Soljanin

We study the fundamental trade-off between storage and content download time. We show that the download time can be significantly reduced by dividing the content into chunks, encoding it to add redundancy and then distributing it across multiple disks. We determine the download time for two content access models - the fountain and fork-join models that involve simultaneous content access, and individual access from enqueued user requests respectively. For the fountain model we explicitly characterize the download time, while in the fork-join model we derive the upper and lower bounds. Our results show that coding reduces download time, through the diversity of distributing the data across more disks, even for the total storage used.

measurement and modeling of computer systems | 2014

Efficient task replication for fast response times in parallel computation

Da Wang; Gauri Joshi; Gregory W. Wornell

Large-scale distributed computing systems divide a job into many independent tasks and run them in parallel on different machines. A challenge in such parallel computing is that the time taken by a machine to execute a task is inherently variable, and thus the slowest machine becomes the bottleneck in the completion of the job. One way to combat the variability in machine response is to replicate tasks on multiple machines and waiting for the machine that finishes first. While task replication reduces response time, it generally increases resource usage. In this work, we propose a theoretical framework to analyze the trade-off between response time and resource usage. Given an execution time distribution for machines, our analysis gives insights on when and why replication helps. We also propose efficient scheduling algorithms for large-scale distributed computing systems.

measurement and modeling of computer systems | 2015

Using Straggler Replication to Reduce Latency in Large-scale Parallel Computing

Da Wang; Gauri Joshi; Gregory W. Wornell

In cloud computing jobs consisting of many tasks run in parallel, the tasks on the slowest machines (straggling tasks) become the bottleneck in the completion of the job. One way to combat the variability in machine response time is to add replicas of straggling tasks and wait for the earliest copy to finish. Using the theory of extreme order statistics, we analyze how task replication reduces latency, and its impact on the cost of computing resources. We also propose a heuristic algorithm to search for the best replication strategies when it is difficult to model the empirical behavior of task execution time and use the proposed analysis techniques. Evaluation of the heuristic policies on Google Trace data shows a significant latency reduction compared to the replication strategy used in MapReduce.

international symposium on information theory | 2012

On playback delay in streaming communication

Gauri Joshi; Yuval Kochman; Gregory W. Wornell

We consider the problem of minimizing playback delay in streaming over a packet erasure channel with fixed bandwidth. When packets have to be played in order, the expected delay inherently grows with time. We analyze two cases, namely no feedback and instantaneous feedback. We find that in both cases the delay grows logarithmically with the time elapsed since the start of transmission, and we evaluate the growth constant, i.e. the pre-log term, as a function of the transmission bandwidth (relative to the source bandwidth). The growth constant with feedback is strictly better that the one without, but they have the same asymptotic value in the limit of infinite bandwidth.

measurement and modeling of computer systems | 2015

Queues with Redundancy: Latency-Cost Analysis

Gauri Joshi; Emina Soljanin; Gregory W. Wornell

A major advantage of cloud computing and storage is the large-scale sharing of resources, which provides scalability and flexibility. But resource-sharing causes variability in the latency experienced by the user, due to several factors such as virtualization, server outages, network congestion etc . This problem is further aggravated when a job consists of several parallel tasks, because the task run on the slowest machine becomes the latency bottleneck. A promising method to reduce latency is to assign a task to multiple machines and wait for the earliest to finish. Similarly, in cloud storage systems requests to download the content can be assigned to multiple replicas, such that it is sufficient to download any one replica. Although studied actively in systems in the past few years, there is little work on rigorous analysis of how redundancy affects latency. The effect of redundancy in queueing systems was first analyzed only recently in [2, 3, 6], assuming exponential service time. General service time distribution, in particular the effect of its tail, is considered in [7, 8]. This work analyzes the trade-off between latency and the cost of computing resources in queues with redundancy, without assuming exponential service time. We study a generalized fork-join queueing model where finishing any k out of n tasks is sufficient to complete a job. The redundant tasks can be canceled when any k tasks finish, or earlier, when any k tasks start service. For the k = 1 case, we get an elegant latency and cost analysis by identifying equivalences between systems without and with early redundancy cancellation to M/G/1 and M/G/n queues respectively. For general k, we derive bounds on the latency and cost. Please see [4] for an extended version of this work.

international conference on computer communications | 2014

The Effect of Block-wise Feedback on the Throughput-Delay Trade-off in Streaming

Gauri Joshi; Yuval Kochman; Gregory W. Wornell

Unlike traditional file transfer where only total delay matters, streaming applications impose delay constraints on each packet and require them to be in order. To achieve fast in-order packet decoding, we have to compromise on the throughput. We study this trade-off between throughput and in-order decoding delay, and in particular how it is affected by the frequency of block-wise feedback, whereby the source receives full channel state feedback at periodic intervals. Our analysis shows that for the same throughput, having more frequent feedback significantly reduces the in-order decoding delay. For any given block-wise feedback delay, we present a spectrum of coding schemes that span different throughput-delay tradeoffs. One can choose an appropriate coding scheme from these, depending upon the delay-sensitivity and bandwidth limitations of the application.

international symposium on information theory | 2013

Round-robin overlapping generations coding for fast content download

Gauri Joshi; Emina Soljanin

We analyze the download time of a large file, divided into chunks called generations, and transmitted over an erasure channel without feedback. For non-overlapping generations, we derive how the download time scales with the number of generations, for the round-robin and random scheduling policies. We then analyze coding with overlapping generations and show that the optimal overlap size is small compared to the number of generations, which implies that the download time can be reduced with only a small increase in computational complexity. Further, for a given overlap size, we propose overlap structures that have low complexity and are easy to implement, but still give file download as fast as the best previously proposed structures.

allerton conference on communication, control, and computing | 2015

Efficient replication of queued tasks for latency reduction in cloud systems

Gauri Joshi; Emina Soljanin; Gregory W. Wornell

In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.

international symposium on information theory | 2015

Playback delay in on-demand streaming communication with feedback

Kaveh Mahdaviani; Ashish Khisti; Gauri Joshi; Gregory W. Wornell

We consider a streaming communication system where the source packets must be played back sequentially at the destination and study the associated average playback delay. We assume that all the source packets are available before the start of transmission at the transmitter and consider the case of an i.i.d. erasure channel with perfect feedback. We first consider the case when the receiver buffer can be arbitrarily large, and show that the average playback delay remains bounded in the length of the stream provided that the channel bandwidth is greater than a critical threshold. Our analysis involves the application of martingale theory to study the transient behaviour of a one dimensional random walk with drift. Conversely when the channel bandwidth is smaller than the above threshold, the average playback delay increases linearly with the stream length. We also consider the finite buffer case and analyse the playback delay of a greedy dynamic bandwidth scheme. We further show through simulations that the achievable delay with a finite receiver buffer is close to the infinite buffer case for moderately large buffer values.

Explore More