Harsha V. Madhyastha
University of Michigan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Harsha V. Madhyastha.
internet measurement conference | 2009
Rupa Krishnan; Harsha V. Madhyastha; Sridhar Srinivasan; Sushant Jain; Arvind Krishnamurthy; Thomas E. Anderson; Jie Gao
Replicating content across a geographically distributed set of servers and redirecting clients to the closest server in terms of latency has emerged as a common paradigm for improving client performance. In this paper, we analyze latencies measured from servers in Googles content distribution network (CDN) to clients all across the Internet to study the effectiveness of latency-based server selection. Our main result is that redirecting every client to the server with least latency does not suffice to optimize client latencies. First, even though most clients are served by a geographically nearby CDN node, a sizeable fraction of experience latencies several tens of milliseconds higher than other in the same region. Second, we find that queueing delays often override the benefits of a client interacting with a nearby server. To help the administrators of Googles CDN cope with these problems, we have built a system called WhyHigh. First, WhyHigh measures client latencies across all nodes in the CDN and correlates measurements to identify the prefixes affected by inflated latencies. Second, since clients in several thousand prefixes have poor latencies, WhyHigh prioritizes problems based on the impact that solving them would have, e.g., by identifying either an AS path common to several inflated prefixes or a CDN node where path inflation is widespread. Finally, WhyHigh diagnoses the causes for inflated latencies using active measurements such as traceroutes and pings, in combination with datasets such as BGP paths and flow records. Typical causes discovered include lack of peering, routing misconfigurations, and side-effects of traffic engineering. We have used WhyHigh to diagnose several instances of inflated latencies, and our efforts over the course of a year have significantly helped improve the performance offered to clients by Googles CDN.
passive and active network measurement | 2013
Curtis Yu; Cristian Lumezanu; Yueping Zhang; Vishal Singh; Guofei Jiang; Harsha V. Madhyastha
Flow-based programmable networks must continuously monitor performance metrics, such as link utilization, in order to quickly adapt forwarding rules in response to changes in workload. However, existing monitoring solutions either require special instrumentation of the network or impose significant measurement overhead. In this paper, we propose a push-based approach to performance monitoring in flow-based networks, where we let the network inform us of performance changes, rather than query it ourselves on demand. Our key insight is that control messages sent by switches to the controller carry information that allows us to estimate performance. In OpenFlow networks, PacketIn and FlowRemoved messages--sent by switches to the controller upon the arrival of a new flow or upon the expiration of a flow entry, respectively--enable us to compute the utilization of links between switches. We conduct a) experiments on a real testbed, and b) simulations with real enterprise traces, to show accuracy, and that it can refresh utilization information frequently (e.g., at most every few seconds) given a constant stream of control messages. Since the number of control messages may be limited by the properties of traffic (e.g., long flows trigger sparse FlowRemoveds) or by the choices made by operators (e.g., proactive or wildcard rules eliminate or limit PacketIns), we discuss how our proposed passive approach can be combined with active approaches with low overhead.
symposium on operating systems principles | 2013
Zhe Wu; Michael Butkiewicz; Dorian Perkins; Ethan Katz-Bassett; Harsha V. Madhyastha
By offering storage services in several geographically distributed data centers, cloud computing platforms enable applications to offer low latency access to user data. However, application developers are left to deal with the complexities associated with choosing the storage services at which any object is replicated and maintaining consistency across these replicas. In this paper, we present SPANStore, a key-value store that exports a unified view of storage services in geographically distributed data centers. To minimize an application providers cost, we combine three key principles. First, SPANStore spans multiple cloud providers to increase the geographical density of data centers and to minimize cost by exploiting pricing discrepancies across providers. Second, by estimating application workload at the right granularity, SPANStore judiciously trades off greater geo-distributed replication necessary to satisfy latency goals with the higher storage and data propagation costs this entails in order to satisfy fault tolerance and consistency requirements. Finally, SPANStore minimizes the use of compute resources to implement tasks such as two-phase locking and data propagation, which are necessary to offer a global view of the storage services that it builds upon. Our evaluation of SPANStore shows that it can lower costs by over 10x in several scenarios, in comparison with alternative solutions that either use a single storage provider or replicate every object to every data center from which it is accessed.
internet measurement conference | 2006
Harsha V. Madhyastha; Thomas E. Anderson; Arvind Krishnamurthy; Neil Spring; Arun Venkataramani
Several models have been recently proposed for predicting the latency of end to end Internet paths. These models treat the Internet as a black-box, ignoring its internal structure. While these models are simple, they can often fail systematically; for example, the most widely used models use metric embeddings that predict no benefit to detour routes even though half of all Internet routes can benefit from detours.In this paper, we adopt a structural approach that predicts path latency based on measurements of the Internets routing topology, PoP connectivity, and routing policy. We find that our approach outperforms Vivaldi, the most widely used black-box model. Furthermore, unlike metric embeddings, our approach successfully predicts 65% of detour routes in the Internet. The number of measurements used in our approach is comparable with that required by black box techniques, but using traceroutes instead of pings.
acm special interest group on data communication | 2013
Zhe Wu; Michael Butkiewicz; Dorian Perkins; Ethan Katz-Bassett; Harsha V. Madhyastha
Existing cloud computing platforms leave it up to applications to deal with the complexities associated with data replication and propagation across data centers. In our work, we propose the CSPAN key-value store to instead export a unified view of storage services in several geographically distributed data centers. To minimize the cost incurred by application providers, we combine two principles. First, CSPAN spans the data centers of multiple cloud providers. Second, CSPAN judiciously trades off the lower latencies and the higher storage and data propagation costs based on an applications anticipated workload, latency goals, and consistency requirements.
ieee symposium on security and privacy | 2012
Masoud Akhoondi; Curtis Yu; Harsha V. Madhyastha
The widely used Tor anonymity network is designed to enable low-latency anonymous communication. However, in practice, interactive communication on Tor-which accounts for over 90% of connections in the Tor network [1]-incurs latencies over 5x greater than on the direct Internet path. In addition, since path selection to establish a circuit in Tor is oblivious to Internet routing, anonymity guarantees can breakdown in cases where an autonomous system (AS) can correlate traffic across the entry and exit segments of a circuit. In this paper, we show that both of these shortcomings in Tor can be addressed with only client-side modifications, i.e., without requiring a revamp of the entire Tor architecture. To this end, we design and implement a new Tor client, LASTor. First, we show that LASTor can deliver significant latency gains over the default Tor client by simply accounting for the inferred locations of Tor relays while choosing paths. Second, since the preference for low latency paths reduces the entropy of path selection, we design LASTors path selection algorithm to be tunable. A user can choose an appropriate tradeoff between latency and anonymity by specifying a value between 0 (lowest latency) and 1 (highest anonymity) for a single parameter. Lastly, we develop an efficient and accurate algorithm to identify paths on which an AS can correlate traffic between the entry and exit segments. This algorithm enables LASTor to avoid such paths and improve a users anonymity, while the low runtime of the algorithm ensures that the impact on end-to-end latency of communication is low. By applying our techniques to measurements of real Internet paths and by using LASTor to visit the top 200 websites from several geographically-distributed end-hosts, we show that, in comparison to the default Tor client, LASTor reduces median latencies by 25% while also reducing the false negative rate of not detecting a potential snooping AS from 57% to 11%.
acm special interest group on data communication | 2012
Ethan Katz-Bassett; Colin Scott; David R. Choffnes; Ítalo Cunha; Vytautas Valancius; Nick Feamster; Harsha V. Madhyastha; Thomas E. Anderson; Arvind Krishnamurthy
The Internet was designed to always find a route if there is a policy-compliant path. However, in many cases, connectivity is disrupted despite the existence of an underlying valid path. The research community has focused on short-term outages that occur during route convergence. There has been less progress on addressing avoidable long-lasting outages. Our measurements show that long-lasting events contribute significantly to overall unavailability. To address these problems, we develop LIFEGUARD, a system for automatic failure localization and remediation. LIFEGUARD uses active measurements and a historical path atlas to locate faults, even in the presence of asymmetric paths and failures. Given the ability to locate faults, we argue that the Internet protocols should allow edge ISPs to steer traffic to them around failures, without requiring the involvement of the network causing the failure. Although the Internet does not explicitly support this functionality today, we show how to approximate it using carefully crafted BGP messages. LIFEGUARD employs a set of techniques to reroute around failures with low impact on working routes. Deploying LIFEGUARD on the Internet, we find that it can effectively route traffic around an AS without causing widespread disruption.
IEEE ACM Transactions on Networking | 2014
Masoud Akhoondi; Curtis Yu; Harsha V. Madhyastha
Though the widely used Tor anonymity network is designed to enable low-latency anonymous communication, interactive communications on Tor incur latencies over 5 greater than on the direct Internet path, and in many cases, autonomous systems (ASs) can compromise anonymity via correlations of network traffic. In this paper, we develop LASTor, a new Tor client that addresses these shortcomings in Tor with only client-side modifications. First, LASTor improves communication latencies by accounting for the inferred locations of Tor relays while choosing paths. Since the preference for shorter paths reduces the entropy of path selection, we design LASTor so that a user can choose an appropriate tradeoff between latency and anonymity. Second, we develop an efficient and accurate algorithm to identify paths on which an AS can compromise anonymity by traffic correlation. LASTor avoids such paths to improve a users anonymity, and the low run-time of the algorithm ensures that the impact on end-to-end communication latencies is low. Our results show that, in comparison to the default Tor client, LASTor reduces median latencies by 25% while also reducing the false negative rate of not detecting a potential snooping AS from 57% to 11%.Though the widely used Tor anonymity network is designed to enable low-latency anonymous communication, interactive communications on Tor incur latencies over 5 × greater than on the direct Internet path, and in many cases, autonomous systems (ASs) can compromise anonymity via correlations of network traffic. In this paper, we develop LASTor, a new Tor client that addresses these shortcomings in Tor with only client-side modifications. First, LASTor improves communication latencies by accounting for the inferred locations of Tor relays while choosing paths. Since the preference for shorter paths reduces the entropy of path selection, we design LASTor so that a user can choose an appropriate tradeoff between latency and anonymity. Second, we develop an efficient and accurate algorithm to identify paths on which an AS can compromise anonymity by traffic correlation. LASTor avoids such paths to improve a users anonymity, and the low runtime of the algorithm ensures that the impact on end-to-end communication latencies is low. Our results show that, in comparison to the default Tor client, LASTor reduces median latencies by 25% while also reducing the false negative rate of not detecting a potential snooping AS from 57% to 11%.
internet measurement conference | 2010
Justine Sherry; Ethan Katz-Bassett; Mary Pimenova; Harsha V. Madhyastha; Thomas E. Anderson; Arvind Krishnamurthy
Operators and researchers want accurate router-level views of the Internet for purposes including troubleshooting and modeling. However, tools such as traceroute return IP addresses. Because routers may have dozens of IP addresses, or aliases, multiple measurements may return different addresses, obscuring whether they represent the same machine. While many techniques exist to address this issue by identifying some IP aliases, these techniques, even in combination, find only a subset of alias pairs. To improve this state, we design and evaluate a new alias resolution technique using the IP prespecified timestamp option. This option allows a sender to request timestamp val- ues from multiple IP addresses in the same probe. By careful arrangement of these IP addresses, we show that we can infer aliases in many cases. In this paper, we conduct a measurement study of how many routers support IP timestamps, demonstrating that enough honor the option to base our technique on it. Using our technique, and compared to the most accurate alias information available, we find that 94.7% of the aliases identified by our technique are true positives. Further, we show that our IP timestamp-based technique complements existing alias resolution techniques, providing significant gains by discovering previously unidentifiable aliases.
conference on emerging network experiment and technology | 2012
Mustafa Y. Arslan; Indrajeet Singh; Shailendra Singh; Harsha V. Madhyastha; Karthikeyan Sundaresan; Srikanth V. Krishnamurthy
Every night, a large number of idle smartphones are plugged into a power source for recharging the battery. Given the increasing computing capabilities of smartphones, these idle phones constitute a sizeable computing infrastructure. Therefore, for an enterprise which supplies its employees with smartphones, we argue that a computing infrastructure that leverages idle smartphones being charged overnight is an energy-efficient and cost-effective alternative to running tasks on traditional server infrastructure. While parallel execution and scheduling models exist for servers (e.g., MapReduce), smartphones present a unique set of technical challenges due to the heterogeneity in CPU clock speed, variability in network bandwidth, and lower availability compared to servers. In this paper, we address many of these challenges to develop CWC---a distributed computing infrastructure using smartphones. Specifically, our contributions are: (i) we profile the charging behaviors of real phone owners to show the viability of our approach, (ii) we enable programmers to execute parallelizable tasks on smartphones with little effort, (iii) we develop a simple task migration model to resume interrupted task executions, and (iv) we implement and evaluate a prototype of CWC (with 18 Android smartphones) that employs an underlying novel scheduling algorithm to minimize the makespan of a set of tasks. Our extensive evaluations demonstrate that the performance of our approach makes our vision viable. Further, we explicitly evaluate the performance of CWCs scheduling component to demonstrate its efficacy compared to other possible approaches.