Phillipa Gill
Stony Brook University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Phillipa Gill.
internet measurement conference | 2007
Phillipa Gill; Martin F. Arlitt; Zongpeng Li; Anirban Mahanti
This paper presents a traffic characterization study of the popular video sharing service, YouTube. Over a three month period we observed almost 25 million transactions between users on an edge network and YouTube, including more than 600,000 video downloads. We also monitored the globally popular videos over this period of time. In the paper we examine usage patterns, file properties, popularity and referencing characteristics, and transfer behaviors of YouTube, and compare them to traditional Web and media streaming workload characteristics. We conclude the paper with a discussion of the implications of the observed characteristics. For example, we find that as with the traditional Web, caching could improve the end user experience, reduce network bandwidth consumption, and reduce the load on YouTubes core server infrastructure. Unlike traditional Web caching, Web 2.0 provides additional meta-data that should be exploited to improve the effectiveness of strategies like caching.
workshop on online social networks | 2008
Balachander Krishnamurthy; Phillipa Gill; Martin F. Arlitt
Web 2.0 has brought about several new applications that have enabled arbitrary subsets of users to communicate with each other on a social basis. Such communication increasingly happens not just on Facebook and MySpace but on several smaller network applications such as Twitter and Dodgeball. We present a detailed characterization of Twitter, an application that allows users to send short messages. We gathered three datasets (covering nearly 100,000 users) including constrained crawls of the Twitter network using two different methodologies, and a sampled collection from the publicly available timeline. We identify distinct classes of Twitter users and their behaviors, geographic growth patterns and current size of the network, and compare crawl results obtained under rate limiting constraints.
acm special interest group on data communication | 2011
Phillipa Gill; Navendu Jain; Nachiappan Nagappan
We present the first large-scale analysis of failures in a data center network. Through our analysis, we seek to answer several fundamental questions: which devices/links are most unreliable, what causes failures, how do failures impact network traffic and how effective is network redundancy? We answer these questions using multiple data sources commonly collected by network operators. The key findings of our study are that (1) data center networks show high reliability, (2) commodity switches such as ToRs and AggS are highly reliable, (3) load balancers dominate in terms of failure occurrences with many short-lived software related faults,(4) failures have potential to cause loss of many small packets such as keep alive messages and ACKs, and (5) network redundancy is only 40% effective in reducing the median impact of failure.
passive and active network measurement | 2008
Phillipa Gill; Martin F. Arlitt; Zongpeng Li; Anirban Mahanti
In this paper we collect and analyze traceroute measurements to show that large content providers (e.g., Google, Microsoft, Yahoo!) are deploying their own wide-area networks, bringing their networks closer to users, and bypassing Tier-1 ISPs on many paths. This trend, should it continue and be adopted by more content providers, could flatten the Internet topology, and may result in numerous other consequences to users, Internet Service Providers (ISPs), content providers, and network researchers.
ACM Transactions on Storage | 2010
Bianca Schroeder; Sotirios Damouras; Phillipa Gill
Latent sector errors (LSEs) refer to the situation where particular sectors on a drive become inaccessible. LSEs are a critical factor in data reliability, since a single LSE can lead to data loss when encountered during RAID reconstruction after a disk failure or in systems without redundancy. LSEs happen at a significant rate in the field [Bairavasundaram et al. 2007], and are expected to grow more frequent with new drive technologies and increasing drive capacities. While two approaches, data scrubbing and intra-disk redundancy, have been proposed to reduce data loss due to LSEs, none of these approaches has been evaluated on real field data. This article makes two contributions. We provide an extended statistical analysis of latent sector errors in the field, specifically from the view point of how to protect against LSEs. In addition to providing interesting insights into LSEs, we hope the results (including parameters for models we fit to the data) will help researchers and practitioners without access to data in driving their simulations or analysis of LSEs. Our second contribution is an evaluation of five different scrubbing policies and five different intra-disk redundancy schemes and their potential in protecting against LSEs. Our study includes schemes and policies that have been suggested before, but have never been evaluated on field data, as well as new policies that we propose based on our analysis of LSEs in the field.
security and privacy in smartphones and mobile devices | 2011
Kathy Wain Yee Au; Yi Fan Zhou; Zhen Huang; Phillipa Gill; David Lie
Many smartphone operating systems implement strong sandboxing for 3rd party application software. As part of this sandboxing, they feature a permission system, which conveys to users what sensitive resources an application will access and allows users to grant or deny permission to access those resources. In this paper we survey the permission systems of several popular smartphone operating systems and taxonomize them by the amount of control they give users, the amount of information they convey to users and the level of interactivity they require from users. We discuss the problem of permission overdeclaration and devise a set of goals that security researchers should aim for, as well as propose directions through which we hope the research community can attain those goals.
acm special interest group on data communication | 2013
Phillipa Gill; Michael Schapira; Sharon Goldberg
Researchers studying the inter-domain routing system typically rely on models to fill in the gaps created by the lack of information about the business relationships and routing policies used by individual autonomous systems. To shed light on this unknown information, we asked 100 network operators about their routing policies, billing models, and thoughts on routing security. This short paper reports the surveys results and discusses their implications.
internet measurement conference | 2013
Phillipa Gill; Vijay Erramilli; Augustin Chaintreau; Balachander Krishnamurthy; Konstantina Papagiannaki; Pablo Rodriguez
The large-scale collection and exploitation of personal information to drive targeted online advertisements has raised privacy concerns. As a step towards understanding these concerns, we study the relationship between how much information is collected and how valuable it is for advertising. We use HTTP traces consisting of millions of users to aid our study and also present the first comparative study between aggregators. We develop a simple model that captures the various parameters of todays advertising revenues, whose values are estimated via the traces. Our results show that per aggregator revenue is skewed (5% accounting for 90% of revenues), while the contribution of users to advertising revenue is much less skewed (20% accounting for 80% of revenue). Google is dominant in terms of revenue and reach (presence on 80% of publishers). We also show that if all 5% of the top users in terms of revenue were to install privacy protection, with no corresponding reaction from the publishers, then the revenue can drop by 30%.
ACM Transactions on The Web | 2011
Phillipa Gill; Martin F. Arlitt; Niklas Carlsson; Anirban Mahanti; Carey L. Williamson
Today’s Web provides many different functionalities, including communication, entertainment, social networking, and information retrieval. In this article, we analyze traces of HTTP activity from a large enterprise and from a large university to identify and characterize Web-based service usage. Our work provides an initial methodology for the analysis of Web-based services. While it is nontrivial to identify the classes, instances, and providers for each transaction, our results show that most of the traffic comes from a small subset of providers, which can be classified manually. Furthermore, we assess both qualitatively and quantitatively how the Web has evolved over the past decade, and discuss the implications of these changes.
passive and active network measurement | 2013
Rahul Hiran; Niklas Carlsson; Phillipa Gill
China Telecoms hijack of approximately 50,000 IP prefixes in April 2010 highlights the potential for traffic interception on the Internet. Indeed, the sensitive nature of the hijacked prefixes, including US government agencies, garnered a great deal of attention and highlights the importance of being able to characterize such incidents after they occur. We use the China Telecom incident as a case study, to understand (1) what can be learned about large-scale routing anomalies using public data sets, and (2) what types of data should be collected to diagnose routing anomalies in the future. We develop a methodology for inferring which prefixes may be impacted by traffic interception using only control-plane data and validate our technique using data-plane traces. The key findings of our study of the China Telecom incident are: (1) The geographic distribution of announced prefixes is similar to the global distribution with a tendency towards prefixes registered in the Asia-Pacific region, (2) there is little evidence for subprefix hijacking which supports the hypothesis that this incident was likely a leak of existing routes, and (3) by preferring customer routes, providers inadvertently enabled interception of their customers traffic.