Craig E. Wills | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Craig E. Wills is active.

Explore More

Publication

Featured researches published by Craig E. Wills.

acm special interest group on data communication | 2001

On the use and performance of content distribution networks

Balachander Krishnamurthy; Craig E. Wills; Yin Zhang

Content distribution networks (CDNs) are a mechanism to deliver content to end users on behalf of origin Web sites. Content distribution offloads work from origin servers by serving some or all of the contents of Web pages. We found an order of magnitude increase in the number and percentage of popular origin sites using CDNs between November 1999 and December 2000.In this paper we discuss how CDNs are commonly used on the Web and define a methodology to study how well they perform. A performance study was conducted over a period of months on a set of CDN companies employing the techniques of DNS redirection and URL rewriting to balance load among their servers. Some CDNs generally provide better results than others when we examine results from a set of clients. The performance of one CDN company clearly improved between the two testing periods in our study due to a dramatic increase in the number of distinct servers employed in its network. More generally, the results indicate that use of a DNS lookup in the critical path of a resource retrieval does not generally result in better server choices being made relative to client response time in either average or worst case situations.

workshop on online social networks | 2008

Characterizing privacy in online social networks

Balachander Krishnamurthy; Craig E. Wills

Online social networks (OSNs) with half a billion users have dramatically raised concerns on privacy leakage. Users, often willingly, share personal identifying information about themselves, but do not have a clear idea of who accesses their private information or what portion of it really needs to be accessed. In this study we examine popular OSNs from a viewpoint of characterizing potential privacy leakage. Our study identifies what bits of information are currently being shared, how widely, and what users can do to prevent such sharing. We also examine the role of third-party sites that track OSN users and compare with privacy leakage on popular traditional Web sites. Our long term goal is to identify the narrow set of private information that users really need to share to accomplish specific interactions on OSNs.

international world wide web conferences | 1998

Piggyback server invalidation for proxy cache coherency

Balachander Krishnamurthy; Craig E. Wills

Abstract We present a piggyback server invalidation (PSI) mechanism for maintaining stronger cache coherency in Web proxy caches while reducing overall costs. The basic idea is for servers to piggyback on a reply to a proxy client, the list of resources that have changed since the last access by the client. The proxy client invalidates cached entries on the list and can extend the lifetime of entries not on the list. This continues our prior work on piggyback cache validation (PCV) where we focused on piggybacking validation requests from the proxy cache to the server. Trace-driven simulation of PSI on two large, independent proxy log data sets, augmented with data from several server logs, shows PSI provides close to strong cache coherency while reducing the request traffic compared to existing cache coherency techniques. The best overall performance is obtained when the PSI and PCV techniques are combined. Compared to the best TTL-based policy, this hybrid policy reduces the average cost (considering response latency, request messages and bandwidth) by 7–9%, reduces the staleness ratio by 82–86%, yielding a staleness ratio of 0.001.

international world wide web conferences | 2000

Analyzing factors that influence end-to-end Web performance

Balachander Krishnamurthy; Craig E. Wills

Abstract Web performance impacts the popularity of a particular Web site or service as well as the load on the network, but there have been no publicly available end-to-end measurements that have focused on a large number of popular Web servers examining the components of delay or the effectiveness of the recent changes to the HTTP protocol. In this paper we report on an extensive study carried out from many client sites geographically distributed around the world to a collection of over 700 servers to which a majority of Web traffic is directed. Our results show that the HTTP/1.1 protocol, particularly with pipelining, is indeed an improvement over existing practice, but that servers serving a small number of objects or closing a persistent connection without explicit notification can reduce or eliminate any performance improvement. Similarly, use of caching and multi-server content distribution can also improve performance if done effectively.

internet measurement conference | 2006

Generating a privacy footprint on the internet

Balachander Krishnamurthy; Craig E. Wills

As a follow up to characterizing traffic deemed as unwanted by Web clients such as advertisements, we examine how information related to individual users is aggregated as a result of browsing seemingly unrelated Web sites. We examine the privacy diffusion on the Internet, hidden transactions, and the potential for a few sites to be able to construct a profile of individual users. We define and generate a privacy footprint allowing us to assess and compare the diffusion of privacy information across a wide variety of sites. We examine the effectiveness of existing and new techniques to reduce this diffusion. Our results show that the size of the privacy footprint is a legitimate cause for concern across the sets of sites that we study.

international world wide web conferences | 1999

Towards a better understanding of Web resources and server responses for improved caching

Craig E. Wills; Mikhail Mikhailov

Abstract This work focuses on characterizing information about Web resources and server responses that is relevant to Web caching. The approach is to study a set of URLs at a variety of sites and gather statistics about the rate and nature of changes compared with the resource type. In addition, we gather response header information reported by the servers with each retrieved resource. Results from the work indicate that there is potential to reuse more cached resources than is currently being realized due to inaccurate and nonexistent cache directives. In terms of implications for caching, the relationships between resources used to compose a page must be considered. Embedded images are often reused, even in pages that change frequently. This result both points to the need to cache such images and to discard them when they are no longer included as part of any page. Finally, while the results show that HTML resources frequently change, these changes can be in a predictable and localized manner. Separating out the dynamic portions of a page into their own resources allows relatively static portions to be cached, while retrieval of the dynamic resources can trigger retrieval of new resources along with any invalidation of already cached resources.

symposium on usable privacy and security | 2007

Measuring privacy loss and the impact of privacy protection in web browsing

Balachander Krishnamurthy; Delfina Malandrino; Craig E. Wills

Various bits of information about users accessing Web sites. some of which are private, have been gathered since the inception of the Web. Increasingly the gathering, aggregation, and processing has been outsourced to third parties. The goal of this work is to examine the effectiveness of specific techniques to limit this diffusion of private information to third parties. We also examine the impact of these privacy protection techniques on the usability and quality of the Web pages returned. Using objective measures for privacy protection and page quality we examine their tradeoffs for different privacy protection techniques applied to a collection of popular Web sites as well as a focused set of sites with significant privacy concerns. We study privacy protection both at a browser and at a proxy.

Computer Communications | 2001

Studying the impact of more complete server information on Web caching

Craig E. Wills; Mikhail Mikhailov

Caching of objects in the World Wide Web is a widely used technique to reduce end-user latencies, network and server load. Currently deployed heuristic-based approaches to caching result in a large number of unnecessary validations, and prior results show potential for better reuse of cached Web content. This work studies a more deterministic approach to caching of Web objects. The idea is to view HTML pages as containers, holding distinct objects with heterogeneous type and change characteristics. Servers compile information about relationships between containers and embedded objects and piggyback it onto existing request/response traffic. Our results indicate that these techniques significantly improve existing cache management strategies.

international conference on distributed computing systems | 1999

Proxy cache coherency and replacement-towards a more complete picture

Balachander Krishnamurthy; Craig E. Wills

This work studies the interaction of Web proxy cache coherency and replacement policies using trace-driven simulations. We specifically examine the relative importance of each type of policy in affecting the overall costs, the potential of incorporating coherency issues in cache replacement and the inclusion of additional factors such as frequency of resource use in replacement and coherency policies. The results show that the cache replacement policy in use is the primary cost determinant for relatively small caches, while the cache coherency policy is the determinant for larger caches. Incorporating cache coherency issues in cache replacement policies yields little improvement in overall performance. The use of access frequency in cache replacement, along with temporal locality and size information, results in a simple and better performing policy than found in previously published work. Combining this new replacement policy with the best piggyback-based cache coherency policy results in a 4.5% decrease in costs and 89% reduction in staleness ratio when compared to policy combinations in current use. Preliminary work indicates that cache replacement and coherency policies continue to affect costs in the presence of HTTP protocol enhancements such as persistent connections.

international world wide web conferences | 2001

N for the price of 1: bundling web objects for more efficient content delivery

Craig E. Wills; Mikhail Mikhailov; Hao Shang

Persistent connections address inefficiencies associated with multiple concurrent connections. They can improve response time when successfully used with pipelining to retrieve a set of objects from a Web server. In practice, however, there is inconsistent support for persistent connections, particularly with pipelining, from Web servers, user agents, and intermediaries. Web browsers continue to open multiple concurrent TCP connections to the same server. This paper proposes a new idea of packaging the set of objects embedded on a Web page into a single bundle object for retrieval by clients. Our analysis indicates that if embedded objects on a Web page are delivered to clients as a single bundle, the response time experienced by the clients is as good as or better than that provided by currently deployed mechanisms. We also show that, relative to the currently used retrieval methods, our approach reduces the load on the network and servers. The key contribution of our work is a mechanism that gives Web servers better control over the number and duration of TCP connections they support. Implementation of the mechanism requires no changes to the HTTP protocol.

Explore More