Martin F. Arlitt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin F. Arlitt is active.

Explore More

Publication

Featured researches published by Martin F. Arlitt.

internet measurement conference | 2007

Youtube traffic characterization: a view from the edge

Phillipa Gill; Martin F. Arlitt; Zongpeng Li; Anirban Mahanti

This paper presents a traffic characterization study of the popular video sharing service, YouTube. Over a three month period we observed almost 25 million transactions between users on an edge network and YouTube, including more than 600,000 video downloads. We also monitored the globally popular videos over this period of time. In the paper we examine usage patterns, file properties, popularity and referencing characteristics, and transfer behaviors of YouTube, and compare them to traditional Web and media streaming workload characteristics. We conclude the paper with a discussion of the implications of the observed characteristics. For example, we find that as with the traditional Web, caching could improve the end user experience, reduce network bandwidth consumption, and reduce the load on YouTubes core server infrastructure. Unlike traditional Web caching, Web 2.0 provides additional meta-data that should be exploited to improve the effectiveness of strategies like caching.

workshop on online social networks | 2008

A few chirps about twitter

Balachander Krishnamurthy; Phillipa Gill; Martin F. Arlitt

Web 2.0 has brought about several new applications that have enabled arbitrary subsets of users to communicate with each other on a social basis. Such communication increasingly happens not just on Facebook and MySpace but on several smaller network applications such as Twitter and Dodgeball. We present a detailed characterization of Twitter, an application that allows users to send short messages. We gathered three datasets (covering nearly 100,000 users) including constrained crawls of the Twitter network using two different methodologies, and a sampled collection from the publicly available timeline. We identify distinct classes of Twitter users and their behaviors, geographic growth patterns and current size of the network, and compare crawl results obtained under rate limiting constraints.

acm special interest group on data communication | 2006

Traffic classification using clustering algorithms

Jeffrey Erman; Martin F. Arlitt; Anirban Mahanti

Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult with many peer-to-peer (P2P) applications using dynamic port numbers, masquerading techniques, and encryption to avoid detection. An alternative approach is to classify traffic by exploiting the distinctive characteristics of applications when they communicate on a network. We pursue this latter approach and demonstrate how cluster analysis can be used to effectively identify groups of traffic that are similar using only transport layer statistics. Our work considers two unsupervised clustering algorithms, namely K-Means and DBSCAN, that have previously not been used for network traffic classification. We evaluate these two algorithms and compare them to the previously used AutoClass algorithm, using empirical Internet traces. The experimental results show that both K-Means and DBSCAN work very well and much more quickly then AutoClass. Our results indicate that although DBSCAN has lower accuracy compared to K-Means and AutoClass, DBSCAN produces better clusters.

measurement and modeling of computer systems | 2000

Evaluating content management techniques for Web proxy caches

Martin F. Arlitt; Ludmila Cherkasova; John Dilley; Richard J. Friedrich; Tai Jin

The continued growth of the World-Wide Web and the emergence of new end-user technologies such as cable modems necessitate the use of proxy caches to reduce latency, network traffic and Web server loads. Current Web proxy caches utilize simple replacement policies to determine which files to retain in the cache. We utilize a trace of client requests to a busy Web proxy in an ISP environment to evaluate the performance of several existing replacement policies and of two new, parameterless replacement policies that we introduce in this paper. Finally, we introduce Virtual Caches, an approach for improving the performance of the cache for multiple metrics simultaneously.

ACM Transactions on Internet Technology | 2001

Characterizing the Scalability of a Large Web-Based Shopping System

Martin F. Arlitt; Diwakar Krishnamurthy; Jerry Rolia

This article presents an analysis of five days of workload data from a large Web-based shopping system. The multitier environment of this Web-based shopping system includes Web servers, application servers, database servers, and an assortment of load-balancing and firewall appliances. We characterize user requests and sessions and determine their impact on system performance scalability. The purpose of our study is to assess scalability and support capacity planning exercises for the multitier system. We find that horizontal scalability is not always an adequate mechanism for supporting increased workloads and that personalization and robots can have a significant impact on system scalability.

international world wide web conferences | 2008

A comparative analysis of web and peer-to-peer traffic

Naimul Basher; Aniket Mahanti; Anirban Mahanti; Carey L. Williamson; Martin F. Arlitt

Peer-to-Peer (P2P) applications continue to grow in popularity, and have reportedly overtaken Web applications as the single largest contributor to Internet traffic. Using traces collected from a large edge network, we conduct an extensive analysis of P2P traffic, compare P2P traffic with Web traffic, and discuss the implications of increased P2P traffic. In addition to studying the aggregate P2P traffic, we also analyze and compare the two main constituents of P2P traffic in our data, namely BitTorrent and Gnutella. The results presented in the paper may be used for generating synthetic workloads, gaining insights into the functioning of P2P applications, and developing network management strategies. For example, our results suggest that new models are necessary for Internet traffic. As a first step, we present flow-level distributional models for Web and P2P traffic that may be used in network simulation and emulation experiments.

passive and active network measurement | 2008

The flattening internet topology: natural evolution, unsightly barnacles or contrived collapse?

Phillipa Gill; Martin F. Arlitt; Zongpeng Li; Anirban Mahanti

In this paper we collect and analyze traceroute measurements to show that large content providers (e.g., Google, Microsoft, Yahoo!) are deploying their own wide-area networks, bringing their networks closer to users, and bypassing Tier-1 ISPs on many paths. This trend, should it continue and be adopted by more content providers, could flatten the Internet topology, and may result in numerous other consequences to users, Internet Service Providers (ISPs), content providers, and network researchers.

Performance Evaluation | 2000

Performance evaluation of Web proxy cache replacement policies

Martin F. Arlitt; Rich Friedrich; Tai Jin

Abstract The continued growth of the World-Wide Web and the emergence of new end-user technologies such as cable modems necessitate the use of proxy caches to reduce latency, network traffic and Web server loads. In this paper we analyze the importance of different Web proxy workload characteristics in making good cache replacement decisions. We evaluate workload characteristics such as object size, recency of reference, frequency of reference, and turnover in the active set of objects. Trace-driven simulation is used to evaluate the effectiveness of various replacement policies for Web proxy caches. The extended duration of the trace (117 million requests collected over 5 months) allows long term side effects of replacement policies to be identified and quantified. Our results indicate that higher cache hit rates are achieved using size-based replacement policies. These policies store a large number of small objects in the cache, thus increasing the probability of an object being in the cache when requested. To achieve higher byte hit rates a few larger files must be retained in the cache. We found frequency-based policies to work best for this metric, as they keep the most popular files, regardless of size, in the cache. With either approach it is important that inactive objects be removed from the cache to prevent performance degradation due to pollution.

measurement and modeling of computer systems | 1999

Workload characterization of a Web proxy in a cable modem environment

Martin F. Arlitt; Rich Friedrich; Tai Jin

This paper presents a detailed workload characterization study of a World-Wide Web proxy. Measurements from a proxy within an Internet Service Provider (ISP) environment were collected. This ISP allows clients to access the Web using high-speed cable modems rather than traditional dialup modems. By examining this site we are able to evaluate the effects that cable modems have on proxy workloads.This paper focuses on workload characteristics such as file type distribution, file size distribution, file referencing behaviour and turnover in the active set of files. We find that when presented with faster access speeds users are willing to download extremely large files. A widespread increase in the transfer of these large files would have a significant impact on the Web. This behaviour increases the importance of caching for ensuring the scalability of the Web.

workshop on software and performance | 2005

A capacity management service for resource pools

Jerry Rolia; Ludmila Cherkasova; Martin F. Arlitt; Artur Andrzejak

Resource pools are computing environments that offer virtualized access to shared resources. When used effectively they can align the use of capacity with business needs (flexibility), lower infrastructure costs (via resource sharing), and lower operating costs (via automation). This paper describes the Quartermaster capacity manager service for managing such pools. It implements a trace-based technique that models workload (e.g., application) resource demands, their corresponding resource allocations, and resource access quality of service. The primary advantages of the technique are its accuracy, generality, support for resource access qualities of service, and optimizing search method. We pose general capacity management questions for resource pools and explain how the capacity manager helps to address them in an automated manner. A case study demonstrates and validates the method on empirical data from an enterprise application. We show that the technique exploits much of the resource savings to be achieved from resource sharing and is significantly more accurate at estimating per-server required capacity than a benchmark method used in practice to manage a resource pool. Finally, we explain how the problems relate to other practices regarding enterprise capacity management and software performance engineering.

Explore More