Geoffrey M. Voelker | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Geoffrey M. Voelker is active.

Explore More

Publication

Featured researches published by Geoffrey M. Voelker.

ACM Transactions on Computer Systems | 2006

Inferring Internet denial-of-service activity

David Moore; Colleen Shannon; Douglas J. Brown; Geoffrey M. Voelker; Stefan Savage

In this article, we seek to address a simple question: “How prevalent are denial-of-service attacks in the Internet?” Our motivation is to quantitatively understand the nature of the current threat as well as to enable longer-term analyses of trends and recurring patterns of attacks. We present a new technique, called “backscatter analysis,” that provides a conservative estimate of worldwide denial-of-service activity. We use this approach on 22 traces (each covering a week or more) gathered over three years from 2001 through 2004. Across this corpus we quantitatively assess the number, duration, and focus of attacks, and qualitatively characterize their behavior. In total, we observed over 68,000 attacks directed at over 34,000 distinct victim IP addresses---ranging from well-known e-commerce companies such as Amazon and Hotmail to small foreign ISPs and dial-up connections. We believe our technique is the first to provide quantitative estimates of Internet-wide denial-of-service activity and that this article describes the most comprehensive public measurements of such activity to date.

measurement and modeling of computer systems | 2002

Characterizing user behavior and network performance in a public wireless LAN

Anand Balachandran; Geoffrey M. Voelker; Paramvir Bahl; P. Venkat Rangan

This paper presents and analyzes user behavior and network performance in a public-area wireless network using a workload captured at a well-attended ACM conference. The goals of our study are: (1) to extend our understanding of wireless user behavior and wireless network performance; (2) to characterize wireless users in terms of a parameterized model for use with analytic and simulation studies involving wireless LAN traffic; and (3) to apply our workload analysis results to issues in wireless network deployment, such as capacity planning, and potential network optimizations, such as algorithms for load balancing across multiple access points (APs) in a wireless network.

knowledge discovery and data mining | 2009

Beyond blacklists: learning to detect malicious web sites from suspicious URLs

Justin Ma; Lawrence K. Saul; Stefan Savage; Geoffrey M. Voelker

Malicious Web sites are a cornerstone of Internet criminal activities. As a result, there has been broad interest in developing systems to prevent the end user from visiting such sites. In this paper, we describe an approach to this problem based on automated URL classification, using statistical methods to discover the tell-tale lexical and host-based properties of malicious Web site URLs. These methods are able to learn highly predictive models by extracting and automatically analyzing tens of thousands of features potentially indicative of suspicious URLs. The resulting classifiers obtain 95-99% accuracy, detecting large numbers of malicious Web sites from their URLs, with only modest false positives.

Mobile Computing and Communications Review | 2003

Access and mobility of wireless PDA users

Marvin McNett; Geoffrey M. Voelker

In this paper, we analyze the mobility patterns of users of wireless hand-held PDAs in a campus wireless network using an eleven week trace of wireless network activity. Our study has two goals. First, we characterize the high-level mobility and access patterns of hand-held PDA users and compare these characteristics to previous workload mobility studies focused on laptop users. Second, we develop two wireless network topology models for use in wireless mobility studies: an evolutionary topology model based on user proximity and a campus waypoint model that serves as a trace-based complement to the random waypoint model. We use our evolutionary topology model as a case study for preliminary evaluation of three ad hoc routing algorithms on the network topologies created by the access and mobility patterns of users of modern wireless PDAs. Based upon the mobility characteristics of our trace-based campus waypoint model, we find that commonly parameterized synthetic mobility models have overly aggressive mobility characteristics for scenarios where user movement is limited to walking. Mobility characteristics based on realistic models can have significant implications for evaluating systems designed for mobility. When evaluated using our evolutionary topology model, for example, popular ad hoc routing protocols were very successful at adapting to user mobility, and user mobility was not a key factor in their performance.

Communications of The ACM | 2010

Difference engine: harnessing memory redundancy in virtual machines

Diwaker Gupta; Sangmin Lee; Michael Vrable; Stefan Savage; Alex C. Snoeren; George Varghese; Geoffrey M. Voelker; Amin Vahdat

Virtual machine monitors (VMMs) are a popular platform for Internet hosting centers and cloud-based compute services. By multiplexing hardware resources among virtual machines (VMs) running commodity operating systems, VMMs decrease both the capital outlay and management overhead of hosting centers. Appropriate placement and migration policies can take advantage of statistical multiplexing to effectively utilize available processors. However, main memory is not amenable to such multiplexing and is often the primary bottleneck in achieving higher degrees of consolidation. Previous efforts have shown that content-based page sharing provides modest decreases in the memory footprint of VMs running similar operating systems and applications. Our studies show that significant additional gains can be had by leveraging both subpage level sharing (through page patching) and incore memory compression. We build Difference Engine, an extension to the Xen VMM, to support each of these---in addition to standard copy-on-write full-page sharing---and demonstrate substantial savings across VMs running disparate workloads (up to 65%). In head-to-head memory-savings comparisons, Difference Engine outperforms VMware ESX server by a factor 1.6--2.5 for heterogeneous workloads. In all cases, the performance overhead of Difference Engine is less than 7%.

Communications of The ACM | 2011

Sora: high-performance software radio using general-purpose multi-core processors

Kun Tan; He Liu; Jiansong Zhang; Yongguang Zhang; Ji Fang; Geoffrey M. Voelker

This paper presents Sora, a fully programmable software radio platform on commodity PC architectures. Sora combines the performance and fidelity of hardware SDR platforms with the programmability and flexibility of general-purpose processor (GPP) SDR platforms. Sora uses both hardware and software techniques to address the challenges of using PC architectures for high-speed SDR. The Sora hardware components consist of a radio front-end for reception and transmission, and a radio control board for high-throughput, low-latency data transfer between radio and host memories. Sora makes extensive use of features of contemporary processor architectures to accelerate wireless protocol processing and satisfy protocol timing requirements, including using dedicated CPU cores, large low-latency caches to store lookup tables, and SIMD processor extensions for highly efficient physical layer processing on GPPs. Using the Sora platform, we have developed a demonstration radio system called SoftWiFi. SoftWiFi seamlessly interoperates with commercial 802.11a/b/g NICs, and achieves equivalent performance as commercial NICs at each modulation.

international conference on machine learning | 2009

Identifying suspicious URLs: an application of large-scale online learning

Justin Ma; Lawrence K. Saul; Stefan Savage; Geoffrey M. Voelker

This paper explores online learning approaches for detecting malicious Web sites (those involved in criminal scams) using lexical and host-based features of the associated URLs. We show that this application is particularly appropriate for online algorithms as the size of the training data is larger than can be efficiently processed in batch and because the distribution of features that typify malicious URLs is changing continuously. Using a real-time system we developed for gathering URL features, combined with a real-time source of labeled URLs from a large Web mail provider, we demonstrate that recently-developed online algorithms can be as accurate as batch techniques, achieving classification accuracies up to 99% over a balanced data set.

acm special interest group on data communication | 2006

Jigsaw: solving the puzzle of enterprise 802.11 analysis

Yu-Chung Cheng; John Bellardo; Péter Benkö; Alex C. Snoeren; Geoffrey M. Voelker; Stefan Savage

The combination of unlicensed spectrum, cheap wireless interfaces and the inherent convenience of untethered computing have made 802.11 based networks ubiquitous in the enterprise. Modern universities, corporate campuses and government offices routinely de-ploy scores of access points to blanket their sites with wireless Internet access. However, while the fine-grained behavior of the 802.11 protocol itself has been well studied, our understanding of how large 802.11 networks behave in their full empirical complex-ity is surprisingly limited. In this paper, we present a system called Jigsaw that uses multiple monitors to provide a single unified view of all physical, link, network and transport-layer activity on an 802.11 network. To drive this analysis, we have deployed an infrastructure of over 150 radio monitors that simultaneously capture all 802.11b and 802.11g activity in a large university building (1M+ cubic feet). We describe the challenges posed by both the scale and ambiguity inherent in such an architecture, and explain the algorithms and inference techniques we developed to address them. Finally, using a 24-hour distributed trace containing more than 1.5 billion events, we use Jigsaws global cross-layer viewpoint to isolate performance artifacts, both explicit, such as management inefficiencies, and implicit, such as co-channel interference. We believe this is the first analysis combining this scale and level of detail for a production 802.11 network.

symposium on operating systems principles | 2005

Scalability, fidelity, and containment in the potemkin virtual honeyfarm

Michael Vrable; Justin Ma; Jay Chen; David Moore; Erik Vandekieft; Alex C. Snoeren; Geoffrey M. Voelker; Stefan Savage

The rapid evolution of large-scale worms, viruses and bot-nets have made Internet malware a pressing concern. Such infections are at the root of modern scourges including DDoS extortion, on-line identity theft, SPAM, phishing, and piracy. However, the most widely used tools for gathering intelligence on new malware -- network honeypots -- have forced investigators to choose between monitoring activity at a large scale or capturing behavior with high fidelity. In this paper, we describe an approach to minimize this tension and improve honeypot scalability by up to six orders of magnitude while still closely emulating the execution behavior of individual Internet hosts. We have built a prototype honeyfarm system, called Potemkin, that exploits virtual machines, aggressive memory sharing, and late binding of resources to achieve this goal. While still an immature implementation, Potemkin has emulated over 64,000 Internet honeypots in live test runs, using only a handful of physical servers.

internet measurement conference | 2013

A fistful of bitcoins: characterizing payments among men with no names

Sarah Meiklejohn; Marjori Pomarole; Grant Jordan; Kirill Levchenko; Damon McCoy; Geoffrey M. Voelker; Stefan Savage

Bitcoin is a purely online virtual currency, unbacked by either physical commodities or sovereign obligation; instead, it relies on a combination of cryptographic protection and a peer-to-peer protocol for witnessing settlements. Consequently, Bitcoin has the unintuitive property that while the ownership of money is implicitly anonymous, its flow is globally visible. In this paper we explore this unique characteristic further, using heuristic clustering to group Bitcoin wallets based on evidence of shared authority, and then using re-identification attacks (i.e., empirical purchasing of goods and services) to classify the operators of those clusters. From this analysis, we characterize longitudinal changes in the Bitcoin market, the stresses these changes are placing on the system, and the challenges for those seeking to use Bitcoin for criminal or fraudulent purposes at scale.

Explore More