Yinglian Xie | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yinglian Xie is active.

Explore More

Publication

Featured researches published by Yinglian Xie.

international conference on computer communications | 2002

Locality in search engine queries and its implications for caching

Yinglian Xie; David R. O'Hallaron

Caching is a popular technique for reducing both server load and user response time in distributed systems. We consider the question of whether caching might be effective for search engines as well. We study two real search engine traces by examining query locality and its implications for caching. Our trace analysis produced three results. One result shows that queries have significant locality, with query frequency following a Zipf distribution. Very popular queries are shared among different users and can be cached at servers or proxies, while 16% to 22% of the queries are from the same users and should be cached at the user side. Multiple-word queries are shared less and should be cached mainly at the user side. Another result shows that if caching is to be done at the user side, short-term caching for hours is enough to cover query temporal locality, while server/proxy caching should use longer periods, such as days. The third result showed that most users have small lexicons when submitting queries. Frequent users who submit many search requests tend to reuse a small subset of words to form queries. Thus, with proxy or user side caching, prefetching based on the user lexicon looks promising.

acm special interest group on data communication | 2001

Early measurements of a cluster-based architecture for P2P systems

Balachander Krishnamurthy; Jia Wang; Yinglian Xie

Peer-to-peer applications such as Napster [4], Freenet [1], and Gnutella [2], [7] have gained much attention recently. These applications are mainly designed and used for largescale sharing of MP3 files. In such systems, end-hosts self-organize into an overlay network and share content with each other. Compared to the traditional client-server model, files are served in a distributed manner and replicated among the network on demand. Since hosts participating in peer-to-peer (P2P) networks also devote some computing resources, such systems scale with the number of hosts in terms of hardware, bandwidth, and disk space. With the wide deployment of P2P applications, the P2P traffic is becoming a growing portion of the Internet traffic. There has been very little examination of P2P traffic patterns and how they differ from traditional service models. Studying and understanding P2P traffic is thus important to provide efficient application-level content location and routing within the network. The existing applications use their own approach to do content location and routing and none of them are scalable. Napster uses a centralized server to locate content, while Gnutella clients broadcast queries to all their neighbors. [8] discusses the query locality observed in Gnutella traces and suggests caching as a short-term approach to increase Gnutella’s scalability. Recent designs such as CAN [5], Chord [9], Pastry [6], and Tapestry [10] propose distributed indexing schemes based on hashing to locate content. These systems assume a flat content delivery mesh. Each object’s location is stored at one or more nodes selected deterministically by a uniform hash function; queries for the object will be routed incrementally to the node. Although hash functions can help locate content deterministically, they lack the flexibility of keyword searching—a useful operation to find content without prior knowledge of exact object names. There is no real deployment at present and thus no measurement information is available for understanding the usability and scalability of

ieee symposium on security and privacy | 2005

Worm origin identification using random moonwalks

Yinglian Xie; Vyas Sekar; David A. Maltz; Michael K. Reiter; Hui Zhang

We propose a novel technique that can determine both the host responsible for originating a propagating worm attack and the set of attack flows that make up the initial stages of the attack tree via which the worm infected successive generations of victims. We argue that knowledge of both is important for combating worms: knowledge of the origin supports law enforcement, and knowledge of the causal flows that advance the attack supports diagnosis of how network defenses were breached. Our technique exploits the wide tree shape of a worm propagation emanating from the source by performing random moonwalks backward in time along paths of flows. Correlating the repeated walks reveals the initial causal flows, thereby aiding in identifying the source. Using analysis, simulation, and experiments with real world traces, we show how the technique works against both todays fast propagating worms and stealthy worms that attempt to hide their attack flows among background traffic.

dependable systems and networks | 2006

A Multi-Resolution Approach forWorm Detection and Containment

Vyas Sekar; Yinglian Xie; Michael K. Reiter; Hui Zhang

Despite the proliferation of detection and containment techniques in the worm defense literature, simple threshold-based methods remain the most widely deployed and most popular approach among practitioners. This popularity arises out of the simplistic appeal, ease of use, and independence from attack-specific properties such as scanning strategies and signatures. However, such approaches have known limitations: they either fail to detect low-rate attacks or incur very high false positive rates. We propose a multi-resolution approach to enhance the power of threshold-based detection and rate-limiting techniques. Using such an approach we can not only detect fast attacks with low latency, but also discover low-rate attacks - several orders of magnitude less aggressive than todays fast propagating attacks -with low false positive rates. We also outline a multi-resolution rate limiting mechanism for throttling the number of new connections a host can make, to contain the spread of worms. Our trace analysis and simulation experiments demonstrate the benefits of a multi-resolution approach for worm defense

recent advances in intrusion detection | 2004

Seurat: A Pointillist Approach to Anomaly Detection

Yinglian Xie; Hyang-Ah Kim; David R. O'Hallaron; Michael K. Reiter; Hui Zhang

This paper proposes a new approach to detecting aggregated anomalous events by correlating host file system changes across space and time. Our approach is based on a key observation that many host state transitions of interest have both temporal and spatial locality. Abnormal state changes, which may be hard to detect in isolation, become apparent when they are correlated with similar changes on other hosts. Based on this intuition, we have developed a method to detect similar, coincident changes to the patterns of file updates that are shared across multiple hosts. We have implemented this approach in a prototype system called Seurat and demonstrated its effectiveness using a combination of real workstation cluster traces, simulated attacks, and a manually launched Linux worm.

international conference on network protocols | 2006

Forensic Analysis for Epidemic Attacks in Federated Networks

Yinglian Xie; Vyas Sekar; Michael K. Reiter; Hui Zhang

We present the design of a Network Forensic Alliance (NFA), to allow multiple administrative domains (ADs) to jointly locate the origin of epidemic spreading attacks. ADs in the NFA collaborate in a distributed protocol for post-mortem analysis of worm-like attacks. Information exchange between any two participating ADs is limited to traffic records that are known to both sides, maintaining the privacy of participants. Such an architecture is incentive-compatible - participants benefit by gaining better local investigative capabilities, even with partial deployment. Further, we show that by sharing local investigation results, ADs can achieve global investigative capabilities that are comparable to a centralized implementation with access to global traffic records. Our evaluation demonstrates that it is feasible for large-scale attack investigation to be incrementally deployed in an Internet-like federation.

annual computer security applications conference | 2006

Protecting Privacy in Key-Value Search Systems

Yinglian Xie; Michael K. Reiter; David R. O'Hallaron

This paper investigates the general problem of efficiently performing key-value search at untrusted servers without loss of user privacy. Given key-value pairs from multiple owners that are stored across untrusted servers, how can a client efficiently search these pairs such that no server, on its own, can reconstruct the key-value pairs? We propose a system, called Peekaboo, that is applicable and practical to any type of key-value search while protecting both data owner privacy and client privacy. The main idea is to separate the key-value pairs across different servers. Supported by access control and user authentication, Peekaboo allows search to be performed by only authorized clients without reducing the level of user privacy.

acm special interest group on data communication | 2006

Virtual disk based centralized management for enterprise networks

Yuezhi Zhou; Yaoxue Zhang; Yinglian Xie

The rapid advances in hardware, software, and networks have made the management of enterprise network systems an increasingly challenging task. Due to the tight coupling between hardware, software, and data, every one of the hundreds or thousands of PCs that are connected in an enterprise environment has to be administered individually, leading to high Total Cost of Ownership (TCO). We argue that centralized management with distributed, diskless clients, yet centralized repositories of all software and data can reduce the management complexity with reduced software maintenance time, improved system availability, and enhanced security. We instantiate such paradigm with a diskless, thick client based system that supports heterogeneous OSes including Windows---the dominant commodity OS in the current market. The prototype requires no or minimum OS modification, nor application modification. Our initial deployment and experiment results demonstrate that our approach is a feasible and efficient solution for managing enterprise network systems.

high performance distributed computing | 2002

A secure distributed search system

Yinglian Xie; David R. O'Hallaron; Michael K. Reiter

This paper presents the design, implementation and evaluation of Mingle, a secure distributed search system. Each participating host runs a Mingle server, which maintains an inverted index of the local file system. Users initiate peer-to-peer keyword searches by typing keywords to lightweight Mingle clients. Central to Mingle are its access control mechanisms and its insistence on user convenience. For access control, we introduce the idea of access-right mapping, which provides a convenient way for file owners to specify access permissions. Access control is supported through a single sign-on mechanism that allows users to conveniently establish their identity to Mingle servers, such that subsequent authentication occurs automatically, with minimal manual involvement. Preliminary performance evaluation suggests that Mingle is both feasible and scalable.

Archive | 2004