An Zhu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where An Zhu is active.

Explore More

Publication

Featured researches published by An Zhu.

international conference on database theory | 2005

Anonymizing tables

Gagan Aggarwal; Tomás Feder; Krishnaram Kenthapadi; Rajeev Motwani; Rina Panigrahy; Dilys Thomas; An Zhu

We consider the problem of releasing tables from a relational database containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is k-anonymization. A release is considered k-anonymous if the information for each person contained in the release cannot be distinguished from at least k–1 other persons whose information also appears in the release. In the k-Anonymityproblem the objective is to minimally suppress cells in the table so as to ensure that the released version is k-anonymous. We show that the k-Anonymity problem is NP-hard even when the attribute values are ternary. On the positive side, we provide an O(k)-approximation algorithm for the problem. This improves upon the previous best-known O(klog k)-approximation. We also give improved positive results for the interesting cases with specific values of k — in particular, we give a 1.5-approximation algorithm for the special case of 2-Anonymity, and a 2-approximation algorithm for 3-Anonymity.

mobile ad hoc networking and computing | 2001

Geometric spanner for routing in mobile networks

Jie Gao; Leonidas J. Guibas; John Hershberger; Li Zhang; An Zhu

We propose a new routing graph, the restricted Delaunay graph (RDG), for mobile ad hoc networks. Combined with a node clustering algorithm, the RDG can be used as an underlying graph for geographic routing protocols. This graph has the following attractive properties: 1) it is planar; 2) between any two graph nodes there exists a path whose length, whether measured in terms of topological or Euclidean distance, is only a constant times the minimum length possible; and 3) the graph can be maintained efficiently in a distributed manner when the nodes move around. Furthermore, each node only needs constant time to make routing decisions. We show by simulation that the RDG outperforms previously proposed routing graphs in the context of the Greedy perimeter stateless routing (GPSR) protocol. Finally, we investigate theoretical bounds on the quality of paths discovered using GPSR.

symposium on computational geometry | 2001

Discrete mobile centers

Jie Gao; Leonidas J. Guibas; John Hershberger; Li Zhang; An Zhu

\emph{We propose a new randomized algorithm for maintaining a set of c lusters among moving nodes in the plane. Given a specified cluster radius, our algorithm selects and maintains a variable subset of the nodes as cluster centers. This subset has the property that (1) balls of the given radius centered at the chosen nodes cover all the others and (2) the number of centers selected is a constant-factor approximation of the minimum possible. As the nodes move, an event-based kinetic data structure updates the clustering as necessary. This kinetic data structure is shown to be responsive, efficient, local, and compact. The produced cover is also smooth, in the sense that wholesale cluster re-arrangements are avoided. The algorithm can be implemented without exact knowledge of the node positions, if each node is able to sense its distance to other nodes up to the cluster radius. Such a kinetic clustering can be used in numerous applications where mobile devices must be interconnected into an ad-hoc network to collaboratively perform some task.}

international colloquium on automata, languages and programming | 2004

Algorithms for Multi-product Pricing

Gagan Aggarwal; Tomás Feder; Rajeev Motwani; An Zhu

In the information age, the availability of data on consumer profiles has opened new possibilities for companies to increase their revenue via data mining techniques. One approach has been to strategically set prices of various products, taking into account the profiles of consumers. We study algorithms for the multi-product pricing problem, where, given consumer preferences among products, their budgets, and the costs of production, the goal is to set prices of multiple products from a single company, so as to maximize the overall revenue of the company. We present approximation algorithms as well as negative results for several variants of the multi-product pricing problem, modeling different purchasing patterns and market assumptions.

symposium on the theory of computing | 2001

Algorithms for minimizing weighted flow time

Chandra Chekuri; Sanjeev Khanna; An Zhu

We study the problem of minimizing <italic>weighted</italic> flow time on a single machine in the preemptive setting. We present an <italic>O(\log^2 P)</italic>-competitive <italic>semi-online</italic> algorithm where <italic>P</italic> is the ratio of the maximum and minimum processing times of jobs in the system. In the offline setting we show that a <italic>(2+\eps)</italic>-approximation is achievable in quasi-polynomial time. These are the first non-trivial results for the weighted versions of minimizing flow time. For multiple machines we show that no competitive randomized online algorithm exists for weighted flow time. We also present an improved online algorithm for minimizing total stretch (a special case of weighted flow time) on multiple machines.

Discrete and Computational Geometry | 2003

Discrete Mobile Centers

Jie Gao; Leonidas J. Guibas; John Hershberger; Li Zhang; An Zhu

Abstract. We propose a new randomized algorithm for maintaining a set of clusters among moving nodes in the plane. Given a specified cluster radius, our algorithm selects and maintains a variable subset of the nodes as cluster centers. This subset has the property that (1) balls of the given radius centered at the chosen nodes cover all the others and (2) the number of centers selected is a constant-factor approximation of the minimum possible. As the nodes move, an event-based kinetic data structure updates the clustering as necessary. This kinetic data structure is shown to be responsive, efficient, local, and compact. The produced cover is also smooth, in the sense that wholesale cluster re-arrangements are avoided. This clustering algorithm is distributed in nature and can enable numerous applications in ad hoc wireless networks, where mobile devices must be interconnected to perform various tasks collaboratively.

acm symposium on parallel algorithms and architectures | 2003

The load rebalancing problem

Gagan Aggarwal; Rajeev Motwani; An Zhu

In the classical load balancing or multiprocessor scheduling problem, we are given a sequence of jobs of varying sizes and are asked to assign each job to one of the m empty processors. A typical objective is to minimize makespan, the load on the heaviest loaded processor. Since in most real world scenarios the load is a dynamic measure, the initial assignment may be not remain optimal with time. Motivated by such considerations in a variety of systems, we formulate the problem of load rebalancing --- given a possibly suboptimal assignment of jobs to processors, relocate a set of the jobs so as to decrease the makespan. Specifically, the goal is to achieve the best possible makespan under the constraint that no more than k jobs are relocated. We also consider a generalization of this problem where there is an arbitrary cost function associated with each job relocation. Since the problem is clearly NP-hard, we focus on approximation algorithms. We construct a sophisticated algorithm which achieves a 1.5-approximation, with near linear running time. We also show that the problem has a PTAS, resolving the complexity issue. Finally, we investigate the approximability of several extensions of the rebalancing model.

Computer Networks | 2004

Modeling correlations in web traces and implications for designing replacement policies

Konstantinos Psounis; An Zhu; Balaji Prabhakar; Rajeev Motwani

A number of web cache-related algorithms, such as replacement and prefetching policies, rely on specific characteristics present in the sequence of requests for efficient performance. Further, there is an increasing need to synthetically generate long traces of web requests for studying the performance of algorithms and systems related to the web. These reasons motivate us to obtain a simple and accurate model of web request traces.Our Markovian model precisely captures the degrees to which temporal correlations and document popularity influence web trace requests. We describe a mathematical procedure to extract the model parameters from real traces and generate synthetic traces using these parameters. This procedure is verified by standard statistical analysis. We also validate the model by comparing the hit ratios for real traces and their synthetic counterparts under various caching algorithms.As an important by-product, the model provides guidelines for designing efficient replacement algorithms. We obtain optimal algorithms given the parameters of the model. We also introduce a spectrum of practicable, high-performance algorithms that adapt to the degree of temporal correlation present in the request sequence, and discuss related implementation concerns.

foundations of computer science | 2003

Switch scheduling via randomized edge coloring

Gagan Aggarwal; Rajeev Motwani; Devavrat Shah; An Zhu

The essence of an Internet router is an n /spl times/ n switch which routes packets from input to output ports. Such a switch can be viewed as a bipartite graph with the input and output ports as the two vertex sets. Packets arriving at input port i and destined for output port j can be modeled as an edge from i to j. Current switch scheduling algorithms view the routing of packets at each time step as a selection of a bipartite matching. We take the view that the switch scheduling problem across a sequence of time-steps is an instance of the edge coloring problem for a bipartite multigraph. Implementation considerations lead us to seek edge coloring algorithms for bipartite multigraphs that are fast, decentralized, and online. We present a randomized algorithm which has the desired properties, and uses only a near-optimal /spl Delta/ + o(/spl Delta/) colors on dense bipartite graphs arising in the context of switch scheduling. This algorithm extends to non-bipartite graphs as well. It leads to a novel switch scheduling algorithm which, for stochastic online edge arrivals, is stable, i.e. the queue length at each input port is bounded at all times. We note that this is the first decentralized switch scheduling algorithm that is also guaranteed to be stable.

Theoretical Computer Science | 2004

Combining request scheduling with web caching

Tomás Feder; Rajeev Motwani; Rina Panigrahy; Steven S. Seiden; Rob van Stee; An Zhu

textabstractWe extend the classic paging model by allowing reordering of requests under the constraint that a request is delayed by no longer than a predetermined number of time steps. We first give a dynamic programming algorithm to solve the offline case. Then we give tight bounds on competitive ratios for the online case. For caches of size k, we obtain bounds of k + O(1) for deterministic algorithms and Theta(log k) for randomized algorithms. We also give bounds for the case where either the online or the offline algorithm can reorder the requests, but not both. Finally, we extend our analysis to the case where pages have different sizes.

Explore More