Joel L. Wolf | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joel L. Wolf is active.

Explore More

Publication

Featured researches published by Joel L. Wolf.

international conference on management of data | 1999

Fast algorithms for projected clustering

Charu C. Aggarwal; Joel L. Wolf; Philip S. Yu; Cecilia M. Procopiuc; Jong Soo Park

The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. Unfortunately, all known algorithms tend to break down in high dimensional spaces because of the inherent sparsity of the points. In such high dimensional spaces not all dimensions may be relevant to a given cluster. One way of handling this is to pick the closely correlated dimensions and find clusters in the corresponding subspace. Traditional feature selection algorithms attempt to achieve this. The weakness of this approach is that in typical high dimensional data mining applications different sets of points may cluster better for different subsets of dimensions. The number of dimensions in each such cluster-specific subspace may also vary. Hence, it may be impossible to find a single small subset of dimensions for all the clusters. We therefore discuss a generalization of the clustering problem, referred to as the projected clustering problem, in which the subsets of dimensions selected are specific to the clusters themselves. We develop an algorithmic framework for solving the projected clustering problem, and test its performance on synthetic data.

knowledge discovery and data mining | 1999

Horting hatches an egg: a new graph-theoretic approach to collaborative filtering

Charu C. Aggarwal; Joel L. Wolf; Kun-Lung Wu; Philip S. Yu

This paper introduces a new and novel approach to ratingbased collaborative filtering. The new technique is most appropriate for e-commerce merchants offering one or more groups of relatively homogeneous items such as compact disks, videos, books, software and the like. In contrast with other known collaborative filtering techniques, the new algorithm is graph-theoretic, based on the twin new concepts of ho&rag and predictability. As is demonstrated in this paper, the technique is fast, scalable, accurate, and requires only a modest learning curve. It makes use of a hierarchical classification scheme in order to introduce context into the rating process, and uses so-called creative links in order to find surprising and atypical items to recommend, perhaps even items which cross the group boundaries. The new technique is one of the key engines of the Intelligent Recommendation Algorithm (IRA) project, now being developed at IBM Research. In addition to several other recommendation engines, IRA contains a situation analyzer to determine the most appropriate mix of engines for a particular e-commerce merchant, as well as an engine for optimizing the placement of advertisements.

IEEE Transactions on Knowledge and Data Engineering | 1999

Caching on the World Wide Web

Charu C. Aggarwal; Joel L. Wolf; Philip S. Yu

With the recent explosion in usage of the World Wide Web, the problem of caching Web objects has gained considerable importance. Caching on the Web differs from traditional caching in several ways. The nonhomogeneity of the object sizes is probably the most important such difference. In this paper, we give an overview of caching policies designed specifically for Web objects and provide a new algorithm of our own. This new algorithm can be regarded as a generalization of the standard LRU algorithm. We examine the performance of this and other Web caching algorithms via event- and trace-driven simulation.

international world wide web conferences | 2001

Segment-based proxy caching of multimedia streams

Kun-Lung Wu; Philip S. Yu; Joel L. Wolf

As streaming video and audio over the Internet becomes popular, proper proxy caching of large multimedia objects has become increasingly important. For a large media object, such as a 2-hour video, treating the whole video as a single web object for caching is not appropriate. In this paper, we present and evaluate a segment-based bu er management approach to proxy caching of large media streams. Blocks of a media stream received by a proxy server are grouped into variable-sized segments. The cache admission and replacement policies then attach di erent caching values to di erent segments, taking into account the segment distance from the start of the media. These caching policies give preferential treatments to the beginning segments. As such, users can quickly play back the media objects without much delay. Event-driven simulations are conducted to evaluate this segment-based proxy caching approach. The results show that (1) segment-based caching is e ective not only in increasing byte-hit ratio (or reducing total traAEc) but also in lowering the number of requests that require delayed starts; (2) segment-based caching is especially advantageous when the cache size is limited, when the set of hot media objects changes over time, when the media le size is large, and when many users may stop playing the media after only a few initial blocks.

international conference on multimedia computing and systems | 1996

A permutation-based pyramid broadcasting scheme for video-on-demand systems

Charu C. Aggarwal; Joel L. Wolf; Philip S. Yu

Periodic broadcasting can be used to support near video on demand for popular videos. For a given bandwidth allocation, pyramid broadcasting schemes substantially reduce the viewer latency (waiting) time as compared with conventional broadcasting schemes. Nevertheless, such pyramid schemes typically have substantial storage requirements at the client end, and this results in set top boxes needing disks with high transfer rate capabilities. We present a permutation based pyramid scheme in which the storage requirements and disk transfer rates are greatly reduced, and yet the viewer latency is smaller as well. Under the proposed approach, each video is partitioned into contiguous segments of geometrically increasing sizes and each segment is further divided into blocks, where a block is the basic unit of transmission. As in the original pyramid scheme, frequencies of transmission for the different segments of a video vary in a manner inversely proportional to their size. Instead of transmitting the block in each segment in sequential order, the proposed scheme transmits these blocks in a prespecified cyclic permutation to save on storage requirements in the client end. Performance analyses are provided to quantify the benefits of the new scheme.

international conference on multimedia computing and systems | 1996

On optimal batching policies for video-on-demand storage servers

Charu C. Aggarwal; Joel L. Wolf; Philip S. Yu

In a video-on-demand environment, batching of video requests is often used to reduce I/O demand and improve throughput. Since viewers may defect if they experience long waits, a good video scheduling policy needs to consider not only the batch size but also the viewer defection probabilities and wait times. Two conventional scheduling policies for batching are first-come-first-served (FCFS) and maximum queue length (MOL). Neither of these policies lead to entirely satisfactory results. MQL tends to be too aggressive in scheduling popular videos by only considering the queue length to maximize batch size, while FCFS has the opposite effect. We introduce the notion of factored queue length and propose a batching policy that schedules the video with the maximum factored queue length. We refer to this as the MFQ policy. The factored queue length is obtained by weighting each video queue length with a factor which is biased against the more popular videos. An optimization problem is formulated to solve the best weighting factors for the various videos. A simulation is developed to compare the proposed MFQ policy with FCFS and MQL. Our study shows that MFQ yields excellent empirical results in terms of standard performance measures such as average latency time, defection rates and fairness.

electronic commerce | 2001

On maximizing service-level-agreement profits

Zhen Liu; Mark S. Squillante; Joel L. Wolf

We present a methodology for maximizing profits in a general class of e-commerce environments. The cost model is based on revenues that are generated when Quality-of-Service (QoS) guarantees are satisfied and on penalties that are incurred otherwise. The corresponding QoS criteria are derived from multiclass Service-Level-Agreements (SLAs) between service providers and their clients, which include the tail distributions of the per-class delays in addition to more standard QoS metrics such as throughput and mean delays. Our approach consists of formulating the optimization problem as a network flow model with a separable set of concave objective functions based on queueing-theoretic formulas, where the SLA classes are taken into account in both the constraints and the objective function. This problem is then solved via a fixed-point iteration. Numerous experiments illustrate the benefits of our approach.We present a methodology for maximizing profits in a general class of e-commerce environments. The cost model is based on revenues that are generated when Quality-of-Service (QoS) guarantees are satisfied and on penalties that are incurred otherwise. The corresponding QoS criteria are derived from multiclass Service-Level-Agreements (SLAs) between service providers and their clients, which include the tail distributions of the per-class delays in addition to more standard QoS metrics such as throughput and mean delays. Our approach consists of formulating the optimization problem as a network flow model with a separable set of concave objective functions based on queueing-theoretic formulas, where the SLA classes are taken into account in both the constraints and the objective function. This problem is then solved via a fixed-point iteration. Numerous experiments illustrate the benefits of our approach.

IEEE Transactions on Computers | 1992

Optimal partitioning of cache memory

Harold S. Stone; John Turek; Joel L. Wolf

A model for studying the optimal allocation of cache memory among two or more competing processes is developed and used to show that, for the examples studied, the least recently used (LRU) replacement strategy produces cache allocations that are very close to optimal. It is also shown that when program behavior changes, LRU replacement moves quickly toward the steady-state allocation if it is far from optimal, but converges slowly as the allocation approaches the steady-state allocation. An efficient combinatorial algorithm for determining the optimal steady-state allocation, which, in theory, could be used to reduce the length of the transient, is described. The algorithm generalizes to multilevel cache memories. For multiprogrammed systems, a cache-replacement policy better than LRU replacement is given. The policy increases the memory available to the running process until the allocation reaches a threshold time beyond which the replacement policy does not increase the cache memory allocated to the running process. >

acm symposium on parallel algorithms and architectures | 1992

Approximate algorithms scheduling parallelizable tasks

John Turek; Joel L. Wolf; Philip S. Yu

A parallehzab[e task is one that can be run on an arbitrary number of processors with a running time that depends on the number of processors allotted to it. Consider a parallel system having m identical processors and n independent para//eb.zab/e tasks to be scheduled on those processors. The goal is to find (1) for each task j, an allotment of processors /3

international world wide web conferences | 2002

Optimal crawling strategies for web search engines

Joel L. Wolf; Mark S. Squillante; Philip S. Yu; Jay Sethuraman; L. Ozsen

, and, (2) overall, a nonpreemptive schedule assigning the tasks to the processors which minimizes the makespan, or latest task completion time. This multiprocessor scheduling problem is known to be NP-complete in the strong sense. We therefore concentrate on providing a heuristic that has polynomial running time with provable worst case bounds on the suboptimality of the solution. In particular, we give an algorithm that selects a family of (up to n(rn – 1) + 1) candidate allotments of processors to tasks, thereby allowing us to use as a subroutine any algorithm A that ‘(solves” the simpler multiprocessor scheduling problem in which the number of processors allotted to a task is fixed. Our algorithm has the property that for a large class of previously studied algorithms our extension will match the same worst case bounds on the suboptimality of the solution while increasing the computational complexity of A by at most a factor of O(nm). As consequences we get polynomial time algorithms for the following:

Explore More