Khaled Alsabti | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Khaled Alsabti is active.

Explore More

Publication

Featured researches published by Khaled Alsabti.

symposium on frontiers of massively parallel computation | 1995

Many-to-many personalized communication with bounded traffic

Sanjay Ranka; Ravi V. Shankar; Khaled Alsabti

This paper presents solutions for the problem of many-to-many personalized communication, with bounded incoming and outgoing traffic, on a distributed memory parallel machine. We present a two-stage algorithm that decomposes the many-to-many communication with possibly high variance in message size into two communications with low message size variance. The algorithm is deterministic and takes time 2t/spl mu/(+lower order terms) when t/spl ges/0(p/sup 2/+p/spl tau///spl mu/) Here t is the maximum outgoing or incoming traffic at any processor, /spl tau/ is the startup overhead and /spl mu/ is the inverse of the data transfer rate. Optimality is achieved when the traffic is large, a condition that is usually satisfied in practice on coarse-grained architectures. The algorithm was implemented on the Connection Machine CM-5. The implementation used the low latency communication primitives (active messages) available on the CM-5, but the algorithm as such is architecture-independent. An alternate single-stage algorithm using distributed random scheduling for the CM-5 was implemented and the performance of the two algorithms were compared.<<ETX>>

merged international parallel processing symposium and symposium on parallel and distributed processing | 1998

An efficient parallel algorithm for high dimensional similarity join

Khaled Alsabti; Sanjay Ranka; Vineet Singh

Multidimensional similarity join finds pairs of multidimensional points that are within some small distance of each other. The /spl epsiv/-k-d-B tree has been proposed as a data structure that scales better as the number of dimensions increases compared to previous data structures. We present a cost model of the /spl epsiv/-k-d-B tree and use it to optimize the leaf size. We present novel parallel algorithms for the similarity join using the /spl epsiv/-k-d-B tree. A load balancing strategy based on equi-depth histograms is shown to work well for uniform or low-skew situations, whereas another based on weighted equi-depth histograms works far better for high-skew datasets. The latter strategy is only slightly slower than the former strategy for low skew datasets. Further its cost is proportional to the overall cost of the similarity join.

ieee international conference on high performance computing data and analytics | 1998

Skew-insensitive parallel algorithms for relational join

Khaled Alsabti; Sanjay Ranka

Join is the most important and expensive operation in relational databases. The parallel join operation is very sensitive to the presence of the data skew. In this paper we present two new parallel join algorithms for coarse grained machines which work optimally in presence of arbitrary amount of data skew. The first algorithm is sort-based and the second is hash-based. Both of these algorithms employ a preprocessing phase to equally partition the work among the processors. These algorithms are shown to be theoretically as well as practically scalable.

ieee international conference on high performance computing, data, and analytics | 1997

Integer sorting algorithms for coarse-grained parallel machines

Khaled Alsabti; Sanjay Ranka

Integer sorting is a subclass of the sorting problem where the elements have integer values and the largest element is polynomially bounded in the number of elements to be sorted. It is useful for applications in which the size of the maximum value of element to be sorted is bounded. In this paper, we present a new distributed radix-sort algorithm for integer sorting. The structure of our algorithm is similar to radix sort except that it typically requires less number of communication phases. We present experimental results for our algorithm on two distributed memory multiprocessors, the Intel Paragon and the Thinking machine CM-5. These results are compared with two other well known practical parallel sorting algorithms based on radix sort and sample sort. The experimental results show that the distributed radix-sort is competitive with the other two algorithms.

Archive | 1997

An efficient k-means clustering algorithm

Khaled Alsabti; Sanjay Ranka; Vineet Singh

knowledge discovery and data mining | 1998

CLOUDS: a decision tree classifier for large datasets

Khaled Alsabti; Sanjay Ranka; Vineet Singh

very large data bases | 1997

A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data

Khaled Alsabti; Sanjay Ranka; Vineet Singh

Archive | 1998

CLOUDS: Classification for large or out-of-core datasets

Khaled Alsabti; Sanjay Ranka; Vineet Singh

Archive | 1994

The Transportation Primitive

Ravi V. Shankar; Khaled Alsabti; Sanjay Ranka

Archive | 1998

Efficient algorithms for data mining

Sanjay Ranka; Khaled Alsabti

Explore More

Collaboration

Dive into the Khaled Alsabti's collaboration.

Top Co-Authors

Sanjay Ranka

Syracuse University

View shared research outputs

Top Co-Authors

Vineet Singh

Hitachi

View shared research outputs

Top Co-Authors

Ravi V. Shankar

Syracuse University

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Khaled Alsabti is active.

Publication

Featured researches published by Khaled Alsabti.

Many-to-many personalized communication with bounded traffic

An efficient parallel algorithm for high dimensional similarity join

Skew-insensitive parallel algorithms for relational join

Integer sorting algorithms for coarse-grained parallel machines

An efficient k-means clustering algorithm

CLOUDS: a decision tree classifier for large datasets

A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data

CLOUDS: Classification for large or out-of-core datasets

The Transportation Primitive

Efficient algorithms for data mining

Collaboration

Dive into the Khaled Alsabti's collaboration.