Andrew Rau-Chaplin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew Rau-Chaplin is active.

Explore More

Publication

Featured researches published by Andrew Rau-Chaplin.

symposium on computational geometry | 1993

Scalable parallel geometric algorithms for coarse grained multicomputers

Frank K. H. A. Dehne; Andreas Fabri; Andrew Rau-Chaplin

Whereas most of the literature assumes that the number of processors p is a function of the problem size n, in scalable algorithms p becomes a parameter of the time complexity. This is a more realistic modelisation of real parallel machines and yields optimal algorithms, for the case that n H p, where H is a function depending on the architecture of the interconnexion network. In this paper we present scalable algorithms for a number of geometric problems, namely lower envelope of line segments, 2D-nearest neighbour, 3D-maxima, 2D-weighted dominance counting area of the union of rectangles, 2D-convex hull. The main idea of these algorithms is to decompose the problem in p subproblems of size 0(F(n;p) + f(p)), with f(p) 2 F(n;p) , which can be solved independently using optimal sequential algorithms. For each problem we present a spatial decomposition scheme based on some geometric observations. The decomposition schemes have in common that they can be computed by globally sorting the entire data set at most twice. The data redundancy of f(p) duplicates of data elements per processor does not increase the asymptotic time complexity and ranges for the algorithms presented in this paper, from p to p2. The algorithms do not depend on a specific architecture,they are easy to implement and in practice efficient as experiments show.

International Journal of Computational Geometry and Applications | 1996

SCALABLE PARALLEL COMPUTATIONAL GEOMETRY FOR COARSE GRAINED MULTICOMPUTERS

Frank K. H. A. Dehne; Andreas Fabri; Andrew Rau-Chaplin

We study scalable parallel computational geometry algorithms for the coarse grained multicomputer model: p processors solving a problem on n data items, were each processor has O(n/p)≫O(1) local memory and all processors are connected via some arbitrary interconnection network (e.g. mesh, hypercube, fat tree). We present O(Tsequential/p+Ts(n, p)) time scalable parallel algorithms for several computational geometry problems. Ts(n, p) refers to the time of a global sort operation. Our results are independent of the multicomputer’s interconnection network. Their time complexities become optimal when Tsequential/p dominates Ts(n, p) or when Ts(n, p) is optimal. This is the case for several standard architectures, including meshes and hypercubes, and a wide range of ratios n/p that include many of the currently available machine configurations. Our methods also have some important practical advantages: For interprocessor communication, they use only a small fixed number of one global routing operation, global sort, and all other programming is in the sequential domain. Furthermore, our algorithms use only a small number of very large messages, which greatly reduces the overhead for the communication protocol between processors. (Note however, that our time complexities account for the lengths of messages.) Experiments show that our methods are easy to implement and give good timing results.

Journal of Computer and System Sciences | 2003

Solving large FPT problems on coarse-grained parallel machines

James Cheetham; Frank K. H. A. Dehne; Andrew Rau-Chaplin; Ulrike Stege; Peter J. Taillon

Fixed-parameter tractability (FPT) techniques have recently been successful in solving NP-complete problem instances of practical importance which were too large to be solved with previous methods. In this paper, we show how to enhance this approach through the addition of parallelism, thereby allowing even larger problem instances to be solved in practice. More precisely, we demonstrate the potential of parallelism when applied to the bounded-tree search phase of FPT algorithms. We apply our methodology to the k-VERTEX COVER problem which has important applications in, for example, the analysis of multiple sequence alignments for computational biochemistry. We have implemented our parallel FPT method for the k-VERTEX COVER problem using C and the MPI communication library, and tested it on a 32-node Beowulf cluster. This is the first experimental examination of parallel FPT techniques. As part of our experiments, we solved larger instances of k-VERTEX COVER than in any previously reported implementations. For example, our code can solve problem instances with k≥400 in less than 1.5 h.

Distributed and Parallel Databases | 2002

Parallelizing the Data Cube

Frank K. H. A. Dehne; Todd Eavis; Susanne E. Hambrusch; Andrew Rau-Chaplin

This paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottom-up cube algorithms. Both partitioning strategies assign subcubes to individual processors in such a way that the loads assigned to the processors are balanced. Our methods reduce inter processor communication overhead by partitioning the load in advance instead of computing each individual group-by in parallel. Our partitioning strategies create a small number of coarse tasks. This allows for sharing of prefixes and sort orders between different group-by computations. Our methods enable code reuse by permitting the use of existing sequential (external memory) data cube algorithms for the subcube computations on each processor. This supports the transfer of optimized sequential data cube code to a parallel setting.The bottom-up partitioning strategy balances the number of single attribute external memory sorts made by each processor. The top-down strategy partitions a weighted tree in which weights reflect algorithm specific cost measures like estimated group-by sizes. Both partitioning approaches can be implemented on any shared disk type parallel machine composed of p processors connected via an interconnection fabric and with access to a shared parallel disk array.We have implemented our parallel top-down data cube construction method in C++ with the MPI message passing library for communication and the LEDA library for the required graph algorithms. We tested our code on an eight processor cluster, using a variety of different data sets with a range of sizes, dimensions, density, and skew. Comparison tests were performed on a SunFire 6800. The tests show that our partitioning strategies generate a close to optimal load balance between processors. The actual run times observed show an optimal speedup of p.

Evolutionary Bioinformatics | 2008

SPR Distance Computation for Unrooted Trees

Glenn Hickey; Frank K. H. A. Dehne; Andrew Rau-Chaplin; Christian Blouin

The subtree prune and regraft distance (dSPR) between phylogenetic trees is important both as a general means of comparing phylogenetic tree topologies as well as a measure of lateral gene transfer (LGT). Although there has been extensive study on the computation of dSPR and similar metrics between rooted trees, much less is known about SPR distances for unrooted trees, which often arise in practice when the root is unresolved. We show that unrooted SPR distance computation is NP-Hard and verify which techniques from related work can and cannot be applied. We then present an efficient heuristic algorithm for this problem and benchmark it on a variety of synthetic datasets. Our algorithm computes the exact SPR distance between unrooted tree, and the heuristic element is only with respect to the algorithms computation time. Our method is a heuristic version of a fixed parameter tractability (FPT) approach and our experiments indicate that the running time behaves similar to FPT algorithms. For real data sets, our algorithm was able to quickly compute dSPR for the majority of trees that were part of a study of LGT in 144 prokaryotic genomes. Our analysis of its performance, especially with respect to searching and reduction rules, is applicable to computing many related distance measures.

international parallel and distributed processing symposium | 2003

Parallel ROLAP data cube construction on shared-nothing multiprocessors

Ying Chen; Frank K. H. A. Dehne; Todd Eavis; Andrew Rau-Chaplin

The pre-computation of data cubes is critical to improving the response time of On-Line Analytical Processing (OLAP) systems and can be instrumental in accelerating data mining tasks in large data warehouses. In order to meet the need for improved performance created by growing data sizes, parallel solutions for generating the data cube are becoming increasingly important. This paper presents a parallel method for generating data cubes on a shared-nothing multiprocessor. Since no (expensive) shared disk is required, our method can be used on low cost Beowulf style clusters consisting of standard PCs with local disks connected via a data switch. Our approach uses a ROLAP representation of the data cube where views are stored as relational tables. This allows for tight integration with current relational database technology.We have implemented our parallel shared-nothing data cube generation method and evaluated it on a PC cluster, exploring relative speedup, local vs. global schedule trees, data skew, cardinality of dimensions, data dimensionality, and balance tradeoffs. For an input data set of 2,000,000 rows (72 Megabytes), our parallel data cube generation method achieves close to optimal speedup; generating a full data cube of ≈227 million rows (5.6 Gigabytes) on a 16 processors cluster in under 6 minutes. For an input data set of 10,000,000 rows (360 Megabytes), our parallel method, running on a 16 processor PC cluster, created a data cube consisting of ≈846 million rows (21.7 Gigabytes) in under 47 minutes.

Distributed and Parallel Databases | 2006

The cgmCUBE project: Optimizing parallel data cube generation for ROLAP

Frank K. H. A. Dehne; Todd Eavis; Andrew Rau-Chaplin

On-line Analytical Processing (OLAP) has become one of the most powerful and prominent technologies for knowledge discovery in VLDB (Very Large Database) environments. Central to the OLAP paradigm is the data cube, a multi-dimensional hierarchy of aggregate values that provides a rich analytical model for decision support. Various sequential algorithms for the efficient generation of the data cube have appeared in the literature. However, given the size of contemporary data warehousing repositories, multi-processor solutions are crucial for the massive computational demands of current and future OLAP systems.In this paper we discuss the cgmCUBE Project, a multi-year effort to design and implement a multi-processor platform for data cube generation that targets the relational database model (ROLAP). More specifically, we discuss new algorithmic and system optimizations relating to (1) a thorough optimization of the underlying sequential cube construction method and (2) a detailed and carefully engineered cost model for improved parallel load balancing and faster sequential cube construction. These optimizations were key in allowing us to build a prototype that is able to produce data cube output at a rate of over one TeraByte per hour.

Information Processing Letters | 2008

Compact Hilbert indices: Space-filling curves for domains with unequal side lengths

Chris H. Hamilton; Andrew Rau-Chaplin

In this paper we define a new compact Hilbert index which, while maintaining all of the advantages of the standard Hilbert curve, permits spaces with unequal dimension cardinalities. The compact Hilbert index can be used in any application that would have previously relied on Hilbert curves but, in the case of unequal side lengths, provides a more memory efficient representation. This advantage is particularly important in distributed applications (Parallel, P2P and Grid), in which not only is memory space saved but communication volume is significantly reduced.

Journal of Parallel and Distributed Computing | 1990

Implementing data structures on a hypercube multiprocessor, and applications in parallel computational geometry

Frank K. H. A. Dehne; Andrew Rau-Chaplin

In this paper, the authors study the problem of implementing standard data structures on a hypercube multiprocessor. They present a technique for efficiently executing multiple independent search processes on a class of graphs called ordered h-level graphs. They show how this technique can be utilized to implement a segment tree on a hypercube, thereby obtaining O(log{sup 2}n) time algorithms for solving the next element search problem, the trapezoidal composition problem, and the triangulation problem.

ieee international conference on high performance computing data and analytics | 2012

Parallel Simulations for Analysing Portfolios of Catastrophic Event Risk

Aman Kumar Bahl; Oliver Baltzer; Andrew Rau-Chaplin; Blesson Varghese

At the heart of the analytical pipeline of a modern quantitative insurance/reinsurance company is a stochastic simulation technique for portfolio risk analysis and pricing process referred to as Aggregate Analysis. Support for the computation of risk measures including Probable Maximum Loss (PML) and the Tail Value at Risk (TVAR) for a variety of types of complex property catastrophe insurance contracts including Cat eXcess of Loss (XL), or Per-Occurrence XL, and Aggregate XL, and contracts that combine these measures is obtained in Aggregate Analysis. In this paper, we explore parallel methods for aggregate risk analysis. A parallel aggregate risk analysis algorithm and an engine based on the algorithm is proposed. This engine is implemented in C and OpenMP for multi-core CPUs and in C and CUDA for many-core GPUs. Performance analysis of the algorithm indicates that GPUs offer an alternative HPC solution for aggregate risk analysis that is cost effective. The optimised algorithm on the GPU performs a 1 million trial aggregate simulation with 1000 catastrophic events per trial on a typical exposure set and contract structure in just over 20 seconds which is approximately 15x times faster than the sequential counterpart. This can sufficiently support the real-time pricing scenario in which an underwriter analyses different contractual terms and pricing while discussing a deal with a client over the phone.

Explore More