Bing Bing Zhou
Australian National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bing Bing Zhou.
Journal of Parallel and Distributed Computing | 1997
Bing Bing Zhou; Richard P. Brent
In this paper we give evidence to show that in one-sided Jacobi SVD computation the sorting of column norms in each sweep is very important. An efficient parallel ring Jacobi ordering for computing singular value decomposition is described. This ordering can generaten(n? 1)/2 different index pairs and sort column norms at the same time. The one-sided Jacobi SVD algorithm using this parallel ordering converges in about the same number of sweeps as the sequential cyclic Jacobi algorithm. The issue of equivalence of orderings for one-sided Jacobi is also discussed. We show how an ordering which does not sort column norms into order may still perform efficiently as long as it can generate the same index pairs at the same step as one which does sorting. Some experimental results on a Fujitsu AP1000 are presented.
job scheduling strategies for parallel processing | 1998
Bing Bing Zhou; Richard P. Brent; David Walsh; Kuniyasu Suzaki
In this paper we first introduce the concepts of utilisation ratio and effective speedup and their relations to the system performance. We then describe a two-level scheduling scheme which can be used to achieve good performance for parallel jobs and good response for interactive sequential jobs and also to balance both parallel and sequential workloads. The two-level scheduling can be implemented by introducing on each processor a registration office. We also introduce a loose gang scheduling scheme. This scheme is scalable and has many advantages over existing explicit and implicit coscheduling schemes for scheduling parallel jobs under a time sharing environment.
euromicro workshop on parallel and distributed processing | 1995
Bing Bing Zhou; Richard P. Brent
In this paper we give evidence to show that in one-sided Jacobi SVD computation the sorting of column norms in each sweep is very important. Two parallel Jacobi orderings are described. These orderings can generate n(n-1)/2 different index pairs and sort column norms at the same time. The one-sided Jacobi SVD algorithm using these parallel orderings converges in about the same number of sweeps as the sequential cyclic Jacobi algorithm. Some experimental results on a Fujitsu AP1000 are presented. The issue of equivalence of orderings is also discussed.<<ETX>>
euromicro workshop on parallel and distributed processing | 1998
Bing Bing Zhou; Xun Qu; Richard P. Brent
We describe a two-level scheduling scheme for mixed parallel and sequential workloads on scalable parallel machines. The design of this scheduling system is based on two principles, that is, parallel programs should be scheduled in a coordinated manner so that they will not severely interfere with each other and the performance for parallel compacting becomes predictable, and parallel programs may time-share resources with sequential programs so that the efficiency of processor utilisation can greatly be enhanced and good response to interactive clients can be maintained. We also discuss the organisation of a registration office through which the two-level scheduling is realised.
job scheduling strategies for parallel processing | 1999
Bing Bing Zhou; Richard P. Brent; Chris Johnson; David Walsh
This paper presents some ideas for efficiently allocating resources to enhance the performance of gang scheduling. We first introduce a job re-packing scheme. In this scheme we try to rearrange the order of job execution on their originally allocated processors in a scheduling round to combine small fragments of available processors from different time slots together to form a larger and more useful one in a single time slot. We then describe an efficient resource allocation scheme based on job re-packing. Using this allocation scheme we are able to decrease the cost for detecting available resources when allocating processors and time to each given job, to reduce the average number of time slots per scheduling round and also to balance the workload across the processors.
ICWC 99. IEEE Computer Society International Workshop on Cluster Computing | 1999
Bing Bing Zhou; Paul Mackerras; Chris Johnson; David Walsh; Richard P. Brent
Gang scheduling is currently the most popular scheduling scheme for parallel processing in a time shared environment. One major drawback of using gang scheduling is the problem of fragmentation. The conventional method to alleviate this problem is to allow jobs running in multiple time slots. However our experimental results show that simply applying this method alone cannot solve the problem of fragmentation, but on the contrary it may eventually degrade the efficiency of system resource utilisation. In this paper we introduce an efficient resource allocation scheme which effectively incorporates the ideas of re-packing jobs, running jobs in multiple slots and minimising time slots into the buddy allocation system to significantly improve the system and job performance. Because there is no process migration involved in job re-packing, this scheme is particularly suitable for clustered parallel computing systems.
international conference on algorithms and architectures for parallel processing | 1995
Bing Bing Zhou; Richard P. Brent; Margaret Kahn
A method which uses one-sided Jacobi to solve singular valve decomposition and the symmetric eigen-valve problem in parallel is presented. We describe a parallel ring ordering for one-sided Jacobi computation. One distinctive feature of this ordering is that it can sort column norms in each sweep, which is very important to achieve fast convergence. Experimental results on both the Fujitsu AP1000 and the Fujitsu VPP500 are reported.<<ETX>>
international conference on parallel processing | 1996
Bing Bing Zhou; Richard P. Brent
In this paper, we introduce a method for designing efficient Jacobi-like algorithms for eigenvalue decomposition of a real normal matrix. The algorithms use only real arithmetic and achieve ultimate quadratic convergence. A theoretical analysis is conducted and some experimental results are presented.
international conference on parallel processing | 1993
Bing Bing Zhou; Richard P. Brent
We describe a new Jacobi ordering for parallel computation of SVD problems. The ordering uses the high bandwidth of a perfect binary fat-tree to minimise global interprocessor communication costs. It can thus be implemented efficiently on fat-tree architectures.
international conference on acoustics speech and signal processing | 1988
Bing Bing Zhou; Richard P. Brent
The authors introduce a high-throughput systolic implementation of the direct-form second-order recursive filter. The systolic structure has the advantage of regularity over implementations of the block-state-variable form. Since communication is very expensive in VLSI implementations in terms of area, as well as time, this regular structure is considered better for VLSI than those based on block-state-variable filter descriptions.<<ETX>>