Alexandros V. Gerbessiotis

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexandros V. Gerbessiotis is active.

Explore More

Publication

Featured researches published by Alexandros V. Gerbessiotis.

Journal of Parallel and Distributed Computing | 1994

Direct bulk-synchronous parallel algorithms

Alexandros V. Gerbessiotis; Leslie G. Valiant

We describe a methodology for constructing parallel algorithms that are transportable among parallel computers having different numbers of processors, different bandwidths of interprocessor communication and different periodicity of global synchronisation. We do this for the bulk-synchronous parallel (BSP) model, which abstracts the characteristics of a parallel machine into three numerical parameters p, g, and L, corresponding to processors, bandwidth, and periodicity respectively. The model differentiates memory that is local to a processor from that which is not, but, for the sake of universality, does not differentiate network proximity. The advantages of this model in supporting shared memory or PRAM style programming have been treated elsewhere. Here we emphasise the viability of an alternative direct style of programming where, for the sake of efficiency the programmer retains control of memory allocation. We show that optimality to within a multiplicative factor close to one can be achieved for the problems of Gauss-Jordan elimination and sorting, by transportable algorithms that can be applied for a wide range of values of the parameters p, g, and L. We also give some simulation results for PRAMs on the BSP to identify the level of slack at which corresponding efficiencies can be approached by shared memory simulations, provided the bandwidth parameter g is good enough.

acm symposium on parallel algorithms and architectures | 1996

Deterministic sorting and randomized median finding on the BSP model

Alexandros V. Gerbessiotis; Constantinos J. Siniolakis

We present new BSP algorithms for deterministic sorting and rmdomized median finding. We sort n general keys by using a partitioning scheme that achieves the requirements of efficiency (one-optimality) and insensitivity against data skew (the accuracy of the splitting keys depends solely on the step distance, which can be adapted to meet the worstcase requirements of our application). Although we employ sampling in order to realize efficiency, we can give a precise worst-case estimation of the maximum imbalance which might occur. We also investigate optimal randomized BSP algorithms for the problem of finding the median of n elements that require, with high-probability, 3rz/(2p) + o(n/p) number of comparisons, for a wide range of values of n and p. Experimental results for the two algorithms are also presented

parallel computing | 2004

Architecture independent parallel binomial tree option price valuations

Alexandros V. Gerbessiotis

We introduce an architecture independent approach in describing how computations such as those involved in American or European-style option price valuations can be performed in parallel under the binomial tree model. We describe a latency-tolerant parallel algorithm for the multiplicative binomial tree option pricing model. The algorithm is described and analyzed in an architecture independent setting and performance characteristics are expressed in terms of problem size n, the time horizon, and the parameters p, L and g of the bulk-synchronous parallel model of computation. The algorithm achieves optimal theoretical speedup and is within a 1 + o(1) multiplicative factor of the corresponding sequential method. An experimental study of an implementation of the algorithm on a cluster of PC workstations is also undertaken to examine the latency-tolerance of our approach. The implementation with only a recompilation of the same source code works under two diverse parallel programming libraries namely, MPI and BSPlib, thus making it not only architecture but also communication library independent.

IEEE Transactions on Computers | 2009

A Markovian Dependability Model with Cascading Failures

Srinivasan M. Iyer; Marvin K. Nakayama; Alexandros V. Gerbessiotis

We develop a continuous-time Markov chain model of a dependability system operating in a randomly changing environment and subject to probabilistic cascading failures. A cascading failure can be thought of as a rooted tree. The root is the component whose failure triggers the cascade, its children are those components that the roots failure immediately caused, the next generation are those components whose failures were immediately caused by the failures of the roots children, and so on. The amount of cascading is unlimited. We consider probabilistic cascading in the sense that the failure of a component of type i causes a component of type j to fail simultaneously with a given probability, with all failures in a cascade being mutually independent. Computing the infinitesimal generator matrix of the Markov chain poses significant challenges because of the exponential growth in the number of trees one needs to consider as the number of components failing in the cascade increases. We provide a recursive algorithm generating all possible trees corresponding to a given transition, along with an experimental study of an implementation of the algorithm on two examples. The numerical results highlight the effects of cascading on the dependability of the models.

scandinavian workshop on algorithm theory | 1992

Direct Bulk-Synchronous Parallel Algorithms

Alexandros V. Gerbessiotis; Leslie G. Valiant

Integration | 2007

Coprocessor design to support MPI primitives in configurable multiprocessors

Sotirios G. Ziavras; Alexandros V. Gerbessiotis; Rohan Bafna

The Message Passing Interface (MPI) is a widely used standard for interprocessor communications in parallel computers and PC clusters. Its functions are normally implemented in software due to their enormity and complexity, thus resulting in large communication latencies. Limited hardware support for MPI is sometimes available in expensive systems. Reconfigurable computing has recently reached rewarding levels that enable the embedding of programmable parallel systems of respectable size inside one or more Field-Programmable Gate Arrays (FPGAs). Nevertheless, specialized components must be built to support interprocessor communications in these FPGA-based designs, and the resulting code may be difficult to port to other reconfigurable platforms. In addition, performance comparison with conventional parallel computers and PC clusters is very cumbersome or impossible since the latter often employ MPI or similar communication libraries. The introduction of a hardware design to implement directly MPI primitives in configurable multiprocessor computing creates a framework for efficient parallel code development involving data exchanges independently of the underlying hardware implementation. This process also supports the portability of MPI-based code developed for more conventional platforms. This paper takes advantage of the effectiveness and efficiency of one-sided Remote Memory Access (RMA) communications, and presents the design and evaluation of a coprocessor that implements a set of MPI primitives for RMA. These primitives form a universal and orthogonal set that can be used to implement any other MPI function. To evaluate the coprocessor, a router of low latency was designed as well to enable the direct interconnection of several coprocessors in cluster-on-a-chip systems. Experimental results justify the implementation of the MPI primitives in hardware to support parallel programming in reconfigurable computing. Under continuous traffic, results for a Xilinx XC2V6000 FPGA show that the average transmission time per 32-bit word is about 1.35 clock cycles. Although other computing platforms, such as PC clusters, could benefit as well from our design methodology, our focus is exclusively reconfigurable multiprocessing that has recently received tremendous attention in academia and industry.

Parallel Processing Letters | 1999

EFFICIENT DETERMINISTIC SORTING ON THE BSP MODEL

Alexandros V. Gerbessiotis; Constantinos J. Siniolakis

We present a new algorithm for deterministic sorting on the Bulk-Synchronous Parallel (BSP) model of computation. We sort n keys using a partitioning scheme that achieves the requirements of efficiency (one-optimality) and insensitivity against initial key distribution. Although we employ sampling to realize efficiency, we give a precise worst-case estimation of the maximum imbalance which might occur. The algorithm is one-optimal for a wide range of the BSP parameters in the sense that its speedup on p processors is asymptotically (1 - o(1))p.

international parallel processing symposium | 1997

A randomized sorting algorithm on the BSP model

Alexandros V. Gerbessiotis; Constantinos J. Siniolakis

The authors present a new randomized sorting algorithm on the bulk-synchronous parallel (BSP) model. The algorithm improves upon the parallel slack of previous algorithms to achieve optimality. Tighter probabilistic bounds are also established. It uses sample sorting and utilizes recently introduced search algorithms for a class of data structures on the BSP model. Moreover the methods are within a 1+o(1) multiplicative factor of the respective sequential methods in terms of speedup for a wide range of the BSP parameters.

Theoretical Computer Science | 2003

Architecture independent parallel selection with applications to parallel priority queues

Alexandros V. Gerbessiotis; Constantinos J. Siniolakis

We present a randomized selection algorithm whose performance is analyzed in an architecture independent way on the bulk-synchronous parallel (BSP) model of computation along with an application of this algorithm to dynamic data structures, namely parallel priority queues. We show that our algorithms improve previous results upon both the communication requirements and the amount of parallel slack required to achieve optimal performance. We also establish that optimality to within small multiplicative constant factors can be achieved for a wide range of parallel machines. While these algorithms are fairly simple themselves, descriptions of their performance in terms of the BSP parameters is somewhat involved; the main reward of quantifying these complications is that it allows transportable software to be written for parallel machines that fit the model.

european conference on parallel processing | 1996

Communication Efficient Data Structures on the BSP Model with Applications in Computational Geometry

Alexandros V. Gerbessiotis; Constantinos J. Siniolakis

The implementation of data structures on distributed memory models, like the Bulk-Synchronous Parallel (BSP), rather than shared memory ones, like the PRAM, offers a serious challenge. In this paper we undertake the architecture independent study of the communication and synchronization requirements of searching “ordered h-level graphs”, which include most of the standard data structures. We propose n-way search as a general tool for the design, analysis, and implementation of BSP algorithms. This technique allows elegant high-level design and analysis of algorithms, using data structures similar to that of sequential models. Our methods are within a 1 + o(1) factor of the respective sequential methods. An application to computational geometry is also presented.

Explore More