Alexey L. Lastovetsky

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexey L. Lastovetsky is active.

Explore More

Publication

Featured researches published by Alexey L. Lastovetsky.

Journal of Parallel and Distributed Computing | 2001

Heterogeneous Distribution of Computations Solving Linear Algebra Problems on Networks of Heterogeneous Computers

Alexey Kalinov; Alexey L. Lastovetsky

This paper presents and analyzes two different strategies of heterogeneous distribution of computations solving dense linear algebra problems on heterogeneous networks of computers. The first strategy is based on heterogeneous distribution of processes over processors and homogeneous block cyclic distribution of data over the processes. The second is based on homogeneous distribution of processes over processors and heterogeneous block cyclic distribution of data over the processes. Both strategies were implemented in the mpC language?a dedicated parallel extension of ANSI C for efficient and portable programming of heterogeneous networks of computers. The first strategy was implemented using calls to ScaLAPACK; the second strategy was implemented with calls to LAPACK and BLAS. Cholesky factorization on a heterogeneous network of workstations is used to demonstrate that the heterogeneous distributions have an advantage over the traditional homogeneous distribution.

ieee international conference on high performance computing data and analytics | 1999

Heterogeneous Distribution of Computations While Solving Linear Algebra Problems on Networks of Heterogeneous Computers

Alexey Kalinov; Alexey L. Lastovetsky

The paper presents a heterogeneous distribution of computations while solving dense linear algebra problems on heterogeneous networks of computers. The distribution is based on heterogeneous block cyclic distribution which is extension of the traditional homogeneous block cyclic distribution taking into account differences in the processor performances. The mpC language, specially designed for parallel programming heterogeneous networks is briefly introduced. An mpC aplication carring out Cholesky factorization on a heterogenous network of workstations is used to demonstrate that the heterogeneous distribution have an essential advantage over the traditional homogeneous distribution.

Journal of Parallel and Distributed Computing | 2006

HeteroMPI: Towards a message-passing library for heterogeneous networks of computers

Alexey L. Lastovetsky; Ravi Reddy

The paper presents Heterogeneous MPI (HeteroMPI), an extension of MPI for programming high-performance computations on heterogeneous networks of computers. It allows the application programmer to describe the performance model of the implemented algorithm in a generic form. This model allows the specification of all the main features of the underlying parallel algorithm, which have an impact on its execution performance. These features include the total number of parallel processes, the total volume of computations to be performed by each process, the total volume of data to be transferred between each pair of the processes, and how exactly the processes interact during the execution of the algorithm. Given a description of the performance model, HeteroMPI tries to create a group of processes that executes the algorithm faster than any other group. The principal extensions to MPI are presented. We demonstrate the features of the library by performing experiments with parallel simulation of the interaction of electric and magnetic fields and parallel matrix multiplication.

ieee international conference on high performance computing data and analytics | 2007

Data Partitioning with a Functional Performance Model of Heterogeneous Processors

Alexey L. Lastovetsky; Ravi Reddy

In this paper, we address the problem of optimal distribution of computational tasks on a network of heterogeneous computers when one or more tasks do not fit into the main memory of the processors and when relative speeds vary with the problem size. We propose a functional performance model of heterogeneous processors that integrates many essential features of a network of heterogeneous computers having a major impact on its performance such as the processor heterogeneity, the heterogeneity of memory structure, and the effects of paging. Under this model, the speed of each processor is represented by a continuous function of the size of the problem whereas traditional models use single numbers to represent the speeds of the processors. We formulate a problem of partitioning of an n-element set over p heterogeneous processors using this model and design an algorithm of the complexity O(p × log2n) solving the problem.

parallel computing | 2002

Adaptive parallel computing on heterogeneous networks with mpC

Alexey L. Lastovetsky

The paper presents a new advanced version of the mpC parallel language. The language was designed specially for programming high-performance parallel computations on heterogeneous networks of computers.The advanced version allows the programmer to define at runtime all the main features of the underlying parallel algorithm, which have an impact on the application execution performance, namely, the total number of participating parallel processes, the total volume of computations to be performed by each of the processes, the total volume of data to be transferred between each pair of the processes, and how exactly the processes interact during the execution of the algorithm. Such an abstraction of parallel algorithm is called a network type in mpC.Given a network type, the programmer can define a network object of this type and describe in details all the computations and communications to be performed on the network object. The mpC programming system uses the information extracted from the network-type definition together with information about the actual performance of the executing network to map the processes of the parallel program to this network in such a way that leads to its better execution time.In addition, the programmer can use a special operator, time of, which predicts the total time of the algorithm execution on the underlying hardware without its real execution. That feature allows the programmer to write such a parallel program that can follow different parallel algorithms to solve the same problem, making choice at runtime depending on the particular executing network and its actual performance.The paper describes both the language model of parallel algorithm and the model of executing parallel environment used by the mpC programming system. It also discusses principles of the implementation of mapping mpC network objects to the computing network.

parallel computing | 2004

On performance analysis of heterogeneous parallel algorithms

Alexey L. Lastovetsky; Ravi Reddy

The paper presents an approach to performance analysis of heterogeneous parallel algorithms. As a typical heterogeneous parallel algorithm is just a modification of some homogeneous one, the idea is to compare the heterogeneous algorithm with its homogeneous prototype, and to assess the heterogeneous modification rather than analyse the algorithm as an isolated entity. A criterion of optimality of heterogeneous parallel algorithms is suggested. A parallel algorithm of matrix multiplication on heterogeneous clusters is used to illustrate the proposed approach.

international parallel and distributed processing symposium | 2004

Data partitioning with a realistic performance model of networks of heterogeneous computers

Alexey L. Lastovetsky; Ravi Reddy

Summary form only given. The article presents a performance model of a network of heterogeneous computers that takes account of the heterogeneity of memory structure and other architectural differences. Under this model, the speed of each processor is represented by a function of the size of the problem whereas standard models use single numbers to represent the speeds of the processors. We prove that this model is more realistic than the standard ones when the network includes computers with significantly different memory structure. We formulate a problem of partitioning of an n-element set over p heterogeneous processors using this advanced performance model and give its efficient solution of the complexity O(p/sup 2//spl times/log/sub 2/n).

international conference on cluster computing | 2012

Data Partitioning on Heterogeneous Multicore and Multi-GPU Systems Using Functional Performance Models of Data-Parallel Applications

Ziming Zhong; Vladimir Rychkov; Alexey L. Lastovetsky

Transition to hybrid CPU/GPU platforms in high performance computing is challenging in the aspect of efficient utilisation of the heterogeneous hardware and existing optimised software. During recent years, scientific software has been ported to multicore and GPU architectures and now should be reused on hybrid platforms. In this paper, we model the performance of such scientific applications in order to execute them efficiently on hybrid platforms. We consider a hybrid platform as a heterogeneous distributed-memory system and apply the approach of functional performance models, which was originally designed for uniprocessor machines. The functional performance model (FPM) represents the processor speed by a function of problem size and integrates many important features characterising the performance of the architecture and the application. We demonstrate that FPMs facilitate performance evaluation of scientific applications on hybrid platforms. FPM-based data partitioning algorithms have been proved to be accurate for load balancing on heterogeneous networks of uniprocessor computers. We apply FPM-based data partitioning to balance the load between cores and GPUs in the hybrid architecture. In our experiments with parallel matrix multiplication, we couple the existing software optimised for multicores and GPUs and achieve high performance of the whole hybrid system.

international conference on parallel and distributed systems | 2006

An accurate communication model of a heterogeneous cluster based on a switch-enabled Ethernet network

Alexey L. Lastovetsky; Is-Haka Mkwawa; Maureen O'Flynn

The paper presents a communication model of a set of heterogeneous processors interconnected via a switch-enabled Ethernet network. The goal of the model is to accurately predict the contribution of communication operations into the total execution time of parallel applications running on the platform. The presented model takes into account the impact of the heterogeneity of processors on the performance of communication operations. In this paper, we give analytical models for a single point-to-point communication, multiple independent point-to-point communications, multiple one-to-many point-to-point communications, and for a broadcast. Experimental results are presented demonstrating the accuracy of the analytical models

Archive | 2009

High Performance Heterogeneous Computing

Jack J. Dongarra; Alexey L. Lastovetsky

An analytical overview of the state of the art, open problems, and future trends in heterogeneous parallel and distributed computing This book provides an overview of the ongoing academic research, development, and uses of heterogeneous parallel and distributed computing in the context of scientific computing. Presenting the state of the art in this challenging and rapidly evolving area, the book is organized in five distinct parts: Heterogeneous Platforms: Taxonomy, Typical Uses, and Programming Issues Performance Models of Heterogeneous Platforms and Design ofHeterogeneous Algorithms Performance: Implementation and Software Applications Future Trends High Performance Heterogeneous Computing is a valuablereference for researchers and practitioners in the area of high performance heterogeneous computing. It also serves as an excellent supplemental text for graduate and postgraduate courses in related areas.

Explore More