Sunil R. Tiyyagura
University of Stuttgart
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sunil R. Tiyyagura.
Journal of Computer and System Sciences | 2008
Subhash Saini; Robert Ciotti; Brian T. N. Gunney; Thomas E. Spelce; Alice Koniges; Don Dossa; Panagiotis Adamidis; Rolf Rabenseifner; Sunil R. Tiyyagura; Matthias S. Mueller
The HPC Challenge (HPCC) Benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers-SGI Altix BX2, Cray X1, Cray Opteron Cluster, Dell Xeon Cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC Benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks results to study the performance of 11 MPI communication functions on these systems.
ieee international conference on high performance computing data and analytics | 2006
Malte Neumann; Sunil R. Tiyyagura; Wolfgang A. Wall; Ekkehard Ramm
For the numerical simulation of large scale CFD and fluid-structure interaction (FSI) problems efficiency and robustness of the algorithms are two key requirements. In this paper we would like to describe a very simple concept to increase significantly the performance of the element calculation of an arbitrary unstructured finite element mesh on vector computers. By grouping computationally similar elements together the length of the innermost loops and the vector length can be controlled. In addition the effect of different programming languages and different array management techniques will be investigated. A numerical CFD simulation will show the improvement in the overall time-to-solution on vector computers as well as on other architectures. Especially for FSI simulations also the robustness of the algorithm is very important. For the transient interaction of incompressible viscous flows and nonlinear flexible structures commonly used sequential staggered coupling schemes exhibit weak instabilities. As best remedy to this problem subiterations should be invoked to guarantee kinematic and dynamic continuity across the fluid-structure interface. To ensure the efficiency of these iterative substructuring schemes two robust and problem-independent acceleration methods are proposed.
international parallel and distributed processing symposium | 2006
Subhash Saini; Robert Ciotti; Brian T. N. Gunney; Thomas E. Spelce; Alice Koniges; Don Dossa; Panagiotis Adamidis; Rolf Rabenseifner; Sunil R. Tiyyagura; Matthias S. Mueller; Rod Fatoohi
The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems
Lecture Notes in Computer Science | 2005
Rolf Rabenseifner; Sunil R. Tiyyagura; Matthias S. Müller
The HPC Challenge benchmark suite (HPCC) was released to analyze the performance of high-performance computing architectures using several kernels to measure different memory and hardware access patterns comprising latency based measurements, memory streaming, inter-process communication and floating point computation. HPCC defines a set of benchmarks augmenting the High Performance Linpack used in the Top500 list. This paper describes the inter-process communication benchmarks of this suite. Based on the effective bandwidth benchmark, a special parallel random and natural ring communication benchmark has been developed for HPCC. Ping-Pong benchmarks on a set of process pairs can be used for further characterization of a system. This paper analyzes first results achieved with HPCC. The focus of this paper is on the balance between computational speed, memory bandwidth, and inter-node communication.
international conference on computational science | 2006
Sunil R. Tiyyagura; Uwe Küster; Stefan Borowski
Many applications based on finite element and finite difference methods include the solution of large sparse linear systems using preconditioned iterative methods. Matrix vector multiplication is one of the key operations that has a significant impact on the performance of any iterative solver. In this paper, recent developments in sparse storage formats on vector machines are reviewed. Then, several improvements to memory access in the sparse matrix vector product are suggested. Particularly, algorithms based on dense blocks are discussed and reasons for their superior performance are explained. Finally, the performance gain by the presented modifications is demonstrated.
ieee international conference on high performance computing data and analytics | 2006
Malte Neumann; U. Küttler; Sunil R. Tiyyagura; Wolfgang A. Wall; Ekkehard Ramm
In this paper we address various efficiency aspects of finite element (FE) simulations on vector computers. Especially for the numerical simulation of large scale Computational Fluid Dynamics (CFD) and Fluid-Structure Interaction (FSI) problems efficiency and robustness of the algorithms are two key requirements.
ieee international conference on high performance computing data and analytics | 2008
Sunil R. Tiyyagura; Panagiotis Adamidis; Rolf Rabenseifner; Peter Lammers; Stefan Borowski; F. Lippold; F. Svensson; Olaf Marxen; Stefan Haberhauer; Ari P. Seitsonen; J. Furthmüller; Katharina Benkert; Martin Galle; Thomas Bönisch; Uwe Küster; Michael M. Resch
This paper provides a comprehensive performance evaluation of the NEC SX-8 system at the High Performance Computing Center Stuttgart which has been in operation since July 2005. It provides a description of the installed hardware together with its performance for some synthetic benchmarks and five real world applications. All the applications achieved sustained Tflop/s performance. Additionally, the measurements presented show the ability of the system to solve not only large problems with a very high performance, but also medium sized problems with high efficiency using a large number of processors.
Archive | 2008
Sunil R. Tiyyagura; Malte von Scheven
This paper addresses the algorithmic and implementation issues associated with fluid structure interaction simulations, specially on vector architecture. Firstly, the fluid structure coupling algorithm is presented and then a newly developed parallel sparse linear solver is introduced and its performance discussed.
international conference on conceptual structures | 2007
Sunil R. Tiyyagura; Uwe Küster
This paper addresses the efficiency issues in solving large sparse linear systems parallely on scalar and vector architectures. Linear systems arise in numerous applications that need to solve PDEs on complex domains. The major time consuming part of large scale implicit Finite Element (FE) or Finite Volume (FV) simulation is solving the assembled global system of equations. First, the performance of widely used public domain solvers which target performance on scalar machines is analyzed on a typical vector machine. Then, a newly developed parallel sparse iterative solver (Block-based Linear Iterative Solver --- BLIS) targeting performance on both scalar and vector systems is introduced and the time needed for solving linear systems is compared on different architectures. Finally, the reasons behind the scaling behaviour of parallel iterative solvers is analysed.
Archive | 2006
Subhash Saini; Rolf Rabenseifner; Brian T. N. Gunney; Thomas E. Spelce; Alice Koniges; Don Dossa; Panagiotis Adamidis; Robert Ciotti; Sunil R. Tiyyagura; Matthias S. Müller; Rod Fatoohi; Moffett Field; One Washington Square