Debashis Basak
Ohio State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Debashis Basak.
IEEE Transactions on Parallel and Distributed Systems | 1996
Debashis Basak; Dhabaleswar K. Panda
Clustered or hierarchical interconnections have advantages when designing large scale multiprocessor systems. Earlier studies have either focused on only flat interconnections or proposed hierarchical/clustered interconnections with limited packaging and demanded performance constraints. Large systems require several levels of packaging. Packaging technologies impose various physical constraints on bisection bandwidth and channel width of a system. Pinout technologies and the capacity of packaging modules have been ignored in earlier studies, often leading to configurations that are not design-feasible. Similarly, the impact of processor and interconnect technologies on demanded performance has not been considered. We propose a new supply-demand framework for multiprocessor system design by considering packaging, processor, and interconnect technologies in an integrated manner. The elegance of this framework lies in its parameterised representation of different technologies. For a given set of technological parameters the framework derives the best configuration while considering practical design aspects like maximum board area, maximum available pinout, fixed channel width, and scalability. In order to build a scalable parallel system with a given number of processors, the framework explores the design space of flat k-ary n-cube topologies and their clustered variations (k-ary n-cube cluster-c) to derive design-feasible configurations with best system performance.
winter simulation conference | 1997
Dhabaleswar K. Panda; Debashis Basak; Donglai Dai; Ram Kesavan; Rajeev Sivaram; Mohammad Banikazemi; Vijay Moorthy
Components of modern parallel systems are becoming quite complex with many features and variations. An integrated modeling of these components (interconnection network, messaging layer, programming model, and computation-communication characteristics of applications) is essential to derive design guidelines for next generation parallel systems. Most of the current simulation-based modeling platforms do not support such integrated modeling. This paper presents our effort at The Ohio State University towards integrated modeling of parallel systems. Basic features of our CSIM-based Wormholerouted Multiprocessor Simulator (WORMulSim) are outlined. A set of techniques used in our simulator to model different network components (such as switches, links, wormhole/cut-through switching techniques, routing protocols, network interfaces), messaging layer with basic communication primitives, distributed shared memory programming model, and computation-communication characteristics of applications are presented. Some sample performance measures of our simulator on current generation workstations are reported to demonstrate the feasibility of integrated modeling with low computational overhead.
IEEE Transactions on Parallel and Distributed Systems | 1998
Debashis Basak; Dhabaleswar K. Panda
This paper identifies performance degradation in wormhole routed k-ary n-cube networks due to limited number of router-to-processor consumption channels at each node. Many recent research in wormhole routing have advocated the advantages of adaptive routing and virtual channel flow control schemes to deliver better network performance. This paper indicates that the advantages associated with these schemes cannot be realized with limited consumption capacity. To alleviate such performance bottlenecks, a new network interface design using multiple consumption channels is proposed. To match virtual multiplexing on network channels, we also propose each consumption channel to support multiple virtual consumption channels. The impact of message arrival rate at a node on the required number of consumption channels is studied analytically. It is shown that wormhole networks with higher routing adaptivity, dimensionality, degree of hot-spot traffic, and number of virtual lanes have to take advantage of multiple consumption channels to deliver better performance. The interplay between system topology, routing algorithm, number of virtual lanes, messaging overheads, and communication traffic is studied through simulation to derive the effective number of consumption channels required in a system. Using the ongoing technological trend, it is shown that wormhole-routed systems can use up to two-four consumption channels per node to deliver better system performance.
international conference on parallel processing | 1994
Debashis Basak; Dhabaleswar K. Panda
A general framework for architectural design of large hierarchical multiprocessor systems under rapidly changing packaging, processor, and interconnection technologies is presented. In recent years processor boards with larger area (A) and greater pinouts are becoming feasible. Board interconnection technology has advanced from peripheral connections O(\sqrt A ) to elastomeric surface connections 0(A). As processor and interconnection technology grows, there is a varying demand on the interconnection network of the system. The proposed framework is capable of taking into account all these changes in technologies and, depending on a given set of technological parameters, derive the most optimum topology. The framework is illustrated by considering the design problem of the currently popular class of k-ary n-cube cluster-c scalable architectures.
international parallel and distributed processing symposium | 1993
Debashis Basak; Dhabaleswar K. Panda
Recent advancements in VLSI and packaging technologies demonstrate attractiveness in building scalable parallel systems using clustered configurations while exploiting communication locality. Clustered architectures using buses or MINs as the inter-cluster interconnection do not satisfy both the above objectives. This paper proposes a new class of k-ary n-cube cluster-c scalable architectures by combining the scalability of k-ary n-cube wormhole-routed networks with the cost-effectiveness of processor cluster designs. This paper focuses on direct cluster interconnection. The interplay between various system parameters and routing schemes are analyzed to determine optimal configurations under the constant bisection bandwidth constraint. Our analysis indicates that small sized clusters with a ring intra-cluster topology and a 2D/3D/4D inter-cluster network connecting these clusters offer best system performance.<<ETX>>
international conference on parallel processing | 1996
Debashis Basak; Dhabaleswar K. Panda; Mohammad Banikazemi
Advances in multiprocessor interconnect technology are leading to high performance network. However, software overheads associated with message passing are limiting the processors to get maximum performance from these networks, leading to under-utilization of network resources. Though processor clusters are being used in some systems in an ad hoc manner to alleviate this problem, there is no formal analysis in the literature to show when and how processor clusters benefit in designing high performance and scalable systems. In this paper we analyze and solve this problem by considering processor-clustering, messaging overheads, and network performance in an integrated manner. Our analysis establishes the following three design guidelines. Compared to a base system, under high messaging overheads, processor clustering can be used to build a) an equal-sized system with a smaller network or b) a larger system with an equal-sized network. Under low messaging overheads, a combination of processor clustering and wider channels can be used to build a range of larger-sized systems. All these guidelines lead to designing cost-effective and scalable parallel systems while delivering high performance.
international conference on parallel processing | 1996
Debashis Basak; Dhabaleswar K. Panda
Past research on designing processor-cluster based parallel systems has focused mainly on studing the packaging technologies affecting the inter-cluster network. To make such a design approach more attractive, there is a strong need to understand the details about the topology inside the cluster its memory organization, and the impact of this organization on system performance. In this paper we analyze the communication costs for accessing inter-cluster and intra-cluster memories under different cluster organizations. The merits of these organizations are evaluated based on the performance of a commonly used U-mesh broadcast algorithm. Our results indicate that tightly coupled cluster organizations with shared access to memory offer faster intra-cluster communication. This leads to such organizations to outperform loosely coupled cluster organizations. We also demonstrate that such faster intracluster access in clustered systems can be exploited to design better collective communication algorithms. We propose a new broadcasting algorithm on clustered meshes clus-mesh which outperforms existing u-mesh on clustered systems by up to 20%.
international conference on computer communications and networks | 1995
Debashis Basak; Abhijit K. Choudhury; Ellen L. Hahne
winter simulation conference | 1997
Dhabaleswar K. Panda; Debashis Basak; Donglai Dai; Ram Kesavan; Rajeev Sivaram; Mohammad Banikazemi; Vijay Moorthy
international conference on parallel processing | 1996
Debashis Basak; Dhabaleswar K. Panda