Malolan Chetlur | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Malolan Chetlur is active.

Explore More

Publication

Featured researches published by Malolan Chetlur.

Lecture Notes in Computer Science | 1998

An Object-Oriented Time Warp Simulation Kernel

Radharamanan Radhakrishnan; Dale E. Martin; Malolan Chetlur; Dhananjai Madhava Rao; Philip A. Wilsey

The design of a Time Warp simulation kernel is made difficult by the inherent complexity of the paradigm. Hence it becomes critical that the design of such complex simulation kernels follow established design principles such as object-oriented design so that the implementation is simple to modify and extend. In this paper, we present a compendium of our efforts in the design and development of an object-oriented Time Warp simulation kernel, called warped. warped is a publically available Time Warp simulation kernel for experimentation and application development. The kernel defines a standard interface to the application developer and is designed to provide a highly configurable environment for the integration of Time Warp optimizations. It is written in C++, uses the MPI message passing standard for communication, and executes on a variety of platforms including a network of SUN workstations, a SUN SMP workstation, the IBM SP1/SP2 multiprocessors, the Cray T3E, the Intel Paragon, and IBM-compatible PCs running Linux.

workshop on parallel and distributed simulation | 1998

Optimizing communication in time-warp simulators

Malolan Chetlur; Nael B. Abu-Ghazaleh; Radharamanan Radhakrishnan; Philip A. Wilsey

In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine grained applications (such as Time Warp simulators) suffer high overheads because of frequent communication. We investigate the optimization of the communication subsystem of Time Warp simulators using dynamic message aggregation. Under this scheme, Time Warp messages with the same destination LP, occurring in close temporal proximity are dynamically aggregated and sent as a single physical message. Several aggregation strategies that attempt to minimize the communication overhead without harming the progress of the simulation (because of messages being delayed) are developed. The performance of the strategies is evaluated for a network of workstations, and an SMP, using a number of applications that have different communication behavior.

Journal of Parallel and Distributed Computing | 2002

Analysis and Simulation of Mixed-Technology VLSI Systems

Dale E. Martin; Radharamanan Radhakrishnan; Dhananjai Madhava Rao; Malolan Chetlur; Krishnan Subramani; Philip A. Wilsey

Circuit simulation has proven to be one of the most important computer aided design (CAD) methods for verification and analysis of, integrated circuit designs. A popular approach to modeling circuits for simulation purposes is to use a hardware description language such as VHDL. VHDL has had a tremendous impact in fostering and accelerating CAD systems development in the digital arena. Similar efforts have also been carried out in the analog domain which has resulted in tools such as SPICE. However, with the growing trend of hardware designs that contain both analog and digital components, comprehensive design environments that seamlessly integrate analog and digital circuitry are needed. Simulation of digital or analog circuits is, however, exacerbated by high-resource (CPU and memory) demands that increase when analog and digital models are integrated in a mixed-mode (analog and digital) simulation. A cost-effective solution to this problem is the application of parallel discrete-event simulation (PDES) algorithms on a distributed memory platform such as a cluster of workstations. In this paper, we detail our efforts in architecting an analysis and simulation environment for mixed-technology VLSI systems. In addition, we describe the design issues faced in the application of PDES algorithms to mixed-technology VLSI system simulation.

annual simulation symposium | 1996

A comparative analysis of various Time Warp algorithms implemented in the WARPED simulation kernel

Radharamanan Radhakrishnan; Timothy J. McBrayer; Krishnan Subramani; Malolan Chetlur; Vijay Balakrishnan; Philip A. Wilsey

The Time Warp mechanism conceptually has the potential to speedup discrete event simulations on parallel platforms. However, practical implementations of optimistic mechanism have been hindered by several drawbacks, such as large memory usage, excessive rollbacks (instability), and wasted lookahead computation. Several optimizations and variations to the original Time Warp algorithm have been presented in the literature to optimistically synchronize parallel discrete event simulation. This paper uses a common simulation environment to present comparative performance results of several Time Warp optimizations in two different application domains, namely queuing model simulation and digital system simulation. The particular optimizations considered are: lowest-timestamp-first (LTSF) scheduling, periodic (fixed period) checkpointing, dynamic checkpointing, lazy cancellation and dynamic cancellation.

workshop on parallel and distributed simulation | 2001

Causality representation and cancellation mechanism in time warp simulations

Malolan Chetlur; Philip A. Wilsey

The Time Warp synchronization protocol allows causality errors and then recovers from them with the assistance of a cancellation mechanism. Cancellation can cause the rollback of several other simulation objects that may trigger a cascading rollback situation where the rollback cycles back to the original simulation object. These cycles of rollback can cause the simulation to enter an unstable (or thrashing) state where little real forward simulation progress is achieved. To address this problem, knowledge of causal relations between events can be used during cancellation to avoid cascading rollbacks and to initiate early recovery operations from causality errors. In this paper we describe a logical time representation for Time Warp simulations that is used to disseminate causality information. The new timestamp representation, called Total Clocks, has two components: (i) a virtual time component, and (ii) a vector of event counters similar to Vector clocks. The virtual time component provides a one dimensional global simulation time and the vector of event counters records event processing rates by the simulation objects. This time representation allows us to disseminate causality information during event execution that can be used to allow early recovery, during cancellation. We propose a cancellation mechanism using Total Clocks that allows cascading rollbacks in Time Warp simulations that have FIFO communication channels.

international conference on parallel processing | 1998

On-line configuration of a time warp parallel discrete event simulator

Radharamanan Radhakrishnan; Nael B. Abu-Ghazaleh; Malolan Chetlur; Philip A. Wilsey

In time warp simulations, the overheads associated with rollbacks, state-saving and the communication induced by rollbacks are the chief contributors to the cost of the simulation; thus, these aspects of the simulation have been primary targets for optimizations. Unfortunately, the behavior of the time warp simulation is highly dynamic and greatly influenced by the application being simulated. Thus, the suggested optimizations are only effective for certain intervals of the simulation. This paper argues that the performance of time warp simulators benefits from a dynamic on-line decision process that selects and configures the sub-algorithms implementing the different aspects of the simulator to best match the current behavior of the simulation. In particular we study control strategies to dynamically: (i) adjust the checkpointing (or state-saving) interval (ii) select the cancellation strategy (lazy or aggressive), and (iii) determine the policy for aggregating the application messages (an optimization that significantly improves the performance in message passing environments). The strategies have been implemented in the WARPED time warp simulation kernel and the performance obtained via the dynamically controlled optimizations is shown to surpass that of their best performing static counterparts.

international parallel processing symposium | 1999

Adressing Comminication Latency Issues on Clusters for Fine Grained Asynchronous Applications - A Case Study

Umesh Kumar V. Rajasekaran; Malolan Chetlur; Philip A. Wilsey

With the advent of cheap and powerful hardware for workstations and networks, a new cluster-based architecture for parallel processing applications has been envisioned. However, fine-grained asynchronous applications that communicate frequently are not the ideal candidates for such architectures because of their high latency communication costs. Hence, designers of fine-grained parallel applications on clusters are faced with the problem of reducing the high communication latency in such architectures. Depending on what kind of resources are available, the communication latency can be improved along the following dimensions: (a) reducing network latency by employing a higher performance network hardware (i.e., Fast Ethernet versus Myrinet); (b) reducing communication software overhead by developing more efficient communication libraries (MPICH versus TCPMPL (our TCP/IP based message passing layer) versus MPI-BIP); (c) rewriting/restructuring the application code for less frequent communication; and (d) exploiting application characteristics by deploying communication optimizations that exploit the application’s inherent communication characteristics. This paper discusses our experiences with building a communication subsystem on a cluster of workstations for a fine-grained asynchronous application (a Time Warp synchronized discrete-event simulator). Specifically, our efforts in reducing the communication latency along three of the four aforementioned dimensions are detailed and discussed. In addition, performance results from an in-depth empirical evaluation of the communication subsystem are reported in the paper.

winter simulation conference | 2006

Causality Information and Fossil Collection in Time Warp Simulations

Malolan Chetlur; Philip A. Wilsey

This paper presents a time warp fossil collection mechanism that functions without need for a GVT estimation algorithm. Effectively each logical process (LP) collects causality information during normal event execution and then each LP utilizes this information to identify fossils. In this mechanism, LPs use constant size vectors (that are independent of the total number of parallel simulation objects) as timestamps called Plausible Total Clocks to disseminate causality information. For proper operation, this mechanism requires that the communication layer preserves a FIFO ordering on messages. A detailed description of this new fossil collection mechanism and its proof of correctness is presented in this paper

european conference on parallel processing | 2001

Event List Management in Distributed Simulation

Jörgen Dahl; Malolan Chetlur; Philip A. Wilsey

Efficient management of events lists is important in optimizing discrete event simulation performance. This is especially true in distributed simulation systems. The performance of simulators is directly dependent on the event list management operations such as insertion, deletion, and search. Several factors such as scheduling, checkpointing, and state management influence the organization of data structures to manage events efficiently in a distributed simulator. In this paper, we present a new organization for input event queues, called appendqueues, for an optimistically synchronized parallel discrete-event simulator. Append-queues exploits the fact that events exchanged between the distributed simulators are generated in sequences with monotonically increasing time orders. A comparison of append-queues with an existing multi-list organization is developed that uses both analytical and experimental analysis to show the event management cost of different configurations. The comparison shows performance improvements ranging from 3% to 47% for the applications studied.

european pvm mpi users group meeting on recent advances in parallel virtual machine and message passing interface | 1998

An Active Layer Extension to MPI

Malolan Chetlur; Girindra D. Sharma; Nael B. Abu-Ghazaleh; Umesh Kumar V. Rajasekaran; Philip A. Wilsey

Communication costs represent a significant portion of the execution time of most distributed applications. Thus, it is important to optimize the communication behavior of the algorithm to match the capabilities of the underlying communication fabric. Traditionally, optimizations to the communication behavior have been carried out statically and at the application level (optimizing partitioning, using the most appropriate communication protocols, etc). This paper introduces a new class of optimizations to communication: active run-time matching between the application communication behavior and the communication layer. We propose an active layer extension to the Message Passing Interface (MPI) that dynamically reduces the average communication overhead associated with message sends and receives. The active layer uses dynamic message aggregation to reduce the send overheads and infrequent polling to reduce the receive overhead of messages. The performance of the active layer is evaluated using a number of applications.

Explore More