Sunil Thulasidasan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sunil Thulasidasan is active.

Explore More

Publication

Featured researches published by Sunil Thulasidasan.

global communications conference | 2002

GREEN: proactive queue management over a best-effort network

Wu-chun Feng; Apu Kapadia; Sunil Thulasidasan

We present a proactive queue-management (PQM) algorithm called GREEN (generalized random early evasion network) that applies knowledge of the steady-state behavior of TCP connections to drop packets intelligently and proactively, thus preventing congestion from ever occurring and ensuring a higher degree of fairness between flows. This congestion-prevention approach is in contrast to the congestion-avoidance approach of traditional active queue-management (AQM) schemes where congestion is actively detected early and then reacted to. In addition to enhancing fairness, GREEN keeps packet-queue lengths relatively low and reduces bandwidth and latency jitter. These characteristics are particularly beneficial to real-time multimedia applications. Further, GREEN achieves the above while maintaining high link utilization and low packet loss.

Computer Networks | 2010

Criticality analysis of Internet infrastructure

Guanhua Yan; Stephan Eidenbenz; Sunil Thulasidasan; Pallab Datta; Venkatesh Ramaswamy

The Internet has evolved into an indispensable component of our daily lives and protecting its critical infrastructure has thus become a crucial task. In this work, we present and compare different methods to assess the criticality of individual facilities of the Internet infrastructure at a national-level: graph-theoretical analysis, route-based analysis, traffic-based analysis, and consequence-based analysis. Our key observations are: (1) The geographical topology, which is derived from a national-level IP backbone network, has a power-law degree distribution and is a small-world network; (2) A few locations appear much more frequently among all paths in the IP backbone topology than others, and they also witness a high percentage of US Internet traffic. (3) Relative ranking of Internet facility locations from traffic-based analysis differs significantly from those derived from graph-theoretical analysis and route-based analysis, suggesting that a comprehensive, high-fidelity Internet model is necessary to assess critical Internet infrastructure facilities. (4) Consequence-based analysis, although computationally intense, cannot be replaced by other rankings, including traffic-based analysis. Conclusions drawn from this work extend our knowledge regarding the Internet and also shed lights on which critical Internet infrastructure facilities should be protected with limited resources.

high performance distributed computing | 2003

Optimizing GridFTP through dynamic right-sizing

Sunil Thulasidasan; Wu-chun Feng; Mark K. Gardner

In this paper, we describe the integration of dynamic right-sizing - an automatic and scalable buffer management technique for enhancing TCP (transport control protocol) performance - into GridFTP, a subsystem of the Globus Toolkit for managing bulk data transfers across computational Grids. Such Grids are often characterized by networks with large bandwidth-delay products. Unfortunately, many of todays Grid applications use only a small fraction of available bandwidth because the default buffer sizes in TCP are tuned for yesterdays WAN (wide access network) speeds. Buffer sizes can be manually tuned to allow TCP flow control to adapt to high-speed WAN environments, but this is a tedious process. Although recent work has shown how to automatically tune system buffers during connection set-up, these values may not be appropriate for the connections lifetime due to varying network delay and throughput. We show how using the technique of dynamic right-sizing (DRS) in GridFTP helps us optimize memory usage while maintaining high throughput over the lifetime of the connection. We also show how DRS enhances important GridFTP features such as striped and third-party data transfers in a scalable way. The technique is implemented entirely in user space so that end users do not have to modify the kernel.

international parallel and distributed processing symposium | 2014

Efficient Multi-GPU Computation of All-Pairs Shortest Paths

Hristo Djidjev; Sunil Thulasidasan; Guillaume Chapuis; Rumen Andonov; Dominique Lavenier

We describe a new algorithm for solving the all-pairs shortest-path (APSP) problem for planar graphs and graphs with small separators that exploits the massive on-chip parallelism available in todays Graphics Processing Units (GPUs). Our algorithm, based on the Floyd-War shall algorithm, has near optimal complexity in terms of the total number of operations, while its matrix-based structure is regular enough to allow for efficient parallel implementation on the GPUs. By applying a divide-and-conquer approach, we are able to make use of multi-node GPU clusters, resulting in more than an order of magnitude speedup over the fastest known Dijkstra-based GPU implementation and a two-fold speedup over a parallel Dijkstra-based CPU implementation.

ieee international conference on high performance computing, data, and analytics | 2009

Designing systems for large-scale, discrete-event simulations: Experiences with the FastTrans parallel microsimulator

Sunil Thulasidasan; Shiva Prasad Kasiviswanathan; Stephan Eidenbenz; Emanuele Galli; Susan M. Mniszewski; Philip Romero

We describe the various aspects involved in building FastTrans, a scalable, parallel microsimulator for transportation networks that can simulate and route tens of millions of vehicles on real-world road networks in a fraction of real time. Vehicular trips are generated using agent-based simulations that provide realistic, daily activity schedules for a synthetic population of millions of intelligent agents. We use parallel discrete-event simulation techniques and distributed-memory algorithms to scale these simulations to over one thousand compute nodes. We present various optimizations for speeding up simulation execution times, including (i) a set of routing algorithms such as variations of Dijkstras shortest path algorithm and heuristic-based A⋆ search, and (ii) a number of different partitioning schemes for load balancing, including geographic partitioning (that assigns simulation entities that are geographically close by to the same processor) and scattering (that assigns geographically close by entities to different processors). Our main findings include: (i) A⋆ significantly outperforms other routing algorithms while computing near-optimal paths; (ii) surprisingly, scattering outperforms more sophisticated partitioning schemes by achieving near-perfect load-balancing. With optimized routing and partitioning, FastTrans is able to simulate a full 24 hour work-day in New York — involving over one million road links and approximately 25 million vehicular trips — in less than one hour of wall-clock time on a 512-node cluster.

Computer Communications | 2004

User-space auto-tuning for TCP flow control in computational grids

Mark K. Gardner; Sunil Thulasidasan; Wu-chun Feng

With the advent of computational grids, networking performance over the wide-area network (WAN) has become a critical component in the grid infrastructure. Unfortunately, many high-performance grid applications only use a small fraction of their available bandwidth because operating systems and their associated protocol stacks are still tuned for yesterdays network speeds. As a result, network gurus undertake the tedious process of manually tuning system buffers to allow TCP flow control to scale to todays WAN environments. And although recent research has shown how to set the size of these system buffers automatically at connection set-up, the buffer sizes are only appropriate at the beginning of the connections lifetime. To address these problems, we describe an automated and lightweight technique called Dynamic Right-Sizing that can improve throughput by as much as an order of magnitude while still abiding by TCP semantics. We show the performance of two user-space implementations of DRS: drsFTP and DRS-enabled GridFTP.

Journal of Parallel and Distributed Computing | 2015

All-Pairs Shortest Path algorithms for planar graph for GPU-accelerated clusters

Hristo Djidjev; Guillaume Chapuis; Rumen Andonov; Sunil Thulasidasan; Dominique Lavenier

We present a new approach for solving the All-Pairs Shortest-Path (APSP) problem for planar graphs that exploits the massive on-chip parallelism available in todays Graphics Processing Units (GPUs). We describe two new algorithms based on our approach. Both algorithms use Floyd-Warshall method, have near optimal complexity in terms of the total number of operations, while their matrix-based structure is regular enough to allow for efficient parallel implementation on the GPUs. By applying a divide-and-conquer approach, we are able to make use of multi-node GPU clusters, resulting in more than an order of magnitude speedup over fastest known Dijkstra-based GPU implementation and a two-fold speedup over a parallel Dijkstra-based CPU implementation. We develop a new approach for the All-Pairs Shortest Path problem in planar graphs.We target execution on large CPU-GPU clusters and graphs with millions of vertices.We design a centralized (master/slave) and a decentralized (distributed) version.Our algorithms are work-efficient and allow a high-degree of parallelism.Our algorithms are significantly faster than the previous ones.

winter simulation conference | 2009

Accelerating traffic microsimulations: a parallel discrete-event queue-based approach for speed and scale

Sunil Thulasidasan; Stephan Eidenbenz

We present FastTrans — a parallel, distributed-memory simulator for transportation networks that uses a queue-based event-driven approach to traffic microsimulation. Queue-based simulation models have been shown to be significantly faster than cellular-automata type approaches, sacrificing spatial granularity for speed, while preserving link and intersection dynamics with high fidelity. Significant advances over previous work include the size of the simulated network, support for dynamic responses to congestion and the absence of precomputed routes — all routing calculations are executed online. We present initial results from a scalability study using a real-world network from the North-East region of the United States comprising over 1.5 million network elements and over 25 million vehicular trips. Simulation of an entire days worth of realistic vehicular itineraries involving approximately five billion simulated events executes in less than an hour of wall-clock time on a distributed computing cluster. Initial results suggest almost linear speed-ups with cluster size.

international ifip-tc networking conference | 2006

Semantic compression of TCP traces

Gabriel Istrate; Anders A. Hansson; Sunil Thulasidasan; Madhav V. Marathe; Christopher L. Barrett

We propose a new methodology, Restored, for model-based storage and regeneration of TCP traces. Restored provides significant data compression by exploiting semantics of TCP. Experiments show that Restored can achieve over 10,000-fold compression ratios for some really large input connections, while still being able to recover several structural and QoS measures.

Simulation | 2016

Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics

Richard J. Zamora; Arthur F. Voter; Danny Perez; Nandakishore Santhi; Susan M. Mniszewski; Sunil Thulasidasan; Stephan Eidenbenz

Due to its unrivaled ability to predict the dynamical evolution of interacting atoms, molecular dynamics (MD) is a widely used computational method in theoretical chemistry, physics, biology, and engineering. Despite its success, MD is only capable of modeling timescales within several orders of magnitude of thermal vibrations, leaving out many important phenomena that occur at slower rates. The temperature-accelerated dynamics (TAD) method overcomes this limitation by thermally accelerating the state-to-state evolution captured by MD. Due to the algorithmically complex nature of the serial TAD procedure, implementations have yet to improve performance by parallelizing the concurrent exploration of multiple states. Here we utilize a discrete-event-based application simulator to introduce and explore a new speculatively parallel TAD (SpecTAD) method. We investigate the SpecTAD algorithm, without a full-scale implementation, by constructing an application simulator proxy (SpecTADSim). Following this method, we discover that a non-trivial relationship exists between the optimal SpecTAD parameter set and the number of CPU cores available at run-time. Furthermore, we find that a majority of the available SpecTAD boost can be achieved within an existing TAD application using relatively simple algorithm modifications.

Explore More