John A. Chandy | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John A. Chandy is active.

Explore More

Publication

Featured researches published by John A. Chandy.

IEEE Computer | 1995

The Paradigm compiler for distributed-memory multicomputers

Prithviraj Banerjee; John A. Chandy; Manish Gupta; Iv. E.W. Hodges; John G. Holm; Antonio Lain; Daniel J. Palermo; Shankar Ramaswamy; Ernesto Su

To harness the computational power of massively parallel distributed-memory multicomputers, users must write efficient software. This process is laborious because of the absence of global address space. The programmer must manually distribute computations and data across processors and explicitly manage communication. The Paradigm (PARAllelizing compiler for DIstributed-memory, General-purpose Multicomputers) project at the University of Illinois addresses this problem by developing automatic methods for the efficient parallelization of sequential programs. A unified approach efficiently supports regular and irregular computations using data and functional parallelism. >

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1997

An evaluation of parallel simulated annealing strategies with application to standard cell placement

John A. Chandy; Sungho Kim; Balkrishna Ramkumar; Steven Parkes; Prithviraj Banerjee

Simulated annealing, a methodology for solving combinatorial optimization problems, is a very computationally expensive algorithm and, as such, numerous researchers have undertaken efforts to parallelize it. In this paper, we investigate three of these parallel simulated annealing strategies when applied to standard cell placement, specifically the TimberWolfSC placement tool. We have examined a parallel moves strategy, as well as two new approaches to parallel cell placement-multiple Markov chains and speculative computation. These algorithms have been implemented in ProperPLACE, our parallel cell placement application, as part of the ProperCAD II project. We have constructed ProperPLACE so that it is portable across a wide range of parallel architectures. Our parallel moves algorithm uses novel approaches to dynamic message sizing, message prioritization, and error control. We show that parallel moves and multiple Markov chains are effective approaches to parallel simulated annealing when applied to TimberWolfSC, yet speculative computation is wholly inadequate.

ACM Journal on Emerging Technologies in Computing Systems | 2016

A Survey on Chip to System Reverse Engineering

Shahed E. Quadir; Junlin Chen; Domenic Forte; Navid Asadizanjani; Sina Shahbazmohamadi; Lei Wang; John A. Chandy; Mark Tehranipoor

The reverse engineering (RE) of electronic chips and systems can be used with honest and dishonest intentions. To inhibit RE for those with dishonest intentions (e.g., piracy and counterfeiting), it is important that the community is aware of the state-of-the-art capabilities available to attackers today. In this article, we will be presenting a survey of RE and anti-RE techniques on the chip, board, and system levels. We also highlight the current challenges and limitations of anti-RE and the research needed to overcome them. This survey should be of interest to both governmental and industrial bodies whose critical systems and intellectual property (IP) require protection from foreign enemies and counterfeiters who possess advanced RE capabilities.

international conference on parallel processing | 1994

Communication Optimizations Used in the Paradigm Compiler for Distributed-Memory Multicomputers

Daniel J. Palermo; Ernesto Su; John A. Chandy; Prithviraj Banerjee

The PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois provides a fully automated means to parallelize programs, written in a serial programming model, for execution on distributed-memory multicomputers. To provide efficient execution, PARADIGM automatically performs various optimizations to reduce the overhead and idle time caused by interprocessor communication. Optimizations studied in this paper include message coalescing, message vectorization, message aggregation, and coarse gram pipelining. To separate the optimization algorithms from machine-specific details, parameterized models are used to estimate communication and computation costs for a given machine. The models are also used in coarse gram pipelining to automatically select a task granularity that balances the available parallelism with the costs of communication. To determine the applicability of the optimizations on different machines, we analyzed their performance on an Intel iPSC/860, an Intel iPSC/2, and a Thinking Machines CM-5.

field-programmable custom computing machines | 2004

FPGA based network intrusion detection using content addressable memories

Long Bu; John A. Chandy

In this paper, we introduce a novel architecture for a hardware based network intrusion detection system (NIDS). Current software-based NIDS are too compute intensive and cannot meet the bandwidth requirements of a modern network. Thus, hardware techniques are desired to speed up network processing. This paper introduces a FPGA based keyword match processor that can serve as the core of a hardware based NIDS. The keyword match processors key feature is a cellular processor architecture that allows content addressable memory (CAM) to process variable sized keys. These CAMs allow us to perform intrusion detection signature lookup at line speed at rates well past 2 Gbps.

field-programmable custom computing machines | 2005

A signature match processor architecture for network intrusion detection

Janardhan Singaraju; Long Bu; John A. Chandy

In this paper, we introduce a novel architecture for a hardware based network intrusion detection system (NIDS). NIDSs are becoming critical components of the network infrastructure as they serve as a key line of defense in network protection. However, current methods are much too compute intensive and cannot begin to meet the bandwidth requirements of a moderate sized corporate network. Thus, hardware techniques are desired to speed up network processing. This paper introduces a FPGA based signature match processor that can serve as the core of a hardware based NIDS. The signature match processors key feature is a CAM-based cellular processor architecture that can match strings in an area efficient manner. Using a unique binary tree structure, we are also able to generate priority encoded addresses corresponding to multiple signature matches.

international conference on cluster computing | 2008

Active storage using object-based devices

Tina Miriam John; Anuradharthi Thiruvenkata Ramani; John A. Chandy

The increasing performance and decreasing cost of processors and memory are causing system intelligence to move from the CPU to peripherals such as disk drives. Storage system designers are using this trend toward excessive computation capability to perform more complex processing and optimizations directly inside the storage devices. Such kind of optimizations have been performed only at low levels of the storage protocol. Another factor to consider is the current trends in storage density, mechanics, and electronics, which are eliminating the bottleneck encountered while moving data off the media, and putting pressure on interconnects and host processors to move data more efficiently. Previous work on active storage has taken advantage of the extra processing power on individual disk drives to run application-level code. This idea of moving portions of an applicationpsilas processing to run directly at disk drives can dramatically reduce data traffic and take advantage of the parallel storage already present in large systems today. This paper aims at demonstrating active storage on an iSCSI OSD standards-based object oriented framework.

international conference on vlsi design | 1996

Parallel simulated annealing strategies for VLSI cell placement

John A. Chandy; Prithviraj Banerjee

Simulated annealing based standard cell placement for VLSI designs has long been acknowledged as a compute-intensive process, and as a result several research efforts have been undertaken to parallelize this algorithm. Most previous parallel approaches to cell placement annealing have used a parallel moves approach. In this paper we investigate two new approaches that have been proposed for generalized parallel simulated annealing but have not been applied to the cell placement problem. Results are presented on the effectiveness of implementations of these algorithms when applied to the cell placement problem. We find that the first, multiple Markov chains, appears to be promising since it uses parallelism to obtain near linear speedups with no loss in quality. The second, speculative computation, while maintaining quality is not suitable since no speedups are achieved due to the specific nature of the cell placement problem. The two algorithms are compared with the parallel moves approach to parallel cell placement.

international conference on distributed computing systems | 1993

Failure evaluation of disk array organizations

John A. Chandy; A.L.N. Reddy

The authors present an evaluation of some of the disk array organizations proposed in the literature. They evaluate three alternatives for sparing, hot sparing, distributed sparing, and parity sparing, and two options for data layout, regular RAID5 and block designs, and systems based on combinations of these data layout and sparing alternatives. The performance of these organizations is evaluated with different reconstruction strategies. It is shown that parity sparing and distributed sparing have better performance and shorter reconstruction times than hot sparing. It is shown that both block designs as a data layout policy and distributed sparing as a sparing policy reduce the reconstruction time after a failure. The impact of reconstruction strategies is studied, and it is shown that, at higher workloads, choice of reconstruction strategy has a significant impact on the performance of the systems.<<ETX>>

conference on high performance computing (supercomputing) | 1994

A library-based approach to portable, parallel, object-oriented programming: interface, implementation, and application

Steven Parkes; John A. Chandy; Prithviraj Banerjee

The use of parallel platforms, despite their increasing availability, remains largely restricted to well-structured, numeric applications. We address the issue of facilitating the use of parallel platforms on unstructured problems through object-oriented design techniques and the actor model of concurrent computation. We present a multi-level approach to expressing parallelism for unstructured applications: a high-level interface based on the actor model of concurrent object-oriented programming and a low-level interface which provides an object-oriented interface to system services across a wide range of parallel architectures. The high- and low-level interfaces are implemented as part of the ProperCAD II C++ class library which supports shared memory, message-passing, and and hybrid architectures. We demonstrate our approach through a detailed examination of the parallelization process for an existing unstructured serial application, viz. a state-of-the-art VLSI computer-aided design application. We compare and contrast the library-based actor approach to other method for expressing parallelism in C++ on a number of applications and kernels.<<ETX>>

Explore More