Venkatesan Muthukumar

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Venkatesan Muthukumar is active.

Explore More

Publication

Featured researches published by Venkatesan Muthukumar.

digital systems design | 2004

Image processing algorithms on reconfigurable architecture using HandelC

Venkatesan Muthukumar; Daggu Venkateshwar Rao

Computer manipulation of images is generally defined as digital image processing (DIP). DIP is employed in variety of applications, including video surveillance, target recognition, and image enhancement. Some of the algorithms used in image processing include convolution, edge detection and contrast enhancement. These are usually implemented in software but may also be implemented in special purpose hardware to reduce speed. In this work the canny edge detection [A computational approach to the edge detection] architecture has been developed using reconfigurable architecture and hardware modeled using a C-like hardware language called Handel-C. The proposed architecture is capable of producing one edge-pixel every clock cycle. The hardware modeled was implemented using the DK2 IDE tool on the RC1000 Xilinx Vertex FPGA based board. The algorithm was tested on standard image processing benchmarks and significances of the result are discussed.

international conference on information technology: new generations | 2009

Traffic Aware Scheduling Algorithm for Network on Chip

Ashwini Raina; Venkatesan Muthukumar

Steady advancements in semiconductor technology over the past few decades have lead to researchers have proposed Network-on-Chip (NoC) as the on-chip communication model. An efficient NoC design methodology is based upon several key design choices, such as: network topology selection, good routing policy and efficient application to NoC mapping. In this paper a novel off-line non-preemptive static Traffic Aware Scheduling (TAS) policy is proposed for hard NoC platforms. The proposed scheduling policy maps the application onto the NoC architecture keeping track of the network traffic, which is generated with every resource and communication path allocation. The main contribution of the proposed algorithm is that the application mapping and scheduling process are handled together and inter-process communication latency are dynamically calculated based on the application mapping and PE interaction. Our TAS algorithm has been evaluated for various design metrics such as application completion time, resource utilization and task throughput. Simulation results show significant improvements over traditional approaches.

Journal of Systems Architecture | 2007

An efficient variable partitioning approach for functional decomposition of circuits

Venkatesan Muthukumar; Robert J. Bignall; Henry Selvaraj

Functional decomposition is a process of splitting a complex circuit into smaller sub-circuits. There exist two major strategies in decomposition, namely, serial and parallel decomposition. In serial decomposition the problem the complex function represented as a truth table with support set variables and partitioned into free and bout set variables. The minterms corresponding to the bound set variables are represented as an equivalent function called the predecessor function. Equivalent minterms of the bound set variables are assigned an output code. The assigned output codes and the free set variable minterms are represented as the successor function. Serial decomposition is further categorized into disjoint and non-disjoint decomposition, when the free and bound set variables are disjoint and non-disjoint respectively. This paper deals with the problem of determining the set of best free and bound variables (variable partitioning problem) for disjoint serial decomposition. Variable partitioning is the first step in decomposition process. An efficient variable partition algorithm is one that determines the set of all free and bound set variables that satisfy the decomposition theorem in minimal time and by exploring the search space effectively. This will allow the decomposition algorithm to determine the best variable partition of a function that results in smaller decomposed functions and with maximum number of do not cares in these functions. Classical approaches to determine the best free and bound set use exhaustive search methods. The time and memory requirements for such approaches are exponential or super exponential. A novel heuristic search approach is proposed to determine the set of good variable partitions in minimal time by minimally exploring the search space. There are two heuristics employed in the proposed search approach, (1) r-admissibility based heuristic or pruned breadth first search (PBFS) approach and (2) Information relation based heuristic or improved pruned breadth first search (IPBFS) approach. The r-admissibility based heuristic is based on r-partition characteristics of the free and bound set variables. The information relation and measure based heuristic is based on information relationship of free and bound set variables that are expressed as r-partition heuristics. The proposed variable partition search approach has been successfully implemented and test with MCNC and Espresso benchmarks and the results indicate that the time complexity is comparable to r-admissible heuristic algorithm and the quality of solution is comparable to exact variable partitioning algorithm. A comparison of PBFS and IPBFS heuristics for certain benchmarks are also discussed in this paper.

digital systems design | 2004

Hybrid greedy/face routing for ad-hoc sensor network

J. Li; Laxmi Gewali; Henry Selvaraj; Venkatesan Muthukumar

Constructing a route connecting a source node to a destination node is one of the central problems in sensor networks and mobile computing. In this paper we consider the application of a data structure called ExtDCEL that can be very effective in implementing several location-based routing algorithms that include face routing, greedy most forward routing, and hybrid routing. The ExtDCEL data structure is very convenient for representing unit disk graphs as well as planar components. Many network properties can be computed locally when the graph is represented in ExtDCEL. In particular, we show how the popular hybrid greedy face routing algorithm can be implemented efficiently by using ExtDCEL.

international conference on information technology: new generations | 2012

Loopback Virtual Channel Router Architecture for Network on Chip

Jaya Suseela; Venkatesan Muthukumar

Low-level design parameters - such as router micro-architecture, switching techniques and packet sizes - have a huge impact on performance and cost of Network on Chip (NoC) implementation. This work proposes a router micro-architecture that has a mechanism for buffer structure, allocation, and arbitration, which minimizes latency, area overhead of the router, and power consumption. The proposed router micro-architecture can be adapted to various switching techniques used in current NoC implementations, and is independent of the topology. The architecture was developed, simulated, and synthesized using hardware description language (HDL). The performance of the architecture was evaluated for hotspot congestion scenarios and compared to classical router micro-architectures. Compared to classical router micro-architectures, this architecture achieves better performance for area, latency and power.

international conference on information technology: new generations | 2011

Efficient Scheduling Algorithms for MpSoC Systems

Bisrat Tafesse; Ashwini Raina; Jaya Suseela; Venkatesan Muthukumar

MpSoCs have been proposed as a viable solution for present day application, which requires a growing demand in data processing and real-time processing with power constraints. Scheduling and Mapping are two important steps in the MpSoC design process that facilitates efficient MpSoC System. This work presents two scheduling algorithms: 1) Performance Driven Scheduling (PDS) Algorithm (for bus-based MpSoC) and 2) Traffic Aware Scheduling (TAS) algorithm (for NoC-based MpSoC). The algorithms are tested on synthetic task graphs and their performances are evaluated and discussed.

Vlsi Design | 2013

Framework for simulation of heterogeneous MpSoC for design space exploration

Bisrat Tafesse; Venkatesan Muthukumar

Due to the ever-growing requirements in high performance data computation, multiprocessor systems have been proposed to solve the bottlenecks in uniprocessor systems. Developing efficient multiprocessor systems requires effective exploration of design choices like application scheduling, mapping, and architecture design. Also, fault tolerance in multiprocessors needs to be addressed. With the advent of nanometer-process technology for chip manufacturing, realization of multiprocessors on SoC (MpSoC) is an active field of research. Developing efficient low power, fault-tolerant task scheduling, and mapping techniques for MpSoCs require optimized algorithms that consider the various scenarios inherent in multiprocessor environments. Therefore there exists a need to develop a simulation framework to explore and evaluate new algorithms on multiprocessor systems. This work proposes amodular framework for the exploration and evaluation of various design algorithms for MpSoC system. This work also proposes new multiprocessor task scheduling and mapping algorithms for MpSoCs. These algorithms are evaluated using the developed simulation framework. The paper also proposes a dynamic fault-tolerant (FT) scheduling and mapping algorithm for robust application processing. The proposed algorithms consider optimizing the power as one of the design constraints. The framework for a heterogeneous multiprocessor simulation was developed using System C/C++ language. Various design variations were implemented and evaluated using standard task graphs. Performance evaluation metrics are evaluated and discussed for various design scenarios.

international conference on information technology: new generations | 2014

A CUDA Based Implementation of Locally-and Feature-Adaptive Diffusion Based Image Denoising Algorithm

Ali Pour Yazdanpanah; Ajay K. Mandava; Emma E. Regentova; Venkatesan Muthukumar; George Bebis

In this paper we introduce a parallel implementation of locally-and feature-adaptive diffusion based (LFAD) method for image denoising using NVIDIA CUDA framework and graphics processing units (GPUs). LFAD is a novel method for removing additive white Gaussian (AWG) noise in images reported to yield high quality denoised images [1]. It approaches each image region separately and uses different number of nonlinear anisotropic diffusion iterations for each region to attain best peak signal to noise ratio (PSNR). The inverse difference moment (IDM) feature is embedded into a modified diffusion function. As the method has attained highest performance in the class of advanced diffusion based methods and it is competitive with all the state-of-the-art methods, however computationally intensive when executed on the general purpose CPU. To improve the performance, we implemented using the CUDA computational framework. In order to minimize GPU kernel access to the global memory, we use shared memory and the texture memory per multiprocessor. The performance of the GPU implementation of the LFAD has been tested on the standard benchmark images. We demonstrate that with a single NVIDIA Tesla C2050 GPU we can expedite the sequential CPU implementation in most cases from 13 to 20 times.

international conference on information technology | 2007

Cell-based Distributed Addressing Technique Using Clustered Backbone Approach

Ashwini Raina; Venkatesan Muthukumar; Laxmi Gewali

Wireless sensor networks are characterized by their low data rates, autonomous functioning and longevity. Low power consumption in such energy-constrained networks is a central issue and we target this design parameter by proposing and implementing a novel cell-based distributed addressing technique (C-DAT) over a k-power level sensor network. This paper introduces a clustered backbone approach for data forwarding and shows how data aggregation can be achieved, by implementing a MIN query over the network The proposed scheme is evaluated for power consumption during addressing and various network dynamics using a Java based network simulator. Results obtained highlight some key variable power-level model parameters and power consumption trends

international conference on information technology coding and computing | 2005

Construction of power-aware diameter-reduced broadcast trees

S. Veeravalli; Laxmi Gewali; Venkatesan Muthukumar

Constructing a power aware broadcast tree is a central problem in sensor network. The problem of constructing a power minimized broadcast tree in a sensor network is known to be NP-Hard. Known approximation algorithms for constructing power reduced broadcast trees may have large diameter which are not suitable for reducing transmission error. In this paper a modification of the standard BIP algorithm called P-BIP that can be used to generate power-reduced broadcast trees with small diameter is presented. The proposed algorithm has been implemented and tested using a Java based simulation tool.

Explore More