Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where V. K. Prasanna Kumar is active.

Publication


Featured researches published by V. K. Prasanna Kumar.


Journal of the ACM | 1984

Information Transfer in Distributed Computing with Applications to VLSI

Joseph JáJá; V. K. Prasanna Kumar

Simple general lower bound techniques are developed for measuring the amount of interprocessor commumcatlon required in distributed computing. Optimal bounds are shown for many problems, such as integer multiplication, integer division, matrix squaring, matrix inversion, solving a linear system of equations, and computing square roots. Using these techniques, one can unify and strengthen the area-time trade-off results known in the literature. Many new trade-off results are also shown in several of the existing models Categories and SubJect Descriptors: B.7. l [Integrated Circuits]: Types and Design Styles--VLSI (verylarge-scale mtegratton); F.2.3 [Analysis of Algorithms and Problem Complexity] Trade-offs among Complexity Measures General Terms. Algorithms, Theory Additional


Graphical Models \/graphical Models and Image Processing \/computer Vision, Graphics, and Image Processing | 1990

Efficient histogramming on hypercube SIMD machines

Wei Ming Lin; V. K. Prasanna Kumar

Abstract This paper considers the histogramming problem on hypercube. N -PE hypercube is used to process an N 12 × N 12 digitized image in which each pixel has a gray-level value between 0 and M − 1. In general, M , the range of gray-level values is much smaller than N , the number of pixels being processed. Our algorithm generates the histogram of the image in O (log M * log N ) time using radix sort and efficient data movement operations. This technique can be implemented on butterfly, shuffle-exchange and fat pyramid organizations.


Journal of Parallel and Distributed Computing | 1989

Designing linear systolic arrays

V. K. Prasanna Kumar; Yu-Chen Tsai

Abstract We develop a simple mapping technique to design linear systolic arrays. The basic idea of our technique is to map the computations of a certain class of two-dimensional systolic arrays onto one-dimensional arrays. Using this technique, systolic algorithms are derived for problems such as matrix multiplication and transitive closure on linearly connected arrays of PEs with constant I/O bandwidth. Compared to known designs in the literature, our technique leads to modular systolic arrays with constant hardware in each PE, few control lines, lexicographic data input/output, and improved delay time. The unidirectional flow of control and data in our design assures implementation of the linear array in the known fault models of Wafer Scale Integration.


Cvgip: Image Understanding | 1991

Orthogonal multiprocessor sharing memory with an enhanced mesh for integrated image understanding

Kai Hwang; Hussein M. Alnuweiri; V. K. Prasanna Kumar; Dongseung Kim

Abstract This paper proposes a new parallel architecture, which has the potential to support low-level image processing as well as intermediate and high-level vision analysis tasks efficiently. The integrated architecture consists of an SIMD mesh of processors enhanced with multiple broadcast buses, and MIMD multiprocessor with orthogonal access buses, and a two-dimensional shared memory array. Low-level image processing is performed on the mesh processor, while intermediate and high-level vision analysis is performed on the orthogonal multiprocessor. The interaction between the two levels is supported by a common shared memory. Concurrent computations and I/O are made possible by partitioning the memory into disjoint spaces so that each processor system can access a different memory space. To illustrate the power of such a two-level system, we present efficient parallel algorithms for a variety of problems from low-level image processing to high-level vision. Representative problems include matrix based computations, histogramming and key counting operations, image component labeling, pyramid computations, Hough transform, pattern clustering, and scene labeling. Through computational complexity analysis, we show that the integrated architecture meets the processing requirements of most image understanding tasks.


IEEE Transactions on Computers | 1991

Optimal VLSI sorting with reduced number of processors

Hussein M. Alnuweiri; V. K. Prasanna Kumar

A new parallel architecture is presented which has p processors and N=n/sup 2/ memory locations, each consisting of 2s bits. The proposed organization can sort N s-bit numbers, where s=O((1+ epsilon ) log N), epsilon >0, in time t=O(N log N/p), for p in the range 1 to square root N log square root N. This result is optimal in the sense that the product of the number of processors and the parallel sorting time is equal to the sequential complexity of sorting. Also, the constant factors involved in the algorithm complexity are relatively small. When p= square root N log square root N, the time required for sorting N numbers on the proposed organization is O( square root N), which is the same time required by a two-dimensional mesh array, a mesh of trees organization, or a pyramid computer, all with O(N) processors, to sort N numbers. >


parallel computing | 1989

An efficient VLSI architecture with applications to geometric problems

Hussein M. Alnuweiri; V. K. Prasanna Kumar

Abstract We present a parallel organization with a reduced number of processors and special communication features for efficient solutions to problems in computational geometry. The organization has n processors operating in synchronous mode with row and column access to an n × n array of memory modules. The organization has simple regular structure and can be implemented in VLSI on a single chip or using a limited chip set. We develop fast parallel algorithms for computing several geometric properties of a set of n 2 points in the plane. We present O( n log n ) time parallel algorithms to compute the convex hull of n 2 points in the plane, to compute the intersection of two convex polygons each having n 2 edges, and to compute the diameter and a smallest enclosing box of a set of n 2 points. All these problems require O( n 2 log n ) sequential time. Thus, all our solutions are optimal in the sense that their processor-time product is equal to the sequential complexity of these problems. We also consider the problem of computing nearest neighbors when the n 2 points belong to an n × n digitized image. We show that this problem can be solved on the proposed parallel organization in O( n ) time using n PEs, which is the same time taken by a two-dimensional mesh-connected computer with n 2 processors to solve the same problem.


computer vision and pattern recognition | 1991

Parallel algorithms and architectures for discrete relaxation technique

Wei Ming Lin; V. K. Prasanna Kumar

Three parallel implementations based on three alternate sequential approaches are presented. The first design is a systolic array based on the known sequential method. An execution time of O(n/sup 2/m/sup 2/) is achieved with nm processing elements (PEs), with each PE composed of simple logic elements. The second design employs broadcast bus feature to speed up the execution of an alternate sequential method. Linear speedup is achieved by using nm processing elements. The sequential method has an execution time of O(n/sup 2/m/sup 2/) and the proposed parallel design runs in O(nm) time. The third design is a modified approach which is well suited for implementation on general-purpose machines. These designs achieve superior performance compared with the existing designs in terms of their simplicity, execution time, and domain of applications. Using the proposed designs, an efficient parallel implementation of stereo matching based on linear segments as primitives is derived.<<ETX>>


Discrete Applied Mathematics | 1992

Perfect latin squares

Katherine Heinrich; Kichul Kim; V. K. Prasanna Kumar

Abstract We introduce new latin squares called perfect latin squares which have desirable properties for parallel array access. These squares provide conflict free access to various subsets of an n2×n2 array using n2 memory modules. We present a general construction method for building perfect latin squares of order n2 for all n. Some useful properties of the latin squares built by our construction method for parallel array access are also identified.


Algorithmica | 1991

Processor-time optimal parallel algorithms for digitized images on mesh-connected processor arrays

Hussein M. Alnuweiri; V. K. Prasanna Kumar

We present processor-time optimal parallel algorithms for several problems onn ×n digitized image arrays, on a mesh-connected array havingp processors and a memory of sizeO(n2) words. The number of processorsp can vary over the range [1,n3/2] while providing optimal speedup for these problems. The class of image problems considered here includes labeling the connected components of an image; computing the convex hull, the diameter, and a smallest enclosing box of each component; and computing all closest neighbors. Such problems arise in medium-level vision and require global operations on image pixels. To achieve optimal performance, several efficient data-movement and reduction techniques are developed for the proposed organization.


computer vision and pattern recognition | 1988

Optimal geometric algorithms on fixed-size linear arrays and scan line arrays

Hussein M. Alnuweiri; V. K. Prasanna Kumar

Optimal parallel solutions are presented to several geometric problems on an n*n image on a fixed-size linear array with p processors, where 1<or=p<or=n. The array model considered here is an abstraction of several linearly connected parallel computers that have been constructed recently. The authors present O(n/sup 2//p) time solutions to several geometric problems which require global transfer of information such as labeling connected regions, computing the convexity and intersections of multiple regions, and computing several distance functions. All the solutions are optimal in the sense that their processor-time product is equal to the sequential complexity of the problems. Limitations of linear arrays in image computations are also discussed by showing that there are certain image problems which can be solved sequentially in O(n/sup 2/) time, but require Omega (n/sup 2/3) time on a linear array, irrespective of the number of processors used and the way in which the input image is partitioned among the processors. The authors also show alternate fixed-size array organizations with p processors which can solve the above problems in O(n/sup 2//p) time, for 1<or=p<or=n.<<ETX>>

Collaboration


Dive into the V. K. Prasanna Kumar's collaboration.

Top Co-Authors

Avatar

Hussein M. Alnuweiri

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Russ Miller

Hauptman-Woodward Medical Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dionisios I. Reisis

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

Cauligi S. Raghavendra

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Dongseung Kim

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Kai Hwang

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Manavendra Misra

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Mehrnoosh Mary Eshaghian

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Wei Ming Lin

University of Texas at San Antonio

View shared research outputs
Researchain Logo
Decentralizing Knowledge