Shubhabrata Sengupta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shubhabrata Sengupta is active.

Explore More

Publication

Featured researches published by Shubhabrata Sengupta.

international conference on computer graphics and interactive techniques | 2007

Scan primitives for GPU computing

Shubhabrata Sengupta; Mark J. Harris; Yao Zhang; John D. Owens

The scan primitives are powerful, general-purpose data-parallel primitives that are building blocks for a broad range of applications. We describe GPU implementations of these primitives, specifically an efficient formulation and implementation of segmented scan, on NVIDIA GPUs using the CUDA API. Using the scan primitives, we show novel GPU implementations of quicksort and sparse matrix-vector multiply, and analyze the performance of the scan primitives, several sort algorithms that use the scan primitives, and a graphical shallow-water fluid simulation using the scan framework for a tridiagonal matrix solver.

Computer Graphics Forum | 2009

Fast BVH Construction on GPUs

Christian Lauterbach; Michael Garland; Shubhabrata Sengupta; David Luebke; Dinesh Manocha

We present two novel parallel algorithms for rapidly constructing bounding volume hierarchies on manycore GPUs. The first uses a linear ordering derived from spatial Morton codes to build hierarchies extremely quickly and with high parallel scalability. The second is a top‐down approach that uses the surface area heuristic (SAH) to build hierarchies optimized for fast ray tracing. Both algorithms are combined into a hybrid algorithm that removes existing bottlenecks in the algorithm for GPU construction performance and scalability leading to significantly decreased build time. The resulting hierarchies are close in to optimized SAH hierarchies, but the construction process is substantially faster, leading to a significant net benefit when both construction and traversal cost are accounted for. Our preliminary results show that current GPU architectures can compete with CPU implementations of hierarchy construction running on multicore systems. In practice, we can construct hierarchies of models with up to several million triangles and use them for fast ray tracing or other applications.

ACM Transactions on Graphics | 2006

Glift: Generic, efficient, random-access GPU data structures

Aaron E. Lefohn; Shubhabrata Sengupta; Joe Kniss; Robert Strzodka; John D. Owens

This article presents Glift, an abstraction and generic template library for defining complex, random-access graphics processor (GPU) data structures. Like modern CPU data structure libraries, Glift enables GPU programmers to separate algorithms from data structure definitions; thereby greatly simplifying algorithmic development and enabling reusable and interchangeable data structures. We characterize a large body of previously published GPU data structures in terms of our abstraction and present several new GPU data structures. The structures, a stack, quadtree, and octree, are explained using simple Glift concepts and implemented using reusable Glift components. We also describe two applications of these structures not previously demonstrated on GPUs: adaptive shadow maps and octree three-dimensional paint. Last, we show that our example Glift data structures perform comparably to handwritten implementations while requiring only a fraction of the programming effort.

international conference on computer graphics and interactive techniques | 2009

Real-time parallel hashing on the GPU

Dan A. Alcantara; Andrei Sharf; Fatemeh Abbasinejad; Shubhabrata Sengupta; Michael Mitzenmacher; John D. Owens; Nina Amenta

We demonstrate an efficient data-parallel algorithm for building large hash tables of millions of elements in real-time. We consider two parallel algorithms for the construction: a classical sparse perfect hashing approach, and cuckoo hashing, which packs elements densely by allowing an element to be stored in one of multiple possible locations. Our construction is a hybrid approach that uses both algorithms. We measure the construction time, access time, and memory usage of our implementations and demonstrate real-time performance on large datasets: for 5 million key-value pairs, we construct a hash table in 35.7 ms using 1.42 times as much memory as the input data itself, and we can access all the elements in that hash table in 15.3 ms. For comparison, sorting the same data requires 36.6 ms, but accessing all the elements via binary search requires 79.5 ms. Furthermore, we show how our hashing methods can be applied to two graphics applications: 3D surface intersection for moving data and geometric hashing for image matching.

ACM Transactions on Graphics | 2007

Resolution-matched shadow maps

Aaron E. Lefohn; Shubhabrata Sengupta; John D. Owens

This article presents resolution-matched shadow maps (RMSM), a modified adaptive shadow map (ASM) algorithm, that is practical for interactive rendering of dynamic scenes. Adaptive shadow maps, which build a quadtree of shadow samples to match the projected resolution of each shadow texel in eye space, offer a robust solution to projective and perspective aliasing in shadow maps. However, their use for interactive dynamic scenes is plagued by an expensive iterative edge-finding algorithm that takes a highly variable amount of time per frame and is not guaranteed to converge to a correct solution. This article introduces a simplified algorithm that is up to ten times faster than ASMs, has more predictable performance, and delivers more accurate shadows. Our main contribution is the observation that it is more efficient to forgo the iterative refinement analysis in favor of generating all shadow texels requested by the pixels in the eye-space image. The practicality of this approach is based on the insight that, for surfaces continuously visible from the eye, adjacent eye-space pixels map to adjacent shadow texels in quadtree shadow space. This means that the number of contiguous regions of shadow texels (which can be efficiently generated with a rasterizer) is proportional to the number of continuously visible surfaces in the scene. Moreover, these regions can be coalesced to further reduce the number of render passes required to shadow an image. The secondary contribution of this paper is demonstrating the design and use of data-parallel algorithms inseparably mixed with traditional graphics programming to implement a novel interactive rendering algorithm. For the scenes described in this paper, we achieve 60--80 frames per second on static scenes and 20--60 frames per second on dynamic scenes for 5122 and 10242 images with a maximum effective shadow resolution of 32,7682 texels.

Computer Graphics Forum | 2009

Out-of-core Data Management for Path Tracing on Hybrid Resources

Brian Budge; Tony Bernardin; Jeff A. Stuart; Shubhabrata Sengupta; Kenneth I. Joy; John D. Owens

We present a software system that enables path‐traced rendering of complex scenes. The system consists of two primary components: an application layer that implements the basic rendering algorithm, and an out‐of‐core scheduling and data‐management layer designed to assist the application layer in exploiting hybrid computational resources (e.g., CPUs and GPUs) simultaneously. We describe the basic system architecture, discuss design decisions of the systems data‐management layer, and outline an efficient implementation of a path tracer application, where GPUs perform functions such as ray tracing, shadow tracing, importance‐driven light sampling, and surface shading. The use of GPUs speeds up the runtime of these components by factors ranging from two to twenty, resulting in a substantial overall increase in rendering speed. The path tracer scales well with respect to CPUs, GPUs and memory per node as well as scaling with the number of nodes. The result is a system that can render large complex scenes with strong performance and scalability.

international conference on computer graphics and interactive techniques | 2005

Dynamic adaptive shadow maps on graphics hardware

Aaron E. Lefohn; Shubhabrata Sengupta; Joe Kniss; Robert Strzodka; John D. Owens

Author(s): Lefohn, Aaron; Sengupta, Shubhabrata; Kniss, Joe M.; Strzodka, Robert; Owens, John D. | Abstract: We present a novel implementation of adaptive shadow maps (ASMs) that performs all shadow lookups and scene analysis on the GPU, enabling interactive rendering with ASMs while moving both the light and camera. Adaptive shadow maps offer a rigorous solution to projective and perspective shadow map aliasing while maintaining the simplicity of a purely image-based technique. The complexity of the ASM data structure, however, has prevented full GPU-based implementations until now. Our approach uses an entirely GPU-based data structure and a blend of graphics and GPU stream programming. We support shadow map effective resolutions up to 131,072 x 131,072 and, unlike previous implementations, provide smooth transitions between resolution levels by trilinearly filtering (mipmapping) the shadow lookups.

GPU Computing Gems Jade Edition | 2012

Building an Efficient Hash Table on the GPU

Dan A. Alcantara; Vasily Volkov; Shubhabrata Sengupta; Michael Mitzenmacher; John D. Owens; Nina Amenta

Publisher Summary This chapter describes a straightforward algorithm for parallel hash table construction on the graphical processing unit (GPU). It constructs the table in global memory and use atomic operations to detect and resolve collisions. Construction and retrieval performance are limited almost entirely by the time required for these uncoalesced memory accesses, which are linear in the total number of accesses; so the design goal is to minimize the average number of accesses per insertion or lookup. In fact, it guarantees a constant worst-case bound on the number of accesses per lookup. Further, one alternative to using a hash table is to store the data in a sorted array and access it via binary search. Sorted arrays can be built very quickly using radix sort because the memory access pattern of radix sort is very localized, allowing the GPU to coalesce many memory accesses and reduce their cost significantly. However, binary search, which incurs as many as lg ( N) probes in the worst case, is much less efficient than hash table lookup. GPU hash tables are useful for interactive graphics applications, where they are used to store sparse spatial data—usually 3D models that are voxelized on a uniform grid. Rather than store the entire voxel grid, which is mostly empty, a hash table is built to hold just the occupied voxels.

international conference on computer graphics and interactive techniques | 2005

Octree textures on graphics hardware

Joe Kniss; Aaron E. Lefohn; Robert Strzodka; Shubhabrata Sengupta; John D. Owens

Author(s): Kniss, Joe M.; Lefohn, Aaron; Strzodka, Robert; Sengupta, Shubhabrata; Owens, John D. | Abstract: We implement an interactive 3D painting application that stores paint in an octree-like GPU-based adaptive data structure. Interactive painting of complex or unparameterized surfaces is an important problem in the digital film community. Many models used in production environments are either difficult to parameterize or are unparameterized implicit surfaces. We address this problem with a system that allows interactive 3D painting of complex, unparameterized models. The included movie demonstrates interactive painting of a 817k polygon model with effective paint resolutions varying between 64^3 to 2048^3. Our implementation differs from previous work in two important ways: first, it uses an adaptive data structure implemented entirely on the GPU, and second, it enables interactive performance with high quality by supporting quadlinear (mipmapped) filtering and fast, constant-time data accesses.

Archive | 2008