Paul A. Navrátil | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paul A. Navrátil is active.

Explore More

Publication

Featured researches published by Paul A. Navrátil.

2007 IEEE Symposium on Interactive Ray Tracing | 2007

Dynamic Ray Scheduling to Improve Ray Coherence and Bandwidth Utilization

Paul A. Navrátil; Donald S. Fussell; Calvin Lin; William R. Mark

The performance of full-featured ray tracers has historically been limited by the hardwares floating point computational power. However, next generation multi-threaded multi-core architectures promise to provide sufficient CPU throughput to support real time frame rates. In such systems, limited memory system performance in terms of both on-chip cache and DRAM-to-cache bandwidth is likely to bound overall system performance. This paper presents a novel ray tracing algorithm that both improves cache utilization and reduces DRAM-to-cache bandwidth usage. The key insight is to view ray traversal as a scheduling problem, which allows our algorithm to match ray traversal computations and intersection computations with available system resources. Using a detailed simulator, we show that our algorithm significantly reduces the amount of data brought into the cache in exchange for the small overhead of maintaining the ray schedule. Moreover, our algorithm creates units of work that are more amenable to parallelization than traditional Whitted-style ray tracers.

IEEE Transactions on Visualization and Computer Graphics | 2007

Visualization of Cosmological Particle-Based Datasets

Paul A. Navrátil; Jarrett L. Johnson; Volker Bromm

We describe our visualization process for a particle-based simulation of the formation of the first stars and their impact on cosmic history. The dataset consists of several hundred time-steps of point simulation data, with each time-step containing approximately two million point particles. For each time-step, we interpolate the point data onto a regular grid using a method taken from the radiance estimate of photon mapping [21]. We import the resulting regular grid representation into ParaView [24], with which we extract isosurfaces across multiple variables. Our images provide insights into the evolution of the early universe, tracing the cosmic transition from an initially homogeneous state to one of increasing complexity. Specifically, our visualizations capture the build-up of regions of ionized gas around the first stars, their evolution, and their complex interactions with the surrounding matter. These observations will guide the upcoming James Webb Space Telescope, the key astronomy mission of the next decade.

IEEE Transactions on Visualization and Computer Graphics | 2017

OSPRay - A CPU Ray Tracing Framework for Scientific Visualization

Ingo Wald; Gregory P. Johnson; Jefferson Amstutz; Carson Brownlee; Aaron Knoll; J. Jeffers; J. Gunther; Paul A. Navrátil

Scientific data is continually increasing in complexity, variety and size, making efficient visualization and specifically rendering an ongoing challenge. Traditional rasterization-based visualization approaches encounter performance and quality limitations, particularly in HPC environments without dedicated rendering hardware. In this paper, we present OSPRay, a turn-key CPU ray tracing framework oriented towards production-use scientific visualization which can utilize varying SIMD widths and multiple device backends found across diverse HPC resources. This framework provides a high-quality, efficient CPU-based solution for typical visualization workloads, which has already been integrated into several prevalent visualization packages. We show that this system delivers the performance, high-level API simplicity, and modular device support needed to provide a compelling new rendering framework for implementing efficient scientific visualization workflows.

ieee pacific visualization symposium | 2015

Ray tracing within a data parallel framework

Matthew Larsen; Jeremy S. Meredith; Paul A. Navrátil; Hank Childs

Current architectural trends on supercomputers have dramatic increases in the number of cores and available computational power per die, but this power is increasingly difficult for programmers to harness effectively. High-level language constructs can simplify programming many-core devices, but this ease comes with a potential loss of processing power, particularly for cross-platform constructs. Recently, scientific visualization packages have embraced language constructs centering around data parallelism, with familiar operators such as map, reduce, gather, and scatter. Complete adoption of data parallelism will require that central visualization algorithms be revisited, and expressed in this new paradigm while preserving both functionality and performance. This investment has a large potential payoff: portable performance in software bases that can span over the many architectures that scientific visualization applications run on. With this work, we present a method for ray tracing consisting of entirely of data parallel primitives. Given the extreme computational power on nodes now prevalent on supercomputers, we believe that ray tracing can supplant rasterization as the work-horse graphics solution for scientific visualization. Our ray tracing method is relatively efficient, and we describe its performance with a series of tests, and also compare to leading-edge ray tracers that are optimized for specific platforms. We find that our data parallel approach leads to results that are acceptable for many scientific visualization use cases, with the key benefit of providing a single code base that can run on many architectures.

ieee international conference on high performance computing data and analytics | 2013

Ray tracing and volume rendering large molecular data on multi-core and many-core architectures

Aaron Knoll; Ingo Wald; Paul A. Navrátil; Michael E. Papka; Kelly P. Gaither

Visualizing large molecular data requires efficient means of rendering millions of data elements that combine glyphs, geometry and volumetric techniques. The geometric and volumetric loads challenge traditional rasterization-based vis methods. Ray casting presents a scalable and memory- efficient alternative, but modern techniques typically rely on GPU-based acceleration to achieve interactive rendering rates. In this paper, we present bnsView, a molecular visualization ray tracing framework that delivers fast volume rendering and ball-and-stick ray casting on both multi-core CPUs and many-core Intel® Xeon Phi™ co-processors, implemented in a SPMD language that generates efficient SIMD vector code for multiple platforms without source modification. We show that our approach running on co- processors is competitive with similar techniques running on GPU accelerators, and we demonstrate large-scale parallel remote visualization from TACCs Stampede supercomputer to large-format display walls using this system.

eurographics | 2014

RBF Volume Ray Casting on Multicore and Manycore CPUs

Aaron Knoll; Ingo Wald; Paul A. Navrátil; Anne Bowen; Khairi Reda; Michael E. Papka; Kelly P. Gaither

Modern supercomputers enable increasingly large N‐body simulations using unstructured point data. The structures implied by these points can be reconstructed implicitly. Direct volume rendering of radial basis function (RBF) kernels in domain‐space offers flexible classification and robust feature reconstruction, but achieving performant RBF volume rendering remains a challenge for existing methods on both CPUs and accelerators. In this paper, we present a fast CPU method for direct volume rendering of particle data with RBF kernels. We propose a novel two‐pass algorithm: first sampling the RBF field using coherent bounding hierarchy traversal, then subsequently integrating samples along ray segments. Our approach performs interactively for a range of data sets from molecular dynamics and astrophysics up to 82 million particles. It does not rely on level of detail or subsampling, and offers better reconstruction quality than structured volume rendering of the same data, exhibiting comparable performance and requiring no additional preprocessing or memory footprint other than the BVH. Lastly, our technique enables multi‐field, multi‐material classification of particle data, providing better insight and analysis.

eurographics workshop on parallel graphics and visualization | 2015

Volume rendering via data-parallel primitives

Matthew Larsen; Stephanie Labasan; Paul A. Navrátil; Jeremy S. Meredith; Hank Childs

Supercomputing designs have recently evolved to include architectures beyond the standard CPU. In response, visualization software must be developed in a manner that obviates the need for porting all visualization algorithms to all architectures. Recent research results indicate that building visualization software on a foundation of data-parallel primitives can meet this goal, providing portability over many architectures, and doing it in a performant way. With this work, we introduce an unstructured data volume rendering algorithm which is composed entirely of data-parallel primitives. We compare the algorithm to community standards, and show that the performance we achieve is similar. That is, although our algorithm is hardware-agnostic, we demonstrate that our performance on GPUs is comparable to code that was written for and optimized for the GPU, and our performance on CPUs is comparable to code written for and optimized for the CPU. The main contribution of this work is in realizing the benefits of data-parallel primitives --- portable performance, longevity, and programmability --- for volume rendering. A secondary contribution is in providing further evidence of the merits of the data-parallel primitives approach itself.

eurographics workshop on parallel graphics and visualization | 2012

Dynamic Scheduling for Large-Scale Distributed-Memory Ray Tracing

Paul A. Navrátil; Donald S. Fussell; Calvin Lin; Hank Childs

Ray tracing is an attractive technique for visualizing scientific data because it can produce high quality images that faithfully represent physically-based phenomena. Its embarrassingly parallel reputation makes it a natural candidate for visualizing large data sets on distributed memory clusters, especially for machines without specialized graphics hardware. Unfortunately, the traditional recursive ray tracing algorithm is exceptionally memory inefficient on large data, especially when using a shading model that generates incoherent secondary rays. As visualization moves through the petascale to the exascale, disk and memory efficiency will become increasingly important for performance, and traditional methods are inadequate. This paper presents a dynamic ray scheduling algorithm that effectively manages both ray state and data accesses. Our algorithm can render datasets that are larger than aggregate system memory, which existing statically scheduled ray tracers cannot render. For example, using 1024 cores of a supercomputing cluster, our unoptimized algorithm ray traces a 650GB dataset from an N-Body simulation with shadows and reflections, at about 1100 seconds per frame. For smaller problems that fit in aggregate memory, but are larger than typical shared memory, our algorithm is competitive with the best static scheduling algorithm.

IEEE Transactions on Visualization and Computer Graphics | 2014

Exploring the Spectrum of Dynamic Scheduling Algorithms for Scalable Distributed-MemoryRay Tracing

Paul A. Navrátil; Hank Childs; Donald S. Fussell; Calvin Lin

This paper extends and evaluates a family of dynamic ray scheduling algorithms that can be performed in-situ on large distributed memory parallel computers. The key idea is to consider both ray state and data accesses when scheduling ray computations. We compare three instances of this family of algorithms against two traditional statically scheduled schemes. We show that our dynamic scheduling approach can render data sets that are larger than aggregate system memory and that cannot be rendered by existing statically scheduled ray tracers. For smaller problems that fit in aggregate memory but are larger than typical shared memory, our dynamic approach is competitive with the best static scheduling algorithm.

visualization and data analysis | 2012

Configurable data prefetching scheme for interactive visualization of large-scale volume data

Byungil Jeong; Paul A. Navrátil; Kelly P. Gaither; Gregory D. Abram; Gregory P. Johnson

This paper presents a novel data prefetching and memory management scheme to support interactive visualization of large-scale volume datasets using GPU-based isosurface extraction. Our dynamic in-core approach uses a span-space lattice data structure to predict and prefetch the portions of a dataset that are required by isosurface queries, to manage an application-level volume data cache, and to ensure load-balancing for parallel execution. We also present a GPU memory management scheme that enhances isosurface extraction and rendering performance. With these techniques, we achieve rendering performance superior to other in-core algorithms while using dramatically fewer resources.

Explore More