Timo Aila
Nvidia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Timo Aila.
high performance graphics | 2009
Timo Aila; Samuli Laine
We discuss the mapping of elementary ray tracing operations---acceleration structure traversal and primitive intersection---onto wide SIMD/SIMT machines. Our focus is on NVIDIA GPUs, but some of the observations should be valid for other wide machines as well. While several fast GPU tracing methods have been published, very little is actually understood about their performance. Nobody knows whether the methods are anywhere near the theoretically obtainable limits, and if not, what might be causing the discrepancy. We study this question by comparing the measurements against a simulator that tells the upper bound of performance for a given kernel. We observe that previously known methods are a factor of 1.5--2.5X off from theoretical optimum, and most of the gap is not explained by memory bandwidth, but rather by previously unidentified inefficiencies in hardware work distribution. We then propose a simple solution that significantly narrows the gap between simulation and measurement. This results in the fastest GPU ray tracer to date. We provide results for primary, ambient occlusion and diffuse interreflection rays.
eurographics symposium on rendering techniques | 2007
Samuli Laine; Hannu Saransaari; Janne Kontkanen; Jaakko Lehtinen; Timo Aila
We present a method for rendering single-bounce indirect illumination in real time on currently available graphics hardware. The method is based on the instant radiosity algorithm, where virtual point lights (VPLs) are generated by casting rays from the primary light source. Hardware shadow maps are then employed for determining the indirect illumination from the VPLs. Our main contribution is an algorithm for reusing the VPLs and incrementally maintaining their good distribution. As a result, only a few shadow maps need to be rendered per frame as long as the motion of the primary light source is reasonably smooth. This yields real-time frame rates even when hundreds of VPLs are used.
eurographics symposium on rendering techniques | 2004
Timo Aila; Samuli Laine
In this paper we abandon the regular structure of shadow maps. Instead, we transform the visible pixels P(x, y, z) from screen space to the image plane of a light source P′(x′, y′, z′). The (x′, y′) are then used as sampling points when the geometry is rasterized into the shadow map. This eliminates the resolution issues that have plagued shadow maps for decades, e.g., jagged shadow boundaries. Incorrect self-shadowing is also greatly reduced, and semi-transparent shadow casters and receivers can be supported. A hierarchical software implementation is outlined
international conference on computer graphics and interactive techniques | 2011
Jaakko Lehtinen; Timo Aila; Jiawen Chen; Samuli Laine
Traditionally, effects that require evaluating multidimensional integrals for each pixel, such as motion blur, depth of field, and soft shadows, suffer from noise due to the variance of the high-dimensional integrand. In this paper, we describe a general reconstruction technique that exploits the anisotropy in the temporal light field and permits efficient reuse of samples between pixels, multiplying the effective sampling rate by a large factor. We show that our technique can be applied in situations that are challenging or impossible for previous anisotropic reconstruction methods, and that it can yield good results with very sparse inputs. We demonstrate our method for simultaneous motion blur, depth of field, and soft shadows.
eurographics symposium on rendering techniques | 2004
Jan Kautz; Jaakko Lehtinen; Timo Aila
We present a method for interactive rendering of dynamic models with self-shadows due to time-varying, lowfrequency lighting environments. In contrast to previous techniques, the method is not limited to static or preanimated models. Our main contribution is a hemispherical rasterizer, which rapidly computes visibility by rendering blocker geometry into a 2D occlusion mask with correct occluder fusion. The response of an object to the lighting is found by integrating the visibility function at each of the vertices against the spherical harmonic functions and the BRDF. This yields transfer coefficients that are then multiplied by the lighting coefficients to obtain the final, shadowed exitant radiance. No precomputation is necessary and memory requirements are modest. The method supports both diffuse and glossy BRDFs.
high performance graphics | 2013
Tero Karras; Timo Aila
We propose a new massively parallel algorithm for constructing high-quality bounding volume hierarchies (BVHs) for ray tracing. The algorithm is based on modifying an existing BVH to improve its quality, and executes in linear time at a rate of almost 40M triangles/sec on NVIDIA GTX Titan. We also propose an improved approach for parallel splitting of triangles prior to tree construction. Averaged over 20 test scenes, the resulting trees offer over 90% of the ray tracing performance of the best offline construction method (SBVH), while previous fast GPU algorithms offer only about 50%. Compared to state-of-the-art, our method offers a significant improvement in the majority of practical workloads that need to construct the BVH for each frame. On the average, it gives the best overall performance when tracing between 7 million and 60 billion rays per frame. This covers most interactive applications, product and architectural design, and even movie rendering.
international conference on computer graphics and interactive techniques | 2005
Samuli Laine; Timo Aila; Ulf Assarsson; Jaakko Lehtinen; Tomas Akenine-Möller
We present a new, fast algorithm for rendering physically-based soft shadows in ray tracing-based renderers. Our method replaces the hundreds of shadow rays commonly used in stochastic ray tracers with a single shadow ray and a local reconstruction of the visibility function. Compared to tracing the shadow rays. our algorithm produces exactly the same image while executing one to two orders of magnitude faster in the test scenes used. Our first contribution is a two-stage method for quickly determining the silhouette edges that overlap an area light source, as seen from the point to be shaded. Secondly, we show that these partial silhouettes of occluders, along with a single shadow ray, are sufficient for reconstructing the visibility function between the point and the light source.
high performance graphics | 2013
Samuli Laine; Tero Karras; Timo Aila
When programming for GPUs, simply porting a large CPU program into an equally large GPU kernel is generally not a good approach. Due to SIMT execution model on GPUs, divergence in control flow carries substantial performance penalties, as does high register us-age that lessens the latency-hiding capability that is essential for the high-latency, high-bandwidth memory system of a GPU. In this paper, we implement a path tracer on a GPU using a wavefront formulation, avoiding these pitfalls that can be especially prominent when using materials that are expensive to evaluate. We compare our performance against the traditional megakernel approach, and demonstrate that the wavefront formulation is much better suited for real-world use cases where multiple complex materials are present in the scene.
international conference on computer graphics and interactive techniques | 2012
Jaakko Lehtinen; Timo Aila; Samuli Laine
Stochastic techniques for rendering indirect illumination suffer from noise due to the variance in the integrand. In this paper, we describe a general reconstruction technique that exploits anisotropy in the light field and permits efficient reuse of input samples between pixels or world-space locations, multiplying the effective sampling rate by a large factor. Our technique introduces visibility-aware anisotropic reconstruction to indirect illumination, ambient occlusion and glossy reflections. It operates on point samples without knowledge of the scene, and can thus be seen as an advanced image filter. Our results show dramatic improvement in image quality while using very sparse input samplings.
international conference on computer graphics and interactive techniques | 2011
Samuli Laine; Timo Aila; Tero Karras; Jaakko Lehtinen
We present a novel method for increasing the efficiency of stochastic rasterization of motion and defocus blur. Contrary to earlier approaches, our method is efficient even with the low sampling densities commonly encountered in realtime rendering, while allowing the use of arbitrary sampling patterns for maximal image quality. Our clipless dual-space formulation avoids problems with triangles that cross the camera plane during the shutter interval. The method is also simple to plug into existing rendering systems.