Is this you? Create Your Porfile

Ulf Assarsson

Chalmers University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ulf Assarsson is active.

Explore More

Publication

Featured researches published by Ulf Assarsson.

Journal of Parallel and Distributed Computing | 2008

Fast parallel GPU-sorting using a hybrid algorithm

Erik Sintorn; Ulf Assarsson

This paper presents an algorithm for fast sorting of large lists using modern GPUs. The method achieves high speed by efficiently utilizing the parallelism of the GPU throughout the whole algorithm. Initially, GPU-based bucketsort or quicksort splits the list into enough sublists then to be sorted in parallel using merge-sort. The algorithm is of complexity nlogn, and for lists of 8 M elements and using a single Geforce 8800 GTS-512, it is 2.5 times as fast as the bitonic sort algorithms, with standard complexity of n(logn)^2, which for a long time was considered to be the fastest for GPU sorting. It is 6 times faster than single CPU quicksort, and 10% faster than the recent GPU-based radix sort. Finally, the algorithm is further parallelized to utilize two graphics cards, resulting in yet another 1.8 times speedup.

international conference on computer graphics and interactive techniques | 2003

A geometry-based soft shadow volume algorithm using graphics hardware

Ulf Assarsson; Tomas Akenine-Möller

Most previous soft shadow algorithms have either suffered from aliasing, been too slow, or could only use a limited set of shadow casters and/or receivers. Therefore, we present a strengthened soft shadow volume algorithm that deals with these problems. Our critical improvements include robust penumbra wedge construction, geometry-based visibility computation, and also simplified computation through a four-dimensional texture lookup. This enables us to implement the algorithm using programmable graphics hardware, and it results in images that most often are indistinguishable from images created as the average of 1024 hard shadow images. Furthermore, our algorithm can use both arbitrary shadow casters and receivers. Also, one version of our algorithm completely avoids sampling artifacts which is rare for soft shadow algorithms. As a bonus, the four-dimensional texture lookup allows for small textured light sources, and, even video textures can be used as light sources. Our algorithm has been implemented in pure software, and also using the GeForce FX emulator with pixel shaders. Our software implementation renders soft shadows at 0.5-5 frames per second for the images in this paper. With actual hardware, we expect that our algorithm will render soft shadows in real time. An important performance measure is bandwidth usage. For the same image quality, an algorithm using the accumulated hard shadow images uses almost two orders of magnitude more bandwidth than our algorithm. (Less)

Journal of Graphics Tools | 2000

Optimized view frustum culling algorithms for bounding boxes

Ulf Assarsson; Tomas Möller

Abstract This paper presents optimizations for faster view frustum culling (VFC) for axis-aligned bounding box (AABB) and oriented bounding box (OBB) hierarchies. We exploit frame-to-frame coherency by caching and by comparing against previous distances and rotation angles. By using an octant test, we potentially halve the number of plane tests needed, and we also evaluate masking, which is a well-known technique. The optimizations can be used for arbitrary bounding volumes, but we present only results for AABBs and OBBs. In particular, we provide solutions which are 2-11 times faster than other VFC algorithms for AABBs and OBBs, depending on the circumstances.

high performance graphics | 2009

Efficient stream compaction on wide SIMD many-core architectures

Markus Billeter; Ola Olsson; Ulf Assarsson

Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several processing steps and reduces overall memory usage. For wide SIMD many-core architectures, we present a novel stream compaction algorithm and explore several variations thereof. Our algorithm is designed to maximize concurrent execution, with minimal use of synchronization. Bandwidth and auxiliary storage requirements are reduced significantly, which allows for substantially better performance. We have tested our algorithms using CUDA on a PC with an NVIDIA GeForce GTX280 GPU. On this hardware, our reference implementation provides a 3x speedup over previous published algorithms.

eurographics | 2002

Approximate soft shadows on arbitrary surfaces using penumbra wedges

Tomas Akenine-Möller; Ulf Assarsson

Shadow generation has been subject to serious investigation in computer graphics, and many clever algorithms have been suggested. However, previous algorithms cannot render high quality soft shadows onto arbitrary, animated objects in real time. Pursuing this goal, we present a new soft shadow algorithm that extends the standard shadow volume algorithm by replacing each shadow quadrilateral with a new primitive, called the penumbra wedge. For each silhouette edge as seen from the light source, a penumbra wedge is created that approximately models the penumbra volume that this edge gives rise to. Together the penumbra wedges can render images that often are remarkably close to more precisely rendered soft shadows. Furthermore, our new primitive is designed so that it can be rasterized efficiently. Many real-time algorithms can only use planes as shadow receivers, while ours can handle arbitrary shadow receivers. The proposed algorithm can be of great value to, e.g., 3D computer games, especially since it is highly likely that this algorithm can be implemented on programmable graphics hardware coming out within the next year, and because games often prefer perceptually convincing shadows.

IEEE Computer Graphics and Applications | 2001

A Benchmark for Animated Ray Tracing

Jonas Lext; Ulf Assarsson; Tomas Möller

We saw the need for our Benchmark for Animated Ray Tracing (BART) because no benchmark exists in this area and because at least two groups have been ray tracing fairly complex and realistic scenes at interactive speeds - at rates above one frame per second. Another reason is because acceleration data structures for animated ray tracing have not been studied much but probably will be in the future. BARTs main contribution is three parametrically animated test scenes that we designed to stress ray-tracing algorithms and a set of reliable performance measurements that let BART users compare the performance of different ray-tracing algorithms. For approximating algorithms (i.e. algorithms that may produce approximate pixel values), we also define how to measure the quality of the approximated images. Our suite of test scenes is thus designed to accurately measure and objectively compare the performance and quality of ray-traced animated scenes.

Real-Time Shadows 1st | 2011

Real-Time Shadows

Elmar Eisemann; Michael Schwarz; Ulf Assarsson; Michael Wimmer

Important elements of games, movies, and other computer-generated content, shadows are crucial for enhancing realism and providing important visual cues. In recent years, there have been notable improvements in visual quality and speed, making high-quality realistic real-time shadows a reachable goal. Real-Time Shadows is a comprehensive guide to the theory and practice of real-time shadow techniques. It covers a large variety of different effects, including hard, soft, volumetric, and semi-transparent shadows. The book explains the basics as well as many advanced aspects related to the domain of shadow computation. It presents interactive solutions and practical details on shadow computation. The authors compare various algorithms for creating real-time shadows and illustrate how they are used in different situations. They explore the limitations and failure cases, advantages and disadvantages, and suitability of the algorithms in several applications. Source code, videos, tutorials, and more are available on the books website.

eurographics | 2008

Sample based visibility for soft shadows using alias-free shadow maps

Erik Sintorn; Elmar Eisemann; Ulf Assarsson

This paper introduces an accurate real‐time soft shadow algorithm that uses sample based visibility. Initially, we present a GPU‐based alias‐free hard shadow map algorithm that typically requires only a single render pass from the light, in contrast to using depth peeling and one pass per layer. For closed objects, we also suppress the need for a bias. The method is extended to soft shadow sampling for an arbitrarily shaped area‐/volumetric light source using 128‐1024 light samples per screen pixel. The alias‐free shadow map guarantees that the visibility is accurately sampled per screen‐space pixel, even for arbitrarily shaped (e.g. non‐planar) surfaces or solid objects. Another contribution is a smooth coherent shading model to avoid common light leakage near shadow borders due to normal interpolation.

siggraph eurographics conference on graphics hardware | 2003

An optimized soft shadow volume algorithm with real-time performance

Ulf Assarsson; Michael A. Dougherty; Michael Sean Mounier; Tomas Akenine-Möller

In this paper, we present several optimizations to our previously presented soft shadow volume algorithm. Our optimizations include tighter wedges, heavily optimized pixel shader code for both rectangular and spherical light sources, a frame buffer blending technique to overcome the limitation of 8-bit frame buffers, and a simple culling algorithm. These together give real-time performance, and for simple models we get frame rates of over 150 fps. For more complex models 50 fps is normal. In addition to optimizations, two simple techniques for improving the visual quality are also presented.

interactive 3d graphics and games | 2009

Hair self shadowing and transparency depth ordering using occupancy maps

Erik Sintorn; Ulf Assarsson

This paper presents a method for quickly constructing a high-quality approximate visibility function for high frequency semi-transparent geometry such as hair. We can then reconstruct the visibility for any fragment without the expensive compression needed by Deep Shadow Maps and with a quality that is much better than what is attainable at similar framerates using Opacity Maps or Deep Opacity Maps. The memory footprint of our method is also considerably lower than that of previous methods. We then use a similar method to achieve back-to-front sorted alpha blending of the fragments with results that are virtually indistinguishable from depth-peeling and an order of magnitude faster.

Explore More