Sebastian Schwarz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sebastian Schwarz is active.

Explore More

Publication

Featured researches published by Sebastian Schwarz.

picture coding symposium | 2012

Adaptive depth filtering for HEVC 3D video coding

Sebastian Schwarz; Roger Olsson; Mårten Sjöström; Sylvain Tourancheau

Consumer interest in 3D television (3DTV) is growing steadily, but current available 3D displays still need additional eye-wear and suffer from the limitation of a single stereo view pair. So it can be assumed that autostereoscopic multiview displays are the next step in 3D-at-home entertainment, since these displays can utilize the Multiview Video plus Depth (MVD) format to synthesize numerous viewing angles from only a small set of given input views. This motivates efficient MVD compression as an important keystone for commercial success of 3DTV. In this paper we concentrate on the compression of depth information in an MVD scenario. There have been several publications suggesting depth down- and upsampling to increase coding efficiency. We follow this path, using our recently introduced Edge Weighted Optimization Concept (EWOC) for depth upscaling. EWOC uses edge information from the video frame in the upscaling process and allows the use of sparse, non-uniformly distributed depth values. We exploit this fact to expand the depth down-/upsampling idea with an adaptive low-pass filter, reducing high energy parts in the original depth map prior to subsampling and compression. Objective results show the viability of our approach for depth map compression with up-to-date High-Efficiency Video Coding (HEVC). For the same Y-PSNR in synthesized views we achieve up to 18.5% bit rate decrease compared to full-scale depth and around 10% compared to competing depth down-/upsampling solutions. These results were confirmed by a subjective quality assessment, showing a statistical significant preference for 87.5% of the test cases.

IEEE Transactions on Image Processing | 2014

A Weighted Optimization Approach to Time-of-Flight Sensor Fusion

Sebastian Schwarz; Mårten Sjöström; Roger Olsson

Acquiring scenery depth is a fundamental task in computer vision, with many applications in manufacturing, surveillance, or robotics relying on accurate scenery information. Time-of-flight cameras can provide depth information in real-time and overcome short-comings of traditional stereo analysis. However, they provide limited spatial resolution and sophisticated upscaling algorithms are sought after. In this paper, we present a sensor fusion approach to time-of-flight super resolution, based on the combination of depth and texture sources. Unlike other texture guided approaches, we interpret the depth upscaling process as a weighted energy optimization problem. Three different weights are introduced, employing different available sensor data. The individual weights address object boundaries in depth, depth sensor noise, and temporal consistency. Applied in consecutive order, they form three weighting strategies for time-of-flight super resolution. Objective evaluations show advantages in depth accuracy and for depth image based rendering compared with state-of-the-art depth upscaling. Subjective view synthesis evaluation shows a significant increase in viewer preference by a factor of four in stereoscopic viewing conditions. To the best of our knowledge, this is the first extensive subjective test performed on time-of-flight depth upscaling. Objective and subjective results proof the suitability of our approach to time-of-flight super resolution approach for depth scenery capture.

Proceedings of SPIE | 2012

Depth Map Upscaling Through Edge Weighted Optimization

Sebastian Schwarz; Mårten Sjöström; Roger Olsson

Accurate depth maps are a pre-requisite in three-dimensional television, e.g. for high quality view synthesis, but this information is not always easily obtained. Depth information gained by correspondence matching from two or more views suffers from disocclusions and low-texturized regions, leading to erroneous depth maps. These errors can be avoided by using depth from dedicated range sensors, e.g. time-of-flight sensors. Because these sensors only have restricted resolution, the resulting depth data need to be adjusted to the resolution of the appropriate texture frame. Standard upscaling methods provide only limited quality results. This paper proposes a solution for upscaling low resolution depth data to match high resolution texture data. We introduce We introduce the Edge Weighted Optimization Concept (EWOC) for fusing low resolution depth maps with corresponding high resolution video frames by solving an overdetermined linear equation system. Similar to other approaches, we take information from the high resolution texture, but additionally validate this information with the low resolution depth to accentuate correlated data. Objective tests show an improvement in depth map quality in comparison to other upscaling approaches. This improvement is subjectively confirmed in the resulting view synthesis.

IEEE MultiMedia | 2013

Depth Sensing for 3DTV: A Survey

Sebastian Schwarz; Roger Olsson; Mårten Sjöström

In the context of 3D video systems, depth information could be used to render a scene from additional viewpoints. Although there have been many recent advances in this area, including the introduction of the Microsoft Kinect sensor, the robust acquisition of such information continues to be a challenge. This article reviews three depth-sensing approaches for 3DTV. The authors discuss several approaches for acquiring depth information and provides a comparative analysis of their characteristics.

Proceedings of SPIE | 2014

Temporal consistent depth map upscaling for 3DTV

Sebastian Schwarz; Mårten Sjöström; Roger Olsson

The ongoing success of three-dimensional (3D) cinema fuels increasing efforts to spread the commercial success of 3D to new markets. The possibilities of a convincing 3D experience at home, such as three-dimensional television (3DTV), has generated a great deal of interest within the research and standardization community. A central issue for 3DTV is the creation and representation of 3D content. Acquiring scene depth information is a fundamental task in computer vision, yet complex and error-prone. Dedicated range sensors, such as the Time of-Flight camera (ToF), can simplify the scene depth capture process and overcome shortcomings of traditional solutions, such as active or passive stereo analysis. Admittedly, currently available ToF sensors deliver only a limited spatial resolution. However, sophisticated depth upscaling approaches use texture information to match depth and video resolution. At Electronic Imaging 2012 we proposed an upscaling routine based on error energy minimization, weighted with edge information from an accompanying video source. In this article we develop our algorithm further. By adding temporal consistency constraints to the upscaling process, we reduce disturbing depth jumps and flickering artifacts in the final 3DTV content. Temporal consistency in depth maps enhances the 3D experience, leading to a wider acceptance of 3D media content. More content in better quality can boost the commercial success of 3DTV.

3dtv-conference: the true vision - capture, transmission and display of 3d video | 2014

Time-of-flight sensor fusion with depth measurement reliability weighting

Sebastian Schwarz; Mårten Sjöström; Roger Olsson

Accurate scene depth capture is essential for the success of three-dimensional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, scene depth is not easily obtained and often of limited quality. Dedicated Time-of-Flight (ToF) sensors can deliver reliable depth readings where traditional methods, such as stereovision analysis, fail. However, since ToF sensors provide only limited spatial resolution and suffer from sensor noise, sophisticated upsampling methods are sought after. A multitude of ToF solutions have been proposed over the recent years. Most of them achieve ToF superresolution (TSR) by sensor fusion between ToF and additional sources, e.g. video. We recently proposed a weighted error energy minimization approach for ToF super-resolution, incorporating texture, sensor noise and temporal information. For this article, we take a closer look at the sensor noise weighting related to the Time-of-Flight active brightness signal. We determine a depth measurement reliability function based on optimizing free parameters to test data and verifying it with independent test cases. In the presented double-weighted TSR proposal, depth readings are weighted into the upsampling process with regard to their reliability, removing erroneous influences in the final result. Our evaluations prove the desired effect of depth measurement reliability weighting, decreasing the depth upsampling error by almost 40% in comparison to competing proposals.

3dtv-conference: the true vision - capture, transmission and display of 3d video | 2012

Incremental depth upscaling using an edge weighted optimization concept

Sebastian Schwarz; Mårten Sjöström; Roger Olsson

Precise scene depth information is a pre-requisite in three-dimensional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, this information is not easily obtained and often of limited quality. Dedicated range sensors, such as time-of-flight (ToF) cameras, can deliver reliable depth information where (stereo-)matching fails. Nonetheless, since these sensors provide only restricted spatial resolution, sophisticated upscaling methods are sought-after, to match depth information to corresponding texture frames. Where traditional upscaling fails, novel approaches have been proposed, utilizing additional information from the texture for the depth upscaling process. We recently proposed the Edge Weighted Optimization Concept (EWOC) for ToF upscaling, using texture edges for accurate depth boundaries. In this paper we propose an important update to EWOC, dividing it into smaller incremental upscaling steps. We predict two major improvements from this. Firstly, processing time should be decreased by dividing one big calculation into several smaller steps. Secondly, we assume an increase in quality for the upscaled depth map, due to a more coherent edge detection on the video frame. In our evaluations we can show the desired effect on processing time, cutting down the calculation time more than in half. We can also show an increase in visual quality, based on objective quality metrics, compared to the original implementation as well as competing proposals.

picture coding symposium | 2016

An application of unified reference picture list for motion-compensated video compression

Sebastian Schwarz; Marta Mrak

Modern video compression standards rely on motion-compensated prediction from multiple reference pictures. In past standardisation efforts following the introduction of bidirectional prediction, reference pictures were typically signalled in two independent lists, used for forward- and backward-prediction. Modern video coding standards, such as H.264/MPEG-4 Advanced Video Coding or High Efficiency Video Coding, extend this concept by removing the separation between preceding and succeeding reference pictures, and allow reference frames to be included in the two lists regardless of their temporal index. Therefore, the principle of using two separate lists can be considered obsolete. This paper evaluates an alternative concept, using a single, unified reference picture list. This alternative list, called “list unified” or LU, reduces reference picture signalling, while all functionalities for motion-compensated prediction using multiple references are preserved. Evaluation shows competitive video coding efficiency compared to the usage of two lists, with the advantage of simplified bitstream parsing and much improved reference picture flexibility.

Proceedings of SPIE | 2012

Converting conventional stereo pairs to multiview sequences using morphing

Roger Olsson; Vamsi Kiran Adhikarla; Sebastian Schwarz; Mårten Sjöström

Autostereoscopic multi view displays require multiple views of a scene to provide motion parallax. When an observer changes viewing angle different stereoscopic pairs are perceived. This allows new perspectives of the scene to be seen giving a more realistic 3D experience. However, capturing arbitrary number of views is at best cumbersome, and in some occasions impossible. Conventional stereo video (CSV) operates on two video signals captured using two cameras at two different perspectives. Generation and transmission of two views is more feasible than that of multiple views. It would be more efficient if multiple views required by an autostereoscopic display can be synthesized from these sparse set of views. This paper addresses the conversion of stereoscopic video to multiview video using the video effect morphing. Different morphing algorithms are implemented and evaluated. Contrary to traditional conversion methods, these algorithms disregard the physical depth explicitly and instead generate intermediate views using sparse sets of correspondence features and image morphing. A novel morphing algorithm is also presented that uses scale invariant feature transform (SIFT) and segmentation to construct robust correspondences features and qualitative intermediate views. All algorithms are evaluated on a subjective and objective basis and the comparison results are presented.

international conference on systems signals and image processing | 2012