Sergio Goma
Qualcomm
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sergio Goma.
Proceedings of SPIE | 2013
Todor Georgiev Georgiev; Zhan Yu; Andrew Lumsdaine; Sergio Goma
The Lytro camera is the first implementation of a plenoptic camera for the consumer market. We consider it a successful example of the miniaturization aided by the increase in computational power characterizing mobile computational photography. The plenoptic camera approach to radiance capture uses a microlens array as an imaging system focused on the focal plane of the main camera lens. This paper analyzes the performance of Lytro camera from a system level perspective, considering the Lytro camera as a black box, and uses our interpretation of Lytro image data saved by the camera. We present our findings based on our interpretation of Lytro camera file structure, image calibration and image rendering; in this context, artifacts and final image resolution are discussed.
IEEE Transactions on Consumer Electronics | 2014
James Nightingale; Qi Wang; Christos Grecos; Sergio Goma
Users of modern portable consumer devices (smartphones, tablets etc.) expect ubiquitous delivery of high quality services, which fully utilise the capabilities of their devices. Video streaming is one of the most widely used yet challenging services for operators to deliver with assured service levels. This challenge is more apparent in wireless networks where bandwidth constraints and packet loss are common. The lower bandwidth requirements of High Efficiency Video Coding (HEVC) provide the potential to enable service providers to deliver high quality video streams in low-bandwidth networks; however, packet loss may result in greater damage in perceived quality given the higher compression ratio. This work considers the delivery of HEVC encoded video streams in impaired network environments and quantifies the effects of network impairment on HEVC video streaming from the perspective of the end user. HEVC encoded streams were transmitted over a test network with both wired and wireless segments that had imperfect communication channels subject to packet loss. Two different error concealment methods were employed to mitigate packet loss and overcome reference decoder robustness issues. The perceptual quality of received video was subjectively assessed by a panel of viewers. Existing subjective studies of HEVC quality have not considered the implications of network impairments. Analysis of results has quantified the effects of packet loss in HEVC on perceptual quality and provided valuable insight into the relative importance of the main factors observed to influence user perception in HEVC streaming. The outputs from this study show the relative importance and relationship between those factors that affect human perception of quality in impaired HEVC encoded video streams. The subjective analysis is supported by comparison with commonly used objective quality measurement techniques. Outputs from this work may be used in the development of quality of experience (QoE) oriented streaming applications for HEVC in loss prone networks.
Studies in Regional Science | 2009
Todor G. Georgiev; Andrew Lumsdaine; Sergio Goma
We demonstrate high dynamic range (HDR) imaging with the Plenoptic 2.0 camera. Multiple exposure capture is achieved with a single shot using microimages created by microlens array that has an interleaved set of different apertures.
IEEE Communications Magazine | 2014
James Nightingale; Qi Wang; Christos Grecos; Sergio Goma
Video and multimedia streaming services continue to grow in popularity and are rapidly becoming the largest consumers of network capacity in both fixed and mobile networks. In this article we discuss the latest advances in video compression technology and demonstrate their potential to improve service quality for consumers while reducing bandwidth consumption. Our study focuses on the adaptation of scalable, highly compressed video streams to meet the resource constraints of a wide range of portable consumer devices in mobile environments. Exploring SHVC, the scalable extension to the recently standardized High Efficiency Video Coding scheme, we show the bandwidth savings that can be achieved over current encoding schemes and highlight the challenges that lie ahead in realizing a deployable and user-centric system.
Proceedings of SPIE | 2011
Kalin Mitkov Atanassov; Sergio Goma; Vikas Ramachandra; Todor G. Georgiev
Depth estimation in focused plenoptic camera is a critical step for most applications of this technology and poses interesting challenges, as this estimation is content based. We present an iterative algorithm, content adaptive, that exploits the redundancy found in focused plenoptic camera captured images. Our algorithm determines for each point its depth along with a measure of reliability allowing subsequent enhancements of spatial resolution of the depth map. We remark that the spatial resolution of the recovered depth corresponds to discrete values of depth in the captured scene to which we refer as slices. Moreover, each slice has a different depth and will allow extraction of different spatial resolutions of depth, depending on the scene content being present in that slice along with occluding areas. Interestingly, as focused plenoptic camera is not theoretically limited in spatial resolution, we show that the recovered spatial resolution is depth related, and as such, rendering of a focused plenoptic image is content dependent.
Proceedings of SPIE | 2014
James Nightingale; Qi Wang; Christos Grecos; Sergio Goma
As network service providers seek to improve customer satisfaction and retention levels, they are increasingly moving from traditional quality of service (QoS) driven delivery models to customer-centred quality of experience (QoE) delivery models. QoS models only consider metrics derived from the network however, QoE models also consider metrics derived from within the video sequence itself. Various spatial and temporal characteristics of a video sequence have been proposed, both individually and in combination, to derive methods of classifying video content either on a continuous scale or as a set of discrete classes. QoE models can be divided into three broad categories, full reference, reduced reference and no-reference models. Due to the need to have the original video available at the client for comparison, full reference metrics are of limited practical value in adaptive real-time video applications. Reduced reference metrics often require metadata to be transmitted with the bitstream, while no-reference metrics typically operate in the decompressed domain at the client side and require significant processing to extract spatial and temporal features. This paper proposes a heuristic, no-reference approach to video content classification which is specific to HEVC encoded bitstreams. The HEVC encoder already makes use of spatial characteristics to determine partitioning of coding units and temporal characteristics to determine the splitting of prediction units. We derive a function which approximates the spatio-temporal characteristics of the video sequence by using the weighted averages of the depth at which the coding unit quadtree is split and the prediction mode decision made by the encoder to estimate spatial and temporal characteristics respectively. Since the video content type of a sequence is determined by using high level information parsed from the video stream, spatio-temporal characteristics are identified without the need for full decoding and can be used in a timely manner to aid decision making in QoE oriented adaptive real time streaming.
Proceedings of SPIE | 2011
Katherine L. Bouman; Vikas Ramachandra; Kalin Mitkov Atanassov; Mickey Aleksic; Sergio Goma
The MIPI standard has adopted DPCM compression for RAW data images streamed from mobile cameras. This DPCM is line based and uses either a simple 1 or 2 pixel predictor. In this paper, we analyze the DPCM compression performance as MTF degradation. To test this schemes performance, we generated Siemens star images and binarized them to 2-level images. These two intensity values where chosen such that their intensity difference corresponds to those pixel differences which result in largest relative errors in the DPCM compressor. (E.g. a pixel transition from 0 to 4095 corresponds to an error of 6 between the DPCM compressed value and the original pixel value). The DPCM scheme introduces different amounts of error based on the pixel difference. We passed these modified Siemens star chart images to this compressor and compared the compressed images with the original images using IT3 MTF response plots for slanted edges. Further, we discuss the PSF influence on DPCM error and its propagation through the image processing pipe.
Proceedings of SPIE | 2011
Kalin Mitkov Atanassov; Vikas Ramachandra; Sergio Goma; Milivoje Aleksic
Putting high quality and easy-to-use 3D technology into the hands of regular consumers has become a recent challenge as interest in 3D technology has grown. Making 3D technology appealing to the average user requires that it be made fully automatic and foolproof. Designing a fully automatic 3D capture and display system requires: 1) identifying critical 3D technology issues like camera positioning, disparity control rationale, and screen geometry dependency, 2) designing methodology to automatically control them. Implementing 3D capture functionality on phone cameras necessitates designing algorithms to fit within the processing capabilities of the device. Various constraints like sensor position tolerances, sensor 3A tolerances, post-processing, 3D video resolution and frame rate should be carefully considered for their influence on 3D experience. Issues with migrating functions such as zoom and pan from the 2D usage model (both during capture and display) to 3D needs to be resolved to insure the highest level of user experience. It is also very important that the 3D usage scenario (including interactions between the user and the capture/display device) is carefully considered. Finally, both the processing power of the device and the practicality of the scheme needs to be taken into account while designing the calibration and processing methodology.
Proceedings of SPIE | 2014
James Nightingale; Qi Wang; Christos Grecos; Sergio Goma
High Efficiency Video Coding (HEVC), the latest video compression standard (also known as H.265), can deliver video streams of comparable quality to the current H.264 Advanced Video Coding (H.264/AVC) standard with a 50% reduction in bandwidth. Research into SHVC, the scalable extension to the HEVC standard, is still in its infancy. One important area for investigation is whether, given the greater compression ratio of HEVC (and SHVC), the loss of packets containing video content will have a greater impact on the quality of delivered video than is the case with H.264/AVC or its scalable extension H.264/SVC. In this work we empirically evaluate the layer-based, in-network adaptation of video streams encoded using SHVC in situations where dynamically changing bandwidths and datagram loss ratios require the real-time adaptation of video streams. Through the use of extensive experimentation, we establish a comprehensive set of benchmarks for SHVC-based highdefinition video streaming in loss prone network environments such as those commonly found in mobile networks. Among other results, we highlight that packet losses of only 1% can lead to a substantial reduction in PSNR of over 3dB and error propagation in over 130 pictures following the one in which the loss occurred. This work would be one of the earliest studies in this cutting-edge area that reports benchmark evaluation results for the effects of datagram loss on SHVC picture quality and offers empirical and analytical insights into SHVC adaptation to lossy, mobile networking conditions.
Proceedings of SPIE | 2012
Kalin Mitkov Atanassov; Vikas Ramachandra; James Wilson Nash; Sergio Goma
With the rapid growth of 3D technology, 3D image capture has become a critical part of the 3D feature set on mobile phones. 3D image quality is affected by the scene geometry as well as on-the-device processing. An automatic 3D system usually assumes known camera poses accomplished by factory calibration using a special chart. In real life settings, pose parameters estimated by factory calibration can be negatively impacted by movements of the lens barrel due to shaking, focusing, or camera drop. If any of these factors displaces the optical axes of either or both cameras, vertical disparity might exceed the maximum tolerable margin and the 3D user may experience eye strain or headaches. To make 3D capture more practical, one needs to consider unassisted (on arbitrary scenes) calibration. In this paper, we propose an algorithm that relies on detection and matching of keypoints between left and right images. Frames containing erroneous matches, along with frames with insufficiently rich keypoint constellations, are detected and discarded. Roll, pitch yaw , and scale differences between left and right frames are then estimated. The algorithm performance is evaluated in terms of the remaining vertical disparity as compared to the maximum tolerable vertical disparity.