Vikas Ramachandra
Qualcomm
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vikas Ramachandra.
3dtv-conference: the true vision - capture, transmission and display of 3d video | 2008
Vikas Ramachandra; Matthias Zwicker; Truong Q. Nguyen
A procedure for obtaining high dynamic range (HDR) videos from multiple differently exposed image sequences from a camera array is explored. It is observed that using information along both viewspace (camera) and temporal axes results in a good estimate of the HDR image sequence. Images captured at longer exposures are subject to motion blur. A novel motion deblurring scheme is proposed, prior to the actual HDR mapping process. This involves a multiscale directional structure preservation procedure which uses information from adjacent views along camera-space and frames along time. The proposed deblurring scheme works in spite of illumination variations between images.
international conference on acoustics, speech, and signal processing | 2009
Vikas Ramachandra; Matthias Zwicker; Truong Q. Nguyen
Multiview 3D displays have to multiplex a set of views on a single LCD panel. Due to this, each view has to be downsampled by a considerable amount leading to loss of details. In this paper, we extend the seam carving technique for adaptive resizing of images. It is proposed that the depth information be used along with the image pixel intensity values for resizing. This results in better resized multiview images. It is clear from the results presented that the object structure is maintained when the proposed method is used as compared to vanilla seam carving.
3dtv-conference: the true vision - capture, transmission and display of 3d video | 2008
Vikas Ramachandra; Matthias Zwicker; Truong Q. Nguyen
Two new attacks on multiview videos in the view space is investigated, which is also applicable to stereo and free-view television. The first attack involves the generation of new views from different viewpoints. The second attack involves change in the region of focus (the display plane or the zero disparity plane) in the multiview images. A scale invariant feature descriptor (SIFT) based fingerprinting mechanism which can identify such attacks is developed. An online verification system matches the SIFT descriptors of the original video (stored at a central database) to that of the attacked video. Results show that the method can detect such attacks well, and is useful for copy detection on the internet.
IEEE Transactions on Visualization and Computer Graphics | 2011
Vikas Ramachandra; Keigo Hirakawa; Matthias Zwicker; Truong Q. Nguyen
In this paper, we analyze the reproduction of light fields on multiview 3D displays. A three-way interaction between the input light field signal (which is often aliased), the joint spatioangular sampling grids of multiview 3D displays, and the interview light leakage in modern multiview 3D displays is characterized in the joint spatioangular frequency domain. Reconstruction of light fields by all physical 3D displays is prone to light leakage, which means that the reconstruction low-pass filter implemented by the display is too broad in the angular domain. As a result, 3D displays excessively attenuate angular frequencies. Our analysis shows that this reduces sharpness of the images shown in the 3D displays. In this paper, stereoscopic image recovery is recast as a problem of joint spatioangular signal reconstruction. The combination of the 3D display point spread function and human visual system provides the narrow-band low-pass filter which removes spectral replicas in the reconstructed light field on the multiview display. The nonideality of this filter is corrected with the proposed prefiltering. The proposed light field reconstruction method performs light field antialiasing as well as angular sharpening to compensate for the nonideal response of the 3D display. The union of cosets approach which has been used earlier by others is employed here to model the nonrectangular spatioangular sampling grids on a multiview display in a generic fashion. We confirm the effectiveness of our approach in simulation and in physical hardware, and demonstrate improvement over existing techniques.
international conference on image processing | 2008
Vikas Ramachandra; Matthias Zwicker; Truong Q. Nguyen
A method to adaptively code multiview videos has been proposed which uses the depth characteristics of automultiscopic multiview displays. It is found that for the 3D scene seen on multiview displays, regions appearing at large depths are rendered blurry. The proposed method identifies such regions and uses fewer bits to code them. Also, greater number of bits are used for regions which appear sharp on the 3D displays. The overall quality is better than regular AVC/H.264. For compatibility with the framework of scalable multiview coding, we introduce depth scalability. This ensures that the (hierarchical layerwise) encoded video bitstream can be optimally displayed on various displays with different depth properties.
Proceedings of SPIE | 2011
Kalin Mitkov Atanassov; Sergio Goma; Vikas Ramachandra; Todor G. Georgiev
Depth estimation in focused plenoptic camera is a critical step for most applications of this technology and poses interesting challenges, as this estimation is content based. We present an iterative algorithm, content adaptive, that exploits the redundancy found in focused plenoptic camera captured images. Our algorithm determines for each point its depth along with a measure of reliability allowing subsequent enhancements of spatial resolution of the depth map. We remark that the spatial resolution of the recovered depth corresponds to discrete values of depth in the captured scene to which we refer as slices. Moreover, each slice has a different depth and will allow extraction of different spatial resolutions of depth, depending on the scene content being present in that slice along with occluding areas. Interestingly, as focused plenoptic camera is not theoretically limited in spatial resolution, we show that the recovered spatial resolution is depth related, and as such, rendering of a focused plenoptic image is content dependent.
Proceedings of SPIE | 2011
Katherine L. Bouman; Vikas Ramachandra; Kalin Mitkov Atanassov; Mickey Aleksic; Sergio Goma
The MIPI standard has adopted DPCM compression for RAW data images streamed from mobile cameras. This DPCM is line based and uses either a simple 1 or 2 pixel predictor. In this paper, we analyze the DPCM compression performance as MTF degradation. To test this schemes performance, we generated Siemens star images and binarized them to 2-level images. These two intensity values where chosen such that their intensity difference corresponds to those pixel differences which result in largest relative errors in the DPCM compressor. (E.g. a pixel transition from 0 to 4095 corresponds to an error of 6 between the DPCM compressed value and the original pixel value). The DPCM scheme introduces different amounts of error based on the pixel difference. We passed these modified Siemens star chart images to this compressor and compared the compressed images with the original images using IT3 MTF response plots for slanted edges. Further, we discuss the PSF influence on DPCM error and its propagation through the image processing pipe.
Proceedings of SPIE | 2011
Kalin Mitkov Atanassov; Vikas Ramachandra; Sergio Goma; Milivoje Aleksic
Putting high quality and easy-to-use 3D technology into the hands of regular consumers has become a recent challenge as interest in 3D technology has grown. Making 3D technology appealing to the average user requires that it be made fully automatic and foolproof. Designing a fully automatic 3D capture and display system requires: 1) identifying critical 3D technology issues like camera positioning, disparity control rationale, and screen geometry dependency, 2) designing methodology to automatically control them. Implementing 3D capture functionality on phone cameras necessitates designing algorithms to fit within the processing capabilities of the device. Various constraints like sensor position tolerances, sensor 3A tolerances, post-processing, 3D video resolution and frame rate should be carefully considered for their influence on 3D experience. Issues with migrating functions such as zoom and pan from the 2D usage model (both during capture and display) to 3D needs to be resolved to insure the highest level of user experience. It is also very important that the 3D usage scenario (including interactions between the user and the capture/display device) is carefully considered. Finally, both the processing power of the device and the practicality of the scheme needs to be taken into account while designing the calibration and processing methodology.
Proceedings of SPIE | 2012
Kalin Mitkov Atanassov; Vikas Ramachandra; James Wilson Nash; Sergio Goma
With the rapid growth of 3D technology, 3D image capture has become a critical part of the 3D feature set on mobile phones. 3D image quality is affected by the scene geometry as well as on-the-device processing. An automatic 3D system usually assumes known camera poses accomplished by factory calibration using a special chart. In real life settings, pose parameters estimated by factory calibration can be negatively impacted by movements of the lens barrel due to shaking, focusing, or camera drop. If any of these factors displaces the optical axes of either or both cameras, vertical disparity might exceed the maximum tolerable margin and the 3D user may experience eye strain or headaches. To make 3D capture more practical, one needs to consider unassisted (on arbitrary scenes) calibration. In this paper, we propose an algorithm that relies on detection and matching of keypoints between left and right images. Frames containing erroneous matches, along with frames with insufficiently rich keypoint constellations, are detected and discarded. Roll, pitch yaw , and scale differences between left and right frames are then estimated. The algorithm performance is evaluated in terms of the remaining vertical disparity as compared to the maximum tolerable vertical disparity.
Proceedings of SPIE | 2014
Vikas Ramachandra; James Wilson Nash; Kalin Mitkov Atanassov; Sergio Goma
A structured-light system for depth estimation is a type of 3D active sensor that consists of a structured-light projector that projects an illumination pattern on the scene (e.g. mask with vertical stripes) and a camera which captures the illuminated scene. Based on the received patterns, depths of different regions in the scene can be inferred. In this paper, we use side information in the form of image structure to enhance the depth map. This side information is obtained from the received light pattern image reflected by the scene itself. The processing steps run real time. This post-processing stage in the form of depth map enhancement can be used for better hand gesture recognition, as is illustrated in this paper.