Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sheila S. Hemami is active.

Publication


Featured researches published by Sheila S. Hemami.


computer vision and pattern recognition | 2009

Frequency-tuned salient region detection

Radhakrishna Achanta; Sheila S. Hemami; Francisco J. Estrada; Sabine Süsstrunk

Detection of visually salient image regions is useful for applications like object segmentation, adaptive compression, and object recognition. In this paper, we introduce a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects. These boundaries are preserved by retaining substantially more frequency content from the original image than other existing techniques. Our method exploits features of color and luminance, is simple to implement, and is computationally efficient. We compare our algorithm to five state-of-the-art salient region detection methods with a frequency domain analysis, ground truth, and a salient object segmentation application. Our method outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.


IEEE Transactions on Image Processing | 2007

VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images

Damon M. Chandler; Sheila S. Hemami

This paper presents an efficient metric for quantifying the visual fidelity of natural images based on near-threshold and suprathreshold properties of human vision. The proposed metric, the visual signal-to-noise ratio (VSNR), operates via a two-stage approach. In the first stage, contrast thresholds for detection of distortions in the presence of natural images are computed via wavelet-based models of visual masking and visual summation in order to determine whether the distortions in the distorted image are visible. If the distortions are below the threshold of detection, the distorted image is deemed to be of perfect visual fidelity (VSNR = infin)and no further analysis is required. If the distortions are suprathreshold, a second stage is applied which operates based on the low-level visual property of perceived contrast, and the mid-level visual property of global precedence. These two properties are modeled as Euclidean distances in distortion-contrast space of a multiscale wavelet decomposition, and VSNR is computed based on a simple linear sum of these distances. The proposed VSNR metric is generally competitive with current metrics of visual fidelity; it is efficient both in terms of its low computational complexity and in terms of its low memory requirements; and it operates based on physical luminances and visual angle (rather than on digital pixel values and pixel-based dimensions) to accommodate different viewing conditions.


IEEE Transactions on Image Processing | 1995

Transform coded image reconstruction exploiting interblock correlation

Sheila S. Hemami; Teresa H. Meng

Transmission of still images and video over lossy packet networks presents a reconstruction problem at the decoder. Specifically, in the case of block-based transform coded images, loss of one or more packets due to network congestion or transmission errors can result in errant or entirely lost blocks in the decoded image. This article proposes a computationally efficient technique for reconstruction of lost transform coefficients at the decoder that takes advantage of the correlation between transformed blocks of the image. Lost coefficients are linearly interpolated from the same coefficients in adjacent blocks subject to a squared edge error criterion, and the resulting reconstructed coefficients minimize blocking artifacts in the image while providing visually pleasing reconstructions. The required computational expense at the decoder per reconstructed block is less than 1.2 times a non-recursive DCT, and as such this technique is useful for low power, low complexity applications that require good visual performance.


Signal Processing-image Communication | 2010

No-reference image and video quality estimation: Applications and human-motivated design

Sheila S. Hemami; Amy R. Reibman

This paper reviews the basic background knowledge necessary to design effective no-reference (NR) quality estimators (QEs) for images and video. We describe a three-stage framework for NR QE that encompasses the range of potential use scenarios for the NR QE and allows knowledge of the human visual system to be incorporated throughout. We survey the measurement stage of the framework, considering methods that rely on bitstream, pixels, or both. By exploring both the accuracy requirements of potential uses as well as evaluation criteria to stress-test a QE, we set the stage for our community to make substantial future improvements to the challenging problem of NR quality estimation.


IEEE Transactions on Circuits and Systems for Video Technology | 2006

A scalable wavelet-based video distortion metric and applications

Mark A. Masry; Sheila S. Hemami; Yegnaswamy Sermadevi

Video distortion metrics based on models of the human visual system have traditionally used comparisons between the distorted signal and a reference signal to calculate distortions objectively. In video coding applications, this is not prohibitive. In quality monitoring applications, however, access to the reference signal is often limited. This paper presents a computationally efficient video distortion metric that can operate in full- or reduced-reference mode as required. The metric is based on a model of the human visual system implemented using the wavelet transform and separable filters. The visual model is parameterized using a set of video frames and the associated quality scores. The visual models hierarchical structure, as well as the limited impact of fine scale distortions on quality judgments of severely impaired video, are exploited to build a framework for scaling the bitrate required to represent the reference signal. Two applications of the metric are also presented. In the first, the metric is used as the distortion measure in a rate-distortion optimized rate control algorithm for MPEG-2 video compression. The resulting compressed video sequences demonstrate significant improvements in visual quality over compressed sequences with allocations determined by the TM5 rate control algorithm operating with MPEG-2 at the same rate. In the second, the metric is used to estimate time series of objective quality scores for distorted video sequences using reference bitrates as low as 10 kb/s. The resulting quality scores more accurately model subjective quality recordings than do those estimated using the mean squared error as a distortion metric, while requiring a fraction of the bitrate used to represent the reference signal. The reduced-reference metrics performance is comparable to that of the full-reference metrics tested in the first Video Quality Experts Group evaluation.


IEEE Transactions on Image Processing | 2005

Dynamic contrast-based quantization for lossy wavelet image compression

Damon M. Chandler; Sheila S. Hemami

This paper presents a contrast-based quantization strategy for use in lossy wavelet image compression that attempts to preserve visual quality at any bit rate. Based on the results of recent psychophysical experiments using near-threshold and suprathreshold wavelet subband quantization distortions presented against natural-image backgrounds, subbands are quantized such that the distortions in the reconstructed image exhibit root-mean-squared contrasts selected based on image, subband, and display characteristics and on a measure of total visual distortion so as to preserve the visual systems ability to integrate edge structure across scale space. Within a single, unified framework, the proposed contrast-based strategy yields images which are competitive in visual quality with results from current visually lossless approaches at high bit rates and which demonstrate improved visual quality over current visually lossy approaches at low bit rates. This strategy operates in the context of both nonembedded and embedded quantization, the latter of which yields a highly scalable codestream which attempts to maintain visual quality at all bit rates; a specific application of the proposed algorithm to JPEG-2000 is presented.


international conference on image processing | 1997

Regularity-preserving image interpolation

W. Knox Carey; Daniel B. Chuang; Sheila S. Hemami

Common image interpolation methods assume that the underlying signal is continuous and may require that it possess one or more continuous derivatives. These assumptions are not generally true of natural images, most of which have instantaneous luminance transitions at the boundaries between objects. Continuity requirements on the interpolating function produce interpolated images with oversmoothed edges. To avoid this effect, a wavelet-based interpolation method that imposes no continuity constraints is introduced. The algorithm estimates the regularity of edges by measuring the decay of wavelet transform coefficients across scales and attempts to preserve the underlying regularity by extrapolating a new subband to be used in image resynthesis. The algorithm produces noticeably sharper edges than traditional techniques and exhibits an average PSNR improvement of 2.5 dB over bilinear and bicubic techniques.


IEEE Transactions on Image Processing | 1997

Subband-coded image reconstruction for lossy packet networks

Sheila S. Hemami; Robert M. Gray

Transmission of digital subband-coded images over lossy packet networks presents a reconstruction problem at the decoder. This paper presents two techniques for reconstruction of lost subband coefficients, one for low-frequency coefficients and one for high-frequency coefficients. The low-frequency reconstruction algorithm is based on inherent properties of the hierarchical subband decomposition. To maintain smoothness and exploit the high intraband correlation, a cubic interpolative surface is fit to known coefficients to interpolate lost coefficients. Accurate edge placement, crucial for visual quality, is achieved by adapting the interpolation grid in both the horizontal and vertical directions as determined by the edges present. An edge model is used to characterize the adaptation, and a quantitative analysis of this model demonstrates that edges can be identified by simply examining the high-frequency bands, without requiring any additional processing of the low-frequency band. High-frequency reconstruction is performed using linear interpolation, which provides good visual performance as well as maintains properties required for edge placement in the low-frequency reconstruction algorithm. The complete algorithm performs well on loss of single coefficients, vectors, and small blocks, and is therefore applicable to a variety of source coding techniques.


Journal of The Optical Society of America A-optics Image Science and Vision | 2003

Effects of natural images on the detectability of simple and compound wavelet subband quantization distortions

Damon M. Chandler; Sheila S. Hemami

Quantization of the coefficients within a discrete wavelet transform subband gives rise to distortions in the reconstructed image that are localized in spatial frequency and orientation and are spatially correlated with the image. We investigated the detectability of these distortions: Contrast thresholds were measured for both simple and compound distortions presented in the unmasked paradigm and against two natural-image maskers. Simple and compound distortions were generated through uniform scalar quantization of one or two subbands. Unmasked detection thresholds for simple distortions yielded contrast sensitivity functions similar to those reported for 1-octave Gabor patches. Detection thresholds for simple distortions presented against two natural-image backgrounds revealed that thresholds were elevated across the frequency range of 1.15-18.4 cycles per degree with the greatest elevation for low-frequency distortions. Unmasked thresholds for compound distortions revealed relative sensitivities of 1.1-1.2, suggesting that summation of responses to wavelet distortions is similar to summation of responses to gratings. Masked thresholds for compound distortions revealed relative sensitivities of 1.5-1.7, suggesting greater summation when distortions are masked by natural images.


international conference on image processing | 2008

Understanding and simplifying the structural similarity metric

David M. Rouse; Sheila S. Hemami

The structural similarity (SSIM) metric and its multi-scale extension (MS-SSIM) evaluate visual quality with a modified local measure of spatial correlation consisting of three components: mean, variance, and cross-correlation. This paper investigates how the SSIM components contribute to its quality evaluation of common image artifacts. The predictive performance of the individual components and pairwise component products is assessed using the LIVE image database. After a nonlinear mapping, the product of the variance and cross- correlation components yields nearly identical linear correlation with subjective ratings as the complete SSIM and MS- SSIM computations. A computationally simple alternative to SSIM (c.f Eq. (6)) that ignores the mean component and sets the local average patch values to 128 exhibits a 1% decrease in linear correlation with subjective ratings to 0.934 from the complete SSIM evaluation with an over 20% reduction in the number of multiplications.

Collaboration


Dive into the Sheila S. Hemami's collaboration.

Researchain Logo
Decentralizing Knowledge