P. Le Callet
University of Nantes
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by P. Le Callet.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006
O. Le Meur; P. Le Callet; Dominique Barba; Dominique Thoreau
Visual attention is a mechanism which filters out redundant visual information and detects the most relevant parts of our visual field. Automatic determination of the most visually relevant areas would be useful in many applications such as image and video coding, watermarking, video browsing, and quality assessment. Many research groups are currently investigating computational modeling of the visual attention system. The first published computational models have been based on some basic and well-understood human visual system (HVS) properties. These models feature a single perceptual layer that simulates only one aspect of the visual system. More recent models integrate complex features of the HVS and simulate hierarchical perceptual representation of the visual input. The bottom-up mechanism is the most occurring feature found in modern models. This mechanism refers to involuntary attention (i.e., salient spatial visual features that effortlessly or involuntary attract our attention). This paper presents a coherent computational approach to the modeling of the bottom-up visual attention. This model is mainly based on the current understanding of the HVS behavior. Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions are some of the features implemented in this model. The performances of this algorithm are assessed by using natural images and experimental measurements from an eye-tracking system. Two adequate well-known metrics (correlation coefficient and Kullbacl-Leibler divergence) are used to validate this model. A further metric is also defined. The results from this model are finally compared to those from a reference bottom-up model.
IEEE Journal of Selected Topics in Signal Processing | 2009
A. Ninassi; O. Le Meur; P. Le Callet; D. Barba
The temporal distortions such as flickering, jerkiness, and mosquito noise play a fundamental part in video quality assessment. A temporal distortion is commonly defined as the temporal evolution, or fluctuation, of the spatial distortion on a particular area which corresponds to the image of a specific object in the scene. Perception of spatial distortions over time can be largely modified by their temporal changes, such as increase or decrease in the distortions, or as periodic changes in the distortions. In this paper, we have designed a perceptual full reference video quality assessment metric by focusing on the temporal evolutions of the spatial distortions. As the perception of the temporal distortions is closely linked to the visual attention mechanisms, we have chosen to first evaluate the temporal distortion at eye fixation level. In this short-term temporal pooling, the video sequence is divided into spatio-temporal segments in which the spatio-temporal distortions are evaluated, resulting in spatio-temporal distortion maps. Afterwards, the global quality score of the whole video sequence is obtained by the long-term temporal pooling in which the spatio-temporal maps are spatially and temporally pooled. Consistent improvement over objective existing video quality assessment methods is observed. Our validation has been realized with a dataset built from video sequences of various contents.
international conference on image processing | 2003
M. Carnec; P. Le Callet; D. Barba
This paper presents a new method to evaluate the quality if distorted images. This method is based on a comparison between the structural information extracted from the distorted image and from the original image. The interest of our method is that it uses reduced references containing perceptual structural information. First, a quick overview of image quality evaluation methods is given. Then the implementation of our human visual system (HVS) model is detailed. At last, results are given for quality evaluation of JPEG and JPEG2000 coded images. They show that our method provides results which are highly correlated with human judgments (mean opinion score). This method has been implemented in an application available on the Internet.
IEEE Journal of Selected Topics in Signal Processing | 2011
Emilie Bosc; Romuald Pepion; P. Le Callet; Martin Köppel; Patrick Ndjiki-Nya; Muriel Pressigout; Luce Morin
3DTV technology has brought out new challenges such as the question of synthesized views evaluation. Synthesized views are generated through a depth image-based rendering (DIBR) process. This process induces new types of artifacts whose impact on visual quality has to be identified considering various contexts of use. While visual quality assessment has been the subject of many studies in the last 20 years, there are still some unanswered questions regarding new technological improvement. DIBR is bringing new challenges mainly because it deals with geometric distortions. This paper considers the DIBR-based synthesized view evaluation problem. Different experiments have been carried out. They question the protocols of subjective assessment and the reliability of the objective quality metrics in the context of 3DTV, in these specific conditions (DIBR-based synthesized views), and they consist in assessing seven different view synthesis algorithms through subjective and objective measurements. Results show that usual metrics are not sufficient for assessing 3-D synthesized views, since they do not correctly render human judgment. Synthesized views contain specific artifacts located around the disoccluded areas, but usual metrics seem to be unable to express the degree of annoyance perceived in the whole image. This study provides hints for a new objective measure. Two approaches are proposed: the first one is based on the analysis of the shifts of the contours of the synthesized view; the second one is based on the computation of a mean SSIM score of the disoccluded areas.
IEEE Transactions on Broadcasting | 2011
Quan Huynh-Thu; Marcus Barkowsky; P. Le Callet
Three-dimensional video content has attracted much attention in both the cinema and television industries, because 3D is considered to be the next key feature that can significantly enhance the visual experience of viewers. However, one of the major challenges is the difficulty in providing high quality images that are comfortable to view and that also meet signal transmission requirements over a limited bandwidth for display on television screens. The different processing steps that are necessary in a 3D-TV delivery chain can all introduce artifacts that may create problems in terms of human visual perception. In this paper, we highlight the importance of considering 3D visual attention when addressing 3D human factors issues. We provide a review of the field of 3D visual attention, discuss the challenges in both the understanding and modeling of 3D visual attention, and provide guidance to researchers in this field. Finally, we identify perceptual issues generated during the various steps in a typical 3D-TV broadcasting delivery chain, review them and explain how consideration of 3D visual attention modeling can help improve the overall 3D viewing experience.
international conference on image processing | 2008
A. Benoit; P. Le Callet; Patrizio Campisi; R. Cousseau
3DTV has been widely studied these last years from a technical point of view but the related perceived quality evaluations do not follow this hype. This paper firstly reviews quality assessment issues for 3DTV. Compared to 2D quality measure, depth information adds several new problems considering quality assessment. Nevertheless, efforts made for 2D content quality estimation can be used for an extension to 3D. In a second part, this paper proposes an adaptation of 2D metrics to 3D in the context of coding artifacts and stereoscopic images. Distortion on disparity is introduced to improve conventional 2D metrics. Performances have been evaluated using subjective tests.
Signal Processing-image Communication | 2010
O. Le Meur; A. Ninassi; P. Le Callet; Dominique Barba
The aim of this study is to understand how people watch a video sequence during free-viewing and quality assessment tasks. To this end, two eye tracking experiments were carried out. The video dataset is composed of 10 original video sequences and 50 impaired video sequences (five levels of impairments obtained by a H.264 video compression). A first experiment consisted in recording eye movements in a free-viewing task. The 10 original video sequences were used. The second experiment concerned an eye tracking experiment in a context of a subjective quality assessment. Eye movements were recorded while observers judged on the quality of the 50 impaired video sequences. The comparison between gaze allocations indicates the quality task has a moderate impact on the visual attention deployment. This impact increases with the presentation number of impaired video sequences. The locations of regions of interest remain highly similar after several presentations of the same video sequence, suggesting that eye movements are still driven by the low level visual features after several viewings. In addition, the level of distortion does not significantly alter the oculomotor behavior. Finally, we modified the pooling of an objective full-reference video quality metric by adjusting the weight applied on the distortions. This adjustment depends on the visual importance (the visual importance is deduced from the eye tracking experiment realized on the impaired video sequences). We observe that a saliency-based distortion pooling does not significantly improve the performances of the video quality metric.
IEEE Journal of Selected Topics in Signal Processing | 2012
Pierre R. Lebreton; Alexander Raake; Marcus Barkowsky; P. Le Callet
3D video quality of experience (QoE) is a multidimensional problem; many factors contribute to the global rating like image quality, depth perception and visual discomfort. Due to this multidimensionality, it is proposed in this paper, that as a complement to assessing the quality degradation due to coding or transmission, the appropriateness of the non-distorted signal should be addressed. One important factor here is the depth information provided by the source sequences. From an application-perspective, the depth-characteristics of source content are of relevance for pre-validating whether the content is suitable for 3D video services. In addition, assessing the interplay between binocular and monocular depth features and depth perception are relevant topics for 3D video perception research. To achieve the evaluation of the suitability of 3D content, this paper describes both a subjective experiment and a new objective indicator to evaluate depth as one of the added values of 3D video.
international conference on image processing | 2005
O. Le Meur; Dominique Thoreau; P. Le Callet; Dominique Barba
A new spatio-temporal model for simulating the bottom-up visual attention is proposed. It has been built from numerous important properties of the human visual system (HVS). This paper focuses both on the architecture of the model and on its performances. Given that the spatial model of the bottom-up visual attention has already been defined [O. Le Meur et al., 2004], the temporal dimension is more accurately described. A qualitative and quantitative comparison with human fixations collected from an eye tracking apparatus is undertaken. From the former, the quality of the prediction is deemed very good whereas the latter illustrates that the best predictor of the human fixation consists of the sum all visual features (achromatic, chromatic and motion).
IEEE Journal of Selected Topics in Signal Processing | 2012
Margaret H. Pinson; Lucjan Janowski; Romuald Pepion; Quan Huynh-Thu; C. Schmidmer; Philip J. Corriveau; Audrey C. Younkin; P. Le Callet; Marcus Barkowsky; William Ingram
Traditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Students t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment.