Thomas Maugey
French Institute for Research in Computer Science and Automation
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Thomas Maugey.
IEEE Transactions on Image Processing | 2015
Thomas Maugey; Antonio Ortega; Pascal Frossard
In this paper, we propose a new geometry representation method for multiview image sets. Our approach relies on graphs to describe the multiview geometry information in a compact and controllable way. The links of the graph connect pixels in different images and describe the proximity between pixels in 3D space. These connections are dependent on the geometry of the scene and provide the right amount of information that is necessary for coding and reconstructing multiple views. Our multiview image representation is very compact and adapts the transmitted geometry information as a function of the complexity of the prediction performed at the decoder side. To achieve this, our graph-based representation (GBR) carefully selects the amount of geometry information needed before coding. This is in contrast with depth coding, which directly compresses with losses the original geometry signal, thus making it difficult to quantify the impact of coding errors on geometry-based interpolation. We present the principles of this GBR and we build an efficient coding algorithm to represent it. We compare our GBR approach to classical depth compression methods and compare their respective view synthesis qualities as a function of the compactness of the geometry description. We show that GBR can achieve significant gains in geometry coding rate over depth-based schemes operating at similar quality. Experimental results demonstrate the potential of this new representation.
Journal of Visual Communication and Image Representation | 2015
Ana De Abreu; Laura Toni; Nikolaos Thomos; Thomas Maugey; Fernando Pereira; Pascal Frossard
We consider an interactive multiview video streaming (IMVS) system where clients select their preferred viewpoint in a given navigation window. To provide high quality IMVS, many high quality views should be transmitted to the clients. However, this is not always possible due to the limited and heterogeneous capabilities of the clients. In this paper, we propose a novel adaptive IMVS solution based on a layered multiview representation where camera views are organized into layered subsets to match the different clients constraints. We formulate an optimization problem for the joint selection of the views subsets and their encoding rates. Then, we propose an optimal and a reduced computational complexity greedy algorithms, both based on dynamic-programming. Simulation results show the good performance of our novel algorithms compared to a baseline algorithm, proving that an effective IMVS adaptive solution should consider the scene content and the client capabilities and their preferences in navigation.
IEEE Transactions on Multimedia | 2015
Laura Toni; Thomas Maugey; Pascal Frossard
We study coding and transmission strategies in multicamera systems, where correlated sources send data through a bottleneck channel to a central server, which eventually transmits views to different interactive users. We propose a dynamic navigation -path aware packet scheduling optimization under delay, bandwidth, and interactivity constraints aimed at optimizing the quality-of-experience of interactive users. In particular , the scene distortion is minimized jointly with the distortion variations along most likely navigation paths. The optimization relies both on a novel rate-distortion model, which captures the importance of each view in the scene reconstruction , and on an objective function that optimizes resources based on a client navigation model. The latter takes into account the distortion experienced by interactive clients as well as the distortion variations that might be observed by clients during multiview navigation. We solve the scheduling problem with a novel trellis-based solution, which permits to formally decompose the multivariate optimization problem, thereby significantly reducing the computation complexity. Simulation results show the PSNR quality gain offered by the proposed algorithm compared to baseline scheduling policies. Finally, we show that the best scheduling policy consistently adapts to the most likely user navigation path and that it minimizes distortion variations that can be very disturbing for users in traditional navigation systems.
european signal processing conference | 2016
Mira Rizkallah; Thomas Maugey; Charles Yaacoub; Christine Guillemot
Light Fields capturing all light rays at every point in space and in all directions contain very rich information about the scene. This rich description of the scene enables advanced image creation capabilities, such as re-focusing or extended depth of field from a single capture. But, it yields a very high volume of data which needs compression. This paper studies the impact of Light Fields compression on two key functionalities: refocusing and extended focus. The sub-aperture images forming the Light Field are compressed as a video sequence with HEVC. A focus stack and the scene depth map are computed from the compressed light field and are used to render an image with an extended depth of field (called the extended focus image). It has been first observed that the Light Field could be compressed with a factor up to 700 without significantly affecting the visual quality of both refocused and extended focus images. To further analyze the compression effect, a dedicated quality evaluation method based on contrast and gradient measurements is considered to differentiate the natural geometrical blur from the blur resulting from compression. As a second part of the experiments, it is shown that the texture distortion of the in-focus regions in the focus stacks is the main cause of the quality degradation in the extended focus and that the depth errors do not impact the extended focus quality unless the light field is significantly distorted with a compression ratio of around 2000:1.
IEEE Transactions on Image Processing | 2016
Thomas Maugey; Giovanni Petrazzuoli; Pascal Frossard; Marco Cagnazzo; Béatrice Pesquet-Popescu
Augmented reality, interactive navigation in 3D scenes, multiview video, and other emerging multimedia applications require large sets of images, hence larger data volumes and increased resources compared with traditional video services. The significant increase in the number of images in multiview systems leads to new challenging problems in data representation and data transmission to provide high quality of experience on resource-constrained environments. In order to reduce the size of the data, different multiview video compression strategies have been proposed recently. Most of them use the concept of reference or key views that are used to estimate other images when there is high correlation in the data set. In such coding schemes, the two following questions become fundamental: 1) how many reference views have to be chosen for keeping a good reconstruction quality under coding cost constraints? And 2) where to place these key views in the multiview data set? As these questions are largely overlooked in the literature, we study the reference view selection problem and propose an algorithm for the optimal selection of reference views in multiview coding systems. Based on a novel metric that measures the similarity between the views, we formulate an optimization problem for the positioning of the reference views, such that both the distortion of the view reconstruction and the coding rate cost are minimized. We solve this new problem with a shortest path algorithm that determines both the optimal number of reference views and their positions in the image set. We experimentally validate our solution in a practical multiview distributed coding system and in the standardized 3D-HEVC multiview coding scheme. We show that considering the 3D scene geometry in the reference view, positioning problem brings significant rate-distortion improvements and outperforms the traditional coding strategy that simply selects key frames based on the distance between cameras.
international conference on image processing | 2015
Aline Roumy; Thomas Maugey
We consider the problem of video compression with free viewpoint interactivity. It is well believed that allowing the user to choose its view will incur some loss in terms of compression efficiency. Here we derive the complete rate-storage region for universal lossless coding under the constraint of choosing the view at the receiver. This leads to a counterintuitive result: freely choosing its view at the receiver incurs a loss in terms of storage only and not in the transmission rate. The gain of the optimal scheme with respect to interactive schemes proposed so far is derived and a practical scheme that achieves this gain is proposed.
multimedia signal processing | 2017
Thomas Maugey; Olivier Le Meur; Zhi Liu
Omnidirectional images describe the color information at a given position from all directions. Affordable 360° cameras have recently been developed leading to an explosion of the 360° data shared on social networks. However, an omnidirectional image does not contain interesting content everywhere. Some part of the images are indeed more likely to be looked at by some users than others. Knowing these regions of interest might be useful for 360° image compression, streaming, retargeting or even editing. In this paper, we aim at modelling the user navigation within a 360° image, and detecting which parts of an omnidirectional content might draw users attention. In particular, the paper proposes to aggregate and analyze 2D saliency detectors in different map projections, and also proposes a smooth navigation through the image to maximize saliency.
international conference on image processing | 2016
Xin Su; Thomas Maugey; Christine Guillemot
Graph-Based Representation (GBR) has recently been proposed for rectified multiview dataset. The core idea of GBR is to use graphs for describing the color and geometry information of a multiview dataset. The color information is represented by the vertices of the graph while the scene geometry is represented by the edges of the graph. In this paper, we generalize the GBR to multi-view images with complex camera configurations. Compared with previous work, the GBR representation introduced in this paper can handle not only horizontal displacements of the cameras but also forward/backward displacements, rotations etc. In order to have a sparse (i.e., easy to code) graph structure, we further propose to use a distortion metric to select the most meaningful connections. For the graph transmission, each selected connection is then replaced by a disparity-based quantity. The experiments show that the proposed GBR achieves high reconstructing quality with less or comparable coding rate compared with traditional depth-based representations, that directly compress the depth signal without considering the rendering task.
international conference on image processing | 2015
Thomas Maugey; Pascal Frossard; Christine Guillemot
In this paper, we propose a new guided inpainting algorithm based on the exemplar-based approach in order to effectively fill in holes in image synthesis applications. Guided inpainting techniques can be very useful in settings where one has access to the ground truth information like most multiview coding applications. We propose a new auxiliary information based on patch clustering, which is used to refine the candidate exemplar set in the inpainting. For that purpose, a new recursive clustering method based on locally linear embedding (LLE) is introduced. We then design the guided inpainting solution based on LLE with clustered patches, which contrains the reconstruction to operate in one patch cluster only. The index of the appropriate cluster considered as auxiliary information. Experimental results show that our clustering algorithm provides clusters that are well suited to the inpainting problem. They also show that the auxiliary information enables to significantly improve the quality of the inpainted image for a small coding cost. This work is the first study to show that effective inpainting can be performed when the auxiliary information is properly adapted to the characteristics of both the hole and the known texture.
IEEE Transactions on Multimedia | 2018
Rui Ma; Thomas Maugey; Pascal Frossard
In contrary to traditional media streaming services where a unique media content is delivered to different users, interactive multiview navigation applications enable users to choose their own viewpoints and freely navigate in a three-dimensional scene. The interactivity brings new challenges in addition to the classical rate-distortion tradeoff, which considers only the compression performance and viewing quality. On one hand, interactivity necessitates sufficient viewpoints for richer navigation; on the other hand, it requires to provide low bandwidth and delay costs for smooth navigation during view transitions. In this paper, we formally describe the novel tradeoffs posed by the navigation interactivity and classical rate-distortion criterion. Based on an original formulation, we look for the optimal design of the data representation by introducing novel rate and distortion models and practical solving algorithms. Experiments show that the proposed data representation method outperforms the baseline solution by providing lower resource consumptions and higher visual quality in all navigation configurations, which certainly confirms the potential of the proposed data representation in practical interactive navigation systems.