Linhao Dong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Linhao Dong is active.

Explore More

Publication

Featured researches published by Linhao Dong.

international conference on image processing | 2016

Mouse calibration aided real-time gaze estimation based on boost Gaussian Bayesian learning

Nanyang Ye; Xiaoming Tao; Linhao Dong; Ning Ge

In this paper, we propose a novel gaze estimation method to evaluate the attention span of users upon on-screen content via a single webcam. Our method is based on supervised descent method for eye region of interest (ROI) extraction. Then, boost Gaussian Bayesian regressors are applied to learn a robust mapping from the input eye ROI to gaze coordinates. To get enough training samples, we implant our scheme as a plug-in into web browsers for data collection from users without bothering. To improve accuracy, we also introduce mouse click to help train the regressors. Experiment results show that our method outperforms the existing method and can provide gaze estimation data for user behaviour analysis in real-time implementation.

IEEE Transactions on Geoscience and Remote Sensing | 2017

Bayesian Hyperspectral and Multispectral Image Fusions via Double Matrix Factorization

Baihong Lin; Xiaoming Tao; Mai Xu; Linhao Dong; Jianhua Lu

This paper focuses on fusing hyperspectral and multispectral images with an unknown arbitrary point spread function (PSF). Instead of obtaining the fused image based on the estimation of the PSF, a novel model is proposed without intervention of the PSF under Bayesian framework, in which the fused image is decomposed into double subspace-constrained matrix-factorization-based components and residuals. On the basis of the model, the fusion problem is cast as a minimum mean square error estimator of three factor matrices. Then, to approximate the posterior distribution of the unknowns efficiently, an estimation approach is developed based on variational Bayesian inference. Different from most previous works, the PSF is not required in the proposed model and is not pre-assumed to be spatially invariant. Hence, the proposed approach is not related to the estimation errors of the PSF and has potential computational benefits when extended to spatially variant imaging system. Moreover, model parameters in our approach are less dependent on the input data sets and most of them can be learned automatically without manual intervention. Exhaustive experiments on three data sets verify that our approach shows excellent performance and more robustness to the noise with acceptable computational complexity, compared with other state-of-the-art methods.

international conference on computer information and telecommunication systems | 2016

Chunk-wise face model based gaze correction in conversational videos with single camera

Jichuan Lu; Xiaoming Tao; Linhao Dong; Ning Ge

Eye contact is one of critical aspects of video conference. In current video conference systems, an important problem is the shortage of eye contact. This is due to the direction disparity between the normal of focus plane of camera and the interlocutors gaze. Currently, the state-of-the-method performs well on gaze correction with only single webcam. However, this method that applied to the sequences with face obscured has shown unsatisfying results. In this paper, we propose a novel model-based gaze correction scheme to independently build the eye models, the nose model and the mouth model independently so that the occlusion problem can be solved effectively. To be more specific, by rotating the aforementioned four models at a pre-detected displacement, the degree of gaze correction can be obtained. The facial appearance that is occluded will not be rotated, which will prevent the occluding object from being distorted. Experiment results show that, under different parts of occluding objects, our method can achieve better performance on gaze correction than the current method in terms of perceived quality in two scenes usually appearing in conversational videos.

IEEE Transactions on Multimedia | 2016

HEMS: Hierarchical Exemplar-Based Matching-Synthesis for Object-Aware Image Reconstruction

Yipeng Sun; Xiaoming Tao; Yang Li; Linhao Dong; Jianhua Lu

Motivated by the attention on salient objects, conventional region-of-interest (ROI)-based image coding approaches attempt to assign more bits to ROIs and fewer bits to other regions. Thus, the perceptual quality of salient object regions is improved by sacrificing the quality of non-ROI regions with unpleasant artifacts. To address this issue, we concentrate on the efficient compression of object-centered images by encoding salient objects and background features separately. To fully recover the object and background, we propose a hierarchical exemplar-based matching-synthesis (HEMS) approach to reconstruct the image from exemplars. In the proposed framework, once the salient object regions are encoded, only the quantized color features and local descriptors of the background are kept, achieving bit-rate reduction. To make it possible and practical to reconstruct background regions, the hierarchical framework is designed in three layers, including relevant image search, patch candidates matching, and distortion optimized image synthesis. In the hierarchical framework, firstly, image search from an external database returns relevant images, limiting the search space to a feasible number of patch candidates. Secondly, patches are matched by color features to select the appropriate candidates. Finally, the distortion optimized image synthesis further makes it possible to automatically choose the most suitable texture sample, and seamlessly reconstruct the image. Compared to the conventional ROI-based image coding schemes, the proposed approach can achieve better visual quality on both ROI and background regions.

international conference on image processing | 2015

The THU multi-view face database for videoconferences

Linhao Dong; Xiaoming Tao; Yang Li; Jichuan Lu; Zizhuo Zhang; Jingwen Cheng; Jianhua Lu

In this paper, we present a face video database that contains 31,500 videos of 100 individual volunteers. The primary purpose of building this database is to serve as a standardized test video sequences for any research related to video-conferences. Each of the volunteers was filmed by 9 groups of synchronized webcams under 7 illumination conditions, and was requested to complete a series designated actions. Thus, face variations on lip shape, occlusion, illumination, pose, and expression are presented in each video clip. Compared to the existing databases, THU face database provides multi-view video sequences with strict temporal synchronization, enabling evaluations on gaze-correction methods. Besides, based on our database, three well-known methods were tested, demonstrating the numerical performances under different circumstances. Free samples of this database can be downloaded at www.facedbv.com.

China Communications | 2015

Improving low bitrate video coding via computation incorporating a priori information

Xiaoming Tao; Linhao Dong; Shaoyang Li; Yang Li; Ning Ge; Jianhua Lu

The growing number of mobile users, as well as the diversification in types of services have resulted in increasing demands for wireless network bandwidth in recent years. Although evolving transmission techniques are able to enlarge the network capacity to some degree, they still cannot satisfy the requirements of mobile users. Meanwhile, following Moores Law, the data processing capabilities of mobile user terminals are continuously improving. In this paper, we explore possible methods of trading strong computational power at wireless terminals for transmission efficiency of communications. Taking the specific scenario of wireless video conversation, we propose a model-based video coding scheme by learning the structures in multimedia contents. Benefiting from both strong computing capability and pre-learned model priors, only low-dimensional parameters need to be transmitted; and the intact multimedia contents can also be reconstructed at the receivers in real-time. Experiment results indicate that, compared to conventional video codecs, the proposed scheme significantly reduces the data rate with the aid of computational capability at wireless terminals.

vehicular technology conference | 2017

Online Bayesian Learning for Remote-Sensing Imagery Compression

Zizhuo Zhang; Shaoyang Li; Xiaoming Tao; Linhao Dong; Jianhua Lu

This work investigates a statistical technique for high performance remote-sensing imagery compression. By exploiting existing remote-sensing data sets, useful structural and texture prior information can be learned. The main methodologies are Bayesian dictionary learning and stochastic approximation. A Bayesian network simulating the generation mechanism of remote- sensing images is modelled. The whole compression scheme is established. And the corresponding inference algorithm using Gibbs sampling is given, where the inference is realized in an online way. The performance of the proposed compressing scheme is evaluated over a high-resolution remote-sensing image data set captured by TH-1 series satellites. Experiment results have shown that our compression scheme outperforms JPEG-2000 by 3dB on average with same bits-per-pixel performance, and that Bayesian learning can provide a dictionary with high expressiveness for remote-sensing images. In addition, with online learning skills our proposed compression scheme can scale up to very large-scale training data.

Neurocomputing | 2016

The THU multi-view face database for videoconferences and baseline evaluations

Xiaoming Tao; Linhao Dong; Yang Li; Jianhua Lu

In this paper, we present a face video database and its acquisition. This database contains 31,500 video clips of 100 individual figures from 20 countries. The primary purpose of building this database is to serve as a standardized test video sequences for any research related to videoconferences, such as gaze-correction and model based face reconstruction, etc. To be specific, each of the subjects was filmed by 9 groups of synchronized webcams under 7 illumination conditions, and was requested to complete a series of designated actions. Thus, face variations including lip shape, occlusion, illumination, pose, and expression are well presented in each video clip. Compared to the existing databases, the proposed THU face database provides multi-view video sequences with strict temporal synchronization, which enables evaluations on current and future possible gaze-correction methods for conversational video communications. Besides, we discuss the evaluation protocol based on our database for gaze-correction, where three well-known methods were tested. Experiment results show that, under this evaluation protocol, comparisons of performance can be obtained numerically in terms of peak signal-to-noise ratio (PSNR), demonstrating the strengths and weaknesses of these methods under different circumstances.

ieee global conference on signal and information processing | 2015

A nonparametric Bayesian approach to joint multiple dictionary learning with separate image sources

Shaoyang Li; Xiaoming Tao; Linhao Dong; Jianhua Lu

Nonparametric Bayesian approach is considered for learning appropriate dictionaries in sparse image representations. However, for images from multiple separate sources, existing methods have two issues that potentially limit their practical implements: first, learning one unified dictionary is not optimal for representing image samples in different subspaces; second, the required number of dictionaries and their correlations are unknown in advance. In this paper, we address these issues by: 1) modeling multiple dictionaries using a Dirichlet process which can automatically infer the latent dictionary number needed to fit the image data; 2) placing a hierarchical Beta process prior to depict the dictionary correlations. To make the Bayesian model inference tractable, we further derive a combination of collapsed Gibbs sampler and auxiliary-variable-based slice sampler. Experimental results demonstrate that our proposed inference approach can achieve an optimized set of dictionaries for multiple source images, while exhibiting performance improvements in the context of image compressive sensing reconstruction.

international conference on acoustics, speech, and signal processing | 2016