Is this you? Create Your Porfile

Baojun Zhao

Beijing Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Baojun Zhao is active.

Explore More

Publication

Featured researches published by Baojun Zhao.

IEEE Transactions on Geoscience and Remote Sensing | 2015

Compressed-Domain Ship Detection on Spaceborne Optical Image Using Deep Neural Network and Extreme Learning Machine

Jiexiong Tang; Chenwei Deng; Guang-Bin Huang; Baojun Zhao

Ship detection on spaceborne images has attracted great interest in the applications of maritime security and traffic control. Optical images stand out from other remote sensing images in object detection due to their higher resolution and more visualized contents. However, most of the popular techniques for ship detection from optical spaceborne images have two shortcomings: 1) Compared with infrared and synthetic aperture radar images, their results are affected by weather conditions, like clouds and ocean waves, and 2) the higher resolution results in larger data volume, which makes processing more difficult. Most of the previous works mainly focus on solving the first problem by improving segmentation or classification with complicated algorithms. These methods face difficulty in efficiently balancing performance and complexity. In this paper, we propose a ship detection approach to solving the aforementioned two issues using wavelet coefficients extracted from JPEG2000 compressed domain combined with deep neural network (DNN) and extreme learning machine (ELM). Compressed domain is adopted for fast ship candidate extraction, DNN is exploited for high-level feature representation and classification, and ELM is used for efficient feature pooling and decision making. Extensive experiments demonstrate that, in comparison with the existing relevant state-of-the-art approaches, the proposed method requires less detection time and achieves higher detection accuracy.

IEEE Transactions on Systems, Man, and Cybernetics | 2017

NMF-Based Image Quality Assessment Using Extreme Learning Machine

Shuigen Wang; Chenwei Deng; Weisi Lin; Guang-Bin Huang; Baojun Zhao

Numerous state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage process: distortion description followed by distortion effects pooling. As for the first stage, the distortion descriptors or measurements are expected to be effective representatives of human visual variations, while the second stage should well express the relationship among quality descriptors and the perceptual visual quality. However, most of the existing quality descriptors (e.g., luminance, contrast, and gradient) do not seem to be consistent with human perception, and the effects pooling is often done in ad-hoc ways. In this paper, we propose a novel full-reference IQA metric. It applies non-negative matrix factorization (NMF) to measure image degradations by making use of the parts-based representation of NMF. On the other hand, a new machine learning technique [extreme learning machine (ELM)] is employed to address the limitations of the existing pooling techniques. Compared with neural networks and support vector regression, ELM can achieve higher learning accuracy with faster learning speed. Extensive experimental results demonstrate that the proposed metric has better performance and lower computational complexity in comparison with the relevant state-of-the-art approaches.

Neurocomputing | 2016

Gradient-based no-reference image blur assessment using extreme learning machine

Shuigen Wang; Chenwei Deng; Baojun Zhao; Guang-Bin Huang; Baoxian Wang

The increasing number of demanding consumer digital multimedia applications has boosted interest in no-reference (NR) image quality assessment (IQA). In this paper, we propose a perceptual NR blur evaluation method using a new machine learning technique, i.e., extreme learning machine (ELM). The proposed metric, Blind Image Blur quality Evaluator (BIBE), exploits scene statistics of gradient magnitudes to model the properties of blurred images, and then the underlying blur features are derived by fitting gradient magnitudes distribution. The resultant feature is finally mapped into an associated quality score using ELM. As subjective evaluation scores by human beings are integrated into training, machine learning techniques can predict image quality more accurately than those traditional methods. Compared with other learning techniques such as support vector machine (SVM), ELM has better learning performance and faster learning speed. Experimental results on public databases show that the proposed BIBE correlates well with human perceived blurriness, and outperforms the state-of-the-art specific NR blur evaluation metrics as well as generic NR IQA methods. Moreover, the application of automatic focusing system for digital cameras further confirms the capability of BIBE.

IEEE Geoscience and Remote Sensing Letters | 2016

Fast and Accurate Spatiotemporal Fusion Based Upon Extreme Learning Machine

Xun Liu; Chenwei Deng; Shuigen Wang; Guang-Bin Huang; Baojun Zhao; Paula Lauren

Spatiotemporal fusion is important in providing high spatial resolution earth observations with a dense time series, and recently, learning-based fusion methods have been attracting broad interest. These algorithms project image patches onto a feature space with the enforcement of a simple mapping to predict the fine resolution patches from the corresponding coarse ones. However, the sophisticated projection, e.g., sparse representation, is always computationally complex and difficult to be implemented on large patches, which cannot grasp enough local structural information in the coarse patches. To address these issues, a novel spatiotemporal fusion method is proposed in this letter, using a powerful learning technique, i.e., extreme learning machine (ELM). Unlike traditional approaches, we devote to learning a mapping function on difference images directly, rather than the sophisticated feature representation followed by a simple mapping. Characterized by good generalization performance and fast speed, the ELM is employed to achieve accurate and fast fine patches prediction. The proposed algorithm is evaluated by five actual data sets of Landsat enhanced thematic mapper plus-moderate resolution imaging spectroradiometer acquisitions and experimental results show that our method obtains better fusion results while achieving much greater speed.

international conference on image processing | 2013

A novel SVD-based image quality assessment metric

Shuigen Wang; Chenwei Deng; Weisi Lin; Baojun Zhao; Jie Chen

Image distortion can be categorized into two aspects: content-dependent degradation and content-independent one. An existing full-reference image quality assessment (IQA) metric cannot deal with these two different impacts well. Singular value decomposition (SVD) as a useful mathematical tool has been used in various image processing applications. In this paper, SVD is employed to separate the structural (content-dependent) and the content-independent components. For each portion, we design a specific assessment model to tailor for its corresponding distortion properties. The proposed models are then fused to obtain the final quality score. Experimental results with the TID database demonstrate that the proposed metric achieves better performance in comparison with the relevant state-of-the-art quality metrics.

international geoscience and remote sensing symposium | 2016

Spatiotemporal reflectance fusion based on location regularized sparse representation

Xun Liu; Chenwei Deng; Baojun Zhao

Spatiotemporal reflectance fusion plays an important role in providing earth observation with both high-spatial and high-temporal resolutions, and sparse representation is one of the popular strategies to implement spatiotemporal fusion. However, the existing methods generally suffers from instability of sparse representation for the fine and coarse image pairs. In this paper, we demonstrate that such instability can be addressed by exploiting spatial correlations among the neighboring fine images, which is mathematically formulated as a location regularized term. A fast iterative shrinkage-thresholding algorithm (FISTA) is then employed to find the optimal solution. Experimental results show that the performance of proposed method outperforms other relevant state-of-the-art fusion approaches.

Applied Optics and Photonics China (AOPC2015) | 2015

Parallax handling of image stitching using dominant-plane homography

Zhaofeng Pang; Cheng Li; Baojun Zhao; Linbo Tang

In this paper, we present a novel image stitching method to handle parallax in practical application. For images with significant amount of parallax, the more effective approach is to align roughly and globally the overlapping regions and then apply a seam-cutting method to composite naturally stitched images. It is well known that images can be modeled by various planes result from the projective parallax under non-ideal imaging condition. The dominant-plane homography has important advantages of warping an image globally and avoiding some local distortions. The proposed method primarily addresses large parallax problem through two steps: (1) selecting matching point pairs located on the dominant plane, by clustering matching correspondences and then measuring the cost of each cluster; and (2) in order to obtain a plausible seam, edge maps of overlapped area incorporation arithmetic is adopted to modify the standard seam-cutting method. Furthermore, our approach is demonstrated to achieve reliable performance of handling parallax through a mass of experimental comparisons with state-of-the-art methods.

Applied Optics and Photonics China (AOPC2015) | 2015

Coarse-to-fine wavelet-based airport detection

Cheng Li; Shuigen Wang; Zhaofeng Pang; Baojun Zhao

Airport detection on optical remote sensing images has attracted great interest in the applications of military optics scout and traffic control. However, most of the popular techniques for airport detection from optical remote sensing images have three weaknesses: 1) Due to the characteristics of optical images, the detection results are often affected by imaging conditions, like weather situation and imaging distortion; and 2) optical images contain comprehensive information of targets, so that it is difficult for extracting robust features (e.g., intensity and textural information) to represent airport area; 3) the high resolution results in large data volume, which makes real-time processing limited. Most of the previous works mainly focus on solving one of those problems, and thus, the previous methods cannot achieve the balance of performance and complexity. In this paper, we propose a novel coarse-to-fine airport detection framework to solve aforementioned three issues using wavelet coefficients. The framework includes two stages: 1) an efficient wavelet-based feature extraction is adopted for multi-scale textural feature representation, and support vector machine(SVM) is exploited for classifying and coarsely deciding airport candidate region; and then 2) refined line segment detection is used to obtain runway and landing field of airport. Finally, airport recognition is achieved by applying the fine runway positioning to the candidate regions. Experimental results show that the proposed approach outperforms the existing algorithms in terms of detection accuracy and processing efficiency.

Sensors | 2018

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection

Baojun Zhao; Boya Zhao; Linbo Tang; Yuqi Han; Wenzheng Wang

With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still images and only use the spatial information, which means that the feature consistency cannot be ensured because the training procedure loses temporal information. To address these problems, we propose a single, fully-convolutional neural network-based object detection framework that involves temporal information by using Siamese networks. In the training procedure, first, the prediction network combines the multiscale feature map to handle objects of various sizes. Second, we introduce a correlation loss by using the Siamese network, which provides neighboring frame features. This correlation loss represents object co-occurrences across time to aid the consistent feature generation. Since the correlation loss should use the information of the track ID and detection label, our video object detection network has been evaluated on the large-scale ImageNet VID dataset where it achieves a 69.5% mean average precision (mAP).

Cognitive Computation | 2018

Conditional Random Mapping for Effective ELM Feature Representation

Cheng Li; Chenwei Deng; Shichao Zhou; Baojun Zhao; Guang-Bin Huang

Extreme learning machine (ELM) has been extensively studied, due to its fast training and good generalization. Unfortunately, the existing ELM-based feature representation methods are uncompetitive with state-of-the-art deep neural networks (DNNs) when conducting some complex visual recognition tasks. This weakness is mainly caused by two critical defects: (1) random feature mappings (RFM) by ad hoc probability distribution is unable to well project various input data into discriminative feature spaces; (2) in the ELM-based hierarchical architectures, features from previous layer are scattered via RFM in the current layer, which leads to abstracting higher level features ineffectively. To address these issues, we aim to take advantage of label information for optimizing random mapping in the ELM, utilizing an efficient label alignment metric to learn a conditional random feature mapping (CRFM) in a supervised manner. Moreover, we proposed a new CRFM-based single-layer ELM (CELM) and then extended CELM to the supervised multi-layer learning architecture (ML-CELM). Extensive experiments on various widely used datasets demonstrate our approach is more effective than original ELM-based and other existing DNN feature representation methods with rapid training/testing speed. The proposed CELM and ML-CELM are able to achieve discriminative and robust feature representation, and have shown superiority in various simulations in terms of generalization and speed.

Explore More