Electrical Engineering And Systems Science Image And Video Processing - Researchain

Featured Researches

Identification of particle mixtures using machine-learning-assisted laser diffraction analysis

We demonstrate a smart laser-diffraction analysis technique for particle mixture identification. We retrieve information about the size, geometry, and ratio concentration of two-component heterogeneous particle mixtures with an efficiency above 92%. In contrast to commonly-used laser diffraction schemes -- in which a large number of detectors is needed -- our machine-learning-assisted protocol makes use of a single far-field diffraction pattern, contained within a small angle ( ??0.26 ??) around the light propagation axis. Because of its reliability and ease of implementation, our work may pave the way towards the development of novel smart identification technologies for sample classification and particle contamination monitoring in industrial manufacturing processes.

Image And Video Processing

Image Compression with Encoder-Decoder Matched Semantic Segmentation

In recent years, layered image compression is demonstrated to be a promising direction, which encodes a compact representation of the input image and apply an up-sampling network to reconstruct the image. To further improve the quality of the reconstructed image, some works transmit the semantic segment together with the compressed image data. Consequently, the compression ratio is also decreased because extra bits are required for transmitting the semantic segment. To solve this problem, we propose a new layered image compression framework with encoder-decoder matched semantic segmentation (EDMS). And then, followed by the semantic segmentation, a special convolution neural network is used to enhance the inaccurate semantic segment. As a result, the accurate semantic segment can be obtained in the decoder without requiring extra bits. The experimental results show that the proposed EDMS framework can get up to 35.31% BD-rate reduction over the HEVC-based (BPG) codec, 5% bitrate, and 24% encoding time saving compare to the state-of-the-art semantic-based image codec.

Image And Video Processing

Image Restoration by Solving IVP

Recent research on image restoration have achieved great success with the aid of deep learning technologies, but, many of them are limited to dealing SR with realistic settings. To alleviate this problem, we introduce a new formulation for image super-resolution to solve arbitrary scale image super-resolution methods. Based on the proposed new SR formulation, we can not only super-resolve images with multiple scales, but also find a new way to analyze the performance of super-resolving process. We demonstrate that the proposed method can generate high-quality images unlike conventional SR methods.

Image And Video Processing

Image Splicing Detection, Localization and Attribution via JPEG Primary Quantization Matrix Estimation and Clustering

Detection of inconsistencies of double JPEG artefacts across different image regions is often used to detect local image manipulations, like image splicing, and to localize them. In this paper, we move one step further, proposing an end-to-end system that, in addition to detecting and localizing spliced regions, can also distinguish regions coming from different donor images. We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources. To do so, we cluster the image blocks according to the estimated primary quantization matrix and refine the result by means of morphological reconstruction. The proposed method can work in a wide variety of settings including aligned and non-aligned double JPEG compression, and regardless of whether the second compression is stronger or weaker than the first one. We validated the proposed approach by means of extensive experiments showing its superior performance with respect to baseline methods working in similar conditions.

Image And Video Processing

Image quality enhancement in wireless capsule endoscopy with adaptive fraction gamma transformation and unsharp masking filter

Wireless Capsule Endoscopy (WCE) presented in 2001 as one of the key approaches to observe the entire gastrointestinal (GI) tract, generally the small bowels. It has been used to detect diseases in the gastrointestinal tract. Endoscopic image analysis is still a required field with many open problems. The quality of many images it produced is rather unacceptable due to the nature of this imaging system, which causes some issues to prognosticate by physicians and computer-aided diagnosis. In this paper, a novel technique is proposed to improve the quality of images captured by the WCE. More specifically, it enhanced the brightness, contrast, and preserve the color information while reducing its computational complexity. Furthermore, the experimental results of PSNR and SSIM confirm that the error rate in this method is near to the ground and negligible. Moreover, the proposed method improves intensity restricted average local entropy (IRMLE) by 22%, color enhancement factor (CEF) by 10%, and can keep the lightness of image effectively. The performances of our method have better visual quality and objective assessments in compare to the state-of-art methods.

Image And Video Processing

ImageCHD: A 3D Computed Tomography Image Dataset for Classification of Congenital Heart Disease

Congenital heart disease (CHD) is the most common type of birth defect, which occurs 1 in every 110 births in the United States. CHD usually comes with severe variations in heart structure and great artery connections that can be classified into many types. Thus highly specialized domain knowledge and the time-consuming human process is needed to analyze the associated medical images. On the other hand, due to the complexity of CHD and the lack of dataset, little has been explored on the automatic diagnosis (classification) of CHDs. In this paper, we present ImageCHD, the first medical image dataset for CHD classification. ImageCHD contains 110 3D Computed Tomography (CT) images covering most types of CHD, which is of decent size Classification of CHDs requires the identification of large structural changes without any local tissue changes, with limited data. It is an example of a larger class of problems that are quite difficult for current machine-learning-based vision methods to solve. To demonstrate this, we further present a baseline framework for the automatic classification of CHD, based on a state-of-the-art CHD segmentation method. Experimental results show that the baseline framework can only achieve a classification accuracy of 82.0\% under a selective prediction scheme with 88.4\% coverage, leaving big room for further improvement. We hope that ImageCHD can stimulate further research and lead to innovative and generic solutions that would have an impact in multiple domains. Our dataset is released to the public compared with existing medical imaging datasets.

Image And Video Processing

Impact of lung segmentation on the diagnosis and explanation of COVID-19 in chest X-ray images

The COVID-19 pandemic is undoubtedly one of the biggest public health crises our society has ever faced in recent history. One of the main complications caused by COVID-19 is pneumonia, which is diagnosed using imaging exams, such as chest X-ray (CXR) and computed tomography (CT) scan. The CT scan is more precise than the CXR. However, CXR is suitable in particular situations because it is cheaper, faster, more widespread, and exposes the patient to less radiation. This study aims to demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image decisively contribute to its identification. We performed the lung segmentation using a U-Net CNN architecture, and the classification using three well-known CNN architectures: VGG, ResNet, and Inception. To estimate the impact of lung segmentation, we applied some Explainable Artificial Intelligence (XAI) techniques, specifically LIME and Grad-CAM. To empirically evaluate our approach, we composed a database with three classes: lung opacity (pneumonia), COVID-19, and normal. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented lung achieved an F1-Score of 0.88 for the multi-class setup and 0.83 for COVID-19 identification. Further testing and XAI techniques suggest that segmented CXR images represent a much more realistic and less biased performance. To the best of our knowledge, no other work tried to estimate the impact of lung segmentation in COVID-19 identification using comprehensive XAI techniques.

Image And Video Processing

Improved Brain Age Estimation with Slice-based Set Networks

Deep Learning for neuroimaging data is a promising but challenging direction. The high dimensionality of 3D MRI scans makes this endeavor compute and data-intensive. Most conventional 3D neuroimaging methods use 3D-CNN-based architectures with a large number of parameters and require more time and data to train. Recently, 2D-slice-based models have received increasing attention as they have fewer parameters and may require fewer samples to achieve comparable performance. In this paper, we propose a new architecture for BrainAGE prediction. The proposed architecture works by encoding each 2D slice in an MRI with a deep 2D-CNN model. Next, it combines the information from these 2D-slice encodings using set networks or permutation invariant layers. Experiments on the BrainAGE prediction problem, using the UK Biobank dataset, showed that the model with the permutation invariant layers trains faster and provides better predictions compared to other state-of-the-art approaches.

Image And Video Processing

Improved gradient descent-based chroma subsampling method for color images in VVC

Prior to encoding color images for RGB full-color, Bayer color filter array (CFA), and digital time delay integration (DTDI) CFA images, performing chroma subsampling on their converted chroma images is necessary and important. In this paper, we propose an effective general gradient descent-based chroma subsampling method for the above three kinds of color images, achieving substantial quality and quality-bitrate tradeoff improvement of the reconstructed color images when compared with the related methods. First, a bilinear interpolation based 2 × 2 t ( ∈{RGB,Bayer,DTDI} ) color block-distortion function is proposed at the server side, and then in real domain, we prove that our general 2 × 2 t color block-distortion function is a convex function. Furthermore, a general closed form is derived to determine the initially subsampled chroma pair for each 2 × 2 chroma block. Finally, an effective iterative method is developed to improve the initially subsampled (U,V) -pair. Based on the Kodak and IMAX datasets, the comprehensive experimental results demonstrated that on the newly released versatile video coding (VVC) platform VTM-8.0, for the above three kinds of color images, our chroma subsampling method clearly outperforms the existing chroma subsampling methods.

Image And Video Processing

Improving Automated COVID-19 Grading with Convolutional Neural Networks in Computed Tomography Scans: An Ablation Study

Amidst the ongoing pandemic, several studies have shown that COVID-19 classification and grading using computed tomography (CT) images can be automated with convolutional neural networks (CNNs). Many of these studies focused on reporting initial results of algorithms that were assembled from commonly used components. The choice of these components was often pragmatic rather than systematic. For instance, several studies used 2D CNNs even though these might not be optimal for handling 3D CT volumes. This paper identifies a variety of components that increase the performance of CNN-based algorithms for COVID-19 grading from CT images. We investigated the effectiveness of using a 3D CNN instead of a 2D CNN, of using transfer learning to initialize the network, of providing automatically computed lesion maps as additional network input, and of predicting a continuous instead of a categorical output. A 3D CNN with these components achieved an area under the ROC curve (AUC) of 0.934 on our test set of 105 CT scans and an AUC of 0.923 on a publicly available set of 742 CT scans, a substantial improvement in comparison with a previously published 2D CNN. An ablation study demonstrated that in addition to using a 3D CNN instead of a 2D CNN transfer learning contributed the most and continuous output contributed the least to improving the model performance.

Ready to get started?

Join us today

Archive Your Research