Electrical Engineering And Systems Science Image And Video Processing - Researchain

Featured Researches

Analyzing Epistemic and Aleatoric Uncertainty for Drusen Segmentation in Optical Coherence Tomography Images

Age-related macular degeneration (AMD) is one of the leading causes of permanent vision loss in people aged over 60 years. Accurate segmentation of biomarkers such as drusen that points to the early stages of AMD is crucial in preventing further vision impairment. However, segmenting drusen is extremely challenging due to their varied sizes and appearances, low contrast and noise resemblance. Most existing literature, therefore, have focused on size estimation of drusen using classification, leaving the challenge of accurate segmentation less tackled. Additionally, obtaining the pixel-wise annotations is extremely costly and such labels can often be noisy, suffering from inter-observer and intra-observer variability. Quantification of uncertainty associated with segmentation tasks offers principled measures to inspect the segmentation output. Realizing its utility in identifying erroneous segmentation and the potential applications in clinical decision making, here we develop a U-Net based drusen segmentation model and quantify the segmentation uncertainty. We investigate epistemic and aleatoric uncertainty capturing model confidence and data uncertainty respectively. We present segmentation results and show how uncertainty can help formulate robust evaluation strategies. We visually inspect the pixel-wise uncertainty and segmentation results on test images. We finally analyze the correlation between segmentation uncertainty and accuracy. Our results demonstrate the utility of leveraging uncertainties in developing and explaining segmentation models for medical image analysis.

Image And Video Processing

Anisotropic 3D Multi-Stream CNN for Accurate Prostate Segmentation from Multi-Planar MRI

Background and Objective: Accurate and reliable segmentation of the prostate gland in MR images can support the clinical assessment of prostate cancer, as well as the planning and monitoring of focal and loco-regional therapeutic interventions. Despite the availability of multi-planar MR scans due to standardized protocols, the majority of segmentation approaches presented in the literature consider the axial scans only. Methods: We propose an anisotropic 3D multi-stream CNN architecture, which processes additional scan directions to produce a higher-resolution isotropic prostate segmentation. We investigate two variants of our architecture, which work on two (dual-plane) and three (triple-plane) image orientations, respectively. We compare them with the standard baseline (single-plane) used in literature, i.e., plain axial segmentation. To realize a fair comparison, we employ a hyperparameter optimization strategy to select optimal configurations for the individual approaches. Results: Training and evaluation on two datasets spanning multiple sites obtain statistical significant improvement over the plain axial segmentation ( p<0.05 on the Dice similarity coefficient). The improvement can be observed especially at the base ( 0.898 single-plane vs. 0.906 triple-plane) and apex ( 0.888 single-plane vs. 0.901 dual-plane). Conclusion: This study indicates that models employing two or three scan directions are superior to plain axial segmentation. The knowledge of precise boundaries of the prostate is crucial for the conservation of risk structures. Thus, the proposed models have the potential to improve the outcome of prostate cancer diagnosis and therapies.

Image And Video Processing

Anisotropic field-of-views in radial imaging

Radial imaging techniques, such as projection-reconstruction (PR), are used in magnetic resonance imaging (MRI) for dynamic imaging, angiography, and short-imaging. They are robust to flow and motion, have diffuse aliasing patterns, and support short readouts and echo times. One drawback is that standard implementations do not support anisotropic field-of-view (FOV) shapes, which are used to match the imaging parameters to the object or region-of-interest. A set of fast, simple algorithms for 2-D and 3-D PR, and 3-D cones acquisitions are introduced that match the sampling density in frequency space to the desired FOV shape. Tailoring the acquisitions allows for reduction of aliasing artifacts in undersampled applications or scan time reductions without introducing aliasing in fully-sampled applications. It also makes possible new radial imaging applications that were previously unsuitable, such as imaging elongated regions or thin slabs. 2-D PR longitudinal leg images and thin-slab, single breath-hold 3-D PR abdomen images, both with isotropic resolution, demonstrate these new possibilities. No scan time to volume efficiency is lost by using anisotropic FOVs. The acquisition trajectories can be computed on a scan by scan basis.

Image And Video Processing

Anonymization of labeled TOF-MRA images for brain vessel segmentation using generative adversarial networks

Anonymization and data sharing are crucial for privacy protection and acquisition of large datasets for medical image analysis. This is a big challenge, especially for neuroimaging. Here, the brain's unique structure allows for re-identification and thus requires non-conventional anonymization. Generative adversarial networks (GANs) have the potential to provide anonymous images while preserving predictive properties. Analyzing brain vessel segmentation, we trained 3 GANs on time-of-flight (TOF) magnetic resonance angiography (MRA) patches for image-label generation: 1) Deep convolutional GAN, 2) Wasserstein-GAN with gradient penalty (WGAN-GP) and 3) WGAN-GP with spectral normalization (WGAN-GP-SN). The generated image-labels from each GAN were used to train a U-net for segmentation and tested on real data. Moreover, we applied our synthetic patches using transfer learning on a second dataset. For an increasing number of up to 15 patients we evaluated the model performance on real data with and without pre-training. The performance for all models was assessed by the Dice Similarity Coefficient (DSC) and the 95th percentile of the Hausdorff Distance (95HD). Comparing the 3 GANs, the U-net trained on synthetic data generated by the WGAN-GP-SN showed the highest performance to predict vessels (DSC/95HD 0.82/28.97) benchmarked by the U-net trained on real data (0.89/26.61). The transfer learning approach showed superior performance for the same GAN compared to no pre-training, especially for one patient only (0.91/25.68 vs. 0.85/27.36). In this work, synthetic image-label pairs retained generalizable information and showed good performance for vessel segmentation. Besides, we showed that synthetic patches can be used in a transfer learning approach with independent data. This paves the way to overcome the challenges of scarce data and anonymization in medical imaging.

Image And Video Processing

Applications of Deep Learning in Fundus Images: A Review

The use of fundus images for the early screening of eye diseases is of great clinical importance. Due to its powerful performance, deep learning is becoming more and more popular in related applications, such as lesion segmentation, biomarkers segmentation, disease diagnosis and image synthesis. Therefore, it is very necessary to summarize the recent developments in deep learning for fundus images with a review paper. In this review, we introduce 143 application papers with a carefully designed hierarchy. Moreover, 33 publicly available datasets are presented. Summaries and analyses are provided for each task. Finally, limitations common to all tasks are revealed and possible solutions are given. We will also release and regularly update the state-of-the-art results and newly-released datasets at this https URL Review to adapt to the rapid development of this field.

Image And Video Processing

Artefact removal in ground truth and noise model deficient sub-cellular nanoscopy images using auto-encoder deep learning

Image denoising or artefact removal using deep learning is possible in the availability of supervised training dataset acquired in real experiments or synthesized using known noise models. Neither of the conditions can be fulfilled for nanoscopy (super-resolution optical microscopy) images that are generated from microscopy videos through statistical analysis techniques. Due to several physical constraints, supervised dataset cannot be measured. Due to non-linear spatio-temporal mixing of data and valuable statistics of fluctuations from fluorescent molecules which compete with noise statistics, noise or artefact models in nanoscopy images cannot be explicitly learnt. Therefore, such problem poses unprecedented challenges to deep learning. Here, we propose a robust and versatile simulation-supervised training approach of deep learning auto-encoder architectures for the highly challenging nanoscopy images of sub-cellular structures inside biological samples. We show the proof of concept for one nanoscopy method and investigate the scope of generalizability across structures, noise models, and nanoscopy algorithms not included during simulation-supervised training. We also investigate a variety of loss functions and learning models and discuss the limitation of existing performance metrics for nanoscopy images. We generate valuable insights for this highly challenging and unsolved problem in nanoscopy, and set the foundation for application of deep learning problems in nanoscopy for life sciences.

Image And Video Processing

Attention-Based Neural Networks for Chroma Intra Prediction in Video Coding

Neural networks can be successfully used to improve several modules of advanced video coding schemes. In particular, compression of colour components was shown to greatly benefit from usage of machine learning models, thanks to the design of appropriate attention-based architectures that allow the prediction to exploit specific samples in the reference region. However, such architectures tend to be complex and computationally intense, and may be difficult to deploy in a practical video coding pipeline. This work focuses on reducing the complexity of such methodologies, to design a set of simplified and cost-effective attention-based architectures for chroma intra-prediction. A novel size-agnostic multi-model approach is proposed to reduce the complexity of the inference process. The resulting simplified architecture is still capable of outperforming state-of-the-art methods. Moreover, a collection of simplifications is presented in this paper, to further reduce the complexity overhead of the proposed prediction architecture. Thanks to these simplifications, a reduction in the number of parameters of around 90% is achieved with respect to the original attention-based methodologies. Simplifications include a framework for reducing the overhead of the convolutional operations, a simplified cross-component processing model integrated into the original architecture, and a methodology to perform integer-precision approximations with the aim to obtain fast and hardware-aware implementations. The proposed schemes are integrated into the Versatile Video Coding (VVC) prediction pipeline, retaining compression efficiency of state-of-the-art chroma intra-prediction methods based on neural networks, while offering different directions for significantly reducing coding complexity.

Image And Video Processing

Attention-gated convolutional neural networks for off-resonance correction of spiral real-time MRI

Spiral acquisitions are preferred in real-time MRI because of their efficiency, which has made it possible to capture vocal tract dynamics during natural speech. A fundamental limitation of spirals is blurring and signal loss due to off-resonance, which degrades image quality at air-tissue boundaries. Here, we present a new CNN-based off-resonance correction method that incorporates an attention-gate mechanism. This leverages spatial and channel relationships of filtered outputs and improves the expressiveness of the networks. We demonstrate improved performance with the attention-gate, on 1.5 Tesla spiral speech RT-MRI, compared to existing off-resonance correction methods.

Image And Video Processing

Audiovisual Saliency Prediction in Uncategorized Video Sequences based on Audio-Video Correlation

Substantial research has been done in saliency modeling to develop intelligent machines that can perceive and interpret their surroundings. But existing models treat videos as merely image sequences excluding any audio information, unable to cope with inherently varying content. Based on the hypothesis that an audiovisual saliency model will be an improvement over traditional saliency models for natural uncategorized videos, this work aims to provide a generic audio/video saliency model augmenting a visual saliency map with an audio saliency map computed by synchronizing low-level audio and visual features. The proposed model was evaluated using different criteria against eye fixations data for a publicly available DIEM video dataset. The results show that the model outperformed two state-of-the-art visual saliency models.

Image And Video Processing

Automated Deep Learning Analysis of Angiography Video Sequences for Coronary Artery Disease

The evaluation of obstructions (stenosis) in coronary arteries is currently done by a physician's visual assessment of coronary angiography video sequences. It is laborious, and can be susceptible to interobserver variation. Prior studies have attempted to automate this process, but few have demonstrated an integrated suite of algorithms for the end-to-end analysis of angiograms. We report an automated analysis pipeline based on deep learning to rapidly and objectively assess coronary angiograms, highlight coronary vessels of interest, and quantify potential stenosis. We propose a 3-stage automated analysis method consisting of key frame extraction, vessel segmentation, and stenosis measurement. We combined powerful deep learning approaches such as ResNet and U-Net with traditional image processing and geometrical analysis. We trained and tested our algorithms on the Left Anterior Oblique (LAO) view of the right coronary artery (RCA) using anonymized angiograms obtained from a tertiary cardiac institution, then tested the generalizability of our technique to the Right Anterior Oblique (RAO) view. We demonstrated an overall improvement on previous work, with key frame extraction top-5 precision of 98.4%, vessel segmentation F1-Score of 0.891 and stenosis measurement 20.7% Type I Error rate.

Ready to get started?

Join us today

Archive Your Research