Electrical Engineering And Systems Science Image And Video Processing - Researchain

Featured Researches

Adversarial attacks on deep learning models for fatty liver disease classification by modification of ultrasound image reconstruction method

Convolutional neural networks (CNNs) have achieved remarkable success in medical image analysis tasks. In ultrasound (US) imaging, CNNs have been applied to object classification, image reconstruction and tissue characterization. However, CNNs can be vulnerable to adversarial attacks, even small perturbations applied to input data may significantly affect model performance and result in wrong output. In this work, we devise a novel adversarial attack, specific to ultrasound (US) imaging. US images are reconstructed based on radio-frequency signals. Since the appearance of US images depends on the applied image reconstruction method, we explore the possibility of fooling deep learning model by perturbing US B-mode image reconstruction method. We apply zeroth order optimization to find small perturbations of image reconstruction parameters, related to attenuation compensation and amplitude compression, which can result in wrong output. We illustrate our approach using a deep learning model developed for fatty liver disease diagnosis, where the proposed adversarial attack achieved success rate of 48%.

Image And Video Processing

Adversarial cycle-consistent synthesis of cerebral microbleeds for data augmentation

We propose a novel framework for controllable pathological image synthesis for data augmentation. Inspired by CycleGAN, we perform cycle-consistent image-to-image translation between two domains: healthy and pathological. Guided by a semantic mask, an adversarially trained generator synthesizes pathology on a healthy image in the specified location. We demonstrate our approach on an institutional dataset of cerebral microbleeds in traumatic brain injury patients. We utilize synthetic images generated with our method for data augmentation in cerebral microbleeds detection. Enriching the training dataset with synthetic images exhibits the potential to increase detection performance for cerebral microbleeds in traumatic brain injury patients.

Image And Video Processing

Age-Net: An MRI-Based Iterative Framework for Brain Biological Age Estimation

The concept of biological age (BA), although important in clinical practice, is hard to grasp mainly due to the lack of a clearly defined reference standard. For specific applications, especially in pediatrics, medical image data are used for BA estimation in a routine clinical context. Beyond this young age group, BA estimation is mostly restricted to whole-body assessment using non-imaging indicators such as blood biomarkers, genetic and cellular data. However, various organ systems may exhibit different aging characteristics due to lifestyle and genetic factors. Thus, a whole-body assessment of the BA does not reflect the deviations of aging behavior between organs. To this end, we propose a new imaging-based framework for organ-specific BA estimation. In this initial study, we focus mainly on brain MRI. As a first step, we introduce a chronological age (CA) estimation framework using deep convolutional neural networks (Age-Net). We quantitatively assess the performance of this framework in comparison to existing state-of-the-art CA estimation approaches. Furthermore, we expand upon Age-Net with a novel iterative data-cleaning algorithm to segregate atypical-aging patients (BA ≈/ CA) from the given population. We hypothesize that the remaining population should approximate the true BA behavior. We apply the proposed methodology on a brain magnetic resonance image (MRI) dataset containing healthy individuals as well as Alzheimer's patients with different dementia ratings. We demonstrate the correlation between the predicted BAs and the expected cognitive deterioration in Alzheimer's patients. A statistical and visualization-based analysis has provided evidence regarding the potential and current challenges of the proposed methodology.

Image And Video Processing

An Adaptive Video Acquisition Scheme for Object Tracking and its Performance Optimization

We present a novel adaptive host-chip modular architecture for video acquisition to optimize an overall objective task constrained under a given bit rate. The chip is a high resolution imaging sensor such as gigapixel focal plane array (FPA) with low computational power deployed on the field remotely, while the host is a server with high computational power. The communication channel data bandwidth between the chip and host is constrained to accommodate transfer of all captured data from the chip. The host performs objective task specific computations and also intelligently guides the chip to optimize (compress) the data sent to host. This proposed system is modular and highly versatile in terms of flexibility in re-orienting the objective task. In this work, object tracking is the objective task. While our architecture supports any form of compression/distortion, in this paper we use quadtree (QT)-segmented video frames. We use Viterbi (Dynamic Programming) algorithm to minimize the area normalized weighted rate-distortion allocation of resources. The host receives only these degraded frames for analysis. An object detector is used to detect objects, and a Kalman Filter based tracker is used to track those objects. Evaluation of system performance is done in terms of Multiple Object Tracking Accuracy (MOTA) metric. In this proposed novel architecture, performance gains in MOTA is obtained by twice training the object detector with different system generated distortions as a novel 2-step process. Additionally, object detector is assisted by tracker to upscore the region proposals in the detector to further improve the performance.

Image And Video Processing

An Attention Mechanism with Multiple Knowledge Sources for COVID-19 Detection from CT Images

Until now, Coronavirus SARS-CoV-2 has caused more than 850,000 deaths and infected more than 27 million individuals in over 120 countries. Besides principal polymerase chain reaction (PCR) tests, automatically identifying positive samples based on computed tomography (CT) scans can present a promising option in the early diagnosis of COVID-19. Recently, there have been increasing efforts to utilize deep networks for COVID-19 diagnosis based on CT scans. While these approaches mostly focus on introducing novel architectures, transfer learning techniques, or construction large scale data, we propose a novel strategy to improve the performance of several baselines by leveraging multiple useful information sources relevant to doctors' judgments. Specifically, infected regions and heat maps extracted from learned networks are integrated with the global image via an attention mechanism during the learning process. This procedure not only makes our system more robust to noise but also guides the network focusing on local lesion areas. Extensive experiments illustrate the superior performance of our approach compared to recent baselines. Furthermore, our learned network guidance presents an explainable feature to doctors as we can understand the connection between input and output in a grey-box model.

Image And Video Processing

An Interpretation of Regularization by Denoising and its Application with the Back-Projected Fidelity Term

The vast majority of image recovery tasks are ill-posed problems. As such, methods that are based on optimization use cost functions that consist of both fidelity and prior (regularization) terms. A recent line of works imposes the prior by the Regularization by Denoising (RED) approach, which exploits the good performance of existing image denoising engines. Yet, the relation of RED to explicit prior terms is still not well understood, as previous work requires too strong assumptions on the denoisers. In this paper, we make two contributions. First, we show that the RED gradient can be seen as a (sub)gradient of a prior function--but taken at a denoised version of the point. As RED is typically applied with a relatively small noise level, this interpretation indicates a similarity between RED and traditional gradients. This leads to our second contribution: We propose to combine RED with the Back-Projection (BP) fidelity term rather than the common Least Squares (LS) term that is used in previous works. We show that the advantages of BP over LS for image deblurring and super-resolution, which have been demonstrated for traditional gradients, carry on to the RED approach.

Image And Video Processing

An Ultra Fast Low Power Convolutional Neural Network Image Sensor with Pixel-level Computing

The separation of the data capture and analysis in modern vision systems has led to a massive amount of data transfer between the end devices and cloud computers, resulting in long latency, slow response, and high power consumption. Efficient hardware architectures are under focused development to enable Artificial Intelligence (AI) at the resource-limited end sensing devices. This paper proposes a Processing-In-Pixel (PIP) CMOS sensor architecture, which allows convolution operation before the column readout circuit to significantly improve the image reading speed with much lower power consumption. The simulation results show that the proposed architecture enables convolution operation (kernel size=3*3, stride=2, input channel=3, output channel=64) in a 1080P image sensor array with only 22.62 mW power consumption. In other words, the computational efficiency is 4.75 TOPS/w, which is about 3.6 times as higher as the state-of-the-art.

Image And Video Processing

Analysis of Dilation in Children and its Impact on Iris Recognition

The dilation of the pupil and it's variation between a mated pair of irides has been found to be an important factor in the performance of iris recognition systems. Studies on adult irides indicated significant impact of dilation on iris recognition performance at different ages. However, the results of adults may not necessarily translate to children. This study analyzes dilation as a factor of age and over time in children, from data collected from same 209 subjects in the age group of four to 11 years at enrollment, longitudinally over three years spaced by six months. The performance of iris recognition is also analyzed in presence of dilation variation.

Image And Video Processing

Analysis of Hand-Crafted and Automatic-Learned Features for Glaucoma Detection Through Raw Circmpapillary OCT Images

Taking into account that glaucoma is the leading cause of blindness worldwide, we propose in this paper three different learning methodologies for glaucoma detection in order to elucidate that traditional machine-learning techniques could outperform deep-learning algorithms, especially when the image data set is small. The experiments were performed on a private database composed of 194 glaucomatous and 198 normal B-scans diagnosed by expert ophthalmologists. As a novelty, we only considered raw circumpapillary OCT images to build the predictive models, without using other expensive tests such as visual field and intraocular pressure measures. The results ratify that the proposed hand-driven learning model, based on novel descriptors, outperforms the automatic learning. Additionally, the hybrid approach consisting of a combination of both strategies reports the best performance, with an area under the ROC curve of 0.85 and an accuracy of 0.82 during the prediction stage.

Image And Video Processing

Analysis of skin lesion images with deep learning

Skin cancer is the most common cancer worldwide, with melanoma being the deadliest form. Dermoscopy is a skin imaging modality that has shown an improvement in the diagnosis of skin cancer compared to visual examination without support. We evaluate the current state of the art in the classification of dermoscopic images based on the ISIC-2019 Challenge for the classification of skin lesions and current literature. Various deep neural network architectures pre-trained on the ImageNet data set are adapted to a combined training data set comprised of publicly available dermoscopic and clinical images of skin lesions using transfer learning and model fine-tuning. The performance and applicability of these models for the detection of eight classes of skin lesions are examined. Real-time data augmentation, which uses random rotation, translation, shear, and zoom within specified bounds is used to increase the number of available training samples. Model predictions are multiplied by inverse class frequencies and normalized to better approximate actual probability distributions. Overall prediction accuracy is further increased by using the arithmetic mean of the predictions of several independently trained models. The best single model has been published as a web service.

Ready to get started?

Join us today

Archive Your Research