[PDF] Adversarial attacks on deep learning models for fatty liver disease classification by modification of ultrasound image reconstruction method

Abstract

Convolutional neural networks (CNNs) have achieved remarkable success in medical image analysis tasks. In ultrasound (US) imaging, CNNs have been applied to object classification, image reconstruction and tissue characterization. However, CNNs can be vulnerable to adversarial attacks, even small perturbations applied to input data may significantly affect model performance and result in wrong output. In this work, we devise a novel adversarial attack, specific to ultrasound (US) imaging. US images are reconstructed based on radio-frequency signals. Since the appearance of US images depends on the applied image reconstruction method, we explore the possibility of fooling deep learning model by perturbing US B-mode image reconstruction method. We apply zeroth order optimization to find small perturbations of image reconstruction parameters, related to attenuation compensation and amplitude compression, which can result in wrong output. We illustrate our approach using a deep learning model developed for fatty liver disease diagnosis, where the proposed adversarial attack achieved success rate of 48%.

Full PDF

AAdversarial attacks on deep learning models forfatty liver disease classiﬁcation by modiﬁcation ofultrasound image reconstruction method

Michał Byra ∗† , Grzegorz Styczynski ‡ , Cezary Szmigielski ‡ , Piotr Kalinowski § , Lukasz Michalowski ¶ ,Rafal Paluszkiewicz § , Bogna Ziarkiewicz-Wroblewska ¶ , Krzysztof Zieniewicz § , Andrzej Nowicki †† Department of Ultrasound, Institute of Fundamental Technological Research,Polish Academy of Sciences, Warsaw, Poland ‡ Department of Internal Medicine, Hypertension and Vascular Diseases, Medical University of Warsaw, Poland § Department of General, Transplant and Liver Surgery, Medical University of Warsaw, Poland ¶ Department of Pathology, Center for Biostructure Research, Medical University of Warsaw, Poland ∗ Corresponding author, e-mail: [email protected]

Abstract —Convolutional neural networks (CNNs) haveachieved remarkable success in medical image analysis tasks.In ultrasound (US) imaging, CNNs have been applied to objectclassiﬁcation, image reconstruction and tissue characterization.However, CNNs can be vulnerable to adversarial attacks, evensmall perturbations applied to input data may signiﬁcantly affectmodel performance and result in wrong output. In this work, wedevise a novel adversarial attack, speciﬁc to ultrasound (US)imaging. US images are reconstructed based on radio-frequencysignals. Since the appearance of US images depends on theapplied image reconstruction method, we explore the possibilityof fooling deep learning model by perturbing US B-mode imagereconstruction method. We apply zeroth order optimization toﬁnd small perturbations of image reconstruction parameters,related to attenuation compensation and amplitude compression,which can result in wrong output. We illustrate our approachusing a deep learning model developed for fatty liver diseasediagnosis, where the proposed adversarial attack achieved successrate of 48%.

Index Terms —adversarial attacks, deep learning, fatty liver,transfer learning, ultrasound imaging

I. I

NTRODUCTION

Convolutional neural networks (CNNs) have achieved re-markable success in medical image analysis tasks, such as im-age classiﬁcation, object detection and semantic segmentation.However, CNNs can be vulnerable to adversarial attacks, evensmall perturbations applied to input data may signiﬁcantlyaffect model performance and result in wrong output [1]–[3].Such perturbations may occur accidentally or can be inten-tionally designed with the aim to fool the model, for exampleby direct modiﬁcation of the input image pixel intensities.Existence of adversarial examples raises questions about therobustness of deep learning models, and is especially importantin medical imaging. In ultrasound (US) imaging, the appear-ance of tissues depends on the applied image reconstructionmethod. US image pixel intensities depend on the attenuationcompensation technique and the algorithms used to processradio-frequency (RF) US signals. Nonlinear compression ofUS echoes may enhance the visibility of tissue interfaces, but also remove speckle patterns speciﬁc to particular tissues.In practice, radiologists and physicians use different scannersettings to obtain the desired US image quality. However, aspresented in the previous studies, modiﬁcations of US imagepixel intensities may affect extraction of texture features andlower the performance of US based machine learning models[4]–[6]. In the case of other medical imaging modalities, thevulnerability of deep learning models to adversarial attackshas been presented, among others, using dermoscopy images[7].In this work, we devise a novel approach to adversarialattacks, which is speciﬁc to US imaging. In comparison tomethods from computer vision that directly modify imagepixel intensities, we explore the possibility of fooling deeplearning model by perturbing B-mode US image reconstruc-tion method [2]. We investigate whether a modiﬁcation ofthe reconstruction method may change the distribution ofUS image pixel intensities, and consequently result in wrongoutput. We illustrate the proposed approach using a deeplearning model developed for the diagnosis of fatty liverdisease, which is an important medical problem [8], [9]. First,we use transfer learning to develop a deep learning modelfor classiﬁcation of liver US images. Second, we apply zerothorder optimization (ZOO) to ﬁnd a perturbation in the spaceof image reconstruction parameters, which results in wrongclassiﬁcation of the inputted reconstructed US image. To thebest of our knowledge, while several groups have proposeddeep learning models for fatty liver disease diagnosis, therobustness of such methods to adversarial attacks has not beeninvestigated yet [10]–[12].II. M

ETHODS

A. Dataset

To develop the deep learning model and to assess theproposed adversarial attack, we used the following datasets:1) 178 US images collected from 178 patients with the GEVivid E9 System (GE Healthcare INC, Horten, Norway). a r X i v : . [ ee ss . I V ] S e p ) 33 RF data frames (post-beamformed, before US imagereconstruction) collected from 33 patients with SiemensSystem (Siemens, Issaquah, Wash).The data were collected from the liver/kidney view frompatients admitted for bariatric surgery. The scanning wasperformed with convex transducers operating at imaging fre-quency of around 2.5 MHz. Fatty liver disease was diagnosedbased on liver biopsy (more than 5% hepatocytes with steato-sis). For each dataset, approximately 65% of patients hadfatty liver disease. Several US images from each dataset arepresented in Fig. 1. B. Image reconstruction

Commonly, the reconstruction method includes several pro-cedures, e.g. attenuation compensation, compression of RFsignals, data interpolation and resizing. In our case, the re-construction of US images based on RF signals included thefollowing steps:1) Amplitude calculations with the Hilbert transform.2) Attenuation compensation based on ﬁxed attenuationcoefﬁcient β .3) Interpolation from the coordinate space of the convextransducer to correct spatial dimensions.4) Logarithmic compression of amplitude samples andthresholding to speciﬁc decibel range speciﬁed by upper α u and lower α l threshold levels (e.g. amplitude samplesbelow α l were set to α l ).5) Mapping of compressed and thresholded amplitude sam-ples to US image pixel intensities (8 bits).In this work, we performed the adversarial attack based onthe modiﬁcation of the attenuation coefﬁcient β and the com-pression threshold levels α u and α l . We selected these param-eters, because any modiﬁcation of these parameters directlyaffects US image pixel intensities, and consequently changethe appearance of edges and speckle patterns in US image.For the sake of the experiments, we also selected the followinginitial reconstruction parameters: β =0.9 dB/(cm*MHz), α l =10dB, α u =55 dB. These parameters were used as a starting pointfor the search of the adversarial perturbation. Selection of theattenuation coefﬁcient was motivated by the previous studieson fatty liver disease assessment with quantitative US [13],[14]. The compression threshold levels were selected basedon subjective visual assessment of differently reconstructedUS images. C. Deep learning model

We used the InceptionResNetV2 CNN pre-trained on theImageNet dataset for the fatty liver disease diagnosis [15],[16]. The last dense layer was replaced with a randomlyinitialized dense layer suitable for the binary classiﬁcation.The network was trained using the ﬁrst dataset and with theimages from the second dataset reconstructed based on theinitial parameters. Stochastic gradient descent algorithm wasapplied to minimize the binary cross-entropy loss. Dropoutregularization and early stopping were applied to address theover-ﬁtting problem. Loss function was weighted to take into

Fig. 1. US images acquired with the two scanners and used in our study toinvestigate the proposed adversarial attack. account class imbalance. Moreover, the reconstructed gray-scale US images were resized to (299, 299), the originalinput size the InceptionResNetV2 CNN, and duplicated acrossall color channels to imitate RGB images. These two stepsmay also be considered as the part of the US input imagereconstruction method.

D. Adversarial attack

The proposed method works in a black-box setting, it onlyrequires the access to the output of the network to calculatethe loss function (binary cross-entropy in our case) [17], [18].The attack is performed for each RF data frame separately. Wereconstruct the ﬁrst US image using the initial parameters, andnext for each step of the procedure we determine a perturbationresulting in performance decrease. While there is no explicitformula to calculate the gradient of the loss function inrespect to reconstruction parameters, we can approximate thegradient of the, for example, attenuation coefﬁcient β with thefollowing formula: ∂J∂β ≈ J ( β + δβ ) − J ( β − δβ )2 δβ , (1)where J( · ) stands for the binary cross-entropy loss dependingon the input image, image label, network parameters, recon-struction parameters β, α l , α u and δβ is the step size set for the β parameter. Similarly, this formula can be used to calculatethe gradient in respect to the α l and α u parameters. Given thegradients, we can apply sign coordinate gradient descent andupdate the reconstruction parameters in a way to maximizethe loss function and undermine the model. For example, toupdate the β parameter we can apply the following formula: β i +1 = βi + (cid:15) β sign (cid:18) ∂J∂β (cid:19) , (2) ig. 2. Pipeline of the investigated adversarial attack on US based deep learning model. The reconstruction parameters β, α l , α u are updated using zerothorder optimization to ﬁnd a set parameters resulting in wrong classiﬁcation.Fig. 3. US image presenting fatty liver (reconstruction parameters β =0.9dB/(cm*MHz), α l =10 dB, α u =55 dB) and the corresponding adversarialexample (reconstruction parameters β =1.05 dB/(cm*MHz), α l =11.5 dB, α u =54.5 dB, and the difference between the two images. where i stands for the i -th iteration step of the procedure and (cid:15) β is the learning rate for the β parameter. The remainingreconstruction parameters, α l and α u , can be updated in asimilar way.In the study, we determined the steps in eq. 1 and thelearning rates in eq. 2 experimentally. We found that stepsequal to 0.05, 0.1, 0.1 and the learning rates equal to 0.05,0.5, 0.5 for the β, α l , α u performed well in our case. Thepipeline of the proposed adversarial attack is illustrated inFig. 2. To assure relatively small perturbations, we limitedthe min/max ranges of the possible reconstruction parametersto (0.5, 1.3), (5, 15) and (50, 60) for the β, α l and α u parameter, respectively. Procedure was stopped after obtainingperturbation resulting in wrong classiﬁcation or after reachingthe min/max parameters. The cut-off for the classiﬁcation wasset to 0.5. All computations performed in this work weredone in Python, the deep learning model was implementedin TensorFlow [19]. III. R ESULTS

The area under receiver operating characteristic curve(AUC) and accuracy in the case of the second dataset wereequal to 0.84 and 0.82 (27/33), respectively. The correctlyclassiﬁed cases were utilized to assess the proposed adversarialattack. We were able to perform successful attack (resultingin misclassiﬁcation) on 13 out of 27 cases, with success rateof 48%. We found that all reconstruction parameters wereused by the optimizer to minimize the loss and perturb thedata. In the case of the unsuccessful attacks, the procedurereached the parameter bounds each time, but the change inthe network output was to small to result in misclassiﬁcation.Fig. 3 presents an example of a successful adversarial attackon an US image presenting fatty liver.IV. D

ISCUSSION

In this work, we presented that adversarial attacks basedon the modiﬁcation of US image reconstruction method arefeasible. Our approach was demonstrated with a deep modeldeveloped for fatty liver disease diagnosis. As presented inFig. 3, even small change of the parameters related to thereconstruction method may signiﬁcantly decrease classiﬁca-tion performance of the deep model and result in wrongoutput. This might be probably due to complex behaviors ofdeep models [1], [2]. In comparison to the adversarial attacksfrom computer vision, which aim to directly modify inputimage pixels, our approach is speciﬁc to US imaging. In ourcase, the modiﬁcation of the US image pixels results fromthe change of the image reconstruction method. Nevertheless,while we designed the perturbations, accidental changes ofthe reconstruction method may also arise in practice dueto, for example, modiﬁcation of the scanner settings. Thenetwork utilized in our study was trained with the US imagesfrom the second dataset reconstructed using the same initialparameters. Taking this into account, our study suggests thato develop more efﬁcient US based deep learning models itmight be necessary to augment training data with differentlyreconstructed US images. Our study also suggests that the RFdata might serve as a better data type for training of the deepmodels, because training based on RF data does not requireUS image reconstruction and therefore should be more robust.In future, we would like to apply the ZOO technique tostudy the robustness of different machine learning models.The proposed approach to robustness assessment is general,it can be applied to examine machine learning models basedon handcrafted texture features and standard classiﬁers (e.g.support vector machines, random forests). We would likealso to expand our approach by taking into account otherparameters related to US image reconstruction, for examplethose related to image scaling and ﬁltration. In this work, weapplied a relatively simple optimization method to ﬁnd theadversarial perturbations, but in future it would be interestingto also investigate other methods. Moreover, we did notexamine potential defences to the attack. For example, theaugmentation of the training set with differently reconstructedimages would probably result in better performance and a deepmodel that would be more robust to adversarial attacks.C

ONFLICT OF INTEREST

The authors do not have any conﬂicts of interest.R

EFERENCES[1] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessingadversarial examples,” arXiv preprint arXiv:1412.6572 , 2014.[2] N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learningin computer vision: A survey,”

IEEE Access , vol. 6, pp. 14 410–14 430,2018.[3] S. Qiu, Q. Liu, S. Zhou, and C. Wu, “Review of artiﬁcial intelligenceadversarial attack and defense technologies,”

Applied Sciences , vol. 9,no. 5, p. 909, 2019.[4] M. Byra, L. Wan, J. H. Wong, J. Du, S. B. Shah, M. P. Andre,and E. Y. Chang, “Quantitative ultrasound and b-mode image texturefeatures correlate with collagen and myelin content in human ulnar nervefascicles,”

Ultrasound in medicine & biology , vol. 45, no. 7, pp. 1830–1840, 2019.[5] M. Byra, T. Sznajder, D. Korzinek, H. Piotrzkowska-Wroblewska,K. Dobruch-Sobczak, A. Nowicki, and K. Marasek, “Impact of ultra-sound image reconstruction method on breast lesion classiﬁcation withdeep learning,” in

Iberian Conference on Pattern Recognition and ImageAnalysis . Springer, 2019, pp. 41–52.[6] W. G´omez-Flores and J. Hern´andez-L´opez, “Assessment of the invari-ance and discriminant power of morphological features under geometrictransformations for breast tumor classiﬁcation,”

Computer methods andprograms in biomedicine , vol. 185, p. 105173, 2020.[7] S. G. Finlayson, H. W. Chung, I. S. Kohane, and A. L. Beam, “Ad-versarial attacks against medical deep learning systems,” arXiv preprintarXiv:1804.05296 , 2018.[8] S. Beeman and J. Garbow, “Imaging and metabolism,” 2018.[9] V. W.-S. Wong, L. A. Adams, V. de L´edinghen, G. L.-H. Wong, andS. Sookoian, “Noninvasive biomarkers in naﬂd and nashcurrent progressand future promise,”

Nature reviews Gastroenterology & hepatology ,vol. 15, no. 8, pp. 461–478, 2018.[10] M. Byra, G. Styczynski, C. Szmigielski, P. Kalinowski, Ł. Michałowski,R. Paluszkiewicz, B. Ziarkiewicz-Wr´oblewska, K. Zieniewicz, P. So-bieraj, and A. Nowicki, “Transfer learning with deep convolutionalneural network for liver steatosis assessment in ultrasound images,”

In-ternational journal of computer assisted radiology and surgery , vol. 13,no. 12, pp. 1895–1903, 2018. [11] A. Han, M. Byra, E. Heba, M. P. Andre, J. W. Erdman Jr, R. Loomba,C. B. Sirlin, and W. D. OBrien Jr, “Noninvasive diagnosis of nonalco-holic fatty liver disease and quantiﬁcation of liver fat with radiofre-quency ultrasound data using one-dimensional convolutional neuralnetworks,”

Radiology , vol. 295, no. 2, pp. 342–350, 2020.[12] W. Cao, X. An, L. Cong, C. Lyu, Q. Zhou, and R. Guo, “Applicationof deep learning in quantitative analysis of 2-dimensional ultrasoundimaging of nonalcoholic fatty liver disease,”

Journal of Ultrasound inMedicine , vol. 39, no. 1, pp. 51–59, 2020.[13] A. Han, Y. N. Zhang, A. S. Boehringer, M. P. Andre, J. W. Erdman,R. Loomba, C. B. Sirlin, and W. D. OBrien, “Inter-platform reproducibil-ity of ultrasonic attenuation and backscatter coefﬁcients in assessingnaﬂd,”

European radiology , vol. 29, no. 9, pp. 4699–4708, 2019.[14] A. Han, Y. N. Zhang, A. S. Boehringer, V. Montes, M. P. Andre, J. W.Erdman Jr, R. Loomba, M. A. Valasek, C. B. Sirlin, and W. D. OBrien Jr,“Assessment of hepatic steatosis in nonalcoholic fatty liver disease byusing quantitative us,”

Radiology , vol. 295, no. 1, pp. 106–113, 2020.[15] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, andL. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,”

International Journal of Computer Vision (IJCV) , vol. 115, no. 3, pp.211–252, 2015.[16] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4,inception-resnet and the impact of residual connections on learning,” in

Thirty-First AAAI Conference on Artiﬁcial Intelligence , 2017.[17] P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.-J. Hsieh, “Zoo: Zerothorder optimization based black-box attacks to deep neural networkswithout training substitute models,” in

Proceedings of the 10th ACMWorkshop on Artiﬁcial Intelligence and Security , 2017, pp. 15–26.[18] S. Liu, P.-Y. Chen, X. Chen, and M. Hong, “signsgd via zeroth-orderoracle,” in

International Conference on Learning Representations , 2018.[19] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin,S. Ghemawat, G. Irving, M. Isard et al. , “Tensorﬂow: A system for large-scale machine learning,” in { USENIX } Symposium on OperatingSystems Design and Implementation ( { OSDI }16)