[PDF] Transfer Learning in Magnetic Resonance Brain Imaging: a Systematic Review

Abstract

Transfer learning refers to machine learning techniques that focus on acquiring knowledge from related tasks to improve generalization in the tasks of interest. In MRI, transfer learning is important for developing strategies that address the variation in MR images. Additionally, transfer learning is beneficial to re-utilize machine learning models that were trained to solve related tasks to the task of interest. Our goal is to identify research directions, gaps of knowledge, applications, and widely used strategies among the transfer learning approaches applied in MR brain imaging. We performed a systematic literature search for articles that applied transfer learning to MR brain imaging. We screened 433 studies and we categorized and extracted relevant information, including task type, application, and machine learning methods. Furthermore, we closely examined brain MRI-specific transfer learning approaches and other methods that tackled privacy, unseen target domains, and unlabeled data. We found 129 articles that applied transfer learning to brain MRI tasks. The most frequent applications were dementia related classification tasks and brain tumor segmentation. A majority of articles utilized transfer learning on convolutional neural networks (CNNs). Only few approaches were clearly brain MRI specific, considered privacy issues, unseen target domains or unlabeled data. We proposed a new categorization to group specific, widely-used approaches. There is an increasing interest in transfer learning within brain MRI. Public datasets have contributed to the popularity of Alzheimer's diagnostics/prognostics and tumor segmentation. Likewise, the availability of pretrained CNNs has promoted their utilization. Finally, the majority of the surveyed studies did not examine in detail the interpretation of their strategies after applying transfer learning, and did not compare to other approaches.

Full PDF

RReview

Transfer Learning in Magnetic Resonance BrainImaging: a Systematic Review

Juan Miguel Valverde , Vandad Imani , Ali Abdollahzadeh , Riccardo De Feo ,Mithilesh Prakash , Robert Ciszek and Jussi Tohka * A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland;{juanmiguel.valverde,vandad.imani,ali.abdollahzadeh,riccardo.defeo,mithilesh.prakash,robert.ciszek,jussi.tohka}@uef.ﬁ * Correspondence: jussi.tohka@uef.ﬁ;† The order of these authors was randomizedVersion February 3, 2021 submitted to J. Imaging

Abstract: Background:

Transfer learning refers to machine learning techniques that focus on acquiringknowledge from related tasks to improve generalization in the tasks of interest. In magnetic resonanceimaging (MRI), transfer learning is important for developing strategies that address the variation inMR images from different imaging protocols or scanners. Additionally, transfer learning is beneﬁcial tore-utilize machine learning models that were trained to solve different (but related) tasks to the task ofinterest. The aim of this review is to identify research directions, gaps of knowledge, applications, andwidely used strategies among the transfer learning approaches applied in MR brain imaging.

Methods:

We performed a systematic literature search for articles that applied transfer learning to MR brain imagingtasks. We screened 433 studies for their relevance and we categorized and extracted relevant information,including task type, application, availability of labels, and machine learning methods. Furthermore, weclosely examined brain MRI-speciﬁc transfer learning approaches and other methods that tackled issuesrelevant to medical imaging, including privacy, unseen target domains, and unlabeled data.

Results:

We found 129 articles that applied transfer learning to MR brain imaging tasks. The most frequentapplications were dementia related classiﬁcation tasks and brain tumor segmentation. A majority ofarticles utilized transfer learning techniques based on convolutional neural networks (CNNs). Only fewapproaches utilized clearly brain MRI-speciﬁc methodology, considered privacy issues, unseen targetdomains or unlabeled data. We proposed a new categorization to group speciﬁc, widely-used approachessuch as pre-training and ﬁne-tuning CNNs.

Discussion:

There is an increasing interest in transferlearning within brain MRI. Well-known public datasets have clearly contributed to the popularity ofAlzheimer’s diagnostics/prognostics and tumor segmentation as applications. Likewise, the availabilityof pretrained CNNs has promoted their utilization. Finally, the majority of the surveyed studies didnot examine in detail the interpretation of their strategies after applying transfer learning, and did notcompare their approach with other transfer learning approaches.

Keywords:

Transfer learning; Magnetic resonance imaging; Brain; Systematic review; Survey; Machinelearning; Artiﬁcial intelligence; Convolutional neural networks

1. Introduction

Magnetic resonance imaging (MRI) is an non-invasive imaging technology that produces threedimensional images of living tissue. MRI measures radio-frequency signals emitted from hydrogen atoms

Submitted to

J. Imaging a r X i v : . [ ee ss . I V ] F e b ersion February 3, 2021 submitted to J. Imaging after the application of electromagnetic (radio-frequency) waves, localizing the signal using spatiallyvarying magnetic gradients, and is capable to measure various properties of the tissue depending onthe particular pulse sequence applied for the measurement [1]. The use of MRI is increasing rapidly, notonly for clinical purposes but also for brain research and development of drugs and treatments. Thishas called for machine learning (ML) algorithms for automating the steps necessary for the analysis ofthese images. Common tasks for machine learning include tumor segmentation [2], registration [3], anddiagnostics/prognostics [4].However, the variability in, for instance, image resolution, contrast, signal-to-noise ratio or acquisitionhardware leads to distributional differences that limit the applicability of ML algorithms in research andclinical settings alike [5,6]. In other words, an ML model trained for a task in one dataset may notnecessarily be applied to the same task in another dataset because of the distributional differences. Thisdifﬁculty emerges in large datasets that combine MR images from multiple studies and acquisition centerssince different imaging protocols and scanner hardware are used, and also hampers the clinical applicabilityof ML techniques as the algorithms would need to be re-trained in a new environment. Additionally, partlybecause of the versatility of MRI, an ML model trained for a task could be useful in another, related task.For instance, an ML model trained for brain extraction (skull-stripping) might be useful when trainingan ML model for tumor segmentation. Therefore, developing strategies to address the variation in datadistributions within large heterogeneous datasets is important. This review focuses on transfer learning , astrategy in ML to produce a model for a target task by leveraging the knowledge acquired from a differentbut related source domain [7].Transfer learning (or knowledge transfer) reutilizes knowledge from source problems to solve targettasks. This strategy, inspired by psychology [8], aims to exploit common features between related tasks anddomains. For instance, an MRI expert can specialize in computed tomography (CT) imaging faster thansomeone with no knowledge in either MRI or CT. There exist several surveys [7,9,10] of transfer learningin machine learning literature, including in medical imaging [11], but to our knowledge, no surveys havefocused on its use in brain imaging or MRI applications. Pan and Yang [9] presented one of the earliestsurveys in transfer learning, which focused on tasks with the same feature space (i.e., homogeneoustransfer learning). Pan and Yang [9] proposed two transfer learning categorizations that are in wide use.One categorization divided approaches based on the availability of labels, and the other based on whichknowledge is transferred (e.g., features, parameters). Day and Khoshgoftaar [7] surveyed heterogeneoustransfer learning applications, i.e., tasks with different feature spaces, and, also based on the availability oflabels, transfer learning approaches were divided into 38 categories. More recently, another survey [10]expanded the number of categories to over 40. These categorizations can be useful in a general context asthey were derived from approaches from diverse areas. However, the large number of proposed categoriescan lead to their underutilization, and the classiﬁcation of similar strategies differently. Therefore, suchcategorizations can be inappropriate in speciﬁc ﬁelds, such MR brain imaging.Systematic reviews analyze methodologically articles from speciﬁc areas of science. Recent systematicreviews in biomedical applications of machine learning have covered such topics as predicting stroke[12], detection and classiﬁcation of transposable elements [13], and infant pain prediction [14]. Here, wepresent the ﬁrst systematic review of transfer learning in MR brain imaging applications. The aim of thisreview is to identify research trends, and to ﬁnd popular strategies and methods speciﬁcally designedfor brain applications. We classify transfer learning strategies following the criteria described in [9], andwe introduce new subcategories that describe how transfer learning was applied, revealing widely usedstrategies within the MR brain imaging community. We highlight brain-speciﬁc methods and strategiesaddressing data privacy, unseen target domains, and unlabeled data—topics especially relevant in medicalimaging. Finally, we discuss the research directions and knowledge gaps we found in the literature, andsuggest certain practices that can enhance methods’ interpretability. ersion February 3, 2021 submitted to

J. Imaging

2. Transfer Learning

According to [9], we deﬁne a domain in transfer learning as D = {X , P ( X ) } , where X is the featurespace, and P ( X ) with X = { x , . . . , x n } ⊂ X is a marginal probability distribution. For example, X couldinclude all possible images derived from a particular MRI protocol, acquisition parameters, and scannerhardware, and P ( X ) depend on, for instance, subject groups, such as adolescents or elderly people. Taskscomprise a label space Y and a decision function f , i.e., T = {Y , f } . The decision function is to be learnedfrom the training data ( X , Y ) . Tasks in MR brain imaging can be, for instance, survival rate predictionof cancer patients, where f is the function that predicts the survival rate, and Y is the set of all possibleoutcomes. Given a source domain D S and task T S , and a target domain D T and task T T , transfer learningreutilizes the knowledge acquired in D S and T S to improve the generalization of f T in D T [9]. Importantly, D S must be related to D T , and T S must be related to T T [15]; otherwise, transfer learning can worsen theaccuracy on the target domain. This phenomenon, called negative transfer, has been recently formalized in[16] and studied in the context of MR brain imaging [17].Transfer learning approaches can be categorized based on the availability of labels in source and/ortarget domains during the optimization [9]: unsupervised transfer learning (unlabeled data), transductive (labels available only in the source domain), and inductive approaches (labels available in the targetdomains and, optionally, in the source domains). Table 1 illustrates these three types with examples in MRbrain imaging applications. Table 1.

Types of transfer learning and examples in MR brain imaging. ∼ indicates "different but related".The subscripts S and T indicate source and target, respectively. Type Properties Approach example

Unsupervised D S ∼ D T , T S = T T Transforming T1- and T2-weighted images into the same featurespace with adversarial training.Transductive D S ∼ D T , T S = T T Learning a feature mapping from T1- to T2-weighted images whileoptimizing to segment tumors in T2-weighted images.Inductive D S ∼ D T , T S ∼ T T Optimizing a classiﬁer on a natural images dataset, and ﬁne-tuningcertain parameters for tumor segmentation. D S ∼ D T , T S = T T Optimizing a lesion segmentation algorithm in T2-weighted images,and re-optimizing certain parameters on FLAIR images. D S = D T , T S ∼ T T Optimizing a lesion segmentation algorithm in T2-weighted images,and re-optimizing certain parameters in the same images foranatomical segmentation.Transfer learning approaches can also be grouped into four categories based on the knowledgetransferred [9].

Instance -based approaches estimate and assign weights to images to balance theirimportance during optimization.

Feature -based approaches seek a shared feature space across tasksand/or domains. These approaches can be further divided into asymmetric (transforming targetdomain features into the source domain feature space), and symmetric (ﬁnding a common intermediatefeature representation).

Parameter -based approaches ﬁnd shared priors or parameters between sourceand target tasks/domains. Parameter-based approaches assume that such parameters or priors sharefunctionality and are compatible across domains, such as a domain-invariant image border detector.Finally, relational -based approaches aim to exploit common knowledge across relational domains. ersion February 3, 2021 submitted to

J. Imaging

There exist various strategies to improve ML generalization in addition to transfer learning. Inmulti-task learning, ML algorithms are optimized for multiple tasks simultaneously. For instance,Weninger et al. [18] proposed a multi-task autoencoder-like convolutional neural network with threedecoders—one per task—to segment and reconstruct MR images of brains with tumors. Zhou et al. [19]presented an autoencoder with three branches for coarse, reﬁned, and detailed tumor segmentation.Multi-task learning differs from transfer learning in that transfer learning focuses on target tasks/domainswhereas multi-task learning tackles multiple tasks/domains simultaneously; thus, considering sourceand target tasks/domains equally important [9]. Data augmentation, which adds slightly transformedcopies of the training data to the training set, can also enhance algorithms’ extrapolability [20,21]. Dataaugmentation is useful when images with certain properties (i.e., a speciﬁc contrast) are scarce, andit can complement other techniques, such as multi-task or transfer learning. However, in contrast tomulti-task or transfer learning, data augmentation ignores whether the knowledge acquired from similartasks can be reutilized. Additionally, increasing the training data also increases computational costs, andﬁnding advantageous data transformations is non-trivial. Finally, meta-learning aims to ﬁnd parametersor hyper-parameters of machine learning models to generalize well across different tasks. In contrast totransfer learning, meta-learning does not focus on a speciﬁc target task or domain, but on all possibledomains, including unseen domains. Liu et al. [22] showed that the model-agnostic meta-learning strategy[23] yields state-of-the-art segmentations in MR prostate images.

3. Methods

We followed the PRISMA statement [24] as a guideline, and we included the PRISMA checklist in theSupplementary Materials.

We searched articles about transfer learning applied to MR images in the Scopus database . We choseScopus since it retrieves only peer-reviewed articles (as opposed to Google Scholar), and it has a broadcoverage of engineering and computer science conferences (as opposed to Web of Science). In addition, wesearched for relevant articles from the most recent Medical Image Computing and Computer AssistedInterventions (MICCAI) conference from the Springer website because they were unavailable in Scopusat the time when the search was performed. To ensure we retrieve relevant results, we focused on ﬁndingkeywords in either the title, abstract, or article keywords. We also searched for "knowledge transfer"and "domain adaptation" as these are alternative names or subclasses of transfer learning. Similarly, wesearched for different terms related to MRI, including magnetic resonance and diffusion imaging. Table 2shows the exact searched terms used. Note that "brain" was not one of the keywords as we observed thatincluding it would have led to the omission of several relevant articles. https://link.springer.com/search?facet-conf-event-id=miccai2020&facet-content-type=Chapterersion February 3, 2021 submitted to J. Imaging

Table 2.

Search terms in Scopus and Springer website.

Scopus Springer website ( TITLE-ABS-KEY ( ( "transfer learning" OR"knowledge transfer" OR "domain adaptation")AND ( mri OR "magnetic resonance" OR "diffusionimaging" OR "diffusion weighted imaging"OR "arterial spin labeling" OR "susceptibilitymapping" OR bold OR "blood oxygenation leveldependent" OR "blood oxygen level dependent") )) ( "transfer learning" OR "knowledge transfer" OR"domain adaptation") AND ( mri OR "magneticresonance" OR "diffusion imaging" OR "diffusionweighted imaging" OR "arterial spin labeling" OR"susceptibility mapping" OR "T1" OR "T2" )

We excluded non-article records retrieved (e.g., entire conferences proceedings). We reviewed theabstracts of all candidate articles to discard studies that were not about MR brain imaging or did notapply transfer learning. For this, each abstract was reviewed by two co-authors. In more detail, we usedthe following procedure to assign the abstracts to co-authors. JMV assigned the abstracts randomly tosix labels so to that each label corresponded to one co-author (all co-authors except JMV). JT assingedthe labels with co-authors so that JMV was unaware of the true identity of each reviewer to guaranteeanonymity and to reduce reviewer bias during the screening. Furthermore, to ensure the same criteria wereapplied during this review by each co-author, review guidelines written by JT and JVM and commented byother co-authors were distributed among the co-authors. Reviewers annotated which studies were relevantand, to support their decision, reviewers also included extra information from the screened abstracts,including the studied organs (e.g., brain, heart), MRI type (e.g., structural, functional), and comments.Disagreements between two reviewers were ﬁnally resolved by JVM.

4. Results

Figure 1 summarizes the screening process and the number of articles after each step. First, weretrieved 399 records from Scopus (19th October 2020) and 34 from Springer (30th October 2020). Weexcluded 42 records from Scopus: 41 entire proceedings of conferences and one corrigendum, leaving 391journal and conference articles. After the abstract review, we excluded 220 articles that were either notabout transfer learning or MR brain imaging. We reviewed the remaining 171 articles based on full paper,and 44 articles were discarded: 26 studies mixed data from the same subject to their training and testing ersion February 3, 2021 submitted to

J. Imaging sets , 8 were unrelated to transfer learning in MR brain imaging, 7 were unclear, 2 were inaccessible, and 1was a poster. Finally, we added two other relevant articles, resulting in 129 articles that were included inthis survey. Total Entries 433

Initial retrieval

Data cleaning

Non-articles removed(42) 171

Abstract review

Article review

Articles removed (44)Articles added (2)Records identi ed throughScopus (399) and Springer (34)

Survey

Figure 1.

Flowchart of the screening process.

Studied diseases

Figure 2.

Number of articles according to the ICD-11 category. 19 articles considered only healthy controls. These articles typically performed ML tasks on slices of MRI instead of MRI volumes, and both training and testing sets appearedto contain data from the same volumetric MRI.ersion February 3, 2021 submitted to

J. Imaging

Table 3.

Tasks and applications in the surveyed papers.

Task (total) % of studies Application

Classiﬁcation (68) 52.71% Alzheimer’s diagnostics/prognostics (31), Tumor (10), fMRI decoding(6), Autism spectrum disorder (5), Injected cells (2), Parkinson(2), Schizophrenia (2), Sex (2), Aneurysm [25], Attention deﬁcithyperactivity disorder [26], Bipolar disorder [27], Embryonicneurodevelopmental disorders [28], Epilepsy [29], IDH mutation [30],Multiple sclerosis [31], Quality control [32]Segmentation (45) 34.88% Tumor (16), Anatomical (15), Lesion (14)Regression (12) 9.30% Age (8), Alzheimer’s disease progression [33], Autism symptomseverity [34], Brain connectivity in Alzheimer’s disease [35], Tumorcell density [36]Others (15) 11.63% Reconstruction (5), Registration (4), Image translation (3), CBIR (2),Image fusion [37]We found 29 applications that we considered distinct. Table 3 summarizes the distinct tasks andapplications within these tasks. Note that some articles studied more than one task/application. Figure 2shows the number of articles that addressed different brain diseases according to the 11th InternationalClassiﬁcation of Diseases (ICD-11) .Classiﬁcation tasks were the most widely studied and, among these, dementia-related (neurocognitiveimpairment, Fig. 2) and tumor-related (neoplasms, Fig. 2) applications accounted for 45.59% and 14.71% ofall classiﬁcation tasks, respectively. Other applications included autism spectrum disorder diagnostics, andfunctional MRI (fMRI) decoding that is classiﬁcation of, for instance, stimulus, or cognitive state based onobserved fMRI [38]. Segmentation was the second most popular task, studied in one-third of the articles.Segmentation applications included anatomical, lesion, and tumor segmentation with each applicationstudied in approximately one-third of the segmentation articles. Regression problems were considerablyless common than classiﬁcation and segmentation, and, within these, age prediction was predominant.Among the other applications, image reconstruction and registration were the most widely studied.The majority of the surveyed articles (98) utilized anatomical MR images, including T1-weighted,T2-weighted, FLAIR, and other contrasts. The number of studies that utilized fMRI and diffusion MRI(dMRI) data was 18 and 6, respectively. Finally, 8 studies were multimodal, combining MRI with positronemission tomography (PET) or CT. Figure 3 (left) illustrates the number of articles that utilized speciﬁc machine learning and statisticalmethods. Convolutional neural networks (CNNs) were applied in the majority of articles (68.22%),followed by kernel methods (including support vector machines (SVMs) and support vector regression),multilayer perceptrons, decision trees (including random forests), Bayesian methods, clustering methods,elastic net, long short-term memory, and deep neural networks without convolution layers. Severalother methods (Fig 3, left, "Others") appeared only in a single article: deformable models, manifoldalignment, graph neural networks, Fisher’s linear discriminant, principal component analysis, independentcomponent analysis, joint distribution alignment, singular value decomposition, Pearson correlation, andAdaboost. Figure 3 (right) shows the number of articles that utilized CNNs, kernel methods, and other https://icd.who.int/enersion February 3, 2021 submitted to J. Imaging approaches across years. Since 2014, the number of articles applying transfer learning to MR brain imagingapplications and the use of CNNs have grown exponentially. In the last two years 80% of the articles (68)utilized CNNs, and, during this period, their use and the total number of articles have started to converge.

Approaches across articles A r t i c l e s Articles and popular approaches across years

CNNsKernel methodsOther

Figure 3.

Left: Number of articles according to the ML method studied (methods enumerated in Section4.2). Right: Distribution of articles according to the publication year.

Table 4.

Transfer learning strategies categorization. Bold font highlights the proposed categories.

Type % of studies Subtype Subsubtype Approaches

Instance (16) 11.63%

Fixed

Optimized Unsupervised

Supervised

Direct

17 (13.18%)

Indirect

14 (10.85%)Parameter (87) 65.89% Prior sharing 50 (38.76%)Parameter sharing

One model

21 (16.28%)

Multiple

16 (12.40%)

Table 4 summarizes the transfer learning strategies found in the surveyed papers. Note that certainarticles applied multiple strategies. We classiﬁed these strategies into instance, feature representation("feature" for short), parameter, and relational knowledge [9]. Afterwards, we divided feature-basedapproaches into symmetric and asymmetric. Since the categorization described in [9] is general, wepropose new subcategories, described below. These new subcategories, based on the strategies foundduring our survey, aim to reduce ambiguity and to facilitate the identiﬁcation of popular transfer learningmethods in the context of MR brain imaging.We divided instance-based approaches into ﬁxed and optimized sub-categories.

Fixed weights arethose weights assigned following certain preset criteria, such as assigning higher weights to target domainimages. On the other hand, optimized weights are weights estimated by solving an optimization problem.Furthermore, we separated optimized weights approaches based on whether such optimization problemrequired labels—requirement not always feasible in medical imaging—into supervised and unsupervised .We divided symmetric feature-based approaches that ﬁnd a common feature space between sourceand target domain were into direct and indirect subcategories.

Direct approaches are methods that ersion February 3, 2021 submitted to

J. Imaging operate directly on the feature representations of source and target domain data, thus requiring thesedata simultaneously. We considered such requirement an important discriminative since it can limit theapplicability of the approach. More precisely, approaches that require source and target domain datasimultaneously may need more memory and, importantly, may not be applicable if source and targetdomain data cannot be shared due to privacy issues. As an example of this category, consider an approachthat minimizes the distance between the feature representations of source and target domain images,aiming to transform the data into the same feature space. In contrast, indirect approaches do not operatedirectly on the feature representations and need not simultaneous access to source and target domain data.Parameter-based approaches consisted of two steps: ﬁrst, ﬁnding the shared priors or parameters,and second, ﬁne-tuning all or certain parameters in the target domain data. Since the approaches thatﬁne-tuned all parameters assumed that their previous parameters were closer to the solution than theirrandom initialization, we considered these approaches as prior sharing . Although sharing parameterscould be also considered as sharing priors, separating these two approaches based on whether certainor all model parameters were ﬁne-tuned revealed the popularity of different strategies. Furthermore,we propose to divide parameter-sharing approaches into two categories: approaches that only utilize one model , and approaches in which the shared parameters correspond to a feature-extracting model foroptimizing separate models, thus comprising multiple models.4.3.1. Instance-based approachesWe found 16 approaches (11.63% of the studies) that applied transfer learning by weighing images.We propose to divide these approaches based on whether images’ weights were ﬁxed or optimized , and inthe latter case, distinguish between supervised or unsupervised optimization.Fixed-weights strategies included sample selection (i.e., binary weights), such as in [39], whereauthors discarded certain images in datasets biased with more subjects with a given pathology. Likewise,[40,41] trained a classiﬁer to perform sample selection based on the images’ probability these imagesbelong to the source domain. Similar probabilities and non-binary ﬁxed weights were applied in [42] and[43,44], respectively, allowing the contribution of all images in the studied task.Images’ weights that were optimized via unsupervised strategies relied on data probability densityfunctions (PDFs). The surveyed articles that applied these strategies optimized images’ weights separatelybefore tackling the studied task (e.g., Alzheimer’s diagnostics/prognostics). Images’ weights werederived by minimizing the distance (Kullback-Leibler divergence, Bhattacharyya, squared Euclidean, andmaximum mean discrepancy) between the PDFs in the source and target domains [45–48]. On the otherhand, supervised strategies optimized images’ weights and the ML model simultaneously by minimizingthe corresponding task-speciﬁc loss [43,49–52]. Notably, supervised and unsupervised strategies can becombined, and approaches can also incorporate extra information unused in the main task. For instance,Wachinger et al. [53] also considered age and sex during the unsupervised optimization of PDFs forAlzheimer’s diagnostics/prognostics.4.3.2. Feature-based approachesWe found 38 approaches (29.46% of the studies) that applied transfer learning by ﬁnding a commonfeature space between source and target domains/tasks. These approaches were divided into symmetric and asymmetric (see Section 2). Additionally, we propose to subdivide symmetric approaches based onwhether the common feature space was achieved by directly operating on the source and target featurerepresentations (e.g., by minimizing their distance), or indirectly .We found 7 asymmetric approaches that transformed target domain features into source domainfeatures via generative adversarial networks [54,55], Bregman divergence minimization [56], probabilistic ersion February 3, 2021 submitted to J. Imaging

10 of 23 models [57], median [58], and nearest neighbors [59]. Contrarily, Qin et al. [60] transformed source domainfeatures into target domain features via dictionary-based interpolation to optimize a model on the targetdomain.Among the surveyed 31 symmetric approaches, direct approaches operated on the featurerepresentations across domains by minimizing their differences (via mutual information [61], maximummean discrepancy [44,47,62], Euclidean distance [63–69], Wasserstein distance [70], and average likelihood[71]), maximizing their correlation [72,73] or covariance [34], and introducing sparsity with L1/L2 norms[40,74]. On the other hand, indirect approaches were applied via adversarial training [26,39,52,75–83], andknowledge distillation [84].4.3.3. Parameter-based approachesWe found 87 approaches (65.89% of the studies) that applied transfer learning by sharing priorsor parameters. Parameter-sharing approaches were further subdivided based on whether one model or multiple models were optimized.The most common approach to apply a prior-sharing strategy—and, in general, transfer learning—wasﬁne-tuning all the parameters of a pretrained CNN [27,29–31,33,37,69,85–117] (80% of all prior-sharingmethods). Other approaches utilized Bayesian graphical models [35,36,118,119], graph neural networks[120], kernel methods [62,121], multilayer perceptrons [122], and Pearson-correlation methods [123].Additionally, Sato et al. [25] proposed a general framework to inhibit negative transfer. Within theprior-sharing group, 20 approaches utilized a parameter initialization derived from pretraining on naturalimages (i.e., ImageNet [124]) whereas 26 approaches pretrained on medical images.The second most popular strategy to apply transfer learning was ﬁne-tuning certain parameters ina pretrained CNN [32,125–144]. The remaining approaches ﬁrst optimized a feature extractor (typicallya CNN or a SVM), and then trained a separated model (SVMs [28,43,145–147], long short-term memorynetworks [148,149], clustering methods [146,150], random forests [68,151], multilayer perceptrons [152],logistic regression [146], elastic net [153], CNNs [154]). Additionally, Yang et al. [155] ensembled CNNsand ﬁne-tuned their individual contribution. Within the parameter-sharing group, 17 approaches utilizeda ImageNet-pretrained CNN, and 15 others pretrained on medical images.We found 40 studies that utilized publicly-available CNN architectures. The most popular were VGG[156] (23), ResNet [157] (15), and Inception [158–160] (11). Nearly three-fourths of these studies ﬁne-tunedthe ImageNet-pretrained version of the CNNs whereas one-third pretrained the networks in medicaldatasets. Finally, among these 40 studies, 13 articles compared the performance of multiple architectures,and VGG (4) and Inception (4) usually provided the highest accuracy. We closely examined transfer learning approaches that were inherently unique to brain imaging.Additionally, we included strategies that considered data privacy, unseen target domains, and unlabeleddata, as these topics are especially relevant to the medical domain.Brain MRI speciﬁcAside from employing brain MRI-speciﬁc input features (e.g., brain connectivity matrices in fMRI,diffusion values in dMRI) or pre-processing (e.g., skull-stripping), we only found two transfer learningstrategies unique to brain imaging. Moradi et al. [34] considered cortical thickness measures of eachneuroanatomical structure separately, yielding multiple domain-invariant feature spaces—one per brainregion. This approach diverges from the other surveyed feature-based strategies that sought a singledomain-invariant feature space for all the available input features. Cheng et al. [40] extracted the anatomical ersion February 3, 2021 submitted to

J. Imaging

11 of 23 volumes of 93 regions of interest in the gray matter tissue of MRI and PET images. Afterwards, authorsapplied a feature-based transfer learning approach with sparse logistic regression separately in MRI andPET images, yielding informative brain regions for each modality. Finally, authors combined these featureswith cerebrospinal ﬂuid (CSF) biomarkers, and applied an instance-based approach to ﬁnd informativesubjects.PrivacyWe found two frameworks that considered privacy. Li et al. [52] built a federated-learning frameworkthat optimized a global model by transferring the parameters of site-speciﬁc local models. This frameworkrequires no database sharing, and the transferred parameters are slightly perturbed to ensure differentialprivacy [161]. Sato et al. [25] proposed a framework that applies online transfer learning and only requiresthe output—not the images—of the models optimized in the source domain data. Notably, this frameworkalso tackles negative transfer. Besides these two frameworks, parameter-based approaches that wereﬁne-tuned exclusively in target domain data also protected subjects’ privacy since they required no accessto source domain images during the ﬁne-tuning.Unseen target domainsWe found four studies that considered unseen target domains. van Tulder and de Bruijne [68], Shenand Gao [76] utilized multimodal images and proposed a symmetric feature-based approach that dropsa random image modality during the optimization, avoiding the specialization to any speciﬁc domain.Hu et al. [36], Varsavsky et al. [82] considered that each image belongs to a different target domain, andpresented a strategy that adapts to each image. Note that these approaches also protected data privacy asno access to source domain images was required to adapt to the target domain.Unlabeled dataWe revisited transfer learning strategies that handled unlabeled data in the target domain and,optionally, in the source domain. Van Opbroek et al. [45] presented an instance-based approach that reliedexclusively on the difference between data distributions, thus requiring no labels. Goetz et al. [42] expandedthis idea and also incorporated source domain labels to optimize a model for estimating images’ weights.On the other hand, feature-based approaches can include source domain labels while ﬁnding appropriatefeature transformations. Following this idea, Li et al. [56] minimized the Bregman distance between thesource and target domain features while optimizing a task-speciﬁc classiﬁer. Similarly, Ackaouy et al. [70], Orbes-Arteaga et al. [80] sought a shared feature space while minimizing Dice loss and a consistencyloss, respectively, with source domain labels. Additionally, various indirect symmetric feature-basedapproaches jointly optimized an adversarial loss and a task-speciﬁc loss on the source domain images[77,78,81]. Finally, Orbes-Arteainst et al. [84] used knowledge distillation, training a teacher model onthe labeled source domain, and optimizing a student network on the probabilistic maps from the teachermodel derived with the source and target domain images.

The application of transfer learning has a few potential detrimental consequences that only twostudies included in the survey have investigated. Kollia et al. [134] considered source and target domainimages jointly during the optimization to avoid catastrophic forgetting—lower performance on the sourcedomain after applying transfer learning to the target domain. Sato et al. [25] proposed an algorithm todetect aneurysms that directly inhibits negative transfer [16]—worse results on the target domain than ifno transfer learning is applied. ersion February 3, 2021 submitted to

J. Imaging

12 of 23

5. Discussion

We surveyed 129 articles on transfer learning in brain MRI, and our results indicate an increasedinterest in the ﬁeld in recent years. Alzheimer’s diagnostics/prognostics, tumor classiﬁcation, and tumorsegmentation were the most studied applications. The popularity of these applications is likely linkedto the existence of well-known publicly or easily available databases, such as ADNI [162], and the BrainTumor Segmentation challenge datasets [2,163,164]. We would like to point out that there are also otherlarge MRI databases available to researchers (e.g. ABIDE [165,166], Human Connectome Project [167]), butthe number of articles in this review utilizing these other databases was considerably lower than ADNI orBraTS. CNNs were the most used machine learning method, utilized in 68.22% of all the reviewed papers,and 80% in the last two years. The demonstrations of outperformance of CNNs over other methods [168],and the availability of trained CNNs in ImageNet [124], such as AlexNet [169], VGG [156], and Inception[158], probably explains CNNs’ popularity.We classiﬁed transfer learning approaches into instance, feature representation, parameter, andrelational knowledge [9]. We noticed that this classiﬁcation was too coarse, hindering the identiﬁcationof popular solutions within the MR brain imaging community. As Pan and Yang [9] indicated, thiscategorization is based on "what to transfer" rather than "how to transfer". Therefore, based on thesurveyed articles, we reﬁned this categorization by introducing subcategories that deﬁne transfer learningapproaches more precisely (see Section 4.3 and Table 4). Our categorization divided instance-basedapproaches based on whether images’ weights were ﬁxed or optimized, and in the latter case, subdividedto separate supervised and unsupervised optimization. Symmetric feature-based approaches were splitdepending on whether the strategy operated directly or indirectly on the feature representations betweendomains. Parameter-sharing-based approaches were divided based on whether one or multiple modelswere optimized. With our categorization, we found that most of the strategies pre-trained a modelor utilized a pre-trained model, and subsequently ﬁne-tuned all, certain parameters, or a separatemodel on the target domain data. Among these studies, a similar number of approaches either utilizedImageNet-pre-trained CNNs or pre-trained on medical images.

Beyond showing accuracy gains, the surveyed articles rarely examined other approach-speciﬁcdetails. Only a few studies that optimized images’ weights [47,52,53] showed their post-optimizationdistribution. Interestingly, several of these weights became zero or close to zero, indicating that thecontribution of their corresponding images to tackle the studied task was negligible. The incorporation ofsparsity-inducing regularizers as in [40], and a closer view to these weights could lead to intelligent sampleselection strategies and advances in curriculum learning [170]. Regarding feature-based transfer learningapproaches, various studies [26,39,44,52,56,64,68,69,74] illustrated, typically with t-SNE [171], that thesource and target domain images lied closer in the feature space after the application of their method.However, we found no articles that compared and illustrated the feature space after implementing differentstrategies. This also raises the question of how to properly quantify that a feature space distribution isbetter than another in the context of transfer learning.With a limited number of training data in the target domain, ﬁne-tuning ImageNet-pretrainedCNNs has been demonstrated to yield better results than training such CNNs from scratch, even inthe medical domain [172]. In agreement with this observation, nearly half of the parameter-basedapproaches followed this practice. Fine-tuning all parameters (prior-sharing) and ﬁne-tuning certainparameters (parameter-sharing) were both widely used methods, although in the latter case we rarely found ersion February 3, 2021 submitted to

J. Imaging

13 of 23 justiﬁcations for choosing which parameters to ﬁne-tune. Since the ﬁrst layers of CNNs capture low-levelinformation, such as borders and corners, various studies [32,132,133,135,143] have considered that thoseparameters can be shared across domains. Besides, as adapting pretrained CNNs to the target domain datarequires, at least, replacing the last layer of these models, researchers have likely turn ﬁne-tuning only thisrandomly-initialized layer into common practice, although we found no empirical studies that supportedsuch practice. Four surveyed articles studied different ﬁne-tuning strategies with CNNs pretrained onImageNet [94,132] and medical images [127,128]. The approaches that utilized ImageNet-pretrainedCNNs [94,132] reported that ﬁne-tuning more layers led to higher accuracy, suggesting that the ﬁrstlayers of ImageNet-pretrained networks—that detect low-level image characteristics, such as cornersand borders—may not be adequate for medical images. Furthermore, Bashyam et al. [99] reportedthat Inceptionv2 [159] pretrained on medical images outperformed its ImageNet-pretrained version inAlzheimer’s, mild cognitive impairment, and schizophrenia classiﬁcation. Finally, a recent study [128]showed that ﬁne-tuning the ﬁrst layers of a CNN yielded better results than the traditional approach ofexclusively ﬁne-tuning the last layers, suggesting that the ﬁrst layers encapsulate more domain-dependentinformation.The number and which parameters require ﬁne-tuning could depend on the target domain/taskand on the training set size, as large models require more training data. Likewise, the training set size inthe target domain could be task-dependent. Only a few studies investigated the application of transferlearning with different training set sizes [97,99,126–128,133,145,148]. Among these, most articles reportedthat with a sufﬁciently large training set, models trained from scratch achieved similar or even better[97,133] results than applying transfer learning.

A limitation of this survey arises from the subjectivity and difﬁculty of classifying transfer learningmethods into the sub-categories. Also, a few surveyed articles combined multiple transfer learningapproaches, hindering their identiﬁcation and classiﬁcation. Occasionally determining the source andtarget tasks/domains, and whether labels were used exclusively in the transfer learning approach waschallenging. Thus, the numbers of approaches belonging to speciﬁc categories might not be exact. Anotherlimitation is that, especially with CNNs, it is sometimes ambiguous to decide whether an approach is atransfer learning approach. We based our literature search largely in the authors’ opinions on whether theirarticle was about transfer learning. Also, when rejecting articles from review, we typically trusted authors’judgements: we accepted if they indicated that their approach was transfer learning. Thus, borderlinearticles, which the authors themselves did not categorize as transfer learning, might have escaped ourliterature search although a similar approach was included into this review.

6. Conclusions

The growing number of transfer learning approaches for brain MRI applications during recent yearssignals the perceived importance of transfer learning in brain MRI. The need for transfer learning emergesdue to large datasets that combine MR images from multiple studies and acquisition centers since differentimaging protocols and scanner hardware are used. Also, as transfer learning can be used to increasethe clinical applicability of ML techniques by at least simplifying the training in a new environment.Well-known datasets that are easily available to researchers have clearly boosted certain applications suchas Alzheimer’s diagnostics/prognostics and tumor segmentation.Similarly, the availability of pretrained CNNs has contributed to the wide use of CNNs fortransfer learning. Indeed, we found that pretraining a CNN, or utilizing a pretrained CNN to,subsequently, ﬁne-tune it on target domain data was the most widely used approach to apply transfer ersion February 3, 2021 submitted to

J. Imaging

14 of 23 learning. Additionally, we noticed that the studies that investigated different ﬁne-tuning strategies inImageNet-pretrained CNNs reported higher accuracy after ﬁne-tuning all the parameters and not just thelast layers, as many other approaches did. Although we found various studies tackling issues relevantto the medical imaging community, such as privacy, we only found two brain-speciﬁc approaches that,coincidentally, did not utilize CNNs.Finally, the surveyed studies rarely examined in depth their solutions after applying transfer learning.For example, instance-based approaches seldom interpreted images’ weights; feature-based approachesdid not compare various feature spaces derived with other methods; and the majority of parameter-basedapproaches relied on assumptions to decide which parameters to share.

Funding:

The work of J.M. Valverde was funded from the European Union’s Horizon 2020 Framework Programme(Marie Skłodowska Curie grant agreement

Conﬂicts of Interest:

The authors declare no conﬂict of interest. The funders had no role in the design of the study; inthe collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish theresults.

Author Contributions: conceptualization, JT; methodology, JMV, JT; abstract and paper reviews and analyses, JMV,VI, AA, RDF, MP, RC, JT ; writing–original draft preparation, JMV; writing–review and editing, VI, AA, RDF, MP, RC,JT.

References

1. Lerch, J.P.; van der Kouwe, A.J.; Raznahan, A.; Paus, T.; Johansen-Berg, H.; Miller, K.L.; Smith, S.M.; Fischl, B.;Sotiropoulos, S.N. Studying neuroanatomy using MRI.

Nature neuroscience , , 314–326.2. Menze, B.H.; Jakab, A.; Bauer, S.; Kalpathy-Cramer, J.; Farahani, K.; Kirby, J.; Burren, Y.; Porz, N.; Slotboom, J.;Wiest, R.; others. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions onmedical imaging , , 1993–2024.3. Cao, X.; Fan, J.; Dong, P.; Ahmad, S.; Yap, P.T.; Shen, D. Image registration using machine and deep learning. In Handbook of Medical Image Computing and Computer Assisted Intervention ; Elsevier, 2020; pp. 319–342.4. Falahati, F.; Westman, E.; Simmons, A. Multivariate data analysis and machine learning in Alzheimer’s diseasewith a focus on structural magnetic resonance imaging.

Journal of Alzheimer’s disease , , 685–708.5. Jovicich, J.; Czanner, S.; Greve, D.; Haley, E.; van Der Kouwe, A.; Gollub, R.; Kennedy, D.; Schmitt, F.; Brown, G.;MacFall, J.; others. Reliability in multi-site structural MRI studies: effects of gradient non-linearity correctionon phantom and human data. Neuroimage , , 436–443.6. Chen, J.; Liu, J.; Calhoun, V.D.; Arias-Vasquez, A.; Zwiers, M.P.; Gupta, C.N.; Franke, B.; Turner, J.A. Explorationof scanning effects in multi-site structural MRI studies. Journal of neuroscience methods , , 37–50.7. Day, O.; Khoshgoftaar, T.M. A survey on heterogeneous transfer learning. Journal of Big Data , , 29.8. Woodworth, R.S.; Thorndike, E. The inﬂuence of improvement in one mental function upon the efﬁciency ofother functions.(I). Psychological review , , 247.9. Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Transactions on knowledge and data engineering , , 1345–1359.10. Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A comprehensive survey on transferlearning. Proceedings of the IEEE , , 43–76.11. Cheplygina, V.; de Bruijne, M.; Pluim, J.P. Not-so-supervised: a survey of semi-supervised, multi-instance, andtransfer learning in medical image analysis. Medical image analysis , , 280–296.12. Wang, W.; Kiik, M.; Peek, N.; Curcin, V.; Marshall, I.J.; Rudd, A.G.; Wang, Y.; Douiri, A.; Wolfe, C.D.; Bray, B. Asystematic review of machine learning models for predicting outcomes of stroke with structured data. Plos one , , e0234722.13. Orozco-Arias, S.; Isaza, G.; Guyot, R.; Tabares-Soto, R. A systematic review of the application of machinelearning in the detection and classiﬁcation of transposable elements. PeerJ , , e8311. ersion February 3, 2021 submitted to J. Imaging

15 of 23

14. Cheng, D.; Liu, D.; Philpotts, L.L.; Turner, D.P.; Houle, T.T.; Chen, L.; Zhang, M.; Yang, J.; Zhang, W.; Deng,H. Current state of science in machine learning methods for automatic infant pain evaluation using facialexpression information: study protocol of a systematic review and meta-analysis.

BMJ open , .15. Ge, L.; Gao, J.; Ngo, H.; Li, K.; Zhang, A. On handling negative transfer and imbalanced distributions inmultiple source transfer learning. Statistical Analysis and Data Mining: The ASA Data Science Journal , , 254–271.16. Wang, Z.; Dai, Z.; Póczos, B.; Carbonell, J. Characterizing and avoiding negative transfer. Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 11293–11302.17. Leen, G.; Peltonen, J.; Kaski, S. Focused multi-task learning in a Gaussian process framework. Machine learning , , 157–182.18. Weninger, L.; Liu, Q.; Merhof, D. Multi-task Learning for Brain Tumor Segmentation. International MICCAIBrainlesion Workshop. Springer, 2019, pp. 327–337.19. Zhou, C.; Ding, C.; Wang, X.; Lu, Z.; Tao, D. One-pass multi-task networks with cross-task guided attention forbrain tumor segmentation. IEEE Transactions on Image Processing , , 4516–4529.20. Nalepa, J.; Marcinkiewicz, M.; Kawulok, M. Data augmentation for brain-tumor segmentation: a review. Frontiers in computational neuroscience , , 83.21. Zhao, A.; Balakrishnan, G.; Durand, F.; Guttag, J.V.; Dalca, A.V. Data augmentation using learnedtransformations for one-shot medical image segmentation. Proceedings of the IEEE/CVF Conference onComputer Vision and Pattern Recognition, 2019, pp. 8543–8553.22. Liu, Q.; Dou, Q.; Heng, P.A. Shape-aware Meta-learning for Generalizing Prostate MRI Segmentation to UnseenDomains. International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer, 2020, pp. 475–485.23. Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. arXivpreprint arXiv:1703.03400 .24. Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Group, P.; others. Preferred reporting items for systematicreviews and meta-analyses: the PRISMA statement. PLoS med , , e1000097.25. Sato, I.; Nomura, Y.; Hanaoka, S.; Miki, S.; Hayashi, N.; Abe, O.; Masutani, Y. Managing Computer-AssistedDetection System Based on Transfer Learning with Negative Transfer Inhibition. Proceedings of the 24th ACMSIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 695–704.26. Huang, Y.L.; Hsieh, W.T.; Yang, H.C.; Lee, C.C. Conditional Domain Adversarial Transfer for Robust Cross-SiteADHD Classiﬁcation Using Functional MRI. ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 1190–1194.27. Martyn, P.; McPhilemy, G.; Nabulsi, L.; Martyn, F.; McDonald, C.; Cannon, D.; Schukat, M. Using MagneticResonance Imaging to Distinguish a Healthy Brain from a Bipolar Brain: A Transfer Learning Approach. AICS,2019.28. Attallah, O.; Sharkas, M.A.; Gadelkarim, H. Deep Learning Techniques for Automatic Detection of EmbryonicNeurodevelopmental Disorders. Diagnostics , , 27.29. Si, X.; Zhang, X.; Zhou, Y.; Sun, Y.; Jin, W.; Yin, S.; Zhao, X.; Li, Q.; Ming, D. Automated Detection of JuvenileMyoclonic Epilepsy using CNN based Transfer Learning in Diffusion MRI. 2020 42nd Annual InternationalConference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2020, pp. 1679–1682.30. Chougule, T.; Shinde, S.; Santosh, V.; Saini, J.; Ingalhalikar, M. On Validating Multimodal MRI BasedStratiﬁcation of IDH Genotype in High Grade Gliomas Using CNNs and Its Comparison to Radiomics.International Workshop on Radiomics and Radiogenomics in Neuro-oncology. Springer, 2019, pp. 53–60.31. Eitel, F.; Soehler, E.; Bellmann-Strobl, J.; Brandt, A.U.; Ruprecht, K.; Giess, R.M.; Kuchling, J.; Asseyer, S.;Weygandt, M.; Haynes, J.D.; others. Uncovering convolutional neural network decisions for diagnosingmultiple sclerosis on conventional MRI using layer-wise relevance propagation. NeuroImage: Clinical , , 102003.32. Samani, Z.R.; Alappatt, J.A.; Parker, D.; Ismail, A.A.O.; Verma, R. QC-Automator: Deep learning-basedautomated quality control for diffusion mr images. Frontiers in Neuroscience , . ersion February 3, 2021 submitted to J. Imaging

16 of 23

33. Dong, Q.; Zhang, J.; Li, Q.; Wang, J.; Leporé, N.; Thompson, P.M.; Caselli, R.J.; Ye, J.; Wang, Y.; Initiative, A.D.N.;others. Integrating Convolutional Neural Networks and Multi-Task Dictionary Learning for Cognitive DeclinePrediction with Longitudinal Images.

Journal of Alzheimer’s Disease , pp. 1–22.34. Moradi, E.; Khundrakpam, B.; Lewis, J.D.; Evans, A.C.; Tohka, J. Predicting symptom severity in autismspectrum disorder based on cortical thickness measures in agglomerative data.

Neuroimage , , 128–141.35. Huang, S.; Li, J.; Chen, K.; Wu, T.; Ye, J.; Wu, X.; Yao, L. A transfer learning approach for network modeling. IIE transactions , , 915–931.36. Hu, L.S.; Yoon, H.; Eschbacher, J.M.; Baxter, L.C.; Dueck, A.C.; Nespodzany, A.; Smith, K.A.; Nakaji, P.; Xu, Y.;Wang, L.; others. Accurate patient-speciﬁc machine learning models of glioblastoma invasion using transferlearning. American Journal of Neuroradiology , , 418–425.37. Hermessi, H.; Mourali, O.; Zagrouba, E. Convolutional neural network-based multimodal image fusion viasimilarity learning in the shearlet domain. Neural Computing and Applications , , 2029–2045.38. Naselaris, T.; Kay, K.N.; Nishimoto, S.; Gallant, J.L. Encoding and decoding in fMRI. Neuroimage , , 400–410.39. Dinsdale, N.K.; Jenkinson, M.; Namburete, A.I. Unlearning Scanner Bias for MRI Harmonisation. InternationalConference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2020, pp. 369–378.40. Cheng, B.; Liu, M.; Zhang, D.; Munsell, B.C.; Shen, D. Domain transfer learning for MCI conversion prediction. IEEE Transactions on Biomedical Engineering , , 1805–1817.41. Cheng, B.; Liu, M.; Suk, H.I.; Shen, D.; Zhang, D.; Initiative, A.D.N.; others. Multimodal manifold-regularizedtransfer learning for MCI conversion prediction. Brain imaging and behavior , , 913–926.42. Goetz, M.; Weber, C.; Binczyk, F.; Polanska, J.; Tarnawski, R.; Bobek-Billewicz, B.; Koethe, U.; Kleesiek, J.;Stieltjes, B.; Maier-Hein, K.H. DALSA: domain adaptation for supervised learning from sparsely annotatedMR images. IEEE transactions on medical imaging , , 184–196.43. Van Opbroek, A.; Ikram, M.A.; Vernooij, M.W.; De Bruijne, M. Transfer learning improves supervised imagesegmentation across imaging protocols. IEEE transactions on medical imaging , , 1018–1030.44. Wang, B.; Li, W.; Fan, W.; Chen, X.; Wu, D. Alzheimer’s Disease Brain Network Classiﬁcation Using ImprovedTransfer Feature Learning with Joint Distribution Adaptation. 2019 41st Annual International Conference ofthe IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2019, pp. 2959–2963.45. Van Opbroek, A.; Ikram, M.A.; Vernooij, M.W.; De Bruijne, M. A transfer-learning approach to imagesegmentation across scanners by maximizing distribution similarity. International Workshop on MachineLearning in Medical Imaging. Springer, 2013, pp. 49–56.46. van Opbroek, A.; Vernooij, M.W.; Ikram, M.A.; de Bruijne, M. Weighting training images by maximizingdistribution similarity for supervised segmentation across scanners. Medical image analysis , , 245–254.47. Van Opbroek, A.; Achterberg, H.C.; Vernooij, M.W.; De Bruijne, M. Transfer learning for image segmentationby combining image weighting and kernel learning. IEEE transactions on medical imaging , , 213–224.48. Wang, B.; Prastawa, M.; Saha, A.; Awate, S.P.; Irimia, A.; Chambers, M.C.; Vespa, P.M.; Van Horn, J.D.; Pascucci,V.; Gerig, G. Modeling 4D changes in pathological anatomy using domain adaptation: Analysis of TBI imagingusing a tumor database. International Workshop on Multimodal Brain Image Analysis. Springer, 2013, pp.31–39.49. van Opbroek, A.; Ikram, M.A.; Vernooij, M.W.; de Bruijne, M. Supervised image segmentation across scannerprotocols: A transfer learning approach. International Workshop on Machine Learning in Medical Imaging.Springer, 2012, pp. 160–167.50. Tan, X.; Liu, Y.; Li, Y.; Wang, P.; Zeng, X.; Yan, F.; Li, X. Localized instance fusion of MRI data of Alzheimer’sdisease for classiﬁcation based on instance transfer ensemble learning. Biomedical engineering online , , 49.51. Zhou, K.; He, W.; Xu, Y.; Xiong, G.; Cai, J. Feature selection and transfer learning for Alzheimer’s diseaseclinical diagnosis. Applied Sciences , , 1372.52. Li, X.; Gu, Y.; Dvornek, N.; Staib, L.; Ventola, P.; Duncan, J.S. Multi-site fmri analysis using privacy-preservingfederated learning and domain adaptation: Abide results. arXiv preprint arXiv:2001.05647 . ersion February 3, 2021 submitted to J. Imaging

17 of 23

53. Wachinger, C.; Reuter, M.; Initiative, A.D.N.; others. Domain adaptation for Alzheimer’s disease diagnostics.

Neuroimage , , 470–479.54. Ali, M.B.; Gu, I.Y.H.; Berger, M.S.; Pallud, J.; Southwell, D.; Widhalm, G.; Roux, A.; Vecchio, T.G.; Jakola,A.S. Domain Mapping and Deep Learning from Multiple MRI Clinical Datasets for Prediction of MolecularSubtypes in Low Grade Gliomas. Brain Sciences , , 463.55. Tokuoka, Y.; Suzuki, S.; Sugawara, Y. An Inductive Transfer Learning Approach using Cycle-consistentAdversarial Domain Adaptation with Application to Brain Tumor Segmentation. Proceedings of the 2019 6thInternational Conference on Biomedical and Bioinformatics Engineering, 2019, pp. 44–48.56. Li, W.; Zhao, Y.; Chen, X.; Xiao, Y.; Qin, Y. Detecting Alzheimer’s disease on small dataset: A knowledgetransfer perspective. IEEE journal of biomedical and health informatics , , 1234–1242.57. Wang, B.; Prastawa, M.; Irimia, A.; Saha, A.; Liu, W.; Goh, S.M.; Vespa, P.M.; Van Horn, J.D.; Gerig, G. Modeling4D pathological changes by leveraging normative models. Computer Vision and Image Understanding , , 3–13.58. van Opbroek, A.; Achterberg, H.C.; de Bruijne, M. Feature-space transformation improves supervisedsegmentation across scanners. Medical Learning Meets Medical Imaging. Springer, 2015, pp. 85–93.59. van Opbroek, A.; Achterberg, H.C.; Vernooij, M.W.; Ikram, M.A.; de Bruijne, M.; Initiative, A.D.N.; others.Transfer learning by feature-space transformation: A method for Hippocampus segmentation across scanners. Neuroimage: Clinical , , 466–475.60. Qin, Y.; Li, Y.; Liu, Z.; Ye, C. Knowledge Transfer Between Datasets for Learning-Based Tissue MicrostructureEstimation. 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). IEEE, 2020, pp. 1530–1533.61. Mansoor, A.; Linguraru, M.G. Communal Domain Learning for Registration in Drifted Image Spaces.International Workshop on Machine Learning in Medical Imaging. Springer, 2019, pp. 479–488.62. Zhou, S.; Cox, C.R.; Lu, H. Improving whole-brain neural decoding of fmri with domain adaptation.International Workshop on Machine Learning in Medical Imaging. Springer, 2019, pp. 265–273.63. van Tulder, G.; de Bruijne, M. Representation learning for cross-modality classiﬁcation. In Medical ComputerVision and Bayesian and Graphical Models for Biomedical Imaging ; Springer, 2016; pp. 126–136.64. Gao, Y.; Zhang, Y.; Cao, Z.; Guo, X.; Zhang, J. Decoding Brain States from fMRI Signals by using UnsupervisedDomain Adaptation.

IEEE Journal of Biomedical and Health Informatics .65. Zhang, J.; Wan, P.; Zhang, D. Transport-Based Joint Distribution Alignment for Multi-site Autism SpectrumDisorder Diagnosis Using Resting-State fMRI. International Conference on Medical Image Computing andComputer-Assisted Intervention. Springer, 2020, pp. 444–453.66. Cheng, B.; Liu, M.; Shen, D.; Li, Z.; Zhang, D.; Initiative, A.D.N.; others. Multi-domain transfer learning forearly diagnosis of Alzheimer’s disease.

Neuroinformatics , , 115–132.67. Cheng, B.; Liu, M.; Zhang, D.; Shen, D.; Initiative, A.D.N.; others. Robust multi-label transfer feature learningfor early diagnosis of Alzheimer’s disease. Brain imaging and behavior , , 138–153.68. van Tulder, G.; de Bruijne, M. Learning cross-modality representations from multi-modal images. IEEEtransactions on medical imaging , , 638–648.69. Li, Z.; Ogino, M. Augmented Radiology: Patient-Wise Feature Transfer Model for Glioma Grading. In DomainAdaptation and Representation Transfer, and Distributed and Collaborative Learning ; Springer, 2020; pp. 23–30.70. Ackaouy, A.; Courty, N.; Vallée, E.; Commowick, O.; Barillot, C.; Galassi, F. Unsupervised domain adaptationwith optimal transport in multi-site segmentation of multiple sclerosis lesions from MRI data.

Frontiers incomputational neuroscience , , 19.71. Hofer, C.; Kwitt, R.; Höller, Y.; Trinka, E.; Uhl, A. Simple domain adaptation for cross-dataset analyses ofbrain MRI data. 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017). IEEE, 2017, pp.441–445.72. Cai, X.L.; Xie, D.J.; Madsen, K.H.; Wang, Y.M.; Bögemann, S.A.; Cheung, E.F.; Møller, A.; Chan, R.C.Generalizability of machine learning for classiﬁcation of schizophrenia based on resting-state functionalMRI data. Human Brain Mapping , , 172–184. ersion February 3, 2021 submitted to J. Imaging

18 of 23

73. Guerrero, R.; Ledig, C.; Rueckert, D. Manifold alignment and transfer learning for classiﬁcation of Alzheimer’sdisease. International Workshop on Machine Learning in Medical Imaging. Springer, 2014, pp. 77–84.74. Wang, M.; Zhang, D.; Huang, J.; Yap, P.T.; Shen, D.; Liu, M. Identifying autism spectrum disorder withmulti-site fMRI via low-rank domain adaptation.

IEEE Transactions on Medical Imaging , , 644–655.75. Zhang, J.; Liu, M.; Pan, Y.; Shen, D. Unsupervised Conditional Consensus Adversarial Network for BrainDisease Identiﬁcation with Structural MRI. International Workshop on Machine Learning in Medical Imaging.Springer, 2019, pp. 391–399.76. Shen, Y.; Gao, M. Brain tumor segmentation on MRI with missing modalities. International Conference onInformation Processing in Medical Imaging. Springer, 2019, pp. 417–428.77. Robinson, R.; Dou, Q.; de Castro, D.C.; Kamnitsas, K.; de Groot, M.; Summers, R.M.; Rueckert, D.; Glocker, B.Image-level Harmonization of Multi-Site Data using Image-and-Spatial Transformer Networks. InternationalConference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2020, pp. 710–719.78. Guan, H.; Yang, E.; Yap, P.T.; Shen, D.; Liu, M. Attention-Guided Deep Domain Adaptation for BrainDementia Identiﬁcation with Multi-site Neuroimaging Data. In Domain Adaptation and Representation Transfer,and Distributed and Collaborative Learning ; Springer, 2020; pp. 31–40.79. Mahapatra, D.; Ge, Z. Training data independent image registration using generative adversarial networksand domain adaptation.

Pattern Recognition , , 107109.80. Orbes-Arteaga, M.; Varsavsky, T.; Sudre, C.H.; Eaton-Rosen, Z.; Haddow, L.J.; Sørensen, L.; Nielsen, M.; Pai,A.; Ourselin, S.; Modat, M.; others. Multi-domain adaptation in brain MRI through paired consistency andadversarial learning. In Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labelsand Imperfect Data ; Springer, 2019; pp. 54–62.81. Kamnitsas, K.; Baumgartner, C.; Ledig, C.; Newcombe, V.; Simpson, J.; Kane, A.; Menon, D.; Nori, A.; Criminisi,A.; Rueckert, D.; others. Unsupervised domain adaptation in brain lesion segmentation with adversarialnetworks. International conference on information processing in medical imaging. Springer, 2017, pp. 597–609.82. Varsavsky, T.; Orbes-Arteaga, M.; Sudre, C.H.; Graham, M.S.; Nachev, P.; Cardoso, M.J. Test-time unsuperviseddomain adaptation. International Conference on Medical Image Computing and Computer-AssistedIntervention. Springer, 2020, pp. 428–436.83. Shanis, Z.; Gerber, S.; Gao, M.; Enquobahrie, A. Intramodality Domain Adaptation Using Self Ensemblingand Adversarial Training. In

Domain Adaptation and Representation Transfer and Medical Image Learning with LessLabels and Imperfect Data ; Springer, 2019; pp. 28–36.84. Orbes-Arteainst, M.; Cardoso, J.; Sørensen, L.; Igel, C.; Ourselin, S.; Modat, M.; Nielsen, M.; Pai, A. Knowledgedistillation for semi-supervised domain adaptation. In

OR 2.0 Context-Aware Operating Theaters and MachineLearning in Clinical Neuroimaging ; Springer, 2019; pp. 68–76.85. Grimm, F.; Edl, F.; Kerscher, S.R.; Nieselt, K.; Gugel, I.; Schuhmann, M.U. Semantic segmentation ofcerebrospinal ﬂuid and brain volume with a convolutional neural network in pediatric hydrocephalus—transferlearning from existing algorithms.

Acta Neurochirurgica , , 2463–2474.86. Chen, C.L.; Hsu, Y.C.; Yang, L.Y.; Tung, Y.H.; Luo, W.B.; Liu, C.M.; Hwang, T.J.; Hwu, H.G.; Tseng, W.Y.I.Generalization of diffusion magnetic resonance imaging–based brain age prediction model through transferlearning. NeuroImage , p. 116831.87. Oh, K.; Chung, Y.C.; Kim, K.W.; Kim, W.S.; Oh, I.S. Classiﬁcation and visualization of Alzheimer’s diseaseusing volumetric convolutional neural network and transfer learning.

Scientiﬁc Reports , , 1–16.88. Abrol, A.; Bhattarai, M.; Fedorov, A.; Du, Y.; Plis, S.; Calhoun, V.; Initiative, A.D.N.; others. Deep residuallearning for neuroimaging: An application to predict progression to alzheimer’s disease. Journal of NeuroscienceMethods , p. 108701.89. Alex, V.; Vaidhya, K.; Thirunavukkarasu, S.; Kesavadas, C.; Krishnamurthi, G. Semisupervised learning usingdenoising autoencoders for brain lesion detection and segmentation.

Journal of Medical Imaging , , 041311.90. Cui, S.; Mao, L.; Jiang, J.; Liu, C.; Xiong, S. Automatic semantic segmentation of brain gliomas from MRIimages using a deep cascaded neural network. Journal of healthcare engineering , . ersion February 3, 2021 submitted to J. Imaging

19 of 23

91. Gao, F.; Yoon, H.; Xu, Y.; Goradia, D.; Luo, J.; Wu, T.; Su, Y.; Initiative, A.D.N.; others. AD-NET: Age-adjustneural network for improved MCI to AD conversion prediction.

NeuroImage: Clinical , p. 102290.92. Han, Y.; Yoo, J.; Kim, H.H.; Shin, H.J.; Sung, K.; Ye, J.C. Deep learning with domain adaptation for acceleratedprojection-reconstruction MR.

Magnetic resonance in medicine , , 1189–1205.93. Amin, J.; Sharif, M.; Yasmin, M.; Saba, T.; Anjum, M.A.; Fernandes, S.L. A new approach for brain tumorsegmentation and classiﬁcation based on score level fusion using transfer learning. Journal of medical systems , , 326.94. Swati, Z.N.K.; Zhao, Q.; Kabir, M.; Ali, F.; Ali, Z.; Ahmed, S.; Lu, J. Content-based brain tumor retrieval for MRimages using transfer learning. IEEE Access , , 17809–17822.95. Xu, Y.; Géraud, T.; Bloch, I. From neonatal to adult brain MR image segmentation in a few seconds using3D-like fully convolutional network and transfer learning. 2017 IEEE International Conference on ImageProcessing (ICIP). IEEE, 2017, pp. 4417–4421.96. Ladefoged, C.N.; Hansen, A.E.; Henriksen, O.M.; Bruun, F.J.; Eikenes, L.; Øen, S.K.; Karlberg, A.; Højgaard, L.;Law, I.; Andersen, F.L. AI-driven attenuation correction for brain PET/MRI: Clinical evaluation of a dementiacohort and importance of the training group size. NeuroImage , , 117221.97. Dar, S.U.H.; Özbey, M.; Çatlı, A.B.; Çukur, T. A Transfer-Learning Approach for Accelerated MRI Using DeepNeural Networks. Magnetic Resonance in Medicine , , 663–685.98. Aderghal, K.; Khvostikov, A.; Krylov, A.; Benois-Pineau, J.; Afdel, K.; Catheline, G. Classiﬁcation of Alzheimerdisease on imaging modalities with deep CNNs using cross-modal transfer learning. 2018 IEEE 31stInternational Symposium on Computer-Based Medical Systems (CBMS). IEEE, 2018, pp. 345–350.99. Bashyam, V.M.; Erus, G.; Doshi, J.; Habes, M.; Nasralah, I.; Truelove-Hill, M.; Srinivasan, D.; Mamourian, L.;Pomponio, R.; Fan, Y.; others. MRI signatures of brain age and disease over the lifespan based on a deep brainnetwork and 14 468 individuals worldwide. Brain , , 2312–2324.100. Xu, Y.; Géraud, T.; Puybareau, É.; Bloch, I.; Chazalon, J. White matter hyperintensities segmentation in a fewseconds using fully convolutional network and transfer learning. International MICCAI Brainlesion Workshop.Springer, 2017, pp. 501–514.101. Han, X. MR-based synthetic CT generation using a deep convolutional neural network method. Medical physics , , 1408–1419.102. Deepak, S.; Ameer, P. Retrieval of brain MRI with tumor using contrastive loss based similarity on GoogLeNetencodings. Computers in Biology and Medicine , , 103993.103. Li, H.; Parikh, N.A.; He, L. A novel transfer learning approach to enhance deep neural network classiﬁcation ofbrain functional connectomes. Frontiers in neuroscience , , 491.104. Alkassar, S.; Abdullah, M.A.; Jebur, B.A. Automatic Brain Tumour Segmentation using fully ConvolutionNetwork and Transfer Learning. 2019 2nd International Conference on Electrical, Communication, Computer,Power and Control Engineering (ICECCPCE). IEEE, 2019, pp. 188–192.105. Afridi, M.J.; Ross, A.; Shapiro, E.M. L-CNN: exploiting labeling latency in a cnn learning framework. 201623rd International Conference on Pattern Recognition (ICPR). IEEE, 2016, pp. 2156–2161.106. Chen, K.T.; Schürer, M.; Ouyang, J.; Koran, M.E.I.; Davidzon, G.; Mormino, E.; Tiepolt, S.; Hoffmann, K.T.; Sabri,O.; Zaharchuk, G.; others. Generalization of deep learning models for ultra-low-count amyloid PET/MRIusing transfer learning. European Journal of Nuclear Medicine and Molecular Imaging , pp. 1–10.107. Mahmood, U.; Rahman, M.M.; Fedorov, A.; Lewis, N.; Fu, Z.; Calhoun, V.D.; Plis, S.M. Whole MILC:generalizing learned dynamics across tasks, datasets, and populations. International Conference on MedicalImage Computing and Computer-Assisted Intervention. Springer, 2020, pp. 407–417.108. Yang, Y.; Yan, L.F.; Zhang, X.; Han, Y.; Nan, H.Y.; Hu, Y.C.; Hu, B.; Yan, S.L.; Zhang, J.; Cheng, D.L.; others.Glioma grading on conventional MR images: a deep learning study with transfer learning.

Frontiers inneuroscience , , 804.109. Pravitasari, A.A.; Iriawan, N.; Almuhayar, M.; Azmi, T.; Fithriasari, K.; Purnami, S.W.; Ferriastuti, W.; others.UNet-VGG16 with transfer learning for MRI-based brain tumor segmentation. Telkomnika , , 1310–1318. ersion February 3, 2021 submitted to J. Imaging

20 of 23

Computers in Biology and Medicine , p.103804.111. Zhao, X.; Zhang, H.; Zhou, Y.; Bian, W.; Zhang, T.; Zou, X. Gibbs-ringing artifact suppression with knowledgetransfer from natural images to MR images.

Multimedia Tools and Applications , , 33711–33733.112. Coupé, P.; Mansencal, B.; Clément, M.; Giraud, R.; de Senneville, B.D.; Ta, V.T.; Lepetit, V.; Manjon, J.V.AssemblyNet: A novel deep decision-making process for whole brain MRI segmentation. InternationalConference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2019, pp. 466–474.113. Zhou, Z.; Sodha, V.; Siddiquee, M.M.R.; Feng, R.; Tajbakhsh, N.; Gotway, M.B.; Liang, J. Models genesis:Generic autodidactic models for 3d medical image analysis. International Conference on Medical ImageComputing and Computer-Assisted Intervention. Springer, 2019, pp. 384–393.114. Tao, X.; Li, Y.; Zhou, W.; Ma, K.; Zheng, Y. Revisiting Rubik’s cube: self-supervised learning with volume-wisetransformation for 3D medical image segmentation. International Conference on Medical Image Computingand Computer-Assisted Intervention. Springer, 2020, pp. 238–248.115. Liu, Y.; Pan, Y.; Yang, W.; Ning, Z.; Yue, L.; Liu, M.; Shen, D. Joint Neuroimage Synthesis and RepresentationLearning for Conversion Prediction of Subjective Cognitive Decline. International Conference on MedicalImage Computing and Computer-Assisted Intervention. Springer, 2020, pp. 583–592.116. Ataloglou, D.; Dimou, A.; Zarpalas, D.; Daras, P. Fast and precise hippocampus segmentation through deepconvolutional neural network ensembles and transfer learning. Neuroinformatics , , 563–582.117. Afridi, M.J.; Ross, A.; Shapiro, E.M. On automated source selection for transfer learning in convolutional neuralnetworks. Pattern recognition , , 65–75.118. Kouw, W.M.; Ørting, S.N.; Petersen, J.; Pedersen, K.S.; de Bruijne, M. A cross-center smoothness prior forvariational Bayesian brain tissue segmentation. International Conference on Information Processing in MedicalImaging. Springer, 2019, pp. 360–371.119. Kuzina, A.; Egorov, E.; Burnaev, E. Bayesian generative models for knowledge transfer in mri semanticsegmentation problems. Frontiers in neuroscience , , 844.120. Wee, C.Y.; Liu, C.; Lee, A.; Poh, J.S.; Ji, H.; Qiu, A.; Initiative, A.D.N.; others. Cortical graph neural network forAD and MCI diagnosis and transfer learning across populations. NeuroImage: Clinical , , 101929.121. Fei, X.; Wang, J.; Ying, S.; Hu, Z.; Shi, J. Projective parameter transfer based sparse multiple empirical kernellearning Machine for diagnosis of brain disease. Neurocomputing , , 271–283.122. Velioglu, B.; Vural, F.T.Y. Transfer learning for brain decoding using deep architectures. 2017 IEEE 16thInternational Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2017, pp. 65–70.123. Li, W.; Zhang, L.; Qiao, L.; Shen, D. Toward a Better Estimation of Functional Brain Network for Mild CognitiveImpairment Identiﬁcation: A Transfer Learning View. IEEE Journal of Biomedical and Health Informatics , , 1160–1168.124. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database.2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.125. Jónsson, B.A.; Bjornsdottir, G.; Thorgeirsson, T.; Ellingsen, L.M.; Walters, G.B.; Gudbjartsson, D.; Stefansson, H.;Stefansson, K.; Ulfarsson, M. Brain age prediction using deep learning uncovers associated sequence variants. Nature communications , , 1–10.126. Ghafoorian, M.; Mehrtash, A.; Kapur, T.; Karssemeijer, N.; Marchiori, E.; Pesteie, M.; Guttmann, C.R.; de Leeuw,F.E.; Tempany, C.M.; Van Ginneken, B.; others. Transfer learning for domain adaptation in mri: Applicationin brain lesion segmentation. International conference on medical image computing and computer-assistedintervention. Springer, 2017, pp. 516–524.127. Valverde, S.; Salem, M.; Cabezas, M.; Pareto, D.; Vilanova, J.C.; Ramió-Torrentà, L.; Rovira, À.; Salvi, J.; Oliver,A.; Lladó, X. One-shot domain adaptation in multiple sclerosis lesion segmentation using convolutional neuralnetworks. NeuroImage: Clinical , , 101638. ersion February 3, 2021 submitted to J. Imaging

21 of 23

Domain Adaptation and Representation Transfer, and Distributed andCollaborative Learning ; Springer, 2020; pp. 117–126.129. Gao, Y.; Zhang, Y.; Wang, H.; Guo, X.; Zhang, J. Decoding Behavior Tasks From Brain Activity Using DeepTransfer Learning.

IEEE Access , , 43222–43232.130. Naser, M.A.; Deen, M.J. Brain tumor segmentation and grading of lower-grade glioma using deep learning inMRI images. Computers in Biology and Medicine , p. 103758.131. Vakli, P.; Deák-Meszlényi, R.J.; Hermann, P.; Vidnyánszky, Z. Transfer learning improves resting-state functionalconnectivity pattern analysis using convolutional neural networks.

GigaScience , , giy130.132. Jiang, H.; Guo, J.; Du, H.; Xu, J.; Qiu, B. Transfer learning on T1-weighted images for brain age estimation. Mathematical biosciences and engineering: MBE , , 4382–4398.133. Kushibar, K.; Valverde, S.; González-Villà, S.; Bernal, J.; Cabezas, M.; Oliver, A.; Lladó, X. Supervised domainadaptation for automatic sub-cortical brain structure segmentation with minimal user interaction. Scientiﬁcreports , , 1–15.134. Kollia, I.; Stafylopatis, A.G.; Kollias, S. Predicting Parkinson’s disease using latent information extracted fromdeep neural networks. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 2019, pp. 1–8.135. Menikdiwela, M.; Nguyen, C.; Shaw, M. Deep Learning on Brain Cortical Thickness Data for DiseaseClassiﬁcation. 2018 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2018, pp. 1–5.136. Kaur, B.; Lemaître, P.; Mehta, R.; Sepahvand, N.M.; Precup, D.; Arnold, D.; Arbel, T. Improving PathologicalStructure Segmentation via Transfer Learning Across Diseases. In Domain Adaptation and Representation Transferand Medical Image Learning with Less Labels and Imperfect Data ; Springer, 2019; pp. 90–98.137. Liu, R.; Hall, L.O.; Goldgof, D.B.; Zhou, M.; Gatenby, R.A.; Ahmed, K.B. Exploring deep features frombrain tumor magnetic resonance images via transfer learning. 2016 International Joint Conference on NeuralNetworks (IJCNN). IEEE, 2016, pp. 235–242.138. Stawiaski, J. A pretrained densenet encoder for brain tumor segmentation. International MICCAI BrainlesionWorkshop. Springer, 2018, pp. 105–115.139. Mahapatra, D.; Ge, Z. Training data independent image registration with gans using transfer learning andsegmentation information. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE,2019, pp. 709–713.140. Wang, L.; Li, S.; Meng, M.; Chen, G.; Zhu, M.; Bian, Z.; Lyu, Q.; Zeng, D.; Ma, J. Task-oriented Deep Networkfor Ischemic Stroke Segmentation in Unenhanced CT Imaging. 2019 IEEE Nuclear Science Symposium andMedical Imaging Conference (NSS/MIC). IEEE, 2019, pp. 1–3.141. Wang, S.; Shen, Y.; Chen, W.; Xiao, T.; Hu, J. Automatic recognition of mild cognitive impairment frommri images using expedited convolutional neural networks. International Conference on Artiﬁcial NeuralNetworks. Springer, 2017, pp. 373–380.142. Guy-Fernand, K.N.; Zhao, J.; Sabuni, F.M.; Wang, J. Classiﬁcation of Brain Tumor Leveraging Goal-DrivenVisual Attention with the Support of Transfer Learning. 2020 Information Communication TechnologiesConference (ICTC). IEEE, 2020, pp. 328–332.143. Khan, N.M.; Abraham, N.; Hon, M. Transfer learning with intelligent training data selection for prediction ofAlzheimer’s disease.

IEEE Access , , 72726–72735.144. Tufail, A.B.; Ma, Y.; Zhang, Q.N. Multiclass classiﬁcation of initial stages of Alzheimer’s Disease throughNeuroimaging modalities and Convolutional Neural Networks. 2020 IEEE 5th Information Technology andMechatronics Engineering Conference (ITOEC). IEEE, 2020, pp. 51–56.145. Castro, A.P.; Fernandez-Blanco, E.; Pazos, A.; Munteanu, C.R. Automatic assessment of Alzheimer’s diseasediagnosis based on deep learning techniques. Computers in Biology and Medicine , p. 103764.146. Bodapati, J.D.; Vijay, A.; Veeranjaneyulu, N. Brain tumor detection using deep features in the latent space.

Journal homepage: http://iieta. org/journals/isi , , 259–265.147. Kang, L.; Jiang, J.; Huang, J.; Zhang, T. Identifying early mild cognitive impairment by multi-modalitymri-based deep learning. Frontiers in aging neuroscience , . ersion February 3, 2021 submitted to J. Imaging

22 of 23

OR 2.0Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging ; Springer, 2019; pp. 59–67.149. Ebrahimi-Ghahnavieh, A.; Luo, S.; Chiong, R. Transfer Learning for Alzheimer’s Disease Detection on MRIImages. 2019 IEEE International Conference on Industry 4.0, Artiﬁcial Intelligence, and CommunicationsTechnology (IAICT). IEEE, 2019, pp. 133–138.150. Zheng, J.; Xia, K.; Zheng, Q.; Qian, P. A smart brain MR image completion method guided by synthetic-CT-basedmultimodal registration.

Journal of Ambient Intelligence and Humanized Computing , pp. 1–10.151. Svanera, M.; Savardi, M.; Benini, S.; Signoroni, A.; Raz, G.; Hendler, T.; Muckli, L.; Goebel, R.; Valente, G.Transfer learning of deep neural network representations for fMRI decoding.

Journal of neuroscience methods , , 108319.152. Ren, Y.; Luo, Q.; Gong, W.; Lu, W. Transfer Learning Models on Brain Age Prediction. Proceedings of the ThirdInternational Symposium on Image Computing and Digital Medicine, 2019, pp. 278–282.153. Han, W.; Qin, L.; Bay, C.; Chen, X.; Yu, K.H.; Miskin, N.; Li, A.; Xu, X.; Young, G. Deep Transfer Learningand Radiomics Feature Prediction of Survival of Patients with High-Grade Gliomas. American Journal ofNeuroradiology , , 40–48.154. He, Y.; Carass, A.; Zuo, L.; Dewey, B.E.; Prince, J.L. Self domain adapted network. International Conference onMedical Image Computing and Computer-Assisted Intervention. Springer, 2020, pp. 437–446.155. Yang, Y.; Li, X.; Wang, P.; Xia, Y.; Ye, Q. Multi-Source Transfer Learning via Ensemble Approach for InitialDiagnosis of Alzheimer’s Disease. IEEE Journal of Translational Engineering in Health and Medicine , , 1–10.156. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXivpreprint arXiv:1409.1556 .157. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. Proceedings of the IEEEconference on computer vision and pattern recognition, 2016, pp. 770–778.158. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Goingdeeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition,2015, pp. 1–9.159. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computervision. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.160. Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, inception-resnet and the impact of residualconnections on learning. Proceedings of the AAAI Conference on Artiﬁcial Intelligence, 2017, Vol. 31.161. Dwork, C.; Roth, A.; others. The algorithmic foundations of differential privacy. Foundations and Trends inTheoretical Computer Science , , 211–407.162. Petersen, R.C.; Aisen, P.; Beckett, L.A.; Donohue, M.; Gamst, A.; Harvey, D.J.; Jack, C.; Jagust, W.; Shaw, L.;Toga, A.; others. Alzheimer’s disease neuroimaging initiative (ADNI): clinical characterization. Neurology , , 201–209.163. Bakas, S.; Akbari, H.; Sotiras, A.; Bilello, M.; Rozycki, M.; Kirby, J.S.; Freymann, J.B.; Farahani, K.; Davatzikos,C. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomicfeatures. Scientiﬁc data , , 1–13.164. Bakas, S.; Reyes, M.; Jakab, A.; Bauer, S.; Rempﬂer, M.; Crimi, A.; Shinohara, R.T.; Berger, C.; Ha, S.M.; Rozycki,M.; others. Identifying the best machine learning algorithms for brain tumor segmentation, progressionassessment, and overall survival prediction in the BRATS challenge. arXiv preprint arXiv:1811.02629 .165. Di Martino, A.; Yan, C.G.; Li, Q.; Denio, E.; Castellanos, F.X.; Alaerts, K.; Anderson, J.S.; Assaf, M.; Bookheimer,S.Y.; Dapretto, M.; others. The autism brain imaging data exchange: towards a large-scale evaluation of theintrinsic brain architecture in autism. Molecular psychiatry , , 659–667.166. Di Martino, A.; O’connor, D.; Chen, B.; Alaerts, K.; Anderson, J.S.; Assaf, M.; Balsters, J.H.; Baxter, L.; Beggiato,A.; Bernaerts, S.; others. Enhancing studies of the connectome in autism using the autism brain imaging dataexchange II. Scientiﬁc data , , 1–15.167. Van Essen, D.C.; Smith, S.M.; Barch, D.M.; Behrens, T.E.; Yacoub, E.; Ugurbil, K.; Consortium, W.M.H.; others.The WU-Minn human connectome project: an overview. Neuroimage , , 62–79. ersion February 3, 2021 submitted to J. Imaging

23 of 23