[PDF] Combining unsupervised and supervised learning for predicting the final stroke lesion

Abstract

Predicting the final ischaemic stroke lesion provides crucial information regarding the volume of salvageable hypoperfused tissue, which helps physicians in the difficult decision-making process of treatment planning and intervention. Treatment selection is influenced by clinical diagnosis, which requires delineating the stroke lesion, as well as characterising cerebral blood flow dynamics using neuroimaging acquisitions. Nonetheless, predicting the final stroke lesion is an intricate task, due to the variability in lesion size, shape, location and the underlying cerebral haemodynamic processes that occur after the ischaemic stroke takes place. Moreover, since elapsed time between stroke and treatment is related to the loss of brain tissue, assessing and predicting the final stroke lesion needs to be performed in a short period of time, which makes the task even more complex. Therefore, there is a need for automatic methods that predict the final stroke lesion and support physicians in the treatment decision process. We propose a fully automatic deep learning method based on unsupervised and supervised learning to predict the final stroke lesion after 90 days. Our aim is to predict the final stroke lesion location and extent, taking into account the underlying cerebral blood flow dynamics that can influence the prediction. To achieve this, we propose a two-branch Restricted Boltzmann Machine, which provides specialized data-driven features from different sets of standard parametric Magnetic Resonance Imaging maps. These data-driven feature maps are then combined with the parametric Magnetic Resonance Imaging maps, and fed to a Convolutional and Recurrent Neural Network architecture. We evaluated our proposal on the publicly available ISLES 2017 testing dataset, reaching a Dice score of 0.38, Hausdorff Distance of 29.21 mm, and Average Symmetric Surface Distance of 5.52 mm.

Full PDF

CCombining unsupervised and supervised learning for predicting the ﬁnal strokelesion

Adriano Pinto a,b, ∗ , S´ergio Pereira a,b , Raphael Meier d , Roland Wiest d , Victor Alves b , Mauricio Reyes c , Carlos A.Silva a a Center MEMS of University of Minho, Campus of Azur´em, 4800-058 Guimar˜aes Portugal b Center Algoritmi, University of Minho, Braga, Portugal c Healthcare Imaging A.I., Insel Data Science Center, Bern University Hospital, Switzerland d Support Center for Advanced Neuroimaging, University Institute for Diagnostic and Interventional Neuroradiology, Bern University Hospital,Switzerland

Abstract

Predicting the ﬁnal ischaemic stroke lesion provides crucial information regarding the volume of salvageable hypoperfused tissue,which helps physicians in the di ﬃ cult decision-making process of treatment planning and intervention. Treatment selection is in-ﬂuenced by clinical diagnosis, which requires delineating the stroke lesion, as well as characterising cerebral blood ﬂow dynamicsusing neuroimaging acquisitions. Nonetheless, predicting the ﬁnal stroke lesion is an intricate task, due to the variability in lesionsize, shape, location and the underlying cerebral haemodynamic processes that occur after the ischaemic stroke takes place. More-over, since elapsed time between stroke and treatment is related to the loss of brain tissue, assessing and predicting the ﬁnal strokelesion needs to be performed in a short period of time, which makes the task even more complex. Therefore, there is a need forautomatic methods that predict the ﬁnal stroke lesion and support physicians in the treatment decision process. We propose a fullyautomatic deep learning method based on unsupervised and supervised learning to predict the ﬁnal stroke lesion after 90 days. Ouraim is to predict the ﬁnal stroke lesion location and extent, taking into account the underlying cerebral blood ﬂow dynamics thatcan inﬂuence the prediction. To achieve this, we propose a two-branch Restricted Boltzmann Machine, which provides specializeddata-driven features from di ﬀ erent sets of standard parametric Magnetic Resonance Imaging maps. These data-driven feature mapsare then combined with the parametric Magnetic Resonance Imaging maps, and fed to a Convolutional and Recurrent Neural Net-work architecture. We evaluated our proposal on the publicly available ISLES 2017 testing dataset, reaching a Dice score of 0.38,Hausdor ﬀ Distance of 29.21 mm, and Average Symmetric Surface Distance of 5.52 mm.

Keywords:

Deep Learning, Image Prediction, Magnetic Resonance Imaging, Stroke

1. Introduction

Stroke is the second leading cause of death worldwide(World Health Organization et al., 2014), being classiﬁedin two types: ischaemic and haemorrhagic (Grysiewiczet al., 2008). Ischaemic stroke is the most common type,resulting from an occlusion of a vessel, which can be ∗ Corresponding author: Department of Industrial Electronics, Cam-pus Azur´em, Guimar˜aes, Portugal.

Email addresses: [email protected] (Adriano Pinto ), [email protected] (Carlos A. Silva) caused by thrombolysis, haemodynamic factors, or em-bolic causes (Grysiewicz et al., 2008). Due to vessel oc-clusion, the insu ﬃ cient supply of oxygenated blood tobrain cells leads to hypoperfused brain tissue, trigger-ing cellular mechanisms to preserve the integrity of thecell. The hypoperfused area consists of tissue at risk thatcan be salvaged, being designated penumbra. As timepasses, in the absence of ﬂow restoration or su ﬃ cient col-lateral blood ﬂow supply, the hypoperfused tissue even-tually reaches a non-salvageable state designated core orinfarct tissue (Memezawa et al., 1992).Diagnosis and treatment of ischaemic stroke relies on Preprint submitted to Elsevier January 5, 2021 a r X i v : . [ ee ss . I V ] J a n euroimaging acquisitions, where Computed Tomogra-phy (CT) and Magnetic Resonance Imaging (MRI) are thepreferred imaging modalities (Gonzalez et al., 2007). CTimaging remains the most used acquisition due to its ra-pidity and availability (Gonzalez et al., 2007). However,multi-parametric MRI provides a higher sensitivity in de-tecting early ischaemic stroke and assessing the penum-bra region (Gonzalez et al., 2007). Treatment consists inrestoring tissue perfusion levels, also known as reperfu-sion, by performing mechanical thrombectomy or throm-bolysis. Since ischaemic stroke is a dynamic process thatevolves over time, the treatment is only possible up to 24hours, where viable neurones still persist (El Tawil andMuir, 2017; Zivelonghi and Tamburin, 2018). So, expertphysicians must evaluate the beneﬁts and risks of mechan-ical thrombectomy before an intervention, since it maycause haemorrhage, vascular injury, and other complica-tions (Powers et al., 2018). If performed, the success ofthe intervention is assessed radiologically via angiogra-phy imaging and scored by a qualitative expert-generatedscale designated the standardized Thrombolysis in Cere-bral Infarction (TICI) scale (Higashida et al., 2003). Dur-ing the decision-making process, the physician needs toassess the nature and location of the lesion alongsidepathophysiological factors such as age, presence of co-morbidities, and collateral circulation (Liebeskind, 2003).The latter is of utmost importance in ischaemic stroke.The presence of collateral circulation, where a secondarynetwork of vessels is responsible for granting cerebralblood ﬂow to the lesioned tissue, increases the chancesof a successful reperfusion (Liebeskind, 2003). Assertingthe potential e ﬃ cacy of treatment can be time-consumingand prone to inter- and intra-variability among physicians,which is further potentiated when performed in a clinicalemergency environment (Coutts et al., 2003). Moreover,since time is critical, MRI acquisitions are optimized forspeed, which is accomplished by reducing the resolution(Gonz´alez et al., 2011), making the prediction of the ﬁnalstroke lesion an intricate task. Thus, automatic predictionof a stroke lesion at a given time since stroke has a greatpotential to guide physicians in this time-critical decision-making process.We propose a novel automatic method based on unsu-pervised and supervised deep learning. We utilize Re-stricted Boltzmann Machines (RBMs) to jointly charac-terise the lesion and blood ﬂow information through a two-pathway architecture, trained with two subsets ofstandard parametric MRI maps. One subset encompassesthe Time-To-Peak (TTP), Mean Transit Time (MTT),Time-to-Maximum (Tmax), and Apparent Di ﬀ usion Co-e ﬃ cient (ADC). The second set contains the ADC, therelative Cerebral Blood Volume (rCBV), and the relativeCerebral Blood Flow (rCBF). In a second stage, the fea-ture maps computed by the RBMs are combined with thestandard parametric MRI maps to form the input of a su-pervised deep learning architecture composed by Convo-lutional Neural Networks (CNNs) and Recurrent NeuralNetworks (RNNs). The proposed architecture was evalu-ated using the publicly-available ISLES 2017 dataset. In acute ischaemic stroke, the clinical evaluation of thestandard parametric maps ( e . g . ADC and Tmax) can iden-tify infarct tissue and tissue that will infarct in the ab-sence of therapeutic intervention. In this analysis, the in-farct tissue, is identiﬁed by the hypointense regions of theADC map, which characterize tissue with limited di ﬀ u-sion (Butcher and Emery, 2010a). Hypoperfused tissue, i . e . tissue that will infarct, is identiﬁed by hyperintenseregions of the Tmax map, indicating an increased arrivaltime of contrast agent (Butcher and Emery, 2010b). How-ever, to correctly predict the ﬁnal ischaemic stroke lesion,besides considering the complex time-evolving transfor-mation of hypoperfused tissue to infarcted tissue, it isalso necessary to appraise the impact of the clinical inter-vention, thrombectomy, on the underlying brain perfusionand di ﬀ usion.A successful thrombectomy should restore the perfu-sion levels, recovering the hypoperfused tissue. How-ever, several factors may a ﬀ ect the reperfusion, limitingthe degree of success of the intervention. To better un-derstand the nuances of the clinical intervention, considerthe cases presented in Figure 1. In the ﬁrst case, Figure1a, the ADC does not present any hypointense region, sono infarct tissue may be identiﬁed, and we should expecta complete recovery of the hypoperfused tissue indicatedby Tmax; however, the follow-up delineation obtained af-ter thrombectomy presents a large ﬁnal lesion, which isexplained by an unsuccessful intervention. In the secondcase, Figure 1b, we observe a ﬁnal infarct lesion that issmaller than the hypointense region present in the ADC2 a) (b) Figure 1: ADC and Tmax parametric maps of two patient cases from ISLES 2017 training set, and the ﬁnal lesion delineated at a 90-day follow-up, overlapped with the onset ADC: patient 0036 (Figure 1a) with an unsuccessful reperfusion, and patient 0006 (Figure 1b), where the clinicalintervention was successful. (Figure 1b arrow). This indicates reversible di ﬀ usion re-striction, which is a rare case (Labeyrie et al., 2012) andwas only possible to identify by a follow-up T2-weightedacquisition. So, an automatic method for predicting the ﬁ-nal stroke lesion has not only to capture the time-evolvingprocess of di ﬀ usion and perfusion, but also to consider di-rectly or indirectly the degree of success of the thrombec-tomy, which may condition the ﬁnal lesion either to beconﬁned to the hypointense region of the ADC map, orto grow to brain tissue areas that are hyperintense in theTmax. Due to the time-evolving process of di ﬀ usion andperfusion, the complexity of predicting the lesion will ag-gravate as we move from a target window of some days toseveral months.The complexity of the evaluation process may be alsoobserved in the inter-rater agreement of expert radiolo-gists in ISLES 2017 dataset, which obtained a Dice scoreof 0 . ± .

20 on delineating the lesion using a 90-dayfollow-up T2-weighted acquisition (Winzeck et al., 2018).

Contrary to stroke lesion segmentation, where severalmethods have already been proposed (Rekik et al., 2012;Maier et al., 2017), the complexity of predicting the ﬁ-nal stroke lesion has only recently attracted attention inthe medical imaging community. For predicting the ﬁnalstroke lesion several methods have been already proposedbased on multivariate linear regression models (Scalzoet al., 2012; Rose et al., 2001; Kemmling et al., 2015),decision trees (McKinley et al., 2017; Bauer et al., 2014),and CNNs (Choi et al., 2016). Furthermore, with the re-lease of Ischaemic Stroke LEsion Segmentation (ISLES) Challenge in 2016 and 2017, new methods have been pro-posed. These aim to predict at a 90-day time-window.Rose et al. (2001) proposed a two-stage method basedon parametric perfusion and di ﬀ usion MRI maps. On theﬁrst stage, the method deﬁnes a region of interest (ROI)based on the intensity signal of the standard parametricmaps, the MTT, Cerebral Blood Flow (CBF), CerebralBlood Volume (CBV), and Di ﬀ usion-Weighted Imaging(DWI). The second stage performs stroke tissue predic-tion using Gaussian mixture models trained in di ﬀ erentsets of parametric maps. Bauer et al. (2014) used RandomForests to segment or predict the ﬁnal stroke lesion de-pending on whether acute stroke imaging or three-monthfollow-up imaging was available, respectively. McKin-ley et al. (2017) also used a two-stage classiﬁcation ap-proach as in Rose et al. (2001) for lesion characterisa-tion and lesion prediction, where each stage consists oftwo sets of Random Forests (RFs) classiﬁers. The ﬁrststage aims to deﬁne a ROI that encompasses the hypop-erfused region. In the ﬁrst set, each classiﬁer is trainedwith features extracted from di ﬀ erent sets of MRI para-metric maps. Having deﬁned the location and extensionof the lesion, a second set of two RFs performs stroketissue prediction. Such classiﬁers were trained on dif-ferent sets of patients, stratiﬁed by the TICI score. Oneclassiﬁer is trained in patients with unsuccessful reperfu-sion interventions, whereas a second classiﬁer is trainedin patients with successful reperfusion. The ﬁnal predic-tion is obtained by combining the results of both clas-siﬁers, using a logistic regression model. Scalzo et al.(2012) proposed a framework for stroke tissue prediction,which characterises the state of the lesion four days af-ter clinical intervention (thrombectomy). From the Fluid3ttenuation Inversion Recovery (FLAIR) MRI sequence,ADC and Tmax MRI maps, the method applies a regres-sion model that learns the behaviour of neighbouring vox-els within a cuboid. Kemmling et al. (2015) proposeda multi-modality approach based on CT and MRI mapswith non-imaging clinical meta-data, namely the TICIscore and the time to treatment of each patient, to performstroke tissue prediction.In another line of research, authors have investigatedthe use of deep learning (Choi et al., 2016; Nielsen et al.,2018; Robben et al., 2020) for stroke tissue prediction.Choi et al. (2016), the winner approach at ISLES 2016Challenge, proposed an ensemble of twelve CNN archi-tectures, grouped into two sets of networks. The ﬁrstgroup comprehends four 3D U-Nets (Ronneberger et al.,2015) performing voxel-wise tissue prediction. The sec-ond group of networks uses two-pathway Fully Con-nected Networks (FCNs) performing two types of patch-wise classiﬁcation. One set of FCNs classiﬁes a patchas lesion if it includes any lesion voxel. The other setof FCNs classiﬁes a patch as lesion if the central voxelis a lesion. After merging the two pathway FCN, themethod incorporates meta-data by adding a dense layerof clinical predictors merged with the imaging output ofeach network. The ﬁnal stroke lesion prediction resultsfrom a weighted merging of all models. Mok and Chung(2017) applied deep adversarial training for stroke tis-sue prediction in an ensemble of U-Nets. Monteiro andOliveira (2017) proposed a method based on the V-Netarchitecture (Milletari et al., 2016). The training was con-ducted with a custom loss function that applies a weightedsum between Dice score and cross entropy. Lucas andHeinrich (2017) proposed the use of a U-Net architec-ture, which combines patches from the MRI maps in thesame slice, with patches from 3 neighbouring slices and2 hemispheric ﬂips. In the expanding path of the U-Net,each level computes a Dice loss for the healthy tissue andfor the ﬁnal lesion, after the softmax activation. After-wards, all losses are summed up, having the loss of thelesion and healthy tissue weighted according to a priorprobability (Winzeck et al., 2018). Robben and Suetens(2017) employed a CNN-based architecture inspired byKamnitsas et al. (2017). The authors proposed to com-bine the MRI inputs with clinical meta-data, before feed-ing them to each branch of a two-pathway 3D network.In the ﬁrst branch the input is kept with the original res- olution, while in the second branch the input resolutionwas lowered by a factor of 3. The output of each branchis transformed to the same scale and merged by two fullyconnected layers. The network is trained with four dif-ferent sets of hyper-parameters. These four networks areused as an ensemble, whose prediction is obtained by av-eraging the output of each one. Similarly, Niu et al. (2018)used multiple scales of overlapping 3D patches to capturelocal and global spatial information. In the review pa-per of Winzeck et al. (2018), Rivera et al. also built onthe work of Kamnitsas et al. (2017) and Milletari et al.(2016), by proposing a scheme to extract di ﬀ erent patchresolutions, independent of each other, that are fed intofour di ﬀ erent paths. Afterwards, a fully connected layercombines all the outputs to perform stroke tissue predic-tion. Pisov et al. (2017) employed an ensemble strategyby combining di ﬀ erent CNN-based architectures to over-come the strong anisotropy of the data. As summarized byWinzeck et al. (2018), the work of Yoon et al. proposed atwo-stage gated CNN. In a ﬁrst stage, the authors performlesion detection and delineation. Afterwards, based on theprobability maps of the ﬁrst stage, a second CNN archi-tecture processes the regions where the probability mapsof healthy tissue and lesion are close to each other. Pintoet al. (2018b) made use of temporal perfusion imaging,the Dynamic Susceptibility Contrast-MRI, in a U-Net ar-chitecture. This architecture aims to temporally processand extract deep features, which are then combined witha second feature step of another U-Net network, whichwas trained on the standard parametric maps. Using alarge CT dataset, Robben et al. (2020) predicted the ﬁ-nal infarct stroke lesion with a temporal window rangingfrom 24 hours to 5 days. The authors considered spatio-temporal CT perfusion as input to a deep neural networkinspired in the architecture proposed by Kamnitsas et al.(2017). Additionally, the model combines CT neuroimag-ing with clinical meta-data. Nielsen et al. (2018) pro-posed a method based on the SegNet architecture (Badri-narayanan et al., 2015), predicting on a 30-day follow-upacquisition based on a private dataset.Principal and collateral blood ﬂow has been consid-ered either directly by modelling the temporal perfusionimaging (Pinto et al., 2018b), or indirectly by perfusionand di ﬀ usion parametric maps (Choi et al., 2016; Maieret al., 2017; Scalzo et al., 2012), or through clinical in-formation that characterises the success of the revascu-4arization (McKinley et al., 2017). We hypothesize thatmodelling the haemodynamics of the brain when arteryocclusion occurs can be beneﬁcial for predicting the ﬁnalstroke lesion. So, in this work, we investigate the rep-resentation of the haemodynamics through an unsuper-vised learning model. Contrary to previous approaches,we propose grouping the input maps according to theirsubjacent physical meaning and encoding each group sep-arately with an RBM. As groups, we investigated thetime-resolved perfusion maps (Tmax, TTP, MTT), and theblood-ﬂow-dynamic related maps (rCBF, rCBV) (Butcherand Emery, 2010a,b). Our proposal of combining featuresobtained unsupervisedly and supervisedly was motivatedby the knowledge that unsupervised models learn struc-tural features of the original image, while the supervisedmodels learn features conditioned on the label, so thereis potential for obtaining richer and more discriminativefeatures by joining both types of models. This work presents an automatic approach for predict-ing the ﬁnal stroke lesion, using onset neuroimaging data.The main contributions are:- The use of unsupervised models for extractingstructural features of time-resolved perfusion andblood-ﬂow-dynamic related MRI maps for predict-ing stroke lesion.- The use of local and long spatial context providedby gated recurrent neural networks for relating struc-tural features and image information when learn-ing features conditioned on the label in a supervisedmodel.- The proposal of a competitive system which outper-forms state-of-the-art methods to predict the ﬁnal in-farct stroke lesion, in ISLES 2017 Challenge dataset.The remainder of the paper is organized as follows.Section 2 describes the fundamental components of theproposed method. Section 3 describes the dataset, theevaluation procedure and the setup. The results and thediscussion are addressed in Section 4. Finally, in Section5 we present the main conclusions.

2. Methods

In this work, predicting the ﬁnal infarct stroke lesionconsists of delineating the lesion’s spatial extent at a 90-day follow-up time-point, using multi-parametric MRIimaging, namely the ADC, MTT, TTP, Tmax, rCBF, andrCBV, which are acquired at the onset time-point. Thearchitecture of the proposed system and its main compo-nents are described in the following subsections.

The overall architecture of the proposed method can bedivided into two functional blocks, as shown in Figure 2.The ﬁrst functional block performs unsupervised rep-resentation learning using two unsupervised models,namely RBMs. This unsupervised block provides newfeatures that represent structural information that comple-ments the standard parametric MRI maps, enhancing thecapacity of our model to predict the ﬁnal infarct lesion.In our approach, we aim to model the clinical procedure,which ﬁrst locates and delineates the lesion at currenttime, and then considers the blood ﬂow haemodynamicthat might inﬂuence the ﬁnal stroke lesion prediction.This procedure is encoded in our two-path RBM. Theﬁrst RBM is responsible for capturing information on le-sion location and extension, referred to as the RBM

Lesion .The second RBM, RBM

Haemo , aims to capture blood ﬂowhaemodynamics information ( e . g . collateral circulation),which has been identiﬁed as a key factor by physicianswhen assessing stroke ﬁnal infarct lesion in clinical re-ports (Berkhemer et al., 2016; Menon et al., 2015). Onone hand, to locate the onset ischaemic stroke lesion, theRBM Lesion considers standard parametric maps that char-acterise the arrival times and mean transit times of thecontrast agent. In the presence of an ischaemic lesion, theoccluded vessel can decrease or interrupt the normal brainperfusion, translating into hyperintense regions on time-related parametric maps (Butcher and Emery, 2010b). Onthe other hand, the RBM

Haemo considers standard para-metric maps that characterise the amount of blood beingdelivered in unit of time, which correlates to the cerebralblood ﬂow haemodynamics (Butcher and Emery, 2010b).Thus, the RBM

Lesion considers the MTT, TTP and Tmaxperfusion maps, while the RBM

Haemo the rCBV and rCBFperfusion maps. Regarding the ADC standard di ﬀ usionmap, it is present in both RBM Lesion and RBM

Haemo , since5 igure 2: Overview of the proposed method for predicting the ﬁnal stroke lesion. In the supervised learning block, the input data dimensions aredeﬁned for each operation. it provides higher brain structural information and allowsthe identiﬁcation of tissue that is already infarcted. Thisseparation of the input imaging allows the RBM to learnspeciﬁc feature sets, which may enable the method toanalyse di ﬃ cult cases where information concerning theblood ﬂow can have a favourable impact on the lesion pre-diction.The second functional block consists of a deep learningarchitecture that comprehends 2D convolutional blocks ina U-Net-based structure, alongside recurrent blocks. Asimaging input data, we combine the standard parametricmaps with feature maps from each RBM, totalling 18 in-put feature maps. The RBM is an undirected graphical model consti-tuted by two layers of nodes: a visible layer and a hid-den layer (Rumelhart and McClelland, 1986). Each nodehas a weighted connection to all nodes in the other layer(Rumelhart and McClelland, 1986). However, there areno connections among nodes of the same layer. Orig-inally, Rumelhart and McClelland (1986) proposed theRBM to learn from binary data on both layers. How-ever, this does not represent well continuous real-valuedinput data, which is the case of MRI data. Therefore, wemodel the visible nodes as linear units with independentGaussian noise. The hidden nodes are modelled as NoisyRectiﬁer Linear Units (NReLU), since they have been re-ported to be suitable for feature extraction (Hinton, 2012).This kind of RBM was previously used in segmentationtasks, such as in Pereira et al. (2019). Mapping the input data into a feature vector is performed through the interac-tion of states between the visible and hidden units, whichis learned by minimizing an energy function.The complete pipeline of the unsupervised block isshown in Figure 3 and detailed in Section 3.4. TheRBM

Lesion and RBM

Haemo function as feature generatorsthat output two complementary sets of feature maps N and N . These features characterise the structure of theimages; however, we are interested only on the most dis-tinctive details. So, after training the RBMs, we performfeature selection to reduce the generated feature space,obtaining smaller but representative feature sets M and M , such that |M i | (cid:28) |N i | , for i ∈ [1 , | . | denotes the cardinality of a set. In the feature selec-tion, we would like to select the features from the RBMthat encodes the MRI maps, but also that correlates withthe stroke prediction. Since the RBM is an unsupervisedmethod, we compute the Normalized Mutual Informationto quantify the statistical dependence between each gener-ated feature and the respective input MRI map, as deﬁnedby Equation 1 (Vinh et al., 2010): N MI sum ( MRI x , Feat y ) = MI ( MRI x , Feat y ) H ( MRI x ) + H ( Feat y ) , (1)where MI ( . ) is the mutual information between an MRIparametric map, MRI x , and an output feature, Feat y ; H ( . )deﬁnes the entropy of a map, namely, MRI x and Feat y .To relate the features of the RBM with the class label,we could use a classiﬁer supervisedly trained. Since theneural network is trained iteratively, we use a RF clas-6 igure 3: Overview of the proposed unsupervised learning block. For each RBM of the unsupervised learning block, the selected features were M = M = sifer trained with the Mean Decrease Impurity (MDI) asa surrogate to make the feature selection tractable. After,we compute the MI RBM and and MDI RF , normalize theMI RBM by the maximum value, add both ranks and sortdecreasingly. The best set will be the ﬁrst M i features.Our selection method was inspired on the work of Pereiraet al. (2018); however, their method cannot be directlyapplied, since it would generate too many features for ourproblem. Our supervised functional block is based on the U-Netarchitecture as proposed by Ronneberger et al. (2015).The input of the U-Net considers the concatenation ofstandard parametric maps with the sets of feature mapsextracted from the unsupervised block. In the ﬁrst levelof our encoder architecture we use four 2D convolutionalblocks with kernel size of 3 × e . g . time-series). To be applicable to 2D data, we developed anonline 2D Partition layer that transforms a grid-structureinput ( e . g . an image) into a one-dimensional sequence. In-spired by Visin et al. (2016), the 2D Partition layer waspredeﬁned with a neighbourhood of 2 ×

2, where eachtime-step is characterised by a feature space of four vox-els. After, two Bidirectional LSTM layers are employed7long the left-right and frontal-dorsal directions followedby an up-sampling layer. These four layers, referred asthe Gated Recurrent block, are shown in Figure 2. In oursupervised functional block, two Gated Recurrent blockswere used, where the Bidirectional LSTMs have 64 and32 hidden layers, respectively. The impact of the maincomponents is evaluated in an ablation study in the exper-iments.

3. Experimental Setup

We evaluated the proposed approach on the publiclyavailable ISLES 2017 dataset and on a private dataset.ISLES has an online benchmark platform (Kistler et al.,2013) that performs automatic evaluation (SMIR On-line Platform, 2017). In this section we describe thedataset, the training and evaluation, and the main hyper-parameters of our method.

ISLES 2017 dataset encompasses 75 ischaemic strokepatients, which are separated into two sets: training ( n =

43) and testing ( n = ﬀ usion ADCmap, perfusion rCBF, rCBV, TTP, MTT and Tmax maps.In addition to the standard parametric maps, each case isalso characterised by a manual delineation of the lesion.This refers to the 90-day stroke lesion delineated with ac-cess to the follow-up T2-weighted acquisition. However,the manual delineation is only available for the trainingset, while the follow-up T2-weighted imaging is not dis-closed for any set. All parametric MRI maps are alreadyco-registered and skull-stripped (Winzeck et al., 2018).Figure 4 (top row) shows an example of MRI maps,alongside the manual lesion delineation, the Ground Truth(GT), of a patient.The private dataset considers 23 acute ischaemic strokepatients that underwent clinical therapy, acquired at BernUniversity Hospital in Switzerland. As in ISLES 2017,each patient is characterized by the same six parametricmaps, being the ﬁnal lesion manually delineated at 90-dayfollow-up T2. The parametric maps were co-registeredfollowed by skull-stripping with FSL BET2 on the co-registered follow-up T2 image (Jenkinson et al., 2012). We evaluated our proposal with ﬁve metrics, which arethe same ones computed by the online ISLES 2017 bench-mark platform: Dice Similarity Score, Hausdor ﬀ Distance(HD), Average Symmetric Surface Distance (ASSD), Pre-cision, and Recall (Kistler et al., 2013).Dice score measures the spatial overlap between twovolumes. HD corresponds to the highest distance betweensurface points of di ﬀ erent volumes, which characterisespatial outliers in the prediction. ASSD quantiﬁes theaverage distances between the volumes’ surface. Preci-sion quantiﬁes the proportions of correctly classiﬁed caseswithin a class, while Recall corresponds to the proportionof positive cases correctly identiﬁed as such. Since MRI acquisitions were acquired from di ﬀ erentcenters and conﬁgurations (Winzeck et al., 2018), foreach patient we resized all maps to a common volumeof dimension of 256 × ×

32. Afterwards, the ADCmaps were clipped between [0 , × − mm / s andthe Tmax maps were clipped to [0 , s ], since values be-yond these ranges are known to be biologically meaning-less (McKinley et al., 2017). Finally, a linear scaling wasapplied across all maps, to the range [0 , Data augmentation can be used to increase the numberof training samples and reduce over-ﬁtting (Krizhevskyet al., 2012). Due to the relatively small size of the train-ing dataset, we employed artiﬁcial data augmentation inthe supervised portion of our proposal. For each sample,we applied rotations of 90 ◦ , 180 ◦ , 270 ◦ . The unsupervised func-tional block was trained by optimizing the negative log-likelihood of the data. However, since computing the8radient is generally intractable, we performed the train-ing by approximating the gradient with Contrastive Diver-gence with one step of alternating Gibbs sampling (Hin-ton, 2012). The training process of an RBM can be dif-ﬁcult if one tries to learn the parameter σ i of the energyfunction, which corresponds to the standard deviation ofthe Gaussian noise of a visible node i (Hinton, 2012). Ac-cording to Hinton (2012), we normalize each componentof the data with zero mean and unit variance, and deﬁne σ i =

1. In Table 1, we present the settings used for train-ing the unsupervised model.For training each RBM, we randomly extract 3Dpatches of shape 7 × × C . Then, the 3D patches are reshaped intoa 1D vector and fed into the visible layer of the RBM, hav-ing an input of size m = × × × |C| , as shown in Figure3. After training, we extract features from the NReLUunits noise-free activations. These units exhibit intensityequivariance when the bias has zero value, and they arenoise free units (Nair and Hinton, 2010). Due to the largenumber of extracted feature maps ( |N | = |N | = M = M =

6. The most appropriatecardinality of M is discussed in Section 4.1.1. Table 1: Model training parameters for the unsupervised and supervisedfunctional blocks.

Functional Block Parameter DescriptionUnsupervised Optimizer SGD with momentum ( lr = × − )Patch shape 7 × × lr = × − )Patch shape 84 × Supervised functional block.

As for the supervised func-tional block, the complete settings of the training aregiven in Table 1. For each subject, 350 patches were ran-domly sampled. The training comprehended 36 subjects,while the remaining 7 subjects were used for validation.The settings were optimized through cross-validation in aprevious work (Pinto et al., 2018a). For training, we usedsoft Dice loss function (Milletari et al., 2016). It is deﬁnedas: Soft Dice loss = (cid:80) | V | i p i g i (cid:80) | V | i p i + (cid:80) | V | i g i . (2)In the soft dice loss, the sum occurs over the set V of voxels belonging to the predicted output patch, where p i ∈ P denotes the probability of a voxel i in the outputpatch and g i ∈ G corresponds to the respective ground-truth label voxel.The method was implemented using Keras with Ten-sorﬂow backend, in a workstation equipped with a GTX1080 Ti 11 GB. Prediction time takes around 20s per pa-tient.

4. Results and Discussion

In this section, we discuss the impact of the maincontributions, namely, the incorporation of unsupervisedlearning with supervised learning and the Gated Recur-rent blocks. Then, we compare our method with the stateof the art in ISLES 2017 Challenge. Finally, we delve onthe di ﬃ culty of predicting the ﬁnal infarct stroke lesion. The ablation study aims to gradually measure the im-portance of the main components and consequently asserton the contribution of each component to the overall per-formance. Thus, we start by evaluating the importance ofthe unsupervised feature generator and the proposed in-put grouping. After, we focus on the use of the GatedRecurrent block and the choice of the dimensionality ofthe spatial context.

We hypothesize that grouping the parametric MRImaps according to their physical meaning and encodingeach group with an RBM has potential to extract betterfeatures to characterise the stroke lesion and the bloodhaemodynamics. We perform several experiments to cor-roborate this working hypothesis. In all experiments, the Additional details of setting and model training are provided in thesupplementary material. Also, the source code for reproducing the seg-mentations, the models’ weights and segmentations can be found at: https://github.com/apinto92/stroke_prediction.git . Grouping all parametric MRI maps in a single group.

We considered, ﬁrst, the e ﬀ ect of encoding all paramet-ric maps using a single RBM. We varied the number ofselected features from the RBM, observing that in allcases, the average Dice score is equal or lower than us-ing only the parametric maps as input to the supervisedblock. Also, using 12 features presented the lowest av-erage Dice score. The use of 3 or 6 obtained the sameaverage Dice score, having the second a lower averageHausdor ﬀ distance. So, based on the metrics, we mayconclude that there is no clear gain in using the featuresgenerated by the RBM, at least, when we encode all theparametric maps with a single RBM.Since, the selection of 6 features also includes the pre-vious top 3 features, we compared the normalized mutualinformation between them. As shown in Figure 4, the top3 features have low values of normalized mutual informa-tion in relation to the additional 3 features, which indi-cates that there is additional information. For this reason,we chose 6 as the number of features in the subsequentexperiments. Grouping parametric MRI maps according to the subja-cent physical meaning.

In this experiment, we groupedthe parametric maps according to their underlining physi-cal meaning together with ADC map in each group. Eachgroup was encoded with an RBM. Comparing isolatedlythe use of each group of features, we verify that RBM

Lesion had a higher average Dice score compared to using onlythe parametric maps as input to the supervised block.The increase in the average Dice score was obtained bya higher average Recall. Also, we observe an improve-ment in all distance metrics. The experiment of usingRBM

Haemo presented the lowest average Dice and Re-call, as well as higher average distance metrics. However,RBM

Haemo presented higher average Precision, contraryto RBM

Lesion , which motivated the study on the combina-tion of features from RBM

Haemo with RBM

Haemo besidesthe parametric maps. The results of this experiment arepresented in Table 2. We may observe that this combi-nation obtained the highest average Dice and Precision,as well as the lowest average distance metrics. How-ever, this improvement could have been originated fromthe combination of maps according to a speciﬁc com-mon property, subjacent physical meaning of the para-metric maps, in each group, or because we reduced thenumber of maps from 6 to 3 in each group. And this re-

Table 2: Results obtained with di ﬀ erent conﬁgurations of the unsupervised feature generator block in ISLES 2017 testing set. Each metric representsthe mean ± standard deviation. Underlined values correspond to the highest mean. Unsupervised Block Supervised Block Dice HD ASSD Precision RecallFCN G-RNN– U-Net LSTM 0.30 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± + RBMHaemo U-Net LSTM 0.38 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± / Less + RBMLesion / Less + RBMADC igure 4: Onset parametric maps of patient case 0011 in ISLES 2017 training set, alongside the ﬁnal stroke lesion, at a 90-day follow-up, overthe onset ADC map. The subsequent rows show the RBM features selected from the RBM Lesion , RBM

Haemo and RBM

Single , respectively. The lastcolumn shows the normalized mutual information, across whole dataset, among features of the same RBM. duction could have allowed a better training of the RBM.So, we performed two complementary experiments. Inthe ﬁrst experiment, we formed two groups with similarsize, but we randomly chose the parametric maps to in-clude in each group. In the second experiment (Three-RBMs), we changed the groups of MRI maps encodedin RBM

Lesion and RBM

Haemo by removing the ADC mapfrom each one. These two new groups were encoded inRBM

Lesion / Less and RBM

Haemo / Less , respectively. The ADCwas separately encoded in RBM

ADC . As presented in Ta- ble 2, the ﬁrst experiment presented the lowest averageDice score and higher average distance metrics, while thesecond experiment attained the second highest averageDice score, thus showing the importance of splitting theparametric perfusion maps and including the ADC map inboth the RBM

Haemo and RBM

Lesion .Considering these experiments together, we may drawsome conclusions. First, although CNNs are very e ﬀ ec-tive in generating features from raw data, they can gen-erate even better features if rich and complementary in-11ormation is provided. A similar conclusion was inferredby Oliveira et al. (2018) that observed improvement whenthe coe ﬃ cients of the Wavelet were added as input in theproblem of retinal vessel segmentation. Here, we observea similar e ﬀ ect, but using the encoding provided by anRBM trained unsupervisedly for the problem of stroke le-sion prediction. Second, at least to the problem of strokelesion prediction, when we have data with di ﬀ erent latentfactors and we are able to group it, according to thosefactors, then there is potential to extract complementaryinformation from each group, but to mix them all togethercan be detrimental. In medical imaging segmentation, which is similar toour problem of inferring the extension of the lesion 90days ahead, the use of a cascade of convolutional layersto elaborate the features is the prevalent practise. How-ever, as discussed previously, Gated-RNN layers are ableto capture long distance spatial relations among input vox-els, so we performed some experiments to evaluate itscontribution. The results are presented in Table 3.Analysing Table 3, we verify that when we just hadparametric maps as input to the supervised block, addinga LSTM layer increased the average Precision, but theaverage Recall decreased, resulting in the same averageDice score. But, a di ﬀ erent behaviour is observed whenwe added the features computed from the RBMs. In thisscenario, we verify that using only CNN layers improvedover having just parametric maps, which came by a higheraverage Precision. However, when we add the LSTM, wehave an even higher improvement, which is observed in alarger increase in the average Precision, and a decrease in the average distance metrics.Based on these experiments, we may conclude that theCNN layers were able to extract additional informationfrom the RBM features; however, at least to the problemof inferring the extension of the lesion months ahead, longand local distance spatial relations among input voxels in-troduced by Gated RNN was critical to reduce the detec-tion of false positives, increasing substantially the averageDice score by 6%. MRI images are 3D by nature, so the use of 3D ﬁl-ters would allow capturing more context, which has thepotential to provide better prediction. Since 2D ﬁltersare conﬁned to a plane, unnatural discontinuous contourmay occur in the perpendicular axis. However, as pre-sented previously, the resolution of MRI images in ISLESdataset is not equal in all axis, being coarser along theaxial axis. So, we studied the e ﬀ ect of the spatial con-text in our architecture. As we have two blocks, unsuper-vised and supervised blocks, the e ﬀ ect on each one wasevaluated separately. The results are presented in Table 4.Considering the results, we observe that using 2D patchesin both blocks has lower average Dice score, than usingonly the parametric maps as input (baseline), because theincrease in the average Precision was not enough to com-pensate the drop in the average Recall. Using 3D patchesfor both blocks had the same performance as our base-line. However, when we used 3D patches for the RBMbut 2D blocks for the U-Net block, we improved over ourbaseline. This is the model with the highest average Dicescore without LSTM. So, we may conclude that for ourarchitecture, larger context using 3D patches was moree ﬀ ective for encoding features in the unsupervised block, Table 3: Results obtained when considering the Gated Recurrent block with and without the unsupervised learning block with ISLES 2017 testingset. Each metric represents the mean ± standard deviation. Underlined values correspond to the highest mean. Unsupervised Block Supervised Block Dice HD ASSD Precision RecallFCN G-RNN– U-Net – 0.30 ± ± ± ± ± ± ± ± ± ± + RBMHaemo [3D] U-Net – 0.32 ± ± ± ± ± ± ± ± ± ± able 4: Evaluation metrics obtained with di ﬀ erent spatial context conﬁgurations in the unsupervised and supervised learning blocks in ISLES 2017testing set. Each metric represents the mean ± standard deviation. Underlined values correspond to the highest mean. Unsupervised Block Supervised Block Dice HD ASSD Precision RecallFCN G-RNNRBMLesion + RBMHaemo [2D] U-Net [2D] – 0.27 ± ± ± ± ± + RBMHaemo [3D] U-Net [2D] – 0.32 ± ± ± ± ± ± ± ± ± ± while 2D patches were better suited for encoding featuresin the supervised U-Net-based block. To further evaluate the generalization capacity of ourproposal, we tested it on a private dataset and compareit with the baseline method. Table 5 presents the resultsobtained by the two methods.On the overall, our proposal was capable of surpassingthe baseline model, attaining an higher average Dice, Pre-cision and distance metrics, which were statistically sig-niﬁcant (Wilcoxon Signed Ranked test with p − value < . ﬀ erent acquisition protocols or the di ﬀ erences inthe preprocessing step. The results of published methods for ﬁnal infarct strokelesion prediction using ISLES 2017 testing set (Winzecket al., 2018), together with our baseline and proposalmethods are presented in Table 6. The metrics were com-puted by the online platform, so the ground-truth data, which was manually delineated based on a follow-up T2MRI acquisitions, are not disclosed for public access.Considering the results, we observe that our baselineis competitive with an average Dice, being among thetop 3 methods, and surpassing the ensemble methods ofPisov et al. (2017) and Robben and Suetens (2017). Ourproposed method presented the lowest distance metricsamong all methods, especially for the Hausdor ﬀ distance.It obtained the second-best average Precision score, be-ing surpassed by Robben and Suetens (2017) The authorsproposed the integration of meta-data information, usinga two-pathway 3D network in an ensemble; however, ourexperiments did not indicate any improvement using 3Dpatches for the U-Net, at least for our architecture. So,this improvement could have come from a combinationof the e ﬀ ect of the ensemble and the meta-data. But wenote that their method presented a lower average Recall,which explains their lower average Dice score. Regard-ing the average Recall score, our method was fourth, butwhen we consider the top 3 methods, specially Pinto et al.(2018a), we conclude that it was obtained with a muchlower average Precision, which means that to increase thetrue positive detections, they had to increase substantiallythe false positives. So, comparing with the state of theart, our method presented a better balance between Pre- Table 5: Results obtained by our proposal and baseline method in the private dataset. Each metric is represented by the mean ± standard deviation.Underlined values correspond to the highest mean, while bold values represent statistically signiﬁcant values ( p-value < . Unsupervised Block Supervised Block Dice HD ASSD Precision RecallFCN G-RNN– U-Net [2D] – 0.32 ± ± ± ± ± + RBMHaemo [3D] U-Net [2D] LSTM ± ± ± ± ± able 6: Published methods in ISLES 2017 testing dataset and our proposal. Each metric is represented by the mean ± standard deviation.Underlined values correspond to the highest mean. Dice HD ASSD Precision Recall E n s e m b l e Mok et al. * 0.32 ± ± ± ± ± et al. * 0.31 ± ± ± ± ± et al. * 0.27 ± ± ± ± ± et al. * 0.27 ± ± ± ± ± S i ng l e M od e l Monteiro et al. * 0.30 ± ± ± ± ± ± ± ± ± ± et al. * 0.29 ± ± ± ± ± et al. * 0.28 ± ± ± ± ± et al. * 0.26 ± ± ± ± ± et al. * 0.20 ± ± ± ± ± et al. * 0.19 ± ± ± ± ± et al. * 0.19 ± ± ± ± ± et al. * 0.18 ± ± ± ± ± et al. * 0.17 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ∗ Methods presented in Winzeck et al. (2018), whose results were retrieved from SMIR Online Platform (2017). cision and Recall, which reﬂected into a higher averageDice score.Based on the results, we may conclude that the use ofcomplementary features provided by the RBMs and theuse of LSTM for a larger context allowed our baseline tosurpass current state-of-the-art methods.

Results from ChallengeR Benchmark.

The SMIR plat-form of ISLES 2017 provides a weekly benchmark reportof the current top-10 methods in the testing set, accord-ing to the average Dice score. So, some of the methodsmay not be published, lacking a description on their im-plementation, and, for this reason, were not included inthe previous discussion.Figure 5 presents the boxplots of each method consid-ered in the report.We observe that the top-10 methods failed to predictthe lesion of one or more cases (lowest outliers), whichmay indicate the degree of complexity of predicting in-farct stroke lesion 90 days ahead in ISLES 2017 Chal-lenge dataset. But we verify that our method is the onlyone to have the ﬁrst quartile above 0.20 in the Dice score.

Figure 5: Boxplot of the top-10 ranking methods ordered by averageDice score in ISLES 2017 testing set.

In Figure 6 we have the signiﬁcance maps of the pair-wise signiﬁcant test with one-side Wilcoxon signed ranktest ( p-value = . podium plot of each method for eachcase in the testing set, and its ranking. We observe thatour proposal is the method, which ranked ﬁrst most of thetimes, as well as second and third. Also, when we con-sider the methods ranked bellow fourth, our method is in14 igure 6: One-side Wilcoxon signed rank test in ISLES 2017 testing set.Statistically signiﬁcant tests are marked with the blue colour, while thered colour designates statistically non-signiﬁcant tests. general among those with the lowest counts. Analysingthe cases individually, we note two trends, for some casesall methods presented similar performance, while for oth-ers, we ﬁnd a large variation from the ﬁrst to the othermethods. The ﬁrst trend may be found in the most di ﬃ -cult case, where all methods had zero or a close value forthe Dice score. In the second trend, we observe that ourmethod is ranked as ﬁrst most of the cases.Based on the results of the benchmark, we may inferthat our method is competitive among current state of the art, presenting the highest average Dice score and lowestaverage distance score. Considering the ablation study,this performance was attained due to the combination ofadding extra features obtained by encoding the parametricmaps with RBMs, according to the underlining physicalmeaning, and the elaboration provided by the long contextof the LSTM layers.

5. Conclusions

In this work, we present a deep learning approach forpredicting the ﬁnal stroke lesion, based on unsupervisedand supervised learning. We proposed to group the inputmaps according to the underlying physical principle be-hind their creation, namely, the time-resolved perfusionmaps (Tmax, TTP, MTT), and the blood-ﬂow-dynamic re-lated maps (rCBF, rCBV). Each group was encoded usingan unsupervised model to obtain structural features spe-ciﬁc to its underlying physical principle. These structuralfeatures together with the standard parametric maps werefed to a supervised model to learn features conditionedon the label, which in our problem, means to conditionon the results of the medical intervention — lesion at 90-days follow-up. We also investigated the use of Gated

Figure 7: Podium plot of each testing case in ISLES 2017. For each ISLES 2017 testing subject, deﬁned by a coloured line with circles, the Podiumplot orders decreasingly the Dice score obtained by each of the top 10 methods that are represented by coloured circles. ﬀ erent revascularizationscenarios. So as future work, we aim to study how suchmeta-data ( i . e . TICI score) could be incorporated in ourarchitecture, to consolidate the impact of the clinical in-tervention and to further improve the 90-day lesion pre-diction. Acknowledgement

Adriano Pinto was supported by a scholarship from theFundac¸ ˜ao para a Ciˆencia e Tecnologia (FCT), Portugal(scholarship number PD / BD / / / / / / References

Badrinarayanan, V., Kendall, A., Cipolla, R., 2015.Segnet: A deep convolutional encoder-decoder ar-chitecture for image segmentation. arXiv preprintarXiv:1511.00561 .Bauer, S., Gratz, P.P., Gralla, J., Reyes, M., Wiest, R.,2014. Towards automatic MRI volumetry for treat-ment selection in acute ischemic stroke patients, in: En-gineering in Medicine and Biology Society (EMBC),2014 36th Annual International Conference of theIEEE, IEEE. pp. 1521–1524.Berkhemer, O.A., Jansen, I.G., Beumer, D., Fransen,P.S., Van Den Berg, L.A., Yoo, A.J., Lingsma, H.F., Sprengers, M.E., Jenniskens, S.F., Lycklama `a Nije-holt, G.J., et al., 2016. Collateral status on baselinecomputed tomographic angiography and intra-arterialtreatment e ﬀ ect in patients with proximal anterior cir-culation stroke. Stroke 47, 768–776.Butcher, K., Emery, D., 2010a. Acute stroke imaging parti: Fundamentals. Canadian Journal of NeurologicalSciences 37, 4–16.Butcher, K., Emery, D., 2010b. Acute stroke imaging partii: the ischemic penumbra. Canadian Journal of Neu-rological Sciences 37, 17–27.Choi, Y., Kwon, Y., Lee, H., Kim, B.J., Paik, M.C.,Won, J.H., 2016. Ensemble of deep convolutional neu-ral networks for prognosis of Ischemic Stroke, in: In-ternational Workshop on Brainlesion: Glioma, Mul-tiple Sclerosis, Stroke and Traumatic Brain Injuries,Springer. pp. 231–243.Coutts, S.B., Simon, J.E., Tomanek, A.I., Barber, P.A.,Chan, J., Hudon, M.E., Mitchell, J.R., Frayne, R.,Eliasziw, M., Buchan, A.M., et al., 2003. Reliability ofassessing percentage of di ﬀ usion-perfusion mismatch.Stroke 34, 1681–1683.El Tawil, S., Muir, K.W., 2017. Thrombolysis andthrombectomy for acute ischaemic stroke. ClinicalMedicine 17, 161–165.Gonzalez, R., Hirsch, J., Koroshetz, W., Lev, M., Schae-fer, P., 2007. Acute ischemic stroke: imaging and in-tervention. American Journal of Neuroradiology 28,1622.Gonz´alez, R.G., Hirsch, J.A., Koroshetz, W., Lev, M.H.,Schaefer, P.W., 2011. Acute ischemic stroke. Springer.Grysiewicz, R.A., Thomas, K., Pandey, D.K., 2008. Epi-demiology of ischemic and hemorrhagic stroke: inci-dence, prevalence, mortality, and risk factors. Neuro-logic clinics 26, 871–895.Higashida, R.T., Furlan, A.J., Roberts, H., Tomsick, T.,Connors, B., Barr, J., Dillon, W., Warach, S., Broder-ick, J., Tilley, B., et al., 2003. Trial design and report-ing standards for intraarterial cerebral thrombolysis foracute ischemic stroke. Journal of Vascular and Inter-ventional Radiology 14, E1–E31.16inton, G.E., 2012. A practical guide to training restrictedBoltzmann machines, in: Neural networks: Tricks ofthe trade. Springer, pp. 599–619.Hochreiter, S., Schmidhuber, J., 1997. Long short-termmemory. Neural computation 9, 1735–1780.Jenkinson, M., Beckmann, C.F., Behrens, T.E., Woolrich,M.W., Smith, S.M., 2012. Fsl. Neuroimage 62, 782–790.Kamnitsas, K., Ledig, C., Newcombe, V.F., Simpson, J.P.,Kane, A.D., Menon, D.K., Rueckert, D., Glocker, B.,2017. E ﬃ cient multi-scale 3d CNN with fully con-nected CRF for accurate brain lesion segmentation.Medical image analysis 36, 61–78.Kemmling, A., Flottmann, F., Forkert, N.D., Minnerup,J., Heindel, W., Thomalla, G., Eckert, B., Knauth,M., Psychogios, M., Langner, S., Fiehler, J., 2015.Multivariate dynamic prediction of ischemic infarctionand tissue salvage as a function of time and degreeof recanalization. Journal of Cerebral Blood Flow &Metabolism 35, 1397–1405.Kistler, M., Bonaretti, S., Pfahrer, M., Niklaus, R.,B¨uchler, P., 2013. The virtual skeleton database: anopen access repository for biomedical research and col-laboration. Journal of medical Internet research 15.Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Ima-genet classiﬁcation with deep convolutional neural net-works, in: Advances in neural information processingsystems, pp. 1097–1105.Labeyrie, M.A., Turc, G., Hess, A., Hervo, P., Mas, J.L.,Meder, J.F., Baron, J.C., Touz´e, E., Oppenheim, C.,2012. Di ﬀ usion lesion reversal after thrombolysis: amr correlate of early neurological improvement. Stroke43, 2986–2991.Liebeskind, D.S., 2003. Collateral circulation. Stroke 34,2279–2284.Lucas, C., Heinrich, M.P., 2017. 2d multi-scale res-netfor stroke segmentation.Maier, O., Menze, B.H., von der Gablentz, J., H¨ani, L.,Heinrich, M.P., Liebrand, M., Winzeck, S., Basit, A., Bentley, P., Chen, L., et al., 2017. ISLES 2015-apublic evaluation benchmark for ischemic stroke lesionsegmentation from multispectral MRI. Medical imageanalysis 35, 250–269.McKinley, R., H¨ani, L., Gralla, J., El-Koussy, M., Bauer,S., Arnold, M., Fischer, U., Jung, S., Mattmann, K.,Reyes, M., et al., 2017. Fully automated stroke tis-sue estimation using random forest classiﬁers (faster).Journal of Cerebral Blood Flow & Metabolism 37,2728–2741.Memezawa, H., Smith, M.L., Siesj¨o, B.K., 1992. Penum-bral tissues salvaged by reperfusion following middlecerebral artery occlusion in rats. Stroke 23, 552–559.Menon, B.K., Qazi, E., Nambiar, V., Foster, L.D., Yeatts,S.D., Liebeskind, D., Jovin, T.G., Goyal, M., Hill,M.D., Tomsick, T.A., et al., 2015. Di ﬀ erential e ﬀ ectof baseline computed tomographic angiography collat-erals on clinical outcome in patients enrolled in the in-terventional management of stroke iii trial. Stroke 46,1239–1244.Milletari, F., Navab, N., Ahmadi, S.A., 2016. V-net: Fullyconvolutional neural networks for volumetric medicalimage segmentation, in: 3D Vision (3DV), 2016 FourthInternational Conference on, IEEE. pp. 565–571.Mok, T.C., Chung, A.C., 2017. Deep adversarial net-works for stroke lesion segmentation .Monteiro, M., Oliveira, A.L., 2017. Fully convolutionalneural network for 3d stroke lesion segmentation .Nair, V., Hinton, G.E., 2010. Rectiﬁed linear units im-prove restricted Boltzmann machines, in: Proceedingsof the 27th international conference on machine learn-ing (ICML-10), pp. 807–814.Nielsen, A., Hansen, M.B., Tietze, A., Mouridsen, K.,2018. Prediction of tissue outcome and assessmentof treatment e ﬀ ect in acute ischemic stroke using deeplearning. Stroke 49, 1394–1401.Niu, Y., Gong, E., Xu, J., Pauly, J., Zaharchuk, G.,2018. Improved prediction of the ﬁnal infarct fromacute stroke neuroimaging using deep learning, in:STROKE.17liveira, A., Pereira, S., Silva, C.A., 2018. Retinal vesselsegmentation based on fully convolutional neural net-works. Expert Systems with Applications 112, 229–242.Pereira, S., Meier, R., McKinley, R., Wiest, R., Alves,V., Silva, C.A., Reyes, M., 2018. Enhancing inter-pretability of automatically extracted machine learningfeatures: application to a RBM-Random Forest systemon brain lesion segmentation. Medical image analysis44, 228–244.Pereira, S., Pinto, A., Amorim, J., Ribeiro, A., Alves,V., Silva, C.A., 2019. Adaptive feature recombinationand recalibration for semantic segmentation with fullyconvolutional networks. IEEE Transactions on Medi-cal Imaging 38, 2914–2925. doi: .Pinto, A., McKinley, R., Alves, V., Wiest, R., Silva, C.A.,Reyes, M., et al., 2018a. Stroke lesion outcome pre-diction based on MRI imaging combined with clinicalinformation. Frontiers in Neurology 9, 1060.Pinto, A., Pereira, S., Meier, R., Alves, V., Wiest, R.,Silva, C.A., Reyes, M., 2018b. Enhancing clinical MRIperfusion maps with data-driven maps of complemen-tary nature for lesion outcome prediction, in: MedicalImage Computing and Computer Assisted Intervention– MICCAI 2018, pp. 107–115.Pisov, M., Belyaev, M., Krivov, E., 2017. Neural networksensembles for ischemic stroke lesion segmentation .Powers, W.J., Rabinstein, A.A., Ackerson, T., Adeoye,O.M., Bambakidis, N.C., Becker, K., Biller, J., Brown,M., Demaerschalk, B.M., Hoh, B., et al., 2018.2018 guidelines for the early management of patientswith acute ischemic stroke: a guideline for health-care professionals from the American Heart Associa-tion / American Stroke Association. Stroke 49, e46–e99.Rekik, I., Allassonni`ere, S., Carpenter, T.K., Ward-law, J.M., 2012. Medical image analysis methodsin MR / CT-imaged acute-subacute ischemic stroke le-sion: Segmentation, prediction and insights into dy-namic evolution simulation models. a critical appraisal.NeuroImage: Clinical 1, 164–178. Robben, D., Boers, A.M., Marquering, H.A., Langezaal,L.L., Roos, Y.B., van Oostenbrugge, R.J., van Zwam,W.H., Dippel, D.W., Majoie, C.B., van der Lugt, A.,et al., 2020. Prediction of ﬁnal infarct volume fromnative ct perfusion and treatment parameters using deeplearning. Medical image analysis 59, 101589.Robben, D., Suetens, P., 2017. Dual-scale fully convo-lutional neural network for ﬁnal infarct prediction, in:Ischemic stroke lesion segmentation-ISLES challenge2017, held in conjunction with MICCAI 2017, Date:2017 / / / /

10, Location: Quebec City, Que-bec, Canada.Ronneberger, O., Fischer, P., Brox, T., 2015. U-net:Convolutional networks for biomedical image seg-mentation, in: International Conference on Medicalimage computing and computer-assisted intervention,Springer. pp. 234–241.Rose, S.E., Chalk, J.B., Gri ﬃ n, M.P., Janke, A.L., Chen,F., McLachan, G.J., Peel, D., Zelaya, F.O., Markus,H.S., Jones, D.K., et al., 2001. MRI based di ﬀ usionand perfusion predictive model to estimate stroke evo-lution. Magnetic resonance imaging 19, 1043–1053.Rumelhart, D.E., McClelland, J.L., 1986. Parallel dis-tributed processing: explorations in the microstructureof cognition. volume 1. foundations .Scalzo, F., Hao, Q., Alger, J.R., Hu, X., Liebeskind, D.S.,2012. Regional prediction of tissue fate in acute is-chemic stroke. Annals of Biomedical Engineering 40,2177–2187.SMIR Online Platform, 2017. Ischemic stroke lesionsegmentation 2017.