[PDF] Characterization of anomalous diffusion classical statistics powered by deep learning (CONDOR)

Abstract

Diffusion processes are important in several physical, chemical, biological and human phenomena. Examples include molecular encounters in reactions, cellular signalling, the foraging of animals, the spread of diseases, as well as trends in financial markets and climate records. Deviations from Brownian diffusion, known as anomalous diffusion, can often be observed in these processes, when the growth of the mean square displacement in time is not linear. An ever-increasing number of methods has thus appeared to characterize anomalous diffusion trajectories based on classical statistics or machine learning approaches. Yet, characterization of anomalous diffusion remains challenging to date as testified by the launch of the Anomalous Diffusion (AnDi) Challenge in March 2020 to assess and compare new and pre-existing methods on three different aspects of the problem: the inference of the anomalous diffusion exponent, the classification of the diffusion model, and the segmentation of trajectories. Here, we introduce a novel method (CONDOR) which combines feature engineering based on classical statistics with supervised deep learning to efficiently identify the underlying anomalous diffusion model with high accuracy and infer its exponent with a small mean absolute error in single 1D, 2D and 3D trajectories corrupted by localization noise. Finally, we extend our method to the segmentation of trajectories where the diffusion model and/or its anomalous exponent vary in time.

Full PDF

CCharacterisation of anomalous diﬀusion statisticspowered by deep learning

Alessia Gentili, Giorgio Volpe

Department of Chemistry, University College London, 20 Gordon Street, LondonWC1H 0AJ, UKE-mail: [email protected]

Abstract.

Diﬀusion processes are important in several physical, chemical, biologicaland human phenomena. Examples include molecular encounters in reactions, cellularsignalling, the foraging of animals, the spread of diseases, as well as trends in ﬁnancialmarkets and climate records. Deviations from Brownian diﬀusion, known as anomalousdiﬀusion, can often be observed in these processes, when the growth of the mean squaredisplacement in time is not linear. An ever-increasing number of methods has thusappeared to characterize anomalous diﬀusion trajectories based on classical statisticsor machine learning approaches. Yet, characterization of anomalous diﬀusion remainschallenging to date as testiﬁed by the launch of the Anomalous Diﬀusion (AnDi)Challenge in March 2020 to assess and compare new and pre-existing methods on threediﬀerent aspects of the problem: the inference of the anomalous diﬀusion exponent,the classiﬁcation of the diﬀusion model, and the segmentation of trajectories. Here,we introduce a novel method which combines feature engineering based on classicalstatistics with supervised deep learning to eﬃciently identify the underlying anomalousdiﬀusion model with high accuracy (up to ) and infer its exponent with a smallmean absolute error (down to . ) in single 1D, 2D and 3D trajectories corrupted bylocalization noise. Keywords : anomalous diﬀusion, single trajectory characterization, classical statisticsanalysis, deep learning, neural networks

1. Introduction

Anomalous diﬀusion refers to diﬀusion phenomena characterized by a mean squaredisplacement (MSD) that grows in time with an exponent α that is either smaller(subdiﬀusion) or greater (superdiﬀusion) than one (standard Brownian diﬀusion) [1].Typically, this is represented by a nonlinear power-law scaling ( MSD( t ) ∼ t α ) [1].The growing interest in anomalous diﬀusion shown by the physics community lies inthe fact that it extends the concept of Brownian diﬀusion and helps describe a broadrange of phenomena of both natural and human origin [2]. Examples include molecularencounters in reactions [3], cellular signalling [4, 5], the foraging of animals [6], search a r X i v : . [ phy s i c s . d a t a - a n ] F e b haracterisation of anomalous diﬀusion statistics powered by deep learning ) [40]. The goal of the challenge was to push the methodologicalfrontier in the characterization of three diﬀerent aspects of anomalous diﬀusion for 1D,2D and 3D noise-corrupted trajectories [40]: (1) the inference of the anomalous diﬀusionexponent α ; (2) the classiﬁcation of the diﬀusion model; and (3) the segmentation oftrajectories characterized by a change in anomalous diﬀusion exponent and/or model atan unspeciﬁed time t .Here, in response to the AnDi Challenge, we describe a novel method(CONDOR: Classiﬁer Of aNomalous DiﬀusiOn tRajectories) for the characterizationof single anomalous diﬀusion trajectories based on the combination of classical statisticsanalysis tools and supervised deep learning. Speciﬁcally, our method applies adeep feed-forward neural network to cluster parameters extracted from the statisticalfeatures of individual trajectories. By combining advantages from both approaches(classical statistics analysis and machine learning), CONDOR is highly performant andparameter-free. It allows us to identify the underlying anomalous diﬀusion model withhigh accuracy ( (cid:38) ) and infer its exponent α with a small mean absolute error( (cid:46) . ) in single 1D, 2D and 3D trajectories corrupted by localization noise. CONDORwas always among the top performant methods in the AnDi Challenge for both theinference of the α -exponent and the model classiﬁcation of 1D, 2D and 3D anomalousdiﬀusion trajectories. In particular, CONDOR was the leading method for the inference haracterisation of anomalous diﬀusion statistics powered by deep learning Figure 1. Representative anomalous diﬀusion trajectories. (a-b)

Examplesof single 1D anomalous diﬀusion trajectories in the absence of corrupting noise for (a) diﬀerent models of standard diﬀusion (anomalous diﬀusion exponent α = 1 ) and for (b) the same anomalous diﬀusion model (sbm) with diﬀerent values of α . The modelsconsidered here are the annealed transient time motion (attm), the continuous-timerandom walk (ctrw), the fractional Brownian motion (fbm), the Lévy walk (lw) andthe scaled Brownian motion (sbm). (c) Same trajectories as in (a) corrupted withGaussian noise (of zero mean and variance σ n = 1 ) associated to a ﬁnite localizationprecision. For better comparison, each sequence of positions x n taken at times t n isrepresented after subtracting its average and after rescaling its displacements to theirstandard deviation to give the normalized sequence ˆ x n [35]. The grey-shaded areashighlight a small portion (10 data points) of each trajectory. task in 2D and 3D and for the classiﬁcation task in 3D.

2. Methods

CONDOR combines classical statistics analysis of single anomalous diﬀusion trajectorieswith supervised deep learning. The trajectories used in this work are generatednumerically (with the andi-dataset

Python package [41]) among the 5 modelsconsidered in the AnDi Challenge: the annealed transient time motion (attm) [42], thecontinuous-time random walk (ctrw) [43], the fractional Brownian motion (fbm) [44],the Lévy walk (lw) [45] and the scaled Brownian motion (sbm) [46]). Brieﬂy, AnDiChallenge datasets contain 1D, 2D or 3D trajectories from these 5 models sampledat regular time steps ( t n = n ∆ t , with n a positive integer and ∆ t a unitary timeinterval). The anomalous exponent α varies in the range [0 . , with steps of 0.05, asallowed by the respective model [40]. The shortest trajectories have at least 10 samplesper dimension ( T min ) and can be corrupted with additive Gaussian noise (with zeromean and variance σ n ) associated to a ﬁnite localization precision to better simulateexperimental endeavours [40, 41]. Figure 1 shows examples of such trajectories withoutnoise (Figure 1a-b) and with the addition of corrupting noise (Figure 1c). By visualinspection, these examples show how diﬀerences between single trajectories are notalways explicit, thus making the characterization of anomalous diﬀusion challenging.Already in the absence of noise, the same diﬀusion process (e.g. for α = 1 in Figure 1a)can be described by diﬀerent models and the same model (e.g. sbm in Figure 1b) haracterisation of anomalous diﬀusion statistics powered by deep learning Figure 2. CONDOR workﬂow.

CONDOR workﬂow is divided in three mainparts: preprocessing (Section 2.1), classiﬁcation and inference (Section 2.2). In thepreprocessing step, the vector of positions x associated to a single trajectory is usedto extract a vector I of N = 1 + 92 d inputs primarily based on classical statistics(with d = 1 , , being the trajectory dimension). The vector I is the basic inputfor each neural network in the classiﬁcation and inference steps. The output of theclassiﬁcation step is determined by the output of up to three deep feed-forward neuralnetworks and is the predicted model ˜ M among ﬁve possibile categories: the annealedtransient time motion (attm), the continuous-time random walk (ctrw), the fractionalBrownian motion (fbm), the Lévy walk (lw) and the scaled Brownian motion (sbm).Finally, the inference step predicts the value of the exponent α ∈ [0 . , basing itsdecision on I and the predicted ˜ M for each trajectory. In particular, the predicted ˜ α is the arithmetic average of two distinct predictions: ˜ α and ˜ α where ˜ M is used asselection element or additional input for the deep learning algorithm, respectively. can assume diﬀerent values of the exponent α . These occurances are not always easyto set apart (e.g. attm, fbm and sbm in Figure 1a). The presence of corrupting noise(Figure 1c) and/or short trajectories (shaded areas in Figure 1a-c) add extra complexity.In this section, we provide a detailed description of our method (CONDOR) tocharacterise such single anomalous diﬀusion trajectories. The fundamental intuitionbehind our approach is that diﬀerent trajectories produced by the same diﬀusionmodel should share similar statistical features to set them apart from other models [1].Nonetheless, in the presence of noise or with few data points, these similarities anddiﬀerences do not need be explicit in single trajectories (Figure 1). Thus, CONDORﬁrst manipulates single trajectories to extract this hidden statistical information andthen employs the power of deep learning to analyse and cluster them. CONDORworkﬂow is shown in Figure 2. We will ﬁrst discuss the feature engineering step used topreprocess the trajectories and extract a set of N relevant features primarily basedon classical statistics analysis (Section 2.1). These features become the inputs forthe deep-learning part of the characterization algorithm based on deep feed-forwardneural networks (Figure 3), which, in order, classiﬁes the diﬀusion model and infers theanomalous diﬀusion exponent α (Figure 2). These two steps are performed by multipleneural networks, whose architecture and connection is explained in Section 2.2. Finally,Section 2.3 discusses the training of CONDOR neural networks. haracterisation of anomalous diﬀusion statistics powered by deep learning Before the classiﬁcation and inference steps (Figure 2), we analyzed each trajectory toextract N = 1 + 92 d inputs (with d = 1 , , being the trajectory dimension) primarilybased on classical statistics (Table 1). The inputs used in this work were ultimatelyselected based on the ﬁnal performance of the deep learning step.Independently of the dimensionality of the problem, the ﬁrst input ( ln T max ) isalways connected to the duration T max of each trajectory to enable the deep learningalgorithm to properly weight diﬀerent durations.To extract the remaining d inputs for every trajectory in 1D, 2D and 3D,we analysed the vector of the positions along each direction x i ≡ ( x i , x i , .... x i T max∆ t ) (with i = 1 , ... d ) independently, after ﬁrst subtracting its average and rescaling itsdisplacements w ij = x ij +1 − x ij (with j = 1 , ... T max ∆ t − ) to their standard deviation toobtain the normalised displacements ˆ w ij [35]. Practically, for each dimension, this stepproduces new time sequences ˆ x i ≡ (ˆ x i , ˆ x i , .... ˆ x i T max∆ t ) of positions reconstructed from thecumulative sum of the normalised displacements of the original trajectories and with (cid:104) ˆ x i (cid:105) = . Importantly, this step allows for a better comparison between trajectories ofdiﬀerent dimensions and durations while keeping the performance of the deep learningalgorithm optimal [35].The vectors of the normalised positions ˆ x i and of the normalised displacements ˆ w i ≡ ( ˆ w i , ˆ w i , .... ˆ w i T max∆ t − ) are the starting point to calculate the statistical information(i.e. the deep learning inputs) associated to each trajectory. In particular, for eachdimension, we calculated the mean ( µ i ), the median ( µ i / ), the standard deviation ( σ i ),the skewness ( β i ), the kurtosis ( β i ) and the approximate entropy ( ApEn i ) associated tothe following (Table 1): the normalised displacements ˆ w i and the vector of their absolutevalues ˆ W i ≡ ( | ˆ w i | , | ˆ w i | , .... | ˆ w i T max∆ t − | ) , the vector ∆ i ( k ) of the absolute displacements ∆ ij ( m ) = | ˆ x ij + m − ˆ x j | taken at time lags τ = m ∆ t (with m = 2 , , ..., T min ∆ t − ), thevector r i of the ratios r ij = ˆ w ij +1 ˆ w ij , the absolute values |F i { ˆ w i }| of the normalized Fouriertransform of ˆ w i , the vector P iδ of the time series of the normalised powers of ˆ w i segmentsselected with a non-overlapping moving window of width δ ∆ t (with δ = 3 ), the vector ∆MSD i of the absolute values of the normalised variations ∆MSD ij = | MSD ij +1 − MSD ij j ∆ t | ofthe time-averaged mean square displacement (MSD), and the absolute values |W iu { ˆ w i }| of the wavelet transform of ˆ w i at the ﬁrst two translations u ∆ t with u = 1 , .To further improve the performance of the deep learning algorithm, the followingextra four inputs were also added for each dimension to extract additional informationabout regular and irregular time patterns in the trajectories (correlations, powervariations and abrupt changes): the integral of the normalized autocorrelation function( ACF ) of ˆ w i over a time window of width δ ∆ t to the right of the autocorrelationpeak, the level of frequency asymmetry in the normalized power spectral density of ˆ w i calculated as ( ∆ tT max ) (cid:80) k |F i { ˆ w i } k | sgn( k − F N ) with F N being the Nyquist frequency, thelevel of time asymmetry in P iδ calculated as (cid:80) k P iδ,k sgn( k − (cid:6) T max δ (cid:7) ) , and the occurrence haracterisation of anomalous diﬀusion statistics powered by deep learning Table 1.

Statistical information extracted from each dimension of every trajectoryas inputs for the deep learning algorithm. The approximate entropy is introduced asa way to quantify the amount of regularity and unpredictability of ﬂuctuations overthe considered time series. m = 2 , , ..., T min ∆ t − and u = 1 , . Note that ApEn i is notdeﬁned for ∆ i ( k ) when m = T min ∆ t − and m = T min ∆ t − . ˆ w i ˆ W i ∆ i ( m ) r i |F i { ˆ w i }| P iδ ∆MSD i |W iu { ˆ w i }| µ i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) µ i / (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) σ i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) β i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) β i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) ApEn i (cid:51) (cid:51) ( (cid:51) ) (cid:51) (cid:51) (cid:51) (cid:51) of abrupt changes in the variance of the elements of ˆ x i [47]. In the model classiﬁcation task (Figure 2), the N inputs calculated in the previous stepare ﬁrst fed to a deep three-layer feed-forward neural network (architecture in Figure 3)with two sigmoid hidden layers of twenty neurones each and one softmax output layerof ﬁve neurones to predict the model ˜ M of anomalous diﬀusion among the ﬁve possiblecategories (attm, ctrw, fbm, lw and sbm). We settled for this speciﬁc architecturewith two dense layers because it is deep enough to eﬃciently classify noisy vectorsarbitrarily while still avoiding that its learning rate is slowed down by an excessivelycomplex architecture. The output of the previous neural network is then reﬁned bytwo additional networks with similar architectures (but with three output neuronesand two output neurones, respectively) to improve the classiﬁcation among trajectoriesrespectively classiﬁed as attm, fbm and sbm and as attm and ctrw (Figure 2), asthese models can be more easily confused. Indeed, in the attm model, the diﬀusioncoeﬃcient can vary in space or time (in the AnDi challenge only time variations wereconsidered [40,42]). This feature can be easily confused with features of other processes:namely, the time-dependent diﬀusivity of the sbm [46], the time irregularities betweenconsecutive steps in the ctrw [43] and the correlations among the increments in thefbm [44].Once the model is identiﬁed, the next step is to infer the anomalous diﬀusionexponent α . The estimated value ˜ α is obtained as the arithmetic average of the outputsof two diﬀerent approaches based on deep learning (Figure 2), as the two approacheslead to predicting an underestimated α value ( ˜ α ) and an overestimated α value ( ˜ α ),respectively. In the ﬁrst approach (Figure 2), the output ˜ M of the classiﬁcation is used tocluster the trajectories into ﬁve categories based on the identiﬁed model. Each categoryis analysed independently by a diﬀerent set of neural networks (architectures as in Figure3). The neural networks for attm and ctrw have ﬁve output categories predicting ˜ α haracterisation of anomalous diﬀusion statistics powered by deep learning OutputsOutput layerHidden layer 1 Hidden layer 2Inputs

Figure 3. Basic neural network architecture in CONDOR.

Architecture ofthe deep three-layer feed-forward neural network chosen for both classiﬁcation andinference tasks. Inputs I n (with n = 1 , , ...N ) are processed by two hidden layersand an output layer to yield outputs O q (with q = 1 , , ...Q ). The hidden layers have K = 20 neurons each with a sigmoidal activation function ( (cid:82) ). The output layer has Q neurons with a softmax activation function ( σ ). Each neuron has a summer ( Σ )that gathers its weighted inputs and bias to return a scalar output. For each neuron,weights v and bias b are optimised during the training of the network. in the range 0.05 to 1 [40, 41]. The neural network for lw has ﬁve output categoriespredicting ˜ α in the range 1 to 2 [40, 41]. Finally, the prediction for fbm and sbm isbased on a 1x2 directed rooted tree of neural networks with the root network havingtwo equally spaced output categories predicting ˜ α in the range 0.05 to 2; each categoryis then reﬁned by a second neural network with ﬁve equally spaced output categoriesin the corresponding subrange identiﬁed by the root network. In the second approach(Figure 2), the output ˜ M of the model classiﬁcation is used as an additional inputtogether with the list of inputs identiﬁed in Section 2.1. These new set of inputs feed a1x4 directed rooted tree of neural networks, each with the same overall architecture as inFigure 3. The root neural network has four equally spaced output categories predicting ˜ α in the range 0.05 to 2; each of this category is then reﬁned by a second neural networkwith ﬁve equally spaced output categories in the corresponding α subrange identiﬁedby the root network. After deﬁning their architecture, we trained each network of CONDOR independently ona set of trajectories for which the ground-truth values of the models and of α was known.Each training was performed using the scaled conjugate gradient backpropagation (scg) haracterisation of anomalous diﬀusion statistics powered by deep learning Figure 4. Evolution of CONDOR performance with the size of the trainingdataset. (a-c) F score calculated on the total 1D, 2D and 3D training datasets of300k trajectories as a function of the number of trajectories ( N t ) employed to trainthe three consecutive networks of the classiﬁcation task (Figure 2). N t is normalizedto the size of a single AnDi challenge dataset ( N c =

10k trajectories). The F scoreis given at the output of (a) the ﬁrst network, (b) the second network and (c) thethird network used to reﬁne the model predictions in the classiﬁcation task (Figure 2).At each of these steps, only the best performing network (black squares) is carriedforward to the next step. (d-e) MAE calculated on the total 1D, 2D and 3D trainingdatasets of 300k trajectories each as a function of N t / N c for (d) ˜ α (dashed lines) and ˜ α (dotted lines) and (e) for ˜ α (solid lines) as deﬁned in Figure 2. For each N t / N c , thearithmetic average ˜ α is calculated using the best performing networks used to infer ˜ α and ˜ α with the same number of training trajectories. The networks with the lowestMAE for ˜ α and ˜ α are also shown in (d) (black circles). The shaded areas correspondto one standard deviation around the mean values (calculated from the performanceof 5 distinct training attempts). algorithm (for speed) on mixed datasets with up to three hundred thousand trajectoriesfrom diﬀerent models and values of α , with diﬀerent durations and levels of corruptingnoise. In particular, to simulate the presence of localization error, Gaussian noise withzero mean and a random variance σ n was added to each trajectory [41]. Datasets wererandomly split among training (70%), validation (15%) and test (15%) trajectories.Overall, training with scg is very eﬃcient on a standard desktop computer (processorIntel(R) Core(TM) i7-4790 CPU @ 3.60 GHz), being the size of the training datasetthe main limiting factor. Training the three consecutive networks used for classiﬁcation(Figure 2) took at most 1 hour and a half, while training the fourteen networks for theinference task took at most 3 hours. Training time can be reduced by up to 2 orders ofmagnitude employing a GPU. After training, CONDOR is very computationally eﬃcienton new data. For example, CONDOR can classify ≈ trajectories/s on a standarddesktop computer.Figure 4 shows the performance of the classiﬁcation task and of the inference taskwith the size of the training dataset. We evaluated the performance of the classiﬁcation haracterisation of anomalous diﬀusion statistics powered by deep learning F score: F = T PT P + ( F P + F N ) where T P , F P and

F N are the true positive, false positive and false negative rates,respectively. In particular, we computed the micro average of the F score with theScikit-learn Python’s library. F can vary between 0 (no trajectory classiﬁed correctly)and 1 (all trajectories classiﬁed correctly). To evaluate the performance of the inferenceof the α -exponent, we used the mean absolute error ( MAE ) instead:

MAE = 1 n n (cid:88) i =1 | ˜ α i − α i | where n is the number of trajectories in the dataset, and ˜ α i and α i are the predictedand the ground truth values of the anomalous diﬀusion coeﬃcient of the i -th trajectory,respectively. MAE has a lower limit of 0, meaning that the coeﬃcients of all the n trajectories have been predicted correctly.Figure 4a-c shows the F score measured on the training dataset at the output of thethree consecutive networks used for classiﬁcation (Figure 2) for the 1D, 2D and 3D cases.For each step, we increased the number of trajectories used for training progressivelyuntil the F score reached near saturation. Past this point any gain in F score isrelatively small and by far outbalanced by the increased computational cost associated tolonger training times. The performance improves with the dimension d of the trajectoriesand, interestingly, at each step, saturation of the F score for higher d appears to takeplace with smaller training datasets, as each trajectory contains intrinsically d timesmore information. Only the best performing network at each step is selected (blacksquares in Figure 2a-c) and carried forward to the next step. Independently of thedimensionality of the problem, the ﬁrst network produces the highest increase of F score with the size of the training dataset (Figure 4a): for example, going from 10k to100k trajectories, the F score increases approximately by 4% in 1D, 5% in 2D and 6%in 3D. The combined contribution of the following two networks to the overall F scoreis less pronounced, especially for higher dimensions (Figure 4b-c); nonetheless they playa key role in reﬁning the classiﬁcation among attm, fbm and sbm (Figure 4b) or amongattm and ctrw (Figure 4c), thus boosting the ﬁnal F score approximately by 4% in 1D,3% in 2D and 1% in 3D.Similarly, Figure 4d-e shows the MAE associated to the inference of the α -exponentaccording to ˜ α , ˜ α and their arithmetic average ˜ α (as deﬁned in Figure 2) for the 1D,2D and 3D cases. As for F , we report the MAE of each method as a function of thesize of the training dataset progressively until it reaches near saturation. In general, theMAE lowers with the number of trajectories used for training and with their dimension d . As noted earlier, independently of the dimensionality of the problem, the arithmeticaverage ˜ α provides a better estimate of the α exponent with respect to both ˜ α and ˜ α ,leading to an overall MAE at saturation that is better than those obtained with theindividual methods by at least 0.01 in all the dimensions. haracterisation of anomalous diﬀusion statistics powered by deep learning Figure 5. CONDOR performance on test datasets. (a) F and (b) MAE obtained using CONDOR on the 1D, 2D and 3D datasets used in the classiﬁcation andinference tasks of the AnDi Challenge (10k trajectories each). In (a) , the stacked barcharts shows the contribution of the three consecutive networks used for classiﬁcation(Figure 2) to the total F score. In (b) , the grouped bar charts shows the MAEassociated to the inference of the α -exponent according to ˜ α , ˜ α and their arithmeticaverage ˜ α , which provides the best estimate of the α exponent (Figure 2). We have implemented all neural networks using the patternnet function in theMATLAB TM -based Deep Learning Toolbox TM . We provide the MATLAB TM implementations of the key functionalities of CONDOR []. Nonetheless, our approach isindependent of the deep-learning framework used for its implementation. Moreover, it isalso possible to exchange neural network models between the Deep Learning Toolbox TM and other platforms, such as those based on TensorFlow TM .

3. Results

The best performing networks identiﬁed during the training step (black symbols inFigure 4) were carried forward and used in the prediction step, where we classiﬁed theanomalous diﬀusion model and inferred the α exponent of the same datasets used inthe AnDi Challenge (with 10k trajectories each) for testing purposes [40]. Figure 5ashows the overall F score obtained on these test databases: F = 0 . , F = 0 . and F = 0 . for the 1D, 2D and 3D cases, respectively. Figure 5b shows the overall MAE instead:

MAE = 0 . , MAE = 0 . and MAE = 0 . for the 1D, 2D and 3D cases,respectively. These very respectable values enabled CONDOR to be one of the topperformant methods in the AnDi Challenge ( ) [40]. haracterisation of anomalous diﬀusion statistics powered by deep learning Figure 6. Detailed analysis of CONDOR performance for the classiﬁcationtask.

CONDOR classiﬁcation performance ( F score) on the 1D, 2D and 3D datasetsof the AnDi Challenge used for the classiﬁcation task (as in Figure 5a) (a) by model, (b) by strength of the localization error (Gaussian noise with zero mean and variance σ n ), and (c-e) by trajectory length for the (c) (d)

2D and (e)

3D case, respectively.In (c-e) , the trends are ﬁtted to a power law ( y = c − ax − b , with a, b, c > ) for ease ofvisualization. The shaded areas highlight the values of the F score above the averagevalue (Figure 5a). Figure 6 provides further detail on CONDOR performance in the classiﬁcation task bymodel (Figure 6a), by strength of the localization noise (Figure 6b) and by trajectorylength (Figure 6c-e). These data largely support CONDOR robustness of performanceacross the parameter space.Consistently across the dimensions (1D, 2D and 3D), CONDOR recognises twomodels (ctrw and lw) extremely well with F ≥ . for ctrw and F ≥ . for lw(Figure 6a). This is understandable as both models show typical features that standthem apart from the remaining models and are relatively robust to the presence oflocalization noise (Figure 1): i.e. long ballistic steps in lw trajectories and long waitingtimes in ctrw trajectories [1]. For the remaining three models (attm, fbm and sbm),CONDOR performs progressively better moving from 1D to 3D. As noticed earlier(Figure 1), the distinctive features among these models are more subtle and can bebetter evaluated for higher dimensions as they are intrinsically richer in informationcontent. While CONDOR performs equally well with fbm and sbm ( F ≈ . − . in1D, F ≈ . − . in 2D and F ≈ . − . in 3D), it struggles more with attm asthis model can be more easily confused with others (Figure 1). Nonetheless, CONDORperformance skyrockets with the dimension of the attm trajectories from F ≈ . in1D to F ≈ . in 3D.The performance of CONDOR with diﬀerent strengths of Gaussian noise ( σ n = 0 . , σ n = 0 . , σ n = 1 . ) further proves the robustness of our method (Figure 6b). Theclassiﬁcation in 2D and 3D is largely unaﬀected by the amount of localization noiseadded to the trajectories. The F score is indeed . ± . in 2D and . ± . in 3D for all values of σ n . Due to the relative lack of information contained withinindividual trajectories, the 1D case is more exposed to a negative impact of large noise haracterisation of anomalous diﬀusion statistics powered by deep learning Figure 7. Detailed analysis of CONDOR performance for the α -exponentinference task. CONDOR α -exponent inference performance ( MAE ) on the 1D, 2Dand 3D datasets of the AnDi Challenge used for the inference task (as in Figure 5b) (a-c) by value of α for the (a) (b)

2D and (c)

3D case, (d) by model, (e) by strengthof the localization error (Gaussian noise with zero mean and variance σ n ), and (f-h) bytrajectory length for the (f ) (g)

2D and (h)

3D case, respectively. In (a-c) eachbar corresponds to an increment in α of 0.05 and the dotted lines represent the overallMAE as in Figure 5b. In (f-h) , the trends are ﬁtted to a power law ( y = ax − b + c ,with a, b, c > ) for ease of visualization. The dashed areas highlight the values of theMAE below the average value (Figure 5b). variances. In fact, while the performances in 1D and 2D are comparable for σ n ≤ . , ahigher level of localization noise leads to a reduced F value of ≈ . for the 1D case.Finally, Figure 6c-e shows CONDOR classiﬁcation performance with the durationof the trajectory. Due to the richer information content, classiﬁcation is increasinglybetter for longer trajectories and for higher dimensions. The trends in Figure 6c-e arewell ﬁtted by power law functions (dashed lines), which show how, above a certainthreshold in duration, CONDOR performs above the average F scores (delimited bythe shaded areas), reaching F values of up to 0.96 in 1D, 0.97 in 2D and 0.99 in 3D.Moreover, this threshold decreases when increasing the dimension, from ≈ timepoints in 1D to ≈ in 3D. α -exponent inference task Figure 7 provides further detail on CONDOR performance in the α -exponentinterference task by value of α (Figure 7a-c), by model (Figure 7d), by strength ofthe localization noise (Figure 7e) and by trajectory length (Figure 5f-h). These datalargely conﬁrm the observations on CONDOR performance for the classiﬁcation task,thus conﬁrming its robustness across the parameter space.In Figure 7a-c, we can see how CONDOR performs in a relatively consistent wayacross the α values ( MAE = 0 . ± . in 1D, MAE = 0 . ± . in 2D and haracterisation of anomalous diﬀusion statistics powered by deep learning MAE = 0 . ± . in 3D), with the exception of α = 0 . in the 1D and 2D cases( MAE = 0 . in 1D and MAE = 0 . in 2D). This is primarily due to a relativelylarge proportion of 1D and 2D attm trajectories with α = 0 . being assigned to other α values: for example, for these trajectories, ≈ of the predicted values of α is greaterthan 1 (against only the . of the 3D attm trajectories with α = 0 . ).As already noted for the model classiﬁcation (Figure 6a), CONDOR can also inferthe α -exponent very well and consistently across the dimensions for two models (ctrwand lw) with MAE ≤ . for ctrw and MAE ≤ . for lw (Figure 7d). For the threeremaining models (attm, fbm and sbm), CONDOR again performs progressively bettermoving from 1D to 3D. The best results across the dimensions are obtained for the fbm( MAE ≈ . in 1D, MAE ≈ . in 2D and MAE ≈ . in 3D), followed by sbm( MAE ≈ . in 1D, MAE ≈ . in 2D and MAE ≈ . in 3D). Once again, asin the classiﬁcation task, CONDOR struggles more with attm ( MAE ≈ . in 1D, MAE ≈ . in 2D and MAE ≈ . in 3D).Figure 7c conﬁrms CONDOR robustness to the addition of localization error. Asnoted for the classiﬁcation, the inference of the α exponent in 2D and 3D is also largelyunaﬀected by the amount of localization noise added to the trajectories. The MAE is . ± . in 2D and . ± . in 3D. However, in 1D, the MAE increases byincreasing the noise strength from 0.140 for σ n = 0 . to 0.193 for σ n = 1 . .Finally, Figure 7f-h shows CONDOR performance in inferring the α exponentwith the duration of the trajectory. Similar to the classiﬁcation task, CONDORperformance improves for higher dimensions and for longer trajectories as the MAEbecomes increasingly lower in agreement with prior observations [35, 37]. The trendsin Figure 7f-h are well ﬁtted by power law functions (dashed lines), which show how,above a certain threshold in duration, CONDOR performance is signiﬁcantly below theaverage MAE (delimited by the shaded areas), reaching values as small as 0.083 in 1D,0.085 in 2D and 0.065 in 3D. As for the classiﬁcation task, this threshold decreases whenincreasing the dimension, from time points in 1D to in 3D.

4. Conclusion

In conclusion, we have introduced a new method (CONDOR) for the characterisationof single anomalous diﬀusion trajectories in response to the AnDi Challenge [40]. Ourmethod combines tools from classical statistics analysis with deep learning techniques.CONDOR can identify the anomalous diﬀusion model and its exponent with highaccuracy, in relatively short times and without the need of any prior information.Moreover, CONDOR performance is robust to the addition of localization noise. As aconsequence, CONDOR was among the top performant methods in the AnDi Challenge( ). At the expenses of an increased computationalcost, performance can be further improved by expanding the set of inputs fed to thedeep learning algorithm and by employing more advanced network architectures (e.g.recurrent neural networks or convolutional neural networks). While most advanced haracterisation of anomalous diﬀusion statistics powered by deep learning

Acknowledgements

The authors would like to thank the organisers of the AnDi Challenge for providing astimulating scientiﬁc environment around anomalous diﬀusion and its characterization.The authors acknowledge sponsorship for this work by the U.S. Oﬃce of Naval ResearchGlobal (Award No. N62909- 18-1-2170).

References [1] Ralf Metzler, Jae-Hyung Jeon, Andrey G Cherstvy, and Eli Barkai. Anomalous diﬀusion modelsand their properties: non-stationarity, non-ergodicity, and ageing at the centenary of singleparticle tracking.

Phys. Chem. Chem. Phys. , 16(44):24128–24164, 2014.[2] Joseph Klafter and Igor M Sokolov. Anomalous diﬀusion spreads its wings.

Phys. World , 18(8):29,2005.[3] Eli Barkai, Yuval Garini, and Ralf Metzler. Strange kinetics of single molecules in living cells.

Phys. Today , 65(8):29, 2012.[4] Felix Höﬂing and Thomas Franosch. Anomalous transport in the crowded world of biological cells.

Rep. Prog. Phys. , 76(4):046602, 2013.[5] Diego Krapf and Ralf Metzler. Strange interfacial molecular dynamics.

Phys. Today , 72(9):48–55,2019.[6] Gandhimohan M Viswanathan, Marcos GE Da Luz, Ernesto P Raposo, and H Eugene Stanley.

Thephysics of foraging: an introduction to random searches and biological encounters . CambridgeUniversity Press, 2011.[7] Giorgio Volpe and Giovanni Volpe. The topography of the environment alters the optimal searchstrategy for active particles.

Proc. Natl. Acad. Sci. U.S.A. , 114(43):11350–11355, 2017.[8] Dirk Brockmann, Lars Hufnagel, and Theo Geisel. The scaling laws of human travel.

Nature ,439(7075):462–465, 2006.[9] Marta C Gonzalez, Cesar A Hidalgo, and Albert-Laszlo Barabasi. Understanding individual humanmobility patterns.

Nature , 453(7196):779–782, 2008.[10] H Zhang and GH Li. Anomalous epidemic spreading in heterogeneous networks.

Phys. Rev. E ,102(1):012315, 2020.[11] Ugur Tirnakli and Constantino Tsallis. Epidemiological model with anomalous kinetics-the covid-19 pandemics. medRxiv , 2020.[12] Vasiliki Plerou, Parameswaran Gopikrishnan, Luís A Nunes Amaral, Xavier Gabaix, and H EugeneStanley. Economic ﬂuctuations and anomalous diﬀusion.

Phys. Rev. E , 62(3):R3023, 2000.[13] Fredrick Michael and MD Johnson. Financial market dynamics.

Physica A , 320:525–534, 2003.[14] Mozhdeh Massah and Holger Kantz. Conﬁdence intervals for time averages in the presence oflong-range correlations, a case study on earth surface temperature anomalies.

Geophys. Res.Lett. , 43(17):9243–9249, 2016.[15] Yasmine Meroz and Igor M Sokolov. A toolbox for determining subdiﬀusive mechanisms.

Phys.Rep. , 573:1–29, 2015.[16] Ido Golding and Edward C Cox. Physical nature of bacterial cytoplasm.

Phys. Rev. Lett. ,96(9):098102, 2006. haracterisation of anomalous diﬀusion statistics powered by deep learning [17] Avi Caspi, Rony Granek, and Michael Elbaum. Enhanced diﬀusion in active intracellular transport. Phys. Rev. Lett. , 85(26):5655, 2000.[18] Eldad Kepten, Aleksander Weron, Grzegorz Sikora, Krzysztof Burnecki, and Yuval Garini.Guidelines for the ﬁtting of anomalous diﬀusion mean square displacement graphs from singleparticle tracking experiments.

PLOS One , 10(2):e0117722, 2015.[19] Xavier Michalet. Mean square displacement analysis of single-particle trajectories with localizationerror: Brownian motion in an isotropic medium.

Phys. Rev. E , 82(4):041914, 2010.[20] Stephanie C Weber, Andrew J Spakowitz, and Julie A Theriot. Bacterial chromosomal loci movesubdiﬀusively through a viscoelastic cytoplasm.

Phys. Rev. Lett. , 104(23):238102, 2010.[21] Diego Krapf, Enzo Marinari, Ralf Metzler, Gleb Oshanin, Xinran Xu, and Alessio Squarcini. Powerspectral density of a single brownian trajectory: what one can and cannot learn from it.

NewJ. Phys. , 20(2):023029, 2018.[22] Diego Krapf, Nils Lukat, Enzo Marinari, Ralf Metzler, Gleb Oshanin, Christine Selhuber-Unkel,Alessio Squarcini, Lorenz Stadler, Matthias Weiss, and Xinran Xu. Spectral content of a singlenon-brownian trajectory.

Phys. Rev. X , 9(1):011019, 2019.[23] Vincent Tejedor, Olivier Bénichou, Raphael Voituriez, Ralf Jungmann, Friedrich Simmel, ChristineSelhuber-Unkel, Lene B Oddershede, and Ralf Metzler. Quantitative analysis of single particletrajectories: mean maximal excursion method.

Biophys. J. , 98(7):1364–1372, 2010.[24] Jae-Hyung Jeon, Natascha Leijnse, Lene B Oddershede, and Ralf Metzler. Anomalous diﬀusionand power-law relaxation of the time averaged mean squared displacement in worm-like micellarsolutions.

New J. Phys. , 15(4):045011, 2013.[25] Krzysztof Burnecki, Eldad Kepten, Yuval Garini, Grzegorz Sikora, and Aleksander Weron.Estimating the anomalous diﬀusion exponent for single particle tracking data with measurementerrors-an alternative approach.

Sci. Rep , 5(1):1–11, 2015.[26] Samudrajit Thapa, Agnieszka Wylomanska, Gregor Sikora, Caroline Wagner, Diego Krapf, HolgerKantz, Alexei Chechkin, and Ralf Metzler. Leveraging large-deviationstatistics to decipher thestochastic properties of measured trajectories.

New Journal of Physics , 2020.[27] Konrad Hinsen and Gerald R. Kneller. Communication: A multiscale bayesian inference approachto analyzing subdiﬀusion in particle trajectories.

J. Chem. Phys. , 145(15):151101, 2016.[28] Natallia Makarava, Sabah Benmehdi, and Matthias Holschneider. Bayesian estimation of self-similarity exponent.

Phys. Rev. E , 84(2):021109, 2011.[29] Nilah Monnier, Syuan-Ming Guo, Masashi Mori, Jun He, Péter Lénárt, and Mark Bathe. Bayesianapproach to msd-based analysis of particle motion in live cells.

Biophys. J. , 103(3):616–626,2012.[30] Samudrajit Thapa, Michael A Lomholt, Jens Krog, Andrey G Cherstvy, and Ralf Metzler.Bayesian analysis of single-particle tracking data using the nested-sampling algorithm:maximum-likelihood model selection applied to stochastic-diﬀusivity data.

Phys. Chem. Chem.Phys. , 20(46):29018–29037, 2018.[31] Andrey G Cherstvy, Samudrajit Thapa, Caroline E Wagner, and Ralf Metzler. Non-gaussian, non-ergodic, and non-ﬁckian diﬀusion of tracers in mucin hydrogels.

Soft Matter , 15(12):2526–2551,2019.[32] Alexander S Serov, François Laurent, Charlotte Floderer, Karen Perronet, Cyril Favard, DelphineMuriaux, Nathalie Westbrook, Christian L Vestergaard, and Jean-Baptiste Masson. Statisticaltests for force inference in heterogeneous environments.

Sci. Rep , 10(1):1–12, 2020.[33] Marcin Magdziarz, Aleksander Weron, Krzysztof Burnecki, and Joseph Klafter. Fractionalbrownian motion versus the continuous-time random walk: A simple test for subdiﬀusivedynamics.

Phys. Rev. Lett. , 103(18):180602, 2009.[34] Aleksander Weron, Joanna Janczura, Ewa Boryczka, Titiwat Sungkaworn, and Davide Calebiro.Statistical testing approach for fractional anomalous diﬀusion classiﬁcation.

Phys. Rev. E ,99(4):042149, 2019.[35] Gorka Muñoz-Gil, Miguel Angel Garcia-March, Carlo Manzo, José D Martín-Guerrero, and haracterisation of anomalous diﬀusion statistics powered by deep learning Maciej Lewenstein. Single trajectory characterization via machine learning.

New J. Phys. ,22(1):013010, 2020.[36] Joanna Janczura, Patrycja Kowalek, Hanna Loch-Olszewska, Janusz Szwabiński, and AleksanderWeron. Classiﬁcation of particle trajectories in living cells: Machine learning versus statisticaltesting hypothesis for fractional anomalous diﬀusion.

Physical Review E , 102(3):032402, 2020.[37] Stefano Bo, Falko Schmidt, Ralf Eichhorn, and Giovanni Volpe. Measurement of anomalousdiﬀusion using recurrent neural networks.

Phys. Rev. E , 100(1):010102, 2019.[38] Patrycja Kowalek, Hanna Loch-Olszewska, and Janusz Szwabiński. Classiﬁcation of diﬀusionmodes in single-particle tracking data: Feature-based versus deep-learning approach.

Phys.Rev. E , 100(3):032410, 2019.[39] Naor Granik, Lucien E Weiss, Elias Nehme, Maayan Levin, Michael Chein, Eran Perlson, YaelRoichman, and Yoav Shechtman. Single-particle diﬀusion characterization by deep learning.

Biophys. J. , 117(2):185–192, 2019.[40] Gorka Muñoz-Gil, Giovanni Volpe, Miguel Angel Garcia-March, Ralf Metzler, Maciej Lewenstein,and Carlo Manzo. Andi: The anomalous diﬀusion challenge. arXiv:2003.12036 , 2020.[41] Gorka MuÒoz-Gil, Carlo Manzo, Giovanni Volpe, Miguel Angel Garcia-March, Ralf Metzler, andMaciej Lewenstein. The anomalous diﬀusion challenge dataset. doi:10.5281/zenodo.370770,March 2020.[42] P Massignan, C Manzo, JA Torreno-Pina, MF García-Parajo, M Lewenstein, and GJ Lapeyre Jr.Nonergodic subdiﬀusion from brownian motion in an inhomogeneous medium.

Phys. Rev. Lett. ,112(15):150603, 2014.[43] Harvey Scher and Elliott W Montroll. Anomalous transit-time dispersion in amorphous solids.

Phys. Rev. B , 12(6):2455, 1975.[44] Benoit B Mandelbrot and John W Van Ness. Fractional brownian motions, fractional noises andapplications.

SIAM Rev Soc Ind Appl Math , 10(4):422–437, 1968.[45] J Klafter and G Zumofen. Lévy statistics in a hamiltonian system.

Phys. Rev. E , 49(6):4873,1994.[46] SC Lim and SV Muniandy. Self-similar gaussian processes for modeling anomalous diﬀusion.

Phys.Rev. E , 66(2):021114, 2002.[47] R. Killick, P. Fearnhead, and I. A. Eckley. Optimal detection of changepoints with a linearcomputational cost.