Characterization of anomalous diffusion classical statistics powered by deep learning (CONDOR)
CCharacterisation of anomalous diffusion statisticspowered by deep learning
Alessia Gentili, Giorgio Volpe
Department of Chemistry, University College London, 20 Gordon Street, LondonWC1H 0AJ, UKE-mail: [email protected]
Abstract.
Diffusion processes are important in several physical, chemical, biologicaland human phenomena. Examples include molecular encounters in reactions, cellularsignalling, the foraging of animals, the spread of diseases, as well as trends in financialmarkets and climate records. Deviations from Brownian diffusion, known as anomalousdiffusion, can often be observed in these processes, when the growth of the mean squaredisplacement in time is not linear. An ever-increasing number of methods has thusappeared to characterize anomalous diffusion trajectories based on classical statisticsor machine learning approaches. Yet, characterization of anomalous diffusion remainschallenging to date as testified by the launch of the Anomalous Diffusion (AnDi)Challenge in March 2020 to assess and compare new and pre-existing methods on threedifferent aspects of the problem: the inference of the anomalous diffusion exponent,the classification of the diffusion model, and the segmentation of trajectories. Here,we introduce a novel method which combines feature engineering based on classicalstatistics with supervised deep learning to efficiently identify the underlying anomalousdiffusion model with high accuracy (up to ) and infer its exponent with a smallmean absolute error (down to . ) in single 1D, 2D and 3D trajectories corrupted bylocalization noise. Keywords : anomalous diffusion, single trajectory characterization, classical statisticsanalysis, deep learning, neural networks
1. Introduction
Anomalous diffusion refers to diffusion phenomena characterized by a mean squaredisplacement (MSD) that grows in time with an exponent α that is either smaller(subdiffusion) or greater (superdiffusion) than one (standard Brownian diffusion) [1].Typically, this is represented by a nonlinear power-law scaling ( MSD( t ) ∼ t α ) [1].The growing interest in anomalous diffusion shown by the physics community lies inthe fact that it extends the concept of Brownian diffusion and helps describe a broadrange of phenomena of both natural and human origin [2]. Examples include molecularencounters in reactions [3], cellular signalling [4, 5], the foraging of animals [6], search a r X i v : . [ phy s i c s . d a t a - a n ] F e b haracterisation of anomalous diffusion statistics powered by deep learning ) [40]. The goal of the challenge was to push the methodologicalfrontier in the characterization of three different aspects of anomalous diffusion for 1D,2D and 3D noise-corrupted trajectories [40]: (1) the inference of the anomalous diffusionexponent α ; (2) the classification of the diffusion model; and (3) the segmentation oftrajectories characterized by a change in anomalous diffusion exponent and/or model atan unspecified time t .Here, in response to the AnDi Challenge, we describe a novel method(CONDOR: Classifier Of aNomalous DiffusiOn tRajectories) for the characterizationof single anomalous diffusion trajectories based on the combination of classical statisticsanalysis tools and supervised deep learning. Specifically, our method applies adeep feed-forward neural network to cluster parameters extracted from the statisticalfeatures of individual trajectories. By combining advantages from both approaches(classical statistics analysis and machine learning), CONDOR is highly performant andparameter-free. It allows us to identify the underlying anomalous diffusion model withhigh accuracy ( (cid:38) ) and infer its exponent α with a small mean absolute error( (cid:46) . ) in single 1D, 2D and 3D trajectories corrupted by localization noise. CONDORwas always among the top performant methods in the AnDi Challenge for both theinference of the α -exponent and the model classification of 1D, 2D and 3D anomalousdiffusion trajectories. In particular, CONDOR was the leading method for the inference haracterisation of anomalous diffusion statistics powered by deep learning Figure 1. Representative anomalous diffusion trajectories. (a-b)
Examplesof single 1D anomalous diffusion trajectories in the absence of corrupting noise for (a) different models of standard diffusion (anomalous diffusion exponent α = 1 ) and for (b) the same anomalous diffusion model (sbm) with different values of α . The modelsconsidered here are the annealed transient time motion (attm), the continuous-timerandom walk (ctrw), the fractional Brownian motion (fbm), the Lévy walk (lw) andthe scaled Brownian motion (sbm). (c) Same trajectories as in (a) corrupted withGaussian noise (of zero mean and variance σ n = 1 ) associated to a finite localizationprecision. For better comparison, each sequence of positions x n taken at times t n isrepresented after subtracting its average and after rescaling its displacements to theirstandard deviation to give the normalized sequence ˆ x n [35]. The grey-shaded areashighlight a small portion (10 data points) of each trajectory. task in 2D and 3D and for the classification task in 3D.
2. Methods
CONDOR combines classical statistics analysis of single anomalous diffusion trajectorieswith supervised deep learning. The trajectories used in this work are generatednumerically (with the andi-dataset
Python package [41]) among the 5 modelsconsidered in the AnDi Challenge: the annealed transient time motion (attm) [42], thecontinuous-time random walk (ctrw) [43], the fractional Brownian motion (fbm) [44],the Lévy walk (lw) [45] and the scaled Brownian motion (sbm) [46]). Briefly, AnDiChallenge datasets contain 1D, 2D or 3D trajectories from these 5 models sampledat regular time steps ( t n = n ∆ t , with n a positive integer and ∆ t a unitary timeinterval). The anomalous exponent α varies in the range [0 . , with steps of 0.05, asallowed by the respective model [40]. The shortest trajectories have at least 10 samplesper dimension ( T min ) and can be corrupted with additive Gaussian noise (with zeromean and variance σ n ) associated to a finite localization precision to better simulateexperimental endeavours [40, 41]. Figure 1 shows examples of such trajectories withoutnoise (Figure 1a-b) and with the addition of corrupting noise (Figure 1c). By visualinspection, these examples show how differences between single trajectories are notalways explicit, thus making the characterization of anomalous diffusion challenging.Already in the absence of noise, the same diffusion process (e.g. for α = 1 in Figure 1a)can be described by different models and the same model (e.g. sbm in Figure 1b) haracterisation of anomalous diffusion statistics powered by deep learning Figure 2. CONDOR workflow.
CONDOR workflow is divided in three mainparts: preprocessing (Section 2.1), classification and inference (Section 2.2). In thepreprocessing step, the vector of positions x associated to a single trajectory is usedto extract a vector I of N = 1 + 92 d inputs primarily based on classical statistics(with d = 1 , , being the trajectory dimension). The vector I is the basic inputfor each neural network in the classification and inference steps. The output of theclassification step is determined by the output of up to three deep feed-forward neuralnetworks and is the predicted model ˜ M among five possibile categories: the annealedtransient time motion (attm), the continuous-time random walk (ctrw), the fractionalBrownian motion (fbm), the Lévy walk (lw) and the scaled Brownian motion (sbm).Finally, the inference step predicts the value of the exponent α ∈ [0 . , basing itsdecision on I and the predicted ˜ M for each trajectory. In particular, the predicted ˜ α is the arithmetic average of two distinct predictions: ˜ α and ˜ α where ˜ M is used asselection element or additional input for the deep learning algorithm, respectively. can assume different values of the exponent α . These occurances are not always easyto set apart (e.g. attm, fbm and sbm in Figure 1a). The presence of corrupting noise(Figure 1c) and/or short trajectories (shaded areas in Figure 1a-c) add extra complexity.In this section, we provide a detailed description of our method (CONDOR) tocharacterise such single anomalous diffusion trajectories. The fundamental intuitionbehind our approach is that different trajectories produced by the same diffusionmodel should share similar statistical features to set them apart from other models [1].Nonetheless, in the presence of noise or with few data points, these similarities anddifferences do not need be explicit in single trajectories (Figure 1). Thus, CONDORfirst manipulates single trajectories to extract this hidden statistical information andthen employs the power of deep learning to analyse and cluster them. CONDORworkflow is shown in Figure 2. We will first discuss the feature engineering step used topreprocess the trajectories and extract a set of N relevant features primarily basedon classical statistics analysis (Section 2.1). These features become the inputs forthe deep-learning part of the characterization algorithm based on deep feed-forwardneural networks (Figure 3), which, in order, classifies the diffusion model and infers theanomalous diffusion exponent α (Figure 2). These two steps are performed by multipleneural networks, whose architecture and connection is explained in Section 2.2. Finally,Section 2.3 discusses the training of CONDOR neural networks. haracterisation of anomalous diffusion statistics powered by deep learning Before the classification and inference steps (Figure 2), we analyzed each trajectory toextract N = 1 + 92 d inputs (with d = 1 , , being the trajectory dimension) primarilybased on classical statistics (Table 1). The inputs used in this work were ultimatelyselected based on the final performance of the deep learning step.Independently of the dimensionality of the problem, the first input ( ln T max ) isalways connected to the duration T max of each trajectory to enable the deep learningalgorithm to properly weight different durations.To extract the remaining d inputs for every trajectory in 1D, 2D and 3D,we analysed the vector of the positions along each direction x i ≡ ( x i , x i , .... x i T max∆ t ) (with i = 1 , ... d ) independently, after first subtracting its average and rescaling itsdisplacements w ij = x ij +1 − x ij (with j = 1 , ... T max ∆ t − ) to their standard deviation toobtain the normalised displacements ˆ w ij [35]. Practically, for each dimension, this stepproduces new time sequences ˆ x i ≡ (ˆ x i , ˆ x i , .... ˆ x i T max∆ t ) of positions reconstructed from thecumulative sum of the normalised displacements of the original trajectories and with (cid:104) ˆ x i (cid:105) = . Importantly, this step allows for a better comparison between trajectories ofdifferent dimensions and durations while keeping the performance of the deep learningalgorithm optimal [35].The vectors of the normalised positions ˆ x i and of the normalised displacements ˆ w i ≡ ( ˆ w i , ˆ w i , .... ˆ w i T max∆ t − ) are the starting point to calculate the statistical information(i.e. the deep learning inputs) associated to each trajectory. In particular, for eachdimension, we calculated the mean ( µ i ), the median ( µ i / ), the standard deviation ( σ i ),the skewness ( β i ), the kurtosis ( β i ) and the approximate entropy ( ApEn i ) associated tothe following (Table 1): the normalised displacements ˆ w i and the vector of their absolutevalues ˆ W i ≡ ( | ˆ w i | , | ˆ w i | , .... | ˆ w i T max∆ t − | ) , the vector ∆ i ( k ) of the absolute displacements ∆ ij ( m ) = | ˆ x ij + m − ˆ x j | taken at time lags τ = m ∆ t (with m = 2 , , ..., T min ∆ t − ), thevector r i of the ratios r ij = ˆ w ij +1 ˆ w ij , the absolute values |F i { ˆ w i }| of the normalized Fouriertransform of ˆ w i , the vector P iδ of the time series of the normalised powers of ˆ w i segmentsselected with a non-overlapping moving window of width δ ∆ t (with δ = 3 ), the vector ∆MSD i of the absolute values of the normalised variations ∆MSD ij = | MSD ij +1 − MSD ij j ∆ t | ofthe time-averaged mean square displacement (MSD), and the absolute values |W iu { ˆ w i }| of the wavelet transform of ˆ w i at the first two translations u ∆ t with u = 1 , .To further improve the performance of the deep learning algorithm, the followingextra four inputs were also added for each dimension to extract additional informationabout regular and irregular time patterns in the trajectories (correlations, powervariations and abrupt changes): the integral of the normalized autocorrelation function( ACF ) of ˆ w i over a time window of width δ ∆ t to the right of the autocorrelationpeak, the level of frequency asymmetry in the normalized power spectral density of ˆ w i calculated as ( ∆ tT max ) (cid:80) k |F i { ˆ w i } k | sgn( k − F N ) with F N being the Nyquist frequency, thelevel of time asymmetry in P iδ calculated as (cid:80) k P iδ,k sgn( k − (cid:6) T max δ (cid:7) ) , and the occurrence haracterisation of anomalous diffusion statistics powered by deep learning Table 1.
Statistical information extracted from each dimension of every trajectoryas inputs for the deep learning algorithm. The approximate entropy is introduced asa way to quantify the amount of regularity and unpredictability of fluctuations overthe considered time series. m = 2 , , ..., T min ∆ t − and u = 1 , . Note that ApEn i is notdefined for ∆ i ( k ) when m = T min ∆ t − and m = T min ∆ t − . ˆ w i ˆ W i ∆ i ( m ) r i |F i { ˆ w i }| P iδ ∆MSD i |W iu { ˆ w i }| µ i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) µ i / (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) σ i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) β i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) β i (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) ApEn i (cid:51) (cid:51) ( (cid:51) ) (cid:51) (cid:51) (cid:51) (cid:51) of abrupt changes in the variance of the elements of ˆ x i [47]. In the model classification task (Figure 2), the N inputs calculated in the previous stepare first fed to a deep three-layer feed-forward neural network (architecture in Figure 3)with two sigmoid hidden layers of twenty neurones each and one softmax output layerof five neurones to predict the model ˜ M of anomalous diffusion among the five possiblecategories (attm, ctrw, fbm, lw and sbm). We settled for this specific architecturewith two dense layers because it is deep enough to efficiently classify noisy vectorsarbitrarily while still avoiding that its learning rate is slowed down by an excessivelycomplex architecture. The output of the previous neural network is then refined bytwo additional networks with similar architectures (but with three output neuronesand two output neurones, respectively) to improve the classification among trajectoriesrespectively classified as attm, fbm and sbm and as attm and ctrw (Figure 2), asthese models can be more easily confused. Indeed, in the attm model, the diffusioncoefficient can vary in space or time (in the AnDi challenge only time variations wereconsidered [40,42]). This feature can be easily confused with features of other processes:namely, the time-dependent diffusivity of the sbm [46], the time irregularities betweenconsecutive steps in the ctrw [43] and the correlations among the increments in thefbm [44].Once the model is identified, the next step is to infer the anomalous diffusionexponent α . The estimated value ˜ α is obtained as the arithmetic average of the outputsof two different approaches based on deep learning (Figure 2), as the two approacheslead to predicting an underestimated α value ( ˜ α ) and an overestimated α value ( ˜ α ),respectively. In the first approach (Figure 2), the output ˜ M of the classification is used tocluster the trajectories into five categories based on the identified model. Each categoryis analysed independently by a different set of neural networks (architectures as in Figure3). The neural networks for attm and ctrw have five output categories predicting ˜ α haracterisation of anomalous diffusion statistics powered by deep learning OutputsOutput layerHidden layer 1 Hidden layer 2Inputs
Figure 3. Basic neural network architecture in CONDOR.
Architecture ofthe deep three-layer feed-forward neural network chosen for both classification andinference tasks. Inputs I n (with n = 1 , , ...N ) are processed by two hidden layersand an output layer to yield outputs O q (with q = 1 , , ...Q ). The hidden layers have K = 20 neurons each with a sigmoidal activation function ( (cid:82) ). The output layer has Q neurons with a softmax activation function ( σ ). Each neuron has a summer ( Σ )that gathers its weighted inputs and bias to return a scalar output. For each neuron,weights v and bias b are optimised during the training of the network. in the range 0.05 to 1 [40, 41]. The neural network for lw has five output categoriespredicting ˜ α in the range 1 to 2 [40, 41]. Finally, the prediction for fbm and sbm isbased on a 1x2 directed rooted tree of neural networks with the root network havingtwo equally spaced output categories predicting ˜ α in the range 0.05 to 2; each categoryis then refined by a second neural network with five equally spaced output categoriesin the corresponding subrange identified by the root network. In the second approach(Figure 2), the output ˜ M of the model classification is used as an additional inputtogether with the list of inputs identified in Section 2.1. These new set of inputs feed a1x4 directed rooted tree of neural networks, each with the same overall architecture as inFigure 3. The root neural network has four equally spaced output categories predicting ˜ α in the range 0.05 to 2; each of this category is then refined by a second neural networkwith five equally spaced output categories in the corresponding α subrange identifiedby the root network. After defining their architecture, we trained each network of CONDOR independently ona set of trajectories for which the ground-truth values of the models and of α was known.Each training was performed using the scaled conjugate gradient backpropagation (scg) haracterisation of anomalous diffusion statistics powered by deep learning Figure 4. Evolution of CONDOR performance with the size of the trainingdataset. (a-c) F score calculated on the total 1D, 2D and 3D training datasets of300k trajectories as a function of the number of trajectories ( N t ) employed to trainthe three consecutive networks of the classification task (Figure 2). N t is normalizedto the size of a single AnDi challenge dataset ( N c =
10k trajectories). The F scoreis given at the output of (a) the first network, (b) the second network and (c) thethird network used to refine the model predictions in the classification task (Figure 2).At each of these steps, only the best performing network (black squares) is carriedforward to the next step. (d-e) MAE calculated on the total 1D, 2D and 3D trainingdatasets of 300k trajectories each as a function of N t / N c for (d) ˜ α (dashed lines) and ˜ α (dotted lines) and (e) for ˜ α (solid lines) as defined in Figure 2. For each N t / N c , thearithmetic average ˜ α is calculated using the best performing networks used to infer ˜ α and ˜ α with the same number of training trajectories. The networks with the lowestMAE for ˜ α and ˜ α are also shown in (d) (black circles). The shaded areas correspondto one standard deviation around the mean values (calculated from the performanceof 5 distinct training attempts). algorithm (for speed) on mixed datasets with up to three hundred thousand trajectoriesfrom different models and values of α , with different durations and levels of corruptingnoise. In particular, to simulate the presence of localization error, Gaussian noise withzero mean and a random variance σ n was added to each trajectory [41]. Datasets wererandomly split among training (70%), validation (15%) and test (15%) trajectories.Overall, training with scg is very efficient on a standard desktop computer (processorIntel(R) Core(TM) i7-4790 CPU @ 3.60 GHz), being the size of the training datasetthe main limiting factor. Training the three consecutive networks used for classification(Figure 2) took at most 1 hour and a half, while training the fourteen networks for theinference task took at most 3 hours. Training time can be reduced by up to 2 orders ofmagnitude employing a GPU. After training, CONDOR is very computationally efficienton new data. For example, CONDOR can classify ≈ trajectories/s on a standarddesktop computer.Figure 4 shows the performance of the classification task and of the inference taskwith the size of the training dataset. We evaluated the performance of the classification haracterisation of anomalous diffusion statistics powered by deep learning F score: F = T PT P + ( F P + F N ) where T P , F P and
F N are the true positive, false positive and false negative rates,respectively. In particular, we computed the micro average of the F score with theScikit-learn Python’s library. F can vary between 0 (no trajectory classified correctly)and 1 (all trajectories classified correctly). To evaluate the performance of the inferenceof the α -exponent, we used the mean absolute error ( MAE ) instead:
MAE = 1 n n (cid:88) i =1 | ˜ α i − α i | where n is the number of trajectories in the dataset, and ˜ α i and α i are the predictedand the ground truth values of the anomalous diffusion coefficient of the i -th trajectory,respectively. MAE has a lower limit of 0, meaning that the coefficients of all the n trajectories have been predicted correctly.Figure 4a-c shows the F score measured on the training dataset at the output of thethree consecutive networks used for classification (Figure 2) for the 1D, 2D and 3D cases.For each step, we increased the number of trajectories used for training progressivelyuntil the F score reached near saturation. Past this point any gain in F score isrelatively small and by far outbalanced by the increased computational cost associated tolonger training times. The performance improves with the dimension d of the trajectoriesand, interestingly, at each step, saturation of the F score for higher d appears to takeplace with smaller training datasets, as each trajectory contains intrinsically d timesmore information. Only the best performing network at each step is selected (blacksquares in Figure 2a-c) and carried forward to the next step. Independently of thedimensionality of the problem, the first network produces the highest increase of F score with the size of the training dataset (Figure 4a): for example, going from 10k to100k trajectories, the F score increases approximately by 4% in 1D, 5% in 2D and 6%in 3D. The combined contribution of the following two networks to the overall F scoreis less pronounced, especially for higher dimensions (Figure 4b-c); nonetheless they playa key role in refining the classification among attm, fbm and sbm (Figure 4b) or amongattm and ctrw (Figure 4c), thus boosting the final F score approximately by 4% in 1D,3% in 2D and 1% in 3D.Similarly, Figure 4d-e shows the MAE associated to the inference of the α -exponentaccording to ˜ α , ˜ α and their arithmetic average ˜ α (as defined in Figure 2) for the 1D,2D and 3D cases. As for F , we report the MAE of each method as a function of thesize of the training dataset progressively until it reaches near saturation. In general, theMAE lowers with the number of trajectories used for training and with their dimension d . As noted earlier, independently of the dimensionality of the problem, the arithmeticaverage ˜ α provides a better estimate of the α exponent with respect to both ˜ α and ˜ α ,leading to an overall MAE at saturation that is better than those obtained with theindividual methods by at least 0.01 in all the dimensions. haracterisation of anomalous diffusion statistics powered by deep learning Figure 5. CONDOR performance on test datasets. (a) F and (b) MAE obtained using CONDOR on the 1D, 2D and 3D datasets used in the classification andinference tasks of the AnDi Challenge (10k trajectories each). In (a) , the stacked barcharts shows the contribution of the three consecutive networks used for classification(Figure 2) to the total F score. In (b) , the grouped bar charts shows the MAEassociated to the inference of the α -exponent according to ˜ α , ˜ α and their arithmeticaverage ˜ α , which provides the best estimate of the α exponent (Figure 2). We have implemented all neural networks using the patternnet function in theMATLAB TM -based Deep Learning Toolbox TM . We provide the MATLAB TM implementations of the key functionalities of CONDOR []. Nonetheless, our approach isindependent of the deep-learning framework used for its implementation. Moreover, it isalso possible to exchange neural network models between the Deep Learning Toolbox TM and other platforms, such as those based on TensorFlow TM .
3. Results
The best performing networks identified during the training step (black symbols inFigure 4) were carried forward and used in the prediction step, where we classified theanomalous diffusion model and inferred the α exponent of the same datasets used inthe AnDi Challenge (with 10k trajectories each) for testing purposes [40]. Figure 5ashows the overall F score obtained on these test databases: F = 0 . , F = 0 . and F = 0 . for the 1D, 2D and 3D cases, respectively. Figure 5b shows the overall MAE instead:
MAE = 0 . , MAE = 0 . and MAE = 0 . for the 1D, 2D and 3D cases,respectively. These very respectable values enabled CONDOR to be one of the topperformant methods in the AnDi Challenge ( ) [40]. haracterisation of anomalous diffusion statistics powered by deep learning Figure 6. Detailed analysis of CONDOR performance for the classificationtask.
CONDOR classification performance ( F score) on the 1D, 2D and 3D datasetsof the AnDi Challenge used for the classification task (as in Figure 5a) (a) by model, (b) by strength of the localization error (Gaussian noise with zero mean and variance σ n ), and (c-e) by trajectory length for the (c) (d)
2D and (e)
3D case, respectively.In (c-e) , the trends are fitted to a power law ( y = c − ax − b , with a, b, c > ) for ease ofvisualization. The shaded areas highlight the values of the F score above the averagevalue (Figure 5a). Figure 6 provides further detail on CONDOR performance in the classification task bymodel (Figure 6a), by strength of the localization noise (Figure 6b) and by trajectorylength (Figure 6c-e). These data largely support CONDOR robustness of performanceacross the parameter space.Consistently across the dimensions (1D, 2D and 3D), CONDOR recognises twomodels (ctrw and lw) extremely well with F ≥ . for ctrw and F ≥ . for lw(Figure 6a). This is understandable as both models show typical features that standthem apart from the remaining models and are relatively robust to the presence oflocalization noise (Figure 1): i.e. long ballistic steps in lw trajectories and long waitingtimes in ctrw trajectories [1]. For the remaining three models (attm, fbm and sbm),CONDOR performs progressively better moving from 1D to 3D. As noticed earlier(Figure 1), the distinctive features among these models are more subtle and can bebetter evaluated for higher dimensions as they are intrinsically richer in informationcontent. While CONDOR performs equally well with fbm and sbm ( F ≈ . − . in1D, F ≈ . − . in 2D and F ≈ . − . in 3D), it struggles more with attm asthis model can be more easily confused with others (Figure 1). Nonetheless, CONDORperformance skyrockets with the dimension of the attm trajectories from F ≈ . in1D to F ≈ . in 3D.The performance of CONDOR with different strengths of Gaussian noise ( σ n = 0 . , σ n = 0 . , σ n = 1 . ) further proves the robustness of our method (Figure 6b). Theclassification in 2D and 3D is largely unaffected by the amount of localization noiseadded to the trajectories. The F score is indeed . ± . in 2D and . ± . in 3D for all values of σ n . Due to the relative lack of information contained withinindividual trajectories, the 1D case is more exposed to a negative impact of large noise haracterisation of anomalous diffusion statistics powered by deep learning Figure 7. Detailed analysis of CONDOR performance for the α -exponentinference task. CONDOR α -exponent inference performance ( MAE ) on the 1D, 2Dand 3D datasets of the AnDi Challenge used for the inference task (as in Figure 5b) (a-c) by value of α for the (a) (b)
2D and (c)
3D case, (d) by model, (e) by strengthof the localization error (Gaussian noise with zero mean and variance σ n ), and (f-h) bytrajectory length for the (f ) (g)
2D and (h)
3D case, respectively. In (a-c) eachbar corresponds to an increment in α of 0.05 and the dotted lines represent the overallMAE as in Figure 5b. In (f-h) , the trends are fitted to a power law ( y = ax − b + c ,with a, b, c > ) for ease of visualization. The dashed areas highlight the values of theMAE below the average value (Figure 5b). variances. In fact, while the performances in 1D and 2D are comparable for σ n ≤ . , ahigher level of localization noise leads to a reduced F value of ≈ . for the 1D case.Finally, Figure 6c-e shows CONDOR classification performance with the durationof the trajectory. Due to the richer information content, classification is increasinglybetter for longer trajectories and for higher dimensions. The trends in Figure 6c-e arewell fitted by power law functions (dashed lines), which show how, above a certainthreshold in duration, CONDOR performs above the average F scores (delimited bythe shaded areas), reaching F values of up to 0.96 in 1D, 0.97 in 2D and 0.99 in 3D.Moreover, this threshold decreases when increasing the dimension, from ≈ timepoints in 1D to ≈ in 3D. α -exponent inference task Figure 7 provides further detail on CONDOR performance in the α -exponentinterference task by value of α (Figure 7a-c), by model (Figure 7d), by strength ofthe localization noise (Figure 7e) and by trajectory length (Figure 5f-h). These datalargely confirm the observations on CONDOR performance for the classification task,thus confirming its robustness across the parameter space.In Figure 7a-c, we can see how CONDOR performs in a relatively consistent wayacross the α values ( MAE = 0 . ± . in 1D, MAE = 0 . ± . in 2D and haracterisation of anomalous diffusion statistics powered by deep learning MAE = 0 . ± . in 3D), with the exception of α = 0 . in the 1D and 2D cases( MAE = 0 . in 1D and MAE = 0 . in 2D). This is primarily due to a relativelylarge proportion of 1D and 2D attm trajectories with α = 0 . being assigned to other α values: for example, for these trajectories, ≈ of the predicted values of α is greaterthan 1 (against only the . of the 3D attm trajectories with α = 0 . ).As already noted for the model classification (Figure 6a), CONDOR can also inferthe α -exponent very well and consistently across the dimensions for two models (ctrwand lw) with MAE ≤ . for ctrw and MAE ≤ . for lw (Figure 7d). For the threeremaining models (attm, fbm and sbm), CONDOR again performs progressively bettermoving from 1D to 3D. The best results across the dimensions are obtained for the fbm( MAE ≈ . in 1D, MAE ≈ . in 2D and MAE ≈ . in 3D), followed by sbm( MAE ≈ . in 1D, MAE ≈ . in 2D and MAE ≈ . in 3D). Once again, asin the classification task, CONDOR struggles more with attm ( MAE ≈ . in 1D, MAE ≈ . in 2D and MAE ≈ . in 3D).Figure 7c confirms CONDOR robustness to the addition of localization error. Asnoted for the classification, the inference of the α exponent in 2D and 3D is also largelyunaffected by the amount of localization noise added to the trajectories. The MAE is . ± . in 2D and . ± . in 3D. However, in 1D, the MAE increases byincreasing the noise strength from 0.140 for σ n = 0 . to 0.193 for σ n = 1 . .Finally, Figure 7f-h shows CONDOR performance in inferring the α exponentwith the duration of the trajectory. Similar to the classification task, CONDORperformance improves for higher dimensions and for longer trajectories as the MAEbecomes increasingly lower in agreement with prior observations [35, 37]. The trendsin Figure 7f-h are well fitted by power law functions (dashed lines), which show how,above a certain threshold in duration, CONDOR performance is significantly below theaverage MAE (delimited by the shaded areas), reaching values as small as 0.083 in 1D,0.085 in 2D and 0.065 in 3D. As for the classification task, this threshold decreases whenincreasing the dimension, from time points in 1D to in 3D.
4. Conclusion
In conclusion, we have introduced a new method (CONDOR) for the characterisationof single anomalous diffusion trajectories in response to the AnDi Challenge [40]. Ourmethod combines tools from classical statistics analysis with deep learning techniques.CONDOR can identify the anomalous diffusion model and its exponent with highaccuracy, in relatively short times and without the need of any prior information.Moreover, CONDOR performance is robust to the addition of localization noise. As aconsequence, CONDOR was among the top performant methods in the AnDi Challenge( ). At the expenses of an increased computationalcost, performance can be further improved by expanding the set of inputs fed to thedeep learning algorithm and by employing more advanced network architectures (e.g.recurrent neural networks or convolutional neural networks). While most advanced haracterisation of anomalous diffusion statistics powered by deep learning
Acknowledgements
The authors would like to thank the organisers of the AnDi Challenge for providing astimulating scientific environment around anomalous diffusion and its characterization.The authors acknowledge sponsorship for this work by the U.S. Office of Naval ResearchGlobal (Award No. N62909- 18-1-2170).
References [1] Ralf Metzler, Jae-Hyung Jeon, Andrey G Cherstvy, and Eli Barkai. Anomalous diffusion modelsand their properties: non-stationarity, non-ergodicity, and ageing at the centenary of singleparticle tracking.
Phys. Chem. Chem. Phys. , 16(44):24128–24164, 2014.[2] Joseph Klafter and Igor M Sokolov. Anomalous diffusion spreads its wings.
Phys. World , 18(8):29,2005.[3] Eli Barkai, Yuval Garini, and Ralf Metzler. Strange kinetics of single molecules in living cells.
Phys. Today , 65(8):29, 2012.[4] Felix Höfling and Thomas Franosch. Anomalous transport in the crowded world of biological cells.
Rep. Prog. Phys. , 76(4):046602, 2013.[5] Diego Krapf and Ralf Metzler. Strange interfacial molecular dynamics.
Phys. Today , 72(9):48–55,2019.[6] Gandhimohan M Viswanathan, Marcos GE Da Luz, Ernesto P Raposo, and H Eugene Stanley.
Thephysics of foraging: an introduction to random searches and biological encounters . CambridgeUniversity Press, 2011.[7] Giorgio Volpe and Giovanni Volpe. The topography of the environment alters the optimal searchstrategy for active particles.
Proc. Natl. Acad. Sci. U.S.A. , 114(43):11350–11355, 2017.[8] Dirk Brockmann, Lars Hufnagel, and Theo Geisel. The scaling laws of human travel.
Nature ,439(7075):462–465, 2006.[9] Marta C Gonzalez, Cesar A Hidalgo, and Albert-Laszlo Barabasi. Understanding individual humanmobility patterns.
Nature , 453(7196):779–782, 2008.[10] H Zhang and GH Li. Anomalous epidemic spreading in heterogeneous networks.
Phys. Rev. E ,102(1):012315, 2020.[11] Ugur Tirnakli and Constantino Tsallis. Epidemiological model with anomalous kinetics-the covid-19 pandemics. medRxiv , 2020.[12] Vasiliki Plerou, Parameswaran Gopikrishnan, Luís A Nunes Amaral, Xavier Gabaix, and H EugeneStanley. Economic fluctuations and anomalous diffusion.
Phys. Rev. E , 62(3):R3023, 2000.[13] Fredrick Michael and MD Johnson. Financial market dynamics.
Physica A , 320:525–534, 2003.[14] Mozhdeh Massah and Holger Kantz. Confidence intervals for time averages in the presence oflong-range correlations, a case study on earth surface temperature anomalies.
Geophys. Res.Lett. , 43(17):9243–9249, 2016.[15] Yasmine Meroz and Igor M Sokolov. A toolbox for determining subdiffusive mechanisms.
Phys.Rep. , 573:1–29, 2015.[16] Ido Golding and Edward C Cox. Physical nature of bacterial cytoplasm.
Phys. Rev. Lett. ,96(9):098102, 2006. haracterisation of anomalous diffusion statistics powered by deep learning [17] Avi Caspi, Rony Granek, and Michael Elbaum. Enhanced diffusion in active intracellular transport. Phys. Rev. Lett. , 85(26):5655, 2000.[18] Eldad Kepten, Aleksander Weron, Grzegorz Sikora, Krzysztof Burnecki, and Yuval Garini.Guidelines for the fitting of anomalous diffusion mean square displacement graphs from singleparticle tracking experiments.
PLOS One , 10(2):e0117722, 2015.[19] Xavier Michalet. Mean square displacement analysis of single-particle trajectories with localizationerror: Brownian motion in an isotropic medium.
Phys. Rev. E , 82(4):041914, 2010.[20] Stephanie C Weber, Andrew J Spakowitz, and Julie A Theriot. Bacterial chromosomal loci movesubdiffusively through a viscoelastic cytoplasm.
Phys. Rev. Lett. , 104(23):238102, 2010.[21] Diego Krapf, Enzo Marinari, Ralf Metzler, Gleb Oshanin, Xinran Xu, and Alessio Squarcini. Powerspectral density of a single brownian trajectory: what one can and cannot learn from it.
NewJ. Phys. , 20(2):023029, 2018.[22] Diego Krapf, Nils Lukat, Enzo Marinari, Ralf Metzler, Gleb Oshanin, Christine Selhuber-Unkel,Alessio Squarcini, Lorenz Stadler, Matthias Weiss, and Xinran Xu. Spectral content of a singlenon-brownian trajectory.
Phys. Rev. X , 9(1):011019, 2019.[23] Vincent Tejedor, Olivier Bénichou, Raphael Voituriez, Ralf Jungmann, Friedrich Simmel, ChristineSelhuber-Unkel, Lene B Oddershede, and Ralf Metzler. Quantitative analysis of single particletrajectories: mean maximal excursion method.
Biophys. J. , 98(7):1364–1372, 2010.[24] Jae-Hyung Jeon, Natascha Leijnse, Lene B Oddershede, and Ralf Metzler. Anomalous diffusionand power-law relaxation of the time averaged mean squared displacement in worm-like micellarsolutions.
New J. Phys. , 15(4):045011, 2013.[25] Krzysztof Burnecki, Eldad Kepten, Yuval Garini, Grzegorz Sikora, and Aleksander Weron.Estimating the anomalous diffusion exponent for single particle tracking data with measurementerrors-an alternative approach.
Sci. Rep , 5(1):1–11, 2015.[26] Samudrajit Thapa, Agnieszka Wylomanska, Gregor Sikora, Caroline Wagner, Diego Krapf, HolgerKantz, Alexei Chechkin, and Ralf Metzler. Leveraging large-deviationstatistics to decipher thestochastic properties of measured trajectories.
New Journal of Physics , 2020.[27] Konrad Hinsen and Gerald R. Kneller. Communication: A multiscale bayesian inference approachto analyzing subdiffusion in particle trajectories.
J. Chem. Phys. , 145(15):151101, 2016.[28] Natallia Makarava, Sabah Benmehdi, and Matthias Holschneider. Bayesian estimation of self-similarity exponent.
Phys. Rev. E , 84(2):021109, 2011.[29] Nilah Monnier, Syuan-Ming Guo, Masashi Mori, Jun He, Péter Lénárt, and Mark Bathe. Bayesianapproach to msd-based analysis of particle motion in live cells.
Biophys. J. , 103(3):616–626,2012.[30] Samudrajit Thapa, Michael A Lomholt, Jens Krog, Andrey G Cherstvy, and Ralf Metzler.Bayesian analysis of single-particle tracking data using the nested-sampling algorithm:maximum-likelihood model selection applied to stochastic-diffusivity data.
Phys. Chem. Chem.Phys. , 20(46):29018–29037, 2018.[31] Andrey G Cherstvy, Samudrajit Thapa, Caroline E Wagner, and Ralf Metzler. Non-gaussian, non-ergodic, and non-fickian diffusion of tracers in mucin hydrogels.
Soft Matter , 15(12):2526–2551,2019.[32] Alexander S Serov, François Laurent, Charlotte Floderer, Karen Perronet, Cyril Favard, DelphineMuriaux, Nathalie Westbrook, Christian L Vestergaard, and Jean-Baptiste Masson. Statisticaltests for force inference in heterogeneous environments.
Sci. Rep , 10(1):1–12, 2020.[33] Marcin Magdziarz, Aleksander Weron, Krzysztof Burnecki, and Joseph Klafter. Fractionalbrownian motion versus the continuous-time random walk: A simple test for subdiffusivedynamics.
Phys. Rev. Lett. , 103(18):180602, 2009.[34] Aleksander Weron, Joanna Janczura, Ewa Boryczka, Titiwat Sungkaworn, and Davide Calebiro.Statistical testing approach for fractional anomalous diffusion classification.
Phys. Rev. E ,99(4):042149, 2019.[35] Gorka Muñoz-Gil, Miguel Angel Garcia-March, Carlo Manzo, José D Martín-Guerrero, and haracterisation of anomalous diffusion statistics powered by deep learning Maciej Lewenstein. Single trajectory characterization via machine learning.
New J. Phys. ,22(1):013010, 2020.[36] Joanna Janczura, Patrycja Kowalek, Hanna Loch-Olszewska, Janusz Szwabiński, and AleksanderWeron. Classification of particle trajectories in living cells: Machine learning versus statisticaltesting hypothesis for fractional anomalous diffusion.
Physical Review E , 102(3):032402, 2020.[37] Stefano Bo, Falko Schmidt, Ralf Eichhorn, and Giovanni Volpe. Measurement of anomalousdiffusion using recurrent neural networks.
Phys. Rev. E , 100(1):010102, 2019.[38] Patrycja Kowalek, Hanna Loch-Olszewska, and Janusz Szwabiński. Classification of diffusionmodes in single-particle tracking data: Feature-based versus deep-learning approach.
Phys.Rev. E , 100(3):032410, 2019.[39] Naor Granik, Lucien E Weiss, Elias Nehme, Maayan Levin, Michael Chein, Eran Perlson, YaelRoichman, and Yoav Shechtman. Single-particle diffusion characterization by deep learning.
Biophys. J. , 117(2):185–192, 2019.[40] Gorka Muñoz-Gil, Giovanni Volpe, Miguel Angel Garcia-March, Ralf Metzler, Maciej Lewenstein,and Carlo Manzo. Andi: The anomalous diffusion challenge. arXiv:2003.12036 , 2020.[41] Gorka MuÒoz-Gil, Carlo Manzo, Giovanni Volpe, Miguel Angel Garcia-March, Ralf Metzler, andMaciej Lewenstein. The anomalous diffusion challenge dataset. doi:10.5281/zenodo.370770,March 2020.[42] P Massignan, C Manzo, JA Torreno-Pina, MF García-Parajo, M Lewenstein, and GJ Lapeyre Jr.Nonergodic subdiffusion from brownian motion in an inhomogeneous medium.
Phys. Rev. Lett. ,112(15):150603, 2014.[43] Harvey Scher and Elliott W Montroll. Anomalous transit-time dispersion in amorphous solids.
Phys. Rev. B , 12(6):2455, 1975.[44] Benoit B Mandelbrot and John W Van Ness. Fractional brownian motions, fractional noises andapplications.
SIAM Rev Soc Ind Appl Math , 10(4):422–437, 1968.[45] J Klafter and G Zumofen. Lévy statistics in a hamiltonian system.
Phys. Rev. E , 49(6):4873,1994.[46] SC Lim and SV Muniandy. Self-similar gaussian processes for modeling anomalous diffusion.
Phys.Rev. E , 66(2):021114, 2002.[47] R. Killick, P. Fearnhead, and I. A. Eckley. Optimal detection of changepoints with a linearcomputational cost.