Correcting for Bias of Molecular Confinement Parameters Induced by Small Time Series Sample Sizes in Single-Molecule Trajectories Containing Measurement Noise
CCorrecting for Bias of Molecular Confinement Parameters Induced by Small-Time-Series SampleSizes in Single-Molecule Trajectories Containing Measurement Noise
Christopher P. Calderon †∗†
Numerica Corporation, 4850 Hahns Peak Drive, Loveland, Colorado, 80538 (Dated: November 9, 2018)Several single-molecule studies aim to reliably extract parameters characterizing molecular con-finement or transient kinetic trapping from experimental observations. Pioneering works from singleparticle tracking (SPT) in membrane diffusion studies [Kusumi et al.,
Biophysical J. , (1993)] ap-pealed to Mean Square Displacement (MSD) tools for extracting diffusivity and other parametersquantifying the degree of confinement. More recently, the practical utility of systematically treatingmultiple noise sources (including noise induced by random photon counts) through likelihood tech-niques have been more broadly realized in the SPT community. However, bias induced by finite timeseries sample sizes (unavoidable in practice) has not received great attention. Mitigating parameterbias induced by finite sampling is important to any scientific endeavor aiming for high accuracy, butcorrecting for bias is also often an important step in the construction of optimal parameter estimates.In this article, it is demonstrated how a popular model of confinement can be corrected for finitesample bias in situations where the underlying data exhibits Brownian diffusion and observationsare measured with non-negligible experimental noise (e.g., noise induced by finite photon counts).The work of Tang and Chen [ J. Econometrics , (2009)] is extended to correct for bias in theestimated “corral radius” (a parameter commonly used to quantify confinement in SPT studies) inthe presence of measurement noise. It is shown that the approach presented is capable of reliablyextracting the corral radius using only hundreds of discretely sampled observations in situationswhere other methods (including MSD and Bayesian techniques) would encounter serious difficulties.The ability to accurately statistically characterize transient confinement suggests new techniquesfor quantifying confined and/or hop diffusion in complex environments. PACS numbers: 87.80.Nj, 87.10.Mn, 05.40Jc, 2.50.Tt, 5.45.Tp
I. INTRODUCTION
In many live cell applications, large scale cellularstructures impose complex constraints on the motion ofsmaller biomolecules [1–13]. Quantifying these effectsfrom in vivo observations is the goal of numerous exper-iments. Fortunately, recent advances in microscopy andother single-molecule probes have substantially improvedresolution in both time and space, so various complex ki-netic constraints can be more quantitatively measured.Fluorescence microscopy can be used to extract ki-netic information from a sequence of point spread func-tion (PSF) measurements [14, 15]. Pioneering efforts[16, 17] aiming to quantify transient “corralling” param-eters characterizing confinement induced by cytoskeletaland transmembrane protein structures [3] appealed toMean Square Displacement (MSD) analyses. Recently,the utility of statistically motivated time series analysishave become more popular for analyzing single-moleculedata. These tools offer several advantages over tradi-tional MSD-based techniques. For example, likelihoodand Bayesian-based statistical analysis methods permitone more flexibility in terms of inference decisions char-acterizing noisy systems, and these schemes also providemore efficient estimation strategies [8, 13, 18–23].Many of the first works utilizing likelihood-based anal-ysis methods to analyze single particle tracking (SPT) ∗ [email protected] data ignored the effects of measurement noise (also re-ferred to as “localization precision” [13, 23, 24]), butthe importance of modeling this noise source has beendemonstrated in various works focused on analysis single-molecule data where measurement noise induced by theexperimental apparatus is not negligible relative to ther-mal fluctuations inherent to single-molecule measure-ments [22, 23, 25, 26]. Ref. [23] provides a discussion onissues associated with simultaneously quantifying mea-surement and molecular diffusion in SPT applications,but the focus of Ref. [23] is on optimal parameter esti-mation. It is well-known that the maximum likelihoodestimator is asymptotically unbiased [18] and achievesthe Cramer-Rao lower bound when the assumed underly-ing model precisely matches the data generating mecha-nism producing observations [19]. In applications wheretracking molecules for a long time is complicated dueto crowding, photobleaching, and/or emitter “blinking”[14, 15], it is difficult to collect a large number of mea-surements (hence the asymptotic sampling regime is notencountered). In PSF modeling, the appropriate para-metric models have been more broadly agreed upon [19],but the “correct” stochastic model consistent with exper-imental single-molecule observations is a more delicateissue [26, 27]. Furthermore, even if observations are con-sistent with the assumed stochastic model, correcting forsystematic bias introduced by finite sample sizes whereobservations contain both diffusive noise and measure-ment noise has not received great attention in the SPTliterature. Therefore estimators accurately quantifying a r X i v : . [ c ond - m a t . m e s - h a ll ] J un finite sample bias (as opposed to asymptotically mini-mizing parameter variance or bias) are desirable whenanalyzing experimental trajectories.This work introduces a bias correction scheme for ex-tracting the “corral radius” [16, 17]. This quantity iscommonly used to characterize confinement in biophys-ical applications [7, 11, 13]. Examples characteristic ofsampling regimes encountered in fluorescence microscopyare presented, but the approach can be readily general-ized to other time and length scales. The bias correctionremoves systematic errors induced by observing a shortfinite time series (enabling estimation in situations wherean MSD curve is deemed too noisy) in contrast to remov-ing artifacts of motion blur [5, 22]. However the analysispresented explicitly shows how to map estimated param-eters to MSD curves, so previously proposed motion blurcorrections for confinement [5, 22] can be used to aug-ment the tools presented.Likelihood-based techniques [21, 26, 28–32] are em-ployed throughout this article; such methods enable oneto consider numerous time series analysis tools in phys-ical and life science applications. The author has foundadopting statistically rigorous time series analysis toolsfrom econometrics and computational finance helpful instatistically analyzing data from microscopic simulations[33–35] and single-molecule force manipulation exper-imental data where measurement noise is commensu-rate with thermal noise [25, 26, 36]. In this article,the relevance of recent likelihood-based tools [30–32] toSPT modeling is demonstrated. Section II presents thestochastic differential equation (SDE) model considered,relates parameters extracted from these models to tra-ditional MSD analyses, and introduces the basic toolsutilized throughout. The first figure and tables in Sec.III present the main results; the remaining results explainand justify how a theory originally developed for estimat-ing SDEs observed without measurement noise [30] canbe modified and extended to handle the situation wheremeasurement noise contaminates time series data. II. METHODS
The underlying position of a molecule will be denotedby x and the noisy experimental observations will be de-noted by ψ ; the motion models considered take the fol-lowing form: dx t = − ∇ V ( x t ) dt + σdB t (1) ψ i = x i + (cid:15) i ; (cid:15) i ∼ N (0 , R ) . (2)The above is an SDE model [37] with a constant diffusioncoefficient driven by a standard Brownian motion process B t (the subscripts denote a continuous time model) hav-ing a drift function determined by a potential V ( x ). Themeasurements, ψ i , in Eqn. 2 are contaminated by noise, (cid:15) i , modeled as draws from a Normal distribution withmean zero and variance R (denoted by N (0 , R )). In SPT applications, the effective measurement noise (i.e., local-ization precision) is often quantified by R / . The mea-surement noise is typically assumed to be an independentand identically distributed (i.i.d.) random number se-quence, and the variance R is assumed unknown a priori (the model also assumes statistical independence of x and (cid:15) ). The integer subscript i denotes that trajectory obser-vations are made at discrete times and t i +1 − t i = ∆ t ∀ i .Since typical SPT calculations assume independence be-tween spatial coordinates [5, 11, 17, 23], we will restrictattention to analyzing the 1D version of Eqn. 1; hencethe diffusion coefficient is D := σ .For V ( x ), two different functional forms will be con-sidered: (i) V ( x ) = 0 for | x | < L/ V ( x ) = ∞ for | x | ≥ L/ L, σ, R ) and (ii) V ( x ) = κx which we label as the Ornstein-Uhlenbeck(OU) process (also known as the Vasicek model [30]);parameters requiring estimation in this case are ( κ, σ, R ).Both potentials mentioned above have been considered inconfined membrane diffusion studies [5, 13, 17]. Kusumi et al. demonstrated how the MSD asymptotically ap-proaches L in the RBM model; extraction of L fromdata is still a common technique for quantifying confine-ment in SPT studies [5, 11, 13] (the 1D corral radiusis defined by (cid:113) L ). The MSD corresponding to an er-godic OU process observed with infinite time for δ timeunits between adjacent observations can (see Appendix)be shown to be σ κ (cid:0) − e − κδ (cid:1) .If the two models under consideration have identicaldiffusion coefficients, then setting κ = σ L is one wayto match asymptotic MSD parameters; in the confinedregime, this relation also allows one to map κ of theOU model onto the corresponding L parameter in theRBM model. The Appendix displays representativetrajectories and also compares the entire MSD for OUand RBM models driven by the same Brownian noiserealizations. In the measurement noise free case ( R = 0)with large samples, the RBM and OU processes areeasy to qualitatively and quantitatively distinguish.When measurement noise is present ( R >
Advantages Afforded by the OU Model
The discrete time analog of Eqn. 1 for the OU modelis: x i = F x i − + η i − ; η i − ∼ N (0 , Q ) (3) ψ i = x i + (cid:15) i ; (cid:15) i ∼ N (0 , R ) , (4)where F ≡ e − κ ∆ t and Q ≡ σ κ (1 − e − κ ∆ t ) [30]. Thisrelation allows one to readily use the Kalman filter esti-mation framework [28]. Maximum likelihood estimation(MLE) of the parameters completely characterizing thestationary OU process can be computed from the ob-servable measurements { ψ i } Ni =0 [25, 28, 36]. This per-mits efficient estimation in situations where sample sizesfor an MSD analysis are difficult to reliably extract andstatistically characterize (see Appendix Fig. 6). TheGaussian structure of the OU process also enables one toexploit a variety of other powerful tools that can be usedto analyze this type of stochastic process [28], includ-ing goodness-of-fit testing (checking model assumptionsagainst data directly [25–27]), exact rate of convergenceanalysis under stationary and non-stationary sampling[29], and bias correction. For example, Tang and Chen[30] demonstrate how to remove bias from MLEs com-puted using finite sample sizes in the case where the x i ’sare directly observed (i.,e., R = 0). In the stationarycase ( κ > E [ˆ κ ] = κ + (5)1 N ∆ t (cid:0)
52 + e κ ∆ t + 12 e κ ∆ t (cid:1) + O ( 1 N ) , where E [ˆ κ ] denotes the expectation of the MLE of κ (theMLE is denoted by ˆ κ ). The other terms quantify the ex-pected bias induced by finite N . ˆ κ is often the dominantsource of bias when the relation L = (cid:113) σ κ is used toextract the corral radius from OU parameter estimatesin the sampling regimes studied (e.g., results obtained byplugging in the corrections to σ reported in Ref. [30] didnot affect results).Before moving onto the case where R >
0, it is worthreviewing a classic first order autoregressive time seriesmodel [28, 29] where x i = F x i − + η i − where η i − ∼N (0 , Q ); the interest is in estimating F ( Q is consideredfrozen and to be nuisance parameter). For notationalsimplicity set Q = 1 and x = 0. In this case, for givensample of size N (also referred to as the “track length”)the standard likelihood equation is: p F ( x , x , . . . , x N ) = (2 π ) − N exp (cid:0) − N (cid:88) i =1 ( x i − F x i − ) (cid:1) (6) Taking the logarithm, expanding the quadratic terms,and setting the derivative of the expression above withrespect to F equal to zero provides the following estima-tor [29]: ˆ F = N (cid:80) i =1 x i x i − N (cid:80) i =1 x i − (7)In the presence of measurement noise, the above suggestsa naive suboptimal (denoted by a tilde) estimator:˜ F = N (cid:80) i =1 ( x i + (cid:15) i )( x i − + (cid:15) i − ) (cid:0) N (cid:80) i =1 ( x i − + (cid:15) i − ) (cid:1) − N ˜ R (8)where ˜ R is an independent estimate of the measurementnoise variance. Recall that the measurement noise is as-sumed i.i.d., so if ˜ R is asymptotically consistent and Q isfixed, ˜ F is asymptotically consistent since the cross-termsums involving (cid:15) and x tend to zero and become insignifi-cant relative to the other non-zero sums in the κ > R (this can alter-natively come from a prior, but this will likely introducebias which is hard to quantify). Furthermore, if one usesestimators ignoring confinement effects, new systematicbiases (on top of inherent finite sample bias associatedwith estimating κ ) can be introduced. This phenomenonis demonstrated by example in the Results.In the Kalman filter framework considered, the inno-vation likelihood (Appendix Eqn. 10) has an approx-imate autoregressive [28] form if the filter covariancereaches steady state quickly. If a stationary OU process isdeemed adequate to describe experimental observationsand the Kalman filter covariance sequences reaches itssteady state value rapidly, then analysis in Ref. [30] canbe applied to study the expected finite sample bias of ˆ κ .When one jointly estimates the MLE parameters associ-ated with Eqn. 1 by optimizing the innovation likelihood,one effectively returns to the situation in Eqn. 7 wherethe estimates of F can be extracted without knowledgeof the value of the constant noise parameters. In the Re-sults (Fig. 4), the convergence of matrices characterizingthe Kalman filter are demonstrated.In what follows, it is shown how plugging the MLE’s(obtained by maximizing Eqn. 10) into Eqn. 5 cansignificantly reduce bias from parameter estimates ob-tained with small N in situations of relevance to SPTtracking (the approach avoids specifying the “lag param-eter” plaguing MSD-based analyses [23]). The approachis demonstrated to accurately infer both κ and the ef-fective L (corral radius) if data is generated using eitherthe OU model (correct model specification) or the RBM(model misspecification). III. RESULTS
Figure 1 presents a histogram of the raw estimate ofthe corral radius obtained via the relation ˆ L = (cid:113) σ ˆ κ for the case where 1000 Monte Carlo simulations with L = 400 nm, D = 0 . µm / , R / = 50 nm , ∆ t = 25 ms and N = 100 observations of ψ were used to generatedata. From this data, the parameter estimates charac-terizing the model in Eqn. 1 were extracted. Corralradius parameters are inferred using the MLE and thebias corrected parameter estimates for two different datagenerating processes. In the top panel, the OU processgenerates data; in the bottom panel, the RBM processgenerates data (here there is model mismatch). The biasinduced by only observing 100 time series is effectivelyremoved in both cases. Appendix Fig. 6 displays rep-resentative trajectories of x , ψ , and the empirical MSDassociated with these trajectories. (a)(b) FIG. 1. (Color online) Raw MLE and bias corrected corral radiusestimate, ˆ L , obtained by extracting the parameters from time se-ries of length 100 sampled every 25 ms (this was repeated for 1000Monte Carlo trials; the histogram displays 1000 ˆ L ’s). In (a), theOU process (with known parameters) generates observations. In(b), the same Brownian motion paths used to generate OU trajec-tories are used to construct RBM paths. In both cases, a singlemeasurement noise random number stream was added to each tra-jectory. The use of the same underlying Brownian path and mea-surement noise sequence was used to reduce random variation andfacilitate quantifying systematic errors. TABLE I. Corral radius estimates with innovation MLEand bias corrected MLE (below referred to as “Classic In-nov.”,“Bias Cor.”, respectively). The columns labeled withˆ L contain the average parameter estimate obtained by ana-lyzing 200 Monte Carlo trajectories each containing 400 ob-servations spaced by ∆ t = 25 ms (the number in parenthesisreports standard deviation). The column labeled error re-ports the mean minus known true corral radius. In this table D = 0 . µm /s and R / = 25 nm .OU RBMEstimator ˆ L [ nm ] Error ˆ L [ nm ] Error L = 250 nm Classic Innov. 169.31 (16.07) -80.69 168.19 (12.56) -81.81Bias Cor. 241.40 (22.54 ) -8.60 240.24 (17.56 ) -9.76 L = 400 nm Classic Innov. 277.09 (18.38) -122.91 275.69 (10.63) -124.31Bias Cor. 399.09 (26.89 ) -0.91 398.21 (15.39 ) -1.79 L = 500 nm Classic Innov. 344.25 (25.75) -155.75 343.70 (14.83) -156.30Bias Cor. 500.35 (38.69 ) 0.35 501.73 (22.54 ) 1.73TABLE II. Same as Table I except D = 0 . µm /s .OU RBMEstimator ˆ L [ nm ] Error ˆ L [ nm ] Error L = 250 nm Classic Innov. 169.43 (19.17) -80.57 170.70 (12.24) -79.30Bias Cor. 253.54 (31.84) 3.54 258.49 (21.06) 8.49 L = 400 nm Classic Innov. 253.84 (46.88) -146.16 257.78 (34.79) -142.22Bias Cor. 418.74 (118.61) 18.74 437.48 (90.10) 37.48 L = 500 nm Classic Innov. 310.85 (61.59) -189.15 308.72 (57.53) -191.28Bias Cor. 582.40 (220.27) 82.40 603.12 (244.42) 103.12
Tables I-II present similar results, but vary the systemand sampling parameters. The parameters explored weremotivated by SPT studies. Even for N = 400, substantialbias exists in the asymptotically efficient MLE. Bias re-duction comes at the cost of variation as can be observedby the reported standard deviations. However, using theOU model structure allows one to use a wealth of quan-titative tools for understanding experimental data anal-ysis. The main results have now been presented, whatfollows expands on technical details and on the domainof applicability of the bias removal approach.Figure 2 presents the distribution of the estimated κ for a variety of estimators. The focus is on κ since this isoften the major source of variation in L estimated fromshort time series [30]. The top panel displays three esti-mators; (i) the raw MLE associated with the OU processwhere R is assumed zero [30]; (ii) the MLE obtained byjointly extracting the κ, σ , and R that minimize the in-novation likelihood [36] (see Eqn. 10) and; (iii) usingthe bias correction of Tang and Chen [30] applied to theoutput of (ii). The average of the ˆ κ distributions for thethree cases are 24.3, 18.2, and 16.0 s , respectively (thetrue value is 15 s ). The difference may seem small, butrecall that the estimated corral radius depends nonlin-early on ˆ κ (hence the amplified difference in ˆ L ).The bottom panel in Fig. 2 uses “other” suboptimalestimators of κ . In one case R is assumed to be knownaccurately a priori ; here ˜ R / = 0 . R / = 40 nm wasused along with Eqn. 8 to estimate κ . Since accurate apriori knowledge of R can be a questionable assumptionin SPT studies, we also show results of applying Eqn. 8in conjunction with the estimator reported in Ref. [22]to extract ˜ R from the data; here R is biased because theestimator in Ref. [22] was designed for the case where noforces or confinement constraints affect particle dynamics(the average of the estimates of R assuming the model inRef. [22] was 72.4 nm for this data set; note that Refs.[22, 23] warn that the estimator is not valid if constraintforces are present).The average MLE (without bias correction) for thediffusion and measurement noise was (0.23 µm /s , 41.9 nm ), that using the estimator from Ref. [22] was (0.07 µm /s , 71.3 nm ), and the true value for the OU data gen-erating process was (0.20 µm /s , 50.0 nm ). Note thatthe arguments appealed to in this paper to explain thevalidity of the bias correction of Ref. [30] in conjunctionwith the Kalman filter’s innovation likelihood [26, 28](relevant expressions shown in Eqn. 10) are not directlyapplicable to the bias correction of the other parametersreported in [30] when R >
0. Analysis of the bias andvariance of parameters σ and R are more involved due toiterations introduced by the Kalman filter’s update andforecast steps; this analysis is beyond the scope of thiswork.Application of various estimators of OU parameters todata generated by both the OU and RBM (a misspec-ified model) processes, was carried out for two reasons:(i) to emphasize that certain estimators can induce sub-tle systematic biases and (ii) to stress that likelihood-based inference permits other analysis tools beyond esti-mation. Bias correction is possible in addition to othertechniques. For example, detecting confinement from ob-servations using visual inspection of the short trajectoriesis problematic (Fig. 6 shows how even in the R = 0 case,distinguishing RBM from the OU process is difficult with N = 100). However, goodness-of-fit testing can be em-ployed [27, 38]. Applying the technique of Hong and Li[38] (more specifically computing the M (1 ,
1) test statis-tic) allows one to reject ≈
20% of the trajectories assum-ing the so-called directed diffusion model (i.e., constantdiffusion, measurement noise and velocity [11, 24], but κ = 0) even with N = 100. There is overwhelming sta-tistical evidence for larger N cases (the average p − valueobtained assuming the directed diffusion plus measure- ment noise model was < × − for all N = 400 casesconsidered). This demonstrates that the test has powerto detect kinetic signatures of confinement in the pres-ence of diffusive plus measurement noise in regimes ofinterest to SPT studies (if the model was not rejectedone can entertain using models involving fewer parame-ters e.g., see Refs. [22, 23]). The case where we assumedan OU model, but an RBM model actually generated thedata (model misspecification), was statistically indistin-guishable using tests in Ref. [38] from the case where theOU model generated data. Since there is no evidence inthe raw observational data favoring one model over theother, and both models produce similar estimates of thequantity of interest (the corral radius), it is attractive touse the OU modeling viewpoint since a substantial bodyof literature exists for analyzing data generated by thistype of stochastic process [21, 26, 28–32].Figure 3 provides another example of analysis toolsthat are made available from likelihood-based analyses.Here κ is plotted against the truncated bias expansionsin Eqn. 5 (taken from Tang and Chen [30]) for a fixed ∆ t and two sample sizes N . Note how as κ decreases, thefraction of bias increases rapidly. Also note, that as κ decreases, there is a higher likelihood of an MLE param-eter estimate being near or less than zero (even for a trulystationary process where the underlying data has κ > κ suggest weak “corralling” since L is in-versely related to κ . The inverse dependence also causesthe inflated standard deviation for L = 500 nm since asmall fraction of estimated ˆ κ are near zero (also note thatthe median ˆ L ’s corresponding to the L = 500 nm row ofTab. II were 540.3 and 582.4; this suggest that these esti-mates in the tail of the estimated parameter distributionsubstantially influenced the observed mean). Althoughone can remove expected bias, if the fraction of bias islarge relative to the signal then other factors can compli-cate bias correction. For example, (i) higher order termsin the expected bias expansion can become more impor-tant; (ii) inherent parameter uncertainty in the point es-timate substantially affects the expected bias. Therefore,plots like Fig. 3 allow researchers to quantitatively deter-mine when other factors influencing the bias correctionscheme need to be considered. Conditions Required for Bias Removal
The Kalman filter’s constant noise assumption (i.e.,the covariance of the innovation sequence, S i , in Eqn.10) needs to be tested in order for the analysis of Tangand Chen [30] to be accurate for the expected bias in ˆ κ .Figure 4 illustrates that this is indeed the case for the pa-rameter regimes under study. Note that we intentionallyignored the estimation of the mean of the OU process(the mean was set to zero), this simplifies analyzing theeffect of κ on the autoregressive parameter ( F = e − κ ∆ t )under the assumption of a constant innovation covari-ance. The mean zero OU process does not restrict utility(with extra effort one can analyze the joint mean and κ estimates and in practice one can simply demean the ψ series using the empirically average of { ψ i } Ni =0 ). (a)(b) FIG. 2. (Color online) Distribution of ˆ κ obtained when the OUprocess with measurement noise generated observations (resultscorrespond to L histogram shown in Fig. 1). Fixed true valueof κ denoted by vertical dashed line. Two different classes of esti-mators were used: (a) MLE-based estimators that use a likelihoodwith the correct model and (b) “Other” estimators use an assumedprior input for ˜ R (the true R / is 50 nm, but ˜ R / is set to 40nm since knowledge of the precise effective measurement noise isdifficult to accurately quantify [22] in SPT applications) and a sub-optimal estimator in Eqn. 8 (for ˜ R we plug-in the noise estimateobtained using code associated with Ref. [23]). In both cases,the Innovation MLE without bias correction (labeled as “Innov.”)serves as the reference histogram. Beyond testing constancy of the covariance of the in-novation, one should verify that the bias in the estimated R and σ is small in relation to the bias in the estimated κ . This can be achieved via simulation if need be. Ifbias in R and σ is determined to be significant, newanalytical or numerical bias removal schemes should beconsidered. Bootstrap techniques can be leveraged toquantify variance and bias in more complex SDE models[42] (e.g., bootstrap techniques can check if the samplesize is deemed too small for using a particular estimator).When ∆ t is decreased with R fixed, measurementnoise often becomes a more dominant part of the single-molecule signal and the bias in R is relatively small[25, 31]. In the OU model, the influence of κ on Q is O (∆ t ) since a Taylor expansion in ∆ t shows Q = σ κ (cid:0) κ ∆ t + O (∆ t ) (cid:1) . In the small ∆ t limit, one can lever-age existing nonparametric tools for estimation and infer- (cid:103) [1/s] B i a s (cid:103) N=100 Bias ( (cid:54) t = 25ms)N=400 Bias ( (cid:54) t = 25ms)
FIG. 3. (Color online) Bias vs. κ for two different sample sizes N using Eqn. 5 (∆ t = 25 ms and measurement noise magnitude25 nm ). ence [32]. However, the parameter regime explored hereis one in which κ ’s influence is not small relative to ∆ t (otherwise, the approach in Ref. [23] would predict moreaccurate R estimates). In this study, it was empiricallydemonstrated that the bias connected to the innovationMLE of R is small relative to that of κ in several pa-rameter regimes of relevance to SPT modeling (no biascorrection was applied to ˆ R ).Note that the bias correction scheme presented also de-pends heavily on the stationarity assumption of both thestate and on the innovation sequence. If stationarity of x is questionable, computing a corral radius should be re-considered. More formally, unit root tests [28] (adjustedto account for measurement noise) can be used to checkthe Brownian motion vs. stationary OU models. To moregenerally test stationarity of the mean or covariance ofthe observed measurements, other testing procedures canalso be considered, e.g. [43]. If the (bias corrected) es-timate along with the associated parameter uncertaintysuggest ˆ κ is near zero, the suitability of a confined diffu-sion model needs to be carefully reevaluated.Finally, if all conditions mentioned in this subsectionare met and the Kalman filter corresponding to the OUprocess is an adequate model of the observations (an as-sumption tested here with time series hypothesis testing[38]), then the state and innovation noise residuals havemean zero (these residuals make up stationary processunder the conditions above). The filter and measure-ment noise sources make the innovation sequence differ-ent than classic order one autoregressive process, but theadditional noise terms do not substantially influence thefirst order expansion of the expected bias of ˆ κ . The ef-fects of the additional noise terms are lowest when thescalar gain, K i (see Eqn. 10), is close to one (a standardautoregressive process generates the data when K i = 1).In small time series sample sizes, even when K i < ,the bias correction can be shown to be accurate since theeffects of parameter uncertainty tend to dominate the ad-ditional noise associated with the filtered state estimates.Furthermore, the MLE parameters are found by jointlyoptimizing the Eqn. 10 given data, but when the innova-tion covariance quickly reaches steady state, the analysisof Tang and Chen [30] is relevant to understanding theexpected bias in ˆ κ . FIG. 4. (Color online) Innovation covariance (see Eqn. 10) con-vergence rates. Plots were obtained by plugging in exact data gen-erating parameters studied in Tables I-II. The plot illustrates thatthe filter quickly reaches steady state relative to the sample size N . IV. CONCLUSIONS
Simulations were used to demonstrate that kinetic con-finement parameters could be accurately extracted fromrelatively short time series (100 ≤ N ≤ ad hoc sampling pa-rameters such as a “time-lag” cut-off (this is a commonproblem in MSD-based analysis [23]; MSD-based analy-ses are still quite popular in the SPT community).The ability to accurately extract kinetic parametersand correct for biases induced by small time series samplesizes (while also accounting for measurement and thermalnoise in a statistically rigorous fashion [22, 26]) showsgreat promise studies where the underlying molecule ex-periences random forces whose distribution changes inboth time and space due to complex interactions in ahighly heterogeneous environment. For example, if onecan both reliably determine when molecules leaves a“picket fence” [3] in the plasma membrane via changepoint detection algorithms [44] and can track trajecto-ries with high temporal resolution (perhaps at the costof spatial accuracy), one can utilize the tools presentedhere to accurately map out both the diffusion coefficientand the corral radii explored by molecules in the plasmamembrane or in the cytoplasm [24]. This presents an at-tractive physically interpretable modeling alternative tosub-diffusion or continuous time random walk type mod-els, but such a study is left to future work. The methodintroduced was shown to be useful in parameter regimescommonly encountered in fluorescence-based SPT exper-imental studies, but the approach is general and can beused to probe other length and time scales. V. ACKNOWLEDGEMENTS
The author would like to thank Randy Paffenroth(Numerica Corp.) for comments on an earlier draft.
VI. APPENDIXA. MSD of the Stationary ( κ > ) OU Process The MSD associated with δ time units between obser-vations is defined by N (cid:104) N (cid:80) t =1 ( x t + δ − x t ) (cid:105) ; in the previousexpression (cid:104)·(cid:105) denotes ensemble averaging [9, 11]. Let M ( δ ) denote the MSD multiplied by N at a given lag δ ;then plugging in the solution to the mean zero station-ary OU process (variance = σ κ [37]) and exploiting otherstandard properties of SDEs driven by Brownian motion[45] yields: M ( δ ) = (cid:104) N (cid:88) t =1 (cid:0) x t e − κδ + σ δ (cid:90) e − κ ( δ − s ) dW s (cid:1) − x t ) (cid:105) = (cid:104) (cid:88) x t e − κδ + x t + σ κ (1 − e − κδ ) − x t e − κδ (cid:105) = (cid:104) (cid:88) x t (cid:0) e − κδ − e − κδ (cid:1) + σ κ (1 − e − κδ ) (cid:105) = (cid:88) (cid:104) x t (cid:105) (cid:0) e − κδ − e − κδ (cid:1) + σ κ (1 − e − κδ )= (cid:88) σ κ (cid:0) e − κδ − e − κδ (cid:1) + σ κ (1 − e − κδ )= N (cid:88) t =1 σ κ (cid:0) − e − κδ (cid:1) (9) (a) x [ µ m ] Time [ s ] (b) ψ [ µ m ] Time [ s ] (c) FIG. 5. (Color online) Long time sample (hence large N ) trajec-tory for OU and RBM process without (a) and with (b) measure-ment noise. Panel (c) shows that for large N , the empirical MSDcurve matches the theoretical (i.e., infinite sample limit) MSD limit.Here L = 400 nm, D = 0 . µm / , R = 50 nm , and ∆ t = 25 ms . To account for i.i.d. Gaussian measurement noise (i.e.,one carries out an MSD on ψ ) in the above expression,simply add 2 × R × N to the MSD expression above [31]. B. Representative Trajectories and MSDs
In this section, the reflected Brownian motion and thecorresponding OU process (found using Eqn. 5) are plot-ted with and without measurement noise. The MSDs ofthe measurement noise free and measurement noise caseare shown for both large and small sample sizes. (a) x [ µ m ] Time [ s ] (b) ψ [ µ m ] Time [ s ] (c) FIG. 6. (Color online) Same as Fig. 5, except the sample sizewas reduced to N = 100 observations. The small sample size com-plicates reliably using an MSD-based analysis. With short tracklengths (small N ), well-known issues associated with selecting thelag truncation to use in computations, statistical dependence com-monly introduced when computing MSDs, etc. [23] are even morepronounced. The likelihood-based bias correction scheme intro-duced is able to reliably extract system parameters even with thesesmall samples sizes. C. MLE of the Innovation Sequence
In the main text, mappings between the OU parame-ters and those of the classic Kalman filter [28] were pre-sented. Here the equations defining the innovation MLEand the associated likelihood [25, 28] relevant to the sce-nario studied are presented (the reader is referred to Ref.[28] for full details). Note that the “observation matrix” H is the identity matrix and that ˆ ψ i | i − ≡ H ˆ x i | i − (=ˆ x i | i − in the case considered). (10)( ˆ R, ˆ F , ˆ Q ) = argmax L ( R, F, Q ) ≡ p ( ψ , ψ , . . . , ψ N ; R, F, Q ) p ( ψ , ψ , . . . , ψ N ; R, F, Q ) = N (cid:89) i =1 √ πS i exp (cid:0) − ( ψ i − F ˆ ψ i − | i − ) S i (cid:1) ˆ ψ i | i = ˆ ψ i | i − + K i ( ψ i − ˆ ψ i | i − ) K i = P i | i − P i | i − + RS i = P i | i − + RP i | i − = F P i − | i − F + QP i | i = P i | i − − P i | i − P i | i − + R For the stationary OU process, the recursion above(processing the observation sequence) was started usingˆ x | = 0 and P | = σ κ . The Nelder-Mead algorithmwas used to find the parameter optimizing Eqn. 10. Goodness-of-fit testing [27, 38] was used to both checkthe consistency of model assumptions against data andto ensure that a local minimum was not encountered inthe optimization. [1] Schlessinger, J., Elszon, E. L., Webb, W. W., Yahara,I., Rutishauser, U., and Edelman, G. M. Proceedings ofthe National Academy of Sciences of the United States ofAmerica (3), 1110–4 March (1977).[2] Sako, Y. The Journal of Cell Biology (6), 1251–1264June (1994).[3] Kusumi, A., Nakada, C., Ritchie, K., Murase, K., Suzuki,K., Murakoshi, H., Kasai, R. S., Kondo, J., and Fujiwara,T.
Annual review of biophysics and biomolecular struc-ture , 351–78 January (2005).[4] Golding, I. and Cox, E. Physical Review Letters (9),14–17 March (2006).[5] Destainville, N. and Salom´e, L. Biophysical journal (2), L17–9 January (2006).[6] Rohatgi, R., Milenkovic, L., and Scott, M. P. Science(New York, N.Y.) (5836), 372–6 July (2007).[7] Saxton, M. J.
Biophysical journal (4), 1178–91 Febru-ary (2007).[8] Masson, J., Casanova, D., Turkcan, S., Voisinne, G.,Popoff, M., Vergassola, M., and Alexandrou, A. Phys-ical review letters (4), 48103 (2009).[9] Magdziarz, M. and Klafter, J.
Physical Review E (1),1–7 July (2010).[10] Nachury, M. V., Seeley, E. S., and Jin, H. Annual reviewof cell and developmental biology , 59–87 November(2010).[11] Park, H. Y., Buxbaum, A. R., and Singer, R. H. Methodsin enzymology (chapter 18) (10), 387–406 (2010).[12] Weigel, A. V., Simon, B., Tamkun, M. M., and Krapf, D.
Proceedings of the National Academy of Sciences of theUnited States of America (16), 6438–43 April (2011).[13] T¨urkcan, S., Alexandrou, A., and Masson, J.-B.
Bio-physical journal (10), 2288–98 May (2012).[14] Kim, S. Y., Gitai, Z., Kinkhabwala, A., Shapiro, L., andMoerner, W. E.
Proceedings of the National Academy ofSciences of the United States of America (29), 10929–34 July (2006).[15] Manley, S., Gillette, J. M., Patterson, G. H., Shroff, H.,Hess, H. F., Betzig, E., and Lippincott-Schwartz, J.
Na-ture methods (2), 155–7 February (2008).[16] Qian, H., Sheetz, M. P., and Elson, E. L. BiophysicalJournal (4), 910–21 October (1991).[17] Kusumi, A., Sako, Y., and Yamamoto, M. Biophysicaljournal (5), 2021–40 November (1993).[18] van der Vaart, A. Asymptotic Statistics . Cambridge Uni-versity Press, (1998).[19] Ober, R. J., Ram, S., and Ward, E. S.
Biophysical J. (2), 1185–200 February (2004).[20] Montiel, D., Cang, H., and Yang, H. Journal of PhysicalChemistry B (40), 19763–70 October (2006).[21] Calderon, C. P.
Multiscale Model. Simul. , 656–687(2007). [22] Berglund, A. J. Physical Review. E (1), 011917 July(2010).[23] Michalet, X. and Berglund, A. Physical Review E (6),061916 June (2012).[24] Thompson, M. A., Casolari, J. M., Badieirostami, M.,Brown, P. O., and Moerner, W. E. Proceedings of theNational Academy of Sciences of the United States ofAmerica (42), 17864–71 October (2010).[25] Calderon, C. P., Chen, W., Harris, N., Lin, K., andKiang, C.
J. Phys.: Condens. Matter , 034114 (2009).[26] Calderon, C. P., Harris, N., Kiang, C., and Cox, D. J.Phys. Chem. B , 138 (2009).[27] Calderon, C. P.
J Phys Chem B , 3242–3253 (2010).[28] Hamilton, J.
Time Series Analysis . Princeton UniversityPress, Princeton, NJ, (1994).[29] Shiriaev, A. and Spokoiny, Y.
Statistical Experiments andDecisions: Asymptotic Theory . World Scientific Publish-ing Company, Singapore, (1999).[30] Tang, C. Y. and Chen, S. X.
Journal of Econometrics (1), 65–81 April (2009).[31] Zhang, L., Mykland, P. A., and Ait-Sahalia, Y.
Journalof the American Statistical Association , 1394–1411(2005).[32] A¨ıt-Sahalia, Y., Fan, J., and Xiu, D.
Journal of theAmerican Statistical Association (492), 1504–1517December (2010).[33] Calderon, C. P.
J. Chem. Phys. , 084106 (2007).[34] Calderon, C. P. and Arora, K.
J. Chem. Theory Comput. , 47 (2009).[35] Calderon, C. P., Martinez, J., Carroll, R., and Sorensen,D. Multiscale Model. Simul. , 1562–1580 (2010).[36] Calderon, C. P., Harris, N., Kiang, C., and Cox, D. J.Mol. Recognit. , 356 (2009).[37] Risken, H. The Fokker-Planck Equation . Springer-Verlag,(1996).[38] Hong, Y. and Li, H.
Rev. Fin. Studies , 37–84 (2005).[39] Lubelski, A., Sokolov, I. M., and Klafter, J. PhysicalReview Letters (25), 250602 (2008).[40] Yokoyama, R.
Probability Theory and Related Fields ,45–57 (1980).[41] Billingsley, P. Convergence of probability measures . Wi-ley, (1968).[42] Davison, A. and Hinkley, D.
Bootstrap Methods andTheir Application . Cambridge Series in Statistical andProbabilistic Mathematics. Cambridge University Press,(1997).[43] Koutris, A., Heracleous, M. S., and Spanos, A.
Econo-metric Reviews (4-6), 363–384 May (2008).[44] Poor, H. V. and Hadjiliadis, O. Quickest Detection . Cam-bridge University Press, (2008).[45] Kloeden, P. and Platen, E.