Heterogeneous Idealization of Ion Channel Recordings -- Open Channel Noise
11 Heterogeneous Idealization of Ion ChannelRecordings - Open Channel Noise
Florian Pein, Annika Bartsch, Claudia Steinem and Axel Munk
Abstract
We propose a new model-free segmentation method for idealizing ion channel recordings. This method is designedto deal with heterogeneity of measurement errors. This in particular applies to open channel noise which, in general,is particularly difficult to cope with for model-free approaches. Our methodology is able to deal with lowpass filtereddata which provides a further computational challenge. To this end we propose a multiresolution testing approach,combined with local deconvolution to resolve the lowpass filter. Simulations and statistical theory confirm that theproposed idealization recovers the underlying signal very accurately at presence of heterogeneous noise, even whenevents are shorter than the filter length. The method is compared to existing approaches in computer experiments andon real data. We find that it is the only one which allows to identify openings of the PorB porine at two differenttemporal scales. An implementation is available as an R package. Index Terms
Deconvolution, dynamic programming, flickering, heterogeneous noise, m -dependency, model-free, non-stationarynoise, peak detection, planar patch clamp, PorB, robustness, statistical multiresolution criterion I. I
NTRODUCTION
The voltage patch clamp technique is a major tool to quantify the electrophysiological dynamics of ion channels inthe cell membrane (Neher and Sakmann, 1976; Sakmann and Neher, 1995). It allows to record the conductance trace
Florian Pein is with the Statistical Laboratory of the Department of Pure Mathematics and Mathematical Statistics (DPMMS) at the Universityof Cambridge, Wilberforce Road, Cambridge, CB3 0WB, United Kingdom.Annika Bartsch and Claudia Steinem are with the Institute of Organic and Biomolecular Chemistry, Georg-August University of Goettingen,Tammannstr. 2, 37077 G¨ottingen, Germany.Axel Munk is with the Institute for Mathematical Stochastics, Georg-August-University of Goettingen, Goldschmidtstr. 7, 37077 G¨ottingen,Germany. Axel Munk is also with the Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 G¨ottingen, Germany, and withthe Felix Bernstein Institute for Mathematical Statistics in the Biosciences, Goldschmidtstr. 7, 37077 G¨ottingen, Germany.Support of DFG (CRC803, projects Z02 and A01, Cluster of excellence 2067 MBExC Multiscale Bioimaging: From Molecular Machinesto Networks of Excitable Cells), the Volkswagen Foundation (FBMS) and of EPSRC (EP/N031938/1 - Statscale programme) is gratefullyacknowledged. We thank Timo Aspelmeier, Manuel Diehn, Benjamin Eltzner, Ingo P. Mey, Ole M. Sch¨utte, Ivo Siekmann, Inder Tecuapetla-G´omez and Frank Werner for fruitful discussions. a r X i v : . [ s t a t . M E ] A ug (i.e., the recorded current trace divided by the applied voltage) of a single ion channel in time, which is for instanceimportant in medical research for the development of new drugs (Kass, 2005; Overington et al., 2006). Importantchannel characteristics such as amplitudes and dwell times can be obtained provided the conductance changes ofthe traces are idealized (underlying signal is reconstructed) from these recordings (data points) (Colquhoun, 1987;Sakmann and Neher, 1995; Hotz et al., 2013; Pein et al., 2018). To obtain such an idealization an extensive amountof methodology is available nowadays, a selective review is given below. Open channel noise:
In this paper, we focus on recordings that are affected by open channel noise , i.e.,have larger noise on segments with a larger conductance. The name open channel noise refers to the fact that alarger conductance results from an open pore. The additional noise when the channel is open can for instance beexplained by current interruptions lasting approximately 1 microsecond (Sigworth, 1985, 1986; Sigworth et al.,1987; Heinemann and Sigworth, 1988, 1990, 1991). We analyze these recordings in a ’model-free’ manner, i.e.,without assuming a hidden Markov or related model, as a time series which is obtained by equidistant samplingfrom the convolution of a piecewise constant signal contaminated by white noise with the kernel of a lowpassfilter. The white noise is scaled by an unknown piecewise constant standard deviation function to allow varianceheterogeneity caused by open channel noise. We stress, that this modeling is rather general and also allows to dealwith heterogeneity of measurement errors, not necessarily due to open channel noise. It will be explained in fulldetail in Section II.
Data: the outer membrane porin PorB:
Figure 1 shows exemplarily a conductance trace of the outer membraneporin PorB from
Neisseria meningitidis , a pathogenic bacterium in the human nose and throat region (Virji, 2009).PorB is a trimeric porin and the second most abundant protein in the outer membrane of
Neisseria meningitidis .It is for instance relevant for the transport of antibiotics into the cell and hence of current interest to understandantibiotic resistance better. Recordings are obtained by the patch clamp technique using solvent-free bilayers. InFigure 1 it is clearly visible that the two conductance levels around .
04 nS and .
36 nS are affected from openchannel noise as the variance of observations around .
36 nS is much larger than of the ones around .
04 nS . Suchrecordings were manually analyzed in (Bartsch et al., 2019) (Figure 1 and its explanation). They have been a majormotivation for our work as they show distinct heterogeneous noise but also short event times, which we could nottackle satisfactorily by existing idealization methods, but also not by a manual analysis. In fact, in (Bartsch et al.,2019) only the conductance levels were investigated but not the full gating dynamics, since events on short timescales could not be idealized.
Methods for open channel noise:
Idealization methodology can be divided into so called model-free methods(Colquhoun, 1987; VanDongen, 1996; Hotz et al., 2013; Gnanasambandam et al., 2017; Pein et al., 2018) whichdo not rely on a specific model for the gating dynamics, to methodology based on hidden Markov models (HMM)(Ball and Rice, 1992; Venkataramanan et al., 2000; Qin et al., 2000; de Gunst et al., 2001; Siekmann et al., 2011;
Fig. 1: From seconds to microseconds: Ion channel recordings (grey points) displayed at a level of seconds (toppanel), of milliseconds (middle panel) and of microseconds (bottom panels). Data points result from a representativeconductance recording of PorB by the patch clamp technique using solvent-free bilayers at
20 mV .Diehn et al., 2019) and to current distribution fitting (Yellen, 1984; Heinemann and Sigworth, 1991; Schroeder,2015; Hartel et al., 2019). The latter often assume a hidden Markov model as well but focus on parameter estimationdirectly. An idealization can be obtained by the Viterbi algorithmus (Viterbi, 1967) as soon as the parameters aredetermined.Most HMM methods can deal with heterogeneous noise. Moreover, they allow to extrapolate information from larger (observable) to smaller (not observable) time scales and hence can provide a good idealization on small temporalscales. However, they rely heavily on the correctness of the assumed model assumptions. Up to few exceptions,see (Fuli´nski et al., 1998; Goychuk et al., 2005; Mercik and Weron, 2001; Shelley et al., 2010), a Markov modelis a reasonable assumption for the underlying ion channel dynamics. However, artifacts in the data observed, forinstance base line fluctuations, occur frequently in ion channel recordings and require elaborate data cleaning beforea HMM can be fitted. Base line fluctuations are for instance caused by small defects in the membrane, which isunavoidable in the recordings. There might be also periodic oscillations, resulting from the electronics or frombuilding vibrations (although damped). The PorB measurements display in Figure 1 show several artifacts of thistype (see for instance the waviness of the observations or the conductance increase around . , which severelyhinders straightforward fitting by a HMM: We tried to fit this data set with in total four different hidden Markovmodel approaches. We achieved the best results when we assumed three states, but with the assumption that twostates (with small conductivity) share the same expectation and variance. More details, also on parameter choices,are given in Section X in the supplement. The obtained idealization is shown in Figure 2. It fits long events well,but misses very short events, see for instance the lower left panel. Fitting such events well requires to take intoaccount the filtering which is computationally very demanding for HMMs. We will discuss such an approach inSection X in the supplement as well. In summary, in addition, to the low robustness against artifacts, the choice ofa specific Markov model, especially the determination of the number of states, can be a demanding task and ofteninvolves subjective choices by the analyst.Contrary, model-free approaches can deal way more flexible with artifacts as they act rather locally on the timeseries without the assumption of a global model. Hence they are more robust than HMMs to model violations.Therefore, they complement HMMs well, e.g., as a preprocessing step. For example, model-free methods can beused to select or verify a specific Markov model, in particular to determine the number of states and possibletransitions, as they explore and potentially remove artifacts in a model-free manner. See also (Pein et al., 2018) fora more extensive discussion of further aspects of the different approaches.To the best of our knowledge, all existing model-free approaches assume (implicitly or explicitly) homogeneousnoise and hence produce unreliable results when open channel noise is present. Among the first methods which fallinto this category is TRANSIT (VanDongen, 1996). An idealization by this approach, details of its limitations inour setting and further discussions can be found in Section X in the supplement. In Figure 3 we display
JULES (Pein et al., 2018), a novel multiscale approach that also falls into this type of methods. It detects many small eventson segments with a larger conductance and variance, but none on ones with a smaller conductance and variance.These additional events are most likely artifacts caused by open channel noise. Indeed, in Section IV-D we foundthat the rates of a simulated hidden Markov model with parameters similar to them underlying the observations inFigure 1 could not be recovered when we used
JULES to idealize the underlying signal. This effect is even more
Fig. 2: Idealization (red) of the data in Figure 1 assuming a HMM displayed on three different temporal scales.The model consists of three states, whereby two states are assumed to have the same expectation and variance. Inthe lower panels we also show the convolution of the idealization with the lowpass filter (blue). It fits most part ofthe data well, but misses short events, for instance the event displayed in lower left panel.severe when the variance heterogeneity is larger.Recently, there has been made some progress to adjust for heterogeneous noise in the context of model-freemethods. However, they are not dedicated to idealize ion channel recordings, which means in particular that they donot incorporate lowpass filtering. Obviously, ignoring the filtering will deteriorate results. For illustration purposes
Fig. 3: Idealization (red) of the data in Figure 1 by
JULES displayed on three different temporal scales. In thelower panels we also show the convolution of the idealization with the lowpass filter (blue). It detects short events,but finds many small events (which are most likely false positives) at parts of high conductance and high variance(see for instance the idealization of the observations around .
36 nS in the middle panel). These detections hinderthe decovolution (see for instance the lower left panel) and make the idealization unreliable.we display the heterogeneous multiscale approach
HSMUCE (Pein et al., 2017) in Figure 4. We found that itprovides reasonable results on larger temporal scales. However, due to filtering
HSMUCE misses shorter events,see for instance the missed peaks around . (lower left panel), . or . . Also for this type of methods we provide further examples and discussions in Section X in the supplement. Fig. 4: Idealization (red) of the data in Figure 1 by
HSMUCE displayed on three different temporal scales. In thelower panels we also show the convolution of the idealization with the lowpass filter (blue). It detects events onlarger temporal scales well (middle panel, lower right panel), but misses short events, see for instance at around . (lower left panel), . or . .The occurrence of short events is often called flickering . Missing them does not only potentially disturb theanalysis of the general channel behavior, the analysis and hence the idealization of flickering events is also ofits own interest in many applications, since flickering has often its own dynamics and can result from different molecular processes. Typical examples are conformational changes of the ion channel (Grosse et al., 2014) or thepassage of larger molecules blocking the ions pathway through the channel (Raj Singh et al., 2012; Bartsch et al.,2019). Hence, one main goal of this paper will be to idealize and detect such events as well.To this end, we introduce in Section II a statistical model which resembles all features (open channel noise, eventson a large range of scales, filtering) of such complex data as in the previous example. In summary, we then askfor a model-free idealization method that adapts automatically to heterogeneous noise, hence is able to detect andidealize events on a large range of relevant scales accurately, but in particular also events shorter than the filterlength. Furthermore, we aim to provide theoretical justification for the detected events (controlling false positives)and for a computationally efficient method to deal with large data sets. HILDE:
To address these tasks, we propose in this paper a new method called H eterogeneous I dealizationby L ocal testing and DE convolution, HILDE . This method combines multiscale regression and deconvolution asit takes into account the convolution of the signal with the lowpass filter explicitly for detecting events that areshort in time. Before we will explain our (quite involved) methodology further, we discuss firstly the generalchallenges: A major difficulty for any such method due to the presence of heterogeneous noise is to distinguishbetween small jumps in the signal and random fluctuations caused by the noise of unknown level. However,simultaneously estimating the signal and the noise level locally is notoriously difficult in general (Pein et al.,2017) and further hampered in our situation since the unknown signal and noise are both smoothed by the filterand hence deconvolution is required when shorter temporal scales are considered. We solve this by means of amultiresolution approach in combination with a local deconvolution to idealize events on all relevant temporalscales accurately. Whereas statistical multiresolution idealization that ignores the deconvolution can be computedefficiently by dynamic programming, see for instance (Hotz et al., 2013; Frick et al., 2014; Pein et al., 2017),combining multiresolution procedures with deconvolution is algorithmically difficult, since due to the coupling ofall observations in the idealization, dynamic programming is not applicable without further ado. We will overcomethis burden by focusing firstly on larger temporal scales and then improving the idealization on smaller temporalscales. More precisely,
HILDE consists of the following three steps: a) detection of long events, b) detection ofshort events and c) parameter estimation by deconvolution. A summary about all three steps is given in Algorithm1 (see Section III). a) Detection of long events:
We will obtain an idealization by multiresolution regression that covers allimportant features on larger temporal scales (for the data set analyzed here large means events of length . atleast, i.e., ≥ sampling points). This step is discussed in Section III-A and technical details are given in SectionVII in the supplement. b) Detection of short events: Our data set contains several short events that will be missed by the previousstep, see for instance in Figure 1 at around . (lower left panel), . , . and . . To detect such events, we test locally whether additional events on smaller temporal scales have to be incorporated. This is impaired bythe lowpass filter and the resulting convolution has to be taken into account explicitly. To this end, we assume thatsignal and noise left and right of the interval on which we test are given by the idealization from the previousmultiresolution step. These tests are detailed in Section III-B, while technical details are postponed to Section VIIIin the supplement.Steps a) and b) determine the number of events and their rough locations. The final idealization in Figure 5 (seee.g. the lower left panel) confirms that step b) is indeed able to detect short events (up to . , corresponding toonly two subsequent observations). c) Parameter estimation by deconvolution: Finally, the precise locations of the events and the conductancelevels have to be obtained. This will done in an additional deconvolution step, as the recordings are filtered. Tothis end, we use the local deconvolution approach from Pein et al. (2018) with minor modifications. This step isdiscussed in Section III-C and technical details are explained in Section IX in the supplement.Figure 5 shows the final idealization by
HILDE of the observations in Figure 1. Despite distinct heterogeneousnoise, the idealization covers all main features on all relevant scales, in particular also short events, while at the sametime it does not include systematically additional artificial changes. The zooms into single peaks (lower panels)show that
HILDE fits the observations well down to a scale of microseconds, which is also a confirmation of ourapproach, including the modeling. We stress that
HILDE is not only robust against heterogeneous noise but hastypically also a larger detection power for event detection than
JULES (even when the noise is homogeneous),since it takes into account the convolution explicitly for detection. This is discussed in more detail in Section VI-B,where we also outline a version of
HILDE that assumes homogeneous noise to improve detection power evenfurther if the homogeneous noise assumption is justified.While the first and the third step are mostly useful modifications of existing methodologies, we want to stressthat this is not true for the second step. To the best of our knowledge, no other model-free ion channel idealizationmethod is able to take the convolution explicitly into account when detecting events. As discussed before, this ishowever indispensable to detect short events when filtering and heterogeneous noise are present.
Implementation and run time:
Each step of
HILDE can be computed separately. Hence,
HILDE can be appliedand modified in modular fashion. This allows for instance to skip the second step if a data set contains only longerevents, hence saving computation time. Another usage might be to modify the local tests in the second step, forinstance to increase the detection power in a data set with small conductance changes but large difference in thenoise levels, without modifying the first or third step. Such modifications are discussed in Section VI-A.The first multiresolution regression step can be computed by a pruned dynamic program. The computation of thelocal tests in the second step is straightforward and the deconvolution in the third step can be computed by aniterative grid search. These steps are detailed in Section III-D and summarized in Algorithm 1. An implementation Fig. 5: Idealization (purple) of the data in Figure 1 by
HILDE displayed on three different temporal scales. Inlower panels we also show the convolution of the idealization with the lowpass filter (orange).
HILDE idealizesevents on all relevant temporal scales well.is available by the function hilde in the R package clampSeg accompanying this paper. The package is available onrequest and has been submitted parallel to CRAN (Pein et al., 2019b).The worst case computational complexity is quadratic in the number of observations, but in most ion channelrecordings conductance changes occur frequently which reduces the complexity to linear in the number of obser-vations. For instance the
600 000 observations in Figure 1 can be idealized in a few minutes on a standard laptop. A detailed discussion of the computational complexity is given in Section III-D.
Simulations:
In Section IV we investigate the performance of
HILDE in Monte-Carlo simulations whichresemble the characteristics of the data in the application in Section V. Based on this we confirm that
HILDE works very well for data sets like the one shown in Figure 1. In more detail, it can detect events which last . , corresponding to only two subsequent observations and being less than one fifth of the filter length long,with probability almost one. Furthermore, all parameters (conductance levels and the locations of the changes) areestimated very accurately, see Section IV-B for more details. Moreover, two subsequent events can be separatedreliably as soon as the distance between them is larger than five times the filter length, see Section IV-C. In SectionIV-D we simulate data from a hidden Markov model. Our method is not assuming a HMM, but such a modelis still illustrative to simulate as it is a standard assumption for the analysis of ion channel recordings. We willalso see in Section V-C that a Markov model is reasonable for the PorB recordings. We find in Section IV-Dthat HILDE recovers all parameters with high precision. Those parameters are chosen similar to those which wehave estimated in Section V-C. Finally, we investigate robustness issues against f - and /f -noise in Section IV-E.We omit most of the time a systematic comparison with other approaches, since, as discussed in the introductionbefore, to the best of our knowledge all existing approaches assume a more restrictive model which hinders a faircomparison. However, we include JULES , HSMUCE and a
HMM based approach in the simulations in SectionIV-D to illustrate the shortcomings (and benefits) of these approaches further.
Application to PorB recordings:
Our analysis of single channel recordings of PorB in Section V confirmsall major results from (Bartsch et al., 2019) about this data set. Moreover, a novel finding of our analysis is thatthe dwell times do not fit a single exponential distribution, but suggests that two different regimes for the dwelltimes are underlying: very short openings of .
31 ms estimated average duration and longer openings of .
62 ms estimated average duration. To best of our knowledge, fast and slow gating at the same time was not observed forPorB before, but for another porine OmpG (Grosse et al., 2014). We stress that all results obtained by
HILDE could be confirmed by at least one other approach. However, none of the other methods were able to reproduce allresults obtained by
HILDE .In summary, in this work we proposed with
HILDE the first fully automatic model-free method for the analysis of ionchannel recordings affected from open channel noise, i.e., to the best of our knowledge no other existing methodologyis able to estimate a piecewise constant function in a model-free manner when filtering and heterogeneous noise arepresent at the same time. Simulations confirm that
HILDE deals efficiently with heterogeneous noise and filtereddata at the same time and idealizes events on various time scales efficiently. More precisely, to obtain a goodidealization events have to be only at least two subsequent observations long but separated from each other by atleast five times the filter length (at signal and noise ratio and filtering as in the analyzed data). This allowed us toobtain novel findings for the PorB channel, e.g., that it can have shorter and longer opening processes at the same time. II. M ODELING
We assume that the recordings result from equidistant sampling from the convolution of an unknown piecewiseconstant signal corrupted by Gaussian white noise with the (known) kernel of a lowpass filter. We stress howeverthat our methodology can be extended to an unknown filter by using the methodology of (Tecuapetla-G´omez andMunk, 2017). To incorporate heterogeneity, the white noise is scaled by an unknown piecewise constant function toallow a larger variance on segments on which the conductance is larger. We only allow potential variance changeswhen the conductance changes, since variance changes also depend on gating events of the channel. More precisely,we model the conductivity and the standard deviation by piecewise constant signals f and σ , f ( t ) = K (cid:88) j =0 c j l [ τ j ,τ j +1 ) ( t ) and σ ( t ) = K (cid:88) k =0 s k l [ τ k ,τ k +1 ) ( t ) , (II.1)where t denotes physical time. The (unknown) conductance levels are denoted as c , . . . , c K , the (unknown) standarddeviations as s , . . . , s K > , the (unknown) number of changes as K and the (unknown) locations as −∞ =: τ <τ < · · · < τ K < τ K +1 := τ end . The indicator function l A ( t ) is one if t ∈ A and zero otherwise. The signals areextended to τ = −∞ to define the convolution correctly but we will see at the end of this section that only a veryshort time period before recordings started, i.e., before t = 0 , will be relevant. We assume c k (cid:54) = c k +1 to define thenumber of changes unambiguously, i.e., to obtain an identifiable model. But we allow s k = s k +1 , i.e., the standarddeviation does not have to change between different events and in particular homogeneous noise is still part of themodel ( σ ≡ s ). We stress that the class of signals in (II.1) is very flexible as potentially any arbitrary numberof changes at arbitrary conductance levels and arbitrary standard deviations can be imposed, see Figure 5 for anexample.We assume further that the recorded data points Y , . . . , Y n (the measured conductivity at time points t i = i/f s , i =1 , . . . , n , equidistantly sampled at rate f s ) result from convolving the signal f perturbed by Gaussian white noise η scaled by the standard deviation function σ with an analogue lowpass filter, with (truncated) kernel F m , anddigitization at sampling rate f s = n/τ end , i.e., Y i = (cid:0) F m ∗ ( f + ση ) (cid:1) ( i/f s ) = ( F m ∗ f )( i/f s ) + (cid:15) i , i = 1 , . . . , n, (II.2)with ∗ the convolution operator. Here, n denotes the total number of data points (typically several hundred thousandsup to few millions). Like in (Hotz et al., 2013; Pein et al., 2018) we truncate (and rescale) the kernel of the lowpassfilter and the covariance function at m/f s to simplify our model. This is implemented in the R function lowpassFilter (Pein et al., 2019b). As a working rule, we choose m such that the autocorrelation function of the untruncatedanalogue lowpass filter is below − afterwards. For the later analyzed PorB traces, which are filtered by a 4-pole lowpass Bessel filter with cut-off frequency and sampled at
10 kHz , this choice leads to m = 11 . Hence, theresulting errors (cid:15) , . . . , (cid:15) n are Gaussian and centered, E [ (cid:15) i ] = 0 , and have covariance Cov (cid:2) Y i , Y i + j (cid:3) = (cid:80) Kk =0 s k (cid:2) A m ( i/f s − τ k , j/f s ) − A m ( i/f s − τ k +1 , j/f s ) (cid:3) for | j | = 0 , . . . , m, for | j | > m, (II.3)with A m ( t, l ) := (cid:90) t F m ( s ) F m ( s + l ) ds. (II.4)Note that we have in (II.3) an unknown non-stationary covariance structure. However, the covariance can bedecomposed into a known stationary autocorrelation given by the lowpass filter and an unknown non-stationaryvariance, which is modeled by a piecewise constant function that shares its change-points with the mean function.An analytic expression of A m ( t, l ) is implemented in the R function lowpassFilter . Hence, (II.3) can be computedexactly and efficiently.The major aim will be now to idealize (reconstruct) the unknown signal f taking into account the convolution, theheterogeneous noise given by (II.3) and the specific structure of f in (II.1). This will be done fully automatically andwith statistically error control. By fully automatic we mean that no user action is required during the idealizationprocess, only certain errors levels α = α + α , the maximal scale on which local tests are performed l max andtwo filter specific parameters have to be selected in advance, see Section III-E.III. M ETHODOLOGY : HILDEIn this section we detail the three steps of our H eterogeneous I dealization by L ocal testing and DE convolution( HILDE ) approach. A summary of these steps is given in the Meta-algorithm 1.
Data Y , . . . , Y n , error levels α = α + α , maximal scale l max ,regularization parameter γ , filter with kernel F m truncated at m/f s Detection of long events ( > l max ):Multiresolution regression at error level α Detection of short events : ( ≤ l max ):Local tests that take into account the convolution explixitly, at joint level α Parameter estimation :Local deconvolution, regularized by γ Idealization ˆ f , i.e., all event times and conductance levels Algorithm 1:
Steps of
HILDE . A. Detection of long events
To detect events on larger temporal scales, we use a modification of the H eterogeneous S imulataneous MU ltiscale C hange-point E stimator, HSMUCE from (Pein et al., 2017), which is a multiresolution procedure that is robustagainst heterogeneous noise. To avoid false positives due to the filter, we omit on each interval the first m observations and do not test on very short intervals. Since we truncated the filter, the signal and the convolutionof the signal with the lowpass filter differ only at the beginning of each segment. More precisely, if the signal isconstant on an interval [ i/f s , j/f s ] with conductance level c ij and the first m observations Y i , . . . , Y i + m − areignored, all other observations Y i + m , . . . , Y j have constant expectation equal to the conductance level c ij . Hence,we take into account only intervals longer than m/f s and ignore the first m observations of each interval.This leads to an estimator that detects change-points at presence of heterogeneous noise and filtering, i.e., when theheterogeneous ion channel model from Section II is assumed, while at the same time the probability to overestimatethe number of events is controlled, i.e., a false positive is only added with probability at most equal to the tuningparameter α , see Theorem VII.1 in Section VII in the supplement. A detailed definition of this estimator is givenin Section VII in the supplement. Note that it does not take into account the convolution explicitly but still hasgood detection properties if events are long enough, but almost no detection power on small scales. Simulations(not displayed) show that for our data set events with of length at least . , corresponding to sampling points,are detected reliably.In the following two sections we will present a refinement of this idealization to detect and idealize events onsmaller time scales, too, which proves to be relevant for our data example. Note that in this section and in the nextsection (a refinement will be provided in the local deconvolution step) we restrict all changes to the grid on whichthe observations are given, in other words, we assume that f s τ i are integers. B. Detection of short events
To detect short events, we test on all intervals containing l = 1 , . . . , l max (to be defined later) observations whetherthe previous idealization is the underlying signal or whether the inclusion of an additional event on the consideredinterval is significantly better. More precisely, let [ τ L , τ R ] = [ i/f s , j/f s ] be the interval on which we test. Andassume for the moment that τ is the only change in [( i − m + 1) /f s , ( j + m − /f s ] with conductance levels c L before and c R afterwards. Note that this also includes the scenario of no change in [( i − m + 1) /f s , ( j + m − /f s ] by setting c L = c R . Then, we decide whether an additional event on [ τ L , τ R ] is required by testing the hypothesis f ( t ) = c L if t < τ,c R if t ≥ τ (III.1) against the alternative f ( c )( t ) = c L if t < τ L ,c if τ L ≤ t < τ R ,c R if t ≥ τ R , (III.2)with c ∈ R arbitrary. The same structure is assumed for standard deviation functions σ and σ with values s L , s and s R . The precise hypotheses and alternatives, i.e., the values for τ, c L , c R , s L and s R , are determined by theprevious idealization step. If more than one change is contained in [( i − m + 1) /f s , ( j + m − /f s ] , no local testwill be performed on this interval. The reasoning behind this and how to obtain τ, c L , c R , s L and s R exactly areexplained in the paragraph ’ Obtaining the hypotheses and alternatives ’ in Section VIII in the supplement. All testsare performed at simultaneous error level α > .The form of these hypotheses allows us to construct a test statistic that takes into account the convolution explicitly.Moreover, information provided by potential variance changes can be used as well. We provide details of thecorresponding test in the paragraph ’ Local testing ’ in Section VIII in the supplement. All choices there are motivatedby a trade-off between a good detection power for events in the measurements in Section V, see Figure 1, and areasonable computational complexity.If a hypothesis is rejected, we replace the single change-point at τ by a short peak. Temporary locations will beplaced at τ L = i/f s and τ R = j/sr , but exact locations and the conductance level c will be obtained in theupcoming deconvolution step. However, note that usually one event in the data causes rejections of multiple tests.Therefore, we only consider the event with the largest test statistic among all rejections on intervals that intersector adjoin each other. More details are provided in the paragraph ’ Multiple dependent rejections ’ in Section VIII inthe supplement.
C. Parameter estimation by local deconvolution
The final idealization is obtained by local deconvolution as described in Section 3.2 of (Pein et al., 2018) withtwo adjustments, which will be discussed in Section IX in the supplement. This means in particular that we stilluse the likelihood function of observations with homogeneous noise, although heterogeneous noise is assumed.Simulations show, see Section IV, that this works reasonably well for the recordings we analyze in Section V.Alternatives for recordings with more pronounced noise heterogeneity are discussed in Section VI-A.
D. Computation and run time
The multiresolution regression step in Section III-A can be computed by a pruned dynamic program as describedin Section A.1 in the supplement of (Pein et al., 2017). For related ideas, see also (Killick et al., 2012; Frick et al.,2014; Li et al., 2016; Maidstone et al., 2017) and the references given there. The implementation of the local tests in Section III-B is straightforward. The local deconvolution in Section III-C can be computed by an iterativegrid search as described in Section 3.2 of (Pein et al., 2018). An implementation of HILDE is available by the R function hilde in the package clampSeg . The package is available on request and has been submitted parallel toCRAN (Pein et al., 2019b). All run time critical parts are written in C++ and are interfaced by the R code.The worse case computation complexity of the dynamic program is quadratic in the number of observations n .However, in most ion channel recordings conductance changes occur frequently which reduces the complexity tobe linear O ( n ) , see Section A.3 in the supplement of Pein et al. (2017). The local tests in Section III-B are ofcomplexity O ( l n ) , since for each of the l max scales , . . . , l max roughly n tests have to be performed and thecomplexity to compute a single test is at most of order O ( l max ) . The computation time of the local deconvolution isdominated by the iterative grid search to deconvolve a single event. The deconvolution of a single event is constant inthe number of observations, since the number of involved observations and the grid sizes do not increase. Moreover,the number of involved observations is small and the covariance matrix is a band matrix, with band size equal to m , which allows fast computation. Hence, the complexity of the deconvolution increases linearly in the number ofevents which increases for ion channel recordings typically linearly in the number of observations. In summary, fora typical channel trace the complexity to compute HILDE increases only linearly in the number of observations.This is confirmed by a run time of less than five minutes for idealizing the
600 000 observations in Figure 1 ona Dell Latitude E6530 with Intel(R) Core(TM) i5-3340M CPU 2.70GHz processor. Similar run times are obtainedfor the traces generated in Section IV-D. Thus, the theoretical considerations as well as the empirical run timesconfirm that
HILDE can be computed efficiently, which is important since large data sets have to be analyzed.
E. Parameter choices
HILDE can be tuned by the parameters α , α , l max and γ , see Algorithm 1 and the referenced sections for adefinition. The probability to overestimate the number of conductance changes is approximately controlled by thesum of the error levels α = α + α . Hence, if such an overestimation control is desired, α should be chosen small.As a default choice we suggest α = 0 . . Increasing α yields to a larger detection power (at the price of includingmore false positives). Hence, one may ’screen’ for different α if important events are difficult to detect. The levels α and α allocates the power between the multiresolution test for detecting events on large scales ( > l max ) andthe local tests to detect events on small scales ( ≤ l max ). We have chosen α = 0 . and α = 0 . in our dataanalysis, since our focus was on detecting short events primarily, while events on larger scales were easier to detect.More weight can be put on α if either short events are of less interest or if long events are difficult to detect aswell, e.g. since they have a smaller jump size than the short events. The latter is often called subgating and wasfor instance studied in (Hotz et al., 2013). The tuning parameter l max , the largest scale on which local tests areperformed to find short events, should be chosen such that all events on larger scales are detected by the previous multiresolution test. This can for instance be determined by Monte-Carlo simulations. In our setting, we choose l max = 65 , since simulations (not displayed) showed that the multiresolution step in Section III-A is able to detectevents which contain more than observations with probability almost one. The correlation matrix is regularizedwith parameter γ = 1 , further details can be found in Section III B in (Pein et al., 2018). And, as mentionedbefore, we truncate the kernel and autocorrelation function of the filter at m = 11 as the autocorrelation functionis below − afterwards. All of these choices are the default parameters of the function hilde and are used in thesimulations in Section IV and in the real data application in Section V.IV. S IMULATIONS
In this section we examine the performance of
HILDE in Monte-Carlo simulations. Since to our best knowledgeno other model-free method is known that takes into account heterogeneous noise and filtering explicitly, it isdifficult to compare
HILDE with other methods. Most similar in spirit are
JULES (Pein et al., 2018),
HSMUCE and an HMM based approach (Diehn, 2017). These have been included in a simulation in Section IV-D for purposeof comparison. The simulation study consists of four parts. First of all, we investigate the detection and idealizationof isolated peaks. Secondly, we identify the minimal distance at which
HILDE is able to separate two consecutivepeaks. Thirdly, although
HILDE does not rely on a hidden Markov model assumption, we examine its ability torecover the parameters of a Markov model, since a hidden Markov model is a common assumption for ion channelrecordings. Finally, we investigate its robustness against violations of the model in Section II, in particular againstadditional f and /f noise. A. Data generation
We generate all signals and observations accordingly to the heterogeneous ion channel model we described inSection II and such that they are in line with the measured data we analyze in Section V. This means in particularthat amplitudes, dwell times and noise levels of the generated observations are chosen such that they are similarto those of the analyzed datasets. We also simulate a 4-pole Bessel filter with cut-off frequency and samplethe observation at
10 kHz .The expectation of the observations, given by the convolution of the signal with the truncated kernel F m ofthe lowpass Bessel filter, can be computed explicitly. For the errors we oversample by a factor of , i.e., wegenerate times as many independent Gaussian observations, discretize the filter accordingly, compute a discreteconvolution and rescale the observations such that they have the desired standard deviation. B. Isolated peak
In this simulation with observations we examine the detection and idealization of a single isolated peak.More precisely, in accordance with the model in Section II and with the estimated values in Section V for the observations in Figure 1, we choose conductance levels c = c = 0 , c = 0 . , variances s = s = 6 . · − and varying variance s ∈ { · − , · − , − , · − , · − } to examine the influence of different noiselevels. Note that s = 10 − is roughly the noise level in the measurements in Section V. Moreover, we simulatechanges at τ = 2000 /f s and τ = (2000 + (cid:96) ) /f s , c.f. (II.1), and are interested in how well HILDE detects thepeak and idealizes the locations τ and τ and the level l as a function of (cid:96) , the length (relative to the samplingrate f s ) of the peak. For (cid:96) = 5 , Figure 6 shows an example of the simulated data as well as the idealizations by HILDE and their convolutions with the Bessel filter in a neighborhood of the peak. Tables I-III summarize ourresults based on
10 000 repetitions for (cid:96) = 2 , , .To this end, we count how often the signal is correctly identified , i.e., only the peak and no other change is detected.More precisely, we define the peak as detected if there exists a j such that | ˆ τ j − τ | < m/f s and | ˆ τ j +1 − τ | < m/f s as a peak is shifted at most m/f s by the filter. If only one change but not a peak is within these boundaries wedo not count it as a true detection, but also not as a false positive , whereas all other changes are counted as falsepositives . For the estimated locations and the level we only consider cases where the peak is detected and reportthe mean square error, the bias and the standard deviation.Fig. 6: Simulated observations (grey points), true block signal f ( ) and its convolution ( ), HILDE s idealization ( ) and itsconvolution ( ) with the lowpass 4-pole Bessel filter.
HILDE provides very accurate idealization.
In most scenarios,
HILDE has a good detection power and detects almost no false positives, see Table I. Onlyfor a five times larger variance than in the real data and when l ≤ few events are missed. In Tables II and III wefound that idealization of the locations τ and τ and the conductance value c works well for variances similarto the real data, but has some issues when the variance of the peak is larger, in particular in the scenario of afive times larger variance. For such observations it might be desirable to take into account the heterogeneous noisein the deconvolution step, see Section VI-A for more details. For smaller variances the results for estimating thelocations are better when the peak is longer, but for larger variances results are even worse when the peak is longer. TABLE I:
Performance of
HILDE in idealizing a signal with an isolated peak in different settings. They differ in the amount of open channelnoise and the length of the peak. More precisely, it has changes at τ = 0 . and τ = τ + (cid:96)/f s , (cid:96) = 2 , , , conductance levels c = c = 0 , c = 0 . , variances s = s = 6 . · − and varying variance s ∈ { · − , · − , − , · − , · − } . Results are based on
10 000 pseudo samples. An example, s = 10 − and (cid:96) = 5 , is given in Figure 6. Setting Length ( (cid:96) ) Correctlyidentified ( % ) Detected ( % ) False positive(Mean) s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − An explanation might be two effects with opposite influences. The conductance change provide more informationwhen the peak is longer, but then also the overall variance of the observations is larger which reduces estimationaccuracy. Estimation of the level c is always more accurate when the peak is longer. It seems that here the firsteffect dominates.All in all, these simulations confirm that HILDE performs very well for observations comparable to them in SectionV.
C. Separation of two consecutive peaks
To examine how well
HILDE separates two consecutive peaks we perform the same simulations as in Section4.3 in (Pein et al., 2018), since results are identical for homogeneous and heterogeneous noise as separation dependsTABLE II:
Performance of
HILDE in idealizing a signal with an isolated peak in different settings. They differ in the amount of openchannel noise and the length of the peak. More precisely, it has changes at τ = 0 . and τ = τ + (cid:96)/f s , (cid:96) = 2 , , , conductance levels c = c = 0 , c = 0 . , variances s = s = 6 . · − and varying variance s ∈ { · − , · − , − , · − , · − } . Resultsare based on
10 000 pseudo samples and are given as multiples of the sampling rate f s = 10 . An example, s = 10 − and (cid:96) = 5 , is givenin Figure 6. Setting Length ( (cid:96) ) f s MSE (ˆ τ ) f s BIAS (ˆ τ ) f s SD (ˆ τ ) f s MSE (ˆ τ ) f s BIAS (ˆ τ ) f s SD (ˆ τ ) s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − TABLE III:
Performance of
HILDE in idealizing a signal with an isolated peak in different settings. They differ in the amount of openchannel noise and the length of the peak. More precisely, it has changes at τ = 0 . and τ = τ + (cid:96)/f s , (cid:96) = 2 , , , conductance levels c = c = 0 , c = 0 . , variances s = s = 6 . · − and varying variance s ∈ { · − , · − , − , · − , · − } . Resultsare based on
10 000 pseudo samples. An example, s = 10 − and (cid:96) = 5 , is given in Figure 6. Setting Length ( (cid:96) ) MSE (ˆ c ) BIAS (ˆ c ) SD (ˆ c ) s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − s = 2 · − s = 5 · − s = 10 − s = 2 · − s = 5 · − on the method and distance between the peaks but not on the noise level. More precisely, we consider a signal f with changes at τ = 2 000 /f s , τ = τ + 5 /f s , τ = τ + d and τ = τ + 5 /f s , , with τ = 0 and τ end = 4 000 /f s and levels l = l = l = 40 pS and l = l = 20 pS . Hence d is the distance between the two peaks. Wedistinguish between perfect separation, i.e., the detection step of HILDE identifies the two peaks (4 changes) andthe local deconvolution yields idealizations for the four levels (illustrated in Figure 7c). Secondly, separation fails inthe detection step, i.e., the multiresolution reconstruction recognizes only 2 changes and identifies one peak whoselevel can be further deconvolved (illustrated in Figure 7a). Finally, separation fails in the deconvolution step, i.e.,
HILDE identifies two peaks but the distance is so small that the deconvolution step cannot separate them, in otherwords, no long segment is in between (illustrated in Figure 7b). (a)
No separation in the detection step, d = 3 (b) No separation in the deconvolution, d =20 (c) Perfect separation, d = 35 Fig. 7:
Data y i = F ∗ ( f ( i/n ) + σ (cid:15) i ) (grey points), where σ = 1 . , (cid:15) i is gaussian white noise and the signal f has two consecutive peakscomprised of the levels l = l = l = 40 , l = l = 20 and change-points τ = 2000 /f s , τ = τ + 5 /f s , τ = τ + d and τ = τ + 5 /f s .True signal ( ) and JULES idealization ( ). Idealization coincides with reconstruction from the detection step if separation fails in thedeconvolution step. Figure 8 shows the frequency at which each scenario occurred as a function of d , the distance between the twopeaks, in
10 000 simulations for each value of d = { , , . . . , } . We found that the two peaks are detected if d > , but separation in the detection step and hence an appropriate idealization requires d ≥ . Hence, in SectionV events have to be separated by more than . to be idealized appropriately. In comparison, we found thatevents are on average separated by / (2 .
67 + 4 .
50) s ≈ .
14 s which shows that this limitation is not an issue forthe analyzed PorB recordings.
D. Hidden Markov model
In this section we simulate data from a three state hidden Markov model. Since hidden Markov models are oftenassumed for ion channel recordings, it is instructive to investigate the methods in such a scenario. We simulateobservations that resemble the PorB data we analyze in Section V. More precisely, we have expectations , and .
32 nS as well as standard deviations . , . and . , i.e. the variances are . · − (nS) and − (nS) . The dwell times in the first, second and third state are exponentially distributed with rates
20 Hz ,
400 Hz and , respectively. The process always jumps from the first or second state to the third state, i.e., notransitions between the first and second state are allowed. And it jumps from the third state with probability / tothe first state and with probability / to the second state. We generate five time series with
600 000 observations,each. Each trace looks similar to the observations in Figure 1 and hence we refrain from showing an example.We analyze these data sets with
HILDE and for purpose of comparison with
JULES (Pein et al., 2018),
HSMUCE
Fig. 8:
Results for
HILDE assuming heterogeneous noise in idealizing two consecutive peaks separated by distance d . Its frequencies for noseparation in the detection step (green), for successful detection, but no separation in the deconvolution step (red) and for successful detectionand deconvolution (blue). Results are based on
10 000 simulations for each value of d . (Pein et al., 2017) and an HMM based approach which assumes the true three state model, i.e. three states, wherebytwo have the same expectations and variances and no transitions are allowed between them. We used µ = (0 , , . T , s = (0 . , . , . T , P = .
999 0 0 . .
99 0 . .
02 0 .
03 0 . . as starting values for the Baum-Welch algorithm. Those standard deviations were determined by taking the empiricalstandard deviation of all observations below and above . , respectively. Idealizations are obtained by using a Viterbialgorithm.The Baum-Welch algorithm estimated the following parameters µ = (0 , , . T , s = (0 . , . , . T , P = . . . . . . . . We will discuss the estimated transition matrix later in comparison with the other approaches and when we alsodiscuss the results using the Viterbi algorithm. The estimated expectations and standard deviations are accurate. Forthe other approaches we show in Figure 9 histograms of the estimated amplitudes of all events with an amplitudebetween . and . . (a) HILDE (b)
JULES (c)
HSMUCE
Fig. 9: Histograms of the estimated amplitudes using various idealization approaches. The black line (for
HILDE hidden by the red line) indicates the true amplitude of . and the red line the estimated amplitudes of . , . and . by the half sample mode using the idealizations from HILDE , JULES and
HSMUCE ,respectively.We found in Figure 9 that all approaches estimate the amplitude accurately. The estimated amplitudes of
HSMUCE are skewed, but the final estimation is still decent.We continue with an analysis of the dwell times. To this end, we consider from now on all events with estimated conductance level between − .
05 nS and .
05 nS as a closed event and between .
15 nS and as an open event,while all other events are considered as artifacts and are ignored. Figure 10 shows histograms of the dwell timesin the closed state for various approaches. (a)
HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 10: Histograms of the dwell times in the closed state with exponential fits (red). We rescaled all lines suchthat the area under them are standardized to one to make them comparable to the histograms.We see that with the exception of
HSMUCE (it misses short events) none on the histograms look exponentiallydistributed, since we have a mixture of short and long events. Hence, in Figures 11 and 12 we will analyze shortand long events separately. To this end, we say an event is short if its dwell time is between . and andlong if its dwell time is between
20 ms and
200 ms . To estimate the rates, we apply a missed event correction likein (Pein et al., 2018). (a)
HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 11: Histograms of the dwell times in the closed state with a length between . and to analyze shortevents, together with the true exponential distribution (black line) and exponential fits (red) that are corrected formissed events. We rescaled all lines such that the area under them are standardized to one to make them comparableto the histograms. This explains why the true distribution does not look always the same. The blue line in the HMM plot indicates an exponential fit when the same missed event correction is used that is applied to the results obtainedby
HILDE .We found from Figures 11 and 12 that
HILDE recovers in both cases the exponential distribution very well andestimates both rates of
400 Hz and
20 Hz with .
59 Hz and . accurately. In comparison, JULES is notable to deconvolve all events due to the detection of additional spurious events, compare Figure 3. The rate for (a) HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 12: Histograms of the dwell times in the closed state with a length between
20 ms and
200 ms to analyze longevents, together with the true exponential distribution (black line) and exponential fits (red) that are corrected formissed events. We rescaled all lines such that the area under them are standardized to one to make them comparableto the histograms. This explains why the true distribution does not look always the same.the short events is with .
16 Hz still accurately, but the rate for the long events is with . significantlyunderestimated. Notably the dwell times are still (almost) exponentially distributed. HSMUCE misses short events,in total it has detected only short events. Hence, a rate for the short events cannot be estimated. The rate forthe long events is with . underestimated as well. The hidden Markov approach estimated with .
44 Hz and . both rates accurately. However, since this approach misses very short events, for the rate for theshort events we had to apply a stricter missed event correction that takes into account only events with a length ofat least .
75 ms . Hence, at least in the used form the hidden Markov approach is less favorable to analyze shortevents (since its corrected estimate is based on less event and hence will have a larger variance). This is remarkably,since the idealization on very short temporal scales is considered to be a strength of hidden Markov approaches.Finally, the estimated exit probabilities by the Baum-Welch algorithmus of . and . correspondsto estimated rates of .
71 Hz and .
44 Hz which is much worse than the rates estimated using the idealizationsobtained by the Viterbi algorithm.We are now analyzing how often closing events occur. To this end, we analyze the dwell times in the open stateor in other words the distance between two closing events. Moreover, we analyze the proportions of short andlong events. Therefore, we divide the number of detected events by the estimated probability that such an event isdetected assuming an exponential distribution for the dwell times.We found from Figure 13 that
HILDE , HSMUCE and the hidden Markov approach recover the exponentialdistribution very well and estimate the rate of with . , . and . accurately. Only JULES underestimates the rate because of previously explained reasons with . a bit. HILDE , JULES and
HMM estimated with . , . and . , respectively, the proportion of short events decently, recall thatthe truth is / . This number could not be determined using HSMUCE , since it misses almost all short events.Once again, the Baum-Welch provides with .
06 Hz and . much worse results. (a) HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 13: Histograms of the dwell times in the open state, together with the true exponential distribution (black line)and exponential fits (red) that are corrected for missed events. We rescaled all lines such that the area under themare standardized to one to make them comparable to the histograms. This explains why the true distribution doesnot look always the same.All in all, we found that
HILDE was indeed able to recover all parameters very well. All other model-freeidealization methods had at least one massive problem. The hidden Markov approach might be usable, but requiresa more restrictive missed event correction and is also more complicated to apply. One should also keep in mindthat we used the true parametric model class as prior knowledge.
E. Robustness
The model we proposed in Section II is a good assumption for ion channel recording at presence of openchannel noise. However, in some patch clamp recordings additional high frequency f (violet) and long tailed /f (pink) noise components have been observed, for a more detailed discussion see (Neher and Sakmann, 1976;Venkataramanan et al., 1998b; Levis and Rae, 1993) and the references therein. Thereto, in this section we examinehow robust HILDE is against such noise components. To this end, we revisit the simulation setting from SectionIV-B with s = 10 − only.For the violet noise we use as suggested by (Venkataramanan et al., 1998a) a moving average process with coeffi-cients . and − . . For the pink noise we use the algorithm available on https://github.com/Stenzel/newshadeofpink .We assume that the pink noise is globally present. More precisely, we reduce the previously present noise by afactor of / and add pink noise which is scaled such that its standard deviation is equal to / √ . · − (halfof the standard deviation in the background in Section IV-B). For the high frequency violet noise we consider thesetting that the new noise component is state-dependent as well. In other words, we generated errors from such amoving average process and convolved them with the kernel of the lowpass filter instead of assuming white noiseerrors.We found in Table IV that HILDE is very robust against the additional f but effected by /f noise. At presenceof the latter noise, the standard deviation estimation on the long segments is wrong which causes the detection TABLE IV:
Robustness of
HILDE against additional noise components in idealizing a signal with an isolated peak having changes at τ = 0 . and τ = τ + (cid:96)/f s , (cid:96) = 2 , , , conductance levels c = c = 0 , c = 0 . , variances s = s = 6 . · − and s = 10 − .Results are based on
10 000 pseudo samples.
Noise type Length ( (cid:96) ) Correctlyidentified ( % ) Detected ( % ) False positive(Mean)White noise 2 99.94 99.98 0.0010 f noise 2 99.94 99.98 0.0010 /f noise 2 75.04 99.28 0.4351White noise 3 99.97 100.00 0.0006 f noise 3 99.97 100.00 0.0006 /f noise 3 75.95 99.32 0.4452White noise 5 99.95 100.00 0.0010 f noise 5 99.94 100.00 0.0012 /f noise 5 76.65 99.54 0.4448 of false positives in roughly a quarter of the cases. Note, that false positives are caused by the underestimatedstandard deviation but also by the long range dependency itself. However, the false positives have a small amplitudeand therefore do not influence the analysis severely or can be removed by postfiltering. Parameter estimation (notdisplayed) is slightly effected by /f noise (estimation of the change-point locations is slightly worse, but estimationof the size of the change is even improved), but not affected by presence of f noise.V. D ATA ANALYSIS
A. Measurements
We analyze single channel recordings of PorB from
Neisseria meningitidis (recall the last paragraph in theintroduction). In the following we analyze six traces, each of them is one minute long and consists of
600 000 observations. An example is shown in Figure 1, which shows distinct heterogeneous noise.Measurements were performed on solvent-free planar bilayers using the Port-a-Patch (Nanion Technologies). Giantunilamellar vesicles (GUVs) composed of 1,2-diphytanoyl- sn -glycero-3-phosphocholine (DPhPC)/cholesterol (9:1)were prepared by electroformation (AC, U = 3 V, peak-to-peak, f = 5 Hz, t = 2 h) in the presence of 1 M sucroseat ◦ C . Spreading of a GUV in M KCl,
10 m M HEPES, pH 7.5 on an aperture (d = 1-5 µ m) in a borosilicatechip by applying 10-40 mbar negative pressure resulted in a solvent-free membrane with a resistance in the G Ω range. Once the membrane with a G Ω seal was formed, varying amounts of a PorB stock solution (2.2 µ M in200 mM NaCl, 20 mM Tris, 0.1% (w/w) LDAO, pH 7.5) were added to the buffer solution (50 µ L) at an appliedDC potential of +40 mV. Current traces were recorded at a sampling rate of
10 kHz and filtered with a low-passfour-pole Bessel filter of using an Axopatch 200B amplifier (Axon Instruments). For digitalization, an A/Dconverter (Digidata 1322; Axon Instruments) was used. B. Idealization
Idealizations are obtained by
HILDE with parameter choices as in Section III-E. Moreover, an illustrativecomparison with other approaches was discussed in the introduction (recall Figures 19-4). In Figure 1 we seethat the channel switches frequently between two conductance levels, roughly between .
04 nS and .
36 nS , thevariance is roughly . · − (nS) in the closed state and − (nS) in the open state. Moreover, several artifactsseem to be present, see for instance the fluctuating conductance in the open state in the first ten seconds. We stressthat such artifacts heavily disturb any idealization that assumes a HMM, confer Figure 19. Contrarily, the model-free idealization by HILDE (Figure 5) recovers all visible features on small as well as on large temporal scalesaccurately. In particular, the zooms into single peaks (Figure 5, lower panels) shows that
HILDE fits the observationswell which is also a confirmation of our model. Since PorB forms three pores, four different conductance levels arepossible. However, in this measurement we see only two different conductance levels. Such a cooperative openingand closing was observed before, see for instance (Song et al., 1998).
C. Analysis of flickering dynamics
We now use the obtained idealizations to analyze the gating dynamics in a similar fashion as the simulated datain Section IV-D. We will focus in this section on
HILDE , but we will compare it in Section XI in the supplementwith analyses based on
JULES , HSMUCE and
HMM . We say a channel opens (a gating event from the lowerconductance level to the higher conductance level) if the idealized level is between .
25 nS and and the previouslevel is between and . . To study the amplitude, we consider the conductance difference of all such events.Figure 14 shows a histogram of the so obtained amplitudes between . and . . All other events are eitherclosing events or are considered as artifacts. Such artifacts can for instance be base line fluctuations as discussedin the introduction. We stress that an analysis of the closing events leads to very similar results.The histogram in Figure 14 shows only one mode. Hence, all events have the same amplitude up to measurementsand idealization errors. This means especially that also the flickering events are full-sized. An amplitude of . is estimated by the half sample mode (Robertson and Cryer, 1974), computed in R by using the modeest package.Note that other mode estimators or Gaussian mean estimation lead to similar results. This amplitude coincides withthe one obtained by a manual analysis using the pClamp 10.2 software package (Axon Instruments), see (Bartschet al., 2019).We now analyze the dwell time in the open state and how frequently the channel opens. We take into accountevents with an amplitude between . and . and with a dwell time between . and
200 ms , sinceshorter events cannot be detected reliably and longer events are rare and often interrupted by artifacts. Histogramsof the dwell time in the open state are shown in Figure 15 together with an exponential fit using a missed eventcorrection like in (Pein et al., 2018). Fig. 14: Histograms of the amplitudes between . and . . The vertical red line indicates the estimatedamplitude of . by the half sample mode. (a) All events between . and
200 ms . (b) Short events between . and . (c) Long events between
20 ms and
200 ms . Fig. 15: Histograms of the dwell times in the open state of all opening events with amplitude between . and . together with exponential fits using a missed event correction (red line).Interestingly, the dwell times do not fit a single exponential distribution, but when we split the events in short(shorter than ) and long (longer than
20 ms ) ones, both fit exponential distributions very well, with an estimatedaverage duration of .
62 ms and .
31 ms , respectively. Note, that these estimations are approximations, since anexponential distribution with a large / small rate generates with a small probability a long / short event, but sincethe average dwell times are very different this error is negligible. To best of our knowledge, fast and slow gatingat the same time was not observed for PorB before. However, Grosse et al. (2014) showed that the loop withinthe pore structure of OmpG leads to fast flickering (fast time constant). If the loop is removed, there is still gatingobserved but less frequent (slower time constant). Even though this is not the same protein, in PorB we have a loopL3 which is also localized in the pore and forms an α -helix in its center, which constricts the pore to its narrowest point. Hence, our findings support that similar dynamics might occur for PorB as well.We are now analyzing the distance between two opening events. This is shown in Figure 16. Moreover, we analyzehow many of the openings are short or long. Once again we apply a correction for missed events.Fig. 16: Histograms of the distances between two opening events with amplitude between . and . togetherwith exponential fits using a missed event correction (red line).The distance between two events seem to be exponentially distributed and the estimated rate is .
75 Hz . We foundthat . of all opening events were short events. Moreover, we found in Section XI in the supplement that allresults obtained by HILDE could be confirmed by at least one other approach, but none of the other methods wasable to reproduce all results obtained by
HILDE .VI. D
ISCUSSION AND O UTLOOK
In this paper we proposed a new model-free idealization method for ion channel recordings, called
HILDE .In comparison to existing approaches,
HILDE provides still reasonable idealizations under heterogeneous noise,for instance caused by open channel noise. Moreover, it detects and idealizes flickering events reliable, is fully-automatic and can be computed efficiently. It offers great flexibility in adapting to the needs of a specific dataanalysis by modifying the error probabilities α = α + α and the scale l max that distinguishes short and longevents. Its precise idealization is confirmed by simulations and a real data application to PorB recordings. We foundthat these recordings contain opening events of significantly different length.We stress that HILDE is modular, i.e., single components like the choice of the test statistics and functionals tooptimize can be changed without further modifications. This can be used to adapt
HILDE to specific challenges in the measurements. We will discuss several such possibilities in the following. Some of them are implementedin the clampSeg package and just require to choose different parameters, for others few lines of code have to bemodified. A. Alternative approaches
A different underlying interval set can be used for the multiresolution test. The set of all intervals of dyadic lengthprovides in general a good compromise between detection power and computation time. But, if a larger detectionpower is required, the set of all intervals can be used at the price of a larger computational complexity. The otherway around, if faster computation is demanded, a smaller interval set, for instance the dyadic partition like in(Pein et al., 2017), can be used. This might be particularly beneficial in situations in which the multiresolution testdetects almost no events which results in a large computation time. Interesting alternatives are also the approachesin (Chan and Walther, 2013; Kov´acs et al., 2020) which require only a slightly larger computational effort than theuse of all intervals of dyadic length but detects change-points in a certain sense statistically optimally. A differentway to increase the detection power is to use likelihood ratio tests, again at computational expenses. We found insimulations (not displayed) that the likelihood ratio test statistic is slightly more powerful on small scales, but muchslower to compute. However, a slightly worse detection power on small scales should not be a big concern, since arefinement by local tests will be done in the next step. Also for detecting events on small scales by local tests, seeSection III-B, different statistics can be used to increase the detection power. For instance the likelihood ratio testor maximum likelihood estimators for the parameters ( c, s ) can be considered. However, they are computationallyvery demanding, since the likelihood function involves the inverse and the determinant of the covariance matrixgiven by (VIII.5).Finally, our deconvolution approach assumes still homogeneous noise which we found in simulations works stillwell at presence of open channel noise, see Tables II and III. Taking into account the heterogeneous noise might bebeneficial, in particular if the noise level differences are large, but difficult, maybe even impossible, since avoidingan ill-conditioned matrix by regularization and keeping the variance levels might be impossible to achieve at thesame time. B. Homogeneous noise
We designed
HILDE particularly to deal with heterogeneous noise. However, taking into account the convolutionexplicitly when detecting changes is also beneficial if the noise is homogeneous, i.e., a constant variance is assumed.In this situation, its detection power can be further improved by small modifications that utilize the assumption of aconstant variance, they are explained in Section XII in the supplement. We found that
HILDE has a better detectionpower than
JULES (Pein et al., 2018), but at the price of worse separation properties and a larger computation time. More precisely, in the simulations in Section IV-B in (Pein et al., 2018) we found that JULES is able todetect an isolated peak of length . /f s with probability almost one. In comparison, HILDE requires only . /f s if homogeneous noise is assumed and . /f s if heterogeneous noise is assumed (see Section 3.9.3. in (Pein, 2017)).Remarkably, the detection power of HILDE is even larger than the one of
JULES if HILDE does not use theassumption of homogeneous noise which illustrates how much detection power is lost by not taking into accountthe convolution.
C. Idealizing the variance
Our focus was on idealizing the conductance while the unknown variance was considered as a nuisance parameter.However, as a byproduct
HILDE can easily be extended to an idealization of the variance which offers for instancea residual analysis of the noise to validate a given model. To this end, we use
HILDE to estimate the change-pointlocation of the conductance and assume that these are the change-points of the variance as well. Note that themodel of Section II allows the variance to stay constant at such a location but precludes further variance changes.With the definitions from before (see Section III), if a segment is long, the square of the estimator in (VIII.1) canbe used. Afterwards, the variance on short segments can be estimated by the estimator in (VIII.8). The resultingfunction will be an idealization of the variance. R
EFERENCES
Ball, F. G. and Rice, J. A. (1992). Stochastic models for ion channels: introduction and bibliography.
Math. Biosci. ,112(2):189–206.Bartsch, A., Llabr´es, S., Pein, F., Kattner, C., Sch¨on, M., Diehn, M., Tanabe, M., Munk, A., Zachariae, U., andSteinem, C. (2019). High-resolution experimental and computational electrophysiology reveals weak β -lactambinding events in the porin porB. Sci. Rep. , 9(1):1264.Chan, H. P. and Walther, G. (2013). Detection with the scan and the average likelihood ratio.
Stat. Sin. , pages409–428.Colquhoun, D. (1987).
Practical analysis of single channel records . Microelectrode techiques. The Plymouthworkshop handbook. Cambridge: Company of Biologists.de Gunst, M. C. M., K¨unsch, H. R., and Schouten, J. G. (2001). Statistical analysis of ion channel data usinghidden Markov models with correlated state-dependent noise and filtering.
J. Am. Stat. Assoc. , 96(455):805–815.Diehn, M. (2017).
Inference in Inhomogeneous Hidden Markov Models with Application to Ion Channel Data . PhDthesis, Georg-August-Universit¨at G¨ottingen. http://hdl.handle.net/11858/00-1735-0000-0023-3FB4-2.Diehn, M., Munk, A., and Rudolf, D. (2019). Maximum likelihood estimation in hidden markov Models withinhomogeneous noise.
ESAIM: P&S , 23:492–523. Enikeeva, F., Munk, A., and Werner, F. (2018). Bump detection in heterogeneous Gaussian regression.
Bernoulli ,24(2):1266–1306.Frick, K., Munk, A., and Sieling, H. (2014). Multiscale change point inference (with discussion and rejoinder bythe authors).
J. R. Statist. Soc. B , 76(3):495–580.Fuli´nski, A., Grzywna, Z., Mellor, I., Siwy, Z., and Usherwood, P. N. R. (1998). Non-Markovian character of ioniccurrent fluctuations in membrane channels.
Phys. Rev. E , 58(1):919–924.Gnanasambandam, R., Nielsen, M. S., Nicolai, C., Sachs, F., Hofgaard, J. P., and Dreyer, J. K. (2017). Unsupervisedidealization of ion channel recordings by minimum description length: Application to human PIEZO1-channels.
Front. Neuroinform. , 11.Goychuk, I., H¨anggi, P., Vega, J. L., and Miret-Art´es, S. (2005). Non-Markovian stochastic resonance: Three-statemodel of ion channel gating.
Phys. Rev. E , 71(6):061906.Grosse, W., Psakis, G., Mertins, B., Reiss, P., Windisch, D., Brademann, F., B¨urck, J., Ulrich, A., Koert, U., andEssen, L.-O. (2014). Structure-based engineering of a minimal porin reveals loop-independent channel closure.
Biochemistry , 53(29):4826–4838.Hartel, A. J. W., Shekar, S., Ong, P., Schroeder, I., Thiel, G., and Shepard, K. L. (2019). High bandwidth approachesin nanopore and ion channel recordings–A tutorial review.
Anal. Chim. Acta .Heinemann, S. H. and Sigworth, F. J. (1988). Open channel noise. IV. Estimation of rapid kinetics of formamideblock in gramicidin A channels.
Biophys. J. , 54(4):757–764.Heinemann, S. H. and Sigworth, F. J. (1990). Open channel noise. V. Fluctuating barriers to ion entry in gramicidinA channels.
Biophys. J. , 57(3):499–514.Heinemann, S. H. and Sigworth, F. J. (1991). Open channel noise. VI. Analysis of amplitude histograms to determinerapid kinetic parameters.
Biophys. J. , 60(3):577–587.Hotz, T., Sch¨utte, O. M., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013). Idealizingion channel recordings by a jump segmentation multiresolution filter.
IEEE Trans. Nanobioscience , 12(4):376–386.Kass, R. S. (2005). The channelopathies: novel insights into molecular and genetic mechanisms of human disease.
J. Clin. Invest. , 115(8):1986–1989.Killick, R., Fearnhead, P., and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computationalcost.
J. Am. Stat. Assoc. , 107(500):1590–1598.Kov´acs, S., Li, H., B¨uhlmann, P., and Munk, A. (2020). Seeded Binary Segmentation: A general methodology forfast and optimal change point detection. arXiv preprint arXiv:2002.06633 .Levis, R. A. and Rae, J. L. (1993). The use of quartz patch pipettes for low noise single channel recording.
Biophys.J. , 65(4):1666–1677. Li, H., Munk, A., and Sieling, H. (2016). FDR-control in multiscale change-point segmentation.
Electron. J. Stat. ,10(1):918–959.Maidstone, R., Hocking, T., Rigaill, G., and Fearnhead, P. (2017). On optimal multiple changepoint algorithms forlarge data.
Stat. Comput. , 27(2):519–533.Mercik, S. and Weron, K. (2001). Stochastic origins of the long-range correlations of ionic current fluctuations inmembrane channels.
Phys. Rev. E , 63(5):051910.Neher, E. and Sakmann, B. (1976). Single-channel currents recorded from membrane of denervated frog musclefibers.
Nature , 260(5554):799–802.Overington, J. P., Al-Lazikani, B., and Hopkins, A. L. (2006). How many drug targets are there?
Nat. Rev. Drug.Discov. , 5(12):993–996.Pein, F. (2017).
Heterogeneous Multiscale Change-Point Inference and its Application to Ion Channel Recordings .PhD thesis, Georg-August-Universit¨at G¨ottingen. http://hdl.handle.net/11858/00-1735-0000-002E-E34A-7.Pein, F., Hotz, T., Sieling, H., and Aspelmeier, T. (2019a). stepR: Multiscale change-point inference . R packageversion 2.1-0.Pein, F., Hotz, T., Tecuapetla-G´omez, I., and Aspelmeier, T. (2019b). clampSeg: Idealisation of Patch ClampRecordings . R package version 1.1-0.Pein, F., Sieling, H., and Munk, A. (2017). Heterogeneous change point inference.
J. Roy. Statist. Soc. Ser. B ,79(4):1207–1227.Pein, F., Tecuapetla-G´omez, I., Sch¨utte, O. M., Steinem, C., and Munk, A. (2018). Fully-automatic multiresolutionidealization for filtered ion channel recordings: flickering event detection.
IEEE Trans. Nanobioscience ,17(3):300–320.Qin, F., Auerbach, A., and Sachs, F. (2000). Hidden Markov modeling for single channel kinetics with filteringand correlated noise.
Biophys. J. , 79(4):1928–1944.Raj Singh, P., Ceccarelli, M., Lovelle, M., Winterhalter, M., and Mahendran, K. R. (2012). Antibiotic permeationacross the OmpF channel: modulation of the affinity site in the presence of magnesium.
J. Phys. Chem. B ,116(15):4433–4438.Robertson, T. and Cryer, J. D. (1974). An iterative procedure for estimating the mode.
J. Am. Stat. Assoc. ,69(348):1012–1016.Sakmann, B. and Neher, E. (1995).
Single-Channel Recording . Springer, 2nd. edition.Schroeder, I. (2015). How to resolve microsecond current fluctuations in single ion channels: The power of betadistributions.
Channels , 9(5):262–280.Shelley, C., Niu, X., Geng, Y., and Magleby, K. L. (2010). Coupling and cooperativity in voltage activation of alimited-state BK channel gating in saturating Ca2+.
J. Gen. Physiol. , 135(5):461–480. Siekmann, I., Wagner, L. E., Yule, D., Fox, C., Bryant, D., Crampin, E. J., and Sneyd, J. (2011). MCMC estimationof Markov models for ion channels.
Biophys. J. , 100(8):1919–1929.Sigworth, F. J. (1985). Open channel noise. I. Noise in acetylcholine receptor currents suggests conformationalfluctuations.
Biophys. J. , 47(5):709–720.Sigworth, F. J. (1986). Open channel noise. II. A test for coupling between current fluctuations and conformationaltransitions in the acetylcholine receptor.
Biophys. J. , 49(5):1041–1046.Sigworth, F. J., Urry, D. W., and Prasad, K. U. (1987). Open channel noise. III. High-resolution recordings showrapid current fluctuations in gramicidin A and four chemical analogues.
Biophys. J. , 52(6):1055–1064.Song, J., Minetti, C. A. S. A., Blake, M. S., and Colombini, M. (1998). Successful recovery of the normalelectrophysiological properties of PorB (class 3) porin from Neisseria meningitidis after expression in Escherichiacoli and renaturation.
BBA Biomembranes , 1370(2):289–298.Tecuapetla-G´omez, I. and Munk, A. (2017). Autocovariance estimation in regression with a discontinuous signaland m-dependent errors: A difference-based approach.
Scand. J. Stat. , 44(2):346–368.VanDongen, A. M. (1996). A new algorithm for idealizing single ion channel data containing multiple unknownconductance levels.
Biophys. J. , 70(3):1303–1315.Venkataramanan, L., Kuc, R., and Sigworth, F. J. (1998a). Identification of hidden Markov models for ion channelcurrents. II. State-dependent excess noise.
IEEE Trans. Signal Process. , 46(7):1916–1929.Venkataramanan, L., Kuc, R., and Sigworth, F. J. (2000). Identification of hidden Markov models for ion channelcurrents. III. Bandlimited, sampled data.
IEEE Trans. Signal Process. , 48(2):376–385.Venkataramanan, L., Walsh, J. L., Kuc, R., and Sigworth, F. J. (1998b). Identification of hidden Markov modelsfor ion channel currents. I. Colored background noise.
IEEE Trans. Signal Process. , 46(7):1901–1915.Venkatraman, E. S., Olshen, A. B., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysisof array-based DNA copy number data.
Biostatistics , 5(4):557–572.Virji, M. (2009). Pathogenic neisseriae: surface modulation, pathogenesis and infection control.
Nat. Rev. Microbiol. ,7(4):274.Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm.
IEEETrans. Inf. Theory , 13(2):260–269.Yellen, G. (1984). Ionic permeation and blockade in Ca2+-activated K+ channels of bovine chromaffin cells.
J.Gen. Physiol. , 84(2):157–186. S UPPLEMENT TO H ETEROGENEOUS I DEALIZATION OF I ON C HANNEL R ECORDINGS - O
PEN C HANNEL N OISE
VII. L
ARGE SCALES
Following the ideas of
HSMUCE (Pein et al., 2017) we propose the idealization ˆ f by ˆ f := ˆ K (cid:88) k =0 ˆ c k l [ˆ τ k , ˆ τ k +1 ) := argmin ˜ f ∈F ˆ K ,M ( ˜ f ) ≤ n (cid:88) i =1 (cid:0) Y i − ˜ f ( i/n ) (cid:1) ˆ σ i , (VII.1)with ˆ σ i = (cid:80) j ∈ I i ( Y j − ˜ f ( i/n )) / | I i | , I i = { j ∈ { , . . . , n } : ˜ f ( k/n ) = ˜ f ( i/n ) ∀ k = j, . . . , i } . In other words, ˆ f is the maximum likelihood estimator restricted to all solutions of the optimization problem ˜ f ∈ F ˆ K such that M ( ˜ f ) ≤ . (VII.2)Thereby, F k is the set of all candidate signals in (II.1) with K = k changes, k ≥ . ˆ K is the estimated number ofchanges defined by the minimal number k for which there exits an ˜ f ∈ F k such that M ( ˜ f ) ≤ . Finally, M ( ˜ f ) denotes the multiresolution test statistic M ( ˜ f ) := max ≤ i ≤ j ≤ n ˜ f ([ i/f s ,j/f s ]) const. (cid:110) T i,j ( ˜ f ) − q | j − i +1 | (cid:111) . (VII.3)This tests simultaneously on all scales (resolution levels) whether ˜ f fits the data well. If it does not, the local teststatistic T i,j will be larger than the scale dependent critical value q | j − i +1 | , exact definitions are given below, and ˜ f will not be considered as a potential idealization. Such a method has many favorable properties, for more detailssee (Frick et al., 2014; Pein et al., 2017, 2018). Most importantly, the number of false positives is controlled ifthe scale dependent critical values are defined appropriately (one possible choice is outlined below), see TheoremVII.1.We use as test statistic the statistic in (1.5) in (Pein et al., 2017) without taking into account the first m observations( JSMURF principle ), i.e., T i,j ( ˜ f ) := ( j − i + 1 − m ) ( Y i + m,j − ˜ f ij ) s i + m,j , (VII.4)with ˜ f ij the conductance level of ˜ f on the interval [ i/n, j/n ] , Y i + m,j = ( j − i + 1 − m ) − (cid:80) jl = i + m Y l and ˆ s i + m,j = ( j − i − m ) − (cid:80) jl = i + m ( Y l − Y i + m,j ) . This statistic estimates the variance locally and is large if themean of the observations differ significantly from the conductance level ˜ f ij . The scale dependent critical values q m +1 , . . . , q n are obtained in a universal manner by Monte Carlo simulations as described in Section 2 of (Peinet al., 2017) such that (VII.3) is a level α -test and different scales are balanced by weights. Here, we use thedefault choice of uniform weights. The error level α ∈ (0 , has to be fixed in advance (by the experimenter) to control the number of false positives of the estimate ˆ f as stated in the following Theorem VII.1. See Section III-Efor a discussion how to choose α . Theorem VII.1.
Assume the heterogeneous ion channel model from Section II and let ˆ f be defined as in (VII.1) with error level α . Then, for any f and σ in (II.1) the probability that ˆ f overestimates the number of changes isbounded by α , i.e. P (cid:16) ˆ K > K (cid:17) ≤ α . (VII.5)To keep the method computationally feasible, we only evaluate the maximum in (VII.3) over the system ofall intervals that contain a dyadic number of observations, i.e., the maximum in (VII.3) is only taken over all ≤ i ≤ j ≤ n such that j − i + 1 = 2 l for some l ∈ N . This reduces the complexity of the system from O ( n ) to O ( n log( n )) intervals. This speeds up the pruned dynamic program and the simulations in Section IVshow that this interval set is large enough to allow a good performance. More details and a discussion of the runtime are given in Section III-D. Note that (Hotz et al., 2013) performed tests on a slightly different interval set.They required that j − i + 1 − m is a dyadic number. Although depending on the true signal, their choice leads ingeneral to a better detection power on scales slightly larger than the filter length m/f s , but its computation lastsmuch longer. And missed short events can be detected by the upcoming local tests. The very long computationtime was one major criticism in (Gnanasambandam et al., 2017). To be fair, both implementations differ in otherpoints, too, and a fast implementation of their interval set might be possible as well, but our approach was easierto integrate in the dynamic programming framework of the stepR package (Pein et al., 2019a). Finally, we remarkthat the restricted maximum likelihood estimator ignores the convolution and hence the locations of the detectedchanges are typically a little bit shifted to the right which will be corrected in the upcoming deconvolution stepin Section III-C. Different to this, (Hotz et al., 2013) suggested to move all locations by a constant factor t , onlydepending on the filter, to the left, but our deconvolution step will (usually) be more precise.VIII. S MALL SCALES
The upcoming three paragraphs describe precisely how the local tests are performed to detect short events thatare missed in the idealization obtained in Section III-A.
Obtaining the hypotheses and alternatives:
In this paragraph we give details for the construction of thehypothesis (III.1) and alternative (III.2). Assuming the model from Section II, we see that the expectation ofthe observations Y i , . . . , Y j is determined by the signal on the interval [( i − m ) /f s , j/f s ] . The other way around,information about the underlying signal on an interval [ i/f s , j/f s ] is provided by the observations Y i +1 , . . . , Y j + m − and the signal on [( i − m + 1) /f s , ( j + m − /f s ] effects the expectation of these observations. Hence, for a local test on an interval [ i/f s , j/f s ] we distinguish few scenarios depending on how many changes the previousidealization has in [( i − m + 1) /f s , ( j + m − /f s ] .If no change is contained, we test a constant signal against the alternative of an additional event on [ i/f s , j/f s ] with an arbitrary conductance level. If one change is contained, we test a signal with one change, exact details arediscussed below, against the alternative of an additional event on [ i/f s , j/f s ] with an arbitrary conductance level.If the test rejects, the single change is replaced by two and the exact locations and the conductance level betweenthese two changes are obtained in the upcoming deconvolution step in Section III-C. In the rare situation that twoor more changes are present no local test is performed to save computation time, since the parameters of more thantwo changes can anyway not be estimated in the upcoming deconvolution step. Moreover, we only test on intervalswith start and end point at the observation grid. Both limitations can be narrowed as discussed in Section VI-A, atthe price of a larger computation time. But, we found that our choices are sufficient for the data we analyze.We now describe how we obtain the parameters τ, c L , c R , s L and s R in the hypotheses in (III.1) and alternativesin (III.2). To this end, note that changes in the idealization from Section III-A are typically slightly shifted to theright, since the convolution is ignored and the lowpass filter acts only in the past. Hence, if we simply obtain theparameters from the previous idealization, many hypotheses will be wrongly rejected, even if the true underlyingsignal has only one change in [( i − m + 1) /f s , ( j + m − /f s ] . To correct for this, we reestimate the locationsof all isolated changes by deconvolution. This is performed locally as described in the upcoming Section III-C.Since we assume for testing that all changes are on the observation grid, we perform the deconvolution only atthe observation grid, without any refinement at finer scales. This includes a reestimation of the conductance levelson long segments by medians. In other words, as the hypothesis we assume the signal which will be obtained bydeconvolution if no test rejects, up to refinements using finer grids. At the same time, estimation of the conductancelevels on long segments by the median guarantees that they are not too badly estimated even if few short peaks aremissed.On long segments, in addition to the expectation, the standard deviation s is estimated by ˆ s ( Y a , . . . , Y b ) = IQR ( Y a + m − Y a , . . . , Y b − Y b − m )2Φ − (0 . (cid:112) F ∗ F )(0) , (VIII.1)using the same observations as used for estimating the expectation. Here, Φ − denotes the quantile of the standardnormal distribution.Finally, we recommend to choose l max such that events on all larger scales are already detected by the previousidealization (or have such a small jump size that they are also not detectable by the tests in this step). We foundin simulations (not displayed) that l max = 65 is a suitable default choice. Local testing:
In this paragraph we propose a test that provides a good trade-off between detection powerfor events in the measurements in Section V and computational complexity. We start with estimating the unknown parameters c and s under the alternative.To estimate c we use the least squares estimator ˆ c := argmin c ∈ R j + m − (cid:88) l = i +1 (cid:0) Y l − E [ Y l ] (cid:1) = argmin c ∈ R j + m − (cid:88) l = i +1 (cid:0) Y l − v l c − c LR,l (cid:1) = (cid:80) j + m − l = i +1 v l ( Y l − c LR,l ) (cid:80) j + m − l = i +1 v l , (VIII.2)where v l := F m ( l/f s − τ L ) − F m ( l/f s − τ R ) (VIII.3)and c LR,l := c L [1 − F m ( l/f s − τ L )] + c R F m ( l/f s − τ R ) , (VIII.4)with F m the antiderivative (step function) of the truncated filter kernel. Moreover, it follows from (II.3) that underthe alternative (III.2) the covariance is given by Cov (cid:2) Y l , Y l + r (cid:3) = w l,r s + s LR,l,r for | r | = 0 , . . . , m, for | r | > m, (VIII.5)with w l,r := A m ( l/f s − τ L , r/f s ) − A m ( l/f s − τ R , r/f s ) (VIII.6)and s LR,l,r := s L (cid:2) A m ( ∞ , r/f s ) − A m ( l/f s − τ L , r/f s ) (cid:3) + s R A m ( l/f s − τ R , r/f s ) . (VIII.7)Hence, for estimating the variance s we use the weighted estimator ˆ s := max (cid:32) (cid:80) j + m − l = i +1 w l, (cid:0) Y l − v l ˆ c − c LR,l (cid:1) − B ( s L , s R ) A , (cid:33) , (VIII.8)with A and B ( s L , s R ) such that E (cid:34) j + m − (cid:88) l = i +1 w l, (cid:0) Y l − v l ˆ c − c LR,l (cid:1) (cid:35) =: As + B ( s L , s R ) . (VIII.9)Note that the random variable of which we take the expectation in (VIII.9) can be written as a quadratic form ( Y i +1 ,j + m − − E [ Y i +1 ,j + m − ]) t C ( Y i +1 ,j + m − − E [ Y i +1 ,j + m − ]) , where Y i,j = ( Y i , . . . , Y j ) t and all entries of thematrix C are non-negative and depend only on v l and w l, , l = i + 1 , . . . , j + m − . This combined with (VIII.5)confirms the proposed structure in (VIII.9) follows and allows to computed A and B ( s L , s R ) explicitly.Note that the estimator ˆ c is unbiased, while for ˆ s this would be true without the projection of negative values tozero in (VIII.8), which however reduces the mean square error. Using these estimators, under the alternative the observation Y l has estimated expectation ˆ c ,l := v l ˆ c + c LR,l and estimated variance ˆ s ,l := w l, ˆ s + s LR,l, . Under the null hypothesis the observation Y l has expectation c ,l := c L (cid:2) −F m ( t − τ ) (cid:3) + c R F m ( t − τ ) and variance s ,l := s L (cid:2) A m ( ∞ , −A m ( l/f s − τ ) (cid:3) + s R A m ( l/f s − τ, .Finally, using these estimators we propose the test statistic T ji := j + m − (cid:88) l = i +1 log (cid:32) s ,l ˆ s ,l (cid:33) + ( Y l − c ,l ) s ,l − ( Y l − ˆ c ,l ) ˆ s ,l . (VIII.10)We are aware that this test statistic and its underlying estimators might be improvable with respect to efficiencyof the estimators and the power of the resulting test for its various alternatives, for a more detailed discussion andpotential alternatives see Section VI. But, as stressed before, we aimed for a test that has at least a good power forthe recordings in Section V and can be computed efficiently. This will be confirmed by the simulations in SectionIV.Moreover, note that this test uses information provided by potential standard deviation changes as well. This isdifferent to the multiresolution test we used in Section III-A to detect events on large scales and to HSMUCE .Note that, different to similar ideas in these settings, it can be computed efficiently, since only testing is requiredand not regression based on these tests. This is another gain of the three step procedure we propose in this paper.The test problem is also of a different type than the one in (Enikeeva et al., 2018), since we allow the standarddeviation to be constant ( s = s L or s = s R ) when the conductance changes. Critical values:
It remains to choose critical values that balance the different tests appropriately. To this end,we obtain again scale depend critical values by using the approach from Section 2 in (Pein et al., 2017). We applyit with significance level α and equal weights β , . . . , β l max = 1 /l max . By this we aim to control the overallprobability of detecting an false positive by α := α + α . While we showed in Theorem VII.1 such a control forthe multiresolution procedure in Section III-A, we are not able to prove such a bound for the local tests as well,since the previous idealization might not be exactly the true signal up to events on shorter temporal scales and hencethe observations are not generated exactly according to the hypotheses (III.1) and alternatives (III.2). Moreover, tospeed up the required Monte-Carlo simulations we use the following simplification. When computing the test staticsin the Monte-Carlo simulations, we ignore the previous idealization step and assume instead a constant signal. Sincethe idealization by JSMURF leads with probability at least − α to a constant idealization, this error is negligible.All in all, we found in simulations that the local tests keep the error level α well. Multiple dependent rejections:
Usually one event in the data causes rejections of multiple tests. Hence, weonly add the event that corresponds to the rejection with the largest test statistic among all rejections on intervalsthat intersect or adjoin each other. More precisely, two rejections on intervals [ i /f s , j /f s ] and [ i /f s , j /f s ] are only considered as two separated events if the intervals are disjoint and (w.l.o.g. let j < i ) there exists an l ∈ { j + 1 , . . . , i − } such that all tests on intervals containing l/f s accept the hypothesis. The choice to consider the rejection with the largest test statistic is a natural choice for all tests on intervals of the same length, sincethey share the same distribution (under their respective null hypotheses and alternatives). For tests on intervals ofdifferent lengths this is not exactly true. Nonetheless, we found that considering the rejection with the largest teststatistic works very well in practice. That is because usually the test statistics are much larger when their alternativeis true than when their null hypothesis is true, which outweighs the (slightly) different distributions (under theirrespective null hypotheses and alternatives). Also note that a slight missestimation of a location does not have anoticeable effect, since the final estimation of them is obtained in the upcoming deconvolution step.IX. L OCAL DECONVOLUTION
In this section we describe two minor modifications on the local deconvolution approach of (Pein et al., 2018)which we made to adapt to the heterogeneous noise setting. As summarized at the end of the introduction thelocal deconvolution approach of (Pein et al., 2018) requires that two short events are separated by at least one longevent. Parameters on long events can be estimated without deconvolution. In our setting we have to estimate meanand standard deviation instead of only the mean as in (Pein et al., 2018). Hence, our first modification is that werequire at least instead of ten observations in the definition of a long segment to guarantee a reasonable wellparameter estimation. Secondly, we adapt the choice of the grids. For the idealization in Section III-A and for thedetection step of JULES (Pein et al., 2018) the locations of the estimated changes are shifted to the right, since theconvolution was ignored. Hence, we use for a change ˆ τ detected in Section III-A (and not replaced by two detectedchanges by the local tests) still the grid { ˆ τ − m/f s , ˆ τ − ( m + 1) /f s , . . . , ˆ τ } . However, for a change ˆ τ detectedby the local tests in Section III-B we use instead { ˆ τ − (cid:100) m/ (cid:101) /f s , . . . , ˆ τ + (cid:100) m/ (cid:101) /f s } , since the locations are notestimated precisely, but also not systematically shifted to one side. The reestimation of the conductance levels onlong segments is adapted in the same way. Everything else is performed in the same way as explained in Section3.2 of (Pein et al., 2018). X. I DEALIZATIONS BY EXISTING APPROACHES
In this section we discuss in more detail than in the introduction the idealization of the observations in Figure 1by various approaches. We begin with approach that assume a hidden Markov model. We observed two differentconductance levels and hence started with two states. We used the following starting values µ = (0 . , . T , s = (0 . , . T , P = .
95 0 . .
05 0 . for mean, standard deviation and and transition matrix, respectively. The standard deviations were determined bytaking the standard deviation of all observations below and above . , respectively. Fig. 17: Idealization (red) of the data in Figure 1 assuming a HMM with two states displayed on three differenttemporal scales. In the lower panels we also show the convolution of the idealization with the lowpass filter (blue).It fits most part of the data well, but misses short events, for instance the event displayed in lower left panel.Its idealization, obtained by the Viterbi algorithm and displayed in Figure 17, looks well on all larger temporalscales, but misses short events, for instance the event displayed in the lower left panel. To fit such events well, we added a third state and used the starting values µ = (0 . , . , . T , s = (0 . , . , . T , P = . . .
999 0 . .
02 0 .
03 0 . . (X.1) Fig. 18: Idealization (red) of the data in Figure 1 assuming a HMM with three states displayed on three differenttemporal scales. In the lower panels we also show the convolution of the idealization with the lowpass filter (blue).It fits most part of the data well, but misses short events, for instance the event displayed in lower left panel.
However, the resulting idealization, displayed in Figure 18, did not change much. In fact, the new state isattained only two times, i.e., fits artifacts instead of the short events. We varied the starting values but withoutmuch success. To improve results we decided to assume the same expectation and variance for those two statesand we used again the starting values in (X.1). Note that this model-class was motivated by the results we obtainedfrom using our model-free idealization approach
HILDE which illustrates nicely how model-free approaches canbe used to support HMMs. The resulting idealization, displayed in Figure 2 in the introduction, however still missesshort events. To detect such events we think that it requires to take into account the filtering explicitly. This canbe done by introducing so called meta-states (Venkataramanan et al., 1998b,a; de Gunst et al., 2001). We usedthe implementation in (Diehn, 2017), but found it way to slow to run it for three states. Hence, we only assumedtwo states but with filtering. We determined the conductance and variance levels as well as a starting value for thetransition matrix by fitting an unfiltered HMM. Secondly, we applied an Baum-Welch algorithm, which takes intoaccount a discretized filter, to estimate the transition matrix. It estimates a transition probability from the closed tothe open state of . and of . for a transition from the open to the closed state. Finally, the observationsare idealized by a Viterbi algorithm assuming discrete filtering as well. This idealization is displayed in Figure 19.It detects short events very well, for instance the events between
10 s and .
15 s could be true events that aremissed by other approaches. However, it also detects a huge amount of most likely false positives events at parts ofthe data with low conductivity, e.g., before . (see for instance the lower right panel) and between . and .
75 s . It appears likely that these events are false positives, since there is not any visible indication of an eventin the data at these locations. False positives can for instance be explained by missestimated parameters due to atoo small number of states or also because of the previously highlighted artifacts in the data set.Secondly, we discussed in Figure 3 in the introduction an idealization by
JULES (Pein et al., 2018). More precisely,we used its implementation in R given by the function jules in the CRAN package clampSeg (Pein et al., 2019b)with its default parameter. This method is designed to take into account the filtering explicitly to idealize short eventswell, but assumes homogeneous noise. Hence, we found that it detects many small false positives on segments with alarger conductance and variance. Another model-free ion channel idealization approach is
TRANSIT (VanDongen,1996). Here, we used its implementation in R given by the function transit in the CRAN package stepR (Pein et al.,2019a) with its default parameter. Its idealization is shown in Figure 21. Like
JULES it detects many small falsepositives on segments with a larger conductance and variance. Such false positives are not a specific flaw of thesemethods, they will occur for any reasonable idealization method that ignores the heterogeneous noise.Finally, we discussed in the introduction in Figure 4 an idealization by
HSMUCE , which serves as an examplefor a method that takes into account the heterogeneous noise, but ignores the filtering. We used the R function stepFit of the CRAN package stepR (Pein et al., 2019a) with the parametric family ”hsmuce” and a conservativesignificance level of α = 0 . to avoid overfitting. As discussed in the introduction, it works well on larger time Fig. 19: Idealization (red) of the data in Figure 1 assuming a HMM with filtering (Diehn, 2017) displayed on threedifferent temporal scales. In the lower panels we also show the convolution of the idealization with the lowpassfilter (blue). It detects very short events but finds a huge number of (most likely false positive) events at parts oflow conductance (see for instance the lower right panel).scales, but is not able to detect shorter events. If we increase α , this effect reduces, but at the precise of additionalfalse positives, see Figure 22, where we used α = 0 . . This effect is even more pronounced when we use CBS (Venkatraman et al., 2004), see Figure 23, a method that like
HSMUCE takes into account heterogeneous noise,but ignores the filtering. However, in comparison to
HSMUCE it puts less emphasize on avoiding false positives, but shows generally a higher detection power for very short events. In fact, we found that CBS is able to find shortevents, but at the price of a massive overfit. A last alternative could be subsampling to mitigate the filtering effects.However, this obviously also does not allow to detect short events and we did not display such results to avoidfurther lengthening of the paper.In summary, none of the existing methods was able to idealize the data set in the middle panel of Figure 1 reliably.In some cases, data cleaning or postfiltering might improve results, but this not only a huge amount of work thatis required for every new data set again, it is also highly subjective. In comparison,
HILDE provided in Figure 5a reasonable idealization without that anything like this was required.XI. A
NALYSIS OF FLICKERING DYNAMICS USING IDEALIZATIONS OF EXISTING APPROACHES
In Section V-C we analyzed the gating dynamics of the PorB traces using the idealizations obtained by
HILDE .In this section we will examine whether we can obtain the same results when we use other idealization methodsinstead. Given the results from the previous section, we restrict ourself to
JULES , HSMUCE and the hiddenMarkov approach which assumes three states but with shared expectation and standard deviation for two states (inthe following denoted by
HMM ). Figure 24 shows histograms of the amplitudes.We note that all model-free approaches estimate roughly the same amplitude. However, when we used the Baum-Welch algorithm to fit the hidden Markov model we obtained an amplitude of . which is far off and mostlikely caused by artifacts. Next we consider in Figures 25-27 the estimated dwell times in the open state.We found in Figures 25-27 that all comparisons are qualitatively the same as for the hidden Markov modelsimulations in Section IV-D. Once again HSMUCE is not able to detect short events and
HMM requires a strictermissed event analysis to analyze them well. Using
HILDE , JULES and
HMM we obtained an average duration of .
31 ms , .
72 ms and .
05 ms for the short events and using
HILDE , JULES , HSMUCE and
HMM we obtained anaverage duration of .
62 ms , .
37 ms , .
85 ms and .
96 ms for the long events. Hence, the estimates obtainedby
HILDE are roughly confirmed. Finally, we are now analyzing the dwell times in the closed state (Figure 28)and the proportions of short and long events.We found in Figure 28 that all four methods indentified an exponential distrubtion and also the estimate ratesare rather similar: Using
HILDE , JULES , HSMUCE and
HMM we obtained a frequency of .
75 Hz , .
80 Hz , .
66 Hz and .
74 Hz , respectively. And using
HILDE , JULES , and
HMM we estimated the proportion of shortevents to be . , . and . , respectively. In summary, all results obtained by using HILDE could beconfirmed by at least one other approach. However, one should also note that none of the other methods was ableto reproduce all results obtained by
HILDE . This is confirmed by simulations in Section IV-D, where we simulateddata from a hidden Markov similar to the one we estimated from the PorB data. Fig. 20: Idealization (red) of the data in Figure 1 by
TRANSIT displayed on three different temporal scales. In thelower panels we also show the convolution of the idealization with the lowpass filter (blue). It detects short events,but finds many small events (which are most likely false positives) at parts of high conductance and high variance(see for instance the idealization of the observations around .
36 nS in the middle panel). These detections hinderthe decovolution (see for instance the lower left panel) and make the idealization unreliable.Fig. 21: Idealization by
TRANSIT (red) of the signal underlying the observations in Figure 1 and its convolutionwith the lowpass filter (darkred). It detects a huge amount of false positives at segments with high conductance andhigh variance. Fig. 22: Idealization (red) of the data in Figure 1 by
HSMUCE using a large significance of α = 0 . displayedon three different temporal scales. In the lower panels we also show the convolution of the idealization with thelowpass filter (blue). It still misses short events (for instance the one displayed in the lower left panel), but alsoincludes several false positives. XII. H OMOGENEOUS NOISE
This section details how
HILDE can be adapted to the assumption of homogeneous noise. A constant variancecan be preestimated as in (6) in (Pein et al., 2017). The first multiresolution step to detect events on larger scalescan be performed by
JSMURF as defined in (Hotz et al., 2013). And for the local tests to detect events on smaller Fig. 23: Idealization (red) of the data in Figure 1 by
CBS displayed on three different temporal scales. In the lowerpanels we also show the convolution of the idealization with the lowpass filter (blue). It detects a huge amount offalse positives at segments with high conductance and high variance.scales in the second step we suggest the (regularized) likelihood ratio test statistic T ji := ( Y i +1 ,j + m − − ( c ) i +1 ,j + m − ) t Σ − i +1 ,j + m − ( Y i +1 ,j + m − − ( c ) i +1 ,j + m − ) − ( Y i +1 ,j + m − − ( c (ˆ c )) i +1 ,j + m − ) t Σ − i +1 ,j + m − ( Y i +1 ,j + m − − ( c (ˆ c )) i +1 ,j + m − ) , (a) HILDE (b)
JULES (c)
HSMUCE
Fig. 24: Histograms of the estimated amplitudes using various idealization approaches. The the red line indicatesthe estimated amplitudes of . , . and . by the half sample mode using the idealizationsfrom HILDE , JULES and
HSMUCE , respectively. (a)
HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 25: Histograms of the dwell times of opening events with amplitude between . and . together withexponential fits using a missed event correction (red line). We consider all dwell times between . and
200 ms . (a) HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 26: Histograms of the dwell times of opening events with amplitude between . and . together withexponential fits using a missed event correction (red line). We consider short dwell times between . and .with ˆ c := argmax c ∈ R ( Y i +1 ,j + m − − ( c ( c )) i +1 ,j + m − ) t Σ − i +1 ,j + m − ( Y i +1 ,j + m − − ( c ( c )) i +1 ,j + m − ) . (a) HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 27: Histograms of the dwell times of opening events with amplitude between . and . together withexponential fits using a missed event correction (red line). We consider long dwell times between
20 ms and
200 ms . (a) HILDE (b)
JULES (c)
HSMUCE (d)
HMM
Fig. 28: Histograms of the dwell times in the closed state, together with exponential fits (red) that are corrected formissed events.Here, ( c ) i +1 ,j + m − and ( c ( c )) i +1 ,j + m − are the vectors ( c ) i +1 ,j + m − = (cid:0) ( F m ∗ f )(( i + 1) /f s ) , . . . , ( F m ∗ f )(( j + m − /f s ) (cid:1) t , ( c ( c )) i +1 ,j + m − = (cid:0) ( F m ∗ f ( c ))(( i + 1) /f s ) , . . . , ( F m ∗ f ( c ))(( j + m − /f s ) (cid:1) t and Σ i +1 ,j + m − the covariance matrix of the observations Y i +1 , . . . , Y j + m − given by (VIII.5) and regularizedby Tikhonov regularization with parameter γ = σ . And, recall, Y i,j = ( Y i , . . . , Y j ) t and f and f1