State-space solutions to the dynamic magnetoencephalography inverse problem using high performance computing
Christopher J. Long, Patrick L. Purdon, Simona Temereanca, Neil U. Desai, Matti S. Hämäläinen, Emery N. Brown
aa r X i v : . [ s t a t . A P ] J u l The Annals of Applied Statistics (cid:13)
Institute of Mathematical Statistics, 2011
STATE-SPACE SOLUTIONS TO THE DYNAMICMAGNETOENCEPHALOGRAPHY INVERSE PROBLEM USINGHIGH PERFORMANCE COMPUTING By Christopher J. Long , Patrick L. Purdon ,Simona Temereanca , Neil U. Desai ,Matti S. H¨am¨al¨ainen and Emery N. Brown Imperial College London, GlaxoSmithKline, Harvard Medical School, Massachusetts General Hospital, Massachusetts Institute of Technology(MIT), Harvard/MIT Division of Health Sciences & Technology
Determining the magnitude and location of neural sources withinthe brain that are responsible for generating magnetoencephalogra-phy (MEG) signals measured on the surface of the head is a chal-lenging problem in functional neuroimaging. The number of poten-tial sources within the brain exceeds by an order of magnitude thenumber of recording sites. As a consequence, the estimates for themagnitude and location of the neural sources will be ill-conditionedbecause of the underdetermined nature of the problem. One well-known technique designed to address this imbalance is the minimumnorm estimator (MNE). This approach imposes an L regularizationconstraint that serves to stabilize and condition the source parame-ter estimates. However, these classes of regularizer are static in timeand do not consider the temporal constraints inherent to the bio-physics of the MEG experiment. In this paper we propose a dynamicstate-space model that accounts for both spatial and temporal cor-relations within and across candidate intracortical sources. In ourmodel, the observation model is derived from the steady-state so-lution to Maxwell’s equations while the latent model representingneural dynamics is given by a random walk process. We show thatthe Kalman filter (KF) and the Kalman smoother [also known asReceived September 2010; revised April 2011. Supported in part by the National Science Foundation through TeraGrid resources pro-vided by the Texas Advanced Computing Center (TACC). This work was also supportedby NIH Grants NIBIB R01 EB0522 (E.N.B.), DP2-OD006454 (P.L.P.), K25-NS05780(P.L.P.), DP1-OD003646 (E.N.B.), NCRR P41-RR14075, R01-EB006385 (E.N.B., P.L.P.,M.S.H.) at Massachusetts General Hospital.
Key words and phrases.
Magnetoencephalography, source localization, Kalman filter,fixed interval smoother.
This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in
The Annals of Applied Statistics ,2011, Vol. 5, No. 2B, 1207–1228. This reprint differs from the original inpagination and typographic detail. 1
C. J. LONG ET AL.the fixed-interval smoother (FIS)] may be used to solve the ensu-ing high-dimensional state-estimation problem. Using a well-knownrelationship between Bayesian estimation and Kalman filtering, weshow that the MNE estimates carry a significant zero bias. Calcu-lating these high-dimensional state estimates is a computationallychallenging task that requires High Performance Computing (HPC)resources. To this end, we employ the NSF Teragrid SupercomputingNetwork to compute the source estimates. We demonstrate improve-ment in performance of the state-space algorithm relative to MNE inanalyses of simulated and actual somatosensory MEG experiments.Our findings establish the benefits of high-dimensional state-spacemodeling as an effective means to solve the MEG source localizationproblem.
1. Introduction.
Electromagnetic source imaging is a neuroimaging tech-nique that permits study of neural events on a millisecond timescale. Thistype of imaging reveals brain dynamics that cannot be seen with imagingmodalities such as functional magnetic resonance imaging (fMRI) or positronemission tomography (PET) that capture brain activity on a much slowertimescale. Two of the most important examples of electromagnetic imagingare electroencephalography (EEG) and magnetoencepholography imaging(MEG). These modalities acquire multiple time series recorded at locationsexterior to the skull and are generated, respectively, by electric and magneticfields of neuronal currents within the cortex of the brain [H¨am¨al¨ainen et al.(1993), Nunez (1995)]. In these experiments subjects execute a task thatputatively activates multiple brain areas. In the case of EEG recordings, themode of acquisition involves affixing an array of electrodes to the surface ofthe scalp and measuring the potential differences relative to a reference node.In contrast, MEG records extremely weak magnetic fields emanating fromthe brain on the order of femtoTesla, that is, 10 − Tesla [Nunez (1995)]. Toput these quantities into context, the magnetic field of the earth is around5 × − Tesla while that of a present-day magnetic resonance scanner usedfor medical diagnostic imaging ranges between 1 . YNAMIC STATE-SPACE METHODS FOR MEG The accuracy of the inverse solution depends critically on two main fea-tures of the problem, specifically, the precision of the forward model andthe choice of inverse solution algorithm. The MEG forward model is de-signed to describe how activity propagates from sources in the cortex tothe SQUID detectors. These forward models are normally constructed bycombining information about head and brain geometry with the electromag-netic properties of the skull, dura and scalp to yield a numerical approxima-tion to Maxwell’s quasistatic equations; see, for example, H¨am¨al¨ainen et al.(1993). Under this forward model the strength of the magnetic field recordedat a given SQUID detector near the surface of the head is approximatelyproportional to the reciprocal of its squared distance from a given corti-cal source. The second consideration we face relates to the choice of sourcelocalization algorithm. Solving this localization problem is one of the mosttechnically challenging in MEG imaging research and has been an active areaof methods development for both EEG and MEG over the past two decades.Historically, two of the most popular methods for solving the EEG/MEGinverse problem have been the dipole-based and (linear) distributed-sourcemethods.Dipole-based methods model the underlying neuronal sources as discretecurrent dipoles whose location, orientation and amplitude are unknown quan-tities to be estimated [Mosher, Lewis and Leahy (1992), Uutela, H¨am¨al¨ainenand Salmelin 1998]. Choosing the number of dipoles to include in the modelis problematic. Eigenvalue decomposition methods have been developed toaddress this model selection issue [Mosher, Lewis and Leahy (1992)]. How-ever, these approaches still require subjective interpretation when the eigen-value distribution does not yield an obvious distinction between the signaland noise subspaces. In addition, finding the best-fitting parameters for themultidipole model requires nonlinear optimization. Since the measured fieldsdepend nonlinearly on the dipole position parameters, typical minimizationroutines may not yield the globally optimal estimates for these parameters.Proposed solutions to this problem include the MUSIC (MUltiple SIgnalClassification) algorithm [Mosher and Leahy (1998)], global optimizationalgorithms and partly heuristic model constructions that use interactive soft-ware tools [Uutela, H¨am¨al¨ainen and Salmelin (1998)]. However, none of thesemethods utilize the temporal sequence of the signals, that is, if the originaldata points are first permuted and an inverse permutation is applied to thesource estimates, the results are the same as without permutation. Finally,in many important clinical and neuroscientific applications, such as epilepsy,sleep or general anesthesia, the dipole model is inappropriate, since the gen-erating currents are most likely distributed across large areas on the cortex.In the distributed source models, each location in the brain representsa possible site of a current source. Since the number of unknown sourcesvastly outnumbers the number of EEG or MEG data recordings, constraints
C. J. LONG ET AL. are required to obtain a unique solution. The minimum-norm estimator(MNE) employs an L -norm data “fidelity” term to quantify the relationshipbetween the observed magnetic field recordings and the estimated source pre-dictions and an L -norm regularization or penalty term on the magnitudeof those solutions. While the minimum-norm estimate (MNE) yields the so-lution with the smallest energy (sum of squares) across the solution space[H¨am¨al¨ainen et al. (1993)], it tends to consistently favor low-amplitude so-lutions that are located close to the scalp surface. The relative contributionof the data term and source term can be controlled or tuned through a reg-ularization parameter. In MNE, the data term is specifically the L -norm ofspatially whitened data while the source term is the L -norm of the currentamplitudes. A well-known variant of this approach is the LORETA (LOw-Resolution Electromagnetic Tomography Algorithm), where the L -normsource term specified in MNE is replaced by a weighted norm that approx-imates the spatial Laplacian of the currents [Pascual-Marqui, Michel andLehmann (1994)]. In the case of minimum-current estimates (MCE) [UutelaK. H¨am¨al¨ainen and Somersalo (1999)], the source term is taken to be the L -norm of the source current amplitudes. A consequence of this choice is thatthe estimates will be sparse as opposed to diffuse, leading to families of solu-tions that may closely resemble those uncovered through a dipolar analysis.Beamformer approaches, originally inspired by problems in radar and so-nar array signal processing [Krim and Viberg (1996)], seek to localize EEG/MEG activity by specifying a series of spatial filters, each tuned to maxi-mize sensitivity to a particular spatial location within the brain [Van Veenet al. (1997)]. These methods assume, however, that EEG/MEG sourcesare spatially uncorrelated, and are limited by interference crosstalk fromsources outside the focus of the beamformer spatial filter. A more recentrelated approach by Zumer et al. (2007) features a graphical model frame-work to characterize the probabilistic relationships among latent and ob-servable temporal sources for averaged ERP data. Once a low-dimensionalstatistical data subspace is estimated, the algorithm computes full posteriordistributions of the parameters and hyperparameters with the primary ob-jective of alleviating temporal correlation between spatially distinct sources(source separation). In later work [Zumer et al. (2008)] this approach isextended somewhat to include explicit temporal modeling by drawing onestimated weightings from a predefined family of smooth basis functions.Wipf et al. (2010) explores an alternate procedure to empirical Bayesiansource modeling that attempts to simultaneously capture and model corre-lated source-space dynamics in the presence of unknown dipole orientationand background interference.Bayesian approaches to MEG source localization have included Phillipset al. (2005), Mattout et al. (2006) and Friston et al. (2008). Collectively, the-se techniques construct the source localization problem within an instanta-neous empirical Bayes framework. These approaches specify a nested and YNAMIC STATE-SPACE METHODS FOR MEG hierarchical series of spatial linear models to be subsequently solved un-der a penalized Restricted Maximum Likelihood (ReML) procedure. Themultiresolutional nature of this regime carries with it a measure of spatialadaptation for the estimated hyperparameters since the spatial clusters ofactivated dipoles are amalgamated across several (orthogonal) scales. Atthe same time the most likely set of spatial priors for each competing modelis chosen using information selective model ranking procedures. The lat-ter approach [Friston et al. (2008)] extends the use of ReML to engendera framework aiming to move from parametric spatial priors to larger sets ofimputed sources with localized but not necessarily continuous support. Theprinciple advantage of this extension is its intrinsic ability to capture sourcesolutions ranging from sparse dipole models at one end of the localizationspectrum to dense but smooth distributed configurations of spatial dipolesat the other.State-space methods featuring latent temporal models have recently beenproposed. For example, studies by Galka, Yamashita and Ozaki (2004) andYamashita et al. (2004) have used a random-walk dynamical model withLaplacian spatial constraints to represent the dynamics of EEG source cur-rents, employing the Kalman filter and recursive least-squares frameworkto perform source localization. Daunizeau and Friston (2007) used a state-space model to capture latent neuronal dynamics by placing a first-orderautorgressive (AR) model within a full spatiotemporal variational Bayesprocedure. Each ensemble of sources is characterized by establishing an ef-fective region of influence in which the hidden neural state-dynamics areassumed constant. This leads to a temporal representation of each region interms of a single average dynamic.Ou, H¨am¨al¨ainen and Golland (2009) have developed a mixed L L -normpenalized model to estimate MEG/EEG sources. This methodology allowsfor spatially sparse solutions while ensuring physiologically plausible andnumerically stable time course reconstructions through the use of basis func-tions. Since estimation of this kind of mixed-norm regularized problems rep-resents a computational challenge, convex-cone optimization methods areimplemented to efficiently compute the MEG/EEG source estimates. Theseprocedures are shown to yield significant improvement over conventionalMNE approaches in both simulated and actual MEG data.In the majority of MEG stimulus response experiments, the objective is toestimate the activity in a given brain region based on multiple repetitions ofthe same stimulus. The time interval of the MEG recordings after the onsetof the stimulus is usually around 1,000 msec. Consequently, the optimalestimate of the source activity at a given time point should be based on allthe data recorded in this time window, as opposed to only the recordingswithin a small instantaneous subinterval as is the case in the MNE algorithm. C. J. LONG ET AL.
Thus, in the current work, we model the spatiotemporal structure of theMEG source localization problem as a full-rank state-space estimation pro-cedure. In contrast to previous related approaches, for example, Galka, Ya-mashita and Ozaki (2004), Yamashita et al. (2004) and Ou, H¨am¨al¨ainen andGolland (2009), we recover the underlying neural state dynamics across theentire solution space using all information available in the MEG measure-ments to compute a new source estimate at each location and time point.We do not restrict our state-covariance structure to operate either throughreduced-rank approximations or through “small-volume” state-space mod-els that operate within localized spatial neighborhoods [Galka, Yamashitaand Ozaki (2004), Yamashita et al. (2004)]. The analysis computes activityat a particular source with activity at that source combined with (poten-tially all) neighboring sources in both preceding and subsequent time periods(spatiotemporal correlation). We find that solution of the resulting high-dimensional state-space model requires the application of high performancecomputing (HPC) resources.We show that two solutions that are naturally suited to this type of dy-namic inverse problem are the Kalman filter and the fixed-interval smoother[Kitagawa and Gersch (1996), Kay (1993)]. These algorithms are based ona series of mathematical relationships that employ principles from electro-magnetic field theory to relate the observed MEG measurements to theunderlying dynamics of the current source dipoles. In contrast to popu-lar source localization methods such as MNE where the regularization con-straints are formulated in terms of a fixed-error covariance matrix, the state-space estimators feature time-varying error covariances and propagate past(and future) information into the current update.We describe how these algorithms may be used to perform this stateestimation by conducting the computation on the NSF Teragrid Supercom-puting Network. In Section 3 we detail the necessary supercomputing meth-ods needed to implement the high-dimensional state-space estimation. Wedemonstrate the improvement in performance of the state-space algorithmrelative to MNE in analyses of a simulated MEG experiment and actualsomatosensory MEG experiment.
2. Theory.
We assume that in an MEG experiment a stimulus is ap-plied T times and at each time for a (short) window of time following theapplication of the stimulus MEG activity is measured simultaneously in S SQUID recording sites. Let Y s,r ( k ) denote the measurement at time k at lo-cation s on repetition r where k = 1 , . . . , N, s = 1 , . . . , S and r = 1 , . . . , T . Wetake Y s ( k ) = T − P Tr =1 Y s,r ( k ) and we define Y k = Y ( Y ( k ) , . . . , Y S ( k )) to bethe S × k . We assume that thereare P current sources in the brain and that the relation between the MEG YNAMIC STATE-SPACE METHODS FOR MEG recordings and the sources is defined by Y k = HJ k + ε k , (2.1)where J k = ( J k, , . . . , J k,P ) is the 3 P × k ,each J k,i is a 3 × H is the S × P lead field matrix, derivedas described in H¨am¨al¨ainen et al. (1993). This matrix is approximated inpractice using the known conductivity profile of an MRI-derived T1 im-age to derive a one-layer head model using the boundary element method[H¨am¨al¨ainen and Sarvas (1989)]. Furthermore, ε k is a zero mean Gaussiannoise with covariance matrix Σ ε . To build on an idea originally suggested inthe EEG literature [Galka, Yamashita and Ozaki (2004), Yamashita et al.(2004)], we assume that the J k behave like a stochastic first-order processwith a potentially weighted neighborhood (six direct neighbors in the casebelow) constraint defined at voxel p by J k,p = J k − ,p + 16 m − ℓ X ℓ ∈ M ( ℓ ) J k − ,ℓ + v k , (2.2)where M ( ℓ ) is some small neighborhood around voxel p and m ℓ is a maskof (estimated) autoregressive weights, and v k is a 3 P × w . The linear structure inequation (2.2) means that we can generalize this structure into matrix formatsuch that all sources in the brain (i.e., not only local relationships) areencompassed in the model. That is, J k = F J k − + v k . (2.3)If F is an identity matrix, the state dynamics reduce to a random walkprocess. Equations (2.1) through (2.3) define the observation and latenttime-series equations, respectively, for a state-space model formulation thatcan be used to provide a dynamic description of the MEG source localizationproblem.2.1. Standard MEG inverse solution.
The standard approach to the MEGsource localization problem is to solve an L regularized least-squares prob-lem at each time k . That is, k Y k − HJ k k ε + k J k − µ k C , (2.4)where k y − x k Q denotes the Mahalanobis distance between the vectors y and x with error covariance matrix Q , µ is an offset parameter (prior mean)and C is the regularization covariance matrix [H¨am¨al¨ainen and Ilmoniemi(1994)].If we take µ = 0 and define the source covariance matrix as C = λR ,where R is a diagonal, scaled matrix normally computed in advance, and C. J. LONG ET AL. if λ >
0, then at each time k an instantaneous MEG source estimate (theMNE solution) is given as J MNE k = λRH T ( λ HRH T + Σ ε ) − Y k (2.5)for k = 1 , . . . , N . The MNE estimate is a local Bayes’ estimate because ituses in its computation only the data Y k at time k and the Gaussian priordistribution with mean µ = 0 and covariance matrix λR . Hence, the MNEestimation procedure imposes no temporal constraint on the sequence ofsolutions.2.2. Kalman filter solution.
Given the state-space formulation of theMEG observation process in equations (2.1) and (2.3), it follows that theoptimal estimate of the current source at time k using the data Y , . . . , Y N upthrough time N is given by the Kalman filter [Kitagawa and Gersch (1996)] J k | k − = F J k − | k − , (2.6) W k | k − = W k − | k − + Σ w , (2.7) K k = W k | k − H T [ HW k | k − H T + Σ ε ] − , (2.8) J k | k = W k | k − H T [ HW k | k − H T + Σ ε ] − Y k (2.9) + [ I − K k H ] J k | k − , which simplifies to J k | k = J k | k − + K k [ Y k − HJ k | k − ] , (2.10) W k | k = [ I − K k H ] W k | k − (2.11)for k = 1 , . . . , N , given initial conditions J ∼ N (0 , W | ) , Σ w = W | . At eachtime k the Kalman filter computes p ( x k | Y , . . . , Y k ), which is the Gaussiandistribution with mean J k | k and covariance matrix W k | k . In terms of the reg-ularization criterion function in equation (2.4), the Kalman filter solution isequivalent to choosing at time k , µ = [ I − K k H ] F J k − | k − and C = W k | k − ,where we have used the matrix inversion lemma to re-express the Kalmanfilter update in equation (2.9) in order that we can compare it directly withthe MNE solution in equation (2.5). From this comparison we see that theKalman solution improves upon the MNE solution in two important ways.First, in the Kalman solution, the offset parameter and the regularizationmatrix are different at each time k and are given, respectively, by µ and theone-step prediction error covariance matrix W k | k − . The MNE solution ateach time k has a fixed prior mean µ = 0 and (temporally) fixed regular-ization matrix C = λR . Because of this choice of regularization constraint,equation (2.9) shows that the Kalman filter estimate at time k is a linear YNAMIC STATE-SPACE METHODS FOR MEG combination of J k − | k − , the current source estimate at time k −
1, and Y k ,the observations at time k . In contrast, the MNE solution at each time k isa linear combination of the observed data and a fixed prior mean of µ = 0. Inthis regard, the MNE estimate biases the solution at each time k toward 0.Taken together, these observations show that the stochastic continuity as-sumption in the Kalman state model results in a time-varying constraintupon the fluctuating source estimates.2.3. Fixed-interval smoothing algorithm solution.
Because the MEG timeseries are often fixed length recordings, we can go beyond the estimate J k | k provided by the Kalman update and compute the posterior density p ( x k | Y , . . . , Y N ) at time k given all the data in the experiment. To do so,we combine the Kalman filter with the fixed-interval smoothing algorithm(FIS) [Kitagawa and Gersch (1996), Kay (1993)], that may be computed asfollows: A k = W k | k W − k +1 | k , (2.12) J k | N = x k | k + A k [ J k | k − J k +1 | k ] , (2.13) W k | N = W k | k + A [ W k | N − W k +1 | k ] A Tk (2.14)for k = N − , . . . , J N | N and W N | N computed fromthe last step of the Kalman filter N . It is well known that p ( x k | Y , . . . , Y N )is a Gaussian distribution with mean J k | N and covariance matrix W k | N [Kitagawa and Gersch (1996), Kay (1993)].2.4. Practical considerations.
Implementation of the Kalman filter firstrequires making estimates of both the initial state, the state covariance ma-trix Σ w and the error covariance matrix Σ ε . We estimated the noise covari-ance matrix Σ ε from measurements taken in the scanner in the absence ofa subject. We next estimated the initial state as the MNE solution J MNE0 andthe state covariance matrix by first computing the MNE source estimates J MNE1 , . . . , J
MNE N , subsequently deriving Σ w from the sample covariance ma-trix using a differenced sequence of these static estimates.
3. Supercomputer implementation.
Historically the Kalman filter hasfound widespread use in several high-dimensional modeling domains, in-cluding weather forecasting [Farrell and Ioannou (2001)] and oceanography[Fukumori et al. (1993), Fukumori and Malanotte-Rizzoli (1995)]. Kalmanfilters and fixed-interval smoothers are advantageous in these scenarios as,under assumptions of linearity and normality, they are (near) optimal esti-mators. In addition, their desirable properties hold across a wide variety oftime-varying linear (and nonlinear) models. However, in its standard form C. J. LONG ET AL. the Kalman filter is computationally prohibitive for these classes of prob-lems. In the example applications listed above, the numerical calculationsare often carried out on systems of state dimension N ∼ O (10 ) with statecovariance matrices of size N ∼ O (10 ) [Farrell and Ioannou (2001)]. Com-putationally, the most intense aspects of the Kalman algorithm stem fromthe prediction update in the covariance matrices that require costly linearalgebraic updates at each time step. Since the dynamical error structure ofthese systems is often well understood, many numerical solutions to thesekinds of paradigms, for example, Farrell and Ioannou (2001) and Fukumoriand Malanotte-Rizzoli (1995), employ a range of model-reduction techniquesin their formulations to achieve computational tractability.In the case of the MEG and EEG inverse problem, the solution space is ∼ O (10 − ), leading to error covariances of dimension P ∼ O (10 –10 ).Thus, the computational problem is not quite so intensive as in the fore-casting applications, meaning that we can feasibly employ full-rank state-estimation methods to compute their solution on an HPC system. Whenperforming Kalman filtering, the computations dominating each time stepinvolve three high-dimensional full-rank matrix multiplications, leading toa total approximate computational cost of 3 P (excluding auxiliary lower di-mensional linear operations). When taking into account ancillary variables,the amount of memory required is at least 24+ gigabytes for each updateoperation. In addition, the FIS requires one additional such multiplicationat each time step followed by a full-rank P × P matrix inversion (2 P ). Also,FIS requires approximately three gigabytes of storage space per time pointin order to save the prediction and update covariances (in 32-bit storageformat) estimated during the forward pass of the Kalman filter. The scaleof these computations renders them infeasible on nearly any state-of-the-artstandalone computing resource. To address this limitation, we parallellizedthe Kalman and FIS filtering computations such that the data-intensiveparts of the algorithm were distributed across multiple nodes of a High Per-formance Computing system [Blackford et al. (1997)]. For this purpose weutilized the NSF Teragrid resource at the TACC (Texas Advanced Com-puting Center). This resource comprised a 1024-processor Cray/Dell Xeon-based Linux cluster with a total of 6.4 Teraflops computing capacity.
4. Results.
Simulated MEG experiment.
We designed a set of simulation stud-ies to compare the performance of the dynamic localization methods againstMNE. Prior to constructing the dynamic simulation, we first computed thespatial conductance profile across the head and specified the spatial reso-lution of the discretized solution space (source locations). To restrict thissource space to the cortical surface, we employed anatomical MRI data
YNAMIC STATE-SPACE METHODS FOR MEG obtained from a single subject with a high-resolution T1-weighted 3D se-quence (TR/TE/flip = 2,530 ms/3.49 ms/7 ◦ , partition thickness = 1.33 mm,matrix = 256 × × ×
21 cm) in a 3-T MRI scan-ner (Siemens Medical Solutions, Erlangen, Germany). The geometry of thegray–white matter surface was subsequently computed using an automaticsegmentation algorithm to yield a triangulated model with approximately340,000 vertices [Dale, Fischl and Sereno (1999)]. Finally, we utilized thetopology of a recursively subdivided icosahedron with approximately 5 mmspacing between the source nodes to give a cortical sampling of approxi-mately 10 locations across both hemispheres.Using MNE, we chose as the region of interest a section of the left hemi-sphere over the primary somatosensory and motor cortices. We computeda single layer homogeneous forward model or lead field matrix H over allthe sampled voxels in the left and right hemispheres. The region of interestcontained 125 active voxels from the more than 10,000 gray matter voxelsthat could be potential sources for an observed magnetic field under thisparcellation. For this simulation, therefore, the number of active sourceswas P active = 125, and the number of measurement channels was S = 306.We next constructed H with dimension 306 × z -direction)as opposed to estimating the triplet of x, y and z contributions from whichboth magnitude and directional information could be resolved. In each ofthe chosen 125 voxels in the source space, we generated an “activation”signal consisting of a mixture of 10 and 20 Hz frequencies modulated bya 0.4 Hz envelope to simulate typical signal patterns commonly observed inthe motor and somatosensory cortices; see, for example, Lounasmaa et al.(1995). Next we added Gaussian noise to all P = 5,120 vertices to generatean image-wide signal-to-noise ratio (SNR) that ranged between 0.1 and 2(in line with typical SNRs encountered in MEG studies). Specifically, wedefined SNR as k HJ k SP σ . (4.1)When computing the source reconstructions we set the observation covari-ance matrix as the empty room covariance, that is, that which is generatedfrom background noise measurements taken from the system while the sub-ject is absent from the scanner. We computed three inverse solutions: theMNE solution, the KF solution, and the FIS solution. In this simulationstudy, we recovered the source estimate for each method and for each valueof SNR using three different choices of tuning parameter ( λ = 0 . , , C. J. LONG ET AL.
To examine the benefits of the dynamic state-space procedures in relationto the MNE source localization algorithm, we computed the Kalman andFixed-Interval Smoothing filters, generating estimates of the source locationsand amplitudes within the entire cortex. When applied to the 200 ms windowof MEG data, this analysis averaged about one hour for each simulatedrecord when distributed across 16 of the CRAY-DELL cores. For each choiceof SNR (through a sweep of 20 SNRs choices equispaced between 0.1 and 2),the FIS algorithm took around two hours on 24 CRAY-DELL computationalnodes, leading to a total CPU time of around 1,280 hours for the wholesimulation. Storage of the prediction and update covariances for each SNRrequired approximately 110 gigabytes of disk space, resulting in a total of2 Terabytes of storage for each of the three choices of tuning parameter ( λ ),that is, 6 Terabytes in total.Figure 1 illustrates the spatial extent of the simulated signal (a) taken asa snapshot at the peak of the damped sinusoidal signal. Yellow to red colormap values indicate progressively smaller current sources ( J ) representingground truth, panel (a), or the reconstructed estimates (b)–(d). Figure 1panels (b)–(d) show the estimates as computed with (b) the minimum normestimate, (c) the Kalman filter and (d) the Fixed Interval Smoother. For eachmethod, the SNR as defined by equation (4.1) and the tuning parametergoverning the strength of the estimate smoothness were set to SNR = λ = 1.Figure 2 depicts temporal reconstructions for the three methods with thechoice of SNR = λ = 1. For each region A, B or C, the time series at thevoxel corresponding to the peak MNE response was plotted and overlaidonto the “ground-truth” simulated signal. The state-space reconstructionsshow considerably less variability than the static MNE time course, however,the KF approach displays elevated bias as compared to the FIS method.Furthermore, the KF method contains a temporal shift in relation to thesimulated signal that is not apparent in MNE and FIS. These fundamentaldifferences in the temporal behavior of the three methods are consistent inspace, across SNR, and are invariant to choice of λ .Figures 3 and 4 illustrate the behavior of MSE (Mean-Squared Error)as a function of SNR for the three choices of tuning parameter: panel (a) λ = 0 .
5, panel (b) λ = 1, and panel (c) λ = 3 between the true signal and theestimated signal for each estimation method. In Figure 3 MSE is calculatedat all vertices in the reconstructed cortical surface and averaged spatially foreach choice of SNR. The fixed-interval smoother shows the lowest MSE whichindicates that the estimated source curves are closer to the ground truthsimulations compared to the alternative methods. By contrast, the MNEapproximations show the least favorable recovery of the simulated sources. InFigure 4 we computed the MSE at each vertex for each source reconstructionmethod in the same manner but calculated the average MSE only withinthe three activated regions. These plots, therefore, examine the behavior YNAMIC STATE-SPACE METHODS FOR MEG Fig. 1.
Activation maps in three regions, A, B and C at the peak response time of 126 msfor: (a)
True signal; and source estimates computed by (b)
MNE; (c) the Kalman filter;and (d) the Fixed Interval Smoother. These images were computed using a regularizationparameter ( λ = 1 ). of the source reconstructions by zooming into areas that contain signal,thus decoupling the model fits to areas of true signal from those verticescontaining background noise. The three methods show similar behavior asin Figure 3 in the activated regions, with the fixed-interval smoother againshowing superior performance to the other two methods. As a consequenceof the relative MSE errors exhibiting significantly smaller values in thesesignal-rich regions, we display their effects on a log linear scale to betterdelineate the salient differences between methods.
5. Somatosensory MEG experiment.
To illustrate the state-space andMNE methods on an actual MEG experiment, we estimated the sources ofthe alpha (8–13 Hz) and mu (10 Hz and 20 Hz) spontaneous brain activity C. J. LONG ET AL.
Fig. 2.
Time course of source estimates from the simulated MEG experiment for themaximally responding voxel relative to the MNE solution, where each panel (from top tobottom) corresponds directly to each of the three regions, A, B and C denoted in Figure 1 (a) .Green denotes the MNE solution, red the KF, dark blue is the FIS solution, while cyanrepresents the true signal.
YNAMIC STATE-SPACE METHODS FOR MEG Fig. 3.
Average percentage relative change in MSE computed as a function of SNR fromthe simulated MEG experiment for all voxels on the cortical surface for each localizationmethod: MNE (light gray), Kalman filter (dark gray), and Fixed Interval Smoother (char-coal). (a) λ = 0 . , (b) λ = 1 , (c) λ = 3 . C. J. LONG ET AL.
Fig. 4.
Average percentage relative change in MSE computed as a function of SNR in thesimulated MEG experiment across all active sources for each localization method: MNE(light gray), Kalman filter (dark gray), and Fixed Interval Smoother (charcoal). (a) λ = 0 . , (b) λ = (c) , λ = 3 . YNAMIC STATE-SPACE METHODS FOR MEG using anatomically-constrained whole-head MEG. Synchronous corticalrhythmic activity of large populations of neurons is associated with distinctbrain states and has been the subject of extensive investigation using unitelectrophysiology, as well as EEG, MEG and fMRI techniques [Lounasmaaet al. (1995), Salmelin and Hari (1994)]. The alpha rhythm is recorded overthe posterior parts of the brain, being strongest when the subject has closedeyes and reduced when the subject has open eyes. Sources of alpha rhythmhave been identified with MEG and EEG to lie in the parietooccipital sulcusand calcarine fissure [Freeman, Ahlfors and Menon (2009)]. The mu rhythmrepresents activity close to 10 and 20 Hz in the somatomotor cortex, andis known to reflect resting states characterized by lack of movement andsomatosensory input. The 20 Hz component tends to be localized in the mo-tor cortex, anterior to location of the 10 Hz component [Lounasmaa et al.(1995)], being specifically suppressed or elevated at topographic locationsrepresenting moving or resting individual limbs, respectively. By compari-son, the 10 Hz component arises in the somatosensory cortex, and is thoughtto reflect the absence of sensory input from the upper limbs [Liljestr¨om et al.(2005), Salmelin et al. (2005)].Following approval by the Massachusetts General Hospital Human Re-search Committee, the MEG signals were acquired in a healthy male sub-ject in a magnetically and electrically shielded room at the Martinos Centerfor Biomedical Imaging at the Massachusetts General Hospital. MEG signalswere recorded from the entire head using a 306-channel dc-SQUID NeuromagVectorview system (Elekta Neuromag, Elekta Oy, Helsinki, Finland) whilethe subject sat with his head inside the helmet-shaped dewar containing thesensors. The magnetic fields were recorded simultaneously at 102 locations,each with 2 planar gradiometers and 1 magnetometer at a sampling rate of600 Hz, minimally filtered to (0.1–200 Hz). The positions of the electrodesin addition to fiduciary points, such as the nose, nasion and preauricularpoints, were digitized using the 3Space Isotrak II System for subsequentprecise co-registration with MRI images. The position of the head with re-spect to the helium-cooled dewar containing the measurement SQUIDS wasdetermined by digitizing the positions of four coils that were attached tothe head. These four coils are subsequently employed by the MEG sensorsto assist in co-registration.Ongoing magnetic activity was thus recorded under the following threeconditions: (1) at rest with eyes closed; (2) at rest with eyes open; and (3)during sustained finger movement with eyes open. As described in previousstudies, examination of raw data revealed strong alpha activity when thesubject had eyes closed, and dampened or no alpha rhythm when the subjecthad eyes open (i.e., at rest or during sustained finger movement conditions).The mu activity was strongest at rest (i.e., with eyes open or closed) andsuppressed during sustained fingers movement condition. 2 s long periods of C. J. LONG ET AL. raw data with either large amplitude alpha waves (e.g., during eyes-closedconditions) or large amplitude mu waves (e.g., during hand/finger restingconditions) were used as the input signal for MEG source localization. In ad-dition, the empty MEG room noise was recorded just before the start of theexperiment and subsequently used as measurement noise in the estimationof the alpha and mu activity generators.Figure 5 depicts the net estimated current distributions characterized ineach panel by the length of each current triplet. In the inverse calculations,the orientation of the estimated currents was fixed such that they lay per-pendicular to the cortical mesh. Each panel reveals the relative effect of thestate-space approaches as compared to those gained from the static MNEtechnique as the dynamics of the mu-rhythm time course unfold (Figure 6).In all cases, we can observe regions in the central sulcus showing strongactivation, but the FIS and KF solutions show stronger activations thanthose obtained through the MNE method. Moreover, Figure 6 shows thatthe reconstructed time courses of the KF and FIS solutions are much lessvariable than those of the MNE solution.In summary, analysis of the somatosensory experiment showed approxi-mate agreement between MNE and the state-space models. As in the simu-lated MEG experiment, we note that the magnitudes of the estimate sourceswere greater for the state-space models (Figure 5). The temporal dynamicsof the FIS agreed more closely with the MNE than with the KF (Figure 5).Not surprisingly, because the state-space models imposed a temporal con-straint, their estimates were smoother than the MNE estimates (Figure 6).Also, as in the simulated MEG experiment, the KF estimates of the sourceamplitudes were larger than those computed by either MNE or the FIS.
6. Discussion.
We have developed and implemented a state-space modelsolution to the MEG inverse problem. We examined its performance rel-ative to the MNE algorithm on simulated and actual MEG experimentaldata. A simple analytic analysis showed that the state-space approach of-fers two important improvements over MNE. Namely, the KF/FIS estimateequation (2.9) uses a dynamic L -norm regularization mean equation (2.6)and covariance matrix equation (2.7) update at each step, whereas the MNEestimate equation (2.5) uses a static L -norm regularization zero mean andstatic covariance matrix to compute its instantaneous estimate. Making theprior mean zero at each update biases every solution toward zero. This makesit difficult to identify sources that are nonzero but weak (low amplitude).In contrast, the KF/FIS estimate is more likely to identify a weak sourcebecause if the source estimate at the previous time point, that is, 1 msec be-fore, was similarly weak, it would be combined with the current observationto identify what would most likely be an attenuated source at the currenttime point. Second, the KF source estimates use all of the observations from YNAMIC STATE-SPACE METHODS FOR MEG Fig. 5.
Source activation maps showing instantaneous reconstructions for the sensorimo-tor task. Each column shows the temporal evolution at the five peak modes of the mu-rhythmfor each of the three methods: Minimal Norm Estimator (MNE); Kalman filter (KF); andFixed Interval Smoother (FIS). The red and blue color maps correspond to inward and out-ward current flow, respectively. This reconstruction was carried out using a regularizationparameter value of λ = 1 . C. J. LONG ET AL.
Fig. 6.
Representative time course reconstructions extracted from locations in the so-matosensory cortex at 492 ms in Figure 5: MNE (blue), Kalman filter (green) and FixedInterval Smoother (red). the start of the experiment to the current time, then the FIS equation (2.13)uses the KF source estimates to compute new estimates based on all of theexperimental observations. In contrast, the MNE source estimate uses onlythe observations recorded at the current time. Given that the MEG updatesare computed every millisecond, the solutions at adjacent time periods arelikely to be strongly correlated. Our state-space approach imposes a dy-namic L -norm regularization constraint to use this temporal information,whereas MNE imposes the same static L -norm regularization constraintat each time point and thus does not consider the temporal structure. Oursimulation studies demonstrated that the KF and FIS estimates are moreaccurate in terms of MSE (Figures 3 and 4) compared with MNE. Althoughthe spatial maps of the state-space and MNE had comparable spatial extent,the KF and FIS estimates captured more accurately the magnitude of the ac-tivations (Figure 1). This accounts for the lower MSE for the FIS and the KFrelative to MNE despite small numbers of erroneous activations in the caseof the state-space models outside the areas of actual activations (Figure 1).In addition, the two state-space solutions were considerably smoother and incloser agreement with the simulated signal than the MNE solution. The KFwhich computes its estimates based exclusively on data up to the currenttime had a temporal lag with respect to the true signal (Figure 2). Thatis, the locations of the signal peaks and troughs were offset with respect totheir true temporal locations. In addition, the KF estimates overestimatedthe true signal amplitude (Figure 2). The backward pass of the FIS correctedboth of these problems, yielding a closer agreement with the true signal and YNAMIC STATE-SPACE METHODS FOR MEG a lower MSE (Figure 2). Although in the analysis of the actual MEG experi-ment MNE and the state-space models source estimates were in approximateagreement (Figure 5), the magnitudes of the FIS estimates were more con-sistent with those of the KF estimates, whereas the temporal dynamics ofthe FIS estimates followed more closely those of the MNE estimates (Fig-ure 5). The KF estimates of the source amplitudes were larger than thosecomputed by the either MNE or the FIS (Figure 6). Although our find-ings establish the feasibility of using high-dimensional state-space modelsfor solving the MEG inverse problem, they also suggest several extensions.In our analysis we computed the KF and FIS estimates by testing differentvalues of the regularization parameter. This parameter can be estimated inan empirical Bayesian [Lamus et al. (2007)] or a fully Bayesian framework.The spatial component of the model can be improved by using the structureof the particular experimental design to proposed specific forms of the F state-transition matrix. There are a broad range of techniques that havebeen used to accelerate computations in high-dimensional state-space mod-els. The forward model can be extended to include subcortical sources. TheMEG and EEG recordings are usually recorded simultaneously. Our state-space paradigm can be applied to the problem of estimating the sources fromthese two sources. These extensions will be the topics of future reports.REFERENCES Blackford, L. S. , Choi, J. , Cleary, A. , D’Azevedo, E. , Demmel, J. , Dhillon, I. , Dongarra, J. , Hammarling, S. , Henry, G. , Petitet, A. , Stanley, K. , Wal-ker, D. and
Whaley, R. C. (1997).
ScaLAPACK Users’ Guide . SIAM, Philadelphia,PA.
Dale, A. M. , Fischl, B. and
Sereno, M. I. (1999). Cortical surface-based analysis.I. Segmentation and surface reconstruction.
NeuroImage Daunizeau, J. and
Friston, K. J. (2007). A mesostate-space model for EEG and MEG.
NeuroImage Farrell, B. F. and
Ioannou, P. J. (2001). State estimation using a reduced orderKalman filter.
J. Atmospheric Sci. Freeman, W. , Ahlfors, S. and
Menon, V. (2009). Combining fMRI with EEG andMEG in order to relate patterns of brain activity to cognition.
Int. J. Psychophysiol. Friston, K. J. , Harrison, L. , Daunizeau, J. , Kiebel, S. , Phillips, C. , Trujillo-Barreto, N. , Henson, R. , Flandin, G. and
Mattout, J. (2008). Multiple sparsepriors for the M/EEG inverse problem.
NeuroImage Fukumori, I. and
Malanotte-Rizzoli, P. (1995). An approximate Kalman filter forocean data assimilation: An example with an idealized Gulf stream model.
J. Geophys.Res.
Fukumori, I. , Benvensite, J. , Wunsch, C. and
Haid-vogel, B. D. (1993). Assimi-lation of sea surface topography into an ocean circulation model using a steady-statesmoother.
J. Phys. Oceanogr. Galka, A. , Yamashita, O. Ozaki, T. et al. (2004). A solution to the dynamical inverseproblem of EEG generation using spatiotemporal Kalman filtering.
NeuroImage C. J. LONG ET AL.
H¨am¨al¨ainen, M. S. and
Ilmoniemi, R. J. (1994). Interpreting magnetic fields of thebrain: Minimum norm estimates.
Med. Biol. Eng. Comput. H¨am¨al¨ainen, M. S. and
Sarvas, J. (1989). Realistic conductivity geometry model of thehuman head for interpretation of neuromagnetic data.
IEEE Trans. Biomed. Eng. H¨am¨al¨ainen, M. S. , Hari, R. , Ilmoniemi, R. J. , Knuutila, J. and
Lounasmaa, O. V. (1993). Magnetoencephalography—Theory, instrumentation, and application to nonin-vasive studies of the working human brain.
Rev. Modern Phys. Kay, S. (1993).
Fundamentals of Statistical Signal Processing, Vol. I—Estimation Theory .Prentice Hall, Upper Saddle River, NJ.
Kitagawa, G. and
Gersch, W. (1996).
Smoothness Priors Analysis of Time Series . Lecture Notes in Statistics . Springer, New York. MR1441074
Krim, H. and
Viberg, M. (1996). Two decades of array signal processing research: Theparametric approach.
IEEE Signal Processing Magazine Lamus, C. , Long, C. J. , Hamalainen, M. S. , Brown, E. N. and
Purdon, P. L. (2007).Parameter estimation and dynamic source localization for the magnetoencephalogra-phy (MEG) inverse problem.
IEEE International Symposium on Biomedical Imaging,Washington, DC, April 12–15, 2007 . Liljestr¨om, M. , Kujala, J. , Jensen, O. and
Salmelin, R. (2005). Neuromagneticlocalization of rhythmic activity in the human brain: A comparison of three methods.
NeuroImage Lounasmaa, O. , H¨am¨al¨ainen, M. , Hari, R. and
Salmelin, R. (1995). Informationprocessing in the human brain: Magnetoencephalographic approach.
Proc. Natl. Acad.Sci. USA Mattout, J. , Phillips, C. , Penny, W. D. , Rugg, M. D. and
Friston, K. J. (2006).MEG source localization under multiple constraints: An extended Bayesian framework.
NeuroImage Mosher, J. C. and
Leahy, R. M. (1998). Recursive MUSIC: A framework for EEG andMEG source localization.
IEEE Trans. Biomed. Eng. Mosher, J. C. , Lewis, P. S. and
Leahy, R. M. (1992). Multiple dipole modeling andlocalization from spatio-temporal MEG data.
IEEE Trans. Biomed. Eng. Nunez, P. L. (1995).
Neocortical Dynamics and Human EEG Rhythms . Oxford Univ.Press, New York.
Ou, W. , H¨am¨al¨ainen, M. S. and
Golland, P. (2009). A distributed spatio-temporalEEG/MEG inverse solver.
NeuroImage Pascual-Marqui, R. D. , Michel, C. M. and
Lehmann, D. (1994). Low resolutionelectromagnetic tomography: A new method for localizing electrical activity in thebrain.
Int. J. Psychophysiol. Phillips, C. , Mattout, J. , Rugg, M. D. , Maquet, P. and
Friston, K. J. (2005). Anempirical Bayesian solution to the source reconstruction problem in EEG.
NeuroImage Salmelin, R. and
Hari, R. (1994). Characterization of spontaneous MEG rhythms inhealthy adults.
Electroencephalogr. Clin. Neurophysiol. Salmelin, R. , Forss, N. , Knuutila, J. and
Hari, R. (2005). Bilateral activation ofthe human somatomotor cortex by distal hand movements.
Electroenceph. Clin. Neu-rophysiol. Uutela, K. , H¨am¨al¨ainen, M. and
Salmelin, R. (1998). Global optimization in thelocalization of neuromagnetic sources.
IEEE Trans. Biomed. Eng. Uutela K. H¨am¨al¨ainen, M. and
Somersalo, S. (1999). Visualization of magnetoen-cephalographic data using minimum current estimates.
NeuroImage Van Veen, B. D. , van Dronglen, W. , Yuchtman, M. and
Suzuki, A. (1997). Local-ization of brain electrical activity via linearly constrained minimum variance spatialfiltering.
IEEE Trans. Biomed. Eng. Wipf, D. P. , Owen, J. P. , Attias, H. T. , Sekihara, K. and
Nagarajan, S. S. (2010).Robust Bayesian estimation of the location, orientation, and time course of multiplecorrelated neural sources using MEG.
NeuroImage Yamashita, O. , Galka, A. , Ozaki, T. , Biscay, R. and
Valdes-Sosa, P. (2004). Recur-sive penalized least squares solution for dynamical inverse problems of EEG generation.
Hum. Brain Mapp. Zumer, J. M. , Attias, H. T. , Sekihara, K. and
Nagarajan, S. S. (2007). A proba-bilistic algorithm form integrating source localization and noise suppression for MEGand EEG data.
NeuroImage Zumer, J. M. , Attias, H. T. , Sekihara, K. and
Nagarajan, S. S. (2008). Probabilisticalgorithms for MEG/EEG source reconstruction using temporal basis functions learnedfrom data.
NeuroImage C. J. LongGlaxoSmithKline ClinicalImaging CenterHammersmith HospitalLondon W12 0NNUKE-mail: [email protected]