# RIGOLETTO -- RIemannian GeOmetry LEarning: applicaTion To cOnnectivity. A contribution to the Clinical BCI Challenge -- WCCI2020

Marie-Constance Corsi, Florian Yger, Sylvain Chevallier, Camille Noûs

CClinical BCI Challenge-WCCI2020RIGOLETTO - RIemannian GeOmetry LEarning :applicaTion To cOnnectivity

Marie-Constance Corsi

Inria Paris, Aramis project-teamParis Brain Institute

Paris, [email protected]

Florian Yger

LAMSADEPSL, Univ. Paris-Dauphine

Paris, Franceﬂ[email protected]

Sylvain Chevallier

LISVUniv. Paris-Saclay

Versailles, [email protected]

Camille Noˆus

CogitamusCNRS

Paris, [email protected]

Abstract —This short technical report describes the approachsubmitted to the Clinical BCI Challenge-WCCI2020. This sub-mission aims to classify motor imagery task from EEG signalsand relies on Riemannian Geometry, with a twist. Instead ofusing the classical covariance matrices, we also rely on measuresof functional connectivity. Our approach ranked 1 st on the task1 of the competition. Index Terms —Riemannian geometry, functional connectivity,ensemble learning, BCI

I. I

NTRODUCTION

Using a brain-computer interface (BCI) is a learned skillthat requires time to reach high performance [1]. Despiteits clinical applications [2], [3], one of the main drawbacksis the high inter-subject variability that could be noticedfor performance. This is sometimes referred in the literatureas the ”BCI inefﬁciency” phenomenon [4], [5] and affectsits usability. Among the approaches adopted to tackle thisissue are the search for neuromarkers, that potentially capturebetter the neurophysiological mechanisms underlying the BCIperformance [6], [7], and the optimization of classiﬁcationpipelines [8], that could be robust enough to be applied toany subject.In this work, we proposed an original approach that com-bines functional connectivity estimators, Riemannian geom-etry and ensemble learning to ensure a robust classiﬁcation.This article not only describes the proposed approach but alsopresents the methodology and the results that were conductedfor our submission.II. R

IEMANNIAN G EOMETRY

As pointed out in [8], the use of Riemannian geometry forMotor Imagery BCI is one of the breakthroughs of the last years of research in BCI and is now the golden standard.The approach consists in extracting Symmetric PositiveDeﬁnite (SPD) matrices that are symmetric matrices withstrictly positive eigenvalues, usually the covariance matricesamong sensors, for each epoch and then in considering thisspace as a curved (i.e. Riemannian) space. As illustratedin Fig. 1 for × matrices, the space of SPD matrices c Fig. 1. Comparison of Euclidean and Riemannian geometries for × SPDmatrices. could be considered as a Euclidean space (as a subspace ofthe Symmetric matrices) but several drawbacks occur (e.g. swelling effect - see [9]). Those drawbacks are leveraged whenthe Riemannian geometry is used and the distance between twoSPD matrices A and B is expressed as : δ R ( A, B ) = || log (cid:16) A − BA − (cid:17) || F (1)with log( · ) the matrix logarithm and ||·|| F the Frobenius norm.However, in practice, we will favor another Riemanniangeometry with similar properties but being faster to compute ,the LogEuclidean distance : δ LE ( A, B ) = || log ( A ) − log ( B ) || F (2)In both geometries, the K¨archer average of a set of matrices { X , · · · X n } is deﬁned as : min X n (cid:88) i =1 δ ( X i , X ) (3) The relationship between those geometries is developed in [10]. a r X i v : . [ ee ss . SP ] F e b orrelation Coherence Phase Locking ValueLeftRight Fig. 2. Features extraction. Each line corresponds to a given condition (left or right) and each column is associated with a given estimator: Correlation,Coherence and Phase Locking Value. We obtained these ﬁgures from subject 6 by averaging the estimators over the trials of the training set. In each case,the threshold

T h was obtained as follows:

T h = Min + 0 . ∗ ( Max − Min ) , where Min and Max refer respectively to the minimum and the maximumvalues obtained for a given case (i.e. metric and experimental condition). A closed-form solution exists for δ LE (but not for δ R ) ¯ X LE = exp (cid:32) n n (cid:88) i =1 log ( X i ) (cid:33) (4)A simple, yet efﬁcient classiﬁer for SPD matrices consistsin computing the K¨archer average of each class and then inpredicting for a given test sample the class which average isthe closest (using δ ).The interested reader can refer to [9], [11] for more details.Until now, the Riemannian geometry was applied on SPDmatrices extracted from covariances among sensors but othercharacteristics extracted from the EEG signal could produceSPD matrices. The next section will describe an alternativeway to obtain SPD matrices based on functional connectivity.III. F UNCTIONAL CONNECTIVITY

Functional connectivity (FC), which consists of assessingthe interaction between different brain areas [12], [13], can bea valuable tool to provide alternative features to discriminatesubjects’ mental states [14] and to study neural mechanismsunderlying BCI learning [15].Here, as an exploratory study, we considered comple-mentary undirected FC estimators to assess which of them,associated to Riemannian geometry, could best classify thedata. For a given FC estimator, we took into account a timewindow of [3, 7.5 s] and we averaged the FC values within thealpha-beta band [8, 30 Hz]. Computations were made usingthe Brainstorm toolbox [16]. In the following subsections, we deﬁned the metrics computed between two given signalsreferred as s ( t ) and s ( t ) between two EEG sensors. Anillustrative example is presented in Fig. 2. A. Spectral estimation

We computed two spectral estimators: the coherence (Coh)and the Imaginary coherence (ICoh). Coh and ICoh are bothcomputed from the coherency, deﬁned as the normalizedcross-spectral density obtained from two given signals. Morespeciﬁcally, they are obtained as follows:

Coh ( f ) = | S ( f ) | S ( f ) .S ( f ) (5) ICoh ( f ) = (cid:61) S ( f ) (cid:112) S ( f ) .S ( f ) (6)with S ( f ) the cross-spectral density and S ( f ) the auto-spectral density.The advantage of ICoh is its reduced sensitivity to signalleakage and volume conduction effects [17], [18]. B. Phase estimation

As a phase estimator method, we worked with the PhaseLocking Value (PLV), which assesses phase synchrony be-tween two signals in a speciﬁc frequency band [19]–[21]. Morespeciﬁcally, it corresponds to the absolute value of the meanphase between s and s , deﬁned as follows: P LV = | e i ∆ φ ( t ) | (7)EG C ov C oh P L V F g M D M F g M D M F g M D M Ridge classiﬁerDecision

Fig. 3. Classiﬁcation pipeline: coherence, phase locking value and spatialcovariances are estimated from the EEG signal. A ﬁrst level of classiﬁcationwas performed by FgMDM classiﬁers, that yielded output decision probabili-ties to train a second level classiﬁer, a ridge regression classiﬁer, that providedthe ﬁnal decision. where ∆ φ ( t ) = arg ( z ( t ) .z ∗ ( t ) | z ( t ) | . | z ( t ) | )∆ φ ( t ) represents the associated relative phase computed be-tween signals and z ( t ) = s ( t ) + i.h ( s ( t )) the analytic signalobtained by applying the Hilbert transform on the signal s ( t ) . C. Amplitude coupling method

We computed the Amplitude Envelope Correlation (AEC)[18], [22], [23] which relies on the linear correlations ofthe envelopes of the band-pass ﬁltered signals obtained fromHilbert transform.For the sake of completeness, we report the results of bothAEC and ICoh, although those features were not used in theﬁnal submission. The generated matrices were not SPD andwe had to pre-process them heavily in order to be able toapply the Riemmannian geometry. This may explain the poorresults of those features in our setup.IV. P

ROPOSED APPROACH : RIGOLETTOThe novelty of our approach consists of combining Rieman-nian classiﬁers trained on SPD matrices coming from bothmeasures of FC and covariance estimation.

A. Task 1 : within-subject classiﬁcation

To estimate FC features, we used the computation de-tailed in the previous section implemented in Brainstormsoftware [16], and the sample covariance estimator of Matlab.The covariance estimators often include regularization usingshrinkage approach to avoid ill-conditioned matrices. For FC,no shrinkage estimators had been deﬁned yet. Thus, we useda simple algorithm to project FC matrices on the manifold ofPSD matrices [24]. To classify FC and covariance matrices, we rely onPython [25], [26] and its libraries: Numpy [27], [28],Scipy [29], Pandas [30], Scikit-Learn [31], MNE [32],Jupyter [33], Matplotlib [34], Seaborn [35]. We applied theFgMDM algorithm [36], that compute ﬁlters from a FisherGeodesic Discriminant Analysis before using a Minimum Dis-tance to Mean (MDM) classiﬁer. We used the LogEuclideandistance and its associated mean in the MDM for its robustnessand its efﬁciency. To take into account the shift betweentraining and test set, the test data were transported on to themean of training set , as described in [10], [37].Each FgMDM classiﬁer predicted a probability for theoutput classes using the softmax function on distance tonearest mean. These probabilities were used to train a stackedclassiﬁer [38]. We tried several classiﬁers and we chose theridge classiﬁer for its robustness. This classiﬁer made the ﬁnaldecision for the prediction, as shown in Fig. 3.The performance of our submission was estimated on train-ing data and compared to a baseline, that was the Linear Dis-criminant Analysis LDA) with CSP spatial ﬁlters. The resultsare shown in Fig. 4, indicating the Kappa score estimatedwith repeated 5-fold cross-validation for each subject. Theperformance of each level 1 classiﬁers – the FgMDM-Coh,FgMDM-PLV, FgMDM-Cov – are provided, along with theensemble classiﬁer.Indeed, we tested several classiﬁers and a combination forthe stacked classiﬁer. The obtained results are summarizedon Fig. 5. The FgMDM trained on ICoh and AEC featurespresented very low kappa score. We also used a popular RGclassiﬁer, an SVM trained on the tangent space (TS-SVM).We also tested the CSP-SVM. In both cases, the SVM wasparametrized through a grid search on the parameter space.This ﬁgure displays the score of the chosen system and stackedclassiﬁers trained on different features: one ensemble classiﬁertrained on all FC features (Coh, ICoh, PLV, AEC, Cov) andone trained on all possible level one classiﬁers. For moredetails on the Task 1, the reader can refer to [39]. B. Task 2 : across subjects decoding

We computed the K¨archer average for the data of eachsubject, that is on the training data for subjects to andon test data for subject and . We chose then the ensembleclassiﬁer described above from the subject with the closestmean in the sense of the δ R to predict the unknown labelsof subjects and respectively. We also built a votingclassiﬁer (data not shown), which combined the output of allthe classiﬁers trained on each subject weighted with the inverseof the distance between K¨archer average. The results were lessstable for subjects 1 to 8, thus we favored the simpler but moreeffective scheme of selecting only the classiﬁer trained on thesubject with the closer K¨archer average.To validate our system, we estimated the score with a leave-one-out scheme, that is training on all but one subject and This transductive setup was allowed by the rules of the competition butin a real-life scenario, the mean of the test data could be estimated withunlabelled data during the calibration.ig. 4. Per subject Kappa score, comparing the separate pipelines, i.e. FgMDM estimated on covariance, spectral coherence and PLV, a CSP+LDA for thebaseline and the submitted ensemble classiﬁer.Fig. 5. Average kappa score for all subjects, for the different tested estimators.Kappa CV Accuracy Kappa RankWithin-subject (task 1) 0.68 78.44% 0.57 1Cross-subject (task 2) 0.55 25.00% -0.50 13Overall - - - 4 computing score on the left-out subject. The Fig. 6 shows theKappa score estimated when trained on a given source subjectand making prediction for a given target subject.

C. Aftermath

There were 14 submissions to the competition from 12 dif-ferent institutions around the world across 9 different countriesspread across 3 continents. At the end of the competition , oursubmission ranked as described in Table IV-C.We mainly focused on task 1 and our approach got the ﬁrstposition on this task with a substantial margin, the followingteams having respectively kappa scores of . and . andaccuracies of . and . . The kappa score obtainedon validation is close to 0.68, the value obtained on training For more details, the reader can access to the website of the competition :https://sites.google.com/view/bci-comp-wcci/ data with a 5-fold cross-validation, indicated in the Kappa CVcolumn.On the second task, we submitted a very simple approachand it seemed to have performed worse than random. Thiscould mean that our approached matched patients havingopposite patterns, hence leading to this result. Despite thisinversion of class prediction, the absolute value of kappaobtained on the validation set is close to the one obtainedon the training data (and the inverse classiﬁer would havereached accuracy). This very curious phenomenon will beinvestigated in the near future. The proposed approach couldbe reﬁned by matching the means of each class (for knownsubjects) to a k-means (with k = 2 and using K¨archer means)or adding weights for each subject in the spirit of [40].Rank aggregation has been the focus of many researchersin the ﬁeld of computational social choice. The competition or-ganizers choose to make the overall ranking using a weightedsum of the kappa value of the two tasks (the weight for a givencompetitor on a task being (15 − r ) with its rank r ). However,this rank aggregation technique is quite peculiar and unknownin the literature of computational social choice (where Kemenyoptimal aggregation would be the classical way to go). First,the ranking is not really performed according to a weightedsum of the performances (as the weights are speciﬁc to eachuser) and it uses the ranking as weights (then, comparing twocandidates could depend on the performance of a third one).Moreover, the rank aggregation method has been selected bythe organizers a posteriori and knowing it, perhaps somecompetitors would have changed the focus of their submission. Note that in our case, the voters would be the tasks and the candidateswould be the submissions. Note that we do not dispute our overall rank -as our approach performedquite differently on each task- but we would rather take this opportunity todiscuss the ranking process for the sake of good order, this discussion beingmeant as a feedback competitions’ organizers.ig. 6. Top: Kappa score for each target subjects, comparing our system and CSP-LDA trained on each source subject. Bottom: Distance δ R from averagecovariance matrix of source subject ¯ X src to average matrix of target subject ¯ X tgt .Fig. 7. Distance between the K¨archer average of each target subject witheach source subject. For a full description of our pipeline and the requirementsto use it, the reader can refer to the RIGOLETTO GitHubrepository on https://github.com/sylvchev/wcci-rgcon.As previously explained, our main objective was to optimizethe classiﬁcation accuracy in the within-subject category. Us-ing FC estimators associated with an ensemble classiﬁer givesthe possibility to take into account the users’ speciﬁcity.After participating to the competition, we elicited differentapproaches to improve our method depending on the adoptedperspective: theory behind RG, features extraction and transferlearning. In the ﬁrst case, further investigation should bedone regarding the follow-up to the MDM [41] and thedimensionality reduction [42] (for other higher dimensionality datasets). Regarding the feature extraction, we plan to improvein particular the selection of the frequency band of inter-est [43]. Another promising lead would be to extract for eachepoch several PSD matrices, each on a different frequencyband, and to consider this set as a trajectory on the manifold,in the spirit of [44]. Other items, such as the agreement andvariability among covariance and connectivity and the non-stationarity of connectivity features [45] will be considered.Finally, we plan to study the impact of the centering operationon transfer learning tasks [46].Participating to the WCCI-Clinical BCI Competition hasbeen the occasion to propose a novel approach and to startbridging the gap between Riemannian geometry and connec-tivity features. This resulted in a very practical algorithm andit brought promising results as well as intriguing failures.Nevertheless, it motivates the need to study more in depththe connectivity features under the lens of the Riemanniangeometry. For instance, Fig. 2 showed that covariance andconnectivity features seems to produce similar average pat-terns but this raises as well the question of their individualvariability. We plan to study those questions in the next future.V. A

UTHORS CONTRIBUTIONS

Camille Noˆus is a collective individual and contributedto the collegial construction of the standards of science, bydeveloping the methodological framework, the state-of-the-art, and by ensuring post-publication follow-up. This co-authorship symbolizes the collaborative nature of this work.All the authors contributed equally to this work, taking ad-vantage of their complementary skills. MCC was in charge f the data exploration and of the features extraction fromfunctional connectivity. FY and SC were in charge of applyingRiemannian geometry from FC estimators and designing theclassiﬁcation pipeline. All the authors wrote, revised andapproved the submission. Finally, MCC was chosen as a teamleader and managed the submission of our approach and thecommunication with the organizers.VI. E NVIRONMENTAL IMPACT

The approach taken in this submission does not requirelengthy computation on GPU clusters or HPC, in order toreduce its environmental impact. This submission generatedthe equivalent of 62 gCO , that is comparable to watching thewhole “Lord of the Ring” trilogy on an HD streaming service.Training and experimenting with the models involved theequivalent of 6h of full load CPU computation. They wereexecuted on a desktop computer with a 600W PSU thatconsumes 0.4 kWh during computation, measured with awattmeter, and operated in France where the carbon footprintis 4.56 gCO2/kWh [47]. The whole computation generated theequivalent of 10.94 gCO .The team members relied mainly on Slack, git and overleafto communicate. As there is no direct estimation of thefootprint of these services, we use the email scenario of TheShift Project report [48] as a surrogate. The digital action ofsending an email is characterized by the 5 minutes use ofa terminal plus 1 MB of transmitted data; it generates 0.3gCO according to The Shift Project. We evaluate that oursubmission required the equivalent of 170 mails following thisscenario. The estimated footprint is thus 51 gCO .This submission generated the equivalent of 62 gCO . TheShift Project made a contested estimation for the environ-mental impact of watching a video in HD on a streamingservice [49]. While this is still debated,the cost is estimated tobe circa 1 gCO for 10 minutes of HD video. Our submissionis thus somewhere between watching the theater-releasedversion or the extended version of the Lord of the Ring trilogyon streaming. VII. A CKNOWLEDGEMENTS

The authors would like to thank the organizers of the WCCI-Clinical BCI competition for giving them the opportunity tokickstart this promising collaboration. FY would like to thankFabien Lotte who suggested several years ago to investigatethe use of connectivity features in lieu of the usual covarianceestimator in Riemannian classiﬁers.FY acknowledges the support of the ANR as part of the”Investissements d’avenir” program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute). SC acknowledges that thiswork could have supported by ANR or IDEX but is onlysupported by the recurrent funding of the UVSQ.R

EFERENCES[1] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T. M.Vaughan, “Brain–computer interfaces for communication and control,”

Clinical Neurophysiology , vol. 113, no. 6, pp. 767–791, Jun. 2002. [2] F. Pichiorri, G. Morone, M. Petti, J. Toppi, I. Pisotta, M. Molinari,S. Paolucci, M. Inghilleri, L. Astolﬁ, F. Cincotti, and D. Mattia,“Brain-computer interface boosts motor imagery practice during strokerecovery,”

Annals of Neurology , vol. 77, no. 5, pp. 851–865, May 2015.[3] C. E. King, P. T. Wang, L. A. Chui, A. H. Do, and Z. Nenadic, “Opera-tion of a brain-computer interface walking simulator for individuals withspinal cord injury,”

Journal of NeuroEngineering and Rehabilitation ,vol. 10, no. 1, p. 77, 2013.[4] B. Z. Allison and C. Neuper, “Could Anyone Use a BCI?” in

Brain-Computer Interfaces , ser. Human-Computer Interaction Series, D. S. Tanand A. Nijholt, Eds. Springer London, 2010, pp. 35–54.[5] M. C. Thompson, “Critiquing the Concept of BCI Illiteracy,”

Scienceand Engineering Ethic , Aug. 2018.[6] B. Blankertz, C. Sannelli, S. Halder, E. M. Hammer, A. K¨ubler, K.-R.M¨uller, G. Curio, and T. Dickhaus, “Neurophysiological predictor ofSMR-based BCI performance,”

NeuroImage , vol. 51, no. 4, pp. 1303–1309, Jul. 2010.[7] M. Ahn, H. Cho, S. Ahn, and S. C. Jun, “High theta and low alphapowers may be indicative of BCI-illiteracy in motor imagery,”

PLoSONE , vol. 8, no. 11, p. e80886, 2013.[8] F. Lotte, L. Bougrain, A. Cichocki, M. Clerc, M. Congedo, A. Rako-tomamonjy, and F. Yger, “A Review of Classiﬁcation Algorithms forEEG-based Brain-Computer Interfaces: A 10-year Update,”

Journal ofNeural Engineering , Feb. 2018.[9] F. Yger, M. Berar, and F. Lotte, “Riemannian approaches in brain-computer interfaces: a review,”

IEEE Transactions on Neural Systemsand Rehabilitation Engineering , vol. 25, no. 10, pp. 1753–1762, 2016.[10] F. Yger and M. Sugiyama, “Supervised logeuclidean metric learning forsymmetric positive deﬁnite matrices,” arXiv preprint arXiv:1502.03505 ,2015.[11] M. Congedo, A. Barachant, and R. Bhatia, “Riemannian geometry foreeg-based brain-computer interfaces; a primer and a review,”

Brain-Computer Interfaces , vol. 4, no. 3, pp. 155–174, 2017.[12] F. de Vico Fallani, J. Richiardi, M. Chavez, and S. Achard, “Graphanalysis of functional brain networks: practical issues in translationalneuroscience,”

Philosophical Transactions of the Royal Society B:Biological Sciences , vol. 369, no. 1653, p. 20130521, 2014.[13] A. M. Bastos and J.-M. Schoffelen, “A Tutorial Review of FunctionalConnectivity Analysis Methods and Their Interpretational Pitfalls,”

Front. Syst. Neurosci. , vol. 9, 2016, publisher: Frontiers.[14] T. Cattai, S. Colonnese, M.-C. Corsi, D. S. Bassett, G. Scarano, andF. D. V. Fallani, “Phase/amplitude synchronization of brain signalsduring motor imagery BCI tasks,” arXiv:1912.02745 , Dec. 2019, arXiv:1912.02745.[15] M.-C. Corsi, M. Chavez, D. Schwartz, N. George, L. Hugueville, A. E.Kahn, S. Dupont, D. S. Bassett, and F. De Vico Fallani, “Functionaldisconnection of associative cortical areas predicts performance duringBCI training,”

NeuroImage , vol. 209, p. 116500, Apr. 2020.[16] F. Tadel, S. Baillet, J. Mosher, D. Pantazis, and R. Leahy, “Brainstorm:A User-Firendly Application for MEG/EEG Analysis,”

ComputationalIntelligence and Neuroscience , vol. 2011, Jan. 2011.[17] G. Nolte, O. Bai, L. Wheaton, Z. Mari, S. Vorbach, and M. Hallett,“Identifying true brain interaction from EEG data using the imaginarypart of coherency,”

Clinical Neurophysiology , vol. 115, no. 10, pp. 2292–2307, Oct. 2004.[18] G. L. Colclough, M. W. Woolrich, P. K. Tewarie, M. J. Brookes,A. J. Quinn, and S. M. Smith, “How reliable are MEG resting-stateconnectivity metrics?”

NeuroImage , 2016.[19] J.-P. Lachaux, E. Rodriguez, J. Martinerie, and F. J. Varela, “Measuringphase synchrony in brain signals,”

Human Brain Mapping , vol. 8, no. 4,1999.[20] P. Tass, M. Rosenblum, J. Weule, J. Kurths, A. Pikovsky, J. Volkmann,A. Schnitzler, and H. Freund, “Detection of phase locking from noisydata: application to magnetoencephalography,”

Physical Review Letters ,vol. 81, no. 15, pp. 3291–3294, 1998.[21] S. Aydore, D. Pantazis, and R. M. Leahy, “A note on the phase lockingvalue and its properties,”

NeuroImage , vol. 74, pp. 231–244, Jul. 2013.[22] J. F. Hipp, D. J. Hawellek, M. Corbetta, M. Siegel, and A. K. Engel,“Large-scale cortical correlation structure of spontaneous oscillatoryactivity,”

Nature Neuroscience , vol. 15, no. 6, pp. 884–890, Jun. 2012.[23] M. J. Brookes, J. R. Hale, J. M. Zumer, C. M. Stevenson, S. T. Francis,G. R. Barnes, J. P. Owen, P. G. Morris, and S. S. Nagarajan, “Measuringfunctional connectivity using MEG: Methodology and comparison withfcMRI,”

Neuroimage , vol. 56, no. 3, pp. 1082–1104, Jun. 2011.24] N. J. Higham, “Computing a nearest symmetric positive semideﬁnitematrix,”

Linear algebra and its applications , vol. 103, pp. 103–118,1988.[25] G. Van Rossum and F. L. Drake Jr,

Python reference manual . Centrumvoor Wiskunde en Informatica Amsterdam, 1995.[26] T. E. Oliphant, “Python for scientiﬁc computing,”

Computing in Science& Engineering , vol. 9, no. 3, pp. 10–20, 2007, publisher: IEEE.[27] ——,

A guide to NumPy . Trelgol Publishing USA, 2006, vol. 1.[28] S. v. d. Walt, S. C. Colbert, and G. Varoquaux, “The NumPy array: astructure for efﬁcient numerical computation,”

Computing in science &engineering , vol. 13, no. 2, pp. 22–30, 2011, publisher: IEEE ComputerSociety.[29] P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy,D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, andothers, “SciPy 1.0: fundamental algorithms for scientiﬁc computing inPython,”

Nature methods , vol. 17, no. 3, pp. 261–272, 2020, publisher:Nature Publishing Group.[30] W. McKinney,

Python for data analysis: Data wrangling with Pandas,NumPy, and IPython . ” O’Reilly Media, Inc.”, 2012.[31] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-esnay, “Scikit-learn: Machine learning in Python,”

Journal of MachineLearning Research , vol. 12, pp. 2825–2830, 2011.[32] A. Gramfort, M. Luessi, E. Larson, D. A. Engemann, D. Strohmeier,C. Brodbeck, R. Goj, M. Jas, T. Brooks, L. Parkkonen, and others, “MEGand EEG data analysis with MNE-Python,”

Frontiers in neuroscience ,vol. 7, p. 267, 2013, publisher: Frontiers.[33] T. Kluyver, B. Ragan-Kelley, F. P´erez, B. E. Granger, M. Bussonnier,J. Frederic, K. Kelley, J. B. Hamrick, J. Grout, S. Corlay, and others,“Jupyter Notebooks-a publishing format for reproducible computationalworkﬂows.” in

ELPUB , 2016, pp. 87–90.[34] J. D. Hunter, “Matplotlib: A 2D graphics environment,”

Computing inscience & engineering , vol. 9, no. 3, pp. 90–95, 2007, publisher: IEEEComputer Society.[35] M. Waskom, O. Botvinnik, D. OKane, P. Hobson, J. Ostblom,S. Lukauskas, D. C. Gemperline, T. Augspurger, Y. Halchenko, J. B.Cole, and e. al, “Seaborn v0.9.0,” M. Waskom, Tech. Rep., Jul. 2018,publisher: Zenodo.[36] A. Barachant, S. Bonnet, M. Congedo, and C. Jutten, “Riemanniangeometry applied to bci classiﬁcation,” in

International Conference onLatent Variable Analysis and Signal Separation . Springer, 2010, pp.629–636.[37] ——, “Multiclass brain–computer interface classiﬁcation by riemanniangeometry,”

IEEE Transactions on Biomedical Engineering , vol. 59,no. 4, pp. 920–928, 2011.[38] D. H. Wolpert, “Stacked generalization,”

Neural networks , vol. 5, no. 2,pp. 241–259, 1992.[39] M.-C. Corsi, F. Yger, S. Chevallier, and C. Noˆus, “Riemannian geometryon connectivity for clinical bci,” in

IEEE ICASSP 2021 , 2021.[40] E. K. Kalunga, S. Chevallier, and Q. Barth´elemy, “Transfer learning forssvep-based bci using riemannian similarities between users,” in . IEEE, 2018,pp. 1685–1689.[41] M. Congedo, P. Rodrigues, and C. Jutten, “The riemannian minimumdistance to means ﬁeld classiﬁer,” in , 2019.[42] I. Horev, F. Yger, and M. Sugiyama, “Geometry-aware principal com-ponent analysis for symmetric positive deﬁnite matrices,” in

AsianConference on Machine Learning , 2016, pp. 1–16.[43] W. Klimesch, “EEG alpha and theta oscillations reﬂect cognitive andmemory performance: a review and analysis,”

Brain Research Reviews ,vol. 29, no. 2, pp. 169–195, Apr. 1999.[44] Y. Li, K. M. Wong, and H. de Bruin, “Electroencephalogram signalsclassiﬁcation for sleep-state decision–a riemannian geometry approach,”

IET signal processing , vol. 6, no. 4, pp. 288–299, 2012.[45] A. Balzi, F. Yger, and M. Sugiyama, “Importance-weighted covarianceestimation for robust common spatial pattern,”

Pattern RecognitionLetters , vol. 68, pp. 139–145, 2015.[46] P. L. C. Rodrigues, C. Jutten, and M. Congedo, “Riemannian procrustesanalysis: transfer learning for brain–computer interfaces,”