[PDF] Scalable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings

Abstract

Multi-electrode arrays (MEAs) can record extracellular action potentials (also known as 'spikes') from hundreds or thousands of neurons simultaneously. Inference of a functional network from a spike train is a fundamental and formidable computational task in neuroscience. With the advancement of MEA technology, it has become increasingly crucial to develop statistical tools for analyzing multiple neuronal activity as a network. In this paper, we propose a scalable Bayesian framework for inference of functional networks from MEA data. Our framework makes use of the hierarchical structure of networks of neurons. We split the large scale recordings into smaller local networks for network inference, which not only eases the computational burden from Bayesian sampling but also provides useful insights on regional connections in organoids and brains. We speed up the expensive Bayesian sampling process by using parallel computing. Experiments on both synthetic datasets and large-scale real-world MEA recordings show the effectiveness and efficiency of the scalable Bayesian framework. Inference of networks from controlled experiments exposing neural cultures to cadmium presents distinguishable results and further confirms the utility of our framework.

Full PDF

SScalable Bayesian Functional Connectivity Inference forMulti-Electrode Array Recordings

Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, Linda Petzold

University of California, Santa BarbaraSanta Barbara, [email protected]

ABSTRACT

Multi-electrode arrays (MEAs) can record extracellular action po-tentials (also known as ’spikes’) from hundreds or thousands ofneurons simultaneously. Inference of a functional network from aspike train is a fundamental and formidable computational task inneuroscience. With the advancement of MEA technology, it has be-come increasingly crucial to develop statistical tools for analyzingmultiple neuronal activity as a network. In this paper, we propose ascalable Bayesian framework for inference of functional networksfrom MEA data. Our framework makes use of the hierarchical struc-ture of networks of neurons. We split the large scale recordingsinto smaller local networks for network inference, which not onlyeases the computational burden from Bayesian sampling but alsoprovides useful insights on regional connections in organoids andbrains. We speed up the expensive Bayesian sampling process byusing parallel computing. Experiments on both synthetic datasetsand large-scale real-world MEA recordings show the effectivenessand efficiency of the scalable Bayesian framework. Inference ofnetworks from controlled experiments exposing neural cultures tocadmium presents distinguishable results and further confirms theutility of our framework.

CCS CONCEPTS • Applied computing → Life and medical sciences ; Com-putational biology ; Biological networks ; KEYWORDS

Functional connectivity; Multi-electrode array; Bayesian infer-ence;

ACM Reference Format:

Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, LindaPetzold. 2020. Scalable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings. In

San Diego ’20: 19th International Workshopon Data Mining in Bioinformatics, August 24, 2020, San Diego, CA.

ACM, NewYork, NY, USA, 10 pages.

Neuroscience deals with how networks of neurons are organizedand how they function [2]. Understanding connectivity between

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].

San Diego ’20, August 24, 2020, San Diego, CA © 2020 Association for Computing Machinery. neurons and within the brain is a fundamental problem in neu-robiology [11]. Functional connectivity, defined as the statisticaldependencies between different brain regions with similar patterns,is widely used in various neural tasks [1, 13]. For instance, func-tional connectivity magnetic resonance imaging (MRI) is crucialfor diagnosing and comprehending autism spectrum disorders [20].MEAs [24] can record extracellular action potentials from hun-dreds or thousands of neurons and provide insights on neuronalconnectivity [17]. For hours or weeks, action potentials can benon-invasively monitored, when neurons are grown on planarMEAs [14]. Further, there is a trend towards increasing the densityof the arrays [26] to better understand the neuron connectivities.MEA recordings provide researchers opportunities to understandneuron activities in many regions such as the brain, retina, andheart [18]. However, the analysis of this data is challenging, inpart because of its high dimensionality. Summary statistics couldbe used to measure the connection weights between electrodes.These include, for example, Pearson correlation [12], cross correl-ogram (CCG), the maximal information coefficient (MIC) [25] aswell as biophysically-inspired metrics [4]. However, a data gen-erative model is required to understand the underlying structureand to make full use of the domain expert knowledge [16]. Further-more, these summary statistic methods provide different functionalconnectivity results for the same recording since they are all deter-ministic metrics, which present fixed connection weights betweenevery two electrodes instead of a probabilistic estimation.Bayesian inference can address the requirements for the infer-ence, since it provides distributions for parameters using proba-bilistic models and observation data. In contrast to deterministicoptimization procedures that give point estimates of the unknownfunctional connectivity, computing a Bayesian posterior yields prob-ability distributions for the neuronal network functional connec-tivity. Bayesian inference has been combined with the generalizedlinear model (GLM), with graph-based priors to infer the neuronconnectivity pattern for analysis [15]. However, there is a lack ofscalable Bayesian techniques for inference of network structure,which is particularly acute for inference from high-density record-ings.A considerable challenge for Bayesian techniques is the rapidgrowth of computation time in accordance with the increasing scaleof the network. In this paper, we propose a scalable frameworkof Bayesian inference, inspired by the hierarchical structure ofnetworks of neurons. Experiments on both synthetic datasets andlarge scale real-world MEA recordings show that our frameworkprovides accurate and insightful results. Furthermore, we applythe proposed framework to a controlled cadmium dataset, and theresults confirm its utility. a r X i v : . [ c s . C E ] J u l an Diego ’20, August 24, 2020, San Diego, CA Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, Linda Petzold -1+1 + + . Neurons on MEA Individual Electrode Recordings Detected Spikes InferredFunctional Connectivity

Figure 1: The workflow for Bayesian functional connectivity inference. We consider only negative deflections that exceeded6 times the standard deviation of the median noise level as spikes.

The key contributions of this paper include:1) We propose a scalable functional connectivity inference frame-work shown in Fig. 1 for MEA recording data. We speed up theexpensive Bayesian sampling process through the use of parallelcomputing.2) We infer the network by splitting the large scale recordingsinto smaller local networks. The splitting strategy decreases theaverage sampling time quadratically in accordance with the numberof smaller local networks. We also provide a strategy for inferringthe regional connectivity between local networks. This not onlyeases the computational burden from sampling but also providesuseful insights on regional connections in organoids and brains.3) Experiments on both synthetic dataset and large-scale real-world MEA recordings show the effectiveness and efficiency of theBayesian framework. Inference of network structure of Cadmium-exposed neuron cultures further demonstrates the usefulness ofour framework.The remainder of this paper is organized as follows. Section 2describes the MEA data collection. We delineate the probabilisticmodels in Section 3 and demonstrate the Bayesian inference detailsin Section 4. Section 5 describes the hierarchical setup. Resultsfor both synthetic and real data are provided in Sections 6 and 7,respectively. Related work is described in Section 8. Section 9 is theDiscussion.

We prepared hippocampal neurons from postnatal day 0 (P0) micewith C57BL/6 genetic background using a previously describedprotocol [28]. Cleaned and sterilized MEAs (120MEA100/30iR-ITOarrays; Multi Channel Systems) were incubated with poly-L-lysine(0.1 mg/ml) for at least one hour at 37 ◦ C, rinsed 3 times with steriledeionized water and allowed to air dry before cell plating. Glialcultures were maintained in separate T-75 flasks. 100, 000 - 125, 000dissociated glial cells were used for the first plating of MEAs toobtain a confluent glial culture over the surface of the electrodes.Once glia were confluent, the hippocampi dissected from the brainfollowed by manual dissociation were plated at 250, 000 cells inthe MEA chamber. Cultures were grown in a tissue culture incu-bator (37 ◦ C, 5% CO ) in a medium made with minimum essentialmedium + EarleâĂŹs salts (Thermo Scientific, catalog CO when the plate is in the CO incubator. Extracellular voltage recordings of neuronal cultures were per-formed using an MEA 2100-System (Multichannel Systems, Reut-lingen, Germany). Each MEA contained 120 electrodes with a 100Âţm inter-electrode distance. All data were acquired at a 20 kHzsampling rate. All recordings were performed in culture media.The head stage temperature was set to 30ÂřC with an externaltemperature controller, and the MEAs were equilibrated for 5 minon the head stage before data acquisition or after any pharmaco-logical or temperature manipulation. Recording duration was 3minutes. Only cultures at 14 days in vitro (DIV) or older were usedfor pharmacological experiments.

Raw data was converted to HDF5 file format and processed offline.Spike detection was done with Matlab tools Waveclus [7]. Notethat we did not apply spike sorting, since it may introduce consid-erable noise due to unsupervised clustering methods when tryingto obtain the neuron (or unit) information [8], and there are manydifferent spike sorting algorithms [7, 10], which give different out-puts. Extracellular voltage recordings were bandpass filtered usingcutoff frequencies of 200Hz and 4000Hz. Only negative deflectionsin the voltage records were labelled as spikes when the amplitudeexceeded 6 times the standard deviation of the median noise level.Spike times and amplitudes were recorded and used for downstreamanalysis.

In this section, we briefly review the probabilistic model of neuronalspike trains introduced in [15] along with our choice of parame-terization. Table 1 summarizes some common notations that wewill use in this paper. At a high level, the model describes how theunderlying connectivity network affects the activation propensity calable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings San Diego ’20, August 24, 2020, San Diego, CA

Table 1: Notations

Notations Description X t , n The observed spike at time bin t for electrode nA Adjacency matrix W Weight matrix b n The baseline activation of electrode nN Number of electrodes T Autoregressive window of influence ψ t , n The activation of electrode n at time bin tN o Number of overlapped electrodes when split W o Weight matrix of the overlapped region N s Sample number of Bayesian inference ρ The prior for connection probability µ w n Mean for the n th row of Wµ b Mean of the bias vector S w n Covariance for the n th row of WS b Covariance of the bias vectorof each electrode over time, producing the observed spike firingpattern measured over the entire MEA. Specifically, the model iscomposed of three parts: a network model specifying the underly-ing connectivity of the electrodes, an activation propensity modeldetailing how a network along with past spike history affects theprobability of a spike at a time bin, and a spiking observation modelmapping the activation propensity to the observed binary spiketrains. Note that an electrode can fire no more than once in onetime bin because of the refractory period in neurons. A probabilisticgraphical model of this is shown in Fig 2.

𝐴 𝑊 𝑏 % 𝑡 𝑛 ... ...... ... 𝑋 )*+,% ... ... 𝑋 )-+,% 𝑋 ),% 𝜓 )-+,% 𝜓 )*+,% 𝜓 ),% Figure 2: Probabilistic graphical model for MEA. The modeldescribes how the underlying connectivity of a network ofelectrodes ( A and W ) can lead to the observed spike trains X t , n . The network model aims to capture the key properties of the under-lying functional network of the electrode population. Specifically, itseeks to represent that those connections are potentially directionaland different in strength. To accommodate this, a weighted directedgraph is used where the edge weights represent the strength of theconnection between two electrodes. This is incorporated as twomatrix-valued latent variables A ∈ { , } N × N and W ∈ R N × N cor-responding to a binary adjacency matrix and a real-valued weightmatrix respectively. A neuron can either fire spontaneously or as a response to commu-nications (spikes) it receives from incoming, connected neurons.Given a particular realization of the electrode network, A and W ,the instantaneous activation of electrode n at time bin t , ψ t , n ismodeled as a linear, autoregressive function of the lagged spikesfrom neighboring electrodes: ψ t , n = b n + N (cid:213) m = T (cid:213) ∆ t = A m → n W m → n e − ∆ t / τ X t − ∆ t , m , (1)Here, b n represents the baseline activation rate for electrode n in the absence of influence from any other electrode. A m → n ∈{ , } is a binary variable indicating whether or not there existdirected connections from electrode m to electrode n . The weight W m → n is the connection strength from electrode m to electrode n . The activation rate ψ is linearly adjusted by the lagged spikesfrom neighboring electrodes. The strength of the lagged spike isweighted by the strength of the connection to the neighbor and anexponentially decreasing function of time, inspired by the synapseconnectivity measurement in [4], with time constant of τ = ms .This prioritizes recent spikes from strongly connected neighbors.We consider both positive and negative W m → n , which captures thatneuronal connections may be excitatory or inhibitory in nature. Itis possible for a spike to decrease the propensity of firing when aweight is negative. Our spike train data consists of binary observations of whether elec-trode n fired at time bin t , X t , n . This is modeled as a Bernoulli ran-dom variable with probability dependent on the activation propen-sity, σ ( ψ t , n ) = e ψ t , n ( + e ψ t , n ) − , where σ is the logistic functionthat maps the propensity to a probability. Based on the previous section, the full model, including the priors,is as follows: an Diego ’20, August 24, 2020, San Diego, CA Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, Linda Petzold A i , j ∼ Bernoulli ( ρ ){ µ w n , µ b } , S w n , S b ∼ Normal-Inverse-Wishart ( , , I , ) w n | µ w n , S w n ∼ Normal ( µ w n , S w n ) b | µ b , S b ∼ Normal ( µ b , S b ) ψ t , n = b n + N (cid:213) m = T (cid:213) ∆ t = A m → n W m → n e − ∆ t / τ s t − ∆ t , m X t , n ∼ Bernoulli ( σ ( ψ t , n )) , (2)where w n denotes the n th row of the matrix W . The hyper-parameter ρ affects the prior over the connectivity matrix A .We apply an efficient Gibbs sampler, which exhibits scalableparallelism, derived from [15]. Sampling the posterior over thediscrete adjacency matrix A is the most challenging step. Due toconjugacy, we can integrate over W and sample A from its collapsedconditional distribution. We update A and W by collapsing out W to directly sample each of A âĂŹs elements. We iterate over each A m → n and sample it from its conditional distribution. After that,we sample W from its conditional Gaussian likelihood.We employ a novel splitting strategy to make the inferencemethod amenable to high density arrays. We present and verify themethodology in the following sections. We have 120 electrodes in the MEA device in this paper. However,there are 26,400 electrodes in the MaxWell complementary metaloxide semiconductor (CMOS) MEA device. As we can see in Fig. 3,the average time for one sample increases drastically with thesize of the array. That is because the dimensions of parameters ( A and W ) to be estimated exhibit quadratic growth according to thenumber of electrodes. The time reported in Fig. 3 is the samplingtime, computed in parallel with 24 CPU cores. To deal with thetime complexity challenge, we propose a hierarchical inferenceprocedure. As shown in Fig. 4, we split the whole large array intofixed number (for example, 4) of regions. In the first level of thealgorithm, we perform Bayesian inference individually on eachof the smaller regions. In the second level, we treat each wholeregion as one group by taking the mean of all the spike trainsin the region. We apply the same probabilistic model to infer theregional connectivity, which is the connection strength betweeneach pair of regions. This heuristic splitting strategy is inspired bythe biological phenomenon of regional connections in brains. Thesplitting strategy decreases the average sampling time quadraticallyin accordance with the number of sub-regions. To get a betterunderstanding of the connections on the border of any two regions,we further propose an overlapping split mechanism, as shown inFig. 5. Due to the lack of ground truth functional connectivity in an MEA,we first use synthetic data to check the effectiveness of the proba-bilistic model and our splitting strategies. We use the ground truthmodel to generate synthetic data and compare the functional con-nectivity inferred from synthetic data with the ground truth. For all

Number of Electrodes

Average Time for One Bayesian Sample (s)

Figure 3: Average time for one Bayesian sampling exhibitsquadratic growth according to the number of electrodes. Thetime reported is the sampling time computed in parallelwith 24 central processing units (CPU) cores. • • • • • •• • • • • • • •• • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • •• • • • • • • •• • • • • •A B C D E F G H J K L M123456789101112

Figure 4: Non-overlapping split. We split the large array intofour regions. For the first level, we infer individually on thesmaller regions. For the second level, we treat each wholeregion as one hidden super node and infer the regional con-nection strength between each pair of sub-regions. the experiments in this section and the next section, we use spiketrains with shape of 180, 000 * 120. We apply parallel samplingwith 24 CPUs. The Gibbs sampler was run for 1000 iterations. Thefirst 500 samples were discarded to account for burn-in. We verifythat the chain has reached a steady state by observing that theparameter traces and the log-likelihood have converged.In Fig. 6, the inferred posterior mean is almost the same as theground truth connectivity matrices. Cosine similarity measures calable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings San Diego ’20, August 24, 2020, San Diego, CA • • • • • •• • • • • • • •• • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • •• • • • • • • •• • • • • •A B C D E F G H J K L M123456789101112

Figure 5: Overlapping split. To better capture the connec-tions on the border of the regions, an overlapping split isused. the similarity between two matrices of an inner product space,which is widely used in high dimensional spaces [19]. The inferredfunctional network A and W both achieved a high cosine similarityof 0.99 compared with ground truth in this case. Similarly in Fig. 7,we obtain cosine similarity of 0.95 and 0.99 for A and W when thenumber of electrodes increases to 10.We also use synthetic datasets to verify the effectiveness ofour splitting strategies. Here, we assume all the electrodes lie in aline. For the non-overlapping split, we use N electrodes and splitthem into two sub-regions, "front" and "back", with equal size andapply the Bayesian inference on each region. We compared theseparate split results with the Bayesian inference results withoutsplit and reported the cosine similarity in Tab. 2. It is shown that thecosine similarities are consistently high with all different parametersettings, which validates the effectiveness of our splitting strategy.Similarly, for overlapping split, we test the results from split with N o overlapped electrodes. In Tab. 3, N o indicates the number ofoverlapped electrodes. W o is the weight matrix for the overlappedregion. High cosine similarities in Tab. 3 confirm the effectiveness ofoverlapping split strategy. From both Tab. 2 and Tab. 3, the networkprior does affect the inference of A and W but in an indirect way,which is small in degree compared to the effect of the data.Fig. 8 shows the effectiveness of regional connection inferenceafter split. Each element in the inferred regional connectivity matrix W summarizes the elements of the corresponding regions in groundtruth W altogether. Regional inferences after split precisely indicatethe strength and the nature of the connections between regions. In this section, we apply our framework onto two sets of real MEArecordings. Throughout this section, we apply non-overlappingsplit with 4 regions of equal size as in Fig. 4. As we mentioned inSection 2.2, we use 3-minute recordings, which present as spike post p r e Ground Truth A post p r e Inferred A post p r e Ground Truth W post p r e Inferred W Figure 6: Comparison of network inferred from syntheticdata with 4 electrodes and ground truth. Zero indicates noconnection. Minus one indicates an inhibitory connection,and one indicates an excitatory connection. Cosine similar-ity between the two A s is 0.99. Cosine similarity between thetwo W s is 0.99.Table 2: Non-overlapping Results: Cosine Similarity N S w µ b ρ W f ront A f ront W back A back

10 1 0 0.5 0.93 0.99 0.94 0.9810 1 0 1 0.99 0.99 0.98 0.9910 1 5 0.5 0.99 0.99 0.98 0.9910 2 0 0.5 0.95 0.98 0.88 0.9720 1 0 0.5 0.91 0.99 0.95 0.9930 1 0 0.5 0.94 0.99 0.93 0.99trains with shape of 180, 000 * 120. For the Bayesian inference, weadjusted the prior hyper-parameters and verified that the resultsstayed consistent.

For the real dataset, since we don’t have the ground truth, weadopted neuroscience experts’ labelling as our reference. The neu-roscience experts labelled those connections by watching MEAviewer [6]. We compare our Bayesian analysis with the expert la-beled result. As is shown in Fig. 9, 93.33% of all the connectionsdetected by neuron experts are detected by our inference. We can an Diego ’20, August 24, 2020, San Diego, CA Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, Linda Petzold post p r e Ground Truth A post p r e Inferred A post p r e Ground Truth W post p r e Inferred W Figure 7: Comparison of network inferred from syntheticdata with 10 electrodes and ground truth. Cosine similaritybetween the two A s is 0.95. Cosine similarity between thetwo W s is 0.99.Table 3: Overlapping Results: Cosine Similarity N N o S w µ b ρ W f ront A f ront W back A back W o

10 2 1 0 0.5 0.97 0.9 0.95 0.99 0.9810 2 1 0 1 0.99 0.99 0.99 0.99 0.9910 2 1 5 0.5 0.99 0.99 0.99 0.99 0.9910 4 2 0 0.5 0.92 0.98 0.95 0.99 0.9320 4 1 0 0.5 0.94 0.99 0.97 0.99 0.9630 4 1 0 0.5 0.90 0.99 0.93 0.99 0.97see that the real recording has some regional patterns, which satis-fies the intuition of our splitting strategy. Because of the difficultyin detecting those connections in neuroscience experiments, ourneuroscience experts can find some connections that they are con-fident in but cannot guarantee to find all the connections. However,in our Bayesian inference, we gave the overall possible connectionsbased on our probabilistic model. Thus we found more connectionsusing Bayesian inference compared with the neuroscience experts’result. We also calculated the latency between electrodes usingCCG based on the neuroscience expertsâĂŹ labeled ground truthconnectivity. The latency shows a similar pattern as our inferredweights. post p r e Gound Truth W front back post frontback p r e Regional W post p r e Gound Truth W front back post frontback p r e Regional W Figure 8: Regional connections in non-overlapping split."Front" indicates the region composed of electrodes 0 and 1in the ground truth figure, while "back" indicates the regionof electrode 2 and 3. Each element in the inferred regionalconnectivity matrix W reflects the elements of the corre-sponding regions in ground truth W altogether. Regional in-ferences after split precisely reveal the overall connectivitybetween regions. Cadmium is a toxic heavy metal that accumulates in living systemsand is currently one of the most important occupational and en-vironmental pollutants [5]. Cells have a calcium signalling toolkitwhose components can be mixed and matched to create variousspatial and temporal signals [3]. Cadmium can change the intracel-lular concentration of calcium, which is a universal and versatileintracellular messenger [9].We reduced the release probability of presynaptic vesicles bytitrating in increasing amounts of cadmium in order to decreasethe influx of calcium current into the presynaptic terminal. Base-line recordings were performed 5 minutes prior to any cadmiumaddition. Following the baseline recording, the cadmium concentra-tion was brought to 1 µM and the solution was mixed gently. TheMEA was placed on the recording headstage and allowed to equili-brate for 3 minutes, followed by a 3-minute recording. The sameprocedure was performed for 5 µM cadmium in the same MEA.To analyze how cadmium affects the structure of neuronal net-works, we compared the obtained posterior matrices A and W for calable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings San Diego ’20, August 24, 2020, San Diego, CA Figure 9: Comparison between our Bayesian inference andground truth detected by neuroscience experts. Green rep-resents the connections found by Bayesian inference, whilered denotes the connections not found. 93.33% of the con-nections detected by the neuroscience experts are detectedby Bayesian inference. a set of MEA recordings, comprised of one control with no cad-mium introduced and 2 others with 1 µM and 5 µM of cadmium.We used a threshold of 0.05 (the value is chosen based on the out-lier of the weight distribution) to eliminate those tiny weights inthe inferred weight matrix, which is noise in biology. To under-stand the topology, we look at the following graph metrics for eachposterior sample: number of connections, average clustering coef-ficient, and average path length. Overall, we find that Cadmiumdoes change the topology of the estimated networks with statis-tical significance in posterior regions. As is shown in Fig. 10, themean connection counts decrease with introduction of cadmiumto the culture. Higher concentrations of cadmium lead to a drasticdecrease in the number of connections in the functional networks.Clustering coefficient is the number of edges between a elec-trodeâĂŹs immediate neighbors divided by all possible connec-tions that could exist among them. It measures the level of localconnectivity between electrodes. In Fig. 11, average clustering co-efficient decreases with statistical significance when cadmium isin the culture, indicating that cadmium cultures are less likely tohave connections within tightly connected groups or communities.Average path length is the average shortest path length for allpossible pairs of electrodes. Fig. 12 shows that the average pathlength increases significantly with the increase of cadmium concen-tration. That validates the idea that cadmium can impede neuronsignal transmissions by changing the intracellular concentration ofcalcium.

800 1000 1200 1400 1600 1800 2000

Number of Connections C o un t Distribution of Number of Connections

Control1 M Cd5 M Cd

Figure 10: Posterior of number of connections with Cad-mium concentration. Cd is short for Cadmium. The meanconnection counts decrease with introduction of cadmiumto the culture. Higher concentrations of cadmium lead to afurther decrease in the number of connections in the func-tional networks.

Average Clustering Coefficient C o un t Distribution of Average Clustering Coefficient

Control1 M Cd5 M Cd

Figure 11: Posterior of average clustering coefficient withCadmium concentration. Average clustering coefficient de-creases with statistical significance with the introduction ofCadmium into the culture.

In Fig. 13, we compare the indegree of each electrode for con-trol and 5 µM Cadmium concentration. We can see that the overallconnection pattern maintains but the indegree decreases promi-nently with the introduction of Cadmium into the culture. Similarphenomenon can be detected for outdegree of each electrode forcontrol and 5 µM Cadmium concentration in Fig. 14. an Diego ’20, August 24, 2020, San Diego, CA Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, Linda Petzold

Average Path Length C o un t Distribution of Average Path Length

Control1 M Cd5 M Cd

Figure 12: Posterior of average path length distribution withCadmium concentration. The average path length increasessignificantly with the increase of cadmium concentration.

MEAs provide opportunities for researchers to understand neuronalconnectivity, by recording spike trains but also presents severe dataanalysis challenges. Inferring functional connectivity networksis critical to many applications in neuroscience. To analyze high-dimensional spike trains, there exist several methods to measureconnection weights. For example, CCG [12, 27] is a metric that wasbuilt from two electrodesâĂŹ spike trains. It presents the probabilityof an electrode firing to a spike in a τ milliseconds time windowbefore or after another electrode fires. MIC has been widely usedto identify known and novel relationships for data sets in geneexpression and global health [25]. MIC is a two-variable dependentmeasure that captures functional relationship, by providing a scorethat roughly equals the coefficient of determination of the datarelative to the regression function. Biophysically-inspired metricsextracts a directed functional connectivity matrix, based on thespike train [4]. It uses exponential decay property in axon potentialsignals to quantify the connection coefficients. However, despitethe popularity of applying those metrics on MEA recordings, wecan not rely on them to get reliable functional networks becausethey are deterministic and there is no model for the data.GLM, a commonly used modeling framework [8, 23], can modelthe binary fire/not (1/0) and spike train recordings. We can usethe loдit link function to model the binomial distribution. GLMsare often fit by maximum a posteriori (MAP) estimation [21, 29].However, when it comes to high-density recordings, MAP cannotfully convey the information in the posterior with a point estimate.Bayesian inference is a process using Bayes’ theorem based on theprior beliefs about the probabilistic data generation process [22].It updates the posterior for parameters in data generation model,as more data becomes available. GLM and graph-based modelshave been combined to infer the neuron connectivity patterns ina Bayesian way [15]. However, the Bayesian techniques have a Control Indegree

Cd 5 M Indegree

Figure 13: Incoming connections for control and Cadmium5 µM . Nodes are sized and colored according to the numberof incoming connections. The overall connection pattern re-mains but the indegree decreases prominently with the in-troduction of Cadmium into the culture. considerable downside of the increased computation time, espe-cially when the number of electrodes is large. In this paper, weintroduced a scalable Bayesian inference framework on large scaleMEA recordings. The functional connectivities are inferred from ajoint probabilistic model of GLM and networking. In this paper we have focused on inferring functional neuronalnetwork connections using MEA recordings. Specifically, our goalhas been to infer the functional connectivity for MEAs with largenumbers of probes. Along this line, we proposed a scalable Bayesianinference framework. Our framework makes use of the hierarchicalstructure of networks of neurons and splits the whole array into calable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings San Diego ’20, August 24, 2020, San Diego, CA

Control Outdegree

Cd 5 M Outdegree

Figure 14: Outgoing connections for control and Cadmium5 µM . Nodes are sized and colored according to the numberof outgoing connections. The overall connection pattern re-mains but the outdegree decreases prominently with the in-troduction of Cadmium into the culture. smaller local networks for network inference. The splitting strat-egy decreases the average sampling time quadratically with thenumber of sub-regions. We also provide a strategy for inferring theconnectivity between local networks. By comparing with groundtruth for both synthetic data and real world human expert labeledMEA recordings, our experimental results provide compelling ev-idence that our framework can infer the underlying functionalconnectivity. Furthermore, we applied the proposed framework toa controlled cadmium dataset, and the results confirm its utility.As the density of the MEA continues to increase, our method willbecome more valuable to be able to infer the neuronal networkstructure efficiently.Our experimental results demonstrate the usefulness of thisframework. Here we suggest some avenues for future work. First, the model could be extended to account for more detailed biologicalknowledge. For example, there are different types of neurons suchas motor neurons and interneurons, which exhibit different typesof behaviors in response to incoming signals. The auto-regressivepropensity model, which is currently linear, can also be improvedto incorporate nonlinear effects or other mechanistic firing models.Secondly, developing a method to determine when and how to splitcan allow for the framework to be used more robustly. Moreover,using the framework to study different factors instead of Cadmium,such as the presence of certain genes, may produce a greater un-derstanding of the biologically complex systems. REFERENCES [1] Antonello Baldassarre, Christopher M Lewis, Giorgia Committeri, Abraham ZSnyder, Gian Luca Romani, and Maurizio Corbetta. 2012. Individual variabilityin functional connectivity predicts performance of a perceptual task.

Proceedingsof the National Academy of Sciences

Neuroscience .Vol. 2. Lippincott Williams & Wilkins.[3] Michael J Berridge, Peter Lipp, and Martin D Bootman. 2000. The versatilityand universality of calcium signalling.

Nature reviews Molecular cell biology

1, 1(2000), 11–21.[4] Yazan N Billeh, Michael T Schaub, Costas A Anastassiou, Mauricio Barahona, andChristof Koch. 2014. Revealing cell assemblies at multiple levels of granularity.

Journal of neuroscience methods

236 (2014), 92–106.[5] Jacopo Junio Valerio Branca, Gabriele Morucci, and Alessandra Pacini. 2018.Cadmium-induced neurotoxicity: still much ado.

Neural regeneration research

PloS one

13, 2 (2018), e0192477.[7] Fernando J Chaure, Hernan G Rey, and Rodrigo Quian Quiroga. 2018. A novel andfully automatic spike-sorting implementation with variable number of features.

Journal of neurophysiology

Computational intelligence and neuroscience

Chemico-biological interactions

Neuron

95, 6(2017), 1381–1394.[11] Karl J Friston. 1994. Functional and effective connectivity in neuroimaging: asynthesis.

Human brain mapping

2, 1-2 (1994), 56–78.[12] Matteo Garofalo, Thierry Nieus, Paolo Massobrio, and Sergio Martinoia. 2009.Evaluation of the performance of information theory-based methods and cross-correlation to estimate the functional connectivity in cortical networks.

PloS one

4, 8 (2009), e6482.[13] Adam Gazzaley, Jesse Rissman, and Mark DâĂŹesposito. 2004. Functional connec-tivity during working memory maintenance.

Cognitive, Affective, & BehavioralNeuroscience

4, 4 (2004), 580–599.[14] Christopher M Lewis, Conrado A Bosman, and Pascal Fries. 2015. Recording ofbrain activity across spatial scales.

Current opinion in neurobiology

32 (2015),68–77.[15] Scott Linderman, Ryan P Adams, and Jonathan W Pillow. 2016. Bayesian la-tent structure discovery from multi-neuron recordings. In

Advances in neuralinformation processing systems . 2002–2010.[16] Honglei Liu and Bian Wu. 2017. Active learning of functional networks fromspike trains. In

Proceedings of the 2017 SIAM International Conference on DataMining . SIAM, 81–89.[17] Ming-Gang Liu, Xue-Feng Chen, Ting He, Zhen Li, and Jun Chen. 2012. Use ofmulti-electrode array recordings in studies of network synaptic plasticity in bothtime and space.

Neuroscience bulletin

28, 4 (2012), 409–422.[18] Ming-Gang Liu, Xue-Feng Chen, Ting He, Zhen Li, and Jun Chen. 2012. Use ofmulti-electrode array recordings in studies of network synaptic plasticity in bothtime and space.

Neuroscience bulletin

28, 4 (2012), 409–422.[19] Chunjie Luo, Jianfeng Zhan, Xiaohe Xue, Lei Wang, Rui Ren, and Qiang Yang.2018. Cosine normalization: Using cosine similarity instead of dot productin neural networks. In

International Conference on Artificial Neural Networks .Springer, 382–391.[20] Ralph-Axel Müller, Patricia Shih, Brandon Keehn, Janae R Deyoe, Kelly M Leyden,and Dinesh K Shukla. 2011. Underconnected, but how? A survey of functional an Diego ’20, August 24, 2020, San Diego, CA Yun Zhao, Richard Jiang, Zhenni Xu, Elmer Guzman, Paul K. Hansma, Linda Petzold connectivity MRI studies in autism spectrum disorders.

Cerebral cortex

21, 10(2011), 2233–2243.[21] Liam Paninski. 2004. Maximum likelihood estimation of cascade point-processneural encoding models.

Network: Computation in Neural Systems

15, 4 (2004),243–262.[22] Thomas Parr, Geraint Rees, and Karl J Friston. 2018. Computational neuropsy-chology and Bayesian inference.

Frontiers in human neuroscience

12 (2018), 61.[23] Jonathan W Pillow, Jonathon Shlens, Liam Paninski, Alexander Sher, Alan MLitke, EJ Chichilnisky, and Eero P Simoncelli. 2008. Spatio-temporal correlationsand visual signalling in a complete neuronal population.

Nature

Nature methods

11, 7 (2014), 727.[25] David N Reshef, Yakir A Reshef, Hilary K Finucane, Sharon R Grossman, GileanMcVean, Peter J Turnbaugh, Eric S Lander, Michael Mitzenmacher, and Pardis CSabeti. 2011. Detecting novel associations in large data sets. science

Journal of neuroscience methods

233 (2014),115–128.[27] Emilio Salinas and Terrence J Sejnowski. 2001. Correlated neuronal activity andthe flow of neural information.

Nature reviews neuroscience

2, 8 (2001), 539–550.[28] Kenneth R Tovar, Daniel C Bridges, Bian Wu, Connor Randall, Morgane Au-douard, Jiwon Jang, Paul K Hansma, and Kenneth S Kosik. 2018. Action potentialpropagation recorded from single axonal arbors using multielectrode arrays.

Journal of neurophysiology

Journal ofneurophysiology

93, 2 (2005), 1074–1089.

Table 4: Hyper-parameters for synthetic experiments T µ w n S b n

100 1 1

Table 5: Hyper-parameters for real data experiments T µ w n S b n S w n µ b ρ

100 1 1 1 -2 0.1

Appendix A APPENDIX FORREPRODUCIBILITY

To support the reproducibility of the results in this paper, we providethe details in our experiments. We use 24 CPUs with model ofIntel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz. We perform parallelsampling with OpenMP and the average sampling time scales withthe number of CPUs [15].The network prior does affect the inference of A and W but inan indirect way, which is small compared to the effect of the data.For all the results in this paper, we ran the Gibbs sampler for 1000iterations and verified that the results stayed consistent. Hyper-parameters we used for synthetic and real data in Tab. 4 and Tab. 5respectively. A.1 Hyper-parameters for syntheticexperiments

The hyper-parameters in Tab. 4 are fixed for synthetic experimentalresults in Fig. 6, Fig. 7, Fig. 8 and Tab. 2 and Tab. 3.