[PDF] A metabolomic measure of energy metabolism moderates how an inflammatory miRNA relates to rs-fMRI network and motor control in football athletes

Abstract

Collision sports athletes experience many head acceleration events (HAEs) per season. The effects of these subconcussive events are largely understudied since HAEs may produce no overt symptoms, and are likely to diffusely manifest across multiple scales of study (e.g., molecular, cellular network, and behavior). This study integrated resting-state fMRI with metabolome, transcriptome and computational virtual reality (VR) behavior measures to assess the effects of exposure to HAEs on players in a collegiate American football team. Permutation-based mediation and moderation analysis was used to investigate relationships between network fingerprint, changes in omic measures and VR metrics over the season. Change in an energy cycle fatty acid, tridecenedioate, moderated the relationship between 1) miR-505 and DMN fingerprint and 2) the relationship between DMN fingerprint and worsening VR Balance measures (all p less than or equal to 0.05). In addition, the similarity in DMN over the season was negatively related to cumulative number of HAEs above 80G, and DMN fingerprint was less similar across the season in athletes relative to age-matched non-athletes. miR-505 was also positively related to average number of HAEs above 25G per session. It is important to note that tridecenedioate has a double bond making it a candidate for ROS scavenging. These findings between a candidate ROS-related metabolite, inflammatory miRNA, altered brain imaging and diminished behavioral performance suggests that impact athletes may experience chronic neuroinflammation. The rigorous permutation-based mediation/moderation may provide a methodology for investigating complex multi-scale biological data within humans alone and thus assist study of other functional brain problems.

Full PDF

1 A metabolomic measure of energy metabolism moderates how an inflammatory miRNA relates to rs-fMRI network and motor control in football athletes Running Title: Tridecenedioate, miR-505, rsfMRI, and balance Sumra Bari , Nicole L. Vike , Khrystyna Stetsiv , Linda Papa , Eric A. Nauman , Thomas M. Talavage , Semyon Slobounov , Hans C. Breiter Department of Psychiatry and Behavioral Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA Department of Emergency Medicine, Orlando Regional Medical Center, Orlando, FL, USA Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA School of Mechanical Engineering, Purdue University, West Lafayette, IN, USA Department of Basic Medical Sciences, Purdue University, West Lafayette, IN, USA School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA Department of Kinesiology, Pennsylvania State University, University Park, PA, USA; Laboratory of Neuroimaging and Genetics, Department of Psychiatry, Massachusetts General Hospital and Harvard School of Medicine, Boston, MA, USA indicate co-equal authorship * Corresponding Authors: For project design, management and data collection: Semyon Slobounov ([email protected]) For hypotheses and conceptual framework, data analysis and paper development: Hans Breiter ([email protected]) 2

Abstract

Collision sports athletes experience many head acceleration events (HAEs) per season. The effects of these subconcussive events are largely understudied since HAEs may produce no overt symptoms, and are likely to diffusely manifest across multiple scales of study (e.g., molecular, cellular network, and behavior). This study integrated resting-state fMRI with metabolome, transcriptome and computational virtual reality (VR) behavior measures to assess the effects of exposure to HAEs on players in a collegiate American football team. Permutation-based mediation and moderation analysis was used to investigate relationships between network fingerprint, changes in omic measures and VR metrics over the season. Change in an energy cycle fatty acid,  tridecenedioate, moderated the relationship between 1)  miR-505 and DMN fingerprint and 2) the relationship between DMN fingerprint and worsening  VR Balance measures (all 𝑝 𝐹𝑝𝑒𝑟𝑚 , 𝑝 𝛽 𝑝𝑒𝑟𝑚 ≤ ) . In addition, the similarity in DMN over the season was negatively related to cumulative number of HAEs above 80G, and DMN fingerprint was less similar across the season in athletes relative to age-matched non-athletes.  miR-505 was also positively related to average number of HAEs above 25G per session. It is important to note that tridecenedioate has a double bond making it a candidate for ROS scavenging. These findings between a candidate ROS-related metabolite, inflammatory miRNA, altered brain imaging and diminished behavioral performance suggests that impact athletes may experience chronic neuroinflammation. The rigorous permutation-based mediation/moderation may provide a methodology for investigating complex multi-scale biological data within humans alone and thus assist study of other functional brain problems. Introduction

Athletes in impact sports may experience hundreds of so-called “subconcussive” events during a season that do not lead to overt concussive symptoms [1–4], but may lead to subtle alterations in brain structure and function [5–10]. Such events involve a direct blow to the head or an indirect acceleration or whiplash movement due to an impact elsewhere on the body, and are referred to as head acceleration events (HAE) [2]. Neuroimaging studies using functional MRI (fMRI), diffusion weighted imaging, perfusion imaging, and magnetic resonance spectroscopy (MRS) have shown alterations in functional connectivity, working memory, cerebro-vascular reactivity, white matter integrity, regional cerebral blood flow, and neurometabolite concentrations in asymptomatic collision sport athletes due to repeated HAE exposure [8,9,11–19] supporting multiple etiologic hypotheses for injury. Many etiological hypotheses about the mechanism of injury have been proposed, including: (1) neurovascular decoupling [20–27] (2) neuroinflammation [28–34] and (3) diffuse axonal injury (DAI) [35–42]. Although these three models are independently supported by strong evidence, vascular function is a fundamental part of the initiation phase of inflammatory response, and its longer-term resolution phase [43,44], suggesting two of these three models have some overlap. The neuroinflammation model was the focus of a recent study on micro-RNA (miRNA) levels before and over the course of the football season for miRNAs related to inflammatory responses, such as miR-505, miR-30d, and miR-92a [45]. The panel of miRNAs studied had previously been shown to be elevated in subjects visiting emergency rooms for mild to severe traumatic brain injury (TBI), and were correlated with abnormal clinical readings on CT scans [46]. miRNAs represent a dynamic measure of gene function and are part of the transcriptome [47], targeting up to hundreds of mRNAs, and thus being involved in a wide variety of cellular 4 processes, including many of those that occur after the initial physical impact in head injuries, such as the initiation of inflammation, or its longer-term resolution [48,49]. Inflammation is energy intensive and also leads to alterations in metabolomic profiles (i.e., individualized biochemistry) in athletes, as suggested by several recent MRS studies of mild traumatic brain injury (mTBI) [11,18,50–59]. In parallel, the DAI model suggests there may be an array of shock-wave abnormalities to axons and their structure, and given axon structure depends on a broad array of fatty acids (FAs), FAs might be an important focus for metabolomic study of HAE. One of the fundamental strengths of studying miRNA associated with neuroinflammation, or metabolomic measures of molecules important for brain structural integrity or energetics, is that they can be readily sampled in peripheral blood [45,46]. This strength is coupled, in turn, with the parallel requirement that transcriptomic and metabolomic measures be connected to brain measures and the computational behavior mediated by the brain that is relevant to HAEs and concussion. In recent work, we found such connections between omic measures, brain imaging and computational behavior could be quite powerful, specifically with an observation that miR-30d and miR-92a mediated more than 70% of the relationship between brain imaging and a virtual reality (VR) measure of motor control in football athletes [60]. Core features of the diagnosis of concussion and accumulated HAEs are disturbances of motor control in the form of coordination, balance, or the capacity to navigate, features that were a focus of research by Alexander Luria [61–63]. He noted that head impacts without clinical signs of trauma may be associated with chronic neurocognitive deficits in (a) spatial orientation and accuracy of navigation, (b) processing of visual-spatial information (sensory-motor reactivity), or (c) related coordination functions such as whole-body postural control or balance [61–63]. Recent work with virtual reality (VR) technology has allowed these deficits to be tested using a validated 5 methodology [64,65], producing computational behavior measures of (i) spatial navigation accuracy (“Spatial Memory”), (ii) sensory-motor reactivity and efficiency of visual-spatial processing (“Reaction Time”), and (iii) postural stability during equilibrium changes (“Balance”), along with (iv) an integrative metric of all three measures together (“Comprehensive” score). Although a number of computer-based neuropsychological tests exist to assess aspects of Luria’s triad [i.e., (a) – (c) above], VR can detect residual abnormalities in the absence of self-reported symptoms by HAE recipients [66]. Connecting alterations in measures of motor control to in vivo brain alterations requires use of brain imaging such as resting-state fMRI (rs-fMRI), which measures the spontaneous neural activity in the brain and determines the default functional connectivity between brain regions. rs-fMRI has gained wide-spread attention and is used to investigate brain functional connectivity in the normal healthy brain [67–71] as well as in many clinical populations including mTBI [9,12,13,72–74]. A functional connectome (FC) is a symmetric square matrix that estimates the level of functional coupling of blood oxygenation level dependent (BOLD) signal between pairs of brain regions. Robust individual differences in FCs have been termed a “fingerprint”, and are demonstrated by the self-identification of subjects by correlating repeat visits of the same subject [75–81]. Use of an association between individual differences in FCs (i.e., fingerprints) and motor control behavior would provide a fundamental way to assure that alterations in transcriptomics and metabolomics were related to HAEs, and not just inflammation based on orthopedic injury. Given this background, this study evaluated the hypothesis that omic measures from the transcriptome (miRNA) and metabolome (individualized biochemistry) would mediate or moderate the relationship of rs-fMRI measures to motor control behavior. We hypothesized these effects would be observed through three-way associations with computational behavior measures 6 across a season (i.e., pre- vs. post-season) of a football team. We predicted omic measures would show mediation/moderation relationships with imaging and behavioral measures, in line with our prior experience with imaging omics [60]. For the miRNA measures, we focused on a panel of miRNAs that had previously been shown to be abnormal relative to emergency room controls in the pre-season, and across season for football players, and that were known to be involved in the control of inflammation [45]. For metabolomics, we focused only on compounds significantly altered over the season (with FDR correction and/or random forest assessment) that were (i) fatty acids and/or compounds involved with energy metabolism, (ii) compounds involved with stress/inflammatory responses, or (iii) exogenous compounds related to consumption. For rs-fMRI, a network fingerprint approach was used to compare rs-fMRI measures across the season. The broad focus of this imaging-omics study was to determine if identified metabolomic compounds and inflammatory miRNAs mediated or moderated the relationship of rs-fMRI measures to motor control behaviors. Such a finding would go beyond the idea of an intermediate phenotype in “imaging genetics” [82], wherein brain imaging acts as a common, overlapping node between two associations, and point to the quantitative integration of measures across multiple scales, using omics as opposed to genetic measures (that cannot be easily manipulated for clinical purposes). Omic measures such as metabolomics can be clinically manipulated, and their linkage to imaging and computational behavior suggests the possibility for mechanistic insight. To date, we do not believe that imaging omics using metabolomic measures has been described in the literature. To do this, given the large number of associations needing testing for this study, we integrated permutation-based statistics with mediation/moderation analyses to directly address the issue of multiple comparisons. 7

Methods

Participants and data collection

Twenty-three (23) male collegiate varsity student football athletes (mean  standard deviation = 21  . Blood samples were taken prior to any contact practices (

Pre ) and within one week of the last game (

Post ). None of the subjects had a diagnosed concussion in the 9 months preceding preseason data collection. Blood samples were prepared and sent out for miRNA quantification and metabolomic analysis. Concurrent with blood collection, players also underwent virtual reality (VR) testing and MR imaging sessions. HAE measures were collected over the season (i.e., between

Pre and

Post blood, imaging, and VR sampling) as described below.

Head Acceleration Events (HAEs) monitoring

Head acceleration events (HAEs) were monitored at all contact practice sessions (max = 53; no games were monitored) using the BodiTrak sensor system from The Head Health Network. Sensors were mounted in each active player’s helmet prior to contact. Sensor outputs included peak translational acceleration (PTA; G-units) and impact location. Two G-unit thresholds i.e. 25G and 80G were selected based on previous reports of impacts related to brain health and injury [83]. For each i -th athlete the HAEs were quantified as cumulative number of hits exceeding the threshold Th = 25G and 80G (cHAE and cHAE ): 8 𝑐𝐻𝐴𝐸 𝑇ℎ,𝑖 = ∑ 𝑢(𝑃𝑇𝐴 𝑘,𝑖 − 𝑇ℎ)

𝑁𝑘=1 𝑤ℎ𝑒𝑟𝑒 𝑢(𝑥) = { 1 𝑖𝑓 𝑥 > 00 𝑖𝑓 𝑥 ≤ 0

The average number of hits exceeding 25G and 80G per session for i -th athlete (aHAE and aHAE ) is given as 𝑎𝐻𝐴𝐸 𝑇ℎ,𝑖 = 𝑐𝐻𝐴𝐸

𝑇ℎ,𝑖 𝑠𝑒𝑠𝑠𝑖𝑜𝑛𝑠 𝑖 Serum Extraction

Five mL of blood were drawn from each participant at

Pre and

Post sessions. Samples were placed in a serum separator tube, allowed to clot at room temperature, and then centrifuged. Serum was extracted from each tube and pipetted into bar-coded aliquot tubes. Serum samples were stored at -70 ° C until they were transported to 1) a central laboratory for blinded miRNA batch analysis [45] and 2) Metabolon (Morrisville, NC, USA) for blinded metabolite analysis. miRNA quantification

Serum samples collected at

Pre and

Post sessions were used to isolate and quantify levels of RNA. 100 𝜇 L of serum was aliquoted and RNA was isolated using a serum/plasma isolation kit (Qiagen 9 Inc., Venlo, Netherlands) as per the manufacturer’s protocol. RNA was eluted in 20 𝜇 L of DNAse/RNAse-free water and stored at -80 ° C until further use. Droplet digital PCR (ddPCR; Bio-Rad Inc., Hercules, CA, USA) was used to quantify absolute levels of nine miRNA (miR-20a, miR-505, miR-3623p, miR-30d, miR-92a, miR-486, miR-195, miR-93p, miR-151-5p) [45]. Prior to ddPCR analysis, RNA was checked for quality using a bioanalyzer assay with a small RNA assay. After quality confirmation, 10 ng of RNA was reverse transcribed using specific miRNA TaqMan assays as per the manufacturer’s protocol (Thermo Fisher Scientific Inc., Waltham, MA, USA). Protocol details can be found in [45]. The final PCR product was analyzed using a droplet reader (Bio-Rad Inc., Hercules, CA, USA). Total positive and negative droplets were quantified, and from this, the concentration of miRNA/ 𝜇 L of the PCR reaction was reported. All reactions were performed in duplicate.

Metabolomic Analysis

The remaining serum was sent to Metabolon (Morrisville, NC, USA) for metabolomic quantification. Upon arrival, samples were assigned a unique identifier via an automated laboratory system and stored at -80 ° C. Samples were prepared for subsequent analyses using an automated MicroLab STAR ® system (Hamilton Company, Reno, NV, USA). Proteins were precipitated out of each sample using methanol and a shaker (Glen Mills GenoGrinder 2000), and then centrifuged. The resulting extract was then divided into five fractions for various analyses: 1) two fractions for analysis by two separate reverse phase (RP)/UPLC-MS/MS methods with positive ion mode electrospray ionization (ESI), 2) one for analysis by RP/UPLC-MS/MS with negative ion mode ESI, 3) one for analysis by HILIC/UPLC-MS/MS with negative ion mode ESI, 10 and 4) one reserved for backup. To remove organic solvent, samples were briefly placed on a TurboVap ® (Zymark); samples were then stored under nitrogen overnight prior to analyses. Serum metabolites were quantified using Ultrahigh Performance Liquid Chromatography-Tandem Mass Spectroscopy (UPLC-MS/MS). All methods utilized a Waters ACQUITY UPLC and a Thermo Scientific Q-Exactive high resolution/accurate mass spectrometer interfaced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at 35,000 mass resolution. The sample extract was dried and reconstituted in solvents compatible to each of the listed analyses. Each reconstitution solvent contained a series of standards at fixed concentrations to ensure injection and chromatographic consistency. One aliquot was analyzed using acidic positive ion conditions, chromatographically optimized for more hydrophilic compounds. In this method, the extract was gradient eluted from a C18 column (Waters UPLC BEH C18-2.1x100 mm, 1.7 µm) using water and methanol, containing 0.05% perfluoropentanoic acid (PFPA) and 0.1% formic acid (FA). Another aliquot was also analyzed using acidic positive ion conditions, however it was chromatographically optimized for more hydrophobic compounds. In this method, the extract was gradient eluted from the same afore mentioned C18 column using methanol, acetonitrile, water, 0.05% PFPA and 0.01% FA and was operated at an overall higher organic content. Another aliquot was analyzed using basic negative ion optimized conditions using a separate dedicated C18 column. The basic extracts were gradient eluted from the column using methanol and water, however with 6.5mM Ammonium Bicarbonate at pH 8. The fourth aliquot was analyzed via negative ionization following elution from a HILIC column (Waters UPLC BEH Amide 2.1x150 mm, 1.7 µm) using a gradient consisting of water and acetonitrile with 10mM Ammonium Formate, pH 10.8. The MS analysis alternated between MS and data-dependent MS n

11 scans using dynamic exclusion. The scan range varied slighted between methods but covered 70-1000 m/z. Peak analysis was conducted using a bioinformatics system which consisted of four major components: 1) the Laboratory Information Management System (LIMS - a system used to automate sample accession and preparation, instrumental analysis and reporting, and data analysis), 2) the data extraction and peak-identification software, 3) data processing tools for quality control and compound identification, and 4) a collection of information interpretation and visualization tools. Raw data were extracted, peak-identified, and QC processed using Metabolon’s hardware and software. Compounds were identified by comparison to library entries of purified standards. Biochemical identifications were based on three criteria: 1) retention index (RI) within a narrow RI window of the proposed identification, 2) accurate mass match to the library (+/- 10 ppm), and 3) the MS/MS forward and reverse scores between the experimental data and authentic standards. The MS/MS scores were based on a comparison of the ions present in the experimental spectrum to the ions present in the library spectrum. While there may have be similarities between these molecules based on one of these factors, the use of all three data points was utilized to distinguish and differentiate more than 3,300 registered biochemicals. Peaks were quantified using area-under-the-curve. A data normalization step was performed to correct variation resulting from instrument inter-day tuning differences (i.e. variation between pre and postseason analyses). Specifically, each compound was corrected in run-day blocks by registering the medians to equal one (1.00) and normalizing each data point proportionately. Data were then log-transformed. Of the 3,300+ potential biochemicals, 968 metabolites were analyzed at

Pre and

Post sessions. Of these, 161 showed significant, FDR- 12 corrected, increases or decreases between the

Pre and

Post sessions ( q -value < 0.05). Of those 161 metabolites, 40 were selected based on the following criteria: 1) they appeared in the random forest plot (20/40) (see Vike et al., 2020 [84]) and 2) they were hypothesized to change following repetitive HAEs. Based on these criteria, six macromolecule categories were defined: lipids (19/40), energy-related metabolites (5/40), xenobiotics (10/40), amino acids (3/40), carbohydrates (2/40), and nucleotides (1/40) [84]. These 40 metabolites were used in all subsequent analyses. Virtual Reality (VR) testing

Athletes completed a previously validated Virtual Reality (VR) neurocognitive testing with a 3D TV system (HeadRehab.com) and a head mounted accelerometer [8,60,65,66] for

Pre and

Post sessions. The test included three modules: Spatial Memory, sensory-motor reactivity or whole body Reaction Time and Balance accuracy. These tasks were based on findings from [61–63] for spatial navigation problems following head injuries in veterans. For the Spatial Memory module participants were shown a randomized virtual pathway including multiple turns to a door along with the return trip. Athletes were instructed to repeat the pathway from their memory using a joystick. The Spatial Memory score was based on correct responses versus the errors. For the Reaction Time module the participants stood feet shoulder width apart with hands on their hips. They were instructed to move their body in the same direction as the virtual room’s movements, and the accelerometer measured response time latency. For the Balance module, athletes were instructed to hold a modified tandem Romberg position for all trials. The first trial was baseline measure where the virtual room was still. Athletes were scored for the subsequent six trials where the virtual room moved in various directions, and the deviance of individual’s alignment with the virtual room was quantified via an accelerometer. 13 To facilitate interpretation, all scores were scaled such that higher score represents better performance. In addition to the individual scores from each module, an overall Comprehensive score was calculated by combining the three module scores into a ten-point scale (0 worst and 10 best) [66]. resting-state fMRI

Twenty (20) athletes participated in two MRI sessions consisting of a 10-min eyes-closed rs-fMRI scan with echo-planar imaging and the following parameters: time of echo (TE) = 35.8 ms, time of repetition (TR) = 2000 ms, flip angle = 90°, 72 contiguous 2-mm axial slices in an interleaved order, voxel resolution = 2 mm × 2 mm × 2 mm, matrix size = 104 × 104 and 300 total volumes. One high-resolution T scan using 3D magnetization prepared rapid acquisition gradient recalled echo (3D MPRAGE) sequence was acquired for registration and tissue segmentation purposes with the following parameters: TE = 1.77 ms, time of inversion (TI) = 850ms, TR = 1700ms, flip angle = 9°, matrix size = 320×260×176, voxel size=1mm×1mm×1mm, receiver bandwidth = 300 Hz/pixel, and parallel acceleration factor = 2. rs-fMRI data were processed using functions from AFNI [85] and FSL [86,87] using in-house MATLAB code explained in detail in [81]. rs-fMRI BOLD timeseries were processed in the subject’s native space and the first four volumes were discarded to remove spin history effects. Structural T images were denoised and segmented into gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) tissue masks. The 4D BOLD timeseries was then passed through outlier detection, despiking, slice timing correction, volume registration, aligned to the T structural scan, 14 voxel-wise spatial smoothing within tissue masks, scaled to a maximum (absolute value) of 200, and the data were censored to remove outlier timepoints (with the censoring criteria as in [81]). The timeseries were then detrended using no global signal regression with the following common regressors: (1) very low frequency fluctuations as derived from a bandpass [0.002–0.01 Hz] filter, the six motion parameters and their derivatives, and the voxel-wise local neighborhood (40mm) mean WM timeseries. For connectivity analysis on a regional basis, the grey matter brain atlas from [88] was warped to each subject’s native space by linear and non-linear registration. This brain parcellation consists of 278 regions of interest (ROIs). Note that data from the cerebellum (comprising a total of 30 ROIs) were discarded, because the acquired data did not completely cover this structure for all subjects. This resulted in a final GM partition of 248 ROIs. A functional connectivity matrix (namely the functional connectome; FC) was computed for each rs-fMRI scan through correlation of the mean time series from each of the 248 ROIs. The resulting square, symmetric FC matrices were not thresholded or binarized. Each FC matrix was ordered into seven cortical sub-networks, as proposed by Yeo et al. [89], and an additional eighth sub-network comprising sub-cortical regions was added [90]. These networks were: Visual (VIS), Somato-Motor (SM), Dorsal Attention (DA), Ventral Attention (VA), Limbic System (L), Fronto-Parietal (FP), Default Mode Network (DMN) and subcortical regions (SUBC). The fingerprint for FC and these eight networks was calculated by correlating the submatrix corresponding to each of the networks from Pre and

Post sessions [75,76,81]. The network fingerprint represents the similarity between repeat visits of the same participant. 15

Statistical Analysis

To assess changes across the season, “  ” values were calculated for all the aforementioned measures by subtracting Pre- session measures from

Post . For rs-fMRI networks, the fingerprint—representing the similarity between repeat visits of the same participant—was used instead of  values. After quality checks and removing missing data seventeen (17) subjects had both Pre and

Post data for all five measurements. All statistical analyses used the software package R [91]. Analyses involved identification of two-way associations, discovery of their overlap as three-way associations, and mediation/moderation testing within a permutation-based framework.

Two-way associations

Two-way associations were checked between nine network fingerprints (FC, Vis, SM, DA, VA, L, FP, DMN and SUBC), four  VR scores (Balance, Reaction Time, Spatial Memory and Comprehensive), nine  miRNA (miR-20a, miR-505, miR-3623p, miR-30d, miR-92a, miR-486, miR-92a, miR-93p, and miR-151-5p), 40  metabolites (complete list as mentioned in [84] and five HAE metrics (sessions, cHAE , aHAE , cHAE , and aHAE ). Linear regression was run between two variables and outliers were removed based on Cook’s distance [92], a robust approach to remove outliers. After outlier removal, linear regressions were re-run and all two-way associations with 𝑝 ≤ 𝑝 ≤ 𝑝 -value,  -coefficient and adjusted R (R adj2 ) . Three-way associations

All two-way associations with 𝑝 ≤ 𝑋 𝑡 , 𝑌 𝑡 , 𝑍 𝑡 defined below schematize the variables used, such as the network fingerprint, VR scores, miRNAs and metabolites. 𝑋 𝑡 = [𝑥 𝑡,1,1 ⋮𝑥 𝑡,1,𝑆 … 𝑥 𝑡,𝑁,1 ⋮𝑥 𝑡,𝑁,𝑆 ] , 𝑌 𝑡 = [𝑦 𝑡,1,1 ⋮𝑦 𝑡,1,𝑆 … 𝑦 𝑡,𝑀,1 ⋮𝑦 𝑡,𝑀,𝑆 ] , 𝑍 𝑡 = [𝑧 𝑡,1,1 ⋮𝑧 𝑡,1,𝑆 … 𝑧 𝑡,𝑃,1 ⋮𝑧 𝑡,𝑃,𝑆 ] where N is the total number of variables in matrix 𝑋 𝑡 , M and P are the total number of variables for matrices 𝑌 𝑡 and 𝑍 𝑡 respectively. S is the number of participants and the matrices were measured at two time points 𝑡 = 1 represents Pre and 𝑡 = 2 represents

Post- season measurements. Across season measures for VR scores, miRNAs and metabolites were calculated as ∆𝑋 = 𝑋 − 𝑋 ∆𝑌 = 𝑌 − 𝑌 ∆𝑍 = 𝑍 − 𝑍 Note that, for rs-fMRI instead of difference measures network fingerprint was computed by correlating the data at two time-points. In what follows, ∆𝒚 𝒋 ~∆𝒙 𝒊 will be used to denote the significant two-way association between any two variables (following the procedure described above). To quantify three-way associations, the following steps ( A – C , below) were performed

17 First, two-way associations were performed between all variables ∆𝒚 𝒋 and ∆𝒙 𝒊 from matrices ∆𝑌 and ∆𝑋 : Step A : ∆𝒚 𝒋 ~∆𝒙 𝒊 {∀ 𝑖 ∈ {1, ⋯ , 𝑁}∀ 𝑗 ∈ {1, ⋯ 𝑀} Second, two-way associations were performed between all variables ∆𝒛 𝒌 and ∆𝒚 𝒋 from matrices ∆𝑍 and ∆𝑌 : Step B : ∆𝒛 𝒌 ~∆𝒚 𝒋 {∀ 𝑗 ∈ {1, ⋯ 𝑀}∀ 𝑘 ∈ {1, ⋯ 𝑃} Third, two-way associations were performed between all variables ∆𝒛 𝒌 and ∆𝒙 𝒊 from matrices ∆𝑍 and ∆𝑋 : Step C : ∆𝒛 𝒌 ~∆𝒙 𝒊 {∀ 𝑖 ∈ {1, ⋯ , 𝑁}∀ 𝑘 ∈ {1, ⋯ 𝑃} Finally, three-way associations between any three variables were formed if all three above steps resulted in significant two-way associations for the common variables, leading to the relation: ∆𝒙 𝒊 ~∆𝒚 𝒋 ~∆𝒛 𝒌 ~∆𝒙 𝒊 Mediation Analysis

All three-way associations were tested for mediations with  VR scores as the dependent variable (DV). This prior hypothesis was observed in previous studies [60]. The independent variable (IV) and mediator (M) were chosen from network fingerprints,  miRNAs and 18  metabolites. Please see the Supplemental material for a detailed description of the mediation procedure. Moderation Analysis

All three-way associations were also tested for moderation. The moderation model proposes that the strength and direction of the relationship between independent variable (IV) and dependent variable (DV) is controlled by the moderator variable (M). The IV, DV and M were chosen from network fingerprints,  miRNAs,  metabolites and  VR scores. The moderation is characterized by the interaction term between IV and M in the linear regression equation as given below:

𝐷𝑉 = 𝛽 + 𝛽 𝐼𝑉 + 𝛽 𝑀 + 𝛽 (𝐼𝑉 ∗ 𝑀) + 𝜖 Moderation is significant if 𝑝 𝛽 ≤ 0.05 and 𝑝 𝐹 ≤ 0.05 , where 𝑝 𝛽 ≤ 0.05 indicates that 𝛽 is significantly different than zero using a t-test and 𝑝 𝐹 is the 𝑝 -value associated with the overall F-test for the regression equation suggesting that the overall linear relationship is significant. Permutation Testing

To control for the occurrence of false positives due to multiple hypotheses testing, permutation-based moderation analysis was conducted. Permutation tests re-sample observations from the original data multiple times to build empirical estimates of the null distribution for the test statistic being studied [93,94]. Permutation-based tests are especially well-suited for studies with small sample sizes as they estimate the statistical significance directly from the data being 19 analyzed rather than making assumptions about the underlying distribution. First, the test statistic is obtained from the original data set, then the data is randomly permuted multiple (Q) times and the test statistic is computed on each permutated data set. The statistical significance is computed by counting (K) the number of times the statistic value obtained in the original data set was more extreme than the statistic value obtained from the permuted data sets, and dividing that value by the number of random permutations (K/Q) [93]. For this study, permutation-based moderation analysis was performed for all three-way associations following the steps listed below: 1.

Moderation analysis was performed by assigning the original data variables ∆𝒙 𝒊 , ∆𝒚 𝒋 , ∆𝒛 𝒌 as IV, DV and M to obtain reference test-statistics: 𝑡 and 𝐹 . Only variables that formed three-way associations were considered. 2. Data permutation: values were randomly selected from 𝒙 and 𝒙 to assign to 𝒙 and 𝒙 . 3. Across season measures were computed from the permuted dataset ∆𝒙 𝒊′ = 𝒙 − 𝒙 . Similarly, ∆𝒚 𝒋′ and ∆𝒛 𝒌′ were computed. 4. Moderation analysis was performed on the permuted dataset ∆𝒙 𝒊′ , ∆𝒚 𝒋′ , ∆𝒛 𝒌′ by assigning as IV, DV and M and the test statistics: 𝑡 𝑞′ and 𝐹 𝑞′ were obtained. 5. The counter variable 𝐾 was incremented by one if absolute value of 𝑡 was greater than absolute value of 𝑡 𝑞′ . 𝐾 was incremented by one if absolute value of 𝐹 was greater than absolute value of 𝐹 𝑞′ . Steps 2-6 were repeated: 𝑞 = 1,2, ⋯ , 𝑄 times. Here,

𝑄 = 100,000 . 20 8.

Permutation-based p -value 𝑝 𝛽 𝑝𝑒𝑟𝑚 was calculated as the proportion of the 𝑡 𝑞′ values that are as extreme or more extreme than 𝑡 i.e. 𝐾 /𝑄 . Permutation-based p -value 𝑝 𝐹𝑝𝑒𝑟𝑚 was computed from 𝐹 and 𝐹 𝑞′ i.e. 𝐾 /𝑄. Moderation analysis was considered significant if 𝑝 𝛽 𝑝𝑒𝑟𝑚 ≤ 0.05 and 𝑝 𝐹𝑝𝑒𝑟𝑚 ≤ 0.05 . 21

Results

This analysis integrated four types of measure at two time-points

Pre and

Post from a competition season of American collegiate football using a permutation-based mediation/moderation framework. The four measures integrated in mediation/moderation were: network fingerprint (reflecting changes in brain connectivity), change in metabolomics, changes in miRNA and changes in VR-based motor control. For such a mediation/moderation framework, two-way associations were identified using Cook’s distance assessments for outlier removal, and then three-way associations were identified through the overlap of two-way associations. Mediation testing was done in a directed manner, with motor control behavior always being the dependent variable following prior methods (e.g., as seen by Chen et al., 2019 [60]). Moderation testing looked at all possibilities for the dependent variable. All mediation/moderation testing used a permutation-based framework to provide protection against false positives due to multiple comparisons. Secondary analyses done after permutation-based mediation/moderation included (i) assessment of the brain imaging measures in athletes relative to age-matched controls, and (ii) testing of association between the four measures integrated in mediation/moderation against HAE measures.

Two-way associations

Linear regression following Cook’s outlier removal was used to assess significant ( 𝑝 ≤  VR scores (Balance, Reaction Time, Spatial Memory and Comprehensive), nine  miRNA (miR-20a, miR-505, miR-3623p, miR-30d, miR-92a, miR-486, miR-92a, miR-93p, and miR-151-5p), 40  metabolites (complete list as mentioned 22 in [84]. Secondary analyses included assessments of these four categories of measure against the five HAE metrics (sessions, cHAE , aHAE , cHAE , and aHAE ). Supplemental Table 1 (A-J) report the significant negative or positive pairwise associations between any two variables with number of Cook’s outliers removed, 𝑝 -value,  coefficient and adjusted R (R adj2 ) for the regression. The common significant pairwise associations were then used to build three-way associations. Three-way associations

Three-way associations were formed from the common significant pairwise associations between any three variables ∆𝒙 𝒊 , ∆𝒚 𝒋 and ∆𝒛 𝒌 where these could be any of the network fingerprints,  VR scores,  miRNAs and/or  metabolites (see Methods for details). Table 1 (A-C) lists all three-way associations that resulted from the common pairwise associations as described in Methods, and were used for directed mediation and moderation analysis. Mediation and Moderation results

The three-way associations were tested for directed mediation analysis with  VR as the dependent variable. The independent variable and mediator were chosen from network fingerprint,  miRNA and/or  metabolite. None of the three-way associations resulted in a significant mediation. The three-way associations identified and listed in Table 1 (A-C) were then tested for moderation analysis across network fingerprint,  VR,  metabolite and  miRNA. Table 1 (A-C) 23 lists the significant moderation result ( 𝑝 𝐹𝑝𝑒𝑟𝑚 ≤ 𝑝 𝛽 𝑝𝑒𝑟𝑚 ≤ (A) (B) (C) Table 1 . Table (A), (B) and (C) lists all triangulations. 𝛽 𝐴 , 𝑝 𝐴 are the nominal interaction results between X and Y variables; 𝛽 𝐵 , 𝑝 𝐵 are the nominal interaction results between Y and Z variables and 𝛽 𝐶 , 𝑝 𝐶 are the nominal interaction results between X and Z variables. Cook’s outliers ( k / n ) lists the number of outliers k , based on Cook’s distance, removed out of the total n samples for moderation analysis. Significant moderation results are marked in bold and italics. Table (A) lists the results between  Metabolite, Network fingerprint and  VR tasks. Table (B) lists the results between  miRNA,  Metabolite and Network fingerprint. Table (C) lists the results between  miRNA, Network fingerprint and  VR tasks

One significant moderation result is shown in Figure 1(A-C) between  VR Balance (dependent variable), and DMN fingerprint,  tridecenedioate (independent variable, moderator) with 𝑝 𝐹𝑝𝑒𝑟𝑚 = 0.03, 𝑝 𝛽 𝑝𝑒𝑟𝑚 = 0.05. The regression slope (  ), 𝑝 -value and adjusted R ( R adj2 ) depicted on each arm of the triangle corresponds to the pairwise interaction results, after Cook’s distance outliers removed, between those two variables with 𝑝 ≤ 0.05 . Figure 1(B) shows the pairwise interactions between the three variables  VR Balance, DMN fingerprint and  tridecenedioate. There was a negative relationship between DMN fingerprint and  tridecenedioate, a negative relationship between  tridecenedioate and  VR Balance and a positive relationship between DMN fingerprint and  VR Balance. Figure 1(C) plots the relationship between DMN fingerprint and 24  VR Balance with  tridecenedioate as the moderator. Three lines corresponds to low (mean - standard deviation), medium (mean) and high (mean + standard deviation)  tridecenedioate values. A second significant moderation result is shown in Figure 1(D-F) between DMN fingerprint (dependent variable), and  miR-505,  tridecenedioate (independent variable, moderator) with 𝑝 𝐹𝑝𝑒𝑟𝑚 = 0.02, 𝑝 𝛽 𝑝𝑒𝑟𝑚 = 0.02. The regression slope (  ), 𝑝 -value and adjusted R ( R adj2 ) depicted on each arm of the triangle corresponds to the pairwise interaction results, after Cook’s distance outliers removed, between those two variables with 𝑝 ≤ 0.05 . Figure 1(E) plots the pairwise interactions between these three variables. There was a positive relationship between  miR-505 and  tridecenedioate, a negative relationship between DMN fingerprint and  tridecenedioate and a negative relationship between DMN fingerprint and  miR-505. Figure 1(F) depicts the relationship between  miR-505 and DMN fingerprint with  tridecenedioate as the moderator. Three lines corresponds to low (mean - standard deviation), medium (mean) and high (mean + standard deviation)  tridecenedioate values. 25 Figure 1 . (A-C) Significant moderation analysis result. (A)  VR Balance was the dependent variable, DMN fingerprint and  tridecenedioate were independent variable and moderator ( 𝑝 𝐹𝑝𝑒𝑟𝑚 = 0.05, 𝑝 𝛽 𝑝𝑒𝑟𝑚 = 0.03 ). The regression slope (  ), 𝑝 -value and adjusted R ( R adj2 ) depicted on each arm of the triangle corresponds to the significant interaction results, after Cook’s distance outliers removed, between those two variables with 𝑝 -values reported at a significance level of 0.05. (B) Interaction plots corresponding to the moderation analysis. There was negative relationship between DMN fingerprint and  tridecenedioate; there was negative relationship between  tridecenedioate and  VR Balance; there was positive relationship between DMN fingerprint and  VR Balance. (C ) The plot depicts the relationship between DMN fingerprint and  VR Balance with  tridecenedioate as the moderator. Three lines corresponds to low (mean - standard deviation), medium (mean) and high (mean + standard deviation)  tridecenedioate values. (D-F) Significant moderation analysis result. (D)

DMN fingerprint was the dependent variable, and  miR-505 and  tridecenedioate were independent variable and moderator ( 𝑝 𝐹𝑝𝑒𝑟𝑚 = 0.02, 𝑝 𝛽 𝑝𝑒𝑟𝑚 =0.02 ). The regression slope (  ), 𝑝 -value and adjusted R ( R adj2 ) depicted arm of the triangle corresponds to the significant interaction results, after Cook’s distance outliers removed, between those two variables with 𝑝 -values reported at a significance level of 0.05. (E) Interaction plots corresponding to the moderation analysis. There was positive relationship between  tridecenedioate and  miR-505; there was negative relationship between  tridecenedioate and DMN fingerprint; there was negative relationship between DMN fingerprint and  miR-505. (F) The plot depicts the relationship between DMN fingerprint and  miR-505 with  tridecenedioate as the moderator. Three lines corresponds to low (mean - standard deviation), medium (mean) and high (mean + standard deviation)  tridecenedioate values. Given the same metabolomic measure moderated both results, Figure 2 integrates the two moderation analyses results along with HAE metrics. Figure 2(A) shows that  VR Balance was the dependent variable and DMN fingerprint acted as the independent variable for one moderation and dependent variable for the other.  tridecenedioate acted as the moderator for the two moderation analyses. In addition, there was a positive relationship between  miR-505 and aHAE as depicted in Figure 2(B). Figure 2(C) shows the negative relationship between DMN fingerprint and cHAE . 27 Figure 2 . This figure combines the moderation analyses results with HAEs. (A)  VR Balance was the dependent variables and  miR-505 was the independent variable. DMN fingerprint acted as independent variable for one moderation and as dependent variable for the other one.  tridecenedioate acted as the moderator for the two analyses. The regression slope (  ), 𝑝 -value and adjusted R ( R adj2 ) depicted arm of the diamond corresponds to the significant interaction results, after Cook’s distance outliers removed, between those two variables with 𝑝 -values reported at a significance level of 0.05. This figure depicts the integration of resting state fMRI with metabolic profiling, transcriptomics and computational virtual reality (VR) behavior task along with Head Acceleration Events (HAEs). (B) Interaction plots between Network fingerprint,  miRNA with Head Acceleration Events (HAEs). The regression slope (  ), 𝑝 -values and correlation coefficient (r) depicted on top of the plots corresponds to the interaction results, after Cook’s distance outliers removed, between two variables with 𝑝 -values reported at a significance level of 0.05. There was a positive relationship between  miR-505 and average number of HAEs above 25G per session (aHAE ). (C) There was a negative relationship between DMN fingerprint and cumulative number of HAEs above 80G (cHAE ). rs-fMRI between Age-Matched Non-Athletes (NAth) and Football Athletes (Ath) To facilitate interpretation of the permutation-based mediation/moderation analyses, network fingerprint strength was compared between age-match non-athletes (NAth) and football athletes (Ath) in this study. 28 Figure 3(A) shows the brain rendering of average network fingerprint strength across the cohort for NAth and Ath. The NAth exhibited stronger fingerprint for each of the resting-state fMRI network: Vis, SM, DA, VA, L, FP, DMN and SUBC. Figure 3(B) shows the boxplots for all the network fingerprints for age-match non-athletes and football athletes. Age-matched non-athletes exhibited significantly higher network fingerprint values, using two-sample t-test following Bonferroni correction, for all networks except the Limbic system (L), as compared to the football athletes.

Figure 3.

Network fingerprint strength analysis for each of Yeo’s resting state functional networks between football athletes and age-matched non-athletes. (A)

Brain rendering of average network fingerprint across football athletes and age-matched non-athletes. The strength per brain region depicted in the figure is the network fingerprint of the cohort to which the brain region belongs. (B)

Boxplots of network fingerprint for each of Yeo’s resting state functional networks between football athletes and age-matched non-athletes. Note that football athletes have significantly lower network fingerprint than the age-matched non-athletes for all networks except Limbic system (L) using two-sample t-test followed with Bonferroni correction. Yeo’s resting functional networks [89]: Visual (VIS), Somato-Motor (SM), Dorsal Attention (DA), Ventral Attention (VA), Limbic system (L), Fronto-Parietal (FP), Default Mode Network (DMN), and subcortical regions (SUBC). Discussion

This study evaluated the hypothesis that omic measures from the transcriptome (miRNA) and metabolome (individualized biochemistry) would mediate or moderate the relationship of rs-fMRI measures to motor control behavior collected

Pre and

Post season in collegiate American football athletes. Using a rigorous outlier removal procedure and permutation analyses for multiple comparisons correction, this hypothesis was confirmed for one metabolomic measure showing multiple moderation effects. Specifically, a fatty acid, tridecenedioate, was found to moderate the relationship of the DMN fingerprint from rs-fMRI to motor control (balance) behavior. The DMN network was shown to have less self-similarity over the course of the season in football players relative to age-matched controls, and it showed a negative relationship with HAEs accumulated over the season (i.e., more HAEs meant lower similarity in DMN across the season). Further, HAEs showed a positive association with changes in miR-505 levels, indicating that increasing HAEs were associated with increasing levels of a neuroinflammatory-associated miRNA. Lastly, tridecenedioate moderated the relationship of miR-505 to the DMN network. Tridecenedioate is a long-chain monounsaturated (i.e., one double bond) dicarboxylic fatty acid. Such compounds are hypothesized to act as reactive oxygen species (ROS) scavengers given the double bond stability and reactivity. Interestingly, similar monounsaturated compounds, as observed [84], were also decreased (e.g., heptenedioate, hexedecenedioate). Long-chain polyunsaturated fatty acids (PUFAs) have anti-inflammatory properties via the reduction of ROS and direct superoxide scavenging [95], while monounsaturated fatty acids (MUFAs) display neuroprotective effects and ameliorate brain injury in animal models [96]. Unsaturated FAs are also shown to efficiently antagonize the toxic action of saturated non-esterified fatty acids [97], and the presence of the double bond corresponds to increased antioxidant and neuroprotective 31 effects [98]. Thus, the association between across-season decreases in tridecenedioate with inflammatory miR-505 (see Supplemental Figure1) implicate its role in ROS scavenging and inflammation resolution. Overall, the selected miRNAs have been implicated in cancer, systemic inflammation, and central nervous system disorders [99–107]. Specifically, miR-505 is negatively correlated with progression in various cancers, inhibiting tumorigenesis and acting as a tumor suppressor in malignancies such as glioblastoma, endometrial carcinoma, hepatocellular carcinoma and prostate cancer [108–111]. In osteosarcoma tissues, for instance, the reduced expression of miR-505 is significantly associated with poorer clinical prognosis [112]. miR-505 has been reported to inhibit osteosarcoma and hepatoma cell proliferation, migration and invasion by regulating the high-mobility group box 1 (HMGB1), which functions as a damage-associated molecular pattern that propagates infection- or injury-elicited inflammatory responses [112,113]. In addition to its oncogenic associations, miR-505 was found to be significantly decreased in the blood of Ulcerative Colitis patients, thus playing a critical role in chronic inflammatory bowel disease [114]. Furthermore, miR-505 has been implicated in Alzheimer’s disease [115] and identified as a predictive biomarker for Parkinson’s disease (PD) [116]. In PD, miR-505 had a potential functional role via inverse modulation of neural proliferation differentiation and control 1 (NPDC1), a down-regulator of neural proliferation [117]. miR-505 was also shown to be upregulated in both mild to moderate and severe TBI in human subjects [46]. Here, involvement of miR-505 suggests a state of neuroinflammation in asymptomatic football athletes due to repetitive HAE exposure. miR-505 was not only associated with decreased tridecenedioate, but also with decreased self-identifiability in the DMN. 32 Default Mode Network (DMN) is the resting state fMRI network that appeared in both moderation results. Alterations in this network have been associated with numerous neurological disorders. DMN, an ensemble of cortical regions, typically including parts of the anterior and the posterior cingulate cortices, is a task-negative resting state network that is active while at rest and deactivated during a task. Apart from being active during rest, DMN persists in passive sensory processing tasks with minimal cognitive demands [70,118,119] but switches off during externally cued tasks with high cognitive demands. DMN is essential for basic, and possibly subconscious, processing necessary for calibrating affective and autonomic states of the brain [70]. DMN is also critical for retrieval of episodic memory [70,120–123] and altered connectivity has been observed in patients with Alzheimer’s disease [70,124–128]. Altered DMN connectivity, and the subsequent inability to deactivate it for cognitive tasks, has also been observed in patients with schizophrenia [129,130], depression [131–133], epilepsy [134–136], multiple sclerosis [137], Autism Spectrum Disorder [138] and contributes to executive deficits in Parkinson Disease (PD) patients [139] such as PD-associated saccadic hypometria [140] and altered resistance to passive movement/motor performance in PD patients [141,142]. Altered DMN connectivity has also been observed in concussed and TBI patients [12,143–151]. Despite these profound network changes, relating complex neuroimaging such as rs-fMRI to observable behavior remains elusive. Core features of the diagnosis of concussion and accumulated HAEs over a season, are disturbances of motor control in the form of coordination, balance, or the capacity to navigate as investigated by Alexander Luria [61–63]. Numerous studies have attempted to uncover behavioral changes in contact sport athletes, but many tests are insensitive to subtle changes in behavior [152–156]. Recent advances in virtual reality (VR) has prompted the investigation of these deficits in concussed athletes with the use of a validated methodology [64,65]. In order to observe subtle 33 changes in contact sports athletes with or without concussion, it is critical to study associations between motor control behavior and other measures, such as metabolomics and brain imaging. In this study, changes in VR Balance scores over the season were found to be associated with DMN fingerprint and  tridecenedioate. Better performance in the VR Balance task was related to higher DMN similarity (i.e., higher fingerprint values) over the season and lower levels of the ROS scavenging compound, tridecenedioate. This implied that improved motor control across the season of play was associated with smaller changes in the brain’s executive network and more efficient neuroinflammation resolution. Combining complex computational VR tasks with other indicators of neurological dysfunction (i.e., metabolomics and neuroimaging) proved to be a powerful method to uncover subtle, and unobservable changes in brain function related to repetitive HAEs. DMN fingerprint and  miR-505 were also found to be correlated with the HAE metrics. First, the DMN fingerprint, which depicts similarity between repeat visits for a specific participant, was found to be significantly lower in football athletes as compared to age-matched non-athletes. This implies that the DMN network, which is critical for executive functions, is altered in football athletes. Specifically, the fingerprint was negatively associated with the cumulative number of high magnitudes hits; namely, the higher the number of cHAE , the lower the similarity in DMN across the season. Second,  miR-505 was found to be positively associated with aHAE ; therefore, elevated levels of an inflammatory miRNA were positively associated with a higher number of average HAEs over the season. One interpretation of these results is that limiting HAE exposure over the season could potentially mitigate the alterations observed in these asymptomatic athletes. 34 The primary limitation of the study was the modest sample size for which all five measures were available: transcriptome, metabolome, MR imaging, VR behavior, and HAE data. It is critical to note that with the highly rigorous mathematical approach used in this study to analyze the dataset, results are deemed highly reproducible despite this potential limitation. Another potential limitation of the study was that the HAE metrics were only allowed to be collected at practices and not during competitive play, which resulted in a limited picture of the true exposure to “subconcussive” events. To observe the transient changes that football athletes go through throughout a competition season, these metrics would need to be collected over more impact events in future studies. Future studies might also include comparison of the aforementioned metrics collected for football athletes to age-matched, non-contact sports athletes. Conclusion

This study evaluated the associations between metabolome, transcriptome, brain imaging, and behavior measures, collected in collegiate football athletes before and after the season, as a function of mechanical accelerations to the head. It was found that  tridecenedioate moderated the relationship between 1)  miR-505 and DMN fingerprint and 2) the relationship between DMN fingerprint and  VR Balance measures. Metabolomics provides a viable method for investigating metabolic fluctuations and biomarkers of brain injury [157,158], but often lack a relationship to neurological function or dysfunction (observed with brain imaging or behavioral testing) and other markers of neuroinflammation (i.e. miRNAs). Here, a metabolomic measure was found to be associated with the transcriptome, brain imaging, and behavior, elucidating it as a potential HAE-associated blood biomarker of neurological dysfunction. The integration of these multi-scale biological measures through a permutation-based approach uncovered a mechanism of putative 35 brain injury due to HAEs as summarized in Figure 4 and described in detail in [84]. The findings of this study are synergistic with the findings of Vike and colleagues in that markers of mitochondrial distress were linked to HAEs, and metabolites related to fatty acid oxidation (e.g. sebacate) and the TCA cycle (e.g. citrate) were observed to mediate the relationship between miRNAs and behavior [84]. Mitochondrial dysfunction is suspected to increase reactive oxygen species production, which is supported by the alterations we observed with  tridecenedioate, which was in turn linked to chronic elevations in inflammatory miRNAs and alterations in brain imaging and behavior. The rigorous permutation-based mediation/moderation that integrated omic measures with brain imaging and behavior suggests a viable methodology for investigating complex multi-scale biological data. Altogether, this study suggests 1) a potential mechanism of brain injury due to repetitive HAEs and 2) a permutation-based mediation/ moderation approach to integrate multi-scale data that can be applied broadly. This last point is of importance in the context of current controversies about the relevance of animal models of functional brain illness, such as with mental illness, and suggests one possible route forward for developing mechanistic hypotheses from the study of humans alone [159–165]. 36 Figure 4.

Synopsis figure summarizing the interplay between moderation variables. Repetitive head accelerate events (HAEs) in football athletes have been related to changes in brain homeostasis, as evidenced by neuroimaging studies [9,11–16,19]. Additionally, a study using the same collegiate football dataset [84] pointed to mitochondrial distress and subsequent energy imbalance, as evidenced by increased medium-chain fatty acids and decreased TCA metabolites. Furthermore, the metabolites were observed to mediate the relationship between elevated miRNAs and Luria behavior. In the present study, significant moderation effects were observed with default mode network (DMN) fingerprint, miR-505, VR balance task performance, and tridecenedioate. Tridecenedioate, a dicarboxylic fatty acid with one double bond, significantly decreased from pre- to post-season. The double bond in this metabolite is hypothesized to act as a reactive oxygen species (ROS) scavenger. Indeed, previous research supports an increase in reactive oxygen species (ROS) following repetitive exposure to head impacts [166]; therefore, a depletion of tridecenedioate would support its role as an ROS scavenger. Even with scavenging, ROS can still remain elevated and cause damage to neurophysiological systems [166]. This may ultimately result in damage to neuronal connectivity as observed by decreased similarity in DMN and interactions with VR balance task performance. Together, these complex relationships may explain why behavioral changes in subconcussed athletes are not consistently observed, but how repetitive, long-term exposure to HAEs, chronic increases of inflammatory-miRNAs, and acute changes to resting state networks could result in behavioral disturbances later in life Supplementary Material

Age Matched Non-Athletes (NAth)

A cohort of sixteen undergraduate and graduate students (age: mean ± st. deviation = 22.8 ± scan and a ten minutes eyes-open resting-state fMRI scan. The imaging parameters for rs-fMRI consisted of blipped echo-planar imaging: TR/TE = 2000/26 msec; flip angle = 35  ; 34 slices; acceleration factor = 2; Field of View = 20 cm; voxel size = 3.125 x 3.125 x 3.80 mm and 294 volumes. The details of this imaging data are described in Bari et al., 2019 [81]. This dataset was processed using the steps listed in the Methods section and as described in [81]. Network fingerprint comparisons were conducted between age-matched non-athletes and football athletes using two-sample t-test following Bonferroni correction, Permutation-based Mediation Analysis

Mediation seeks to clarify the causal relationship between the independent variable (IV) and dependent variable (DV) with the inclusion of a third variable mediator (M). Mediation model proposes that instead of a direct causal relationship between IV and DV, the IV influences M which then influences the DV. Beta coefficients ( β ) and standard error (se) terms from the following linear regression equations are used to calculate the Sobel p- value and mediation effect percentage (T eff ) using the following steps: 38 𝑆𝑡𝑒𝑝 1 (𝑃𝑎𝑡ℎ

𝐴): 𝑀 = 𝛽 + 𝛽 (𝐼𝑉) + 𝜖 𝐴 𝑆𝑡𝑒𝑝 (𝑃𝑎𝑡ℎ 𝐵): 𝐷𝑉 = 𝛽 + 𝛽 (𝑀) + 𝜖 𝐵 𝑆𝑡𝑒𝑝 (𝑃𝑎𝑡ℎ 𝐶, 𝑚𝑜𝑑𝑒𝑙 𝐷𝑉 = 𝛽 + 𝛽 (𝐼𝑉) + 𝜖 𝑆𝑡𝑒𝑝 (𝑃𝑎𝑡ℎ 𝐶, 𝑚𝑜𝑑𝑒𝑙 𝐷𝑉 = 𝛽 + 𝛽 (𝐼𝑉) + 𝛽 (𝑀) + 𝜖 Sobel’s test is used to test if 𝛽 was significantly lower than 𝛽 using the following equation: (3) 𝑆𝑜𝑏𝑒𝑙 𝑧 − 𝑠𝑐𝑜𝑟𝑒 = (𝛽 − 𝛽 )√[(𝛽 ) (𝜖 ) ] + [(𝛽 ) (𝜖 ) ] Using a standard 2-tail z-score table, the Sobel p -value is determined from Sobel z-score. Mediation effect percentage T eff is calculated using the following equation: (4) 𝑇 𝑒𝑓𝑓 = 100 ∗ (𝛽 ∗ 𝛽 )(𝛽 ∗ 𝛽 ) + [𝛽 − (𝛽 ∗ 𝛽 )] For this study, permutation-based mediation analysis was performed for all three-way associations following the steps listed below: 39 1.

Mediation analysis was performed by assigning the original data variables ∆𝒙 𝒊 , ∆𝒚 𝒋 , ∆𝒛 𝒌 as IV, DV and M to obtain reference Sobel z-score: 𝑧 and 𝑇 𝑒𝑓𝑓 . Only variables that formed three-way associations were considered. 2. Data permutation: values were randomly selected from 𝒙 and 𝒙 to assign to 𝒙 and 𝒙 . 3. Across season measures were computed from the permuted dataset ∆𝒙 𝒊′ = 𝒙 − 𝒙 . Similarly, ∆𝒚 𝒋′ and ∆𝒛 𝒌′ were computed. 4. Mediation analysis was performed on the permuted dataset ∆𝒙 𝒊′ , ∆𝒚 𝒋′ , ∆𝒛 𝒌′ by assigning as IV, DV and M and the test statistics: 𝑧 𝑞′ was obtained. 5. The counter variable 𝐾 was incremented by one if absolute value of 𝑧 was greater than absolute value of 𝑧 𝑞′ . Steps 2-5 were repeated: 𝑞 = 1,2, ⋯ , 𝑄 times. 7.

Permutation-based p -value 𝑝 𝑆𝑜𝑏𝑒𝑙𝑝𝑒𝑟𝑚 was calculated as the proportion of the 𝑧 𝑞′ values that are as extreme or more extreme than 𝑧 i.e. 𝐾/𝑄 . Mediation analysis was considered significant if 𝑝 𝑆𝑜𝑏𝑒𝑙𝑝𝑒𝑟𝑚 ≤ 0.05 and 𝑇 𝑒𝑓𝑓 > 50% . Results

Supplemental Figure 1 . Boxplots comparing the Pre- and Post-season distributions of (A) tridecenedioate (B) VR Balance (C) miR-505 and (D) DMN connectivity. (A) (B) (C) (D) (E) (F) (G)

44 (H) (I) (J)

Supplemental Table 1 . Tables (A-J) lists all pairwise nominal interactions at a significance level of 0.05 between the variables: Network fingerprint,  Metabolite,  miRNA,  VR tasks and HAE metrics. Cook’s outliers ( k / n ) lists the number of outliers k , based on Cook’s distance, removed out of the total n samples. The linear regression slopes  , p- values and Adjusted R (R ) are reported after the outlier removal. Reference: