Deep Learning Head Model for Real-time Estimation of Entire Brain Deformation in Concussion
Xianghao Zhan, Yuzhe Liu, Samuel J. Raymond, Hossein Vahid Alizadeh, August G. Domel, Olivier Gevaert, Michael Zeineh, Gerald Grant, David B. Camarillo
11 Deep Learning Head Model for Real-timeEstimation of Entire Brain Deformation inConcussion
Xianghao Zhan, Yuzhe Liu, Samuel J. Raymond,
Member, IEEE,
Hossein Vahid Alizadeh,
Member, IEEE,
August G. Domel, Olivier Gevaert, Michael Zeineh, Gerald Grant, and David B. Camarillo,
Member, IEEE
Abstract —Objective: Many recent studies have suggested thatbrain deformation resulting from a head impact is linked to thecorresponding clinical outcome, such as mild traumatic braininjury (mTBI). Even though several finite element (FE) headmodels have been developed and validated to calculate braindeformation based on impact kinematics, the clinical applicationof these FE head models is limited due to the time-consumingnature of FE simulations. This work aims to accelerate theprocess of brain deformation calculation and thus improve thepotential for clinical applications. Methods: We propose a deeplearning head model with a five-layer deep neural network andfeature engineering, and trained and tested the model on 1803total head impacts from a combination of head model simulationsand on-field college football and mixed martial arts impacts.Results: The proposed deep learning head model can calculatethe maximum principal strain for every element in the entirebrain in less than 0.001s (with an average root mean squarederror of 0.025, and with a standard deviation of 0.002 over twentyrepeats with random data partition and model initialization). Thecontributions of various features to the predictive power of themodel were investigated, and it was noted that the features basedon angular acceleration were found to be more predictive than thefeatures based on angular velocity. Conclusion: Trained using thedataset of 1803 head impacts, this model can be applied to varioussports in the calculation of brain strain with accuracy, and itsapplicability can even further be extended by incorporating datafrom other types of head impacts. Significance: In addition tothe potential clinical application in real-time brain deformationmonitoring, this model will help researchers estimate the brainstrain from a large number of head impacts more efficiently thanusing FE models.
Index Terms —mild traumatic brain injury & whole-brainstrain & finite element & deep learning
I. I
NTRODUCTION
Affecting over 40 million children and adults world-wide, mild traumatic brain injury (mTBI, Table 1 listed theacronyms/abbreviations used in this paper and their respective
X. Zhan, Y. Liu, S. Raymond, H. Vahid Alizadeh, A. Domel, and D.Camarillo are with the Department of Bioengineering, Stanford University,Stanford, 94305, USA. X. Zhan and Y. Liu contributed equally to this work.(Corresponding Author: D. Camarillo e-mail: [email protected])O. Gevaert is with the Department of Biomedical Informatics, StanfordUniversity, Stanford, 94305, USAM. Zeineh is with the Department of Radiology, Stanford University,Stanford, 94305, USAG. Grant is with the Department of Neurosurgery, Stanford University,Stanford, 94305, USAThis work has been submitted to the IEEE for possible publication.Copyright may be transferred without notice, after which this version maybeno longer be accessible. meanings), has become a serious global health challenge [1].Evidence has suggested that mTBI can lead to unconscious-ness immediately after a head impact and can further resultin post-concussive symptoms including cognitive deficits andemotional challenges [2], [3] and even increase the risk oflong-term neurodegenerative diseases such as Parkinson’s andAlzheimer’s diseases [4], [5]. mTBI is not simply limited tovehicle accidents, but also occurs frequently in sports such asfootball [6], [7] and mixed martial arts (MMA) [8]. In sports-related mTBI, brain damage can accumulate with repetitiveinjuries, leading to more severe consequences [9]–[14].The fast diagnosis and early warning approaches of mTBI[15] are crucial to helping prevent repetitive sport-relatedmTBI since the in-time intervention after early detectioncan attenuate the injury to a significant extent [16], [17].As suggested in the previous work [18], brain strain is apromising mechanical parameter to predict the risk of mTBIand can be useful information for medical professionals inthe diagnosis of mTBI. Currently, several finite element (FE)models have been validated to calculate brain strain [11], [19]–[23]; However, expertise of biomechanics is needed to performthe FE simulations, and the simulations usually take at leastseveral hours or even days to run with a personal computer.Limited by the time-consuming nature of the FE simulationsand the lack of access to computational resources, the brainstrain is not widely referenced by medical professionals in thediagnosis of mTBI. Therefore, a fast and easy-to-use approachto calculate brain strain is needed for more practical clinicalapplications.As the input necessary to calculate the brain strain resultingfrom a head impact, the head kinematics of the impact aretraditionally measured by wearable sensor systems like thehead impact telemetry system (HITS) [24], Xpatch [25] andinstrumented mouthguard [26]–[30]. Recently, the brain straincalculated based on mouthguard-measured head kinematicswas found to correlate well with resulting mTBI [13], [31],[32], which suggests the feasibility to using instrumentedmouthguards to help diagnose mTBI with brain strain. There-fore, a promising approach to detect mTBI is to combine thewearable sensor systems with a fast method to calculate brainstrain, and use the brain strain to predict mTBI. This approachwill provide the calculated brain strain in real-time to medicalprofessionals to assist in diagnosis and therapeutic options.Researchers have recently made efforts to develop fast andaccurate methods to calculate brain strain [32]–[36]. One a r X i v : . [ q - b i o . T O ] O c t solution is to simplify the mechanical characteristics of thebrain as reduced-order brain models [32], [35], [36]. As atrade-off, the reduced-order models are limited in accuracyand can only predict a single value for the brain deformationthroughout the entire brain. Another potential solution toaccelerate the calculation is to use machine learning tools[33]. Over the last decade, machine learning has shown tobe very effective in applications such as biomedical signalprocessing leading to improved decision making and physics-based modeling of biological and biomedical systems [33],[37]–[40]. Recently, as the first step to address the problem ofmTBI prediction with machine learning, Wu et al. [33] applieda convolutional neural network (CNN) to predict the th percentile maximum principal strain (95% MPS) of the entirebrain and the corpus callosum, as well as the th percentilefiber strain of the corpus callosum. This work presented a newidea: the head angular velocity profiles can be translated intoimages, which were the input to the CNN model to predict theresulting brain strain. The promising accuracy showed that theCNN model could be effectively applied to analyze on-fieldhead impact for the prediction of 95% MPS, 95% MPS and95% fiber strain at corpus callosum. However, only three strainvalues were provided to indicate the severity of deformation,while the distribution of the deformation severity over thewhole brain, which would be helpful for diagnosis, was notpredicted.To address the costly computation of FE simulations andalso provide the MPS of the entire brain, we developed adeep learning head model to calculate the peak MPS of everyelement of the brain. We used the kinematic data (angularacceleration and angular velocity) of 1422 impacts generatedby simulations of FE model of hybrid III anthropomorphictest dummy (ATD) impacts and 381 impacts collected fromon-field football and MMA games with a previously validatedinstrumented mouthguard. The KTH head model was used,which is a validated and widely used FE model [13], [31],[41]–[43], to calculate the true brain MPS. Engineered featuresrepresenting the time-domain information of the kinematicdata were extracted, and a deep neural network (DNN) wasdeveloped to predict the MPS at each brain element. To obtaindatasets that were large enough to train the model, the 1422impacts from simulations were used as the basis dataset. Then,the basis dataset was fused with the on-field impacts data toapply our model to different sports. We trained and tested ourmodels on each of the datasets and finalized a model trainedon the mixture of all datasets. Examining several metrics, theprediction of entire brain MPS was shown to be accurate.Furthermore, we found that the 95% MPS calculated basedon the entire brain MPS was comparable to the previouslyreported models [33], [36], which showed the applicability ofmachine learning as a novel approach in the field of fast braindeformation calculation.II. M ETHODS
A. Data description
To broaden the applicability of the model, we used the dataof the head impacts collected on-field in college football and
TABLE IT
HE ACRONYMS AND ABBREVIATIONS USED IN THIS PAPER AND THEIRRESPECTIVE MEANINGS .Acronym/Abbreviation MeaningmTBI mild traumatic brain injuryMMA mixed martial artsFE finite elementHITS head impact telemetry systemCNN convolutional neural networkMPS maximum principal strainATD anthropomorphic test dummyCF1 college football dataset 1CF2 college football dataset 2HM ATD head model simulated datasetDNN deep neural networkReLu rectified linear unitAdam adaptive moment estimationMAE mean absolute errorRMSE root mean squared errorSpearman Corr. Spearman coefficient of correlationSTD standard deviationCI confidence intervalRNN recurrent neural network
MMA. Instrumented mouthguards, which have been validatedto measure the rigid-body movement of the head accurately,were used to collect the kinematics data [29], [30]. All theimpacts were video confirmed. Specifically, we included 184head impacts in college football games [31] collected bythe original version of Stanford instrumented mouthguard[29] (dataset college football 1, CF1), and 118 head impactscollected in college football games by the updated Stanfordinstrumented mouthguard [30] (dataset college football 2,CF2). We also included 79 head impacts in MMA collected bythe updated Stanford instrumented mouthguard [13] (datasetMMA). To enlarge the training dataset, we performed sim-ulations of head impacts using a validated FE head modelof the hybrid III ATD [44] (dataset head model, HM). Thebare dummy head was impacted at different locations withthe velocities ranging from 2 m/s to 8 m/s, and in total 1065head impact kinematics were obtained. To further enlarge thetraining dataset, the kinematics in the Y and Z axes of the HMhead kinematics were switched.The KTH head model [41], a validated FE model for mTBI-level head impacts, was used to calculate the brain peak MPSbased on the input head kinematics. In the simulations, theskull of the model was rigid and moved according to theinput kinematics. The peaks of angular velocity magnitude(Fig. 1A) and angular acceleration magnitude (Fig. 1B) wereplotted against the resulting 95% MPS for each dataset. All thesimulations using the on-field data succeeded. However, only1422 head impacts in HM dataset succeeded because some ofthe simulations failed due to the high-frequency componentsin the simulated kinematics. Therefore, 1803 head impactsin total were used in this study. For reference, among thefour datasets, the highest 95% MPS were: HM: 0.4423, CF1:0.5093, CF2: 0.4184, MMA: 0.7051.
B. Feature engineering and data preprocessing
For each impact, features describing the characteristics ofrotational kinematics were extracted as the input to the model.
Fig. 1.
Head kinematics and 95% MPS in the datasets. (A) 95%MPS and peak of resultant angular (Ang.) velocity (vel.). (B) 95% MPSand peak of resultant angular (Ang.) Acceleration (acc.). Different colorswithin the scatter plots indicate the different datasets (HM: hybrid III ATDimpact simulations; CF1: college football impacts from the original versionof instrumented mouthguard; CF2: college football impacts from the updatedversion of instrumented mouthguard; MMA: MMA head impacts from theupdated version of instrumtented mouthguard). The curves at the top and theright are kernel density plots (Top of (A) of angular velocity. The top of panel(B) is for angular acceleration, and right of panel (B) is for 95% MPS).
Angular accelerations ( α x , α y , α z ) were calculated by 5-pointsstencil derivative on angular velocities ( ω x , ω y , ω z ). Thenthe magnitudes ( | α | , | ω | ) and components in each direction( x, y, z ) of angular accelerations and angular velocities weredefined as 8 channels (denoted as c i ( t ) , i = 1 , , ..., ). Foreach channel, six types of time-domain features were extractedbased on the rationale of including the information of thesignal intensity and time history:1) Maximum value (1 feature/channel): max( c i ( t )) , i =1 , , ...,
2) Minimum value (1 feature/channel): min( c i ( t )) , i =1 , , ...,
3) Integral of the time-signal (1 feature/channel): (cid:82) c i ( t ) dt, i = 1 , , ...,
4) Integral of the absolute values of time-signal (1 fea-ture/channel): (cid:82) | c i ( t ) | dt, i = 1 , , ...,
5) Maximum and minimum of the exponential movingaverage of the signal derivative (6 features/channel): E a,i = { min( y i ( n )); max( y i ( n )) } , n = 0 , , , ... (1) E a,i was extracted to take into account the variationof kinematic signals because it describes the generaltrend of a curve without being biased by extreme val-ues caused by noise. The discrete exponential movingaverage of derivative y i ( n ) was defined as: y i (0) = ac i (0) (2) y i ( n ) = (1 − a ) y i ( n −
1) + a ( c i ( n ) − c i ( n − , n ≥ (3)Where a is the smoothing coefficient chosen to be SR , ∗ SR , ∗ SR to account for different smoothingeffects [39] and SR is the sampling frequency (1kHz). E a,i contains the minimum and maximum values of theexponential averages of each of the derivatives of the 8channels, and reflects the largest positive slope and thelargest negative slope in the signal.6) Information of extrema except max and min (10 fea-tures/channel): the total number of positive extrema, the total number of negative extrema, the values ofthe second-largest to the fifth-largest positive extrema,the values of the second-smallest to the fifth-smallestnegative extrema. This type of feature was extractedbecause oscillatory impacts might have an accumulatingeffect on the brain deformation.After feature engineering, the kinematics of each impactwas transformed into a numerical matrix with 160 columns (8channels ×
20 features/channel), while each column denotedone feature and each row denoted one impact. The featurematrix was then used to train the deep neural network afterstandardization and data augmentation. Data standardizationwas performed based on the following formula to eliminatethe bias caused by different units and imbalanced weights offeature values: ¯ f ( x, i ) = f ( x, i ) − mean x ( f eature ( x, i ))std x ( f eature ( x, i )) (4)Here, f ( x, i ) denotes the i th feature of the x th impact. mean x () and std x () calculates the mean value and the standard de-viation over the variable x , respectively. The standardizationparameters were calculated based on the training set in thevalidation process, and on both the training and validation setsin the model evaluation process (See details in Section 2D).Furthermore, to optimize model performances, twocommonly-used data transformations of the numerical valuesof labels (MPS) were done :1) Logarithmic transformation was firstly performed tostabilize the error variance and prevent negative predictions ofMPS. In the training process, each MPS value was transformedby taking its logarithm. After prediction, the predicted valueswere inverse-transformed as MPS by exponentiating.2) The logarithmic-transformed labels were then whitenedto address the problem of potential imbalanced weights givento different MPS values and remove the underlying correlationamong features to improve accuracy. The whitening processwas based on the following formula: z whitened ( x, i ) = z raw ( x, i ) − mean x ( z raw ( x, i ))std x ( z raw ( x, i )) (5)Where z raw ( x, i ) denotes the MPS value of the i th element for the x th impact. The mean x ( z raw ( x, i )) and std x ( z raw ( x, i )) for each of the 4124 elements were calculatedon the training set in the model validation process, and onboth the training and validation sets in the model evaluationprocess (See details in Section 2C, 2D). mean x ( z raw ( x, i )) and std x ( z raw ( x, i )) were recorded to inverse-transform theprediction of MPS. The assumption was that the dataset usedto calculate mean x ( z raw ( x, i )) and std x ( z raw ( x, i )) repre-sented a general distribution.The feature engineering was done on MATLAB R2020a(Austin, TX, USA). The data standardization and transformswere done on Python 3.7 with scikit-learn packages [45]. C. Deep learning model development
The goal of this work was to use a deep neural network(DNN) to act as a function approximator to learn the relation- ship between the measured input kinematics and the calculatedMPS from FE simulations. The DNN would act as a proxy fortraditional FE simulations in order to speed-up workflow whencalculating MPS based on input head kinematics. To do this,the dataset must be partitioned into a training set (to train theDNN-based model), validation set (to tune hyperparameters)and test set (to evaluate model performance). Once trained,this network would take features for a new impact that wasnot present in the training set and the validation set and predictthe MPS.Our deep neural network consisted of five layers in additionto the input layer with 160 units and the output layer of4124 units: 1) Hidden layer 1: 300 neurons with the rectifiedlinear unit (ReLU) as the activation; 2) Dropout layer 1 with adropout rate of 0.5 and no activation; 3) Hidden layer 2: 100neurons with ReLU as the activation; 4) Dropout layer 2 witha dropout rate of 0.5 and no activation; 5) Hidden layer 3:20 neurons with ReLU as the activation. The general designrationale was to condense kinematics information into a deep,low-dimension hidden pattern and then connect to the 4124output units. Furthermore, dropout was used as a form of reg-ularization to boost the model robustness and generalizability.Random initialization was used to break network symmetry.L2 regularization was also added to penalize large weightsto improve generalizability of the model. Mean squared errorwas used as the loss function, and adaptive moment estimation(Adam) optimizer was used as the optimizer [46] with a batchsize of 128 to boost the training efficiency.On the validation set, the hyperparameters tuned in themodeling included the number of training epochs, the learningrate, and the strength of the L2 regularization. The optimalhyperparameters were chosen based on the learning curves ofthe training loss and the validation loss on the validation set.Before training, in an attempt to further boost the gener-alizability of the model, the training set was augmented byadding Gaussian noise into the training samples. A normaldistribution with a mean of zero and standard deviations of0.01 and 0.02 times the standard deviation of the originaldata were added. Ultimately, after the completion of this dataaugmentation, the number of impacts in the training set wastripled. Data augmentation and deep learning were performedin Python 3.7, using the Keras package with the TensorFlow2.0 backend [47].
D. Model assessment
The model training and assessment processes are shownin Fig. 2. Three types of tasks were performed to assess themodel. Firstly, the model was trained and tested solely on theHM dataset because of its large number of impacts (Task 1).Secondly, to apply the model to specific sports, the HM datasetwas used as the basis of the training dataset. The on-fielddata was partitioned into two parts. One part was fused intothe training set and the validation set, and the rest was usedfor the testing (Task 2). Finally, the HM and all the on-fielddatasets were combined into one mixed training/validation/testset to test the model’s applicability to both football and MMAimpacts as a whole (Task 3). The partition of the datasets inthese three tasks was the following:
80% On-field Data
1) Basis/Mixture [20 Repeats] ① Dataset Split
Training Set
HM 1422
Training Set (70%) Validation Set (15%) Test Set (15%) ② Hyperparameter Tuning
Training Set (70%) Validation Set (15%) Test Set (15%)
Data Augmentation ③ Model Evaluation
Training Set (70%) Validation Set (15%) Test Set (15%)
Data Augmentation
2) On-field (CF/MMA) [5 Folds] ① Dataset Split ② Hyperparameter Tuning
Training Set80% of 80% On-field
Validation Set
20% of 80% On-field
Test Set
20% On-field
Data Augmentation ③ Model Evaluation
Training Set80% of 80% On-field Test Set20% On-field
Data AugmentationData Augmentation
Data Augmentation
Training SetHM 1422 Validation Set20% of 80% On-field
Fig. 2.
The description of the dataset partition process and the functionof each dataset.
In each of the three tasks (basis, on-field, and mixture),different training sets, validation sets and test sets were built. The validationsets were used to tune the hyperparameters while the test sets were usedto evaluate model performances in unseen data. Data augmentation withGaussian noise was used to improve the model generalizability. To testreproducible performances and model robustness, in the basis and mixturetasks, the experiment was done within 20 repeats with random data partitionsand random model initialization, and, in on-field tasks, five-fold tests weredone with randomly shuffled on-field data and random model initialization. R ), and Spearman coefficient of correlation were calculated by comparing the predicted MPSand the true MPS at every element. MAE and RMSE showthe average error over all elements and they are in the sameunit as strain. RMSE was also calculated because it is moresensitive to large errors and was thus used for hyperparametertuning. R shows the linear correlation between the predictedand the true MPS, and the Spearman coefficient of correlationrepresents the consistency of the rank and the non-linearcorrelation.Furthermore, since 95% MPS over all elements were usedto indicate the most severe brain deformation during the headimpact [32], [33], [35], [36], R and RMSE of 95% MPS werecalculated over the impacts in the test set. The RMSE of 95%MPS was in the same unit as strain.The accuracy metrics mentioned above (mean values ofMAE, RMSE, R , Spearman coefficient of correlation, 95%MPS R and 95% MPS RMSE) were calculated by comparingthe predicted and the true (calculated via the KTH model) MPSat every element for every impact in the test set. As a result,we obtained a group of metrics describing the performanceof the model in each repeat or fold. The summary statistics(mean, median, STD, 95% CI) over repeats and folds werecalculated in the following manner: • MAE Mean = mean r (mean i (MAE e (MPS))) • MAE Median = median r (mean i (MAE e (MPS))) • MAE STD = std r (mean i (MAE e (MPS))) • MAE 95%CI = 95%CI r (mean i (MAE e (MPS))) • = mean r (R i (95% (MPS)) • = mean r (RMSE i (95% (MPS))For these equations, the subscripts represent that the func-tion is calculated over that variable; for example, the subscript r denotes repeats or folds, the subscript i denotes impacts inthe test set in one repeat or fold, and the subscript e denotesthe 4124 brain elements in one impact. E. Feature analysis
To interpret the relative importance of the various engi-neered features, we input each type of feature individually andclassify the 160 features according to their physical meanings(and regard them as subsets of features based on their clas-sification). The predictive power of each subset was assessedby RMSE of the predictions when only that specific subsetwas fed to the deep learning model. Firstly, to investigate thepredictive power of various methods to engineer features, wecompared the RMSE of the predictions based on maximumvalues, minimum values, integral values, integral of absolutesignal values, maximum and minimum of the exponentialaverage of derivative, and information of extrema except maxand min. Secondly, the predictive power contributed by theangular acceleration features and the angular velocity featureswere compared. Thirdly, we investigated the predictive powercontributed by the features related to signal values (maximum,minimum, extrema) versus features related to the time history(integral, integral of absolute signal values, and maximum andminimum of the exponential moving average of derivative).Finally, we studied the contributions of the components ofthe kinematics at different directions and the magnitude of the kinematics to the predictive power. The analysis wasperformed on the mixture datasets with the same partitiondescribed in Section 2D. Wilcoxon signed-rank test was doneto determine the statistical significance of the RMSE differencebetween every pair of the feature subsets because the Shapiro-Wilk test rejected the assumption of data normality at asignificance level of 0.05 on the RMSE of some featuresubsets. III. R
ESULTS
A. Model performances on different datasets
The model performances based on the accuracy metricsmentioned in Section 2D are listed in Fig. 3. Meanwhile, thenumber of impacts used for training, validation and testing, aswell as the optimal hyperparameters used in the deep neuralnetwork, are also shown in Fig. 3. Furthermore, the absoluteerror of MPS at every element were averaged over impactsand repeats (folds for Task 2). The kernel density distributionis plotted in Fig. 3.It should be noted that
MAE Mean and
MAE Median inall tasks were smaller than 0.03, which were smaller than thedifferences of brain strain seen in injury and non-injury cases[41]. Furthermore, the R of 95% MPS were comparable toa recently reported kinematics-based injury criteria [32], [33],suggesting that this DNN-based model is potentially accurateenough for clinical application.The performance of the model in different tasks is comparedin Fig. 4. The predictions were more accurate on the HM andthe mixture tasks than the on-field task, particularly on theMMA dataset. However, the values of mean absolute errorwere smaller than 0.03 for all tasks. The values of R of 95%MPS in all tasks were around or higher than 0.70 and thevalues of the Spearman coefficient of correlation in all taskswere around or higher than 0.75, which indicated good modelaccuracy in the prediction of MPS.In Fig. 5A-F, to visualize the prediction accuracy, 3 caseswere selected as examples. The 3 cases were selected as theimpacts with the 30th, 60th, 90th percentile highest true 95%MPS over the test impacts of all repeats. It should be noted thatthe brain regions with large deformation were similar in thepredicted and true results. This indicated that the current modelcould be used to diagnose the region of the injury, instead ofonly giving a scalar metric predicting the risk of concussion.The MPS of each element plotted in Fig. 5A-F were comparedin Fig. 5G-I. All scatters indicating MPS lied closely to thereference line y = x , which showed the predictions of MPSwere accurate at every element of the brain. B. Feature analysis
Since, by nature, deep learning approaches tend to be ablack box, we wanted to better understand what engineeredfeatures played a role in the prediction. To do so, the featureswere classified into several subsets according to how theywere extracted from the signals (Max: maximum values, Min:minimum values, Int.: integral values, Abs. int.: integral ofabsolute values, EMA der.: maximum and minimum of theexponential moving average of derivative, Extr.: information
Basis (HM 1422)
Dataset Details Accuracy Metrics over 20 Experiments
Total Cases 1422 MAE Mean 0.015 RMSE Mean 0.022Training Cases 994 MAE Median 0.015 RMSE Median 0.022Validation Cases 214 MAE STD 0.001 RMSE STD 0.002Testing Cases 214 MAE 95%CI 0.001 RMSE 95%CI 0.001
Optimal Hyperparameters
R2 Mean 0.867 Spearman Corr. Mean 0.931Learning Rate 0.001 R2 Median 0.868 Spearman Corr. Median 0.934Epoch 4000 R2 STD 0.023 95%MPS R2 Mean 0.906L2 Regularization 0.0075 R2 95%CI 0.010 95% MPS RMSE Mean 0.029
On-field (CF1 184)Dataset Details Accuracy Metrics over 5-fold Experiments
Total Cases 1606 MAE Mean 0.013 RMSE Mean 0.023
Training Cases 1554 MAE Median 0.013 RMSE Median 0.022Validation Cases 33 MAE STD 0.002 RMSE STD 0.003Testing Cases 19 MAE 95%CI 0.001 RMSE 95%CI 0.003
Optimal Hyperparameters
R2 Mean 0.755 Spearman Corr. Mean 0.838Learning Rate 0.0002 R2 Median 0.797 Spearman Corr. Median 0.841Epoch 500 R2 STD 0.064 95%MPS R2 Mean 0.864L2 Regularization 0.01 R2 95%CI 0.056 95% MPS RMSE Mean 0.027
On-field (CF2 118)Dataset Details Accuracy Metrics over 5-fold Experiments
Total Cases 1530 MAE Mean 0.014 RMSE Mean 0.022Training Cases 1487 MAE Median 0.013 RMSE Median 0.021
Validation Cases 19 MAE STD 0.002 RMSE STD 0.004
Testing Cases 24 MAE 95%CI 0.002 RMSE 95%CI 0.004
Optimal Hyperparameters
R2 Mean 0.670 Spearman Corr. Mean 0.850Learning Rate 0.00005 R2 Median 0.669 Spearman Corr. Median 0.853Epoch 4000 R2 STD 0.074 95%MPS R2 Mean 0.726L2 Regularization 0.01 R2 95%CI 0.065 95% MPS RMSE Mean 0.031
On-field (MMA 79)Dataset Details Accuracy Metrics over 5-fold Experiments
Total Cases 1501 MAE Mean 0.023 RMSE Mean 0.040Training Cases 1472 MAE Median 0.023 RMSE Median 0.036Validation Cases 13 MAE STD 0.003 RMSE STD 0.008Testing Cases 16 MAE 95%CI 0.002 RMSE 95%CI 0.005
Optimal Hyperparameters
R2 Mean 0.648 Spearman Corr. Mean 0.764Learning Rate 0.0002 R2 Median 0.662 Spearman Corr. Median 0.767Epoch 3000 R2 STD 0.144 95%MPS R2 Mean 0.813L2 Regularization 0.01 R2 95%CI 0.089 95% MPS RMSE Mean 0.045
Mixture (Alldata 1803)Dataset Details Accuracy Metrics over 20 Experiments
Total Cases 1803 MAE Mean 0.015 RMSE Mean 0.025
Training Cases 1261 MAE Median 0.016 RMSE Median 0.025
Validation Cases 271 MAE STD 0.001 RMSE STD 0.002Testing Cases 271 MAE 95%CI 0.001 RMSE 95%CI 0.001
Optimal Hyperparameters
R2 Mean 0.837 Spearman Corr. Mean 0.902Learning Rate 0.0005 R2 Median 0.832 Spearman Corr. Median 0.902Epoch 2500 R2 STD 0.023 95%MPS R2 Mean 0.897L2 Regularization 0.01 R2 95%CI 0.010 95% MPS RMSE Mean 0.032 AB CD E Fig. 3.
The distribution of absolute error between predicted MPS and true MPS, and the model information and the accuracy metrics on differentprediction tasks.
Each of the subplots shows the distribution of 4124 absolute error values calculated by averaging absolute error over test impacts and overrepeats/folds for all brain elements. Each of the tables shows the data, the optimal hyperparameters used in modeling, and the model performances based onthe accuracy metrics. In these tables, summary statistics over 20 repeats/5 folds are given.
A B CD E F
Fig. 4.
Box plots of multiple accuracy metrics of models on different tasks.
Each of the subplots shows one of the accuracy metrics of models on eachof the different tasks, comparing DNN predicted MPS vs. FE model calculated MPS. of extrema except the max and min). Then, the features werealso classified into subsets by the types of the kinematics thatthey were extracted from (Ang. acc.: angular acceleration,Ang. vel.: angular velocity); if they were extracted directlyfrom the signal values (Value) or from the time history ofthe signals (His.); and if the features were calculated bythe kinematics components at each direction (Comp.) or themagnitude (Mag.). The comparisons among subsets were madeon the mixture dataset, and the results are shown as box plotsin Fig. 6. Smaller RMSE indicates higher predictive power.The statistical significance in the RMSE difference of eachpair of feature sets is shown in the four tables of Fig. 6 basedon the Wilcoxon signed-rank test.As seen in the results in Fig. 6, in the analysis of howthe features were extracted from the signals, as is expected,the model with all 160 features was the most accurate. Thefeatures from the exponential moving average of derivativeexhibited significantly higher predictive power compared withother types of features( p ≤ . ). The maximum values, theintegral values, and the integral of absolute values were foundwith similar predictive power ( p ≥ . ) and were found to beinferior to that of the EMA derivative ( p ≤ . ). The lowestpredictive power was found in the feature subsets of minimumvalues and the information of extrema except max and min. Interms of the features extracted from the different kinematics,angular acceleration features showed slightly higher predictivepower than the angular velocity feature ( p ≤ . ). This factindicated that angular acceleration could be more predictive forMPS calculation than angular velocity based on the current feature engineering. In the comparison of features based onthe signal values and the time history, signal-value relatedfeatures (maximum, minimum, and extrema) were significantlyless predictive when compared with the time-history relatedfeatures (integral, EMA derivative, etc.) ( p ≤ . ), whichindicated the importance of the time history in predictingthe MPS. Finally, the predictive power in the features oncomponents of kinematics is higher than the magnitude ofkinematics ( p ≤ . ), which agreed with the knowledge thatthe directional information of kinematics should be includedin predicting brain deformation [32].IV. D ISCUSSION
The main goal of this work is to significantly reduce thecomputational cost of brain strain calculation compared withconventional FE modeling. The proposed deep learning modelprovides users with a fast and accurate tool to calculate brainMPS in head impacts. With accuracy in MPS calculation, themajor advantage of the deep learning model is the substantiallysmaller time consumption in the calculation: it takes < . to calculate brain MPS using a PC (Intel Core i5-6300U).This computational time enables the real-time calculation ofthe brain MPS, which has promising potential to monitor thestrain on the brain in real-time. As the model was trained andtested on a variety of datasets, the model can be applied todifferent on-field sports such as football and MMA and canbe further extended to other sports with new datasets.Previously, researchers have made great efforts to developmodels to quickly estimate the brain deformation [33]–[36], Fig. 5.
The visualization of three typical predicted and true MPS in the mixture task based on 95% MPS. (A-F): The cloud plots of true MPS (A, C,E) and the predicted MPS (B, D, F) in the brain. From left to right, (A, B) show the 30th percentile (true 95% MPS) case, (C, D) show the 60th percentile(true 95% MPS) case, (E, F) show the 90th percentile (true 95% MPS) case. From top to bottom, (A1) is the whole brain, (A2) is the posterior half of thebrain, (A3) is the right half of the brain, and (A4) is the bottom half of the brain. The orientation of the brain is given in (A3). The color bar of MPS for(A-F) is given at the bottom of cloud plot, and the ranges of MPS for each case are given. The maximum and the minimum of the range are the highest MPSand half of the highest MPS, respectively. For clarity in showing the regional correlation between the true and predicted values, the elements where MPS isbelow the color bar range are plotted as transparent.(G-J): The comparisons between the true (KTH) MPS and predicted MPS. Each data point denotes oneelement in the brain. (G) is the 30th percentile case in (A, B), (H) is the 60th percentile case in (C, D), and (I) is the 90th percentile case in (E, F). The blacksolid line is y=x . The colors of the scatter points represent the absolute value of the difference between true (KTH) MPS and predicted MPS. The RMSE isgiven at the right corner of each plot.
P All Max Min Int. Abs. int. EMA der. Extr.All - <0.001 <0.001 <0.001 <0.001 0.011 <0.001
Max <0.001 - <0.001 0.852 0.575 0.001 0.095
Min <0.001 <0.001 - <0.001 <0.001 <0.001 0.022
Int. <0.001 0.852 <0.001 - 0.391 <0.001 0.037
Abs. int. <0.001 0.575 <0.001 0.391 - <0.001 0.040
EMA der.
Extr. <0.001 0.095 0.022 0.037 0.040 <0.001 -
P All Ang. acc. Ang. vel.
All - 0.030 0.012
Ang. acc.
Ang. vel.
P All Value His.All - 0.002 0.573
Value
His.
P All Comp. Mag.All - 0.173 <0.001
Comp.
Mag. <0.001 0.002 -
Legend
AB CDE
Fig. 6.
The RMSE box plots of models with subsets of features and the statistical significance in the comparison between the various featuresbased on Wilcoxon signed rank tests. (A) The RMSE of models with different features. The dashed vertical lines separating the various box plots denote:1) the all-feature model performance; 2) comparison among different types of engineered features; 3) comparison between angular acceleration features andangular velocity features; 4) comparison between features based on signal values and features based on signal time history; 5) comparison between kinematiccomponents and the kinematic magnitudes. (B, C, D) The statistical significance in comparison between pairs of features. and these works can be classified into two categories: reduced-order models and pre-computed models. In the reduced-orderbrain models [35], [36], researchers modeled brain defor-mation using a system with mass, spring and dashpot anddetermined the parameters for models based on analyses ofthe brain response to idealized impulses. In the pre-computedmodels, large datasets of head impact kinematics were used todevelop the atlas [34] and train a CNN model [18], [33]. Theproposed deep learning model in our study is in the categoryof the pre-computed models, and the major novel developmentwe present is that this model can predict the strain at everyelement of the entire brain instead of several summary values,as previously done in Wu et al., to indicate the severity of thebrain deformation. The R value between the predicted andthe true 95% MPS of the entire brain was 0.8970 (Mixturetask, Fig. 3E), which is close to but lower than the state-of-the-art models predicting the single 95% MPS (0.966 for theCNN model [33] and 0.968 for the DAMAGE model [36]).However, because the prediction types are different and theFE models used to generate the training dataset are different,the direct benchmarking against the previous models is notfeasible. The performances of the models vary with datasets, asshown in Fig. 3 and also in [33]. Therefore, the type of datasets (sports, measuring instrument, etc.) should be considered wheninterpreting the accuracy of these models.Although the proposed deep learning head model predictedthe 95% MPS with close but slightly lower R values, thisdeep learning model predicted the brain strain of the entirebrain with high accuracy. Firstly, the deep learning head modelcould predict the high-strain regions accurately based on boththe visualized results and the large values of the Spearmancoefficient of correlation. Because the Spearman coefficientof correlation measures the difference between the ranking ofthe predicted MPS and the ranking of the true MPS, the largevalues of the Spearman coefficient of correlation indicated thatgenerally, the true high MPS brain elements were accuratelypredicted with high MPS values by the deep learning model.Secondly, the predicted MPS value on each brain element wasaccurate. The MAE was generally smaller than 0.03, whichwas smaller than the MPS difference between injury cases andnon-injury cases when using the same KTH model to calculatebrain strain [41].One limitation of the pre-computed models is the depen-dence on the size of the training dataset. To overcome this,in a previous study, the impact kinematics data is augmentedby switching the kinematics in different directions for the previously published CNN-based brain model [33]. However,the rotational responses of the neck to head impacts aredifferent in sagittal and coronal planes [48]. So the augmenteddataset may not accurately represent the characteristics ofthe head impact kinematics in football. In this study, anotherapproach was proposed to address the limitation of the datasetsize. A large dataset was generated by FE simulations ofATD head impacts (HM: 1422 impacts), and it was combinedwith on-field datasets (CF1: 184 impacts, CF2: 118 impacts;MMA: 79 impacts). In this method of data fusion, the deeplearning model benefitted from the large number of impactsin the HM dataset. This was done by learning the mechanicalcharacteristics from the brain model and also capturing thecharacteristics of the on-field kinematics in specific sports suchas football and MMA. It should be noted that the FE modelused in the ATD simulated impacts was originally developedfor football. Since the ATD was not helmeted, the head kine-matics were higher but close to helmeted head impacts becausethe impact impulse and neck rotation determine the headrotation. Therefore, the HM dataset might represent footballstyle impacts better. The relatively inferior performance of themodel on the MMA dataset (Fig. 3D) can be explained by this.Additionally, according to the results, it was shown that themodel performed better on the HM dataset than on the on-fielddata. This can be explained by the lower signal-to-noise-ratioin the on-field data.Besides the new method to fuse the dataset, we also designand optimize the model structure for predicting the brainstrain. The structure of the deep learning neural networkapplies the fully connected layers and dropout layers to extractmeaningful information for the MPS prediction. The dropoutlayers and L2 regularization were used to avoid overfitting. Asan important factor in impacting accuracy, the hyperparameterswere carefully tuned based on a validation set to find theoptimal hyperparameters for an accurate deep learning modelwith generalizability. The model performance was then testedon a separate test set with completely unseen impacts, whichshowed the model accuracy in predicting unseen data withoutoverfitting to the training and validation set. To stabilize thevariance and prevent large errors with large prediction values,this deep learning method applied the logarithmic transforma-tion and data whitening, which improved the accuracy andensured all predictions were positive because only positiveMPS values have physical meanings. Additionally, the deeplearning model can be more scalable when new datasets can beincorporated into the current training dataset, and re-trainingcan be done to improve the model performance. This cannotbe easily done in the less scalable pre-computed brain injurymodels [34].The feature analysis shows the contribution of each typeof feature in terms of the predictive power of the deeplearning model. Firstly, angular acceleration features showhigher predictive power than angular velocity features. Thisfinding contradicts the results that angular velocity magnitudewas better correlated to MPS than angular acceleration in arecent study [32]. However, it should be noted that the angularacceleration is the derivative of the angular velocity, and thefeatures included the information of derivative and integration. So angular velocity and angular acceleration subsets mighthave some overlapping features. The potential reason may bethat our features incorporate the time history of the signals(the integral and the integral of absolute signal values, theexponential moving average of the derivative of the signal)instead of only the maximum values, and the time historyleads to the higher predictive power of angular acceleration.Further study is needed to investigate the causal relationshipbetween angular acceleration and MPS, as well as betweenangular velocity and MPS. Secondly, the time history featuresare more predictive in general than the value features. Thisfinding indicates the necessity of further machine learningwork to make use of the temporal relationship in the datamining of kinematic signals, and also suggests the head impactmeasurement should focus on time history instead of just thethe maximum values.The deep learning model can be applied to evaluate therisk of a head impact. FE simulation, as the conventionalapproach to calculate the brain MPS, takes hours or even daysin the calculation. The complicated FE models also requiretrained engineers to perform the simulation. Using the deeplearning model, medical professionals can monitor the brainMPS in a real-time manner. Furthermore, the deep learningmodel is simple enough to be built into a small device. Witha combination of sensor signal recordings and processing inwearable devices, the risk of a head impact can be evaluatedby the device, and the MPS of the entire brain can be sharedwith medical professionals to help diagnosis in real-time.As more data become available and as more is understoodregarding the correlation of MPS and brain injury, the morerealistic this becomes. Evidence suggests that the pathologyoccurs at the regions of high MPS [13], so the deep learningmodel can allow medical inspections (such as CT or MRIscans) to be made as soon as the calculated MPS indicatesserious brain injury threats in certain brain regions. In addition,better precaution and protection decisions and plans for theseplayers could also be put in place with this information. In thefuture, additional metrics like maximum principal strain rate,maximum shearing stress can also be considered to diversifythe output information related to brain injury for medicalanalysis.Furthermore, this model benefits more than just clinicians;it can also benefit researchers working in the field of TBI andmTBI. This model provides researchers with a faster methodto calculate brain MPS, which is recognized as a parameterindicating the risk of mTBI. Using the deep learning model,one head impact can be modeled in less than a second andresearchers can easily process a large amount of head impactskinematics without having to spend days and weeks workingthrough FE simulations.While this work provides a novel and powerful method formTBI monitoring, there are some limitations beyond those pre-viously mentioned. Firstly, this model used feature engineeringto extract information from the time-domain kinematics, butthe overall predictive power of the features set an upper boundon model performance. As there is no widely acknowledgedstandardized feature sets to characterize brain deformation,the engineered features in this work may also not be the best features to drive the proposed deep learning models. Itis possible that, without artificially engineering the features,other deep learning architectures such as a convolutional neu-ral network (CNN) and recurrent neural network (RNN), mayextract features from the longitudinal signals automatically onthe premise of large quantities of data. However, ideally therewould be much more data in order to enact these type ofmodels. This was shown in Wu et al. [33] where a CNNwas applied to estimate the MPS for the whole brain andreported accurate results. More work in this area is warrantedand will continue to add better fidelity and accuracy to thisnew approach. Another limitation is that the deep learningmodel does not predict the fiber strain because the version ofKTH head model used in this study does not incorporate theaxonal fibers.Furthermore, although in this work high accuracy wasshown when the deep learning head model was applied to dif-ferent datasets, the accuracy in real-world application remainsto be validated considering the different characteristics ofdifferent types of head impacts. For instance, the head impactsin high school football games and those in college footballgames may have different characteristics, considering the ageand the level of competitiveness. Therefore, the accuracy needsto be further tested on different types of head impacts andeven more features such as the age of players, can be addedto improve the model accuracy.Finally, although this study endeavors to interpret themodels with different input feature subsets, the informationextracted in the hidden states of the model still remain to befurther studied. For instance, the last hidden layer with onlytwenty neurons might have already condensed the informationof brain deformation. As is shown in the applications inDeepFace and DeepPatient [50], [51] where the information ofthe deep hidden-layer neurons can be extracted for downstreampattern recognition tasks such as facial recognition and clinicaloutcome prediction, whether the data-driven low-dimensionrepresentation could be directly used as brain injury criteria re-mains an open and interesting research topic. It is possible thatthe information of the twenty-neuron layer can be extractedand used in prediction tasks such as concussion prediction.V. C ONCLUSION
This study proposes a novel deep learning head model forfast whole-brain strain calculation. Based on the simulatedhead impact data and the on-field football and MMA datameasured via instrumented mouthguard, with data fusion, thedeep learning head model showed high accuracy, which wasmanifested in multiple metrics in the calculation of maximumprincipal strain. This model remarkably accelerates the cal-culation process when compared with the conventional FEsimulations. The results also showed that the deep learninghead model was accurate in predicting the severity of brain de-formation as well as the specific regions of the brain sufferingfrom the highest strain. Feature analysis of the model showedthat certain types and groups of engineered features weremore predictive in MPS prediction and may be inspirationalfor researchers to extract effective features and understand what factors drive MPS in future studies. This model enablesresearchers to calculate brain strain in a much faster andeven real-time manner without performing computationallyand time expensive FE simulations. Additionally, the complete,fast, and accurate brain strain visualization of the proposedmodel provides trainers with a reliable method to monitor theforces on the brain and the resulting brain deformation fromhead impacts, such that medical professionals can intervene inreal-time to prevent brain damage accumulation and assist inmTBI diagnosis. VI. A
CKNOWLEDGEMENT
This work was supported by the Department of Bioengi-neering, Stanford University and the Office of Naval ResearchYoung Investigator Program (N00014-16-1-2949).R
EFERENCES[1] Cassidy, J. David, et al. ”Incidence, risk factors and prevention of mildtraumatic brain injury: results of the WHO Collaborating Centre TaskForce on Mild Traumatic Brain Injury.” Journal of rehabilitation medicine36.0 (2004): 28-60.[2] Wallace, Tracey, and John Morris. ”Development and Testing of a Tech-nology Enhanced Intervention to Support Emotion Regulation in MilitarymTBI with PTSD.” Archives of Physical Medicine and Rehabilitation100.7 (2019): e5.[3] Shlosberg, Dan, et al. ”Blood–brain barrier breakdown as a therapeutictarget in traumatic brain injury.” Nature Reviews Neurology 6.7 (2010):393-403.[4] Guskiewicz, Kevin M., et al. ”Recurrent concussion and risk of depressionin retired professional football players.” Medicine & Science in Sports &Exercise 39.6 (2007): 903-909.[5] Doherty, Colin P., et al. ”Blood–brain barrier dysfunction as a hallmarkpathology in chronic traumatic encephalopathy.” Journal of Neuropathol-ogy & Experimental Neurology 75.7 (2016): 656-662.[6] Montenigro, Philip H., et al. ”Cumulative head impact exposure pre-dicts later-life depression, apathy, executive dysfunction, and cognitiveimpairment in former high school and college football players.” Journalof neurotrauma 34.2 (2017): 328-340.[7] Mez, Jesse, et al. ”Duration of American football play and chronictraumatic encephalopathy.” Annals of neurology 87.1 (2020): 116-131.[8] Tiernan, Stephen, et al. ”Concussion and the severity of head impactsin mixed martial arts.” Proceedings of the Institution of Mechani-cal Engineers, Part H: Journal of Engineering in Medicine (2020):0954411920947850.[9] Schneider, Daniel K., et al. ”Diffusion tensor imaging in athletes sustain-ing repetitive head impacts: a systematic review of prospective studies.”Journal of neurotrauma 36.20 (2019): 2831-2849.[10] Tagge, Chad A., et al. ”Concussion, microvascular injury, and earlytauopathy in young athletes after impact head injury and an impactconcussion mouse model.” Brain 141.2 (2018): 422-458.[11] Saboori, Parisa, and Graham Walker. ”Brain injury and impact charac-teristics.” Annals of biomedical engineering 47.9 (2019): 1982-1992.[12] Hardy, Warren N., et al. Investigation of head injury mechanisms usingneutral density technology and high-speed biplanar X-ray. No. 2001-22-0016. SAE Technical Paper, 2001.[13] O’Keeffe, Eoin, et al. ”Dynamic Blood–Brain Barrier Regulation in MildTraumatic Brain Injury.” Journal of neurotrauma 37.2 (2020): 347-356.[14] Guskiewicz, Kevin M., et al. ”Cumulative effects associated with re-current concussion in collegiate football players: the NCAA ConcussionStudy.” Jama 290.19 (2003): 2549-2555.[15] Beckwith, Jonathan G., et al. ”Timing of concussion diagnosis is relatedto head impact exposure prior to injury.” Medicine and science in sportsand exercise 45.4 (2013): 747.[16] Ponsford, Jennie, et al. ”Impact of early intervention on outcome aftermild traumatic brain injury in children.” Pediatrics 108.6 (2001): 1297-1303.[17] G¨uiza, Fabian, et al. ”Early detection of increased intracranial pressureepisodes in traumatic brain injury: External validation in an adult and ina pediatric cohort.” Critical care medicine 45.3 (2017): e316-e320.2