In Silico Prediction of Cell Traction Forces
Nicolas Pielawski, Jianjiang Hu, Staffan Strömblad, Carolina Wählby
IIN SILICO PREDICTION OF CELL TRACTION FORCES
Nicolas Pielawski (cid:63) , Jianjiang Hu (cid:5) , Staffan Str¨omblad (cid:5) ,Carolina W¨ahlby (cid:63)(cid:63)
Uppsala University, Dept. of Information Technology, L¨agerhyddsv¨agen 2, 752 37 Uppsala (cid:5)
Karolinska Institutet, Department of Biosciences and Nutrition, H¨alsov¨agen 7C, 141 57 Huddinge
ABSTRACT
Traction Force Microscopy (TFM) is a technique used to de-termine the tensions that a biological cell conveys to the un-derlying surface. Typically, TFM requires culturing cells ongels with fluorescent beads, followed by bead displacementcalculations. We present a new method allowing to predictthose forces from a regular fluorescent image of the cell. Us-ing Deep Learning, we trained a Bayesian Neural Networkadapted for pixel regression of the forces and show that itgeneralises on different cells of the same strain. The pre-dicted forces are computed along with an approximated un-certainty, which shows whether the prediction is trustwor-thy or not. Using the proposed method could help estimat-ing forces when calculating non-trivial bead displacementsand can also free one of the fluorescent channels of the mi-croscope. Code is available at https://github.com/wahlby-lab/InSilicoTFM . Index Terms — Traction Force Microscopy, Deep Learn-ing, Regression, Uncertainty, Bayesian Neural Network
1. INTRODUCTION
In 1999, Dembo et al. developed a new technique namedTraction Force Microscopy and made possible the visualisa-tion of cellular forces by placing cells on a soft gel containingrandomly placed fluorescent beads[1]. Using this technique,they could retrieve the displacement of the beads and generatea vector field representing the local forces, with their magni-tude and direction. Traction Forces are often used in studiesof cell migration patterns; it has been hypothesised that can-cer cells exhibiting high traction forces are more invasive thancells with lower activity [2].In 2010, Lemmon et al.[3] – on the basis that the shapeand size of a cell correlates with the amount of forces it exertson the substrate – described a method allowing the predictionof traction forces based solely on the cell geometry.In recent years, neural networks gained popularity in re-search in biology mainly due to their successful application tovarious problems as well as the availability of computing ca-pacity to train them. In 2015, Ronneberger et al.[4] proposed
Thanks to the Swedish Foundation for Strategic Research for funding(grant SB16-0046). a new type of neural network architecture: the U-Net. It con-tains a down-sampling path working on feature maps of de-creasing resolutions and followed by an up-sampling path thatconstructs the output image. Both paths are connected withskip-connections that allow information to flow through thenetwork. In 2017, J´egou et al. built upon the U-Net architec-ture to create the Tiramisu neural network[5]. Adding variousimprovements, such as the dense blocks from the DenseNetneural network by Huang et al.[6], they could improve theperformances of the U-Net architecture while substantially re-ducing the number of trainable weights.The insight from the method by Lemmon et al.[3], com-bined with a neural network able to use the structural infor-mation of a cell at different scales such as the Tiramisu net-work, offers an opportunity to translate a fluorescent image ofa cell into an image estimating traction forces. To the best ofour knowledge, this paper describes the first attempt at usingdeep neural networks to predict cell traction forces.This paper describe the process that leads to the creationof the data, the construction of the deep neural network topredict the cellular forces and the measurement of the uncer-tainty around the prediction of the neural network.
2. MATERIAL AND METHODS2.1. Image data
Immortalised human fibrosarcoma cell line HT1080 stablyexpressing a FRET based RhoA biosensor[7] was used inthis study. The cells were seeded on collagen type I coatedwith red fluorescent beads (Invitrogen, F8801) containingpolyacrylamide gel (6.9 kPa) three hours before the imagingstarted. An environmental chamber equipped with a NikonA1 confocal microscope with 60x oil objective (NA 1.4) wasused to image single-cell migration and displacements of thefluorescent beads at a resolution of 200 nm/pixel and a timeinterval of 30 seconds for 1.5 hours. 457 nm and 561 nmlasers were used to excite the FRET biosensor and red fluo-rescent beads respectively, while 525/50 and 595/50 emissionfilters were used to collect the signals. After the time-lapseimaging, cells were trypsinized and single snapshots of fluo-rescent beads were collected to get the positions of the beadsat the released state. MATLAB R2014b with the traction a r X i v : . [ ee ss . I V ] O c t a w i m a g e t=0s t=600s t=1200s t=1800s t=2400s t=3000s t=3600s t=4200s t=4800s t=5400s G r o un d - T r u t h P r e d i c t i o n Fig. 1 . Example raw images fed to the neural network (top row), the ground truth (middle row), and the predicted forces (bottomrow) of the test cell over a time series. The orange crosses represent the pixel chosen for generating figure 3. The full sequence –the ground-truth force overlapped with the prediction of the test cell – is available at https://youtu.be/QhzNmrA42T4 .force microscopy package from Danuser Lab. [8] was used tocalculate the traction force based on the bead displacements.Two datasets were generated in this fashion at two dif-ferent occasions and are available on Zenodo[9]. The firstdataset (12 cells) was used for training and the second one(11 cells, and 1 test cell) for validation purposes. The lastcell from the validation dataset was taken out to generate thefigures and will be called test cell.
We modified a Tiramisu segmentation neural network[5] – AU-Net architecture made of Dense blocks from the DenseNetarchitecture – in order to predict forces and their uncertainties.The last layer has been replaced by two fully connectedlayers in order to predict the mean and the variance (aleatoricuncertainty) of the estimated forces. The mean forces have noactivation function (linear activation) while the variance hasa softplus activation squared: ln(1 + exp( x )) . The weightsof the neural networks are initialised with a Glorot normaldistribution except for the variance layer, which is initialisedwith zeros for the kernels, and ln( e − for the biases, so thatthe variance is 1 at the beginning of the training.The drop-out layers have been modified so that they areactive during training as well as during inference, in order topredict the epistemic uncertainty. The rate of drop-out is 20%throughout.The architecture consists of six dense blocks, with fivelayers per block and a growth rate of 16. The first initial filterhas a size of 3x3 and a depth of 48. In the expansive pathof the neural network, at each level, the feature maps are up-sampled by a factor of two without interpolation followed by2-D convolutions with a depth of 128 filters. The Adam op-timiser is used with L-2 weight decay of − and gradientclipping (maximum L-2 norm of . ). The model was trained on four Titan Xp graphic cardsover 200 epochs, with 50 steps per epoch and a batch size of8, so that each GPU deals with two images at a time. The Kullback-Leibler (KL) divergence measures the relativeentropy between two probability distributions. Training aneural network by Maximum Likelihood Estimation (MLE)is analogous to minimising the KL divergence. Because thelog of the data is normally distributed, we used the KL diver-gence between two log-Normal distributions that yields thefollowing loss function[10]: L MSE ( µ , ˆ µ , ˆ σ ) = D KL (ln N ( µ , || ln N ( ˆ µ , I n ˆ σ )) ∝ n n (cid:88) i =0 (cid:18) σ i || µ i − ˆ µ i || + 12 ln ˆ σ i (cid:19) (1)with n the number of data points, µ the ground truth, ˆ µ theprediction, ˆ σ the predicted uncertainty, and assuming wehave no uncertainty on the ground-truth.This loss function is the mean squared error loss thatwould be derived from the KL divergence between two nor-mal distributions. Due to the properties of the log-normal dis-tribution, the ground-truth data needs to be log-transformedand the trained neural network will yield log-forces. Construction of the forces based on the bead displacementcreates high intensity artefacts close to the borders. To copewith this unwanted behaviour all images of forces weremasked with a 2-dimensional 10% cosine-tapered (Tukey)window[11].
Raw data Force data Prediction Std.dev. of the force
Model uncertainty
Data uncertainty
Coeff. of variation ( / )
Entropy (bits)
Fig. 2 . The test cell at time 4320s. The raw data represents the data that was fed to the neural network. The remaining imageshave been generated from the output of the neural network. The full time sequence is available at: https://youtu.be/U9-Tn9ojXAU .The construction of the batches follows a pipeline withthree distinct steps. First, because the images are composedof mainly background, we extracted the cells from the imagesby thresholding them. A morphological closing and openingwere applied sequentially with a box kernel of size 5x5. Abounding box was then fitted around the biggest blob avail-able. In the second step, the images were flipped horizontallythen vertically with a probability of 50%. Resulting imageswere randomly rotated with a bi-cubic interpolation and ran-domly cropped to create uniform batches of size 256x256.Salt noise was added over 1% of the pixels of 50% of the in-put images, with a random intensity sampled from a uniformdistribution ( a = 0 , b = 2000 ). This increased the robustnessof the neural network towards sparse noise of potential highintensity. The final step consisted of log-transforming the re-sulting images with the clipped log function: ln(max(1 , x )) . Measuring uncertainty is an important factor of this study, asit adds another dimension to the understanding of the out-put of the neural network. Indeed, allowing users to knowwhether a prediction can be trusted, and to which extent, canbe useful for further research. For instance, it becomes pos-sible to select images that reach only a specific certainty, oreven perform statistical testing.
The aleatoric and epistemic are two different ways to com-municate about uncertainties. The former can be quantified I n t e n s i t y o f t h e f o r c e Predicted forceGround-truth forceEntropy of the prediction 68% confidence interval90% confidence interval95% confidence interval 78910 E n t r o p y ( b i t s ) Fig. 3 . Ground-truth and predicted force of an arbitrary pixel(displayed as an orange cross in figure 1). Blue regions repre-sent prediction confidence intervals, and entropy is displayedas a dashed line.in closed form and provides an uncertainty related to the lackof information in the input data, whereas the latter needsto be approximated and yields information about the lackof agreement within the model. The approximation of theepistemic uncertainty was achieved using the Monte Carlodrop-out method as described by Kendall et al. in [12],with some modifications to accommodate for the log-normaldistribution.Given that a neural network t has the ability to formulatea prediction µ t and aleatoric uncertainty σ t , the force can beomputed as follows: E [ ˆ y i | x ∗ i , X ] ≈ T T (cid:88) t =1 exp( µ t + σ t / (2)for a given pixel x ∗ i and input image X , and T sampled neu-ral networks where the weights are sampled from a drop-outdistribution.Accordingly, the full prediction variance is derived as: V [ ˆ y i | x ∗ i , X ] ≈ T T (cid:88) t =1 (exp(2 µ t + σ t )(exp( σ t ) − T T (cid:88) t =1 exp( µ t + σ t / − E [ ˆ y i | x ∗ i , X ] (3)This variance is a sum consisting of both the aleatoric andthe epistemic uncertainties, respectively. The coefficient of variation is derived by dividing the standarddeviation by the mean: CV [ ˆ y i | x ∗ i , X ] ≈ T T (cid:88) t =1 (cid:113) exp σ t − (4)and gives information about the amount of uncertainty in re-lation to the intensity of the force, even though the formuladoes not take the parameter µ into account. The entropy of a log-normal distribution is defined in Kvalseth[13], and can be approximated in the following way: H [ ˆ y i | x ∗ i , X ] ≈ T T (cid:88) t =1 log ( σ t exp( µ t + 12 ) / √ π ) (5)and yields the predictive entropy of the neural network. Thismethod, used by Nair et al. “is a measure of how much in-formation is in the model predictive density function at each[pixel] i ”[14]. This measure reveals the relative number ofbits missing from each individual log-normally distributedpixel.
3. RESULTS
The neural network was evaluated on the validation set; thetest set was used for illustration purposes only, the validationset was not used for hyper-parameter optimisation in order toavoid a possible human bias.Figure 1 shows the predicted forces of the test cell overtime. A small orange cross represents a pixel chosen arbitrar-ily to generate Figure 3. M e a n A b s o l u t e E rr o r ( M A E ) Training set Validation set Test set
Fig. 4 . Mean Absolute Error (MAE) of the training, valida-tion and test sets for each individual frame. The standard de-viation represents the spread around the mean for the trainingset (12 cells) and the validation set (11 cells).Figure 3 represents the prediction of the forces comparedto the ground-truth. The log-normal distribution percent-point(quantile) function was used to generate confidence intervals.Figure 4 displays the Mean-Absolute Error over the timesequence of the cells (181 frames) for each of the sets. Thesets were not augmented, the 10% Tukey mask was still ap-plied to the forces. The mean MAE of the training set reached . ± . (1 standard deviation), . ± . for the val-idation set and . on the test cell. The figure shows somesigns of over-fitting as the training set MAE outperforms thetesting and validations sets. Globally, the error remains stableover time, even though the i.i.d. assumption of the data wasnot respected.
4. DISCUSSION AND CONCLUSION
We presented a novel method: using Deep Learning to predictcellular traction forces, even if the information directly relatedto the forces is not available. Indeed, despite the neural net-work not having access to the beads or fluorescent channelslinked to proteins correlated with the force, it successfullymade use of the cell geometry to accurately infer cell forces.Adding channels representing fluorescent proteins relatedto cellular traction forces – i.e. actin or integrin – could bringdramatic improvements to the accuracy of the Deep Learningmodel. More, relying on the intensity of a fluorescent proteincould increase the generalisation and stability when varyingthe softness of the gel, or using a glass medium. In addition,our method has so far been applied to one cell line and gen-eralisation to other cell lines will require further testing anddevelopment.
5. REFERENCES [1] Micah Dembo and Yu-Li Wang, “Stresses at the cell-to-substrate interface during locomotion of fibroblasts,” iophysical journal , vol. 76, no. 4, pp. 2307–2316,1999.[2] Thorsten M Koch, Stefan M¨unster, Navid Bonakdar,James P Butler, and Ben Fabry, “3d traction forces incancer cell invasion,”
PloS one , vol. 7, no. 3, pp. e33476,2012.[3] Christopher A Lemmon and Lewis H Romer, “A pre-dictive model of cell traction forces based on cell geom-etry,”
Biophysical journal , vol. 99, no. 9, pp. L78–L80,2010.[4] Olaf Ronneberger, Philipp Fischer, and Thomas Brox,“U-net: Convolutional networks for biomedical imagesegmentation,” in
International Conference on Med-ical image computing and computer-assisted interven-tion . Springer, 2015, pp. 234–241.[5] Simon J´egou, Michal Drozdzal, David Vazquez, Adri-ana Romero, and Yoshua Bengio, “The one hundred lay-ers tiramisu: Fully convolutional densenets for seman-tic segmentation,” in
Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition Work-shops , 2017, pp. 11–19.[6] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, andKilian Q Weinberger, “Densely connected convolu-tional networks,” in
Proceedings of the IEEE confer-ence on computer vision and pattern recognition , 2017,pp. 4700–4708.[7] Fritz R. D., Letzelter M., Reimann A., Martin K., FuscoL., Ritsma L., Ponsioen B., Fluri E., Schulte-Merker S.,van Rheenen J., and Pertz O., “A versatile toolkit toproduce sensitive fret biosensors to visualize signalingin time and space,”
Science signaling , vol. 6, no. 285,pp. rs12, 2013.[8] Han S. J., Oak Y., Groisman A., and Danuser G., “Trac-tion microscopy to identify force modulation in subres-olution adhesions,”
Nature methods , vol. 12, no. 7, pp.653–656, 2015.[9] Jianjiang Hu, “Traction force microscopy dataset,” Oct2019, doi: 10.5281/zenodo.3484797.[10] Manuel Gil, Fady Alajaji, and Tamas Linder, “R´enyi di-vergence measures for commonly used univariate con-tinuous distributions,”
Information Sciences , vol. 249,pp. 124–131, 2013.[11] Fredric J Harris, “On the use of windows for harmonicanalysis with the discrete fourier transform,”
Proceed-ings of the IEEE , vol. 66, no. 1, pp. 51–83, 1978.[12] Alex Kendall and Yarin Gal, “What uncertainties do weneed in bayesian deep learning for computer vision?,” in
Advances in neural information processing systems ,2017, pp. 5574–5584.[13] T Kvalseth, “Some informational properties of the log-normal distribution (corresp.),”
IEEE Transactions onInformation Theory , vol. 28, no. 6, pp. 963–966, 1982.[14] Tanya Nair, Doina Precup, Douglas L Arnold, and TalArbel, “Exploring uncertainty measures in deep net-works for multiple sclerosis lesion detection and seg-mentation,”