[PDF] Sarc-Graph: Automated segmentation, tracking, and analysis of sarcomeres in hiPSC-derived cardiomyocytes

Abstract

A better fundamental understanding of human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) has the potential to advance applications ranging from drug discovery to cardiac repair. Automated quantitative analysis of beating hiPSC-CMs is an important and fast developing component of the hiPSC-CM research pipeline. Here we introduce "Sarc-Graph," a computational framework to segment, track, and analyze sarcomeres in fluorescently tagged hiPSC-CMs. Our framework includes functions to segment z-discs and sarcomeres, track z-discs and sarcomeres in beating cells, and perform automated spatiotemporal analysis and data visualization. In addition to reporting good performance for sarcomere segmentation and tracking with little to no parameter tuning and a short runtime, we introduce two novel analysis approaches. First, we construct spatial graphs where z-discs correspond to nodes and sarcomeres correspond to edges. This makes measuring the network distance between each sarcomere (i.e., the number of connecting sarcomeres separating each sarcomere pair) straightforward. Second, we treat tracked and segmented components as fiducial markers and use them to compute the approximate deformation gradient of the entire tracked population. This represents a new quantitative descriptor of hiPSC-CM function. We showcase and validate our approach with both synthetic and experimental movies of beating hiPSC-CMs. By publishing Sarc-Graph, we aim to make automated quantitative analysis of hiPSC-CM behavior more accessible to the broader research community.

Full PDF

aa r X i v : . [ q - b i o . Q M ] F e b S AR C -G RAPH : A

UTOMATED SEGMENTATION , TR AC KING , ANDANALYSIS OF SAR C OMERES IN HI

PSC-

DERIVEDC AR DIOMYOCYTES

Bill Zhao

Department of Mechanical Engineering, Boston UniversityBoston, MA 02215, USA [email protected]

Kehan Zhang

Department of Biomedical Engineering, Boston UniversityThe Wyss Institute for Biologically Inspired Engineering, Harvard UniversityBoston, MA, 02115, USA [email protected]

Christopher S. Chen

Department of Biomedical Engineering, Boston UniversityThe Wyss Institute for Biologically Inspired Engineering, Harvard UniversityBoston, MA, 02115, USA [email protected]

Emma Lejeune

Department of Mechanical Engineering, Boston UniversityBoston, MA 02215, USA [email protected]

February 5, 2021 A BSTRACT

A better fundamental understanding of human induced pluripotent stem cell-derived cardiomyocytes(hiPSC-CMs) has the potential to advance applications ranging from drug discovery to cardiac re-pair. Automated quantitative analysis of beating hiPSC-CMs is an important and fast developingcomponent of the hiPSC-CM research pipeline. Here we introduce “Sarc-Graph,” a computationalframework to segment, track, and analyze sarcomeres in ﬂuorescently tagged hiPSC-CMs. Ourframework includes functions to segment z-discs and sarcomeres, track z-discs and sarcomeres inbeating cells, and perform automated spatiotemporal analysis and data visualization. In additionto reporting good performance for sarcomere segmentation and tracking with little to no parame-ter tuning and a short runtime, we introduce two novel analysis approaches. First, we constructspatial graphs where z-discs correspond to nodes and sarcomeres correspond to edges. This makesmeasuring the network distance between each sarcomere (i.e., the number of connecting sarcomeresseparating each sarcomere pair) straightforward. Second, we treat tracked and segmented compo-nents as ﬁducial markers and use them to compute the approximate deformation gradient of theentire tracked population. This represents a new quantitative descriptor of hiPSC-CM function. Weshowcase and validate our approach with both synthetic and experimental movies of beating hiPSC-CMs. By publishing Sarc-Graph, we aim to make automated quantitative analysis of hiPSC-CMbehavior more accessible to the broader research community.

PREPRINT - F

EBRUARY

5, 2021

Author summary (general audience)

Heart disease is the leading cause of death worldwide. Because of this, many researchers are studying heart cells in thelab and trying to create artiﬁcial heart tissue. Recently, there has been a growing focus on human induced pluripotentstem cell-derived cardiomyocytes (hiPSC-CMs). These are cells that are safely sampled from living humans, forexample from the blood or skin, that are then transformed into human heart muscle cells. One active research goal isto use these cells to repair the damaged heart. Another active research goal is to test new drugs on these cells beforetesting them in animals and humans. However, one major challenge is that hiPSC-CMs often have an irregular internalstructure that is difﬁcult to analyze. At present, their behavior is far from fully understood. To address this, we havecreated software to automatically analyze movies of beating hiPSC-CMs. With our software, it is possible to quantifyproperties such as the amount and direction of beating cell contraction, and the variation in behavior across differentparts of each cell. These tools will enable further quantitative analysis of hiPSC-CMs. With these tools, it will beeasier to understand, control, and optimize artiﬁcial heart tissue created with hiPSC-CMs, and quantify the effects ofdrugs on hiPSC-CM behavior.

Quantitative analysis of movies of beating cardiomyocytes is a compelling approach for connecting cell morphologyto dynamic cell function [1, 2]. In particular, connecting structure and function is a crucial step towards a betterfundamental understanding of human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) [3, 4]. Instark contrast to sarcomeres in mature cardiomyocytes which have a highly ordered regular structure [5], sarcomeres inhiPSC-CMs are typically immature and disordered [6]. This irregular structure, combined with large variation betweencells, makes quantitative analysis both more difﬁcult and more pressing [7]. Robust quantitative analysis frameworksfor analyzing beating hiPSC-CMs will help enable technological advances in drug discovery, genetic cardiac disease,and cardiac repair [8, 9, 10]. In this work, we aim to make automated quantitative analysis of hiPSC-CM contractilebehavior more accessible to the broader research community. a) b) c)d)

Figure 1: Analyzing movies of contracting human induced pluripotent stem cell-derived cardiomyoctyes (hiPSC-CMs): (a) Still image of a beating hiPSC-CM movie; (b) Movie still with a schematic illustration of deformationgradient F avg and tracked sarcomeres overlayed, sarcomere color corresponds to contraction level with red indicatingthe highest level of contraction; (c) Magnitudes of average principal stretches ( λ , λ ) computed from F avg withrespect to the movie frame number; (d) Average normalized sarcomere length with respect to the movie frame number.We note that the deﬁnitions of F avg , λ , and λ are introduced in this text.Fig. 1a shows a hiPSC-CM with ﬂourescently labeled z-discs. Movies of these ﬂourescently labeled hiPSC-CMscontracting are incredibly information rich [10]. Critically, with dynamic data it is possible to not only measurestructure but also measure aspects of how the structural components functionally interact [11, 12, 13, 14, 4]. Thefocus of this work is on extracting structural and functional information from movies of beating hiPSC-CMs withﬂourescently labeled z-discs. To extract biologically relevant quantitative information from these movies, we focusprimarily on segmenting, tracking, and analyzing individual sarcomeres. For context, each movie will typically haveover frames, and each frame will contain s of sarcomeres. Therefore, it is infeasible to perform segmentationand tracking by hand at scale. In the ﬁrst component of our computational framework, we introduce a procedure forautomated sarcomere segmentation and tracking. Example results from segmentation and tracking are show in Fig.1b.Building on effective segmentation and tracking, the second contribution of our computational framework is toolsfor automated analysis of hiPSC-CM contractile behavior. To accomplish this, we ﬁrst draw inspiration from recent2 PREPRINT - F

EBRUARY

5, 2021literature on deﬁning summary metrics for describing cardiomyocyte structure and cytoskeletal structure [15, 16] andmechanical deformation in multiple cell types [17, 18]. Speciﬁcally, we treat the tracked sarcomeres as ﬁducial markersand quantify the principal directions and magnitudes of average cell contraction for each movie frame. An example ofapplying this approach is shown in Fig. 1c, with Fig. 1d for comparison. Then, we move beyond average metrics andextend our framework to readily perform spatiotemporal analysis of sub-cellular sarcomere contraction. Speciﬁcally,we treat the structurally disordered and complex hiPSC-CMs as spatial graphs where z-discs are represented as nodesand sarcomeres are represented as edges. With a spatial graph deﬁned, performing spatial statistics based analysis isstraightforward. For example, it is possible to examine the correlation between individual sarcomere contraction timeseries using both euclidean and network distances.In conjunction with segmentation, tracking, and analysis, our computational framework – Sarc-Graph –also contains multiple visualization tools. The remainder of this paper is focused on describing keycomponents of the framework. We note that all code is available free and open source on GitHub: https://github.com/elejeune11/Sarc-Graph . The code is designed in a modular way such that making majoralterations to one component or performing only a subset of the segmentation, tracking, and analysis steps should bestraightforward when appropriate. Looking forward, we anticipate that this work is a starting point for a major effortin quantiﬁcation and statistical analysis of hiPSC-CM contractile behavior.

In general, it is infeasible to perform segmentation and tracking by hand at scale for movies of hiPSC-CMs withﬂourescently labeled z-discs. To address the lack of “ground truth” data we invest signiﬁcant effort in realistic syntheticdata generation to validate the segmentation and tracking components of our computational framework. In addition,we show the results of applying our framework to a selection of real experimental movies.

In Fig. 2, we show the critical components of the synthetic data generation pipeline. The ﬁrst step is illustrated in Fig.2a where we show an example of generating the three-dimensional (3D) skeleton of z-disc and sarcomere geometryfor each movie frame. First, a baseline geometry is deﬁned. Then, the baseline geometry is deformed in space suchthat each sarcomere is associated with a ground truth function for contraction (shown here as fractional change inlength) with respect to time. By starting with this 3D skeleton geometry, it is straightforward to include examples withinhomogeneous contraction, out of plane orientation and deformation, and sarcomere chains that appear to intersectwhen projected into two-dimensional (2D) space.Given the 3D skeleton geometry in Fig. 2a, we then render each frame as illustrated in Fig. 2b-d. First, each z-disc is treated as an oriented cylinder with an associated height and radius. We note that with this approach it isstraightforward to include multiple z-disc sizes in the same image. Then, the domain is converted into a voxel arraysuch that the resolution is representative of the experimental images of interest. The voxel array, a 3D matrix, is thensliced around the simulated focal plane and projected into 2D. Both image noise and blur can be added at the voxelarray step, 2D image step, or both. In the representative examples shown here, we add Gaussian blur to the voxel array[19] and Perlin noise to the 2D image [20]. We note that the entire synthetic data generation pipeline is available onGitHub. All code is written in Python with heavy use of the numpy and matplotlib packages [21, 22].

The ﬁrst two examples of experimental data included with this paper (E1 and E2) were previously made available in apublication by Toepfer et al. [10]. The protocol to introduce green ﬂuorescent protein (GFP) onto z-disc protein titin(TTN-GFP) in hiPSC-CMs is reported in [24]. Video imaging was conducted on small clusters of hiPSC-CMs usinga

X objective of a ﬂuorescent microscope with a minimum acquisition rate of 30 frames per second. The threeadditional examples included with this publication were selected to showcase the general applicability of Sarc-Graph.Cells in E3, E4, and E5 were ﬂuorescently labeled using lentiviral constructs lenti-GFP-actinin-2 or lenti-mApple-actinin-2 as described in [23], and were electrically stimulated at 1Hz using the IonOptix C-Pace EP Culture Pacer(IonOptix) during imaging. Time-lapse videos of the cells contraction were acquired at 30 frames per second using a40x objective on a Nikon Eclipse Ti (Nikon Instruments, Inc.) with an Evolve EMCCD Camera (Photometrics) or ona Zeiss Axiovert 200M inverted spinning disk microscope with an ORCA-100 Camera (Hamamatsu), equipped witha temperature and CO2 equilibrated environmental chamber. Key details are summarized in Table 1. We note that,though not included as examples with this publication, our computational framework has also been tested on additional3

PREPRINT - F

EBRUARY

5, 2021Figure 2: Creating synthetic data to test the image analysis code: (a) Deﬁne the geometry of z-discs (points) andsarcomeres (line segments) in three-dimensional (3D) space and prescribe the ground truth deformation for eachsarcomere; (b) Approximate each z-disc as an oriented cylinder in 3D space; (c) Convert the 3D geometry into a voxelarray with higher resolution in the x and y directions than the z direction; (d) Slice and project the voxel array to obtaina two-dimensional (2D) image. Image noise and blur can be added to the voxel array, 2D image, or both.key information on the experimental data key Sarc-Graph analysis challengesE1 hiPSC-CMs cultured on a ﬁbronectin coated glass substrate, ver-tically aligned ﬁbers. large deformationE2 hiPSC-CMs cultured on a ﬁbronectin coated glass substrate, tan-gentially aligned ﬁbers. large deformationE3 Single paced hiPSC-CM constrained on a micropatterned island( µm ) with ﬁbronectin coating on top of soft polyacry-lamide hydrogel ( . kPa). The method for making the hydrogelis described in [23]. low spatial resolution, large deformationE4 Monolayer of paced hiPSC-CMs cultured on a ﬁbronectin-coated glass substrate. substantial background signalE5 Single paced hiPSC-CMs cultured on a ﬁbronectin-coated glasssubstrate. low spatial resolution, small deformationTable 1: Summary of experimental examples included in this paper (E1, E2, E3, E4, and E5). We note that ExamplesE1 and E2 have already been published and made publicly available [10].beating hiPSC-CM movies obtained by different researchers with subtly different experimental set ups. We also notethat for all examples shown here, both experimental and synthetic, our framework does not require any parametertuning. All input parameters are identical across all cases shown in this paper. Sarc-Graph is divided into seven modular components: file_pre_processing , segmentation , tracking , time_series , spatial_graph , analysis_tools , and functional_metrics . The ﬁrst, file_pre_processing ,is simply used to convert each movie into a series of 2D numpy arrays [21], one for each frame of the movie convertedto grayscale. The code is designed such that the only new step required to support a previously unsupported movie ﬁleformat is a function to convert each frame into a 2D numpy array. The remainder of the code can loosely be grouped as segmentation which is necessary for extracting spatial data, tracking and time_series which are necessary for4 PREPRINT - F

EBRUARY

5, 2021extracting temporal data, and spatial_graph , analysis_tools , and functional_metrics which are necessaryfor synthesizing and interpreting the spatial and temporal data. a) raw image b) Laplacian c) segmented z-disks d) segmented sarcomeres e) trackingf) algorithm to segment sarcomeres Figure 3: Segmenting and tracking the z-discs and sarcomeres: (a) An example of a raw two-dimensional (2D) inputimage; (b) The Laplacian of the input image is used to detect the high gradients present at the edge of every z-disc[25]; (c) z-discs are identiﬁed as closed contours where the value of the Laplacian exceeds the threshold computedwith Otsu’s method [26]; (d) Sarcomeres are procedurally identiﬁed from the segmented z-discs; (e) Sarcomeresand z-discs are tracked independently between frames with the Python trackpy package [27]; (f) The algorithm tosegment sarcomeres is based on linking the approximately parallel z-discs to their closest neighbors in the directionperpendicular to the z-disc; (g) Sarcomere properties are computed from the pair of associated z-discs; (h) Trackingeach sarcomere leads to multiple spatially resolved time series.Within the segmentation script, the function segmentation_all(folder_name, gaussian_filter_size) willsegment all z-discs and all sarcomeres from each frame. The ﬁrst parameter, “ folder_name ” speciﬁes the folder thatthe data is in. The second parameter, “ gaussian_filter_size ” speciﬁes the standard deviation for the Gaussiankernel applied to the image during z-disc segmentation [28]. The default value for gaussian_filter_size = 1 andthat is the value for all data shown in this paper. In our experience, a value of may be more appropriate for moviesof hiPSC-CMs that have high levels of irregular ﬂuctuatig background signal, and thus it is left as a potential tunableparameter with a default value of . The key steps of segmentation are illustrated in Fig. 3. z-disc segmentationis conducted by computing the Laplacian of the image [25], applying a Gaussian ﬁlter to the Laplacian [28], andthen using the scikit-image “ measure.find_contours() ” function [29] to identify contours with a level matchinga scalar threshold computed via Otsu’s method [26]. We note brieﬂy that a global threshold determined with Otsu’smethod is sufﬁcient (i.e., an adaptive threshold is not required) because thresholding is performed on the Laplacianrather than the original image. As illustrated in Fig. 3c, each contour represents a segmented z-disc and the propertiesof each z-disc are computed algorithmically from the contours. Of note, contour position is computed as the meanposition of each pixel in the contour, contour length is computed as the maximum possible distance between two pixelsin the contour, and contour endpoints are the locations of the pixels that correspond to the maximum distance.Once z-discs are segmented, sarcomeres are procedurally identiﬁed from the segmented z-discs with a custom algo-rithm schematically illustrated in Fig. 3f. In brief, the steps of the algorithm are as follows:5 PREPRINT - F

EBRUARY

5, 20211. Compute the distance between the center of each z-discs and the center of it’s nearest neighbor.2. Deﬁne median_neigh as the median distance between a z-disc and its nearest neighbor.3. For each z-disc, deﬁne a line that goes through the center of the z-disc and is perpendicular to the axis deﬁnedby the contour endpoints. Then, deﬁne the two points on the line that are median_neigh / distance fromthe z-disc center. These points are referred to as “ ghost_points .” This is illustrated in Fig. 3f.4. Find the nearest neighbor for each point in ghost_points .5. If there is a two-way nearest neighbor match between a pair in ghost_points , the pair of z-discs willcorrespond to a segmented sarcomere.As illustrated in Fig. 3f, this algorithm links approximately parallel z-discs. In order to ﬂexibly accommodate potentialimage artifacts, the criteria for “approximately parallel” is not strictly deﬁned. However, in most cases, there will notbe a persistent two way nearest neighbor match unless z-discs bound a convex quadrilateral, ideally a trapezoid. Ifnecessary, additional match rejection criteria could be integrated into the algorithm. Once a sarcomere is identiﬁed,its properties are computed from the properties of the corresponding paired z-discs. This is illustrated in Fig. 3g. Ofnote, sarcomere width is computed as the mean length of the two z-discs, sarcomere length is computed as the distancebetween the two z-disc centers, and sarcomere angle is computed as the angle of the vector connecting the two z-disccenters.For the examples shown in this paper, the segmentation step takes order of s of seconds to a few minutes to runon a single laptop. We also note that users working with still images rather than movies can still segment z-discs andsarcomeres from a single frame, and can still create a spatial graph with their data and perform several aspects of thespatial analysis described in this paper. For generality, all length data is reported in units of pixels. The tracking component of Sarc-Graph is for independently tracking all z-discs and all sarcomeres. Tracking thesarcomeres is necessary for reporting the relative length change from frame to frame. Without tracking, it is onlypossible to report population average relative length change which is substantially less information rich because thesarcomeres do not necessarily contract in perfect synchrony, and are not necessarily of identical lengths. We track thez-discs and sarcomeres independently because we use this information to construct a spatial graph, described later inthe text. Tracking is performed by calling the “ run_all_tracking(folder_name,tp_depth) ” function. As statedpreviously, “ folder_name ” speciﬁes the folder that the data is in. The second parameter, “ tp_depth ” speciﬁes thefarthest distance in pixels that a particle can travel between frames [27]. The default value for tp_depth = 4 and thatis the value for all data shown in this paper. In some cases, particularly for movies with lower spatial resolution, it isnecessary to change to tp_depth = 3 . Particle tracking is performed with the Python package trackpy which is basedon the Crocker–Grier algorithm [27, 30]. We note that we do not use trackpy for feature detection, instead we converteach segmented z-disc and sarcomere to a pandas DataFrame for compatibility with the trackpy framework [31]. Theoutcome of running run_all_tracking() is a unique ID for each particle that is tracked for over of the movie.With this information, it is possible to locate each tracked particle in space across multiple frames, as illustrated in Fig.3e where the position of each sarcomere across all frames is plotted.The time_series component of our computational framework processes the results from tracking so that there istime series data describing the length, normalized length ( L − ¯ L ) / ¯ L , width, angle, and position for each tracked sar-comere. This is accomplished by running the function “ timeseries_all(folder_name, keep_thresh) ” where keep_thresh represents the fraction of movie frames that a tracked sarcomere must be present in to get processed.We recommend keep_thresh = 0 . for quantitative analysis of time series data and keep_thresh > /k for visual-ization, where k is the approximate number of times the cell contracts during one movie. To process the results from tracking , Gaussian process regression is used to interpolate data across frames which temporarily lose signal [32].Gaussian process regression is implemented through python scikit-learn with a radial basis function (RBF) and whitenoise kernel [33]. In addition to interpolating between lost frames, Gaussian process regression has the added beneﬁtof adaptively smoothing the data without a need for additional dataset-speciﬁc input parameters. The main result ofrunning the timeseries_all function is illustrated in Fig. 3h where normalized sarcomere length is plotted withrespect to frame number. Even with this synthetic data example, the individual recovered time series deviate from theground truth. This is for two main reasons. First, for the ground truth geometry that we speciﬁed, some sarcomeresare angled out of the image z-plane and are thus not perfectly captured by a 2D image. Second, the limited resolutionof the images introduces artifacts where lengths less than 1 pixel cannot be perfectly captured. However, the mean ofall time series curves illustrated in Fig. 3h is a near perfect ﬁt to the ground truth.For the examples shown in this paper, the tracking step takes order of s of seconds to a minute to run on a singlelaptop. The time_series step takes on the order of s of seconds to a few minutes to run. The time_series PREPRINT - F

EBRUARY

5, 2021step will output several user friendly text ﬁles describing sarcomere properties with respect to time where each rowcorresponds to a tracked sarcomere and each column corresponds to a movie frame. The normalized length data canbe automatically plotted with the function plot_normalized_tracked_timeseries() from the analysis_tools ﬁle. For generality, all time data is reported in units of frames. b) construct a spatial graph

Figure 4: Data analysis from segmentation and tracking: (a) Tracked elements are treated as ﬁducial markers and usedto construct an average deformation gradient F (see eqns. 1-4); (b) z-discs and sarcomeres are tracked independentlyand used to construct a spatial graph (z-discs are “nodes” and sarcomeres are “edges”, edge color corresponds tosarcomere orientation, node color corresponds to correlation in sarcomere orientation) to measure correlations basedon network distance; (c) Automated analysis of time series data is used to extract constants such as mean contractiontime and number of peaks. F avg ) Developing methods to describe data rich images and movies of cells with a tractable set of output parameters is anactive area of research [34, 16]. In this paper speciﬁcally, we draw inspiration from recent work on summarizing datafrom traction force microscopy experiments [18] and agent based model simulations [35, 36] and propose a method tocompute the mean deformation gradient of each domain with respect to time. First, we deﬁne the standard continuummechanics deformation gradient F as follows: F d X = d x (1)where d X is a vector in the initial conﬁguration, d x is the vector in the deformed conﬁguration, and F is the mappingbetween them [37]. In the context of analyzing movies of contracting hiPSC-CMs, we ﬁrst deﬁne a set of n vectors v that connect each potential pair of tracked ﬁducial markers, as illustrated in Fig. 4a. The direction and magnitude ofeach vector v will change as the ﬁducial markers move between movie frames. With this deﬁnition, we can set up theover-determined system of equations: F avg Λ = Λ where Λ = [ v , v , ..., v n ] and Λ = [ v , v , ..., v n ] (2)where F avg is a × matrix, Λ is a × n matrix of vectors in the initial (reference) conﬁguration movie frame, and Λ is a × n matrix of vectors in the current (deformed) conﬁguration movie frame. Note that when the initial frameand the current frame are identical, Λ = Λ and F avg = I . The initial conﬁguration can be deﬁned as either the ﬁrstframe of the movie, or a selected movie frame where the cell is known to be in a relaxed state. Then, we can use thenormal equation to solve for the best ﬁt average deformation gradient as: F avg = ΛΛ T0 [ Λ Λ T0 ] − . (3)Once computed, the components of F avg can be plotted directly, or F avg can be manipulated to provide informationthat is more convenient to compare and interpret. For example, we can write F avg = R avg U avg (4)where R avg is a rotation tensor and U avg is a symmetric positive deﬁnite tensor that represents stretch. In the numericalsetting, R avg and U avg are computed with the scipy polar decomposition function [28]. We can also compute theeigenvalues and eigenvectors of U avg , λ avg1 , λ avg2 , u avg1 , and u avg2 respectively. In the numerical setting, this isdone with the numpy linalg package, and we always deﬁne λ avg1 < λ avg2 [21]. From a mechanical data interpretationperspective, the average stretch values λ avg1 and λ avg2 and the associated eigevectors are particularly convenient becausethey contain information about both the magnitude and direction of contraction. This is illustrated in Fig. 1b and 1c.7 PREPRINT - F

EBRUARY

5, 2021Because sarcomeres data is already processed through the time_series component of our framework, we use thesarcomeres as our ﬁducial markers to deﬁne Λ and Λ . We note that this approach would be just as effective with the z-discs chosen as ﬁducial markers. In addition, if 3D data was available, this approach would still work with F avg as a × matrix and Λ and Λ as × n matrices. Within the analysis_tools ﬁle, the function compute_F_whole_movie() will compute F avg for each movie frame. The function visualize_F_full_movie() will create a visualization of λ avg1 and λ avg2 as a function of the movie frame and schematically illustrate F avg overlaid on the original image data,as shown in Fig. 1. The computational cost of computing and processing F avg for all frames is on the order of secondsto s of seconds on a single laptop. The computational cost of data visualization is on the order of s of seconds tominutes. F avg ) to average sarcomere shortening ( ˜ s , s avg ) and OrientationalOrder Parameter (OOP) In addition to deﬁning tensor F avg as a dynamic metric that evolves over the course of the entire beating hiPSC-CMmovie, it is possible to formulate scalar metrics from F avg that are more straightforward to directly compare to wellestablished metrics for describing hiPSC-CMs [7]. Here we focus on deﬁning scalar metrics from F avg that are directlycomparable to average sarcomere shortening ( ˜ s , s avg ) [10], and the Orientational Order Parameter (OOP) [11, 16, 38].We compute average sarcomere shortening in two different ways. In both cases, we work with the normalized sarcom-ere length obtained from time_series y = ( L − ¯ L ) / ¯ L where L is sarcomere length in pixels. Given normalizedsarcomere length y for each movie frame, we deﬁne shortening as: s = y max − y min y max + 1 (5)where y max is the maximum normalized length and y min is the minimum normalized length. Then, we can deﬁne ˜ s as the median value of s for all sarcomeres tracked in the movie. We choose to use median rather than mean in thiscase because the mean is more susceptible to outliers and thus ˜ s is a better reﬂection of the ground truth sarcomeredeformation. In addition, we can compute the mean normalized length time series y avg and then compute s avg basedon the single mean time series. If all sarcomeres are contracting identically, ˜ s and s avg will be identical. However, ifthis is not the case, the two values will likely differ and ˜ s will reﬂect average individual sarcomere behavior while s avg will reﬂect both individual sarcomeres and the (lack of) synchrony between them. Functions to compute ˜ s and s avg are given in the functional_metrics component of Sarc-Graph.In addition to computing mean sarcomere shortening, we specify the method for computing OOP from hiPSC-CMmovies that we implement in the functional_metrics component of Sarc-Graph. When hiPSC-CMs are fullyrelaxed, individual sarcomere chains are no longer under systolic tension and can thus become wavy and lose localalignment. Therefore, the ﬁrst step to computing OOP is to deﬁne a frame of the movie where the lowest fractionof sarcomere chains will be in their relaxed and potentially wavy conﬁguration. To select this frame, we compute det( F avg ) for every movie frame with the ﬁrst frame of the movie used as the reference conﬁguration. Then, weidentify the frame where det( F avg ) is the greatest (i.e. the sarcomeres are most relaxed) and select that frame as thereference frame and recompute F avg . With the appropriately selected reference frame, we re-compute det( F avg ) anddeﬁne the most contracted frame as the frame with the lowest value of det( F avg ) .We compute OOP for the static image corresponding to the most contracted frame of the movie. To do this, we deﬁnestructural tensor T following the literature: T = (cid:28) (cid:20) r ix r ix r ix r iy r iy r ix r iy r iy (cid:21) − (cid:20) (cid:21) (cid:29) (6)where r i = [ r ix , r iy ] is the unit vector describing the orientation of the i th sarcomere [11]. Given T , with eigenvalues u max and u min and corresponding unit eigenvectors v max and v min , OOP = u max . When OOP = 0 . , sarcomeres areoriented randomly. When OOP = 1 . , sarcomeres are perfectly aligned. Corresponding eigenvector v max correspondsto the dominant direction of sarcomere orientation.In order to best relate F avg to OOP, we deﬁne two scalar metrics C iso and C || that describe average radial contraction,and contraction in the direction of dominant sarcomere orientation v max respectively. Unlike individual sarcomerecontraction or the average of individual sarcomere contraction, C iso and C || represent deformation throughout thewhole domain and not necessarily along the sarcomere axis. Metric C iso is deﬁned as C iso = 1 . − q det( F avg ) (7)where F avg is computed in the most contracted frame. Practically, C iso is the line shortening during contractionof an equivalent system where contraction is not directionally dependent. When C iso = 0 . , no shortening occurs.8 PREPRINT - F

EBRUARY

5, 2021As an example, a value C iso = 0 . corresponds to an equivalent isotropic line shortening of in all directions.Essentially, higher C iso indicates higher levels of hiPSC-CM contraction.To quantify shortening in the dominant direction speciﬁed by v max computed from T , we consider the relation F avg v = v max where v is the corresponding vector deﬁned in the reference frame. We can manipulate this equationto compute v = F − v max and then deﬁne C || as C || = | v | − | v max || v | . (8)With this deﬁnition, C || is the fractional shortening of a line in the direction of v max where direction is deﬁned in thecontracted conﬁguration. Conveniently, lower ˜ s and s avg , C iso , and C || all correspond to lower levels of contraction,and higher ˜ s and s avg , C iso , and C || all correspond to higher levels of contraction.Functions to compute and visualize ˜ s , s avg , OOP, C iso , and C || are all deﬁned in the functional_metrics script.The function “ compute_metrics ” will compute ˜ s , s avg , OOP, C iso , and C || and create a visualization of OOP and F avg . The function “ visualize_lambda_as_functional_metric ” will create a movie of λ avg1 and λ avg2 changingover the course of the movie. Examples of this are provided as Supplementary Information. The computational costof computing F avg (including computing C iso , and C || ) and ˜ s , s avg , and OOP for all frames is on the order of secondsto s of seconds on a single laptop. In addition to analyzing average cell and sarcomere behavior (ex: average z-disc and sarcomere morphological prop-erties, F avg ), we are interested in analyzing the spatiotemporal patterns that arise on the sub-cellular level. We imple-mented the spatial_graph component of Sarc-Graph to conveniently synthesize the outputs of the segmentation , tracking , and time_series steps. Speciﬁcally, running the function create_spatial_graph() will create aspatial graph where z-discs are represented as nodes and sarcomeres are represented as edges. With a spatial graphstructure, it is then possible to better quantify spatiotemporal patterns in hiPSC-CM behavior. For example, access toa spatial graph can help delineate sarcomere synchrony that depends on Euclidean distance, and sarcomere synchronythat depends on network distance. In addition, access to a spatial graph can help us better quantify both global andlocal sarcomere alignment. This functionality is illustrated in Fig. 4b.In the run_all_tracking() step, z-discs and sarcomeres are tracked independently. Every z-disc that is at leastpartially tracked (deﬁned as present in over of the movie frames) will have a unique local (frame-speciﬁc) ID anda unique global ID. Each node of the spatial graph corresponds to a unique global ID, and the local IDs speciﬁc to eachframe are linked to the corresponding global ID node. Each tracked sarcomere is associated with two local z-disc IDsin each frame. To add sarcomeres to the spatial graph, the global ID of the associated z-disc is ﬁrst determined. Then,either a new edge is created linking the two global IDs, or, if the edge already exists, the weight of the edge is increased.The spatial graph is created with the NetworkX python package, and all operations, such as computing the shortestpath between two nodes, rely on built-in NetworkX functionality [39]. The function create_spatial_graph() runsin seconds on a single laptop. Running create_spatial_graph() will also produce a schematic drawing of thespatial graph. In addition to the functions to compute and visualize F avg and other functional metrics ( compute_F_whole_movie() , visualize_F_full_movie() , compute_metrics , visualize_lambda_as_functional_metric ) andplot normalized sarcomere length with respect to time ( plot_normalized_tracked_timeseries() ), the analysis_tools component of Sarc-Graph contains multiple functions to further visualize and analyze the dataacquired from the hiPSC-CM movies. Example outputs from these functions are included onn the Sarc-Graph GitHubpage. Additional key analysis and visualization functions are as follows:• visualize_segmentation(folder_name, gaussian_filter_size, frame_num) : Visualiza-tion of the segmented z-discs and sarcomeres overlaid on the original image. The parameter gaussian_filter_size should match the value chosen in segmentation . The default for gaussian_filter_size = 1 and the default for frame_num = 0 .• visualize_contract_anim_movie(folder_name, re_run_timeseries,use_re_run_timeseries, keep_thresh=0.75) : Visualization of the results of tracking over-laid on the original movie of the contracting hiPSC-CMs. If re_run_timeseries = True and use_re_run_timeseries = True this function will re-run the timeseries_all() function with the newspeciﬁed keep_thresh . The saved tracking data will be designated for visualization purposes only.9

PREPRINT - F

EBRUARY

5, 2021• cluster_timeseries_plot_dendrogram(folder_name,compute_dist_DTW,compute_dist_euclidean) : Visualize hierarchical clustering of all normalized sarcomere lengthtime series data and plot a dendrogram that illustrates the clustering [28]. If compute_dist_DTW = True thesimilarity between each sarcomere time series will be measured with a Dynamic Time Warping algorithm[40]. Because this can be time consuming (order of minutes to s of minutes), we also provide an option toset compute_dist_euclidean = True and instead base clustering on the Euclidean distance between timeseries.• plot_normalized_tracked_timeseries(folder_name) : This function will create a plot of the normal-ized length of the tracked sarcomeres with respect to frame number.• plot_untracked_absolute_timeseries(folder_name) : This function will create a plot of the absolutesarcomere lengths in pixels with respect to frame number for all segmented sarcomeres. The number ofsegmented sarcomeres will exceed the number of tracked sarcomeres.• compute_timeseries_individual_parameters(folder_name) : This function will compute and savescalar values describing the normalized length time series data for the tracked sarcomeres (contraction time,relaxation time, ﬂat time, period, offset, etc.) The results of this analysis will be saved in a spreadsheet titled“ timeseries_parameters_info.xlsx .”• analyze_J_full_movie(folder_name) : This function will compute and save scalar values describing the F avg time series data. Speciﬁcally, it will compute time constants describing the plot of the scalar Jacobian( J = | F avg | ) with respect to frame number. This is illustrated in Fig. 3c.• compare_tracked_untracked(folder_name) : This function will compare the tracked and untracked pop-ulations through random sampling of the untracked population. These plots are important for understandingif the tracked data is signiﬁcantly biased compared to the segmented data.• preliminary_spatial_temporal_correlation_info(folder_name) : This function will perform apreliminary analysis of spatial/temporal correlation within the hiPSC-CMs. The main outcome is a plotof normalized cross-correlation score between normalized sarcomere length time series data with respect toboth Euclidean and network distances. Our ﬁrst investigation into the performance of Sarc-Graph on synthetic data is illustrated in Fig. 5. In this investigation,we deﬁne a sinusoidal chain of sarcomeres that deform homogeneously following the time series illustrated in Fig.2a. Under baseline conditions (modest amplitude, little angling out of plane, and low to moderate noise), Sarc-Graphis able to successfully track all out of sarcomeres. This is illustrated in 5a. Then, we adversely alter thesebaseline conditions to demonstrate the limitations of our approach. In the ﬁrst study, illustrated in Fig. 5b, we alterthe underlying 3D geometry of the sarcomere chain. As anticipated, performance degrades when we increase theamplitude of the sine wave, and when we increase the out of plane tilt of the sarcomere chain. That being said,we are able to segment and track at least half of the sarcomeres in all but four of the examples shown. In thesecond study, illustrated in Fig. 5c, we adversely alter baseline conditions by adding noise to the rendered image.Speciﬁcally, we add Perlin noise at varying levels of magnitude and with a varying number of octaves [20]. Fromthese results, it is clear that our computational framework is quite robust to all tested types of low magnitude noise,and Perlin noise with a low number of octaves (low frequency). However, performance degrades in the presence ofPerlin noise that has high gradients and is similar in size and brightness to the simulated z-discs themselves. We notethat if a speciﬁc experimental dataset has a characteristic noise pattern that can be removed with a known operation,implementing additional steps in the segmentation component of our computational framework to address it wouldbe straightforward. In some cases, simply increasing the parameter gaussian_filter_size will resolve the issue. Inthe experimental data that we have tested the computational framework on thus far, our current segmentation processis effective.Our second investigation into the performance of Sarc-Graph on synthetic data is illustrated in Fig. 6. In theseexamples, our aim is to design synthetic data that contains many of challenges associated with analyzing real hiPSC-CMs: out of plane deformation, sarcomere chain curvature, sarcomere chain overlap, inhomogeneous deformation,variable z-disc size, and irregular contraction. From S1 to S5 the examples are loosely organized from “least” to“most” challenging for the computational framework to capture. The corresponding movies are included in our GitHubrepository. Brieﬂy, example S1 shows four sarcomere chains of variable curvature (decreasing from left to right) andvariable out of plane deformation (increasing from left to right) deforming homogeneously. Our framework is able to10 PREPRINT - F

EBRUARY

5, 2021Figure 5: For each image, the number of sarcomeres successfully segmented and tracked (out of possible) isreported. Running the algorithm on synthetic data shows that the code is robust to geometry and noise up to a limit:(a) Illustration of successful z-disc and sarcomere segmentation and tracking for a curved geometry in the presenceof noise; (b) Performance degrades when the baseline geometry moves partially out of plane (here slanted in the z-direction), and when the sarcomere chain is too distorted (here high amplitude to period ratio for sinusoidal geometry);(c) Performance degrades in the presence of Perlin noise, in particular noise that is similar in size and brightness to thez-discs themselves.recover out of sarcomeres, match the ground truth mean time series data, and match the ground truth on λ avg1 and λ avg2 . This is consistent with the results shown in Fig. 5. Example S2 shows two families of sarcomere chains,an external ellipse and four internal chains that overlap in the center. Our framework is able to recover out of sarcomeres (error occurs primarily near the overlap region), match the ground truth mean time series data, and matchthe ground truth on λ avg1 and λ avg2 .Example S3 shows three closely positioned elliptical sarcomere chains that deform inhomogeneously. Our frameworkis able to recover out of sarcomeres, however, because there is bias in which sarcomeres are recovered (recoveryis worse towards the bottom of the image frame), both the mean time series data and the λ avg1 and λ avg2 data are unableto perfectly capture the ground truth. We note that the results for λ avg1 and λ avg2 are a better reﬂection of the groundtruth than the mean time series data. We also note that the amount of sarcomere contraction simulated in S3 exceedswhat is typically observed in the experimental setting (over contraction).Example S4 shows two families of sarcomere chains: an external ellipse, and multiple partially overlapping and curvedinternal chains in different z-planes. In Example S4, the sarcomeres deform inhomogeneously with a brief abrupt jumpin sarcomere deformation around frame . Our framework is able to recover out of sarcomeres, and is able tonearly recover the mean time series data, and the λ avg1 and λ avg2 data. We note that for both approaches, the abrupt jumpin sarcomere behavior seen in the ground truth is smoothed out in the automated time series processing because duringthe abrupt jump tracking temporarily fails for the entire outer ring of sarcomeres. There are two main implicationsof this result. First, that large abrupt motion may cause Sarc-Graph to fail. Second, that Sarc-Graph is able to tracksarcomeres on either end of a time period where signal is temporarily lost, and interpolate data across the missingframes. Similar to example S3, the plot of λ avg1 and λ avg2 is a better reﬂection of the ground truth than the mean timeseries curve.Example S5 shows multiple sarcomere chains that are all angled out of plane, appear to overlap at multiple points,and undergo inhomogeneous deformation. In this case, we are only able to recover time series data for out of sarcomeres. This results in mean time series data that is a poor reﬂection of the ground truth. However, the λ avg1 and λ avg2 data is a much better match to the ground truth. This is a positive sign towards the efﬁcacy of our proposedmetric even when the number of sarcomeres recovered is relatively low. Though only out of sarcomeres werefully tracked, segmented but only partially tracked sarcomeres appear on the spatial graph shown in Fig. 6 S5. Lookingforward, additional criteria to further reﬁne the spatial graph and further delineate “probable” and “improbable” linkscan readily be added to Sarc-Graph within the spatial_graph module.11 PREPRINT - F

EBRUARY

5, 2021 i n c r ea s i ng c o m p l e x i t y Figure 6: Example performance on synthetic data of increasing complexity with a known ground truth: Segmentationand tracking (marker color corresponds to sarcomere contraction in the illustrated frame); Constructing a spatial graph(line color corresponds to sarcomere orientation); Recovering individual time series data (both Sarc-Graph output andground truth shown); Recovering average deformation behavior (both Sarc-Graph output and ground truth shown).12

PREPRINT - F

EBRUARY

5, 2021 ˜ s ˜ s GT s avg s GTavg

OOP OOP GT C iso C GTiso C || C GT || S1 .

15 0 .

62 0 .

63 0 .

15 0 .

15 0 . S2 .

085 0 .

074 0 .

075 0 .

080 0 .

066 0 .

076 0 .

075 0 .

076 0 . S3 .

38 0 .

40 0 .

079 0 .

24 0 .

93 0 .

88 0 .

11 0 .

16 0 .

039 0 . S4 .

058 0 .

043 0 .

042 0 .

024 0 .

16 0 .

15 0 .

04 0 .

039 0 . S5 .

22 0 .

085 0 .

042 0 .

35 0 .

21 0 .

13 0 .

16 0 .

17 0 . Table 2: Comparison of Sarc-Graph computed ˜ s , s avg , OOP, C iso , and C || with the ground truth values of theseparameters for each synthetic data example.In Table 2, we report median individual sarcomere shortening ( ˜ s ), mean time series sarcomere shortening ( s avg ), Ori-entational Order Parameter (OOP), equivalent isotropic contraction ( C iso ), and contraction in the dominant direction ofsarcomere orientation ( C || ) for each synthetic data example. For all of the synthetic data cases, we are able to computea ground truth for each of these parameters from the known ground truth contraction, displacement, and orientation ofthe sarcomere geometry. For examples S1 and S2, the parameters recovered by Sarc-Graph are a near perfect match tothe ground truth. However, for all other samples minor discrepancies arise in ˜ s , OOP, C iso , and C || . For example S4, ˜ s is over estimated. For examples S3 and S5, OOP is over estimated, C iso is under estimated, and C || is under estimated.Consistent with the results shown in Fig. 6, we believe that this error is due to bias in the region where sarcomeresare recovered in S3, and failure of the algorithm for overlapping and severely out of plane sarcomere chains in S4 andS5. We note that for all examples but S1 and S2 – where sarcomere contraction is synchronous – recovered s avg is nota good match to the ground truth. This is consistent with the normalized sarcomere length time series plot shown inFig. 6. These errors indicate that care should be taken when comparing results across beating hiPSC-CM movies thatmay have systemic differences. The accessibility of our code for generating additional synthetic data can aid in thispreliminary check.In all examples, most glaringly example S5, more sarcomeres are segmentated and tracked for a small number offrames (and thus appear on the spatial graph) then are tracked for enough frames to reconstruct the full time series.The recovered time series ( λ avg1 and λ avg2 ) and parameters ( C iso , C || ) associated with F avg are typically a closer matchto the synthetic data ground truth behavior than the mean curve of the recovered time series data. Example S5 –with out of plane inhomogeneous deformation and overlapping sarcomere chains – is perhaps the best reﬂection ofthe challenges that we have seen in our experience thus far with the experimental data. We note that the extensiblecode used to generate the synthetic data is published on GitHub so implementing and testing additional examples isstraightforward. Here, we show the performance of Sarc-Graph on the ﬁve experimental data examples listed in Table 1. In Fig. 7 - 9,we illustrate the key Sarc-Graph outputs. In Table 3, we report ˜ s , s avg , OOP, C avg , and C || for each example. In theSupplementary Information section, we include a movie for each sample that shows both the change in length of thesuccessfully tracked sarcomeres, and how our computed metric F avg changes over the course of the movie.In Fig. 7, we show the performance of Sarc-Graph on two similar examples of beating hiPSC-CM movies, examplesE1 and E2. For example E1, we are able to segment ≈ sarcomeres per frame and successfully track sarcomeres.The time series plot of absolute sarcomere length change with respect to frame number shows three distinct contractionevents during the movie. Though the individual sarcomeres are clearly not perfectly in sync, there is enough of aunifying pattern for these three peaks to emerge. These three distinct contraction events are also reﬂected in the plotof λ avg1 and λ avg2 . For example E2, we are able to segment ≈ sarcomeres per frame and successfully track sarcomeres. The time series plots of absolute sarcomere length change and λ avg1 and λ avg2 with respect to frame numberalso show three distinct peaks. We also show the spatial graph representation of each example and a plot of normalizedcross-correlation between pairs of individual sarcomere time series curves with respect to network distance.Qualitatively, examples E1 and E2 are different in that the sarcomeres in example E1 appear vertically aligned through-out while the sarcomeres in example E2 appear tangentially aligned around the edge of the cell and unaligned in thecenter. This qualitative observation is consistent with the values of OOP reported in Table 3. In addition, we canuse our computed metric F avg to go beyond morphology alone and quantify the function of the observed hiPSC-CMs.By looking at the time series plots of λ avg1 and λ avg2 with respect to frame number, we can see that example E1 isdeforming quite anisotropically λ avg1 < λ avg2 while the deformation of example E2 is much closer to isotropic where λ avg1 ≈ λ avg2 . In example E1, C || is meaningfully greater than C iso while in example E2 the two parameters are much13 PREPRINT - F

EBRUARY

5, 2021Figure 7: Example performance on experimental data E1 and E2: Raw image visualization; z-disc and sarcomeresegmentation; Sarcomeres that are tracked for / or more of the movie, red corresponds to a contracted state, bluecorresponds to a relaxed state (frame 20); Sarcomeres that are tracked for / or more of the movie, red correspondsto a contracted state, blue corresponds to a relaxed state (frame 20), note that these sarcomeres are included in the timeseries analysis; Individual time series data for normalized sarcomere length; Average deformation behavior; Illustra-tion of the data represented as a spatial graph, color corresponds to sarcomere orientation; Normalized sarcomere timeseries cross-correlation score plotted as a function of the distance along the network. Supplementary Movies SI1 andSI2 are included for further visualization. 14 PREPRINT - F

EBRUARY

5, 2021Figure 8: Example performance on experimental data E3 and E4: Raw image visualization; z-disc and sarcomeresegmentation; Sarcomeres that are tracked for / or more of the movie, red corresponds to a contracted state, bluecorresponds to a relaxed state (frame 50 for E3, frame 40 for E4); Sarcomeres that are tracked for / or more of themovie, red corresponds to a contracted state, blue corresponds to a relaxed state (frame 50 for E3, frame 40 for E4),note that these sarcomeres are included in the time series analysis; Individual time series data for normalized sarcomerelength; Average deformation behavior; Illustration of the data represented as a spatial graph, color corresponds tosarcomere orientation; Normalized sarcomere time series cross-correlation score plotted as a function of the distancealong the network. Supplementary Movies SI3 and SI4 are included for further visualization.15 PREPRINT - F

EBRUARY

5, 2021Figure 9: Example performance on experimental data E5: Raw image visualization; z-disc and sarcomere segmenta-tion; Sarcomeres that are tracked for / or more of the movie, red corresponds to a contracted state, blue correspondsto a relaxed state (frame 50); Sarcomeres that are tracked for / or more of the movie, red corresponds to a con-tracted state, blue corresponds to a relaxed state (frame 50), note that these sarcomeres are included in the time seriesanalysis; Individual time series data for normalized sarcomere length; Average deformation behavior; Illustration ofthe data represented as a spatial graph, color corresponds to sarcomere orientation; Normalized sarcomere time seriescross-correlation score plotted as a function of the distance along the network. Supplementary Movie SI5 is includedfor further visualization.closer in value. Essentially, example E1 is experiencing more substantial oriented contraction than example E2. Wealso note that for these two examples, where ˜ s is similar but OOP is different, metrics derived from F avg are able tocapture functional differences in contraction. ˜ s s avg OOP C iso C || E1 .

17 0 .

049 0 .

48 0 .

015 0 . E2 .

15 0 .

07 0 .

29 0 .

025 0 . E3 .

18 0 .

061 0 .

68 0 .

069 0 . E4 .

13 0 .

041 0 .

16 0 .

021 0 . E5 .

13 0 .

02 0 .

38 0 . . Table 3: Sarc-Graph computed ˜ s , s avg , OOP, C iso , and C || for each experimental data example.In Fig. 8, we show the performance of Sarc-Graph on two contrasting examples of beating hiPSC-CM movies, ex-amples E3 and E4. Example E3 is of a single hiPSC-CM adhered to a patterned soft hydrogel substrate with highlyaligned sarcomeres and large deformation while Example E4 is of hiPSC-CMs in a monolayer culture without a cleardirection of sarcomere alignment. With Sarc-Graph, we are able to segment approximately , and track sarcom-eres from E3. From the plots of λ avg1 and λ avg2 , and scalar metrics OOP, C iso , and C || , we observe high sarcomerealignment and corresponding highly aligned contraction. Notably, the cell contracts substantially in the direction per-pendicular to ﬁber alignment, thus exhibiting auxetic deformation during contraction. For example E4 we are able tosegment approximately , and track sarcomeres. This is notable because the conditions in E4 are far from idealfor image analysis. In both cases, ﬁve distinct contraction events are observed over the course of the movie and arevisible from both the average normalized sarcomere length time series and the λ avg1 and λ avg2 time series.In Fig. 9, we show the performance of Sarc-Graph on example E5. Example E5 is of a hiPSC-CM on a glass slide withpoor ﬁber alignment, curved ﬁbers, and relatively low deformation over the course of the movie. With Sarc-Graph, we16 PREPRINT - F

EBRUARY

5, 2021are able to segment approximately and track sarcomeres for example E5. Though individual sarcomeres inthis movie contract substantially ( ˜ s = 0 . ), there is low overall synchrony, and overall average deformation is quitelow ( C iso = 0 . ). Despite low synchrony, Sarc-Graph still detects distinct contractions in both the average timeseries plot and the λ avg1 and λ avg2 time series plot. Notably, overall contraction is also slightly higher ( C || > C iso ) inthe dominant direction of sarcomere orientation. This example is notable in because Sarc-Graph is able to segmentand track a high number of sarcomeres from an irregular geometry and detect subtle sarcomere behavior beyond thelimits of what has been accomplished with the previous state of the art [10]. The objective of this work is to provide an open source computational framework to quantitatively analyze movies ofbeating hiPSC-CMs. To accomplish this, we introduce tools to segment and track z-discs and sarcomeres, and analyzetheir spatiotemporal behavior. Notably, we are able to automatically segment and track a high number of sarcomeresacross multiple experimental conditions without any input parameter tuning, and we introduce two important newapproaches for the analysis of beating hiPSC-CM: a method for computing average deformation gradient, and a methodfor treating the hiPSC-CMs as spatial graphs. Looking forward, we see this work as an important tool for substantialfuture study of hiPSC-CM behavior. Finally, our proposed computational framework is heavily documented and has amodular extensible design. We anticipate that many components of our open source software will be directly useful toother researchers.

Sarc-Graph, the code to generate the synthetic data, the example data, and examples of the full analysis output can allbe accessed through GitHub: https://github.com/elejeune11/Sarc-Graph . SI-1.

A movie of E1 with a schematic illustration of deformation gradient F avg and tracked sarcomeres overlayed(sarcomere color corresponds to contraction level with red indicating the highest level of contraction), magnitudes ofaverage principal stretches ( λ , λ computed from F avg ), and average normalized sarcomere length, both with respectto the movie frame number. This movie is a supplement to Fig. 7, sample E1. SI-2.

A movie of E2 with a schematic illustration of deformation gradient F avg and tracked sarcomeres overlayed(sarcomere color corresponds to contraction level with red indicating the highest level of contraction), magnitudes ofaverage principal stretches ( λ , λ computed from F avg ), and average normalized sarcomere length, both with respectto the movie frame number. This movie is a supplement to Fig. 7, sample E2. SI-3.

A movie of E3 with a schematic illustration of deformation gradient F avg and tracked sarcomeres overlayed(sarcomere color corresponds to contraction level with red indicating the highest level of contraction), magnitudes ofaverage principal stretches ( λ , λ computed from F avg ), and average normalized sarcomere length, both with respectto the movie frame number. This movie is a supplement to Fig. 8, sample E3. SI-4.

A movie of E4 with a schematic illustration of deformation gradient F avg and tracked sarcomeres overlayed(sarcomere color corresponds to contraction level with red indicating the highest level of contraction), magnitudes ofaverage principal stretches ( λ , λ computed from F avg ), and average normalized sarcomere length, both with respectto the movie frame number. This movie is a supplement to Fig. 8, sample E4. SI-5.

A movie of E5 with a schematic illustration of deformation gradient F avg and tracked sarcomeres overlayed(sarcomere color corresponds to contraction level with red indicating the highest level of contraction), magnitudes ofaverage principal stretches ( λ , λ computed from F avg ), and average normalized sarcomere length, both with respectto the movie frame number. This movie is a supplement to Fig. 9, sample E5. A still frame from this movie is alsoshown in Fig. 1. 17 PREPRINT - F

EBRUARY

5, 2021

This work was supported by the CELL-MET Engineering Research Center NSF ECC-1647837, the Boston Univer-sity Hariri Junior Faculty Fellowship (EL), and the David R. Dalton Career Development Professorship (EL). KZacknowledges fellowship support from the American Heart Association, award number 17PRE33660967.

References [1] Przemek A Gorski, Changwon Kho, and Jae Gyun Oh. Measuring cardiomyocyte contractility and calciumhandling in vitro. In

Experimental Models of Cardiovascular Diseases , pages 93–104. Springer, 2018.[2] Côme Pasqualin, François Gannier, Angèle Yu, Claire O Malécot, Pierre Bredeloux, and Véronique Maupoil. Sar-coptim for imagej: high-frequency online sarcomere length computing on stimulated cardiomyocytes.

AmericanJournal of Physiology-Cell Physiology , 311(2):C277–C283, 2016.[3] Kevin M Beussman, Marita L Rodriguez, Andrea Leonard, Nikita Taparia, Curtis R Thompson, and Nathan JSniadecki. Micropost arrays for measuring stem cell-derived cardiomyocyte contractility.

Methods , 94:43–50,2016.[4] Matthew Wheelwright, Zaw Win, Jennifer L Mikkila, Kamilah Y Amen, Patrick W Alford, and Joseph M Met-zger. Investigation of human ipsc-derived cardiac myocyte functional maturation by single cell traction forcemicroscopy.

PLoS One , 13(4):e0194909, 2018.[5] Kathleen A Clark, Abigail S McElhinny, Mary C Beckerle, and Carol C Gregorio. Striated muscle cytoarchitec-ture: an intricate web of form and function.

Annual review of cell and developmental biology , 18(1):637–706,2002.[6] Francesco Silvio Pasqualini, Sean Paul Sheehy, Ashutosh Agarwal, Yvonne Aratyn-Schaus, and Kevin Kit Parker.Structural phenotyping of stem cell-derived cardiomyocytes.

Stem cell reports , 4(3):340–347, 2015.[7] Sean P Sheehy, Francesco Pasqualini, Anna Grosberg, Sung Jin Park, Yvonne Aratyn-Schaus, and Kevin KitParker. Quality metrics for stem cell-derived cardiac myocytes.

Stem cell reports , 2(3):282–294, 2014.[8] Alexandre JS Ribeiro, Olivier Schwab, Mohammad A Mandegar, Yen-Sin Ang, Bruce R Conklin, Deepak Sri-vastava, and Beth L Pruitt. Multi-imaging method to assay the contractile mechanical output of micropatternedhuman ipsc-derived cardiac myocytes.

Circulation research , 120(10):1572–1583, 2017.[9] Thomas Eschenhagen and Lucie Carrier. Cardiomyopathy phenotypes in human-induced pluripotent stem cell-derived cardiomyocytes—a systematic review.

Pﬂügers Archiv-European Journal of Physiology , 471(5):755–768, 2019.[10] Christopher N Toepfer, Arun Sharma, Marcelo Cicconet, Amanda C Garﬁnkel, Michael Mücke, Meraj Neyazi,Jon AL Willcox, Radhika Agarwal, Manuel Schmid, Jyoti Rao, et al. Sarctrack: an adaptable software tool forefﬁcient large-scale analysis of sarcomere function in hipsc-cardiomyocytes.

Circulation research , 124(8):1172–1183, 2019.[11] Anna Grosberg, Po-Ling Kuo, Chin-Lin Guo, Nicholas A Geisse, Mark-Anthony Bray, William J Adams, Sean PSheehy, and Kevin Kit Parker. Self-organization of muscle cell structure and function.

PLoS computationalbiology , 7(2):e1001088, 2011.[12] Jan David Kijlstra, Dongjian Hu, Nikhil Mittal, Eduardo Kausel, Peter van der Meer, Arman Garakani, andIbrahim J Domian. Integrated analysis of contractile kinetics, force generation, and electrical activity in singlehuman stem cell-derived cardiomyocytes.

Stem cell reports , 5(6):1226–1238, 2015.[13] Scott D Lundy, Wei-Zhong Zhu, Michael Regnier, and Michael A Laﬂamme. Structural and functional matura-tion of cardiomyocytes derived from human pluripotent stem cells.

Stem cells and development , 22(14):1991–2002, 2013.[14] Alexandre JS Ribeiro, Yen-Sin Ang, Ji-Dong Fu, Renee N Rivas, Tamer MA Mohamed, Gadryn C Higgs,Deepak Srivastava, and Beth L Pruitt. Contractility of single cardiomyocytes differentiated from pluripotentstem cells depends on physiological shape and substrate stiffness.

Proceedings of the National Academy ofSciences , 112(41):12705–12710, 2015.[15] Nancy K Drew, Mackenzie A Eagleson, Danny B Baldo Jr, Kevin Kit Parker, and Anna Grosberg. Metrics forassessing cytoskeletal orientational correlations and consistency.

PLoS computational biology , 11(4):e1004190,2015. 18

PREPRINT - F

EBRUARY

5, 2021[16] Tessa Altair Morris, Jasmine Naik, Kirby Sinclair Fibben, Xiangduo Kong, Tohru Kiyono, Kyoko Yokomori,and Anna Grosberg. Striated myocyte structural integrity: Automated analysis of sarcomeric z-discs.

PLoScomputational biology , 16(3):e1007676, 2020.[17] Susan E Leggett, Mohak Patel, Thomas M Valentin, Lena Gamboa, Amanda S Khoo, Evelyn Kendall Williams,Christian Franck, and Ian Y Wong. Mechanophenotyping of 3d multicellular clusters using displacement arraysof rendered tractions.

Proceedings of the National Academy of Sciences , 117(11):5655–5663, 2020.[18] David A Stout, Eyal Bar-Kochba, Jonathan B Estrada, Jennet Toyjanova, Haneesh Kesari, Jonathan S Reichner,and Christian Franck. Mean deformation metrics for quantifying 3d cell–matrix interactions without requiringinformation about matrix material properties.

Proceedings of the National Academy of Sciences , 113(11):2898–2903, 2016.[19] Richard A Haddad, Ali N Akansu, et al. A class of fast gaussian binomial ﬁlters for speech and image processing.

IEEE Transactions on Signal Processing , 39(3):723–727, 1991.[20] Ken Perlin. An image synthesizer.

ACM Siggraph Computer Graphics , 19(3):287–296, 1985.[21] Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Courna-peau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer,Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson,Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke,and Travis E. Oliphant. Array programming with NumPy.

Nature , 585(7825):357–362, September 2020.[22] J. D. Hunter. Matplotlib: A 2d graphics environment.

Computing in Science & Engineering , 9(3):90–95, 2007.[23] Anant Chopra, Matthew L Kutys, Kehan Zhang, William J Polacheck, Calvin C Sheng, Rebeccah J Luu, JeroenEyckmans, J Travis Hinson, Jonathan G Seidman, Christine E Seidman, et al. Force generation via β -cardiacmyosin, titin, and α -actinin drives cardiac sarcomere assembly from cell-matrix adhesions. Developmental cell ,44(1):87–96, 2018.[24] Arun Sharma, Christopher N Toepfer, Tarsha Ward, Lauren Wasson, Radhika Agarwal, David A Conner,Johnny H Hu, and Christine E Seidman. Crispr/cas9-mediated ﬂuorescent tagging of endogenous proteins inhuman pluripotent stem cells.

Current protocols in human genetics , 96(1):21–11, 2018.[25] G. Bradski. The OpenCV Library.

Dr. Dobb’s Journal of Software Tools , 2000.[26] Nobuyuki Otsu. A threshold selection method from gray-level histograms.

IEEE transactions on systems, man,and cybernetics , 9(1):62–66, 1979.[27] D Allan, C van der Wel, N Keim, TA Caswell, D Wieker, R Verweij, C Reid, T Bottaro, L Grueter, K Ramos,apiszcz, zoeith, RW Perry, F Boulogne, P Sinha, pﬁgliozzi, N Bruot, L Uieda, J Katins, H Mary, and A Ahmadia.soft-matter/trackpy: Trackpy v0. 4.2 (2019).

URL https://doi. org/10.5281/zenodo , 3492186.[28] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, EvgeniBurovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, JoshuaWilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C JCarey, ˙Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman,Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa,Paul van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientiﬁc Computing inPython.

Nature Methods , 17:261–272, 2020.[29] Stefan Van der Walt, Johannes L Schönberger, Juan Nunez-Iglesias, François Boulogne, Joshua D Warner, NeilYager, Emmanuelle Gouillart, and Tony Yu. scikit-image: image processing in python.

PeerJ , 2:e453, 2014.[30] John C Crocker and David G Grier. Methods of digital video microscopy for colloidal studies.

Journal of colloidand interface science , 179(1):298–310, 1996.[31] The pandas development team. pandas-dev/pandas: Pandas, February 2020.[32] Carl Edward Rasmussen and Christopher KI Williams. Gaussian process regression for machine learning, 2006.[33] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Math-ieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. the Journal of machine Learning research , 12:2825–2830, 2011.[34] Anne E Carpenter, Thouis R Jones, Michael R Lamprecht, Colin Clarke, In Han Kang, Ola Friman, David AGuertin, Joo Han Chang, Robert A Lindquist, Jason Moffat, et al. Cellproﬁler: image analysis software foridentifying and quantifying cell phenotypes.

Genome biology , 7(10):R100, 2006.[35] Emma Lejeune, Berkin Dortdivanlioglu, Ellen Kuhl, and Christian Linder. Understanding the mechanical linkbetween oriented cell division and cerebellar morphogenesis.

Soft matter , 15(10):2204–2215, 2019.19

PREPRINT - F

EBRUARY

5, 2021[36] Emma Lejeune and Christian Linder. Quantifying the relationship between cell division angle and morphogenesisthrough computational modeling.

Journal of theoretical biology , 418:1–7, 2017.[37] Gerhard A Holzapfel. Nonlinear solid mechanics a continuum approach for engineering, 2000.[38] Akinori Umeno, Hiroko Kotani, Masakazu Iwasaka, and Shoogo Ueno. Quantiﬁcation of adherent cell orienta-tion and morphology under strong magnetic ﬁelds.

IEEE transactions on magnetics , 37(4):2909–2911, 2001.[39] Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. Exploring network structure, dynamics, and functionusing networkx. In Gaël Varoquaux, Travis Vaught, and Jarrod Millman, editors,

Proceedings of the 7th Pythonin Science Conference , pages 11 – 15, Pasadena, CA USA, 2008.[40] Pavel Senin. Dynamic time warping algorithm review.