A calibration-free method for biosensing in cell manufacturing
Jialei Chen, Zhaonan Liu, Kan Wang, Chen Jiang, Chuck Zhang, Ben Wang
AA calibration-free method for biosensing in cell manufacturing
Jialei Chen ∗† , Zhaonan Liu † ‡ , Kan Wang † , Chen Jiang †‡ ,Chuck Zhang ∗† , and Ben Wang ∗†‡ § July 30, 2020
Abstract
Chimeric antigen receptor T cell therapy has demonstrated innovative therapeuticeffectiveness in fighting cancers; however, it is extremely expensive due to the intrinsicpatient-to-patient variability in cell manufacturing. We propose in this work a novelcalibration-free statistical framework to effectively recover critical quality attributesunder the patient-to-patient variability. Specifically, we model this variability via apatient-specific calibration parameter, and use readings from multiple biosensors toconstruct a patient-invariance statistic, thereby alleviating the effect of the calibrationparameter. A carefully formulated optimization problem and an algorithmic frameworkare presented to find the best patient-invariance statistic and the model parameters.Using the patient-invariance statistic, we can recover the critical quality attributeof interest, free from the calibration parameter. We demonstrate improvements ofthe proposed calibration-free method in different simulation experiments. In thecell manufacturing case study, our method not only effectively recovers viable cellconcentration for monitoring, but also reveals insights for the cell manufacturingprocess.
Keywords:
CAR T cell therapy; Cell culture; Invariance; Patient-to-patient variability. ∗ H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology † Georgia Tech Manufacturing Institute, Georgia Institute of Technology ‡ School of Materials Science and Engineering, Georgia Institute of Technology § This work is supported by National Science Foundation CMaT ERC (NSF EEC-1648035). a r X i v : . [ q - b i o . Q M ] J u l Introduction
Cell therapy is one of the most promising new treatment approaches over the last decades,demonstrating great potential in treating cancers, including leukemia and lymphoma (Kimand De Vellis, 2009; Yin, 2017). Among those therapies, chimeric antigen receptor (CAR)T cell therapy (Bonifant et al., 2016; June et al., 2018), involving the reprogrammingof a patients T cells to effectively target and attack tumor cells, has shown innovativetherapeutic effects in clinical trials, leading to a recent approval (i.e., the treatment of CD19+hematological malignancies, see Prasad, 2018) by FDA as a new cancer treatment modality.As illustrated in Figure 1, a typical CAR T cell therapy involves four steps – deriving cellsfrom a patient, genetically modifying the cells, culturing the cells, and re-administeringback to the patient. With increasingly mature gene modification technology, more and moreresearchers focus on the culturing step (i.e., the red box in Figure 1), where the goal is tosubstantially increase the cell amount from a small batch to one dose for delivery to thepatient. However, a key challenge is the intrinsic patient-to-patient variability in the startingmaterial, i.e., cells derived from different patients vary in their viabilities, acceptance ratesof genetic modification, and reactions to culture media (Hinrichs and Restifo, 2013). Thesevariabilities introduce difficulties in cell culturing scale-up (i.e., cell manufacturing), andtherefore, the current CAR T cell therapy is hindered by low scalability, labor-intensiveprocesses, and extremely high cost (Harrison et al., 2019). To achieve high quality andacceptable vein-to-vein cost, we present in this work a statistical framework for onlinemonitoring in cell manufacturing, which can alleviate the negative effect of the intrinsicpatient-to-patient variability.There are two reasons why a new statistical method is needed for monitoring criticalquality attributes (exampled by the cell concentration) in cell manufacturing. Firstly, a direct measurement method for cell concentrations is not suitable in cell manufacturing (Sloukaet al., 2016). Such a method typically requires experienced technicians to collect culturemedia, take microscopic images, and perform computation via an image-based software2 igure 1:
An illustration of the four steps in a typical CAR T cell therapy. This work focuseson the cell culturing (or cell manufacturing) step, i.e., step 3. (e.g., ImageJ, see Collins, 2007). Therefore, it is labor-intensive, time-consuming, andmay introduce contamination to the culture media. Furthermore, the direct measurementmethod is oftentimes destructive – the collected cells would be killed for taking microscopicimages. Secondly, while there are non-destructive sensors available, these sensors need tobe calibrated due to the unknown parameters in the sensing relationship (Pan et al., 2019).For example, impedance sensors (adopted in this work, see Figure 2), which measure thedielectric relaxation of cell suspension, can be used to effectively estimate cell concentrations after the calibration of unknown electrical attributes, e.g., permittivity (Gheorghiu andAsami, 1998) and resistivity (Goh and Ram, 2010). However, due to the patient-to-patientvariability, those electrical attributes not only are unknown but also vary among differentpatients, leading to difficulties in recovering cell concentrations from sensor readings.We introduce in this work a calibration-free statistical method for online monitoringin cell manufacturing. Specifically, the intrinsic patient-to-patient variability is modeledby a patient-specific calibration parameter. We propose to use multiple sensor readings to3onstruct a patient- invariance statistic, where a transformation is adopted to isolate andalleviate the effect of the calibration parameter. The constructed invariance statistic is thenused to model the critical quality attribute of interest. In the training stage, we use thehistorical data to estimate a transformation and model parameters via a carefully formulatedoptimization problem, rather than estimate the calibration parameter as in the standardcalibration problem (Kennedy and O’Hagan, 2001; Tuo and Wu, 2015). In the monitoring stage, we use the online sensor readings to recover the underlying critical quality attributethrough the patient-invariance statistic, free from the calibration parameter. We demonstrateimprovements of the proposed calibration-free method in both simulation experiments anda real-world case study of monitoring viable cell concentrations in cell manufacturing. Theproposed approach provides an effective way to monitor cell manufacturing, and therefore,reduces the cost for the promising CAR T cell therapy in treating cancers.The remaining part of the article is organized as follows. In Section 2, we formulate thebiosensing problem in cell manufacturing, with an emphasis on its challenging aspects. InSection 3, we present the proposed calibration-free method. A detailed simulation study anda real-world cell manufacturing case study are conducted in Sections 4 and 5, respectively.We conclude this work with future directions in Section 6.
We first describe the biosensing problem of recovering the Viable Cell Concentration (VCC)in cell manufacturing. We then discuss the key challenge – the patient-to-patient variability,and related works.
As discussed in Section 1, the goal is to monitor VCC in cell manufacturing, thereby reducingthe cost of the CAR T cell therapy. One state-of-the-art approach is to use biosensors to4 igure 2:
An illustration of the adopted impedance-based biosensors in the cell manufacturingapplication: (a) shows a photo of the biosensor design and (b) shows the biosensing setup. measure impedance signals, as indicators for the VCC of interest (Gheorghiu and Asami,1998; Pan et al., 2019). As illustrated in Figure 2, we adopt impedance-based biosensorswith a facing-electrode (FE) design (Miura and Uno, 2019): Our FE biosensor consists ofa pair of parallel-plate electrodes and silicone at four corners to maintain a gap betweenthem; it would be soaked in media to monitor floating cells in between the electrodes.With the adopted biosensors, we need a biosensing method to recover VCCs fromimpedance readings. From physical knowledge, we know that the impedance readingbetween the two electrodes reflects the cell amount due to the capacitive property of viablecell membranes (Schwan, 1957). The sensing relationship is denoted by y = f ( x, θ, η ) , (1)where y is the impedance reading, x is the underlying VCC of interest, θ denotes thesensor geometry, e.g., the gap width, and η models the underlying electrical attributes, e.g.,permittivity and resistivity. Here, the relationship f ( · ) is unknown due to the dynamicinteraction between cells and biosensors, and extremely challenging to simulate via computercodes considering the micro size of cells. In the proposed calibration-free method, we firstlearn the sensing relationship f ( · ) from the historical data of different patients (i.e., the“training” stage), and then conduct online inference of the VCC using the impedance readingsfor new patients (i.e., the “monitoring” stage).5ethods f ( · ) Online η ∗ Historical η j Inverse problem (Aster et al., 2018) Known Unknown N.A.Supervised learning (Bishop, 2006) Unknown Same SameCalibration (Kennedy and O’Hagan, 2001) Unknown Unknown Known
Calibration-free (proposed) Unknown Unknown Unknown
Table 1:
A comparison of the application scenarios of the proposed calibration-free method andother standard methods in the literature.
The key challenge in biosensing is that the electrical attributes η are unknown in bothtraining and monitoring and different for each patient (also see an illustrating gravityapplication in Section 4.1). Note that it is impractical to compute η for a patient from firstprinciple since it represents the intrinsic properties from genetic material of the patient.One popular way is to model η as a calibration parameter, and estimate it from the trainingdataset { y j , x j , θ } Jj =1 . Existing approaches include Bayesian implementation (Kennedyand O’Hagan, 2001), maximal likelihood estimation (Joseph and Melkote, 2009), and aninterpretative l optimization (Tuo and Wu, 2015). However, such methods are proposedspecifically for data fusion of computer experiments and physical experiments, where η is available in the former and a constant in the latter. Whereas in biosensing, there is no effective computer simulator, and electrical attributes η vary among physical experimentalruns for different patients. In the literature, this challenge is also related to the functionalcalibration problem (Plumlee et al., 2016; Brown and Atamturktur, 2018; Ezzat et al., 2018),where the calibration parameter η = η ( x ) is modeled as a function of the input variables.In the biosensing application, however, the calibration parameter η varies among differentpatients yet is a constant over different input VCCs for each patient.The biosensing problem is also related to the inverse problem in the literature (Asteret al., 2018), where one would estimate both x and η via an optimization problem. However,such a method typically assumes that the sensing relationship f ( · ) is known or can beeasily learned with a complete data { y j , x j , θ, η j } nj =1 , whereas, the calibration parameter6 is unknown even in the training stage in biosensing. Furthermore, one may regard η asan additional model parameter in the unknown relationship f ( · ), and adopt a standardsupervised learning scheme (Bishop, 2006) for finding both f ( · ) and η ; this implicitlyassumes that η is a constant for different patients, which is not true in biosensing. Table1 summarizes the related methods discussed; with extensive efforts in literature search,we have not found a standard method, which can be directly adopted for the biosensingproblem in cell manufacturing. We present the proposed calibration-free method in four parts. First, we discuss the sensingrelationship for multiple sensors. We then introduce an invariance statistic to alleviatepatient-to-patient variability. In the online monitoring stage, we use the invariance statisticto recover VCCs. In the training stage, we propose a carefully constructed optimizationproblem and an algorithmic framework to estimate the underlying sensing model.
The key idea of the calibration-free method is to use multiple sensors to address the unknownand patient-specific calibration parameter. For the i -th sensor, we let θ i be its geometryparameter. Consider first the sensing relationship for a given patient (or experiment).Denote y i [ t ] as the scalar impedance reading (more details in Section 5) from the i -thsensor at experimental time t . Following (1), we model y i [ t ] = f ( x [ t ] , θ i , η ) via the sensingrelationship f ( · ) : R (cid:55)→ R . Here, x [ t ] is the VCC at experimental time t , and η is thecalibration parameter. It is important to note that measurements from different sensors { y [ t ] , y [ t ] , · · · } can be modeled by a same calibration parameter η , featuring the underlyingvalues for electrical attributes of the specific patient’s cells. This patient-specific property isthe key to “canceling out” the calibration parameter using multiple sensor readings (see7ection 3.2).We then introduce an additional superscript j for different patients: y ji [ t ] = f ( x j [ t ] , θ i , η j ) , i = 1 , , · · · , I, t = 1 , · · · T, and, j = 1 , · · · , J. (2)Equation (2) further layouts our biosensing settings with multiple sensors: (i) We assumethe homogeneity of VCC, i.e., at given time t and a given patient j , the VCC x j [ t ] isthe same for different senors θ i at different locations (see Figure 2 (b)). This is because,in suspension cell manufacturing, the culture media is constantly stirred to ensure thehomogeneity of nutrition, and thereby VCC (Haycock, 2011). (ii) We use the same set ofsensors with known parameters { θ i } Ii =1 for all J patients and at different experimental time t . Those parameters are known from the fabrication process or can be easily measured fromthe sensors. Note that the proposed method is also effective for different sets of sensors,as long as those sensor parameters are known – the same sensor assumption is only forfabrication convenience and notation simplicity. (iii) Besides for different sensors θ i , thecalibration parameter η j is the same among different measurement time t . This is because η j models the intrinsic property of the j -th patient’s cells, which typically does not changeduring cell manufacturing. After we clearly layout the above settings, we can then constructthe patient-invariance statistic and recover the underlying VCC. For notation simplicity, we drop the experimental time [ t ] and write θ i in the subscript inthis subsection. Furthermore, we rewrite (2) by decomposing f ( · ) into two parts: y ji = f θ i ( x j , η j ) = µ i ( x j ) + δ i ( x j , η j ) . (3)Here, for a given sensor θ i , µ i ( · ) : R (cid:55)→ R models the part of effect of VCC x on impedancereading y , without hampered by the patient-specific calibration parameter η ; and δ i ( · ) :8 (cid:55)→ R is the remaining effect of both x and η on y . Intuitively speaking, µ i ( · ) can be viewedas the mean process of f ( · ) by plugging in some population average of { η j } Jj =1 , ignoringthe patient-to-patient variability; it can also be interpreted as the physical understandingof the sensing relationship. In practice, the mean relationship µ i ( · ) is oftentimes known,at least to a certain degree, prior to experimentation according to the domain-specificknowledge (e.g., the known set of basis functions, see Section 5). On the other hand, δ i ( · ) isthe variability term, i.e., how patient-to-patient variability affects the impedance reading.Such a term leads to different readings, even when the VCC x is the same. Note that in oneof the considered baseline methods (see Sections 4 and 5), we ignore δ i ( · ), i.e., assuming thecalibration parameter is a constant; this will lead to noticeable errors when estimating x .This variability term δ i ( · ) is typically unknown. Such a decomposition of a mean trend anda variability term is widely assumed in different modeling methods (see, e.g., Guillas et al.,2018; Chen et al., 2019).Assume for now δ i ( x, η ) = δ i ( η ), which suggests that the mean relationship µ i ( · ) extractsall the dependency of x on y (further discussion in Section 3.4). In other words, f i ( x, η ) isassumed to be separable for each sensor θ i : y ji = µ i ( x j ) + δ i ( η j ) . (4)Now, we construct a statistic F which is invariant to the calibration parameter η . Togain intuition, consider the following illustrating example with known δ i ( η ) = θ i η for i = 1 , δ i ( η j ) = log θ i + log η j , the effect of the calibration parameter η j is further separated from θ i . Therefore, by subtracting the (log-transformed) variabilityat different sensors, one can obtain an invariance statistic F = log δ ( η j ) − log δ ( η j ) =log θ + log η j − (log θ + log η j ) = log( θ ) − log( θ ). Note that we incorporate the patient- specific property of the calibration parameter when constructing the invariance statistic.Following the above intuition yet with the unknown variability term δ i ( · ), we construct9he following statistic, via a linear combination of the transformed δ i ( · ): F ( η j ) = I (cid:88) i =1 c i F [ δ i ( η j )] = I (cid:88) i =1 c i F [ y ji − µ i ( x j )] , (5)for patient j with η j . Here, c , · · · , c I are pre-defined combination coefficients, and (cid:80) i c i = 0.With a properly selected transformation F [ · ] : R (cid:55)→ R , (5) gives the target invariance statistic F = F ( η j ). Note that here we adapt a general transformation F [ · ], instead of thespecific log-transformation in the above example. The transformation F [ · ] would be selectedso that the dependency of the invariance statistic F to η j is minimal; a detail estimationmethod for F [ · ] will be discussed in Section 3.4.It is important to note that we reconstruct the sensing model from (2) to (5) via theproposed invariance statistic. This is again due to the key challenge of patient-to-patientvariability. Consider first using (2) for VCC recovery (also see the discussion in Section2.2). Due to the unknown and patient-specific calibration parameter η j , it is challengingto either learn a sensing model from training data or recover VCCs for a new patient.However, the new model (5) contains only the invariance statistic, and is free from thecalibration parameter. Thanks to the properly selected transformation and the combination(see Section 3.4), the invariance statistic would be approximately a constant for differentpatients. Therefore, our new model (5) allows an effective estimation of the sensingrelationship (only the mean part needed) similar to the standard calibration problem witha constant calibration parameter (Tuo and Wu, 2015), and then a calibration-free recoveryof the VCC of interest. We present next the method for recovering the VCC of interest x ∗ , in the online monitoring stage for a new patient denoted by ∗ . At any time t , the sensor reading is denoted as D monitor = { y ∗ i , θ i } Ii =1 along with the unknown calibration parameter η ∗ . Assume for now10he mean sensing relationship µ i ( · ) and the transformation F [ · ] are known (see Section 3.4for the estimation). We adopt the new sensing model (5) with the invariance statistic F ( η ∗ ) = I (cid:88) i =1 c i F [ y ∗ i − µ i ( x ∗ )] , (6)where x ∗ is the target VCC. Note that the computed F ( η ∗ ) in online monitoring is also invariant to the calibration parameter η ∗ . Therefore, the VCC of interest x ∗ can be recoveredby minimizing the squared difference between the computed value and the underlying value¯ F (see Section 3.4 for the estimation):ˆ x ∗ = argmin x ∗ (cid:32) I (cid:88) i =1 c i F [ y ∗ i − µ i ( x ∗ )] − ¯ F (cid:33) . (7)In the cell manufacturing application, we are interested in recovering a VCC curve ˆ x ∗ [ t ]over the whole manufacturing period t = 1 , · · · , T . To this end, we perform optimization(7) for T times corresponding to each experimental time t . Note that here we have notincorporated the time-dependency (or smoothness) of the recovered function ˆ x ∗ [ t ] in onlinerecovering; one can use postprocessing methods or directly model x ∗ [ t ] via a parametricform in the optimization (7). Readers are referred to functional data analysis literature(Ramsay, 2004; Wang et al., 2016) for more discussion. We estimate the unknown transformation F [ · ], and the physical relationship µ i ( · ) usingthe training data D train = { y ji [ t ] , x j [ t ] , θ i } Tt =1 Jj =1 at hand. In our implementation, thetransformation F [ · ] is paramterized by the Box-Cox transformation (Box and Cox, 1964;Sakia, 1992) F λ [ z ] = log( z ) if λ = 0( z λ − ) λ if λ (cid:54) = 0 . (8)11ote that the log-transformation in the above example is a special case in the Box-Coxtransformation. Here, the Box-Cox transformation contains an unknown parameter λ . Atwo-parameter BoxCox or YeoJohnson transformation (Yeo and Johnson, 2000) can also beused if the data are not restricted to be nonnegative. Furthermore, in this article, we willfocus on the parametric transformation (8), but our proposed method are general and canbe applied to non-parametric cases.As for the physical relationship µ i ( · ), we adopt the following basis decomposition: µ i, β ( x ) = µ β ( x, θ i ) = K (cid:88) k =1 β k φ k ( x, θ i ) . (9)Here, φ k ( · ); k = 1 , , · · · , K are the pre-defined basis functions and β = [ β , · · · , β K ] T denotes the vector of corresponding coefficients. Such a set of basis { φ k ( · ) } Kk =1 is selectedby the physical knowledge of the cell manufacturing system or the observation from data.Meanwhile, to account for the separable assumption δ i ( x, η ) ≈ δ i ( η ) in Section 3.2, weintroduce slack variables ∆ = { ∆ ji } Ii =1 Jj =1 to account for the “goodness” of the assumption(more discussion below). Furthermore, the underlying value for the best invariance statistic¯ F is also unknown and need to be estimated from data (see Section 3.3).We propose to estimate the unknown parameters { λ, β , ¯ F , ∆ } via the following opti-mization problem with two penalization terms:min λ, β , ¯ F , ∆ l α ( λ, β , ¯ F , ∆ ) = min λ, β , ¯ F , ∆ (cid:88) t,j (cid:34) I (cid:88) i =1 c i F λ [ y ji [ t ] − µ β ( x j [ t ] , θ i )] − ¯ F (cid:35) + α (cid:88) i,j,t (cid:2) | y ji [ t ] − µ β ( x j [ t ] , θ i ) | − ∆ ji (cid:3) + α | β | . (10)Here, α and α are two penalization coefficients, and | · | denotes the vector l norm.The main objective term (i.e., the first term) in (10) is for achieving the best patient-invariance property of the constructed statistic F . Specifically, we minimize the mean-squared error (MSE) of its underlying truth ¯ F and the computed value from data. This is12quivalent to modeling the patient-invariance statistic F j for each patient as i.i.d. randomdraws from a normal distribution N ( ¯ F , σ ) with mean ¯ F and variance σ . Moreover, thefirst penalization term in (10) is for the separable assumption δ i ( x, η ) ≈ δ i ( η ). Here, weminimize the corresponding MSE of the set { δ i ( x j [ t ] , η j ) } Tt =1 to the underlying truth ∆ ji ,for each sensor i and patient j . Similarly, this can also be viewed as modeling δ i ( · , η j ) viai.i.d. normal random variables; the corresponding penalization α can then be interpretedas the ratio between the variances of the two normal distributions. Finally, the secondpenalization term α | β | is for basis selection, similar to the widely used LASSO method inthe literature (Tibshirani, 1996; Donoho, 2006). This is because, in the cell manufacturingapplication, one would only have an intuitive understanding of the sensing relationship; wewill collect a set of basis functions from experience and select the suitable ones via thispenalization.From the duality of the optimization problem, (10) can also be viewed as unpenalizedlog-likelihood combining both normal random variables with a sparsity constraint (cid:107) β (cid:107) ≤ γ (Boyd and Vandenberghe, 2004). The parameter α sets the variance ratio between thetwo random variables, and α controls the desired sparsity level as parameter γ . Since theobjective is to obtain a high recovery accuracy of the VCC of interest, α and α would betuned using cross-validation techniques (Friedman et al., 2001).Consider now estimating the parameters { λ, β , ¯ F , ∆ } via optimization (10) for fixed α > α >
0. We propose to use the following Blockwise Coordinate Descent (BCD)optimization algorithm, described below. First, assign initial values for { λ, β , ¯ F , ∆ } . Next,iterate the following three steps until the convergence is achieved: (i) for fixed λ and β ,update { ¯ F , ∆ } via the following estimates from first-order conditionsˆ¯ F = 1 J T (cid:88) i,j,t c i F λ (cid:2) y ji [ t ] − µ β ( x j [ t ] , θ i ) (cid:3) , ˆ∆ ji = 1 T T (cid:88) t =1 | y ji [ t ] − µ β ( x j [ t ] , θ i ) | ; (11)(ii) for fixed β and ¯ F , update the transformation parameter λ ignoring the two penalizationterms; and (iii) for fixed λ , ¯ F and ∆ , optimize for β via numerical line search methods, e.g.,13 lgorithm 1 The BCD algorithm for parameter estimation (10) • Set initial values λ ← β ← K • Set I × J × T × K tensor Φ with each element Φ ijtk ← φ k ( x j [ t ] , θ i ) repeat Optimizing F and ∆ : • Set I × J × T tensor D with each element D ijt ← y ji [ t ] − (cid:80) k Φ ijtk β k • Update ¯ F ← (cid:80) ijt c i F λ [ D ijt ] /J/T • Update ∆ ji ← (cid:80) t | c i F λ [ D ijt | /T for i = 1 , · · · I and j = 1 , · · · J Optimizing λ : • Set l ( λ, β , ¯ F , ∆ ) ← l α ( λ, β , ¯ F , ∆ ) with α = α = 0 • Update λ ← argmin λ l α ( λ, β , ¯ F , ∆ ) Optimizing β : • Update β ← argmin β l α ( λ, β , ¯ F , ∆ ) with the L-BFGS method until λ , ¯ F , ∆ and β converge • return λ , ¯ F , ∆ and β L-BFGS algorithm (Liu and Nocedal, 1989). The full optimization procedure is providedin Algorithm 1. Since (10) is a non-convex optimization problem, the proposed BCDalgorithm only converges to a stationary solution (Tseng, 2001). Because of this, we suggestperforming multiple runs of Algorithm 1 with random initializations for each run, thentaking the converged estimates for the run with smallest objective function. These runsshould be performed in parallel if possible, to take advantage of the parallel computingcapabilities in many computing systems.It is important to note the difference between the training stage in our calibration-free method and the calibration stage in the standard calibration problem (Kennedy andO’Hagan, 2001). In calibration methods, the calibration parameter η , assumed to be aconstant, is directly estimated from the training set. This can be viewed as estimating apopulation average of the historical { η j } Jj =1 , which would not be helpful in our cell manu-facturing application. Due to the patient-to-patient variability, the calibration parameter η ∗ corresponding to the new patient can be completely different from the historical averagevalue. In contrast, our calibration-free method adopts a patient-invariance statistic F ,constructed from multiple sensor readings, to alleviate this patient-to-patient variability. In14ur training setup, we learn the unknown mean relationship µ i ( · ) and the transformation F [ · ], which provide the best patient-invariance statistic F . We can then use the invariancestatistic F to effectively recover VCCs via (7), free from the patient-specific calibrationparameter η ∗ . A detailed simulation study is conducted in this section. We first look into a toy applicationof recovering gravitational acceleration coefficients, to show the applicability of the proposedmethod. We then discuss more simulation experiments with different sensing relationships.
Consider the following toy application, where the goal is to recover the gravitationalacceleration coefficient x , for a different planet. As illustrated in Figure 3, we drop a balland measure the traveling distance y of the ball after a certain period of time θ . From thephysical knowledge, we have the relationship y = f ( x, θ, η ) = 12 xθ + ηθ. (12)Here, y is the traveling distance measured by, e.g., taking a photo, η is the initial velocity ofthe ball, and θ is the time period between dropping the ball and taking the photo. Supposethe ball is dropped by an engineer, meaning that the initial velocity η is non-zero andchanges among different drops. With the collected data { y, θ } , typically, one cannot recoverthe gravitational acceleration x even with the known relationship (12). This is because theinitial velocity η is also unknown . The key idea of the proposed calibration-free method isto take multiple photos at different time θ i ; i = 1 , · · · , I . Therefore, more data { y i , θ i } Ii =1 iscollected with the same initial velocity η . We can then use the proposed invariance statisticand Algorithm 1 to “cancel out” η and conduct inference on the gravitational acceleration15 igure 3: An illustration and notations of the toy application of recovering the gravitationalacceleration coefficient. x of interest.The setup for recovering the gravity coefficient is as follows. We set the number ofphotos I = 3, with parameters (i.e., the time of taking photos) { θ i } i =1 = { . , , . } . Ahistorical dataset { y ji , x j ; η j } Jj =1 of size J = 100 is generated with calibration parameters(i.e., initial speed, unknown ) η j ∼ U nif (1 , x j ∼ N (9 . , ), andeach sensor reading (i.e., traveling distance) y ji simulated by (12) with an additional i.i.d.measurement error following N (0 , . ). To test the recovery accuracy, we let the underlyingtruth x ∗ = 9 . η ∗ ∼ U nif (1 ,
3) randomly generated, and y ∗ obtained by (12) with the samemeasurement error.The proposed calibration-free method (via Algorithm 1) is applied to find the besttransformation F ˆ λ [ · ] and then recover the gravitational acceleration ˆ x ∗ . The linear com-bination coefficients are { c i } i =1 = { , , − } , and the set of candidate basis functions isΦ = { x, θ, θx, θx , θ x } . The proposed calibration-free method is repeated, with newlygenerated test data { y ∗ , x ∗ = 9 . η ∗ } , for 20 times.We consider the following two baseline methods (also see Table 1). First, we implementthe supervised learning setting (Bishop, 2006), i.e., assuming the calibration parameter η = η j = η ∗ is the same in both training stage and monitoring stage. Such an assumptionis not true in the considered cell manufacturing application. Specifically, we use thehistorical dataset { y ji , x j } j =1 to estimate an ¯ η and a relationship g ( x, θ ) := f ( x, θ ; ¯ η ), which16 igure 4: Results of the gravity application: (a) shows the recovered gravitational accelerationby the three considered methods. The red line marks the underlying truth x ∗ = 9 . . (b) shows theboxplots of absolute error ratios between the proposed method and baseline baseline methods. Thered line marks the ratio of 1.0. would then be used to recover ˆ x ∗ . Here, we use the same set of basis functions Φ for g ( · ),and a similar optimization scheme with LASSO type penalization (Tibshirani, 1996) forestimating the coefficients β . This method is referred to as “SameCal”. The other baselinemethod is the standard l calibration method suggested by Tuo and Wu (2015). In orderto adopt this method, we need to assume that the calibration parameters { η j } j =1 in thehistorical data are known , which is not true in reality. Therefore, we refer this methodas “Oracle”. To estimate the sensing relationship f ( · ), we adopt the set of basis functionsΦ o = { x, θ, θx, θx , θ x, ηx, ηθ, ηxθ } and a similar optimization scheme with LASSO typepenalization. Both baseline methods are implemented to recover x ∗ of interest via minimizingthe squared difference similar to (7), using the same 20 simulated test data.Figure 4 (a) shows the boxplots of the estimated ˆ x ∗ using the proposed calibration-freemethod and the two baseline methods. The red line indicates the ground truth value x ∗ = 9 .
8. Among the three methods, the Oracle baseline preforms the best since it queries additional information of the calibration parameter in the training stage, which is again not feasible in reality. We notice that the proposed calibration-free method performs almostas good as Oracle. It can accurately recover the true value, with the mean over the 20estimates 9 . . not query the underlyingcalibration parameter in the training stage. Furthermore, the proposed calibration-freemethod is noticeably better in recovering the true x ∗ compared to SameCal. More specifically,our method outperforms SameCal with smaller errors in 17 estimates over 20 test runs intotal. This is not surprising since the calibration-free method, utilizing the patient-invariancestatistic, can address the patient-specific calibration parameters η ∗ . Here, we conduct more experiments on the proposed calibration-free method. Specifically,we consider the following four underlying sensing relationships f k ( x, θ, η ):1. f ( x, θ, η ) = xθ + ηθ ;2. f ( x, θ, η ) = 3 x + 2 θx + xθη ;3. f ( x, θ, η ) = θx + x + θη + √ θη + √ xη/ f ( x, θ, η ) = sin( x ) + ( x + η ) θ + xθ + η .Note that function f ( · ) is quite similar to the sensing relationship (12) in the gravityapplication in Section 4.1. For functions f ( · ) and f ( · ), we notice the existence of interactionterms between x and η , which means the separable assumption δ i ( x, η ) = δ i ( η ) in Section3.2 does not hold. However, f ( · ) and f ( · ) can still be approximately represented by theadopted set of basis functions Φ. For function f ( · ), it is quite complex, and cannot berepresented by Φ. We test all four functions, using the proposed calibration-free methodand the same two baseline methods – SameCal and Oracle – introduced in Section 4.1. Thedetailed test procedure is the same as that in Section 4.1.18 igure 5: Boxplots of error ratios between the proposed method and the considered baselines,under different sensing relationships. The red line marks the ratio of 1.0, indicating the baselinemethod achieves the same accuracy as the proposed method.
Figure 5 shows the boxplots of the absolute error ratios of the proposed method over thebaseline SameCal method (a) and the baseline Oracle method (b), under all four underlyingsensing functions f k ( x, θ, η ); k = 1 , · · · ,
4. We notice the error ratios in Figure 5 (a) aremostly smaller than 1 .
0, indicting that the proposed method can achieve smaller errorscompared to SameCal. This is because the assumption of constant calibration parameter inSameCal does not hold in cell manufacturing application (and thereby in this simulationstudy), whereas, our calibration-free method can address the patient-specific calibrationparameter via a proper combination of multiple sensor readings. Moreover, compared toOracle, the proposed method is only slightly worse. This shows the good performance of ourcalibration-free method: Though we do not know the values of the calibration parameter,we can recover the underlying parameter of interest similar to the Oracle baseline, wherewhose values are assumed accessible. Finally, we notice that for the sensing relationship f ( · ), while the proposed method adopts an inappropriate basis decomposition Φ, it stilloutperforms SameCal. This is again because the proposed calibration-free method introducesthe invariance statistic to alleviate the effect of patient-to-patient variability, and therefore,shows improved performances in recovering the quantity of interest.19 igure 6: The cell manufacturing application: (a) an illustration and (b) the actual experimentalsetup with an emphasis on the impedance measurement part.
In this section, we apply the proposed calibration-free method to the motivating case studyof cell manufacturing. As discussed in Sections 1 and 2, we are interested in recovering andmonitoring the Viable Cell Concentration (VCC) x [ t ] at different experimental time t . Thisis because the goal of cell manufacturing is to culture a small batch of cells to a significantamount, for an effective re-administering in the CAR T cell therapy (see Figure 1). In our experiment, human leukemic T cells (Jurkat E6-1; American Type Culture Col-lection, ATCC ® ) are cultured in ATCC-formulated culture medium (RPMI-1640; GEHealthcare) with 10% fetal bovine serum, 2 mM L-glutamine, 10 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES), 1 mM sodium pyruvate, 4500 mg/L glucose, and1500 mg/L sodium bicarbonate in a 75 cm petri dish (Nunc ™ EasYFlask ™ ; ThermoFisherScientific ™ ). The cells are cultured in a humidified incubator controlled at 37 ° C and 5%CO , and the culture media is pre-heated to avoid the temperature effect on the impedancemeasurement (Dlugolecki et al., 2010).The impedance measurements are obtained by our electric cell-substrate impedancesensing. Figure 6 illustrates the experimental setting for the impedance measurement. Here,20e use I = 4 sandwich shape 3D impedance sensors, consisting a pair of parallel-plateelectrodes and PDMS (Sylgard 184, Tow Corning) to maintain a gap between two electrodes(see Figure 2 (b)). In our experimental setup, the geometry parameter θ of the sensors is theedge length of the electrode pads, and { θ i } i =1 = { mm, mm, mm, mm } . Impedancemeasurements are conducted by an LCR meter (E4980AL; Keysight Technologies) witha sinusoidal signal of 22 mVrms under multiple frequencies ranging from 500 Hz to 100kHz. We let impedance measurement y be the relaxation strength computed from theraw impedance readings over frequencies, i.e., the difference between permittivity valuesof high-frequency end and low-frequency end of the dielectric relaxation process, for itshigh dependency to the VCC of interest (Schwan, 1957). The measurement is taken every15 minutes for around 30-35 hours. This is because typically after 35 hours, we have tochange the culture media and expand the cells to a bigger cell culture flask, which wouldinevitably interrupt the online monitoring. This results in an online monitoring dataset D monitor = { y ji [ t ] , θ } Tt =1 Jj =1 with the underlying VCCs to be recovered.The ground truth VCCs are obtained by an automated cell counter (TC20 ™ ; Bio-RadLaboratories, Inc.), and the concentration is maintained between 1 10 and 1 10 cells/mL.Multiple repetitions are performed, with the averaged value reported as the underlyingVCC x . Note that the measurement procedure is not only labor-intensive but may alsointroduce contamination to the culture media (see Section 1). We will only measure VCCaround six times per cell culture experiment, which leads to a much smaller ( τ (cid:28) T ) yetfull training dataset D train = { y ji [ t ] , x j [ t ] , θ } τt =1 Jj =1 .We conduct the cell culturing for J = 5 experiments, each with different startingmaterials, i.e., different calibration parameters η j . We use the same I = 4 sensors { θ i } i =1 = { mm, mm, mm, mm } for the J = 5 experiments. In experiment j , we measure andcompute the relaxation strength of impedance y ji [ t ] for each sensor i , at different time t (every 15 minutes, T ≈
130 in total). Meanwhile, we measure the ground truth VCC x j [ t ]with a much lower resolution ( τ ≈ xp 1 Exp 2 Exp 3 Exp 4 Exp 5 Mean Proposed
SameCal
Table 2:
Cross-validation errors of the recovered VCCs for the cell manufacturing case study,using the proposed calibration-free method and the baseline SameCal method.
For the collected training dataset D train , we first preform a cross-validation test (Friedmanet al., 2001) on the recovered VCCs. More specifically, we apply the proposed calibration-free method (via Algorithm 1) to learn the sensing relationship using four out of fiveexperiments, and then recover VCC ˆ x j [ t ] for the remaining experiment via (7). We letthe linear combination coefficients { c i } i =1 = { , − , , − } and select the same set of basisfunctions Φ as in Section 4.1. Furthermore, a log-transformation on VCCs is performed priorto the analysis. We consider SameCal (see Section 4) as the baseline method. Such a methodintroduces an additional assumption that the calibration parameter is a constant. Notethat the Oracle baseline cannot be adopted here since the actual values of the calibrationparameter are always unknown , which is the key motivation of the proposed calibration-freemethod (also see Section 2.2 and Table 1).Table 2 shows the absolute errors of the recovered log VCCs when the ground truthVCCs are measured in each experiment. We observe that the proposed method outperformsthe baseline SameCal method four experiments out of five. Furthermore, the mean error ofthe five experiments by the proposed method is 0 . .
501 by SameCal. This is due to the fact that the calibration parameter, whichmodels the patient-to-patient variability, is not a constant in cell manufacturing (Hinrichsand Restifo, 2013); the proposed calibration-free method properly addresses this variabilityvia the construction of the patient-invariance statistic.22 igure 7:
The recovered VCC over time of two cell manufacturing experiments under the twoconsidered methods. The ground truth VCC measurements are shown in dots.
We then perform VCC recovery on the online monitoring set D monitor . Here, the sensingrelationship is estimated using all five experiments in the training set D train . Figure 7shows the two recovered log VCC curves over the whole culture time log ˆ x j [ t ], via theproposed calibration-free method (in red line) and the baseline SameCal method (in greendash line). The ground truth (log) VCC measurements in D train are also plotted in blackdots. We see that the proposed method recovers a meaningful estimation of VCC. Therecovered log ˆ x j [ t ] increases approximately linear over the culture time t , indicating ˆ x j [ t ]growths exponentially in time; this matches the preliminary understanding in the cell cultureliterature (Haycock, 2011). Furthermore, the recovered ˆ x j [ t ] approximately passes throughthe ground truth measurements. However, due to the huge patient-to-patient variability, thebaseline SameCal method struggles in either passing through the ground truth experimentsor providing reasonable estimates of VCC curves. Our calibration-free method, adoptingthe patient-invariance statistic, appears to alleviate such variability well.The proposed calibration-free method can also provide important biological insights forcell growth in cell manufacturing. From Figure 7 (b), we notice a decrease in the VCC curveat around hour 32. This may be due to the lack of nutrition in the media since the culturemedia typically needs to be changed after 30 hours. Furthermore, we observe from Figure 723hat the VCC curves decrease slightly in the first two hours in cell manufacturing. Thismay be because of the lack of viability of the cells at the beginning of the culture process –though we have already thaw cells and stood them still for several minutes, it seems that acertain portion of cells still do not gain full viability and die soon. As a result, we suggeststanding the cell still longer for future experiments. Last but not least, we notice a smallVCC decrease when conducting the ground truth VCC measurements. One reason for thisis that the measurement itself is not in-line and needs to contact the culture media; it mayintroduce contamination, and therefore, kill a small portion of cells (Haycock, 2011). Incontrast, the proposed calibration-free biosensing method, together with impedance-basedbiosensors, provides an in-line, non-destructive, and non-contact way for VCC monitoringin cell manufacturing. In this work, we propose a new calibration-free method for monitoring viable cell con-centration in cell manufacturing, which is a critical component in the promising CART cell therapy. The key challenge here is the patient-to-patient variability in the initialculturing material, leading to poor performances in recovering viable cell concentrations viaexisting methods. We propose to use multiple impedance-based biosensors with differentgeometries and an associated calibration-free statistical framework for online recovery ofviable cell concentrations. Specifically, we model the patient-to-patient variability via apatient-specific calibration parameter. We then construct a patient-invariance statistic,which uses a transformation and a linear combination of sensor readings to alleviate theeffect of the calibration parameter. In the training stage, we learn the best transformationand the sensing relationship via a carefully formulated optimization problem. In the onlinemonitoring stage, viable cell concentrations can be recovered via the invariance statistic, freefrom the patient-specific calibration parameter. We then apply the proposed calibration-freemethod in different simulation experiments and a real-world case study of cell manufacturing,24here the proposed method demonstrates substantial improvements against the existingmethods. Therefore, we believe the proposed calibration-free method can play an essentialrole in cell manufacturing and reduce the cost of the promising CAR T cell therapy.Looking ahead, there are several interesting directions for future exploration. To beginwith, a more thorough analysis of impedance-based sensors can be conducted, with a detailedcomparison of sensitivity using different experimental settings such as sensor geometries andelectrode materials. Moreover, we adopt in this work a parametric sensing relationship anda heuristic approach for parameter estimation. This is mainly due to the already improvedperformance compared to the baseline methods. A more flexible, and non-parametricGaussian process regression method (Santner et al., 2013; Lin and Joseph, 2019) with arigorous likelihood-based parameter estimation scheme may lead to further improvementsin recovering viable cell concentrations, as well as other critical quality attributes. Finally,micro cameras can also be used in cell manufacturing. Therefore, we are also interestedin monitoring cell manufacturing based on cell morphology. In this case, physics-informeddeep learning frameworks in the literature (Raissi et al., 2017; Chen et al., 2020) appear tobe suitable for recovering critical quality attributes in cell manufacturing.25 eferences
Aster, R. C., Borchers, B., and Thurber, C. H. (2018).
Parameter Estimation and Inverse Problems .Elsevier.Bishop, C. M. (2006).
Pattern Recognition and Machine Learning . Springer.Bonifant, C. L., Jackson, H. J., Brentjens, R. J., and Curran, K. J. (2016). Toxicity and managementin car t-cell therapy.
Molecular Therapy-Oncolytics , 3:16011.Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations.
Journal of the RoyalStatistical Society, Series B , 26:211–252.Boyd, S. and Vandenberghe, L. (2004).
Convex Optimization . Cambridge University Press.Brown, D. A. and Atamturktur, S. (2018). Nonparametric functional calibration of computermodels.
Statistica Sinica , pages 721–742.Chen, J., Mak, S., Joseph, V. R., and Zhang, C. (2019). Function-on-function kriging, withapplications to 3d printing of aortic tissues. arXiv preprint arXiv:1910.01754 .Chen, J., Xie, Y., Wang, K., Zhang, C., Vannan, M. A., Wang, B., and Qian, Z. (2020). Activeimage synthesis for efficient labeling.
IEEE Transactions on Pattern Analysis and MachineIntelligence .Collins, T. J. (2007). ImageJ for microscopy.
Biotechniques , 43(S1):S25–S30.Dlugolecki, P., Ogonowski, P., Metz, S. J., Saakes, M., Nijmeijer, K., and Wessling, M. (2010).On the resistances of membrane, diffusion boundary layer and double layer in ion exchangemembrane transport.
Journal of Membrane Science , 349(1-2):369–379.Donoho, D. L. (2006). For most large underdetermined systems of linear equations the minimal l -norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics:A Journal Issued by the Courant Institute of Mathematical Sciences , 59(6):797–829.Ezzat, A. A., Pourhabib, A., and Ding, Y. (2018). Sequential design for functional calibration ofcomputer models.
Technometrics , 60(3):286–296.Friedman, J., Hastie, T., and Tibshirani, R. (2001).
The Elements of Statistical Learning , volume 1.Springer Series in Statistics, NY.Gheorghiu, E. and Asami, K. (1998). Monitoring cell cycle by impedance spectroscopy: experi-mental and theoretical aspects.
Bioelectrochemistry and Bioenergetics , 45(2):139–143. oh, S. and Ram, R. (2010). Impedance spectroscopy for in situ biomass measurements inmicrobioreactor. In Proceedings of the 14th International Conference on Miniaturized Systemsfor Chemistry and Life Science, Groningen, The Netherlands , pages 3–7.Guillas, S., Sarri, A., Day, S. J., Liu, X., and Dias, F. (2018). Functional emulation of highresolution tsunami modelling over cascadia.
The Annals of Applied Statistics , 12(4):2023–2053.Harrison, R. P., Zylberberg, E., Ellison, S., and Levine, B. L. (2019). Chimeric antigen receptor–Tcell therapy manufacturing: modelling the effect of offshore production on aggregate cost ofgoods.
Cytotherapy , 21(2):224–233.Haycock, J. W. (2011). 3D cell culture: a review of current approaches and techniques. In , pages 1–15. Springer.Hinrichs, C. S. and Restifo, N. P. (2013). Reassessing target antigens for adoptive T-cell therapy.
Nature Biotechnology , 31(11):999.Joseph, V. R. and Melkote, S. N. (2009). Statistical adjustments to engineering models.
Journalof Quality Technology , 41(4):362–375.June, C. H., OConnor, R. S., Kawalekar, O. U., Ghassemi, S., and Milone, M. C. (2018). CAR Tcell immunotherapy for human cancer.
Science , 359(6382):1361–1365.Kennedy, M. C. and O’Hagan, A. (2001). Bayesian calibration of computer models.
Journal of theRoyal Statistical Society: Series B , 63(3):425–464.Kim, S. U. and De Vellis, J. (2009). Stem cell-based cell therapy in neurological diseases: a review.
Journal of Neuroscience Research , 87(10):2183–2200.Lin, L.-H. and Joseph, V. R. (2019). Transformation and additivity in gaussian processes.
Technometrics , pages 1–11.Liu, D. C. and Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization.
Mathematical Programming , 45(1-3):503–528.Miura, T. and Uno, S. (2019). Computer simulation for electrochemical impedance of a livingcell adhered on the inter-digitated electrode sensors.
Japanese Journal of Applied Physics ,58(SB):SBBG15.Pan, Y., Hu, N., Wei, X., Gong, L., Zhang, B., Wan, H., and Wang, P. (2019). 3D cell-basedbiosensor for cell viability and drug assessment by 3D electric cell/matrigel-substrate impedancesensing.
Biosensors and Bioelectronics , 130:344–351. lumlee, M., Joseph, V. R., and Yang, H. (2016). Calibrating functional parameters in the ionchannel models of cardiac cells. Journal of the American Statistical Association , 111(514):500–509.Prasad, V. (2018). Tisagenlecleucel – the first approved CAR-T-cell therapy: implications forpayers and policy makers.
Nature Reviews Clinical Oncology , 15(1):11–12.Raissi, M., Perdikaris, P., and Karniadakis, G. E. (2017). Physics informed deep learning (part i):Data-driven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561 .Ramsay, J. O. (2004). Functional data analysis.
Encyclopedia of Statistical Sciences , 4.Sakia, R. M. (1992). The Box-Cox transformation technique: a review.
Journal of the RoyalStatistical Society: Series D (The Statistician) , 41(2):169–178.Santner, T. J., Williams, B. J., and Notz, W. I. (2013).
The Design and Analysis of ComputerExperiments . Springer Science & Business Media.Schwan, H. P. (1957). Electrical properties of tissue and cell suspensions. In
Advances in Biologicaland Medical Physics , volume 5, pages 147–209. Elsevier.Slouka, C., Wurm, D. J., Brunauer, G., Welzl-Wachter, A., Spadiut, O., Fleig, J., and Herwig, C.(2016). A novel application for low frequency electrochemical impedance spectroscopy as anonline process monitoring tool for viable cell concentrations.
Sensors , 16(11):1900.Tibshirani, R. (1996). Regression shrinkage and selection via the lasso.
Journal of the RoyalStatistical Society: Series B (Methodological) , 58(1):267–288.Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable mini-mization.
Journal of Optimization Theory and Applications , 109(3):475–494.Tuo, R. and Wu, C. F. J. (2015). Efficient calibration for imperfect computer models.
The Annalsof Statistics , 43(6):2331–2352.Wang, J.-L., Chiou, J.-M., and M¨uller, H.-G. (2016). Functional data analysis.
Annual Review ofStatistics and its Application , 3:257–295.Yeo, I.-K. and Johnson, R. A. (2000). A new family of power transformations to improve normalityor symmetry.
Biometrika , 87(4):954–959.Yin, P. (2017). Keys to scale-up CAR T-cell therapy manufacturing.
BioProcess Online ..