aa r X i v : . [ m a t h . O C ] J a n MONOTONE SMOOTHING SPLINES WITH BOUNDS
SARA MAAD SASANE
Abstract.
The problem of monotone smoothing splines with bounds is formulated as aconstrained minimization problem of the calculus of variations. Existence and uniqueness ofsolutions of this problem is proved, as well as the equivalence of it to a finite dimensionalbut nonlinear optimization problem. A new algorithm for computing the solution which isa spline curve, using a branch and bound technique, is presented. The method is applied toexamples in neuroscience and for fitting cumulative distribution functions from data. Introduction
Splines and smoothing splines have a long history of application in many fields. The basichistory is outlined in Egerstedt and Martin, [3] and in Wahba, [19]. See also [20] for anintroduction to the use of smoothing splines in statistics. In this paper we return to theproblem of monotone smoothing splines, which was previously studied in [3, 4, 8, 9]. A classicalapplication is to determine the average growth curve of a population of juveniles. Suppose wehave a population of perhaps 30 children whose heights are measured every 6 months fromage 2 to 20. It is easy to fit a smoothing spline to the data set but there is no guaranteeof monotonicity. Since people just do not get shorter and then regrow, the smoothing splineis not appropriate for this and other similar situations. Instead, the appropriate tool is themonotone smoothing spline. Charles, Sun and Martin [2] used monotone smoothing splines incalculating distribution functions. There a problem can arise in that the spline may becomegreater than at some point and by monotonicity it can never decrease, and so this violatesthe condition that a cumulative distribution only takes values between and , a problemthat was not addressed in [2]. In this paper, we solve this problem by imposing a conditionthat the spline is bounded above by some number x max (which would be in the case ofcumulative distribution functions). We also present an application of monotone smoothingsplines in mathematical biology,where sigmoidally shaped function commonly occur.The main difficulty of the problem is the monotonicity constraint ˙ x ≥ , which has to holdat every point t in the interval under consideration. This infinite dimensional constraint ishandled by a vector space version of the Karush–Kuhn–Tucker theorem, and using this, it ispossible to reduce the original infinite dimensional problem to a finite dimensional problemwhich can be solved numerically.The paper is organized as follows: In Section 2, we formulate the curve fitting problemas a constrained Calculus of Variations problem, and in Section 3, we show existence anduniqueness of the minimizer of this minimization problem, and hence existence and uniquenessof the monotone smoothing spline function. In Section 4, we formulate the Karush–Kuhn–Tucker conditions for this problem, and use these to prove a key lemma, saying that the secondderivative of ¨ x is essentially piecewise linear. In Section 5, we use the key lemma from Section4 to reformulate the infinite dimensional linear problem of Section 2 into a finite dimensionalbut nonlinear problem. This was essentially done previously in [3], except that we provide more details in the proof. We also introduce a new branch and bound type algorithm forcomputing the optimal curve. Sections 6 and 7 contain applications and examples which showhow the method can be used. In Section 6, we reconstruct sigmoidal shaped curves arising inan intracellular signalling model, while in Section 7, we apply the method on reconstructingcumulative distribution functions using data, and in particular for a cumulative distributionfunction arising in the cell cycle, and the distribution that we reconstruct gives the time certaincells remain in a particular phase in the cell cycle.2. The problem
Let
T > , and let t < t < · · · < t m = T . Consider a data set { ( t i , α i ) } with α i ∈ R ,with associated weights w i > , i = 1 , . . . , m .Let x max > , and consider the following optimization problem for functions defined on aninterval [0 , T ] :(1) min Z T ¨ x ( t ) dt + 12 m X i =1 w i ( x ( t i ) − α i ) ! , subject to(2) x ∈ H ((0 , T )) ,x (0) = 0 , ˙ x ≥ on t ∈ (0 , T ) ,x ( T ) ≤ x max , where H ((0 , T )) is the Sobolev space of twice weakly differentiable functions on (0 , T ) . Thecondition x (0) = 0 is included because in many situations in applications, it is for modellingpurposes clear that the curve must satisfy this condition. This happens for example in theapplication of the intracellular signalling model that we discuss in Section 6. The condition x ( T ) ≤ x max arises from ≤ x ( t ) ≤ x max . Since the curve is monotonically increasing, itsuffices to impose the condition at the endpoint t = T .As (0 , T ) is a bounded interval, we can use k x k := (cid:16)R T ¨ x ( t ) dt (cid:17) / as a norm on H ((0 , T )) .The rest of the paper is devoted to solving this problem, and to applications of the developedmethod. 3. Existence and uniqueness of a minimizer
Theorem 1.
Let X = { x ∈ H ((0 , T )); x (0) = 0 , ˙ x ≥ , x ( T ) ≤ x max } , and assume that m ≥ . There exists a unique x ∗ ∈ X which solves the minimization problem min x ∈ X Z T ¨ x ( t ) dt + 12 m X i =1 w i ( x ( t i ) − α i ) ! . Proof.
Let f : X → R (3) f ( x ) = 12 Z T ¨ x ( t ) dt + 12 m X i =1 w i ( x ( t i ) − α i ) , We will use the direct method in the calculus of variations, which says that if a functionalis coercive on H ((0 , T )) and weakly lower semicontinuous on a weakly closed set, then a ONOTONE SMOOTHING SPLINES WITH BOUNDS 3 minimizer exists (see e.g. [17], p. 4). It is easy to see that the set X is convex and closed in H ((0 , T )) and hence it is weakly closed by Mazur’s lemma (see e.g. Theorem 3.13 of [12]).We will first check that the first term of f is weakly lower semicontinuous and coercive on L ([0 , T ]) . Indeed, it is weakly lower semicontinuous since ≤ Z T (¨ x j ( t ) − ¨ x ( t )) dt = Z T ¨ x j ( t ) dt − Z T ¨ x j ( t )¨ x ( t ) dt + Z T ¨ x ( t ) dt, and so if x j ⇀ x (i.e. x j converges weakly to x in H ((0 , T )) ), then ≤ lim inf j →∞ (cid:18)Z T ¨ x j ( t ) dt − Z T ¨ x j ( t )¨ x ( t ) dt + Z T ¨ x ( t ) dt (cid:19) = lim inf j →∞ (cid:18)Z T ¨ x j ( t ) dt − Z T ¨ x ( t ) dt (cid:19) , i.e. Z T ¨ x ( t ) dt ≤ lim inf j →∞ Z T ¨ x j ( t ) dt, which shows that x R T ¨ x ( t ) dt is weakly lower semicontinuous on H ((0 , T )) . Coercivenessof the first term on H ((0 , T )) is obvious since it is the square of the norm on H ((0 , T )) .Weak lower semicontinuity of the second term of (1) follows since H ((0 , T )) is compactlyembedded in C ([0 , T ]) ⊂ C ([0 , T ]) . Indeed, if x j ⇀ x in H ([0 , T ]) , then x j is boundedin H ([0 , T ]) , and since the embedding of H ((0 , T )) into C ([0 , T ]) is compact, x j has asubsequence x j l , which converges (to x by uniqueness of a weak limit) in C ([0 , T ]) ). Finally,note that for each j = 1 , . . . , m , | x i ( t j ) − x ( t j ) | ≤ max t ∈ [0 ,T ] | x i ( t ) − x ( t ) | → if x i ⇀ x in H ((0 , T )) , and so the sum in (1) is weakly continuous.As f is a sum of two weakly lower semicontinuous functions, it is clear that f is weaklylower semicontinuous on X .Next, we prove that f is coercive on X . As x R T ¨ x ( t ) dt is coercive on L ([0 , T ]) , wesee that f ( x ) ≥ Z T ¨ x ( t ) dt → ∞ as k x k → ∞ .To show uniqueness, we will show that f is strictly convex. For this, we will use that u R T u ( t ) dt and x ( x − α ) are strictly convex on L ([0 , T ]) and on R , respectively.Let f ( x ) := R T ¨ x ( t ) dt , f ( x ) = w ( x ( t ) − α ) , and f ( x ) = P mj =2 w j ( x ( t j ) − α j ) , so that f = f + f + f on X . It is clear that f , f and f are convex (but not strictly convex)functions on X . To show that f is strictly convex, we need to prove that if(4) f ( λx + (1 − λ ) x ) = λf ( x ) + (1 − λ ) f ( x ) for some λ ∈ (0 , and x , x ∈ X , then x = x .To prove this, we assume that (4) holds. Since f i are convex for i = 1 , . . . , and f = f + f + f , equation (4) holds also when f is replaced by f i , i = 1 , . . . , . Since u R T u ( t ) dt is strictly convex, the equality (4) for f implies that ¨ x = ¨ x . Then, by integration and SARA MAAD SASANE using that x (0) = x (0) = 0 , it follows that there exists a real constant A ∈ R such that x ( t ) = x ( t ) + At .On the other hand, since x ( x − α ) is strictly convex and w > , equality (4) for f implies that x ( t ) = x ( t ) . Combining this with x = x + At , we obtain A = 0 (since t > and x (0) = x (0) ). We have proved that x ( t ) = x ( t ) for all t ∈ [0 , T ] , and hence x = x in X . This concludes the proof that f is strictly convex on X , and from this it also followsthat the minimizer is unique. (cid:3) The Karush–Kuhn–Tucker conditions
The constrained optimization problem will be solved with a vector space version of theKKT method, cf. Theorem 1 p. 249 of [6].Let X := { x ∈ H ((0 , T )); x (0) = 0 } , and let Z := C ([0 , T ]) × R , with a norm defined by k ( u, v ) k Z = (cid:16) k u k C ([0 ,T ]) + | v | (cid:17) / . We define G : X → Z by G ( x ) := ( − ˙ x, x ( T ) − x max ) . We note that since x ∈ H ((0 , T )) , it follows that ˙ x ∈ H ((0 , T )) ⊂ C ([0 , T ]) , and so it isclear that G : X → Z .It is straight-forward to check that G is Fréchet differentiable, and its derivative is G ′ ( x )( h ) = (cid:16) − ˙ h, h ( T ) (cid:17) . By the Riesz representation theorem (see e.g. pp. 113–115 of [6], and pp. 146–150 of [18]),the dual space of C ([0 , T ]) is identified with the normalized space of functions of boundedvariation on [0 , T ] , denoted by N BV ([0 , T ]) , consisting of functions ν of bounded variation on [0 , T ] such that ν is right-continuous and ν (0) = 0 , such that the functionals φ on C ([0 , T ]) can be expressed as the Riemann–Stieltjes integral φ ( u ) = Z T u ( t ) dν ( t ) , and the norm of φ is the total variation of ν on [0 , T ] , denoted by k ν k NBV ([0 ,T ]) .We denote the dual space of Z by Z ∗ . By the above result, Z ∗ is identified with N BV ([0 , T ]) × R , and the norm of an element ( ν, µ ) ∈ Z ∗ is given by k ( ν, µ ) k Z ∗ = (cid:16) k ν k NBV ([0 ,T ]) + µ (cid:17) / . The positive cone in Z is P := { ( w, α ) ∈ Z ; w ≥ and α ≥ } . It is clear that P has a nonempty interior. The positive cone P ∗ in X ∗ is P ∗ := { ( ν, µ ) ∈ N BV ([0 , T ]) × R ; ν is nondecreasing and µ ≥ } . We will derive the KKT conditions for the optimization problem of equation (1). In orderto do this, we first show that all points satisfying the inequality G ( x ) ≤ (i.e. G ( x ) ∈ − P )are regular points for this inequality (cf. [6], p. 248). Lemma 1.
Every x ∈ X with G ( x ) ≤ is a regular point of the inequality G ( x ) ≤ . ONOTONE SMOOTHING SPLINES WITH BOUNDS 5
Proof.
Let x ∈ X be such that G ( x ) ∈ − P . We need to show that there exists an h ∈ X suchthat G ( x ) + G ′ ( x ) h is an interior point of − P , i.e. that h ∈ H ((0 , T )) satisfies − ˙ x − ˙ h < ,x ( T ) + h ( T ) − x max < . There are clearly many choices for h , for example h ( t ) = x max T t − x ( t ) . With this choice, we have h ∈ X and − ˙ x ( t ) − ˙ h ( t ) = − x max T < ,x ( T ) + h ( T ) − x max = − x max < , i.e. G ( x ) + G ′ ( x ) h is an interior point of − P . (cid:3) The functional f : X → R defined by (3) is Fréchet differentiable, and its derivative is givenby f ′ ( x )( h ) = Z T ¨ x ( t )¨ h ( t ) dt + m X i =1 w i ( x ( t i ) − α i ) h ( t i ) . Let x ∗ ∈ X be the minimizer of f subject to G ( x ) ∈ − P . By the KKT Theorem (see [6],p. 249), there exists a z ∗ ∈ Z ∗ , z ∗ ≥ (i.e. z ∗ ∈ P ∗ ) such that the Lagrangian f ( x ) + h G ( x ) , z ∗ i is stationary at x ∗ , and that h G ( x ∗ ) , z ∗ i = 0 .An explicit statement of the KKT conditions implies the following result, which will beused in the next section for constructing a numerical algorithm for the solution: Lemma 2.
Let x ∗ ∈ X be the minimizer of the minimization problem (1) - (2) , and let u ∗ := ¨ x ∗ . Then the following holds: (1) u ∗ is affine on each subinterval of [ t i − , t i ) where ˙ x ∗ > , (2) u ∗ (0) = u ∗ ( T ) = 0 . (3) If ˙ x ∗ = 0 on an interval, then u ∗ = 0 on this interval (and hence it is an affine functionalso there). Remark 1.
We cannot conclude directly from Lemma 2 that u ∗ is piecewise linear, since wecannot yet rule out that there is an increasing sequence of points s j → s such that ˙ x ∗ ( s j ) = 0 while x ∗ ( t ) > for t ∈ ( s j − , s j ) (and u ∗ is affine on each of the intervals ( s j − , s j ) ). We willsee in Lemma 3, that this does not happen for the optimal curve, and u ∗ is in fact piecewiselinear.Proof. By the KKT conditions [6], p. 249, there exists a ν ∗ ∈ N BV ([0 , T ]) and a µ ∗ ∈ R suchthat(5) Z T ¨ x ∗ ( t )¨ h ( t ) dt + m X i =1 w i ( x ∗ ( t i ) − α i ) h ( t i ) − Z T ˙ h ( t ) dν ∗ ( t ) + µ ∗ h ( T ) = 0 for all h ∈ X . Furthermore,(6) − Z T ˙ x ∗ ( t ) dν ∗ ( t ) + µ ∗ ( x ∗ ( T ) − x max ) = 0 SARA MAAD SASANE where ν ∗ is nondecreasing and µ ∗ ≥ . Equation (6) is the complementary slackness condition,and together with the constraint G ( x ∗ ) ≤ , it implies that ν ∗ ( t ) is constant for t such that ˙ x ∗ ( t ) > , and that µ ∗ = 0 if x ∗ ( T ) − x max < .The Riemann-Stieltjes integral in (5) may be integrated by parts, and doing so and notingthat d ˙ h ( t ) = ¨ hdt (since ˙ h ∈ H ((0 , T )) and hence absolutely continuous), we obtain aftercollecting the two integral terms(7) Z T ( u ∗ ( t ) + ν ∗ ( t ))¨ h ( t ) dt + m X i =1 w i ( x ∗ ( t i ) − α i ) h ( t i ) − ν ∗ ( T ) ˙ h ( T ) + µ ∗ h ( T ) = 0 , for all h ∈ X . Choosing h ( t ) = 0 on all except one of the subintervals ( t i − , t i ) , i = 1 , . . . , m ,we conclude that for each i = 1 , . . . , m + 1 , Z t i t i − ( u ∗ ( t ) + ν ∗ ( t ))¨ h ( t ) dt = 0 for all h ∈ C ∞ ([ t i − , t i ]) . Hence there exist β i , γ i , i = 1 , . . . , m such that u ∗ ( t )+ ν ∗ ( t ) = β i + γ i t on ( t i − , t i ) . We may assume (by choosing a representative for the function u ∗ ∈ L ((0 , T )) ),that u ∗ ( t ) + ν ∗ ( t ) = β i + γ i t on the half open interval [ t i − , t i ) for i = 1 , . . . , m . Hence u ∗ is aright continuous function of bounded variation.Now with a general h ∈ X , the integral term of (7) can be rewritten using integration byparts as m X i =1 Z t i t i − ( β i + γ i t )¨ h ( t ) dt = m X i =1 (cid:16) ( β i + γ i t i ) ˙ h ( t i ) − ( β i + γ i t i − ) ˙ h ( t i − ) − γ i ( h ( t i ) − h ( t i − )) (cid:17) = m − X i =1 (cid:16) ( β i − β i +1 + ( γ i − γ i +1 ) t i ) ˙ h ( t i ) − ( γ i − γ i +1 ) h ( t i ) (cid:17) − β ˙ h (0) + ( β m + γ m T ) ˙ h ( T ) − γ m h ( T ) . By choosing h appropriately (i.e. exactly one of h ( t i ) and ˙ h ( t i ) not equal to zero), we concludethat(8) β i − β i +1 + ( γ i − γ i +1 ) t i = 0 , for i = 1 , . . . , m − , − ( γ i − γ i +1 ) + w i ( x ∗ ( t i ) − α i ) = 0 , for i = 1 , . . . , m − ,β = 0 ,β m + γ m T − ν ∗ ( T ) = 0 , − γ m + µ ∗ + w m ( x ∗ ( T ) − α m ) = 0 , where we have also used that h (0) = 0 . In particular, since ν ∗ (0) = 0 and β = 0 , it followsthat u ∗ (0) = 0 . Hence u ∗ ∈ N BV ([0 , T ]) . Note that the first equation of (8) implies that u ∗ + ν ∗ is continuous at the spline knots t i , i = 1 , . . . , m , and the second equation implies thatthe derivative of u ∗ + ν ∗ has jumps of size − w i ( x ∗ ( t i ) − α i ) at t i , i = 1 , . . . , m − .Recall that ν ∗ is (locally) constant for t such that ˙ x ∗ ( t ) > . Let us examine what happensfor a point s such that ˙ x ∗ ( s ) = 0 . If s is an isolated zero of ˙ x ∗ , then ν ∗ may have ajump discontinuity at s (where it is right continuous). If ˙ x ∗ = 0 on an interval around s ,then clearly also u ∗ = ¨ x ∗ = 0 on this interval. In particular u ∗ is piecewise linear in thisinterval. (cid:3) ONOTONE SMOOTHING SPLINES WITH BOUNDS 7 An algorithm for computing the optimal solution
Using Lemma 2, we will now reformulate the optimization problem as a finite dimensionalproblem which we can solve numerically. This is essentially the approach of Section 7.3 of[3], and their method has been adapted to the extra constraint x ( T ) ≤ x max . Instead ofusing dynamical programming as in [3], we give an outline of a branch and bound algo-rithm for finding an optimal solution to the problem. The variables of this new problem are x , . . . , x m , v , . . . , v m , which are the (unknown) values of the function x ( t ) and its derivative ˙ x ( t ) at the spline knots.Assuming initially that the values x , . . . , x n and v , . . . , v m are known, we will use themethod of [3] to determine a curve with minimal cost under the constraint that it passesthrough the points ( t , x ) , . . . , ( t m , x m ) with derivatives at these points equal to v , . . . , v m ,respectively. This will give us a new cost function depending on the variables x , . . . , x m and v , . . . , v m . Minimizing this function is equivalent to the original infinite dimensional problem.Now we focus our attention on one interval [ t i , t i +1 ] , and rename it [ t , t F ] . Without lossof generality, we assume that t = 0 . The corresponding values of x at the endpoints and t T are denoted by x and x F , respectively. We assume without loss of generality that x = 0 .The values of the derivatives at the endpoints are denoted by ˙ x and ˙ x F . The following lemmais essentially given in [3], but here we provide the full proof with more details, taking care ofexcluding the pathological case in the remark after Lemma 2. Note also that a typo in formula(7.27) of [3] has been corrected ( ˙ x i instead of ˙ x i ). Lemma 3. [3]
Suppose that ˙ x , ˙ x F ≥ , x F ≥ and t F > . Then the optimal control u which minimizes R t F u dt subject to the constraints ¨ x ( t ) = u ( t ) for t ∈ (0 , t F ) , x (0) = 0 , x ( t F ) = x F , ˙ x (0) = ˙ x , ˙ x ( t F ) = ˙ x F and ˙ x ( t ) ≥ for t ∈ (0 , t F ) is given by u ( t ) = (cid:18)
6( ˙ x + ˙ x F ) t F − x F t F (cid:19) t + 6 x F t F − x t F − x F t F if x F ≥ t F (cid:0) ˙ x + ˙ x F − √ ˙ x ˙ x F (cid:1) , and u ( t ) = (cid:16) ˙ x / + ˙ x / F (cid:17) x F (cid:18) t − x F ˙ x / ˙ x / + ˙ x / F (cid:19) if ≤ t < x F ˙ x / ˙ x / + ˙ x / F , if x F ˙ x / ˙ x / + ˙ x / F ≤ t ≤ t F − x F ˙ x / F ˙ x / + ˙ x / F , (cid:16) ˙ x / + ˙ x / F (cid:17) x F (cid:18) t − t F + x F ˙ x / F ˙ x / + ˙ x / F (cid:19) if t F − x F ˙ x / F ˙ x / + ˙ x / F < t ≤ t F if x F < t F (cid:0) ˙ x + ˙ x F − √ ˙ x ˙ x F (cid:1) . The contribution to the cost in the two cases is Z t F u ( t ) dt = ( ˙ x + ˙ x F ) t F − x F ( ˙ x + ˙ x F ) t F +3 x F + ˙ x ˙ x F t F t F if x F ≥ t F (cid:0) ˙ x + ˙ x F − √ ˙ x ˙ x F (cid:1) , x F (cid:16) ˙ x / + ˙ x / F (cid:17) otherwise.The corresponding spline function x is given by x ( t ) = ˙ x t + (cid:18) x F t F − x + ˙ x F t F (cid:19) t + (cid:18) ˙ x + ˙ x F t F − x F t F (cid:19) t SARA MAAD SASANE if x F ≥ t F (cid:0) ˙ x + ˙ x F − √ ˙ x ˙ x F (cid:1) , and (9) x ( t ) = x F ˙ x / ˙ x / + ˙ x / F + ( ˙ x / + ˙ x / F ) x F (cid:18) t − x F ˙ x / ˙ x / + ˙ x / F (cid:19) if ≤ t < x F ˙ x / ˙ x / + ˙ x / F x F ˙ x / ˙ x / + ˙ x / F if x F ˙ x / ˙ x / + ˙ x / F ≤ t ≤ t F − x F ˙ x / F ˙ x / + ˙ x / F , x F ˙ x / ˙ x / + ˙ x / F + (cid:16) ˙ x / + ˙ x / F (cid:17) x F (cid:18) t − t F + x F ˙ x / F ˙ x / + ˙ x / F (cid:19) if t F − x F ˙ x / F ˙ x / + ˙ x / F < t ≤ t F if x F < t F (cid:0) ˙ x + ˙ x F − √ ˙ x ˙ x F (cid:1) .Proof. The optimal control u which minimizes R t F u dt subject to the constraints ¨ x = u , x (0) = 0 , x ( t F ) = x F , ˙ x (0) = ˙ x , ˙ x ( t F ) = ˙ x F (that is all the constraints except the mono-tonicity constraint ˙ x ≥ ), is an affine function u ( t ) = Ct + D where C and D are chosen sothat the constraints are satisfied. This gives the expression u ( t ) = (cid:18)
6( ˙ x + ˙ x F ) t F − x F t F (cid:19) t + 6 x F t F − x t F − x F t F for all choices of x F , ˙ x , ˙ x F and t F . By integration and using the remaining constraints, thecorresponding curve x ( t ) on the interval (0 , t F ) is given by x ( t ) = (cid:18) ˙ x + ˙ x F t F − x F t F (cid:19) t + (cid:18) x F t F − x + ˙ x F t F (cid:19) t + ˙ x t. Clearly, in the cases when the monotonicity constraint ˙ x ≥ is satisfied, this curve is optimalalso for the original problem. We claim that the ˙ x ≥ on (0 , t F ) if and only if x F t F ≥ (cid:0) ˙ x + ˙ x − √ ˙ x ˙ x F (cid:1) . To prove this, note that the quadratic function ˙ x is given by ˙ x ( t ) = 3 (cid:18) ˙ x + ˙ x F t F − x F t F (cid:19) t + 2 (cid:18) x F t F − x + ˙ x F t F (cid:19) t + ˙ x = 12 Ct + Dt + ˙ x . We note that ˙ x (0) = ˙ x ≥ and ˙ x ( t F ) = ˙ x F ≥ by the assumptions, and so ˙ x ( t ) ≥ for all t ∈ (0 , t F ) if and only if the value of ˙ x at an interior minimizer is nonnegative.We study the cases C ≥ and C < separately. If C < , then ˙ x doesn’t have a minimizer,and so the monotonicity condition will always be satisfied. We note that in this case, x F t F >
12 ( ˙ x + ˙ x F ) ≥ (cid:16) ˙ x + ˙ x F − p ˙ x ˙ x F (cid:17) , is always satisfied.If C ≥ , i e if x F t F ≤ ( ˙ x + ˙ x F ) , then we need a necessary and sufficient condition for whenthe minimum value of ˙ x is nonnegative and the minimizer belongs to the interval (0 , t F ) . Theminimizer belongs to the interval if and only if − D ∈ [0 , t F C ] , i e if ≤ x + ˙ x F t F − x F t F ≤ t F (cid:18) ˙ x + ˙ x F t F − x F t F (cid:19) , which holds if and only if(10) x F t F ≤
13 ( ˙ x + ˙ x F + min( ˙ x , ˙ x F )) . ONOTONE SMOOTHING SPLINES WITH BOUNDS 9
The minimum value B − D C is nonnegative if and only if BC − (cid:16) D (cid:17) ≥ i e ≤ x ˙ x + ˙ x F t F − x F t F ! − (cid:18) x F t F − x + ˙ x F t F (cid:19) = − (cid:18) x F t F − ˙ x + ˙ x F t F (cid:19) − ˙ x ˙ x F t F , and this inequality holds if and only if (cid:16) ˙ x + ˙ x F − p ˙ x ˙ x F (cid:17) ≤ x F t F ≤ (cid:16) ˙ x + ˙ x F + p ˙ x ˙ x F (cid:17) . We note that the right inequality is always satisfied if the minimizer belongs to the interval (0 , t F ) , by (10) and since min( ˙ x , ˙ x F ) ≤ √ ˙ x ˙ x F . To summarize, we see that irrespective ofthe sign of C , a necessary and sufficient condition for the monotonicity of x on (0 , t F ) is that x F t F ≥ (cid:16) ˙ x + ˙ x F − p ˙ x ˙ x F (cid:17) , as required. (cid:3) Let V (∆ x, v l , v r , ∆ t ) := ( v l + v r )(∆ t ) − x ( v l + v r )∆ t +3(∆ x ) + v l v r (∆ t ) (∆ t ) if ∆ x ≥ ∆ t ( v l + v r − √ v l v r ) , x (cid:16) v / l + v / r (cid:17) otherwise.In view of Lemma 2 and by considering functions which on each subinterval is of the form ofLemma 3, we have proved the following: Theorem 2.
The infinite dimensional optimization problem of Section 2 is equivalent to thefinite dimensional problem (11) min m − X i =1 V ( x i +1 − x i , ˙ x i , ˙ x i +1 , t i +1 − t i ) + 12 m X i =1 w i ( x i − α i ) ! subject to the constraints − ˙ x i ≤ for i = 1 , . . . , m,x i − x i +1 ≤ for i = 1 , . . . , m − ,x m ≤ x max ,x = 0 , The nonlinear optimization problem of Theorem 2 can be solved numerically, for examplewith fmincon in matlab . Unfortunately, this method does not seem to give stable results whenthere are more than 10 subintervals, and it is hard to analyse due to the piecewise definedobjective function.To come around this problem, we suggest using a branch-and-bound approach which isoutlined below. We emphasise that the algorithm is guaranteed to terminate, since there arefinitely many (at most m +1 , but probably much less in practice) subproblems to solve, eachof which are convex and can be solved within a fixed time, e.g. with Newton’s method. Wesuggest that a breadth first search is used when going through the branches of the tree, andit is likely that for most problems it will not be needed to search through so many levels of the tree. Further investigations will be needed to find out how efficient the algorithm is andthe limit of the size of the problem that can be solved in practice. These questions will beaddressed in a future project. In the current paper (Sections 6 and 7) we give some exampleswhere the method has been implemented for up to 30 data points with good results.1. Start by fitting an ordinary cubic smoothing spline using the data points and with theconstraints that x (0) = 0 and x ( T ) ≤ x max . If this curve satisfies ˙ x ( t ) ≥ for every t ∈ (0 , T ) , or, equivalently x ( t i +1 ) − x ( t i ) ≥ t i +1 − t i (cid:0) v i + v i +1 − √ v i v i +1 (cid:1) for every i = 0 , . . . , m − , then this must be the optimal curve, and we can stop. Otherwise,the value of the optimal function for this step gives a lower bound for the optimalsolution.2. If the curve in step 1 was not optimal, we branch the problem into m − subproblems,where each subproblem corresponds to an interval for which the spline is given by apiecewise defined curve as in (9), whereas the spline should be an ordinary cubic splineon the other subintervals. A minimization problem is solved using (11), except thatthe first line in the definition is taken for the subintervals where the curve should bean ordinary spline, and the second line for the subinterval where the spline should bepiecewise defined. The optimal value for each of these subproblems give lower boundsfor the optimum of that branch. If the curve is monotone, then we have an optimumfor the current branch and don’t need to branch any further. Otherwise, that nodehave to be branched into further subproblems, each with one more subinterval wherewe use a piecewise defined spline curve.3. We continue branching and bounding. Branches whose lower bound is smaller thanan optimum in another side branch can be cut off, and don’t need to be examinedfurther. In the end, we compare the branch with the smallest optimum, which willgive the minimum of the full problem.In Sections 6 and 7, we show how this method can be used to construct graphs of increas-ing curves relevant in applications from computational biology and for finding cumulativedistribution functions.6. Applications to an intracellular signaling model
As a first application of the algorithm developed in Section 5, we suggest curve fitting usingmonotone smoothing splines as an alternative to the parametric models that are commonlyused in modelling of pathways of the cell. This is expected to be particularly useful in caseswhere the underlying chemistry is not completely understood, but when certain monotonicitytrends in the data can be observed. The ODE models or stochastic models that are commonlyused can be very complicated, see e.g. [10] where the modelling process of this type of modelsis described. In Section 3 of this reference, a relatively simple modelling example of this type,occuring in neuroscience is given, which we describe briefly here, to give the reader context.Calmodulin is an abbreviation for calcium-modulated protein, which is an intermediatecalcium-binding messenger protein present in all eukaryotic cells. Once bound to a calciumion, calmodulin acts as part of a calcium signal pathway by modifying its interactions withvarious target proteins such as kinases or phosphatases. Its importance in neuroscience stemsfrom its crucial involvement in synaptic plasticity.We consider five data sets with experimental data taken from [1, 11, 15, 16], correspondingto models describing this particular pathway. See [10, 5] for a model using a system of ordinarydifferential equations stemming from steady state equations for these reactions, and where the
ONOTONE SMOOTHING SPLINES WITH BOUNDS 11 log[Ca] (nM) m o l C a / m o l C a M log[apoCaM] (nM) m o l apo C a M / m o l PP B log[Ca] (nM) % A c t i v a t i on o f PP B log[Ca] (nM) m o l C a / m o l C a M log[Ca] (nM) % M a x A u t ono m y Figure 1.
Reconstruction of the first five plots of Figure 12.3 of [10] withmonotone smoothing splines, using the branch and bound algorithm outlinedin Section 5. The data for the respective subfigures are taken (in order) from[16, 11, 16, 15, 1].same data sets are used. This particular model consists of the elementary species calcium(Ca), calmodulin (CaM), protein phosphatase 2B (PP2B), and Ca/CaM-dependent proteinkinase II (CaMKII) and protein phosphatase 1 (PP1).Up to four Ca ions are bound by calmodulin, and the first data set that we consider describeshow many (moles of) ions of calcium are bound to CaM per (mole of) CaM, and this quantityis plotted versus the Ca concentration. When the Ca concentration increases, more Calciumions are bound to the proteins, and it is therefore natural to assume that the curve correspondsto the graph of an increasing function. Each calmodulin molecule can bind at most four Caions, and hence the range of the function is naturally included in [0 , . When there is no Capresent in the system, there cannot be any bounds, and so we require that the curve passesthrough the origin. The data set is taken from [16], and the data together with the fittedmonotone spline is shown in the first subfigure of Figure 1.The binding of Ca ions by Calmodulin is a cooperative process. Ca-bound CaM activatesPP2B, another protein implicated in molecular processes related to learning which also playsa role in striatal signaling. Dataset 2, which is taken from [11], describes the number of molesof apo calmodulin (apoCaM, i e calmodulin without calcium bound to it) bound to each molePP2B versus the concentration of apoCaM. Since one PP2B molecule can bind at most onecalmodulin molecule, the range of the function is naturally between and . As there has tobe apoCaM in the system for this type of binding to occur, we require that the (0 , is onthe curve. The more apoCaM there is in the system, the more likely it is for such a bindingto occur, and it is hence natural to assume that the fitted function is increasing.In the third subfigure, percentage activation of PP2B is plotted versus Ca concentrationat two different concentrations of CaM (30 nM for the left curve and 300 nM for the right curve). The data for this subfigure is taken from [16]. Naturally, the range of the function iscontained in [0 , , and the data suggests that the binding is more likely to occur for higherconcentrations of Ca, and hence it is natural to fit the data with a curve which is the graphof an increasing function. Again, Ca is needed for the activation to occur, and for this reasonwe require that the curve starts at the origin.The third protein CaMKII, is a kinase, which is activated by the binding of Ca–CaM. In thefourth subfigure, we consider data from [15], representing the number of moles of Ca that isbound to CaM per mole of CaM in the presence of the enzyme CaMKII. As in subfigure 1, thecurve is expected to be the graph of a function which is increasing, whose range is containedin [0 , , and which is originating from the origin.CaMKII molecules exist as dodecamers, consisting of two hexamer rings. A CaMKII unitthat has bound CaM can autophosphorylate when sitting beside an active neighboring unit inthe same hexamer ring. The phosphorylated unit can remain active even in the absence of Ca-CaM. In subfigure 5, the data comes from [1], and it describes the percentage phosphorylatedCaMKII (autonomy in CaMKII activity) versus calcium concentration.The method of the current paper is applied to data sets in order to fit curves which comeclose to the data points. The weights w i were set to for all examples and the parameter λ was chosen to be for the first two datasets and for the three last.The same range was used for the variables as in Figure 12.3 of [10]. Instead of imposing anew type of condition corresponding to the limit as the independent variable tends to infinity,an additional data point was introduced, which forces the curve to come close to the maximumbound at the right endpoint of the interval. For example, for dataset 1, we demand that thecurve comes close to the point (6 , . After doing this, the method of Section 5 could be useddirectly. The plots of Figure 1 were obtained, and these can be compared to the first five plotsin Figure 12.3 of [10].7. Applications for cumulative distribution functions and an example fromcell cycle models
A second application to the technique of this paper, is for reconstructing an unknowndistribution function given some data points.Suppose that it is known that a sample comes from a distribution with an absolutelycontinuous distribution function, but that the exact form of the distribution is unknown.Then we could reconstruct the cumulative distribution function by using monotone splines.We first test the method from a sample of data points coming from a normal distribution.Using x max = 1 , random points were generated following the normal distributionwith expectation value and standard deviation . Then a histogram with bins wascreated using Matlab’s function histcounts . The vector α with data values was created byusing Matlab’s function cumsum . With λ = 50 , the method of this paper was used with theminor modification that u (0) = 0 is replaced by u ( t ) ≥ to create an approximation of thecumulative distribution function. By differentiating the obtained spline function an estimatefor the density function could also be obtained. The results can be seen in Figure 2.The method is more useful when the data does not come from a standard distribution, andthe following is an example of such a situation arising in cell biology. The cell cycle consistsof four distinct phases: G , S , G and M . For many types of cells, the time a cell spendsin the G phase is highly variable, and it is of interest to find the distribution for the timea cell spends in the G phase. See for example [7], where such a distribution is used in an ONOTONE SMOOTHING SPLINES WITH BOUNDS 13 -3 -2 -1 0 1 2 3 400.10.20.30.40.50.60.70.80.91
Figure 2.
Estimate of the cumulative distribution function using monotonesplines (left), and comparison of the derived density function with the exactdensity function of the normal distribution and the histogram (right).age structured cell cycle model. FUCCI is a fluorescence technology that can be used fortracking the time an individual cell spends in the G phase [13, 14]. Using the movie S1 of thesupplementary material of [13], the histogram data for the time that each cell in that moviestays in the G phase was obtained. Using λ = 3 , the cumulative distribution function couldbe estimated and is shown in Figure 3 time (min) Figure 3.
Estimate of the cumulative distribution function using monotonesplines (left), and comparison of the derived density function with the his-togram (right). 8.
Acknowledgements
The author would like to thank Clyde Martin for reading and commenting on an earlierversion of the manuscript and Olivia Eriksson for a useful discussion about the data setsoccurring in Figure 1. She is also thankful to the reviewers for useful comments which led toan improvement of the paper.
References [1]
Bradshaw, M., Kubota, Y., Meyer, T., and Schulman, H.
An ultrasensitive ca2+/calmodulin-dependent protein kinase ii-protein phosphatase 1 switch facilitates specificity in postsynaptic calciumsignaling.
Proceedings of the National Academy of Sciences 100 , 18 (2003), 10512–10517.[2]
Charles, J., Sun, S., and Martin, C.
Cumulative distribution estimation via control theoretic smooth-ing splines. In
Three Decades of Progress in Control Sciences , X. Hu, U. Jönsson, B. Wahlberg, andB. Ghosh, Eds. Springer Berlin Heidelberg, 2010, ch. 7, pp. 95–104.[3]
Egerstedt, M., and Martin, C.
Control Theoretic Splines . Princeton University Press, Princeton andOxford, 2010.[4]
Elfving, T., and Andersson, L.-E.
An algorithm for computing constrained smoothing spline func-tions.
Numer. Math. 52 , 5 (1988), 583–595.[5]
Eriksson, O., Jauhiainen, A., Maad Sasane, S., Kramer, A., Nair, A., Sartorius, C., andHellgren Kotaleski, J.
Uncertainty quantification, propagation and characterization by bayesiananalysis combined with global sensitivity analysis applied to dynamical intracellular pathway models.
Bioinformatics 35 , 2 (2019), 284–292.[6]
Luenberger, D. G.
Optimization by vector space methods . John Wiley & Sons, Inc. New York, London,Sydney, Toronto, 1969.[7]
Maad Sasane, S.
An age structured cell cycle model with crowding.
J. Math. Anal. Appl. 444 , 1 (2016),768–803.[8]
Mammen, E., and Thomas-Agnan, C.
Smoothing splines and shape restrictions.
Scand. J. Statist. 26 ,2 (1999), 239–252.[9]
Nagahara, M., and Martin, C.
Monotone smoothing splines using general linear systems.
AsianJournal of Control 15 (2013), 461–468.[10]
Nair, A., Gutierrez-Arenas, O., Eriksson, O., Jauhiainen, A., Blackwell, K., and Kotaleski,J.
Modeling intracellular signaling underlying striatal function in health and disease. In
Progress in molec-ular biology and translational science , vol. 123. 2014, pp. 277–304.[11]
O’Donnell, S., Yu, L., Fowler, A., and Shea, M.
Recognition of β -calcineurin by the domainsof calmodulin: Thermodynamic and structural evidence for distinct roles. Proteins: Structure, Function,and Bioinformatics 79 , 3 (2011), 765–786.[12]
Rudin, W.
Functional analysis . McGraw Hill international editions, 1991.[13]
Sakaue-Sawano, A., Kurokawa, H., Morimura, T., Hanyu, A., Hama, H., Osawa, H., Kashi-wagi, S., Fukami, K., Miyata, T., Miyoshi, H., Imamura, T., Ogawa, M., Masai, H., andMiyawaki, A.
Visualizing spatiotemporal dynamics of multicellular cell-cycle progression.
Cell 132 (2008),487–498.[14]
Sakaue-Sawano, A., Ohtawa, K., Hama, H., Kawano, M., Ogawa, M., and Miyawaki, A.
Tracingthe silhouette of individual cells in
S/G /M phases with fluorescence. Chemistry & Biology 15 (2008),1243–1248.[15]
Shifman, J., Choi, M., Mihalas, S., Mayo, S., and Kennedy, M.
Ca2+/calmodulin-dependentprotein kinase ii (camkii) is activated by calmodulin with two bound calciums.
Proceedings of the NationalAcademy of Sciences 103 , 38 (2006), 13968–13973.[16]
Stemmer, P., and Klee, C.
Dual calcium ion regulation of calcineurin by calmodulin and calcineurinb.
Biochemistry 33 , 22 (1994), 6859–6866.[17]
Struwe, M.
Variational Methods . Springer Verlag, 1996.[18]
Taylor, A., and Lay, D.
Introduction to functional analysis . John Wiley & Sons, 1980.[19]
Wahba, G.
Spline models for observational data , vol. 59 of
CBMS-NSF Regional Conference Series inApplied Mathematics . Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1990.[20]
Wang, Y.
Smoothing Splines, Methods and Applications , vol. 121 of