[PDF] A singular stochastic control approach for optimal pairs trading with proportional transaction costs

Abstract

Optimal trading strategies for pairs trading have been studied by models that try to find either optimal shares of stocks by assuming no transaction costs or optimal timing of trading fixed numbers of shares of stocks with transaction costs. To find optimal strategies which determine optimally both trade times and number of shares in pairs trading process, we use a singular stochastic control approach to study an optimal pairs trading problem with proportional transaction costs. Assuming a cointegrated relationship for a pair of stock log-prices, we consider a portfolio optimization problem which involves dynamic trading strategies with proportional transaction costs. We show that the value function of the control problem is the unique viscosity solution of a nonlinear quasi-variational inequality, which is equivalent to a free boundary problem for the singular stochastic control value function. We then develop a discrete time dynamic programming algorithm to compute the transaction regions, and show the convergence of the discretization scheme. We illustrate our approach with numerical examples and discuss the impact of different parameters on transaction regions. We study the out-of-sample performance in an empirical study that consists of six pairs of U.S. stocks selected from different industry sectors, and demonstrate the efficiency of the optimal strategy.

Full PDF

AA singular stochastic control approach for optimal pairstrading with proportional transaction costs

Haipeng Xing

Department of Applied Mathematics and StatisticsState University of New York, Stony Brook, NY 11794, [email protected]

Abstract : Optimal trading strategies for pairs trading have been studied by models that tryto ﬁnd either optimal shares of stocks by assuming no transaction costs or optimal timing oftrading ﬁxed numbers of shares of stocks with transaction costs. To ﬁnd optimal strategieswhich determine optimally both trade times and number of shares in pairs trading process,we use a singular stochastic control approach to study an optimal pairs trading problemwith proportional transaction costs. Assuming a cointegrated relationship for a pair of stocklog-prices, we consider a portfolio optimization problem which involves dynamic tradingstrategies with proportional transaction costs. We show that the value function of thecontrol problem is the unique viscosity solution of a nonlinear quasi-variational inequality,which is equivalent to a free boundary problem for the singular stochastic control valuefunction. We then develop a discrete time dynamic programming algorithm to compute thetransaction regions, and show the convergence of the discretization scheme. We illustrateour approach with numerical examples and discuss the impact of diﬀerent parameters ontransaction regions. We study the out-of-sample performance in an empirical study thatconsists of six pairs of U.S. stocks selected from diﬀerent industry sectors, and demonstratethe eﬃciency of the optimal strategy.

Keywords:

Free-boundary problem, pairs trading, stochastic control, trading strategies,transaction costs, transaction regions. 1 a r X i v : . [ q -f i n . T R ] N ov Introduction

Pairs trading is one of proprietary statistical arbitrage tools used by many hedge funds andinvestment banks. It is a short-term trading strategy that ﬁrst identiﬁes two stocks whoseprices are associated in a long-run equilibrium and then trades on temporary deviations ofstock prices from the equilibrium. Though paris trading is a simple market neutral strategy,it has been used and discussed extensively by industrial practitioners in the last severaldecades; see detailed discussion in Vidyamurthy (2004), Whistler (2004), Ehrman (2006),Lai and Xing (2008), and reference therein.Besides its wide practice in ﬁnancial industry, pairs trading also draws much attentionfrom academic researchers. For instance, Gatev et al. (2006) examined the risk and returnsof pairs trading using daily data collected from the U.S. equity market and concluded thatthe strategy in general produces proﬁt higher than transaction costs. To investigate thepairs trading strategy analytically, Elliott et al. (2005) modeled the spread of returns as amean-reverting process and proposed a trading strategy based on the model. This motivatessubsequent researchers to formulate pairs trading rules as stochastic control problems foran Ornstein-Uhlenbeck (OU) process and a correlated stock price process. In particular,Mudchanatongsuk et al. (2008) assumed the log-relationship between a pair of stock pricesfollows a mean-reverting process, and considered a self-ﬁnancing portfolio strategy that onlyallows positions that were long in one stock and short in the other with equal dollar amounts.They then formulated a portfolio optimization based stochastic control problem and obtainedthe optimal solution to this control problem in closed form via the corresponding Hamilton-Jacobi-Bellman (HJB) equation. Relaxing the equal dollar constraint, Tourin and Yan (2013)extended Mudchanatongsuk et al. (2008)’s approach and study pairs trading strategies witharbitrary amounts in each stock without any transaction costs.Instead of deriving the optimal weight of stocks in pairs trading, another line of studyon pairs trading strategies ﬁxes the number of traded shares for each stock during the entiretrading process and considers only the optimal timing of trades in the presence of ﬁxed orproportional transaction costs. Speciﬁcally, Leung and Li (2015) studies the optimal timingto open or close the position subject to ﬁxed transaction costs and the eﬀect of stop-losslevel under the OU process by constructing the value function directly. Zhang and Zhang(2008), Song and Yan (2013), and Ngo and Pham (2016) studied the optimal pairs trading2ule that is based on optimal switching among two (buy and sell) or three (buy, sell, and ﬂat)regimes with a ﬁxed commission cost for each transaction, and solve the problem by ﬁndingviscosity solutions to the associated HJB equations (quasi-variational inequalities). Lei andXu (2015) studied the optimal pairs trading rule of entering and exiting the asset marketin ﬁnite horizon with proportional transaction cost for two convergent assets. Note that,although transaction costs are considered in these strategies, since the number of tradedshares of stocks are ﬁxed during the entire trading period, these strategies are still far fromtraders’ practical experience in reality.To bridge the gap between choosing optimal weight of shares and deciding optimaltrading times in pairs trading, we use a singular stochastic control approach to study anoptimal pairs trading problem with proportional transaction costs which allows us choosingnot only optimal weight, but also optimal trading times during the trading process. Forconvenience, we assume the same diﬀusion and Urnstein-Uhlenbeck processes for one stockand its spread with the other stock as those in Mudchanatongsuk et al. (2008). However,diﬀerent from Mudchanatongsuk et al. (2008) who used a trading rule which requires to shortone stock and long the other in equal dollar amounts, we consider a delta-neutral rule underwhich the ratio of traded shares for two stocks is ﬁxed and this ﬁxed ratio is determined bythe cointegration relationship of two stocks. Hence when the number of shares of one stockis determined, based on the rule of delta neutral, the number of shares for the other stock isalso determined. Besides the weight of shares need to be optimally chosen, we also assumea proportional transaction cost for each trade and hence the optimal times of trading alsoneeds to be decided.As the overall transaction cost based on the above assumption depends on both tradingtimes and the numbers of shares in each trade, we compute the terminal utility of wealthover a ﬁxed horizon and formulate the problem of choosing optimal trading times and thenumber of shares as a singular stochastic control problem. We derive the Hamilton-Jacobi-Bellman equations for this problem, and show that the value function of the problem isthe unique viscosity solution of a quasi-variational inequality. We further argue that thequasi-variational inequality is equivalent to a free boundary problem so that the state spaceconsisting of one stock price and its spread with the other stock can be naturally dividedinto three transaction regions: long the ﬁrst stock and short the second, short the ﬁrst andlong the second, and no transaction. The implied transaction regions can help us determine3ot only optimal times of each transaction, but also the optimal number of shares in eachtransaction. To compute the boundaries of these transaction regions, we develop a numericalalgorithm that is based on discrete time dynamic programming to solve the equation for thenegative exponential utility function, and show that the numerical solution converges to theunique continuous-time solution of the problem.To investigate the performance of the optimal trading strategies implied by the transac-tion regions, we carry out both simulation and empirical studies. Speciﬁcally, we study thetime-varying transaction regions (or trading boundaries) for a speciﬁc set of model param-eters, and investigate the impact of variations of model parameters on transaction regionsand performance of the optimal strategy. For comparison purpose, we also consider a bench-mark strategy which is based on the deviation of the spread from its long-term mean andis popular among practitioners. In both simulation studies and real data analysis, we showthat the optimal trading strategy performs better than the benchmark strategy.The rest of the paper is organized as follows. Section 2 ﬁrst formulates the model andthen derive the Hamilton-Jacobi-Bellman equations associated with the singular stochasticcontrol problems. It shows the existence and uniqueness of the viscosity solution for the va-rational inequalities which are equvalent to the portfolio optimization problem, and reducesthe problem into a free boundary problem. Section 2 also consider the optimal trading prob-lem with exponential utility functions. In section 3, we discretize the free boundary problemand propose a disctete time dynamic programming algorithm. We also demonstrate thatthe solution of the discretized problem converges to the viscosity solution of the variationalinequalities. Sections 4 and 5 provide simulation and empircal studies of the model and theoptimal trading strategy, and compare its performance with a benchmark trading strategy.Some concluding remarks are given in Section 6.4

A pairs trading problem with proportional transac-tion costs

Consider a pair of two stocks P and Q , and let p ( t ) and q ( t ) denote their prices at time t ,respectively. We assume that the price of stock P follows a geometric Brownian motion, dp ( t ) = µp ( t ) dt + σp ( t ) dB ( t ) , (1)where µ and σ are the drift and the volatility of stock P , and B ( t ) is a standard Brownianmotion deﬁned on a ﬁltered probability space and but speciﬁed later. Denote x ( t ) thediﬀerence of the logarithms of the two stock prices, i.e., x ( t ) = log q ( t ) − log p ( t ) = log( q ( t ) /p ( t )) . (2)We assume that the spread follows an Ornstein-Uhlenbeck process dx ( t ) = κ ( θ − x ( t )) dt + νdW ( t ) , (3)where κ > θ is the long-term equilibrium level towhich the spread reverts. We assume that ( B ( t ) , W ( t )) is a two-dimensional Brownianmotion deﬁned on a ﬁltered probability space (Ω , F t , P ), and the instantaneous correlationcoeﬃcient between B ( t ) and W ( t ) is ρ , i.e., E [ dW ( t ) dB ( t )] = ρdt. (4)The above assumptions are same as those in Mudchanatongsuk et al. (2008). With theseassumptions, we can express the dynamics of q ( t ) as dq ( t ) = (cid:2) µ + κ ( θ − x ( t )) + 12 ν + ρσν (cid:3) q ( t ) dt + σq ( t ) dB ( t ) + νq ( t ) dW ( t ) . (5)In the presence of proportional transaction costs, the investor pays 0 < ζ p , ζ q < < η p , η q < P and Q . Denote L p ( t ) and M p ( t ) two nondecreasing and non-anticipating processes andrepresent the cumulative number of shares of stock P bought or sold, respectively, within5he time interval [0 , t ], 0 ≤ t ≤ T . Let y p ( t ) be the number of shares held in stock P , i.e., y p ( t ) = L p ( t ) − M p ( t ), and similarly, we deﬁne L q ( t ), M q ( t ), and y q ( t ) = L q ( t ) − M q ( t ) forstock Q . Denote g ( t ) the dollar value of the investment in bond which pays a ﬁxed risk-freerate of r . Then the investor’s position in two stocks and the bond is driven by dy p ( t ) = dL p ( t ) − dM p ( t ) , dy q ( t ) = dL q ( t ) − dM q ( t ) (6)and dg ( t ) = rg ( t ) dt + b p p ( t ) dM p ( t ) − a q q ( t ) dL q ( t ) + b q q ( t ) dM q ( t ) − a p p ( t ) dL p ( t ) , (7)where a i = 1 + ζ i and b i = 1 − η i for i = p, q .We then need to choose a rule to determine the number of shares of stocks P and Q bought or sold at time t . Note that, Mudchanatongsuk et al. (2008) assumed no transactioncost and considered the strategy that always shorts one stock and longs the other in equaldollar amount, i.e., p ( t ) dL p ( t ) + q ( t ) dM q ( t ) = 0 or p ( t ) dM p ( t ) + q ( t ) dL q ( t ) = 0 at time t . Leiand Xu (2015) and Ngo and Pham (2016) considered a delta-neutral strategy that alwayslong one share of a stock and short one share of the other stock, i.e., dy p ( t ) = − dy q ( t ) = 1or dy p ( t ) = − dy q ( t ) = − t . Here, we also consider a delta-neutral strategy thatrequires the total of positive and negative delta of two assets is zero, hence it suggests thatthe number of shares of stock P bought (or sold) at time t are same as the number of sharesof stock Q sold (or bought), i.e., dL p ( t ) = dM q ( t ) , dM p ( t ) = dL q ( t ) . (8)Equation (8) implies that dy q ( t ) = − dy p ( t )at any time t . Comparing to Lei and Xu (2015) and Ngo and Pham (2016), we remove theconstraint dy p ( t ) = − dy q ( t ) = 1 or − y p ( t ) = − y q ( t ) to be a control variable.Using equations (5) and (8), the dynamics of g ( t ) in equation (7) can be simpliﬁed as dg ( t ) = rg ( t ) dt − (cid:0) a p − b q e x ( t ) (cid:1) p ( t ) dL p ( t ) + (cid:0) b p − a q e x ( t ) (cid:1) p ( t ) dM p ( t ) . (9)The process ( L p ( t ) , M p ( t )) together with our delta-neutral strategy provides us an admissibletrading strategy. For convenience, we denote T ( g ) the set of admissiable trading strategies6hat an investor starts at time zero with amount g of the investment in bond and zero hold-ings in two stocks (i.e., y p (0) = y q (0) = 0), which indicates that the numbers of shares heldin stocks P and Q at time t are y p ( t ) and − y p ( t ), respectively. For nonational convenience,we omit the subscript of y p ( t ) and denote y p ( t ) as y ( t ) in our discussion. Then equations(1), (3), (6), and (9) compose the market model in the time interval [0 , T ], which describesa stochastic process of ( p ( t ) , x ( t ) , y p ( t ) , g ( t )) in R + × R × R × R .Denote the terminal value of the pairs trading portfolio by J ( x ( T ) , p ( T ) , y ( T )). Notethat, under our assumption, y ( T ) indicates that the investor’s positions in stocks P and Q are y ( T ) and − y ( T ), respectively, then the liquidated value of the portfolio is J ( p ( T ) , x ( T ) , y ( T )) = A + ( p ( T ) , x ( T )) y ( T ) { y ( T ) ≥ } + A − ( p ( T ) , x ( T )) y ( T ) { y ( T ) < } , (10)where A + ( p, x ) = ( b p − a q e x ) p, A − ( p, x ) = ( a p − b q e x ) p. Furthermore, if the investment in bond at terminal time T is g ( T ), the terminal wealthof the investor is given by g ( T ) + J ( p ( T ) , x ( T ) , y ( T )). Suppose that the investor’s utility U : R −→ R is a concave and increasing function with U (0) = 0. We assume that theinvestor’s goal is to maximize the expected utility of terminal wealth under the marketmodel (1), (3), (6), and (9), V ( t, p, x, y, g ) = sup ( L p ( t ) ,M p ( t )) ∈T ( g ) E (cid:110) U ( g ( T ) + J ( p ( T ) , x ( T ) , y ( T )) | p ( t ) = p,x ( t ) = x, y t = y, g ( t ) = g (cid:111) . (11)Furthermore, given trading strategies ( L p , M p ), the total trading cost incurred over [ t, T ] canbe expressed as C ( L p , M p ; t, T ) = (cid:90) Tt e r ( T − u ) A − ( p ( u ) , x ( u )) dL p ( u ) − (cid:90) Tt e r ( T − u ) A + ( p ( u ) , x ( u )) dM p ( u ) − J ( p ( T ) , x ( T ) , y ( T )) . (12)and the total proﬁt over [ t, T ] is − C ( L p , M p ; t, T ).7 .2 The Hamilton-Jacobi-Bellman equations and free boundaryproblems We now derive the Hamilton-Jacobi-Bellman (HJB) equations, associated with the stochasticcontrol problems, for the utility maximization problem (11). Consider a class of tradingstrategies such that L p ( t ) and M p ( t ) are absolutely continuous processes, given by L p ( t ) = (cid:90) t l ( u ) du, M p ( t ) = (cid:90) t m ( u ) du, where l ( u ) and m ( u ) are positive and uniformly bounded by ξ < ∞ . Then (1), (3), (6), and(9) provides us a system of stochastic diﬀerential equations with controlled drift, and theBellman equation for a value function denoted by V ξ is L ,o V ξ + sup ≤ l t ,m t ≤ ξ (cid:110)(cid:104) L ,b V ξ (cid:105) l t − (cid:104) L ,s V ξ (cid:105) m t (cid:111) = 0 , for ( t, p, X, y, g ) ∈ [0 , T ] × R + × R × R × R , in which the operators L , B , and S are deﬁnedas L ,o := ∂∂t + κ (cid:0) θ − x (cid:1) ∂∂x + µp ∂∂p + rg ∂∂g + 12 ν ∂ ∂x + ρνσp ∂ ∂p∂x + 12 σ p ∂ ∂p , L ,b := ∂∂y − (cid:0) a p − b q e x ( t ) (cid:1) p ( t ) ∂∂g , L ,s := ∂∂y − (cid:0) b p − a q e x ( t ) (cid:1) p ( t ) ∂∂g . The optimal trading strategy is then determined by considering the following three possiblecases:(i) buying stock P and sell stock Q at the same rate l ( t ) = ξ (i.e., m ( t ) = 0) when L ,b V ξ ≥ , L ,s V ξ >

0; (13)(ii) selling stock P and buy stock Q at rate m ( t ) = ξ (i.e., l ( t ) = 0) when L ,b V ξ < , L ,s V ξ ≤

0; (14)(iii) doing nothing (i.e. l ( t ) = m ( t ) = 0) when L ,b V ξ ≤ , L ,s V ξ ≥ . (15)8ote that the case L ,b V ξ > L ,s V ξ < g .The above argument shows that the optimization problem (11) is a free boundary prob-lem in which the optimal trading strategy is deﬁned by the inequalities (i), (ii), and (iii)for a given value function. Besides, the state space [0 , T ] × R + × R × R × R is partitionedinto buy , sell , and no-transaction regions for stock P , which are characterized by inequalities(13), (14), and (15), respectively. For suﬃciently large ξ , the state space remains dividedinto a buy region B , a sell region S , and a no-transaction region N for stock P , which arecorrespondingly the sell region , the buy region , and the no transaction region for stock Q due to equation (8). Obviously, the buy and sell regions for stock P are disjoint, as it isnot optimal to buy and sell the same stock at the same time. We denote the boundariesbetween the no-transaction region N and the buy and sell regions B and S as ∂ B and ∂ S ,respectively.Let ξ → ∞ , the class of admissible trading strategies becomes T ( g ). We can guess thatthe state space is still divided into three regions, a region of buying P and selling Q , a regionof selling P and buying Q , and a no-transaction region. Then the optimal trading strategyrequires an immediate move to the boundaries of buy or sell regions, if the state is in thebuy region B or the sell region S . Actually we can obtain equations that each of the valuefunctions should satisfy as follows.(i) In region B of buying P and selling Q , the value function remains constant along thepath of the state, dictated by the optimal trading strategy, and therefore, for δy ≥ V ( t, p, x, y, g ) = V ( t, p, x, y + δy, g − ( a p − b q e x ) pδy ) , (16)where δy is the number of shares of stock P bought and stock Q sold by the investor. δy canbe any positive value up to the number required to take the state to ∂ B , so letting δy → L ,b V = 0 . (17)(ii) Similarly, in region S of selling P and buying Q , the value function obeys thefollowing equation for δy ≥ V ( t, p, x, y, g ) = V ( t, p, x, y − δy, g + ( b p − a q e x ) pδy ) , (18)9here δy is the number of shares of stock P sold and stock Q bought by the investor. δy canbe any positive value up to the numer required to take the state to ∂ S , so letting δy → L ,s V = 0 . (19)(iii) In the no-transaction region, the value function obeys the same set of equationsobtained for the class of absolutely continuous trading strategies, and therefore the valuefunction is given by L ,o V = 0 , (20)and the pair of inequalities, shown above in (15), also hold. Note that, due to the continuityof the value function, if it is known in the no-transaction region, it can be determined inboth the buy and sell regions by (17) and (19), respectively.In the buy region B , L ,s V <

0, and, in the sell region S , L ,b V >

0. Also, from thetwo pairs of inequailities (13) and (14), we may conjecture that L ,o V in (20) is negative inboth the buy region B and the sell region S . Therefore, the above set of equations can besummarized as the following fully nonlinear partially diﬀerential equations (PDE):min (cid:110) − L ,b V, L ,s V, −L ,o V (cid:111) = 0 (21)for ( t, p, X, y, g ) ∈ [0 , T ] × R + × R × R × R . Note that the above discussion also yields thefollowing free boundary problem for the singular stochastic control value function:  L ,b V = 0 in BL ,s V = 0 in SL ,o V = 0 in N V ( T, p, x, y, g ) = U ( g + J ( p, x, y )) . (22)We next show that the value function given by (11) is a constrained viscosity solutionof the variational inequality (21) on [0 , T ] × R + × R × R × R , and it is the unique boundedconstrained viscosity solution of (21). The proof is given in the appendix. Theorem 1 . The value function V ( t, p, x, y, g ) is a constrained viscosity solution of (21)on [0 , T ] × R + × R × R × R . 10 heorem 2 . Let u be a bounded upper semicontinuous viscosity subsolution of (21),and v a bounded from below lower semicontinuous viscosity supersolution of (21), such that u ( T, x ) ≤ v ( T, x ) for all x ∈ R + × R × R × R . Then u ≤ v on [0 , T ] × R + × R × R × R . We next assume that the investor has the negative exponential utility function U ( z ) = 1 − exp( − γz ) , (23)where γ is the constant absoluate risk aversion (CARA) parameter such that − U (cid:48)(cid:48) ( z ) (cid:14) U (cid:48) ( z ) = γ . For equation (21), this utility function can reduce much of computational eﬀort and iseasy to interpret. Note that for the utility function (23), the deﬁnition of the value function(11) can be expressed as V ( t, p, x, y, g ) = 1 − exp (cid:16) − γge r ( T − t ) (cid:17) H ( t, p, x, y ) , (24)where H ( t, p, x, y ) is a convex nonincreasing continuous function in y and deﬁned by H ( t, p, x, y ) = inf Lp ( t ) ,M p ( t ) ∈T ( g ) E (cid:110) exp[ − γJ ( p ( T ) , x ( T ) , y ( T )] (cid:12)(cid:12) p ( t ) = p, x ( t ) = x, y ( t ) = y (cid:111) = 1 − V ( t, p, x, y, . Plug (24) into (21), and deﬁne the following operators for H ( t, p, x, y ) on [0 , T ] × R + × R × R , L ,o H = ∂H∂t + κ ( θ − x ) ∂H∂x + µp ∂H∂p + 12 ν ∂ H∂x + ρνσp ∂ H∂p∂x + 12 σ p ∂ H∂p , L ,b H = ∂H∂y + γe r ( T − t ) A − ( p, x ) H, L ,s H = ∂H∂y + γe r ( T − t ) A + ( p, x ) H. Then (21) is transformed into the following PDE for H ( t, p, x, y )min (cid:110) L ,b H, −L ,s H, L ,o H (cid:111) = 0 (25)with the following boundary conditions H ( T, p, x, y ) = exp (cid:8) − γJ ( p, x, y ) (cid:9) .  L ,o H = 0 y ∈ [ Y b ( t, p, x ) , Y s ( t, p, x )] L ,b H = 0 y ≤ Y b ( t, p, x ) L ,s H = 0 y ≥ Y s ( t, p, x ) H ( T, p, x, y ) = exp (cid:8) − γJ ( p, x, y ) (cid:9) . (26)in which Y b ( t, p, x ) and Y s ( t, p, x ) are the buy and sell boundaries for stock P , respectively.Note that the function H ( t, p, x, y ) is evaluated in the four-dimensional space [0 , T ] × R × R × R . Furthermore, this suggests that while ( t, u t , w t ) is inside the no-transaction region,the dynamics of h ( t, u, w, y ) is driven by a two-dimensional standard Brownian motions { z t , t ≥ } and { w t , t ≥ } with correlation ρ . In the buy and sell regions, it follows from(26) that H ( t, p, x, y ) = exp {− γe r ( T − t ) A − ( p, x )[ y − Y b ( t, p, x )] } H ( t, p, x, Y b ( t, p, x )) , y ≤ Y b ( t, p, x ) ,H ( t, p, x, y ) = exp {− γe r ( T − t ) A + ( p, x )[ y − Y s ( t, p, x )] } H ( t, p, x, Y s ( t, p, x )) , y ≥ Y s ( t, p, x ) . The solution of the PDE (21) or (25) can be obtained by turning the stochastic diﬀerentialequations (1), (3), (6), and (9) into Markov chains and then applying the discrete timedynamic programming algorithm. The discrete state is X = ( χ, p , x , ϑ, g ), whose elementsdenote time, price of stock P , spread, number of shares of stock P , and amount in thebank in a discrete space. The value function, denoted by V , are given a value at the ﬁnaltime by using the boundary conditions for the continuous value functions over the discretesubspace ( p , x , ϑ, g ), and then they are estimated by proceeding backward in time by usingthe discrete time algorithm. As in the continuous time case, this algorithm is the same forboth value functions and is derived below for a value function denoted by V δ ( χ, p , x , ϑ, g ),where ρ is a discretization parameter, which depends on the discrete time interval t δ . If t δ and the resolution of the ϑ -axis ϑ δ are sent to zero, then the above discrete value functionconverges to a viscosity subsolution and a viscosity supersolution of the PDE (21). Therefore,all the discrete value functions converge to their continuous counterparts; this is due to theuniqueness of the viscosity solution. 12onsider an evenly spaced partition of the time interval [0 , T ]: χ = { δ, δ, . . . , nδ } , where δ = T /n , and two evenly spaced partitions of the space intervals z = { , ±√ δ, ± √ δ, . . . , } and w = { , ±√ δ, ± √ δ, . . . , } . The grids p is deﬁned by z via the following transformation, p i = exp (cid:16) ( µ − σ ) T + z i σ √ T (cid:17) . (27)Note that the SDE (3) implies that the aymptotic distribution of X ( t ) is Normal( θ, ν / (2 κ )),we deﬁne grid x by x j = θ + ν √ κ w j . (28)Denote χ i = iδ for i = 1 , . . . , n −

1. The dynamics (1) and (3) of P ( t ) and X ( t ) implies thefollowing transition density for ( p ( χ i ) , x ( χ i )), (cid:18) p ( χ i +1 ) x χ i +1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:18) p ( χ i ) x χ i (cid:19) ∼ N (cid:32)(cid:18) log p ( χ i ) + ( µ − σ ) δ (1 − δκ ) x ( χ i ) + δκθ (cid:19) , (cid:32) δσ , δρσνδρσν, δν (cid:33) (cid:33) . (29)We also note that the discrete time equation for the amount in the bank g ( χ ) is g ( χ i +1 ) = g ( χ i ) exp( rδ ) . Given the grid deﬁned above, the discrete time dynamic programming principle is in-voked, and the following discretization scheme is proposed for PDE (21): V δ ( χ i , p ( χ i ) , x ( χ i ) , ϑ, g ( χ i )) = max (cid:110) V δ (cid:0) χ i , p ( χ i ) , x ( χ i ) , ϑ + ξ, g ( χ i ) − ( a p − b q e x ( χ i ) ) p ( χ i ) ξ (cid:1) , V δ (cid:0) χ i , p ( χ i ) , x ( χ i ) , ϑ − ξ, g ( χ i ) + ( b p − a q e x ( χ i ) ) p ( χ i ) ξ (cid:1) ,E (cid:8) V δ (cid:0) χ i +1 , p ( χ i +1 ) , x ( χ i +1 ) , ϑ, g ( χ i +1 ) (cid:1)(cid:9) (cid:111) . (30)where ξ > i = 0 , . . . , n −

1. This scheme is based on the principlethat the investor’s policy is the choice of the optimum transaction. We next show that, asthe discretization parameter δ →

0, the solution V δ of (30) converges to the value function V , or, equivalently, to the unique constrained viscosity solution of (21). Theorem 3 . The solution V δ of (30) converges locally uniformly as δ → U ( z ) = 1 − exp( − γz ), the value function V can beexpressed as (24), its discretization scheme is given by V δ ( χ i , p ( χ i ) , x ( χ i ) , ϑ, g ( χ i )) = 1 − exp (cid:16) − γ g ( χ i ) e r ( T − χ i ) (cid:17) H δ ( χ i , p ( χ i ) , x ( χ i ) , ϑ ) . Then the discretization scheme (30) can be reduced to H δ ( χ i , p ( χ i ) , x ( χ i ) , ϑ ) = min (cid:110) F b ( p ( χ i ) , x ( χ i ) , ξ ) · H δ ( χ i , p ( χ i ) , x ( χ i ) , ϑ + ξ ) ,F s ( p ( χ i ) , x ( χ i ) , ξ ) · H δ ( χ i , p ( χ i ) , x ( χ i ) , ϑ − ξ ) , E (cid:8) H δ (cid:0) χ i +1 , p ( χ i +1 ) , x ( χ i +1 ) , ϑ (cid:1)(cid:9) (cid:111) . (31)where F b ( p ( χ i ) , x ( χ i ) , ξ ) = exp (cid:8) γξA − ( p ( χ i ) , x ( χ i )) e r ( T − χ i ) (cid:9) ,F s ( p ( χ i ) , x ( χ i ) , ξ ) = exp (cid:8) − γξA + ( p ( χ i ) , x ( χ i )) e r ( T − χ i ) (cid:9) . We use the numerical algorithm proposed in Section 3 to studies the buy and sell boundariesof the pairs trading strategy. Our study focuses on two aspects of the problem. The ﬁrst isthe property of buy and sell boundaries (or no transaction regions) for a given set of modelparameters, and the other is the impact of diﬀerent model parameters on the shape of buyand sell boundaries. Without loss of the generality, we assume the time horizon T = 1 and p (0) = 1 in all our simulation studies.We ﬁrst consider a baseline scenario. The parameter values in the baseline scenarioare µ = 0 . , σ = 0 . , θ = 0 . , κ = 1 , ν = 0 . , ρ = 0 . , r = 0 . , γ = 5 and ζ p = ζ q = ξ p = ξ q = 0 . t, p, x, y, g ) and use the developedMarkov chain approximation to solve the discretized optimization problem. Figure 1 showsthe buy and sell surfaces of (S1) at time t = 0 . , . , .

65, and 0 .

95. To better readthe ﬁgure, we also show in Figures 2 and 3 the buy and sell boundaries of (S1) at prices p = 0 . , . , . , . x = 0 . , . , . , . p ( T ) and asymptotic distribution of x ( t ), respectively. We ﬁnd the followingfrom these ﬁgures. First, at a given time and a given price level, the no transaction regionbecomes narrower when the spread gets larger, and the no transaction region moves fromthe negative to the positive when the spread turns from the negative to the positive. Forexample, at t = 0 .

05 and p ( t ) = 0 . − . , − . x ( t ) = 0 .

023 to [ − . , − .

4] at x ( t ) = 0 . − . , .

2] at x ( t ) = 0 . . , .

7] at x ( t ) = 0 . p ( t ) gets larger, and the no transaction region moves upwhen the price becomes larger. For instance, at t = 0 .

05 and x ( t ) = 0 . − . , − .

0] at p ( t ) = 0 .

845 to [ − . , − .

6] at p ( t ) = 1 . − . , − . p ( t ) = 1 . − . , − .

0] at p ( t ) = 2 . p ( t ) , x ( t )) = (1 . , . t = 0 . , . , . .

95 are [ − . , − . − . , − . − . , − . − . , − . µ = 0 . σ, θ, κ, ν, ρ, r, γ and ζ p = ζ q = ξ p = ξ q have same values as those in (S1). We discretize the state space ( t, p, x, y, g ), and usethe developed Markov chain approximation to solve the discretized optimization problem forScenarios 2-19.To compare the buy and sell boundaries (or no transaction regions) among diﬀerentscenarios, we plot the buy and sell boundaries over time at four ﬁxed points ( p (1) , x (1) ) =(0 . , . p (2) , x (2) ) = (0 . , . p (3) , x (3) ) = (1 . , . p (4) , x (4) ) = (1 . , . µ , σ , θ , κ , ν , ρ , r , γ , ζ p (= ζ q = ξ p = ξ q ), respectively. In each ﬁgure, weplot the buy and sell boundaries for ( p ( i ) , x ( i ) ), i = 1 , , , µ = 0 . , σ = 0 . , θ = 0 . , κ = 1 , ν = 0 . , ρ = 0 . r = 0 . , γ = 5 and ζ p = ζ q = ξ p = ξ q = 0 . µ = 0 . κ = 0 . r = 0 . µ = 0 . κ = 1 . r = 0 . σ = 0 . ν = 0 . γ = 3(S5) σ = 0 . ν = 0 . γ = 8(S6) θ = − .

05 (S12) ρ = − . ζ p = ζ q = ξ p = ξ q = 0 . θ = 0 . ρ = 0 . ζ p = ζ q = ξ p = ξ q = 0 . µ increases, the buy and sell boundaries move downward atall four points. Figure 5 indicates that when σ increases, the buy and sell boundaries moveupward at ( p (1) , x (1) ) and ( p (2) , x (2) ), but move downward at ( p (3) , x (3) ) and ( p (4) , x (4) ). Figure6 shows that, when θ increases, the buy and sell boundaries move downward at all four points.Figure 7 indicates that, when κ increases, the buy and sell boundaries move downward, andthe magnitude of such movement is larger at ( p (1) , x (1) ) than the other three points. Figure8 shows that, when ν increases, the buy and sell boundaries move upward at ( p ( i ) , x ( i ) ), i = 1 , ,

3, but move downward at ( p (4) , x (4) ). Figure 9 suggests that, when the correlation ρ changes from the negative to the positive, the buy and sell boundaries move downwards at( p (1) , x (1) ) and ( p (2) , x (2) ), but move upward at ( p (3) , x (3) ) and ( p (4) , x (4) ). Figure 10 indicatesthat variations of interest rate r have little impact on the buy and sell boundaries. Figure11 shows that, when the risk aversion parameter γ increases, the buy and sell boundariesmove upward at ( p ( i ) , x ( i ) ), i = 1 , ,

3, but move downward at ( p (4) , x (4) ). Figure 12 suggeststhat, when the transaction cost increases, the center of the no transaction region seems notchange, but the region gets wider. We also perform simulation studies to investigate the performance of the optimal tradingstrategy. For comparision purpose, we also consider a benchmark strategy which is analogous16o the relative-value arbitrage strategy used in Gatev et al. (2006) and based on standarddeviation of the spread. Speciﬁcally, the strategy opens a position when the spread exceedstwice of the standard deviation of the spread process, and closes the position when eitherprice converges or the maturity is reached. As the benchmark strategy doesn’t specify thenumber of shares of stocks that should be bought or sold, we assume that the number ofshares of stocks traded each time is one.We simulate the price process p t and the spread process x t to compare the performanceof the benchmark strategy and our strategy in scenarios (S1)-(S19). Assume that T = 1, andwe discretize the time interval (0 ,

1] as { . , . , . . . , . , } , so that we have 100 tradingperiods. For each scenario, we simulate 1000 paths of { ( p t , x t ) | t = 0 , . , . . . , . , , p = 1 } ,and for each simulated path ( p t , x t ), we implement the benchmark strategy and the optimalstrategy at t = 0 . , . , . . . , .

99 and close the position at T = 1. Let i = b , o representthe benchmark and the optimal strategies, respectively. For each realized trading strategies,denote N ( i ) as the number of trades (i.e., buy and sell) among the 100 trading periods and P L ( i ) = − C ( i ) ( L p , M p ; 0 ,

1) the total proﬁt made during the trading process. Note that thebenchmark strategy trades only one share of stock each time while the number of sharesof stocks in the optimal strategy are “optimally” chosen based on the buy and sell regions,we deﬁne

P S ( i ) as the the average proﬁt (or loss) generated from the maximum number ofshares of stocks during the trading process. That is, P S ( i ) := − C ( i ) ( L p , M p ; 0 , / max t | Y ( i ) t | ,where Y ( i ) t is the number of shares of stock P at t = 0 . , . , . . . , . N ( i ) , P L ( i ) , and P S ( i ) ( i = o, b ) for1000 paths in each scenario. We note that the total numbers of trades N ( o ) in the optimalstrategy range from 45.736 to 55.821 for (S1)-(S17), and increases (or decreases) signiﬁcantlywhen the transaction costs decreases (or increases) in (S18) and (S19). In comparison to this,the total numbers of trades N ( b ) in the benchmark strategy are much smaller, essentially,between 1 and 2. This suggests the benchmark strategy is much more conservative than theoptimal strategy. For the realized proﬁt over the trading period, P L ( o ) is much larger than P L ( b ) as the optimal strategy can choose to buy or sell the “optimal” number of shares ofstock pairs, while the benchmark strategy only buy or sell one share of stock pair. P S ( o ) and P S ( b ) remove the impact of number of shares of traded stocks, and provide the averageearning per traded stock, and we notice that P S ( o ) still signiﬁcantly higher than P S ( b ) .17able 2: Performance of strategies N ( o ) P L ( o ) P S ( o ) N ( b ) P L ( b ) P S ( b ) (S1) 52.289 (.247) 0.349 (.019) 0.048 (.004) 1.094 (.084) 0.005 (.002) 0.005 (.002)(S2) 53.218 (.241) 0.389 (.020) 0.051 (.004) 1.094 (.084) 0.006 (.002) 0.006 (.002)(S3) 51.348 (.253) 0.318 (.019) 0.046 (.004) 1.094 (.084) 0.004 (.002) 0.004 (.002)(S4) 52.999 (.208) 0.378 (.019) 0.054 (.003) 1.094 (.084) 0.007 (.002) 0.007 (.002)(S5) 51.896 (.275) 0.326 (.019) 0.040 (.005) 1.094 (.084) 0.003 (.004) 0.003 (.004)(S6) 49.299 (.235) 0.357 (.020) 0.032 (.003) 1.094 (.084) 0.003 (.002) 0.003 (.002)(S7) 54.233 (.262) 0.344 (.019) 0.064 (.005) 1.094 (.084) 0.008 (.003) 0.008 (.003)(S8) 55.821 (.304) 0.359 (.021) 0.062 (.006) 1.094 (.084) 0.005 (.003) 0.005 (.003)(S9) 45.736 (.254) 0.266 (.016) 0.046 (.003) 1.094 (.084) 0.005 (.002) 0.005 (.002)(S10) 46.347 (.292) 0.228 (.016) 0.042 (.005) 1.052 (.083) 0.004 (.002) 0.004 (.002)(S11) 57.689 (.212) 0.489 (.022) 0.053 (.003) 1.206 (.084) 0.007 (.002) 0.007 (.002)(S12) 46.774 (.248) 0.325 (.015) 0.065 (.003) 1.140 (.086) 0.008 (.001) 0.008 (.001)(S13) 53.516 (.245) 0.361 (.020) 0.045 (.004) 1.140 (.087) 0.006 (.002) 0.006 (.002)(S14) 54.027 (.232) 0.579 (.032) 0.048 (.004) 1.094 (.084) 0.005 (.002) 0.005 (.002)(S15) 50.031 (.266) 0.219 (.012) 0.049 (.004) 1.094 (.084) 0.005 (.002) 0.005 (.002)(S16) 52.300 (.247) 0.347 (.019) 0.048 (.004) 1.094 (.084) 0.005 (.002) 0.005 (.002)(S17) 52.261 (.247) 0.357 (.019) 0.050 (.004) 1.094 (.084) 0.006 (.002) 0.006 (.002)(S18) 73.801 (.286) 0.339 (.019) 0.045 (.004) 1.094 (.084) 0.006 (.002) 0.006 (.002)(S19) 42.996 (.222) 0.339 (.019) 0.049 (.004) 1.094 (.084) 0.004 (.002) 0.004 (.002) We test our model with real market data in this section. We present the sample and explainour methodology ﬁrst, and then show the results and discussion.A key step of implementing pairs trading strategy is to select two stocks for pairs trading.(Gatev et al., 2006) illustrate how this can be done by using stock price data. An alternativeto this approach is to use fundamentals analysis to select two stocks that have almost thesame risk factor exposures; see Vidyamurthy (2004). In this study, we consider a hybrid ofthese two approaches. Speciﬁcally, we restrict two stocks P and Q to belong to the sameindustry sector. Table 3 lists six pairs of stocks selected from four diﬀerent sectors. For eachpair of stocks P and Q , we compute the spread by regressing log price of stock Q on the logprice of stock P , and the ﬁtted values of the regression is considered as the “transformed”price of P . Figure 13 shows six pairs of the original prices of Q and transformed prices of P Q Stock P Consumer goods Apple Inc. (AAPL) Procter & Gamble Co. (PG)Consumer goods Coca-Cola Co (KO) PepsiCo, Inc. (PEP)Technology Alphabet Inc Class A (GOOGL) Microsoft Corporation (MSFT)Technology AT&T Inc. (T) Verizon Communications Inc. (VZ)Industrial goods Boeing Corporation (BA) General Electric Company (GE)Financial Goldman Sachs Group Inc. (GS) JPMorgan Chase & Co. (JPM)over time.We then apply the optimal strategy and the benchmark strategy in Section 4.2 to testthe out-of-the-sample performance. Speciﬁcally, we use the past three years of the historicaldata of each pair to estimate the model parameter, and run unit-root test to conclude if thespread x t is a stationary process. If x t is not stationary, we do not implement any strategies.Otherwise, we implement both the optimal strategy and the benchmark strategy. Note thatthe optimal strategy can optimally choose the number of shares of stocks in each trade,while we still trade one unit of stock in the benchmark strategy. Table 4 shows the numberof trades N ( i ) , the accumulated proﬁt (in U.S. dollars) at maturity P L ( i ) , and the averageproﬁt per traded share P S ( i ) over two testing periods, for i = o (the optimal strategy) and i = b (the benchmark strategy). Table 4 suggests that the benchmark strategy is much moreconservative than the optimal strategy. Besides, the average proﬁts per traded share P S ( o ) of the optimal strategy are much larger than that of the benchmark strategy except for thestock pair ( KO, P EP ). The problem of optimal pairs trading has been studied by many academic researchers andﬁnancial practitioners. Existing models and methods try to ﬁnd either the optimal sharesof stocks by assuming no transaction costs, or the optimal timing of trading ﬁxed numberof shares of stocks with transaction costs. To ﬁnd optimal pairs trading strategies which19able 4: Performance of strategiesPairs Year N ( o ) P L ( o ) P S ( o ) N ( b ) P L ( b ) P S ( b ) (AAPL, PG) 2014 58 8.56 2.173 0 0 02015 70 25.439 3.91 0 0 0(BA, GE) 2014 97 27.866 1.292 0 0 02015 165 168.543 1.982 20 0.455 0.455(T, VZ) 2014 127 77.908 2.158 2 0.603 0.6032015 131 115.587 2.883 0 0 0(GOOGL, MSFT) 2015 103 94.271 6.734 8 1.623 1.6232016 135 65.957 6.296 0 0 0(GS, JPM) 2015 100 7.654 0.195 6 -2.54 -2.542016 200 94.542 2.375 8 -1.66 -1.66(KO, PEP) 2015 142 37.51 0.675 22 10.154 10.1542016 165 217.878 4.059 4 5.983 5.983determine optimally both the trade time and the number of shares during the trading process,we investigate an optimal pairs trading problem with proportional transaction costs. Usingan approach that is based on maximization of the expected utility of terminal wealth, wetransform the problem into a singular stochastic control problem and argue that the valuefunction of the problem is unique viscosity solution of a nonlinear quasi-variational inequality.We further show that the viscosity solution is equivalent to a free boundary problem for thesingular stochastic control value function. To solve the singular stochastic control problemassociated with utility maximization and compute the value function and transaction regions,we develop a dynamic programming based numerial algorithm to compute the solution.In simulation studies, we illustrate the numerical algorithm and investigate the impact ofmodel parameters on the optimal trading strategies (or the transaction regions). We alsodemonstrate the out-of-sample performance of the optimal strategy via an empirical studywhich consists of six pairs of U.S. stocks from diﬀerent industry sectors.There are several directions in which our approach needs further investigation. First,our approach can be easily extended for nonexponential utility functions. In such a case,the optimization problem involves ﬁve (instead of four) variables, the numerial algorithm20n our paper needs to be modiﬁed to adapt for ﬁve variables. Second, our approach canbe extended to solve the optimal cointegration trading which involves n stocks with m cointegration relationship. Third, many empirical studies suggest that stock price processescan be better approximated by incorporating jumps. Using the framework and algorithmsdeveloped in Xing et al. (2017), the method developed here can be extended to the case thatprice processes follow geometric jump-diﬀusion processes. In such a case, the value functionof the corresponding variational inequalities involve integro-diﬀerential equations, which canbe solved by extending our numerical algorithm. Acknowledgement

The author’s research is supported by National Science Foundation DMS-1612501.

References

D. Ehrman.

The Handbook of Pairs Trading: Strategies Using Equities, Options, and Futures .John Wiley and Sons, New Jersey, 2006.R. Elliott, J. Van der Hoek, and W. Malcom. Pairs trading.

Quantitative Finance , 5:271–276,2005.E. Gatev, W. N. Goetzmann, and K. G. Rouwenhorst. Pairs trading: Performance of arelative-value arbitrage rule.

The Review of Financial Studies , 19:797–827, 2006.T.Z. Lai and H. Xing.

Statistical Models and Methods for Financial Markets . Springer, NewYork, 2008.Y. Lei and J. Xu. Costly arbitrage through pairs trading.

Journal of Economic Dynamicsand Control , 56:1–19, 2015.T. Leung and X. Li. Optimal mean reversion trading with transaction costs and stop-lossexit.

International Journal of Theoretical and Applied Finance , 18:1550020, 2015.S. Mudchanatongsuk, J. Primbs, and W. Wong. Optimal pairs trading: A stochastic ap-proach.

American Control Conference, IEEE , 2008.21. Ngo and H. Pham. Optimal switching for the pairs trading rule: A viscosity solutionsapproach.

Journal of Mathematical Analysis and Applications , 441:403–425, 2016.Q. Song and R. Yan. An optimal pairs-trading.

Automatica , 49:3007–3014, 2013.A. Tourin and R. Yan. Dynamic pairs trading using the stochastic control approach.

Journalof Economic Dynamics and Control , 37:1972–1981, 2013.G. Vidyamurthy.

Pairs Trading — Quantitative Methods and Analysis . John Wiley andSons, New York, 2004.M. Whistler.

Trading Pairs — Capturing Proﬁts and Hedging Risk with Statistical ArbitrageStrategies . John Wiley and Sons, New York, 2004.H. Xing, Y. Yu, and T. W. Lim. European option pricing under geometric levy processeswith proportional transaction costs.

Journal of Computational Finance , 21:101–127, 2017.H. Zhang and Q. Zhang. Trading a mean-reverting asset: Buy low and sell high.

Automatica ,44:1511–1518, 2008.

Appendix: Proof of Theorems

Proof of Theorem 1.

In our case, the state X is ( s, x ), where x = ( p, x, y, g ). Let X =( s , p , x , y , G ), it follows that there exists an optimal trading strategy, dictated by thepair of processes ( L ∗ p ( t ) , M ∗ p ( t ), where X ∗ ( t ) = ( t, p ∗ ( t ) , x ∗ ( t ) , y ∗ ( t ) , g ∗ ( t )) is the optimaltrajectory, with X ∗ ( s ) = X .(i) First, we prove that V is a viscosity subsolution of (21) on [0 , T ] × R + × R × R × R ).For this, we must show that, for all smooth functions φ ( X ), such athat V ( X ) − φ ( X ) has alocal maximum at X , the following inequaility holds:min (cid:110) − B φ ( X ) , S φ ( X ) , −L φ ( X ) (cid:111) ≤ . (32)Without loss of generality, we assume that V ( X ) = φ ( X ) and V ≤ φ on [0 , T ] × R + × R × R × R . We argue by contradiction: if the arguments inside the operator of (32) satisfy −B φ ( X ) > S φ ( X ) >

0, then there exists θ >

0, such that −L φ ( X ) > θ . From22he fact that φ is smooth, the above inequalities become −B φ ( X ) > S φ ( X ) >

0, and −L φ ( X ) > θ , where X = ( t, p, x, y, g ) ∈ B ( X ), a neighborhood of X . In Lemma 1, it isshown that X ∗ ( t ) has no jumps, P-a.s., at X = X ∗ ( s ). Hence, τ ( ω ), deﬁned by τ ( ω ) = inf { t ∈ ( s , T ] : X ∗ ( t ) / ∈ B ( X ) } , is positive P-a.s., and therefore the integral along X ∗ ( t ) − θ (cid:90) τs dt >E (cid:90) τs B φ ( X ∗ ( t )) dL ∗ ( t ) − E (cid:90) τs S φ ( X ∗ ( t )) dM ∗ ( t ) + E (cid:90) τs L φ ( X ∗ ( t )) dt = E { I } − E { I } + E { I } , (33)where ( L ∗ ( t ) , M ∗ ( t )) is the optimal trading strategy at X . Applying Itˆ o ’s formula to φ ( X ),where the state dynamics are given by (1)-(6), we get E { φ ( X ∗ ( τ )) } = φ ( X ) + E { I } − E { I } + E { I } . (34)Since V ( X ) ≤ φ ( X ), for all X ∈ B ( X ), and V ( X ) = φ ( X ), (33) and (34) yield E { V ( X ∗ ( τ )) } ≤ V ( X ) + E { I } − E { I } + E { I } < V ( X ) − θ (cid:90) τs dt, which violates the dynamic programming principle, together with the optimality of ( L ∗ ( t ) , M ∗ ( t )).Therefore, at least one of the arguments inside the minimum operator of (32) is nonpositive,and hence the value function is a viscosity subsolution of (21).(ii) In the second part of the proof, we show that V is a viscosity supersolution of (21).For this, we must show that, for all smooth functions φ ( X ), such that V ( X ) − φ ( X ) has alocal minimum at X , the following inequaility holds:min (cid:110) − B φ ( X ) , S φ ( X ) , −L φ ( X ) (cid:111) ≥ , (35)where, without loss of generality, V ( X ) = φ ( X ) and V ( X ) ≥ φ ( X ) on [0 , T ] × R + × R × R × R . In this case, we prove that each argument of the minimum operator of (35) isnonnegative.Consider the trading strategy L ( t ) = L > s ≤ t ≤ T , and M ( t ) = 0, s ≤ t ≤ T .By the dynamic pogramming principle, V ( s , p , x , y , g ) ≥ V ( s , p , x , y + L , g − ( a p − b q e X ) p L ) . φ ( s, p, x, y, g ) as well, and, by taking the left-hand side to the right-hand side, dividing by L , and sending L →

0, we get B φ ( X ) ≤

0. Similary, by usingthe trading strategy L ( t ) = 0 , s ≤ t ≤ T , and M ( t ) = M > s ≤ t ≤ T , the secondargument inside the minimum operator is found to be nonnegative.Finally, consider the case where no trading is applied. By the dynamic programmingprinciple E { V ( X d ( t )) } ≤ V ( s , p , x , y , g ) , (36)where X d ( t ) is the state trajectory of starting at s , when M ( t ) = L ( t ) = 0, s ≤ t ≤ T ,given by (1)-(6) as X d ( t ) = ( t, p ( t ) , x ( t ) , y , g ( t ))and X d ( t ) ∈ B ( X ). Therefore, by applying Itˆ o ’s rule on φ ( s, X, B, y, G ), inequality (36)yields E (cid:40) (cid:90) ts L φ ( X d ( ξ )) dξ (cid:41) ≤ , and, by letting t → s , the third argument inside the minimumm operator is found to benonnegative. This complete the proof. (cid:3) Lemma 1 . Assume that −B φ ( X ) >

0, and denote the event that the optimal trajectory X ∗ ( t ) has a jump of size (cid:15) , along the direction (0 , , , , − ( a p − b q e x ) p ) by A ( ω ). Assumethat the state (after the jump) is ( s , p , x , y + (cid:15), − ( a p − b q e x ) B (cid:15) ) ∈ B ( X ). Then (cid:16) B φ ( X ) (cid:17) P ( A ) ≥ , (37)therefore P ( A ) = 0. Similarly, if S φ ( X ) >

0, then the optimal trajectory has no jumpsalong the direction (0 , , , − , ( b p − a q e x ) p ), P-a.s. at x . Proof . By the principle of dynamic programming, V ( s , p , x , y , g ) = E (cid:8) V ( s , p , x , y + (cid:15), − ( a p − b q e x ) B (cid:15) ) (cid:9) = (cid:90) A ( ω ) V ( s , p , x , y + (cid:15), − ( a p − b q e x ) B (cid:15) ) dP + (cid:90) A ( ω ) V ( s , p , x , y , g ) dP, and therefore (cid:90) A ( ω ) (cid:104) φ ( s , p , x , y + (cid:15), − ( a p − b q e x ) B (cid:15) ) − φ ( s , p , x , y , g ) (cid:105) dP ≥ , V ( X ) ≤ φ ( X ) for all X ∈ B ( X ) and V ( X ) = φ ( X ). Therefore,lim sup (cid:15) → (cid:110) (cid:90) A ( ω ) φ ( s , p , x , y + (cid:15), − ( a p − b q e x ) p (cid:15) ) − φ ( s , p , x , y , g ) (cid:15) dP (cid:111) ≥ , and, by Fatou’s lemma, (cid:90) A ( ω ) lim sup (cid:15) → (cid:110) φ ( s , p , x , y + (cid:15), − ( a p − b q e x ) B (cid:15) ) − φ ( s , p , x , y , G ) (cid:15) (cid:111) dP ≥ , which implies (37). (cid:3) Proof of Theorem 3 . Let V δ ( t, p, x, y, g ) = (cid:40) V δ ( χ, p , x , y, g ) if t ∈ [ χ, χ + δ ) , y ∈ [ ν, ν + κδ ) ,Z ( p , x , y, g ) if t = T and V ( X ) = lim Y → X inf δ → { V δ ( Y ) } and V ( X ) = lim Y → X sup δ → { V δ ( Y ) } , (38)where X = ( t, p, x, y, g ). We will show that V ( X ) and V ( X ) are a viscosity supersolutionand a viscosity subsolution of (21), respectively. Combining this with the uniquesness of theviscosity solution of (21) yields V ( X ) ≥ V ( X ) on [0 , T ] × R + × R × R × R . The oppositeinequality is true by the deﬁnition of V ( X ) and V ( X ), and therefore V ( X ) = V ( X ) = V ( X ) , which, together with (38), also implies the local uniform convergence of V δ to V .Note that we only prove that V is a viscosity supersolution of (21), as the argumentsfor V is identical. Let X be a local minimum of V − φ on [0 , T ] × R + × R × R × R , for φ ∈ C , ([0 , T ] × R + × R × R × R ). Without loss of generality, we may assume that X isa strict local minimum, that V ( X ) = φ ( X ), and that φ ≤ − × sup δ {|| V δ || ∞ } outside thevall B ( X , R ), R >

0, where V ( X ) − φ ( X ) ≥ δ n ∈ R + and Y n ∈ [0 , T ] × R + × R × R × R , such that δ n → , Y n → X , V δ n ( Y n ) → V ( X ) , Y n if a global minimum point of V δ n j − φ. h n = V δ n − φ ; then h n → V δ n j ( X ) ≥ φ ( X ) + h n ( X ) for any X ∈ [0 , T ] × R + × R × R × R . (39)To show that V is a viscosity supersolution of (21), it suﬃces to show thatmin (cid:110) − B φ ( X ) , S φ ( X ) , −L φ ( X ) (cid:111) ≥ . (40)Let Y n = ( s i , p n , x n , y n , g n ), where s i ∈ [ χ i , χ i + δ n ) and y δ n ∈ [ ϑ n , ϑ n + κδ n ). Denote Y (0) n = ( χ n , p n , x n , y n , g n ), Y (1) n = (cid:0) χ n , p n , x n , ϑ n + κδ n , g n − ( a p − b q e x n ) p n κδ n (cid:1) , Y (2) n = (cid:0) χ n , p n , x n , ϑ n − κδ n , g n + ( b p − a q e x n ) p n κδ n (cid:1) . Then V δ n ( Y (0) n ) = max (cid:110) V δ n ( Y (1) n ) , V δ n ( Y (2) n ) , E (cid:8) V δ n ( Y (0) n +1 ) (cid:9) (cid:111) . Now we look at the following three cases.

Case 1 . It holds that V δ n ( Y (0) n ) = V δ n ( Y (1) n ) . Then (39) implies that V δ n ( Y (0) n ) ≥ φ ( Y (1) n ) + V δ n ( Y (0) n ) − φ ( Y (0) n ) , and therefore 0 ≥ lim inf n (cid:110) φ ( Y (1) n ) − φ ( Y (0) n ) δ n (cid:111) ≥ lim inf δ → (cid:110) φ ( Y (1)0 ) − φ ( Y (0)0 ) δ (cid:111) = ∂φ ( X ) ∂y − ( a p − e x ( t ) ) p ( t ) ∂φ ( x ) ∂g . Case 2 . It holds that V δ n ( Y (0) n ) = V δ n ( Y (2) n ). Arguing similarly to case 1, we get0 ≥ − (cid:16) ∂φ ( X ) ∂y − ( b p − a q e x ( t ) ) p ( t ) ∂φ ( X ) ∂g (cid:17) . Case 3 . It holds that V δ n ( Y (0) n ) = E (cid:8) V δ n ( Y (0) n +1 ) (cid:9) . Then (39) implies that V δ n ( Y (0) n ) ≥ E (cid:8) φ ( Y (0) n +1 ) (cid:9) + V δ n ( Y (0) n ) − φ ( Y (0) n +1 ) , and therefore0 ≥ lim inf n (cid:110) φ ( Y (0) n +1 ) − φ ( Y (0) n ) δ n (cid:111) ≥ lim inf δ → (cid:110) φ ( Y (0)1 ) − φ ( Y (0)0 ) δ (cid:111) = L φ ( X ) . Combining the results in cases 1-3 yields (40), and the proof is complete. (cid:3) P t = 0.845 (top left), 1.095 (top right), 1.400(bottom left), and 2.108 (bottom right) and diﬀerent times.28igure 3: Buy and sell boundaries of at spread X t = 0.023 (top left), 0.092 (top right), 0.157(bottom left), and 0.266 (bottom right) and diﬀerent times.29igure 4: Buy and sell boundaries of at ﬁxed prices ( p ( i ) , x ( i ) ), i = 1 , , , µ = 0.1(dashed), 0.2 (solid), 0.3 (dotted). 30igure 5: Buy and sell boundaries of at ﬁxed price ( p ( i ) , x ( i ) ), i = 1 , , , σ = 0.2(dashed), 0.4 (solid), 0.6 (dotted). 31igure 6: Buy and sell boundaries of at ﬁxed price ( p ( i ) , x ( i ) ), i = 1 , , , θ = − . p ( i ) , x ( i ) ), i = 1 , , , κ = 0.8(dashed), 1 (solid), and 1.2 (dotted). 33igure 8: Buy and sell boundaries of at ﬁxed price ( p ( i ) , x ( i ) ), i = 1 , , , ν = 0.1(dashed), 0.15 (solid), and 0.2 (dotted). 34igure 9: Buy and sell boundaries of at ﬁxed price ( p ( i ) , x ( i ) ), i = 1 , , , ρ = − . p ( i ) , x ( i ) ), i = 1 , , , r = 0.005(dashed), 0.01 (solid), and 0.03 (dotted). 36igure 11: Buy and sell boundaries of at ﬁxed price ( p ( i ) , x ( i ) ), i = 1 , , , γ = 3 (dashed),5 (solid), and 8 (dotted). 37igure 12: Buy and sell boundaries of at ﬁxed price ( p ( i ) , x ( i ) ), i = 1 , , , ζ p = ζ qq