[PDF] Pairs Trading with Nonlinear and Non-Gaussian State Space Models

Abstract

This paper studies pairs trading using a nonlinear and non-Gaussian state-space model framework. We model the spread between the prices of two assets as an unobservable state variable and assume that it follows a mean-reverting process. This new model has two distinctive features: (1) The innovations to the spread is non-Gaussianity and heteroskedastic. (2) The mean reversion of the spread is nonlinear. We show how to use the filtered spread as the trading indicator to carry out statistical arbitrage. We also propose a new trading strategy and present a Monte Carlo based approach to select the optimal trading rule. As the first empirical application, we apply the new model and the new trading strategy to two examples: PEP vs KO and EWT vs EWH. The results show that the new approach can achieve a 21.86% annualized return for the PEP/KO pair and a 31.84% annualized return for the EWT/EWH pair. As the second empirical application, we consider all the possible pairs among the largest and the smallest five US banks listed on the NYSE. For these pairs, we compare the performance of the proposed approach with that of the existing popular approaches, both in-sample and out-of-sample. Interestingly, we find that our approach can significantly improve the return and the Sharpe ratio in almost all the cases considered.

Full PDF

PPairs Trading with Nonlinear and Non-Gaussian State SpaceModels ∗ Guang Zhang † Department of Economics, Boston University, Boston, MA, 02215

May 21, 2020

Abstract

This paper studies pairs trading using a nonlinear and non-Gaussian state space modelframework. We model the spread between the prices of two assets as an unobservable statevariable, and assume that it follows a mean reverting process. This new model has two distinctivefeatures: (1) The innovations to the spread is non-Gaussianity and heteroskedastic. (2) Themean reversion of the spread is nonlinear. We show how to use the ﬁltered spread as the tradingindicator to carry out statistical arbitrage. We also propose a new trading strategy and present aMonte Carlo based approach to select the optimal trading rule. As the ﬁrst empirical application,we apply the new model and the new trading strategy to two examples: PEP vs KO and EWTvs EWH. The results show that the new approach can achieve 21.86% annualized return for thePEP/KO pair and 31.84% annualized return for the EWT/EWH pair. As the second empiricalapplication, we consider all the possible pairs among the largest and the smallest ﬁve US bankslisted on the NYSE. For these pairs, we compare the performance of the proposed approachwith that of the existing popular approaches, both in-sample and out-of-sample. Interestingly,we ﬁnd that our approach can signiﬁcantly improve the return and the Sharpe ratio in almostall the cases considered.

Keywords: pairs trading, nonlinear and non-Gaussian state space models, Quasi Monte CarloKalman ﬁlter.

JEL codes : C32, C41, G11, G17. ∗ I am grateful to Zhongjun Qu, Hiroaki Kaido, Jean-Jacques Forneron and seminal participants at the BostonUniversity Economics Department. † Email: [email protected]. a r X i v : . [ q -f i n . P M ] M a y Introduction

In early 1980s, a group of physicists, mathematicians and computer scientists, leaded by quantitativeanalyst Nunzio Tartaglia, tried to use a sophisticated statistical approach to ﬁnd the opportunitiesof arbitrage trading (Gatev et al. 2006). Tartaglia’s strategy, later coined pairs trading, is to ﬁnd apair of two stocks whose prices have moved similarly historically, and make proﬁt by applying thesimple contrarian principles. Since then, pairs trading has become a popular short-term arbitragestrategy used by hedge funds and is often considered as the “ancestor” of statistical arbitrage.Pairs trading works by constructing a self ﬁnancing portfolio with a long position in one securityand a short position in the other. Given that the two securities have moved together historically,when a temporary anomaly happens, one security would be overvalued than the other relative tothe long-term equilibrium. Then, an investor may be able to make money by selling the overvaluedsecurity, buying the undervalued security, and clearing the exposure when the two securities settleback to their long-term equilibrium. Because the eﬀect from movement of the market is hedged bythis self ﬁnancing portfolio, pairs trading is market-neutral.The methods for pairs trading can be broadly divided into nonparametric and parametric meth-ods. In particular, Gatev et al. (2006) propose a nonparametric distance based approach in de-termining the securities for constructing the pairs. They choose a pair by ﬁnding the securitiesthat minimized the sum of squared deviations between the two normalized prices. They argue thisapproach “best approximates the description of how traders themselves choose pairs”. They ﬁndthat average annualized excess returns reach 11% for the top pairs portfolios using CRSP daily datafrom 1962 to 2002. Other Nonparametric methods on pairs trading can also be found in Bogomolov(2013) among others. Overall, the nonparametric distance based approach provides a simple andgeneral method of selecting “good” pairs; however, as pointed out by Krauss (2016) and others,this selection metric is prone to pick up pairs with small variance of the spread, and therefore limitsthe proﬁtability of pairs trading.In contrast, the parametric approach tries to capture the mean-reverting characteristic of thespread using a parametric model. For example, Elliott et al. (2005) propose a mean-revertingGaussian Markov chain model for the spread which is observed in Gaussian noise. See Vidyamurthy(2004), Cummins and Bucca (2012), Tourin and Yan (2013), Moura et al. (2016), Stbinger andEndres (2018), Clegg and Krauss (2018), Elliott and Bradrania (2018), Bai and Wu (2018) for otherparametric methods on pairs trading. Overall, the parametric approach provides tractable methodsfor the analysis of pairs trading; however, most of the existing parametric models are too simpleto be capable of capturing the dynamics of asset price, which substantially limits the returns frompairs trading.Compared with the existing methods on pairs trading, the proposed approach has the following1eatures: (1) It is based on a nonlinear and non-Gaussian state space model. This modelling cancapture several stylized features of ﬁnancial asset prices, including heavy-tailedness, heteroskedas-ticity, volatility clustering and nonlinear dependence. (2) The trading strategy is diﬀerent from theexisting ones. It utilizes the features of the model such as heteroskedasticity and volatility cluster-ing, and it can potentially achieve signiﬁcantly higher returns and Sharpe ratios. (3) The optimaltrading rules is also diﬀerent from the existing ones. Although this rule has no analytic solution,we show that it can be computed eﬀectively using simulations. Finally, the optimal trading rulecan adapt to various objectives, such as a high cumulative return, Sharpe ratio, or Calmar ratio.We apply our approach to two pairs: PEP vs KO and EWT vs EWH. We we ﬁnd that ourapproach achieves an annualized return of 0.2186 and Sharpe ratio of 2.9518 on the PEP/KOpair and an annualized return of 0.3184 and Sharpe ratio of 3.8892 on the EWT/EWH pair. Incomparison, a conventional approach applied to the same pairs can only achieve an annualizedreturn of 0.1311 and Sharpe ratio of 1.1003 for the PEP/KO pair and an annualized return of0.1480 and Sharpe ratio of 1.1277 for the EWT/EWH pair. Next, we test our approach using allthe possible pairs among the largest 5 banks and the smallest 5 banks listed in NYSE. We ﬁndsigniﬁcant improvements over the conventional approach for almost all the pairs. We also ﬁnd thatthe pairs between small banks produce higher return than the pairs between large banks. This islikely because the spread between small banks are more volatile, providing more opportunities foractive trading.The main contributions of this paper can be summarized as follows. On the theory side, wepropose a complete set of tools for pairs trading that include a model for the dynamics of thespread, a new trading strategy and a Monte Carlo method for determining the optimal tradingrule. On the empirical side, we apply our approach to various pairs in practice. The results showthat the new approach can achieve signiﬁcant improvements on the performance of pairs trading.The remainder of this paper is organized as follows. In Section 2, we propose a new model forpairs trading. In Section 3, we propose a new trading strategy based on the mean-reverting propertyof spread, and compare it with conventional trading strategies using simulations. In Section 4, weimplement the proposed approach to actual data, and in Section 5 we conclude the paper.

We propose the following nonlinear and non-Gaussian state space model for pairs trading: P A,t = φ + γP B,t + x t + ε t (1) x t +1 = f ( x t ; θ ) + g ( x t ; θ ) ∗ η t (2)2here P A is the price of security A , P B is the price of security B , γ is the hedge ratio betweentwo securities, and x is the true spread between P A and P B . We assume x follow a mean-revertingprocess as in (2), ε t ∼ N (cid:0) , σ ε (cid:1) and η t ∼ p ( η t ; θ ) which could be non-Gaussian. Popular choicesfor f , g and p could be the followings. Our framework applies to all of them. • Linear mean-reverting (Ornstein–Uhlenbeck process): f ( x t ; θ ) = θ + θ x t • Nonlinear mean-reverting model: f ( x t ; θ ) = θ + θ x t + θ x t • Ait-Sahalia’s nonlinear mean-reverting model (Ait-Sahalia, 1996): f ( x t ; θ ) = θ + θ x − t + θ x t + θ x t • Homoskedasticity model: g ( x t ; θ ) = 1 • ARCH( m ) model: g ( x t ; θ ) = (cid:113) θ + (cid:80) mi =1 θ i x t − i • APARCH( m, δ ) model: g ( x t ; θ ) = (cid:0) θ + (cid:80) mi =1 θ i | x t − i | δ (cid:1) δ • Gaussian distributed noise: p ( η ; µ, σ ) = √ πσ exp (cid:16) − ( µ − η ) σ (cid:17) • Student’s t distributed noise: p ( η ; ν ) = Γ ( ν +12 ) √ νπ Γ ( ν ) (cid:16) η ν (cid:17) − ν +12 • Generalized error distributed noise: p ( η ; α, β, µ ) = β α Γ (cid:16) β (cid:17) exp (cid:16) − ( | η − µ | /α ) β (cid:17) In model (1)-(2), we consider x as the unobservable true spread between security A and B , whichfollows a mean-reverting process. P A is the observation and P B is the control variable. Since φ and θ in the f function can not be identiﬁed simultaneously, we let φ = 0 and denote ψ = ( γ, θ, σ ε ) asthe parameter of the model (1)-(2). ψ is going to determined based on data set { P A,t , P

B,t } Tt =0 Our new model has three advantages compared with existing models for pairs trading, such asElliott et al. (2005) and Moura et al. (2016). First, since η can be non-Gaussian, x can followa non-Gaussian process. By allowing for this non-Gaussianity in η , the model can capture thedistributional deviation from Gaussianity and reproduce heavy-tailed returns.Second, the model captures heteroskedasticity in ﬁnancial data. A well-known feature of ﬁnan-cial time-series is volatility clustering: “large changes tend to be followed by large changes, of eithersign, and small changes tend to be followed by small changes” (Mandelbrot, 1963). This featurewas documented later in Ding, Granger and Engle (1993), and Ding and Granger (1996) amongothers. In model (2), the volatility persistence is represented by ARCH-style modeling. Detailsabout the application of ARCH model in ﬁnance can be found in Bollerslev, Chou and Kroner(1992). 3hird, in order to characterize the nonlinear dependence in ﬁnancial data, we allow f to benonlinear. Scheinkman and LeBaron (1989) ﬁnd evidence that indicates the presence of nonlineardependence in weekly returns on the CRSP value-weighted index. Ait-Sahalia (1996) ﬁnds nonlin-earity in the drift function of interest rate and concludes that “the principal source of rejection ofexisting (linear drift) models is the strong nonlinearity of the drift”. We keep the functional formof f ﬂexible and, as a result, we can capture the nonlinear dependence in ﬁnancial data. In this section, we discuss the trading strategies and trading rules for pairs trading. In this paper,a trading strategy is the method of buying and selling of assets in markets based on the estimationof the unobservable spread. A trading rule is the predeﬁned values to generate the trading signalfor a speciﬁc trading strategy with an investing objective. To implement a strategy and rule onpairs trading, we need the following quantities: (i) parameter estimates for the model (1)-(2), (ii)an estimate of the spread, and (iii) choice of a speciﬁc strategy and the optimal trading rule, and wediscuss these aspects in this section. More speciﬁcally, in Section 3.1, we present an algorithm onthe ﬁltering of the unobervable spread and parameter estimation. In Section 3.2, We will discusstwo benchmark trading strategies. In Section 3.3, we will present and compare three populartrading rules associated with the benchmark trading strategies. In Section 3.4, we propose a newtrading strategy. In this new trading strategy, we change the way we open or close a trade, andwe will discuss the beneﬁt of this new strategy compared with the benchmark strategies. Since theexisting trading rule cannot be simply applied to the model (1)-(2), we propose a new approachto calculate the optimal trading rule based on the simulation of the spread. The detail of thissimulation based method is in Section 3.5. In Section 3.6, we summarize the procedure of pairstrading. This procedure can be applied to pairs trading with all of the trading strategies andtrading rules discussed in this paper.

For a speciﬁcation of model (1)-(2), we run the following algorithm of Quasi Monte Carlo Kalmanﬁlter for nonlinear and non-Gaussian state space models to estimate the unobservable spread andunknown parameters in the model, based on the observations { P A,t , P

B,t } Tt =0 . Suppose the initialspread x follows N ( µ, Σ ) for any reasonable choices of µ and Σ . • Step 1: For non-Gaussian density p ( η t ) , we use Gaussian mixture density to approximate itspdf and denote the approximation as (cid:101) p ( η t ) = (cid:80) mi =1 α i φ ( η t − a i , P i ) , (cid:80) mi =1 α i = 1 where φ is4he Gaussian pdf deﬁned by φ ( v, Σ) = 1(2 π ) / | Σ | / exp (cid:18) − v T Σ − v (cid:19) . To get this approximation, we determine the values of { α i , a i , P i } mi =1 by minimizing the relativeentropy between the true density p ( η t ) and its approximation (cid:101) p ( η t ). The relative entropy isdeﬁned by H ( p | (cid:101) p ) = (cid:90) (cid:18) log p ( η ) (cid:101) p ( η ) (cid:19) × p ( η ) dη. If η t is Gaussian, then this step can be dropped. • Step 2: Generate a Box-Muller transformed Halton sequence { x ( g ) t } Gg =1 with sequence size G from φ ( x t − b ts , P ts ). Compute and storeQ t +1 i = 1 G G (cid:88) g =1 (cid:16) f (cid:16) x ( g ) t (cid:17) − c t +1 i (cid:17) + (cid:16) g (cid:16) x ( g ) t (cid:17)(cid:17) ∗ P k , and c t +1 i = 1 G G (cid:88) g =1 f (cid:16) x ( g ) t (cid:17) + g (cid:16) x ( g ) t (cid:17) ∗ a k . When t = 0, { x ( g )0 } Gg =1 is sampled from N ( µ, Σ ). • Step 3: Repeat Step 2 for s = 1 , , ..., J t +1 , J t +1 = m t , and k = 1 , . . . m, and store c t +1 i andQ t +1 i for i = 1 , , ..., I t +1 , I t +1 = J t +1 ∗ m = m t +1 . • Step 4: Based on the results from Step 3, generate a Box-Muller transformed Halton sequences { x ( g ) t +1 i } Gg =1 from φ ( x t +1 − c t +1 i , Q t +1 i ) for i = 1 , , ..., I t +1 , I t +1 = m t +1 . Then generate P ( g ) A,t +1 i = x ( g ) t +1 i + γ ∗ P B,t +1 . Compute and store the followings¯ P A,t +1 i = 1 G G (cid:88) g =1 P ( g ) A,t +1 i , V t +1 i = 1 G G (cid:88) g =1 (cid:16) P ( g ) A,t +1 i − ¯ P A,t +1 i (cid:17) + σ ε , S t +1 i = 1 G G (cid:88) g =1 (cid:16) x ( g ) t +1 i − c t +1 i (cid:17) (cid:16) P ( g ) A,t +1 i − ¯ P A,t +1 i (cid:17) . • Step 5: Compute K t +1 i = S t +1 i V − t +1 i , P t +1 i = Q t +1 i − K t +1 i V t +1 i , and b t +1 i = c t +1 i +K t +1 i (cid:0) P A,t +1 − ¯ P A,t +1 i (cid:1) . 5 Step 6: Repeat Step 4-5 for i = 1 , , ..., I t +1 , I t +1 = m t +1 . Compute and store ¯ x t +1 and ¯ P t +1 where ¯ x t +1 = (cid:80) I t +1 i =1 β t +1 i b t +1 i , and¯ P t +1 = I t +1 (cid:88) i =1 β t +1 i (cid:0) P t +1 i + b t +1 i (cid:1) −  I t +1 (cid:88) i =1 β t +1 i b t +1 i  ,β t +1 i = φ ( P A,t +1 − c t +1 i − γ ∗ P B,t +1 , V t +1 t ) (cid:80) I t +1 i =1 φ ( P A,t +1 − c t +1 i − γ ∗ P B,t +1 , V t +1 t ) . • Step 7: Repeat Step 2-6 for t = 0 , , , ..., T . { ¯ x t } Tt =1 from Step 6 is our estimation of the spread. To estimate the unknown parameter in themodel, we ﬁrst write the log-likelihood function as L GT ( ψ ) ≡ T (cid:88) t =0 log f G ( ψ ; P A,t , P

B,t ) == T (cid:88) t =1 log  I t +1 (cid:88) i (cid:112) π | V t +1 i | exp (cid:32) − (cid:0) P A,t +1 − ¯ P A,t +1 i (cid:1) ∗ V t +1 i (cid:33) and MLE of the unknow parameter would be determined to maximize the above likelihood, thatis, ˆ ψ MLE = argmax ψ ∈ Φ L GT ( ψ ) . As we discussed in Section 1, the basic idea for pairs trading is to open a trade (short one asset andlong the other one) when the spread deviates from the equilibrium and close the trading when thespread settle back to the equilibrium. The trading strategies for pairs trading are constructed basedon this idea. We use Figure 1 and Figure 2 to illustrate two benchmark trading strategies (hereafterStrategy A and Strategy B). In Figure 1 and Figure 2, the same estimated spread is plotted assolid lines, and a preset upper-boundary U and a preset lower-boundary L are plotted as dashedlines. We will discuss how to choose the optimal U and L in Section 3.2. The upper-boundaryand lower-boundary act as thresholds to determine whether the spread deviates from the long-termequilibrium enough, and we use these two criteria to open a trade. Also, a preset value C acts asa threshold to determine whether the spread settles back to the long-term equilibrium, and we usethis criterion to close a trade. In this paper, we take C as the mean of the spread, and plot it assolid green line in both Figure 1 and Figure 2.In Strategy A (illustrated in Figure 1), a trade is opened at t when the spread is higher thanor equal to U . In this case, we sell 1 share of stock A and buy γ share of stock B. At t (cid:48) when the6igure 1: Trading Strategy Aspread is less than or equal to the mean (i.e., C ), we close the trade and clear the position. Thereturn from this trade is thus U − C . At t when the spread is less than or equal to L , , we opena trade by buying 1 share of stock A and sell γ share of stock B. We close this trade and clear theposition at t (cid:48) when the spread is higher than or equal to the mean. The return from this trade is C − L .In Strategy B (illustrated in Figure 2), we open a trade when the spread cross the upper-boundary from below (e.g., at t ) or cross the lower-boundary from above (e.g., at t ). Unlike theStrategy A, We will hold the portfolio until we need to switch the position. Thus in Strategy B,we clear the exposure at the same time when we open a new trade ( i.e., t and t (cid:48) coincide). In the implementation of pairs trading, trading rule for a speciﬁc trading strategy is the computationof optimal thresholds U and L based on that strategy to fulﬁll an investing objective . There arethree popular approaches for computing the optimal thresholds U and L when the model (2) islinear, homoscedastic and Gaussian (i.e., f is linear, g is a constant and η is a Gaussian noise). Investing objective could be various, such as maximizing the expected cumulative return or maximizing theSharpe ratio. • Rule I: Ad hoc boundariesRule I takes U to be one (1- σ rule) or two (2- σ rule) standard deviations above the mean, L to beone or two standard deviations below the mean and C to be the mean of the spread. This rule issimple and popular in practice. In particular, the 2- σ rule was ﬁrst applied by Gatev et al. (2006)and later checked by Moura et al. (2016), Zeng and Lee (2014) and Cummins and Bucca (2012).The 1- σ rule was discussed in Zeng and Lee (2014) and the performance of 1- σ rule and 2- σ rulewas compared in the same paper. • Rule II : Boundaries based on the ﬁrst-passage-timeThis rule was ﬁrst adopted by Elliott et al. (2005) and later by Moura et al. (2016). Suppose Z t follows a standardized Ornstein–Uhlenbeck process: dZ t = − Z t dt + √ dW t Let T ,Z be the ﬁrst passage time of Z t : T ,Z = inf { t ≥ , Z ( t ) = 0 | Z (0) = Z } . ,Z has a pdf known explicitly: f ,Z ( t ) = (cid:114) π | Z | e − t (1 − e − t ) / exp (cid:18) − Z e − t − e − t ) (cid:19) f ,Z ( t ) can be maximized at t ∗ given by: t ∗ = 12 ln (cid:20) (cid:18)(cid:113)(cid:0) Z − (cid:1) + 4 Z + Z − (cid:19)(cid:21) Here t ∗ is the most possible time, given the value of current spread, that the spread will settle backto the mean. In model (2), if the spread x follows (discrete time) Ornstein–Uhlenbeck process, thenwe can ﬁrst standardize x , and then above formula for t ∗ can be used to construct the optimal C .Similar idea can be applied to compute the optimal upper-boundary U and lower-boundary L . • Rule III: Boundaries based on the renewal theoremThis rule was ﬁrst proposed by Bertram (2010), and then extended by Zeng and Lee (2014). Inthis rule, each trading cycle is separated into two parts, where τ can be used to denote the timefrom taking (long or short) position to clearing the position, and τ can be used to denote the timefrom clearing position to opening next trading. That is, τ = inf { t ; ˆ x t = C | ˆ x = U } τ = inf { t ; ˆ x t = U | ˆ x = C } Suppose T is the total trading duration we have for a pair, and N T is the number of transactionswe can have in the period [0 , T ]. Then, by the renewal theorem, the return per unit time is givenby: ( U − C ) lim T →∞ E ( N T ) T = U − CE ( τ + τ ) . where E ( τ ) and E ( τ ) can be computed based on the density of ﬁrst passage time, mentioned inRule II.The problem of this rule is, as Zeng and Lee (2014) have pointed out, that when there is notransaction cost, this strategy implies U (and L ) will be arbitrarily close to C . This implies thatthe trader values the trading frequency more than the proﬁt per trade. Consequently, this couldincrease the risk of the portfolio signiﬁcantly. We summarize the new trading strategy (hereafter Strategy C) in Figure 3. The basic idea ofStrategy C is similar to both Strategy A and Strategy B: open a trade when the spread is far awayfrom the equilibrium and close the trade when the spread settle back to the equilibrium. Unlike9he Strategy A and B, in Strategy C, we open a trade when the spread cross the upper-boundaryfrom above (or cross the lower-boundary from below), and we clear the position when the spreadcross the mean, or cross the boundaries ( U and L ) after a trade has been opened (i.e., the spreadcross the upper-boundary from below or the lower-boundary from above). For example, in Figure3a for a homoscedastic model, at t , t , t and t we open a trade; and at t (cid:48) , t (cid:48) , t (cid:48) , and t (cid:48) we clearthe exposure. In Figure 3b for a heteroscedastic model, we open a trade at t and t ; and we closethe trade at t (cid:48) , and t (cid:48) .We now discuss the properties of this trading strategy when the model (2) is homoscedastic(i.e., the g function is constant) and when it is heteroscedastic (i.e., g is a general function). Inthe ﬁrst situation, the main beneﬁt of Strategy C is that we can avoid holding the portfolio whenthe spread is larger than the upper boundary (or smaller than the lower boundary). This wouldsigniﬁcantly decrease the risk and drawdown of the portfolio. The main drawback of Strategy Cis that the return can be lower because we open the trade when the spread is closer to the meanof the spread than in Strategy A. Therefore, there is a tradeoﬀ between the risk and the return.In the situation when the model (2) is heteroscedastic, this strategy can not only reduce the risk,it can also improve the return. This is because the opening of a trade now depends on the levelof the volatility and, as a result, the boundaries are no longer constant over time. The logic ofthis new strategy is illustrated in Figure 3a and 3b, for homoscedastic and heteroscedastic cases,respectively. For a general speciﬁcation of model (1)-(2), the conventional trading rules in Section 3.2 are diﬃcultto be applied. For example, the 1- σ rule or 2- σ rule cannot be applied when the model (2) isheteroscedastic; for a complicated speciﬁcation of model (2), it’s impossible to derive the densityof the ﬁrst passage time explicitly, thus Rule II and Rule III are unavailable in this case.To compute the optimal trading rule under model (2) for all of the trading strategies, we proposeto select the optimal boundaries ( U and L , we set C as the mean of spread by default) based onthe Monte Carlo simulation of the spread (equation (2) given the estimation of the unknownparameters). Diﬀerent criterion or investing objectives, such as expected return, Sharpe ratio orCalmar ratio could be used to determine the optimal boundaries for a given trading strategy.Now we use the following four speciﬁcations of model (2) to describe the detail about thecomputation of the new trading rules. Let CR a,t be the cumulative return of portfolio a at time t , and we deﬁne the maximum drawdown of thecumulative return across time 0 to T as MD a,T : MD a,T = sup t ∈ [0 ,T ] (cid:34) sup τ ∈ [0 ,t ] CR a,τ − CR a,t (cid:35) . (a) Trading Strategy C in Homoscedastic Model(b) Trading Strategy C in Heteroscedastic Model Model 1: x t +1 = 0 . ∗ x t + 0 . ∗ η t , η t ∼ N (0 , • Model 2: x t +1 = 0 . ∗ x t + 0 . ∗ x t + 0 . ∗ η t , η t ∼ N (0 , • Model 3: x t +1 = 0 . ∗ x t + (cid:113)(cid:0) . . ∗ x t (cid:1) ∗ η t , η t ∼ N (0 , • Model 4: x t +1 = 0 . ∗ x t + . √ ∗ η t , η t ∼ t Model 1 is a linear, homoscedastic, and Gaussian model. This is the most popular model usedfor pairs trading. See Elliott et al. (2005) and Moura et al. (2016) for examples of this model.Model 2 is a nonlinear model, Model 3 is a heteroscedastic model, and Model 4 is a non-Gaussianmodel. The last three models are diﬀerent extensions of Model 1 and have never been discussed inthe literature on pairs trading. These four models can be considered as the benchmark models forpairs trading. Further extensions are available based on the combination of these four models, andour simulation based method for optimal trading rule can also be applied to them.For every speciﬁcation of Model 1-4, we will calculate the optimal trading rules through the N simulations of the spread for Strategy A, B and C respectively, and compare the resultingperformances of the three strategies based on the expected return, Sharpe ratio. More speciﬁ-cally, across all of the examples, we represent the optimal trading rule (upper-boundary U andlower-boundary L ) as the ratio to one standard deviation of the spread, and we consider the upper-boundary U between [0 . , .

5] and lower-boundary L between [ − . , − .

1] for a grid size of 0.1.For every speciﬁcation of Model 1-4 and every realization of the process of the spread { x ( m,n ) t } Tt =0 ,where m = 1 , , , n = 1 , . . . N , we choose U i from [0 . , .

5] and L j from [ − . , − . i, j = 1 , ...,

25, and compute the resulting cumulative return and Sharpe ratio for diﬀerence strate-gies. More speciﬁcally, We denote the cumulative return and Sharpe ratio as CR m,k,ni,j and SR m,k,ni,j respectively, where m is for diﬀerent models, k is for diﬀerence strategies and n is for diﬀerent real-ization of the spread in simulation. For Model m and strategy k , the resulting expected cumulativereturn CR m,ki,j and Sharpe ratio SR m,ki,j are computed as CR m,ki,j = 1 N N (cid:88) n =1 CR m,k,ni,j SR m,ki,j = 1 N N (cid:88) n =1 SR m,k,ni,j . Then the Calmar ratio can be deﬁned in a similar way as the Sharpe ratio:

Calmar a ≡ E ( R a ) MD a,T where E ( R a ) is the expected return of portfolio a . U ∗ m,k , L ∗ m,k ) is selected to maximize CR m,ki,j or SR m,ki,j , that is, (cid:2) U ∗ m,k , L ∗ m,k (cid:3) = arg max U i ,L j z m,ki,j where z = CR or SR . Across all of the examples, we set the total trading period to be 1000 tradingdays (or approximately four years), and we set the simulation size to be N = 10000. For simplicity,we assume the transaction cost is 20 bp (0.2%) , and annualized risk free rate is set to be 0.In Table 1, we report the optimal trading rule for every combination of the 4 models and 3strategies, and the resulting expected cumulative return and Sharpe ratio . As we can ﬁnd fromthis table, Strategy C outperforms other two strategies when the model is heteroscedastic in boththe cumulative return and the Sharpe ratio; also, for other homoscedastic models (Model 1, 2 and4), the Sharpe ratio of Strategy C is competitive, although the cumulative return is not. Thissupports our discussion of this new strategy in Section 3.3.We leave the detailed results of simulation method in appendix. More precisely, the expectedcumulative returns and Sharpe ratio as functions of various choices of U and L are given in FigureA1-A4 for every possible combination of the three strategies and four models. The return isdisplayed in number, not in percentage through all ﬁgures. We are now in a position to summarize the procedure for pairs trading based on model (1)-(2) andconclude this section. • Step 1: Choose a speciﬁc model for (1)-(2). Given this model and observations { P A,t , P

B,t } Tt =0 ,we run Quasi Monte Carlo Kalman ﬁlter and get the ﬁltered estimation of the spread { ¯ x t } Tt =0 and the estimation of the unknown parameter ˆ ψ in the model. The detail of running QMCKFhas been discussed in Section 3.1. • Step 2: Choose a trading strategy, and determine the optimal trading rule (the optimal U and L ) for a speciﬁc criterion based on Monte Carlo simulation based on the data until time T . The detail of this step can be found in Section 3.2-3.5. • Step 3: For t > T , we run QMCKF and estimate ¯ x t with ψ = ˆ ψ , the estimate of the parameterwe get in Step 1 . We use this { ¯ x t } t>T and follow the preset trading strategy and optimaltrading rule from Step 2 to generate the trading signal for trading. This transaction cost is on one asset of the pair. Since a complete trading includes transactions on two assets,the total transaction cost of one complete trading is 40 bp. If the spread and the strategy is symmetric around the mean, then the optimal upper boundary and lowerboundary should also be symmetric around the mean, i.e, U ∗ = − L ∗ . However, due to the approximation error ingridding, the absolute values of U ∗ and L ∗ may not be exactly the same in Table 1. U ∗ L ∗ CR U ∗ L ∗ SRModel 1 A 0.7 -0.7 0.2508 1.1 -1 0.0573B 0.5 -0.5 0.2745 0.5 -0.5 0.0522C 1 -1 0.1934 0.9 -0.9 0.0679Model 2 A 0.8 -0.8 0.2749 1.2 -1.3 0.1302B 0.6 -0.6 0.3016 0.6 -0.6 0.1198C 1.2 -1.3 0.1640 1.2 -1.3 0.1162Model 3 A 0.3 -0.2 3.9413 0.4 -0.4 0.0751B 0.1 -0.1 4.0139 0.1 -0.1 0.0743C 0.8 -0.8 6.6763 0.1 -0.1 0.2499Model 4 A 0.6 -0.6 0.3792 1 -1 0.0881B 0.4 -0.5 0.4071 0.5 0.5 0.0782C 1 -1 0.2243 1 -1 0.0829

Note: The third and forth columns are the optimal upper-boundary and lower-boundary based on maximizing the cumulative return, and the ﬁfth columnis the resulting cumulative return. The sixth and seventh columns are theoptimal upper-boundary and lower-boundary based on maximizing the Sharperatio, and the eighth column is the resulting Sharpe ratio. The cumulativereturn is displayed in number, not in percentage.

In this section, we test the performance of Pairs Trading through nonlinear and non-Gaussianstate space modeling for diﬀerent trading strategies. Across all of the applications in this section,we assume the transaction cost is 20 bp and the annualized risk free rate is 2%, and we test theperformance of Strategy A, B and C for two speciﬁcations of model (2): • Model I: x t +1 = θ + θ x t + θ ∗ η t , η t ∼ N (0 , • Model II: x t +1 = θ + θ x t + (cid:112) θ + θ x t ∗ η t , η t ∼ N (0 , In this example, we examine the performance of Pairs Trading for PEP (Pepsi) and KO (Coca).The data is the daily observation of adjusted closing prices of PEP and KO from 01/03/2012-06/28/2019. 14able 2 reports the parameter estimation of both Model I and Model II for this pair. Thetrading signal for Model I is given in Figure A5 and that for Model II is given in Figure A6, andthe annualized performance (annualized return, annualized Std Dev, annualized Sharpe ratio andCalmar ratio, and annualized Pain index) is given in Table 3. The plot of the cumulative return anddrawdown of every strategy through the whole trading period for both models are given in FigureA7 and A8. It’s easy to ﬁnd that in Model II, the annualized return of Strategy C is almost 50%higher than those of Strategy A and B, while Strategy C keeps the risk (measured by AnnualizedStd Dev) almost half of Strategy A or B. By comparing the Sharpe ratio, Calmar ratio and Painindex, we can ﬁnd this improvement is signiﬁcant. While the diﬀerence of performances of StrategyA and Strategy B across the two models is limited. This implies the eﬀect of heteroskedasticitymodelling to the performances of Strategy A and B is not signiﬁcant. This is because in Strategy Aand B, the hedging portfolio will be held until the spread is around the mean, so the frequency ofchanging positions is low in Strategy A or B than that in Strategy C. This can be easily conﬁrmedby counting the trading numbers based on Figure A5 and Figure A6.Table 2: Parameter estimation of Model I and Model II on PEP vs KOModel I Model II γ σ ε θ -0.0001 -0.001 θ θ θ - 0.1283 In this example, we examine the performance of Pairs Trading for EWT and EWH. The data is thedaily observation of adjusted closing prices of EWT and EWH from 01/01/2012-05/01/2019. EWTis the iShares MSCI Taiwan ETF managed by BlackRock, which seeks to track the investmentresults of an index composed of Taiwanese equities, and EWH is that for Hong Kong equities.Following the example of PEP vs KO, we will test the performance of Strategy A, B and C forModel I and Model II. We report the parameter estimation in Table 4 and the trading signal inFigure A9 and Figure A10. By comparing the annualized performance in Table 5, we can ﬁndthe heteroskedasticity modeling can improve the performance of Strategy C signiﬁcantly, while hasno eﬀect on Strategy A or B. Also, the riskiness of Strategy B (small Sharpe ratio and Calmar15able 3: Annualized Performance of Pairs Trading on PEP vs KOReturn Std Dev Sharpe Calmar Pain indexStrategy A, Model I 0.1311 0.0988 1.1003 1.3742 0.0195Strategy B, Model I 0.1385 0.1153 1.0052 1.2204 0.0334Strategy C, Model I 0.0618 0.0534 0.7649 0.8243 0.0087Strategy A, Model II 0.1340 0.1038 1.0751 1.4040 0.0200Strategy B, Model II 0.1407 0.1139 1.0366 1.2398 0.0258Strategy C, Model II 0.2186 0.0659 2.9518 8.2384 0.0030

Note: The data is from 01/03/2012-06/28/2019. The return is displayed in number, insteadof in percentage. ratio and high annualized standard variance) is conﬁrmed again in this example. We also plotthe cumulative return and drawdown of every strategy through the whole trading period for bothmodels in Figure A11 and A12.Table 4: Parameter estimation of Model I and Model II on EWT vs EWHModel I Model II γ σ ε θ -0.0004 -0.0015 θ θ θ - 0.1136 We use this example to illustrate the improvement of our new modelling and strategy by imple-menting pairs trading on US banks listed on NYSE during 01/01/2013-01/10/2019. To avoid datasnooping and make our results more concrete, we use a simple way to choose assets and constructpairs. More precisely, based on the market capacity, we select the 5 largest banks to construct thegroup of large banks and the 5 smallest banks to construct the group of small banks. The large bankgroup includes: JPM, BAC, WFC, C and USB , and the small bank group includes: CPF, BANC, JPM is for J P Morgan Chase & Co; BAC is for Bank of America Corporation; WFC is for Wells Fargo &Company; C is for Citigroup Inc.; USB is for U.S. Bancorp.

Note: The data is from 01/03/2012-06/28/2019. The return is displayed in number, insteadof in percentage.

CUBI, NBHC, FCF . We compare the performance between Model I combined with Strategy Aand Model II combined with Strategy C. Model I combined with Strategy A is a popular approachin the existing literature on pairs trading, and it can be a good benchmark for comparison.In Table A1, we report the performance of these two approaches on 10 pairs among the largebanks. The performance on 10 pairs among the small banks is given in Table A2. It’s easy to ﬁndthat Model II combined with Strategy C outperforms Model I combined with Strategy A throughalmost all of the pairs, either in the sense of annualized return or annualized Sharpe ratio. Andthe improvement of Model II combined with Strategy C in Sharpe ratio is much more signiﬁcantthan that in return. For example, when trading is implemented on pairs among large banks,the improvement on return is 41.29%, and the improvement on Sharpe ratio is 89.23%; and iftrading is implemented on pairs among small banks, the improvement on return is 74.41%, and theimprovement on Sharpe ratio is 151.8%.Also, by comparing the results in Table A1 and A2, we can ﬁnd that the performance of pairsamong small banks would be better than that among large banks, either Model I combined withStrategy A or Model II combined with Strategy C is applied for trading. For example, if we exerciseModel I combined with Strategy A, the mean of returns of all pairs among large banks would be0.0703, that among small banks can be improved to 0.1524; and if Model II combined with StrategyC is exercised, we could get an improvement of 0.1664 (from 0.0994 to 0.2658) by switching fromtrading on large banks to trading on small banks. This is because the movement of prices of smallbanks is more volatile than that of large banks, and thus the volatility of the spread between smallbanks is bigger than that between large banks.In Table A3, we report the performance of the two approaches of pairs trading on all possible CPF is for CPB Inc.; BANC is for Banc of California, Inc.; CUBI is for Customers Bancorp, Inc.; NBHC is forNational Bank Holdings Corporation; FCF is for First Commonwealth Financial Corporation.

Pairs trading is a statistical arbitrage involves the long/short position of overpriced and underpricedassets. Our result in this paper shows that digging into the modeling and trading strategy canimprove the performance of pairs trading signiﬁcantly and implies the great potential of pairstrading on ﬁnancial market. This can help the empirical research on the general proﬁtability ofpairs trading and discussion on the tests of market eﬃciency, and we leave this for future research.18 eferences

Agns Tourin and Raphael Yan, 2013,

Dynamic pairs trading using the stochastic control approach ,Journal of Economic Dynamics and Control, 37 (2013) 1972-1981.Ait-Sahalia, Y., 1996,

Testing Continuous-Time Models of the Spot Interest Rate , Review of Finan-cial Studies, 9, 385-426Avellaneda, M., and J.-H. Lee. 2010.

Statistical arbitrage in the US equities market . QuantitativeFinance 10:761–782.Benoit B. Mandelbrot, 1971,

When Can Price be Arbitraged Eﬃciently? A Limit to the Validity ofthe Random Walk and Martingale Models , The Review of Economics and Statistics, Vol. 53, No. 3(Aug., 1971), pp. 225-236Bogomolov, T. 2013.

Pairs trading based on statistical variability of the spread process , QuantitativeFinance 13:1411–1430.Carlos Eduardo de Moura, Adrian Pizzinga and Jorge Zubelli (2016),

A pairs trading strategy basedon linear state space models and the Kalman ﬁlter , Quantitative FinanceClegg, Matthew and Krauss, Christopher. 2018,

Pairs trading with partial cointegration , Quantita-tive Finance 18 (1), 121–138.Cummins, Mark and Bucca, Andrea, 2012,

Quantitative spread trading on crude oil and reﬁnedproducts markets , Quantitative Finance. Dec2012, Vol. 12 Issue 12, p1857-1875.David A. Hsieh, 1989,

Testing for Nonlinear Dependence in Daily Foreign Exchange Rates , TheJournal of Business, Vol. 62, No. 3 (Jul., 1989), pp. 339-368Ding, Z., Granger, C.W.J.

Modeling volatility persistence of speculative returns: A new approach ,Journal of Econometrics, 1996, vol. 73, issue 1, 185-215E. F. Fama and James D. MacBeth.

Risk, return, and equilibrium , The Journal of Political Economy791.1 (1971), pp. 30–55Elliott, R. J., J. Van Der Hoek, and W. P. Malcolm. 2005.

Pairs trading . Quantitative Finance5:271–276Elliott, R. J. ; Bradrania, R, 2018,

Estimating a regime switching pairs trading model , QuantitativeFinance, 2018, Vol.18(5), pp.877-883 19ugene F. Fama, 1970,

Eﬃcient Capital Markets: A Review of Theory and Empirical Work , TheJournal of Finance, Vol. 25, No. 2, Papers and Proceedings of the TwentyEighth Annual Meetingof the American Finance Association New York, N.Y. December, 28-30, 1969 (May, 1970), pp.383-417Gatev, E.G., Goetzmann, W.N. and Rouwenhorst, K.G. (2006).

Pairs Trading: Performance of aRelative Value Arbitrage Rule . The Review of Financial Studies, 19, 797-827.Jose A. Scheinkman and Blake LeBaron, 1989,

Nonlinear dynamics and stock returns, The Journalof Business , Vol. 62, No. 3 (Jul., 1989), pp. 311-337Kiyoshi Suzuki, 2018,

Optimal pair-trading strategy over long/short/square positions—empiricalstudy , Quantitative Finance, Volume 18, 2018 - Issue 1Kon S, 1984,

Models of stock returns: a comparison , J. Finance XXXIX 147–65Mandelbrot B, 1963,

The variation of certain speculative prices , J. Business XXXVI 392–417Ole E. Barndorﬀ-Nielsen, and Neil Shephard, 2001,

Non-Gaussian Ornstein-Uhlenbeck-based modelsand some of their uses in ﬁnancial economics , J. R. Statist. Soc. B (2001) 63, Part 2, pp. 167-241Rad, H., R. K. Y. Low, and R. Faﬀ. 2016.

The proﬁtability of pairs trading strategies: distance,cointegration and copula methods . Quantitative Finance 16:1541–1558.Rama Cont, 2001,

Empirical properties of asset returns: stylized facts and statistical issues , Quan-titative Finance VOL 1 (2001) 223–236Sergio M. Focardi, Frank J. Fabozzic, Ivan K. Mitov, 2016,

A new approach to statistical arbitrage:Strategies based on dynamic factor models of prices and their performance . Journal of Banking &Finance, 65 (2016) 134-155Stbinger, Johannes and Endres, Sylvia, 2018,

Pairs trading with a mean-reverting jump-diﬀusionmodel on high-frequency data , Quantitative Finance Volume 18, 2018 - Issue 10Tim Bollerslev, Ray Y. Chou and Kenneth F. Kroner, 1992,

ARCH modeling in ﬁnance: A reviewof the theory and empirical evidence , Journal of Econometrics 52 (1992) 5-59.Vidyamurthy, G., 2004.

Pairs trading: Quantitative methods and analysis . J. Wiley, Hoboken, N.J.Yang Bai and Lan Wu, 2018,

Analytic value function for optimal regime-switching pairs tradingrules , Quantitative Finance, 2018, Vol.18(4) 20ang, S. Y., Qiao, Q., Beling, P. A., Scherer, W. T., Kirilenko, A. A., 2015.

Gaussian process-basedalgorithmic trading strategy identiﬁcation . Quantitative Finance 0 (0), 1–21.Yaoting Lei and Jing Xu, 2015,

Costly arbitrage through pairs trading , Journal of Economic Dy-namics & Control, 56 (2015) 1-19Zeng, Z., Lee, C. G., 2014.

Pairs trading: optimal thresholds and proﬁtability . Quantitative Finance14 (11), 1881–1893.Zhuanxin Ding, Clive W.J. Granger, Robert F. Engle (1993)

A long memory property of stockmarket returns and a new model ,Journal of Empirical Finance, Volume 1, Issue 1, 1993, Pages83-106 21 a b l e A : P e r f o r m a n ce o f P a i r s T r a d i n go n I n t e r g r o up P a i r s o f B i g B a n k s P a i r S t o c k S t o c k M o d e l I + S t r a t e g y A M o d e l II + S t r a t e g y C I m p r o v e m e n t( i n % ) R e t u r nSh a r p e R e t u r nSh a r p e R e t u r nSh a r p e J P M B A C . . . . - . .

93 2 J P M W F C . . . . . . J P M C . . . . . . J P M U S B . . . . . .

23 5 B A C W F C . . . . . .

99 6 B A CC . . . . . . B A C U S B . . . . . . W F CC . . . . - . .

02 9 W F C U S B . . . . . .

66 10 C U S B . . . . . . M e a n . . . . . . M i n . . . . . . M a x . . . . . . M e d i a n . . . . . . N o t e : R e t u r n i s t h e a nnu a li z e d r e t u r n , d i s p l a y e d i nnu m b e r , n o t i np e r c e n t ag e . Sh a r p e i s t h e a nnu a li z e dSh a r p e r a t i o . I m p r o v e m e n t i s d e ﬁn e d a s ( M o d e l II + S tr a t e g y C ) − ( M o d e l I + S tr a t e g y A ) | M o d e l I + S tr a t e g y A | f o rr e t u r n a ndSh a r p e r a t i o r e s p e c t i v e l y , m e a s u r e d i np e r - c e n t ag e . a b l e A : P e r f o r m a n ce o f P a i r s T r a d i n go n I n t e r g r o up P a i r s o f S m a ll B a n k s P a i r S t o c k S t o c k M o d e l I + S t r a t e g y A M o d e l II + S t r a t e g y C I m p r o v e m e n t( i n % ) R e t u r nSh a r p e R e t u r nSh a r p e R e t u r nSh a r p e C P F B AN C . . . . . .

08 2 C P F C U B I . . . . . . C P F N B H C . . . . . .

41 4 C P FF C F . . . . . . B AN CC U B I . . . . . . B AN C N B H C . . . . - . .

90 7 B AN C F C F . . . . . . C U B I N B H C . . . . . . C U B I F C F . . . . . . N B H C F C F . . . . . . M e a n . . . . . . M i n . . . . . . M a x . . . . . . M e d i a n . . . . . . N o t e : R e t u r n i s t h e a nnu a li z e d r e t u r n , d i s p l a y e d i nnu m b e r , n o t i np e r c e n t ag e . Sh a r p e i s t h e a nnu a li z e dSh a r p e r a t i o . I m p r o v e m e n t i s d e ﬁn e d a s t h a t i n T a b l e A Note: Return is the annualized return, displayed in number, not in percentage. Sharpe is the annualized Sharperatio. Improvement is deﬁned as that in Table A1 a b l e A : I nS a m p l e P e r f o r m a n ce o f P a i r s T r a d i n go n I n t e r g r o up P a i r s o f B i g B a n k s P a i r S t o c k S t o c k M o d e l I + S t r a t e g y A M o d e l II + S t r a t e g y C I m p r o v e m e n t( i n % ) R e t u r nSh a r p e R e t u r nSh a r p e R e t u r nSh a r p e J P M B A C . . . . . . J P M W F C . . . . . . J P M C . . . . . . J P M U S B . . . . . . B A C W F C . . . . B A CC . . . . . . B A C U S B . . . . . . W F CC . . . . . . W F C U S B . . . . . .

30 10 C U S B . . . . . . M e a n . . . . . . M i n . . . . M a x . . . . . . M e d i a n . . . . . . N o t e : T h e d a t a i s f r o m t o01/01/2018 . R e t u r n i s t h e a nnu a li z e d r e t u r n , d i s p l a y e d i nnu m b e r , n o t i np e r c e n t ag e . Sh a r p e i s t h e a nnu a li z e dSh a r p e r a t i o . I m p r o v e m e n t i s d e ﬁn e d a s t h a t i n T a b l e A . a b l e A : O u t o f S a m p l e P e r f o r m a n ce o f P a i r s T r a d i n go n I n t e r g r o up P a i r s o f B i g B a n k s P a i r S t o c k S t o c k M o d e l I + S t r a t e g y A M o d e l II + S t r a t e g y C I m p r o v e m e n t( i n % ) R e t u r nSh a r p e R e t u r nSh a r p e R e t u r nSh a r p e J P M B A C - . - . - . - . . - . J P M W F C - . - . - . - . . .

37 3 J P M C - . - . . . . . J P M U S B . . . . . .

46 5 B A C W F C . . . . . . B A CC - . - . . . . . B A C U S B - . - . . . W F CC - . - . . . . . W F C U S B - . - . . . . . C U S B - . - . . . . . M e a n - . - . . . . . M i n - . - . . - . . . M a x . . . . . . M e d i a n - . - . . . . N o t e : T h e d a t a i s f r o m t o01/12/2019 . R e t u r n i s t h e a nnu a li z e d r e t u r n , d i s p l a y e d i nnu m b e r , n o t i np e r c e n t ag e . Sh a r p e i s t h e a nnu a li z e dSh a r p e r a t i o . I m p r o v e m e n t i s d e ﬁn e d a s t h a t i n T a b l e A . a b l e A : I nS a m p l e P e r f o r m a n ce o f P a i r s T r a d i n go n I n t e r g r o up P a i r s o f S m a ll B a n k s P a i r S t o c k S t o c k M o d e l I + S t r a t e g y A M o d e l II + S t r a t e g y C I m p r o v e m e n t( i n % ) R e t u r nSh a r p e R e t u r nSh a r p e R e t u r nSh a r p e C P F B AN C . . . . . . C P F C U B I . . . . . . C P F N B H C . . . . . .

98 4 C P FF C F . . . . . . B AN CC U B I . . . . . . B AN C N B H C . . . . - . .

27 7 B AN C F C F . . . . . . C U B I N B H C . . . . . . C U B I F C F . . . . . . N B H C F C F . . . . . . M e a n . . . . . . M i n . . . . . . M a x . . . . . . M e d i a n . . . . . . N o t e : T h e d a t a i s f r o m t o01/01/2018 . R e t u r n i s t h e a nnu a li z e d r e t u r n , d i s p l a y e d i nnu m b e r , n o t i np e r c e n t ag e . Sh a r p e i s t h e a nnu a li z e dSh a r p e r a t i o . I m p r o v e m e n t i s d e ﬁn e d a s t h a t i n T a b l e A . a b l e A : O u t o f S a m p l e P e r f o r m a n ce o f P a i r s T r a d i n go n I n t e r g r o up P a i r s o f S m a ll B a n k s P a i r S t o c k S t o c k M o d e l I + S t r a t e g y A M o d e l II + S t r a t e g y C I m p r o v e m e n t( i n % ) R e t u r nSh a r p e R e t u r nSh a r p e R e t u r nSh a r p e C P F B AN C . . . . - . .

03 2 C P F C U B I - . - . . . . . C P F N B H C - . - . . . . . C P FF C F - . - . . . . . B AN CC U B I . . . . . . B AN C N B H C - . - . - . - . . .

58 7 B AN C F C F . . . . C U B I N B H C - . - . - C U B I F C F - . - . - N B H C F C F - . - . . . . . M e a n - . - . . . . . M i n - . - . - . - . . . M a x . . . . . . M e d i a n - . - . . . . . N o t e : T h e d a t a i s f r o m t o01/12/2019 . R e t u r n i s t h e a nnu a li z e d r e t u r n , d i s p l a y e d i nnu m b e r , n o t i np e r c e n t ag e . Sh a r p e i s t h e a nnu a li z e dSh a r p e r a t i o . I m p r o v e m e n t i s d e ﬁn e d a s t h a t i n T a b l e A . T h e r e t u r n s f o r C U B I / N B H C a nd C U B I / F C F a r e b e c a u s e n o t r a d i n g i s o p e n e d f o r t h e s e t w o p a i r s du r i n g t h e o u t - o f - s a m p l e p e r i o d , a nd t h e Sh a r p e r a t i o s a r e und e ﬁn e d . Note: The data is from 01/10/2012 to 01/01/2018. Return is the annualized return, displayed in number, not inpercentage. Sharpe is the annualized Sharpe ratio. Improvement is deﬁned as that in Table A1.

Note: The data is from 01/01/2018 to 01/12/2019. Return is the annualized return, displayed in number, not inpercentage. Sharpe is the annualized Sharpe ratio. Improvement is deﬁned as that in Table A1. (a) Return of Strategy A, Model 1 (b) Sharpe Ratio of Strategy A, Model 1(c) Return of Strategy B, Model 1 (d) Sharpe Ratio of Strategy B, Model 1(e) Return of Strategy C, Model 1 (f) Sharpe Ratio of Strategy C, Model 1 (a) Return of Strategy A, Model 2 (b) Sharpe Ratio of Strategy A, Model 2(c) Return of Strategy B, Model 2 (d) Sharpe Ratio of Strategy B, Model 2(e) Return of Strategy C, Model 2 (f) Sharpe Ratio of Strategy C, Model 2 (a) Return of Strategy A, Model 3 (b) Sharpe Ratio of Strategy A, Model 3(c) Return of Strategy B, Model 3 (d) Sharpe Ratio of Strategy B, Model 3(e) Return of Strategy C, Model 3 (f) Sharpe Ratio of Strategy C, Model 3 (a) Return of Strategy A, Model 4 (b) Sharpe Ratio of Strategy A, Model 4(c) Return of Strategy B, Model 4 (d) Sharpe Ratio of Strategy B, Model 4(e) Return of Strategy C, Model 4 (f) Sharpe Ratio of Strategy C, Model 4 i g u r e A : T r a d i n g s i g n a l o f S t r a t e g y A , B a nd C o n PEP v s KO f o r M o d e l I J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 T r a d i ng S i gn a l, S t r a t e g y A , M od e l I − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 T r a d i ng S i gn a l, S t r a t e g y B , M od e l I − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 T r a d i ng S i gn a l, S t r a t e g y C , M od e l I − − / − − − . − . . . . − . − . . . . N o t e : W h e n t h e t r a d i n g s i g n a li s w e s h o r t PEP a nd l o n g K O ; w h e n t h e t r a d i n g s i g n a li s - w e s h o r t K O a nd l o n g PEP ; w h e n t h e t r a d i n g s i g n a li s w e c l e a r t h e p o s i t i o n a ndh o l dn oa ss e t . i g u r e A : T r a d i n g s i g n a l o f S t r a t e g y A , B a nd C o n PEP v s KO f o r M o d e l II J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 T r a d i ng S i gn a l, S t r a t e g y A , M od e l II − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 T r a d i ng S i gn a l, S t r a t e g y B , M od e l II − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 T r a d i ng S i gn a l, S t r a t e g y C , M od e l II − − / − − − . − . . . . − . − . . . . N o t e : W h e n t h e t r a d i n g s i g n a li s w e s h o r t PEP a nd l o n g K O ; w h e n t h e t r a d i n g s i g n a li s - w e s h o r t K O a nd l o n g PEP ; w h e n t h e t r a d i n g s i g n a li s w e c l e a r t h e p o s i t i o n a ndh o l dn oa ss e t . i g u r e A : T r a d i n g P e r f o r m a n ce o f S t r a t e g y A , B a nd C o n PEP v s KO f o r M o d e l I J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 C u m u l a t i ve R e t u r n − − / − − . . . . S t r a t eg y . C .. M ode l .I S t r a t eg y . B .. M ode l .I S t r a t eg y . A .. M ode l .I D a il y R e t u r n − . . . . . D r a w do w n − . − . − . − . − . N o t e : B l a c k c u r v e s a r e t h e r e s u l t s o f S t r ag e g y C ; r e d c u r v e s a r e t h e r e s u l t s o f S t r a t e g y B ; g r ee n c u r v e s a r e t h e r e s u l t s o f S t r a t e g y A . T h e D a il y R e t u r n d i ag r a m i s o n l y f o r S t r a t e g y C i g u r e A : T r a d i n g P e r f o r m a n ce o f S t r a t e g y A , B a nd C o n PEP v s KO f o r M o d e l II J an

04 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 J an

02 2019 J un

28 2019 C u m u l a t i ve R e t u r n − − / − − . . . . . . . S t r a t eg y . C .. M ode l .II S t r a t eg y . B .. M ode l .II S t r a t eg y . A .. M ode l .II D a il y R e t u r n 0 . . . . D r a w do w n − . − . − . − . − . N o t e : B l a c k c u r v e s a r e t h e r e s u l t s o f S t r ag e g y C ; r e d c u r v e s a r e t h e r e s u l t s o f S t r a t e g y B ; g r ee n c u r v e s a r e t h e r e s u l t s o f S t r a t e g y A . T h e D a il y R e t u r n d i ag r a m i s o n l y f o r S t r a t e g y C i g u r e A : T r a d i n g s i g n a l o f S t r a t e g y A , B a nd C o n E W T v s E W H f o r M o d e l I J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 T r a d i ng S i gn a l, S t r a t e g y A , M od e l I − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 T r a d i ng s i gn a l, S t r a t e g y B , M od e l I − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 T r a d i ng s i gn a l, S t r a t e g y C , M od e l I − − / − − − . − . . . . − . − . . . . N o t e : W h e n t h e t r a d i n g s i g n a li s w e s h o r t E W T a nd l o n g E W H ; w h e n t h e t r a d i n g s i g n a li s - w e s h o r t E W H a nd l o n g E W T ; w h e n t h e t r a d i n g s i g n a l i s w e c l e a r p o s i t i o n a ndh o l dn oa ss e t . i g u r e A : T r a d i n g s i g n a l o f S t r a t e g y A , B a nd C o n E W T v s E W H f o r M o d e l II J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 T r a d i ng S i gn a l, S t r a t e g y A , M od e l II − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 T r a d i ng s i gn a l, S t r a t e g y B , M od e l II − − / − − − . − . . . . − . − . . . . J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 T r a d i ng s i gn a l, S t r a t e g y C , M od e l II − − / − − − . − . . . . − . − . . . . N o t e : W h e n t h e t r a d i n g s i g n a li s w e s h o r t E W T a nd l o n g E W H ; w h e n t h e t r a d i n g s i g n a li s - w e s h o r t E W H a nd l o n g E W T ; w h e n t h e t r a d i n g s i g n a l i s w e c l e a r p o s i t i o n a ndh o l dn oa ss e t . i g u r e A : T r a d i n g P e r f o r m a n ce o f S t r a t e g y A , B a nd C o n E W T v s E W H f o r M o d e l I J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 C u m u l a t i ve R e t u r n − − / − − . . . . S t r a t eg y . C .. M ode l .I S t r a t eg y . B .. M ode l .I S t r a t eg y . A .. M ode l .I D a il y R e t u r n − . − . . . . . D r a w do w n − . − . − . − . − . − . N o t e : B l a c k c u r v e s a r e t h e r e s u l t s o f S t r ag e g y C ; r e d c u r v e s a r e t h e r e s u l t s o f S t r a t e g y B ; g r ee n c u r v e s a r e t h e r e s u l t s o f S t r a t e g y A . T h e D a il y R e t u r n d i ag r a m i s o n l y f o r S t r a t e g y C i g u r e A : T r a d i n g P e r f o r m a n ce o f S t r a t e g y A , B a nd C o n E W T v s E W H f o r M o d e l II J an

03 2012 J u l

02 2012 J an

02 2013 J u l

01 2013 J an

02 2014 J u l

01 2014 J an

02 2015 J u l

01 2015 J an

04 2016 J u l

01 2016 J an

03 2017 J u l

03 2017 J an

02 2018 J u l

02 2018 D e c

31 2018 C u m u l a t i ve R e t u r n − − / − − S t r a t eg y . C .. M ode l .II S t r a t eg y . B .. M ode l .II S t r a t eg y . A .. M ode l .II D a il y R e t u r n − . − . . . . . D r a w do w n − . − . − . − . − . − . N o t e : B l a c k c u r v e s a r e t h e r e s u l t s o f S t r ag e g y C ; r e d c u r v e s a r e t h e r e s u l t s o f S t r a t e g y B ; g r ee n c u r v e s a r e t h e r e s u l t s o f S t r a t e g y A . T h e D a il y R e t u r n d i ag r a m i s o n l y f o r S t r a t e g y C Note: Black circles are the performances of Model I + Strategy A on pairs of large banks, red circles arethe performances of Model I + Strategy A on pairs of small banks, black triangles are the performancesof Model II + Strategy C on pairs of large banks, and red triangles are the performances of Model II +Strategy C on pairs of small banks.