[PDF] A Pre-Trade Algorithmic Trading Model under Given Volume Measures and Generic Price Dynamics (GVM-GPD)

Abstract

We make several improvements to the mean-variance framework for optimal pre-trade algorithmic execution, by working with volume measures and generic price dynamics. Volume measures are the continuum analogies for discrete volume profiles commonly implemented in the execution industry. Execution then becomes an absolutely continuous measure over such a measure space, and its Radon-Nikodym derivative is commonly known as the Participation of Volume (PoV) function. The four impact cost components are all consistently built upon the PoV function. Some novel efforts are made for these linear impact models by having market signals more properly expressed. For the opportunistic cost, we are able to go beyond the conventional Brownian-type motions. By working directly with the auto-covariances of the price dynamics, we remove the Markovian restriction associated with Brownians and thus allow potential memory effects in the price dynamics. In combination, the final execution model becomes a constrained quadratic programming problem in infinite-dimensional Hilbert spaces. Important linear constraints such as participation capping are all permissible. Uniqueness and existence of optimal solutions are established via the theory of positive compact operators in Hilbert spaces. Several typical numerical examples explain both the behavior and versatility of the model.

Full PDF

AA Pre-Trade Algorithmic Trading Model under GivenVolume Measures and Generic Price Dynamics(

GVM-GPD ) Jackie (Jianhong) Shen Industrial and Enterprise Systems Engineering (ISE)and Master of Science in Financial Engineering (MSFE)216D Transportation Building, University of Illinois, Urbana, IL 61801

Dedicated to Gil Strang on the Occasion of His 80th Birthday.

Abstract

Participation of Volume (PoV) function. The four impact cost components are all consistently builtupon the PoV function. Some novel eﬀorts are made for these linear impactmodels by having market signals more properly expressed. For the opportunis-tic cost, we are able to go beyond the conventional Brownian-type motions. Byworking directly with the auto-covariances of the price dynamics, we removethe Markovian restriction associated with Brownians and thus allow potentialmemory eﬀects in the price dynamics. In combination, the ﬁnal execution modelbecomes a constrained quadratic programming problem in inﬁnite-dimensionalHilbert spaces. Important linear constraints such as participation capping areall permissible. Uniqueness and existence of optimal solutions are establishedvia the theory of positive compact operators in Hilbert spaces. Several typicalnumerical examples explain both the behavior and versatility of the model.

Keywords:

Volume, Price, Impact, Risk, Compact positive operator, Hilbertspace, Existence, Uniqueness, Quadratic Programming

Email address: [email protected] ( Jackie (Jianhong) Shen )

URL: http://publish.illinois.edu/jackieshen ( Jackie (Jianhong) Shen ) Jackie Shen had been a Vice President for the Equities Algorithmic Trading divisions atboth J.P. Morgan and Barclays, New York, before joining the University of Illinois in 2013.Readers are encouraged to ﬁrst read the disclaimer statements near the end.

Preprint September 27, 2013 a r X i v : . [ q -f i n . T R ] S e p . Introduction Algorithmic trading helps institutional investors liquidate or acquire big po-sitions without incurring much adverse impact cost or opportunistic risk. Toachieve this objective, a practical model must keep screening any real-time mar-ket characteristics and adapt the execution strategy accordingly. This necessar-ily means that a realistic trading model has to be dynamic , as in the seminalwork of Bertsimas and Lo [4], and many other important ones (e.g., Huber-man and Stanzl [10], Almgren [1], Bouchard, Dang, and Lehalle [5], Azencottet al. [3], just to name a few).On the other hand, both internal or external clients commonly rely on com-putationally less expensive static models to get pre-trade estimations on poten-tial costs and risks for their positions. They may also rely on such pre-tradeestimations to generate alerts if the actual execution is deviating too far, oftenexpressed via conﬁdence intervals, and needs immediate human intervention.Furthermore, due to the mounting computational challenges associated withfully dynamic algorithms, execution houses (e.g., broker-dealers, agency ser-vices, internal execution teams in hedge funds, etc.) also typically utilize staticalgorithms as the core for building heuristic but much faster dynamic models.It is for these reasons that the study of utility-function based static algo-rithms is still very valuable in the execution industry. The current work focuseson some major improvements of the mean-variance framework, following largelythe classical works of Almgren and Chriss [2], Huberman and Stanzl [10], andother closely related works (e.g., Kissell and Malamut [11], and Obizhaeva andWang [12]).The core of the current work is a single continuum model independent of anyinterval based discrete grids. This does not mean that any in-house implementa-tion has to avoid interval-based setting. Instead, a single governing continuummodel has numerous advantages, including for example, (i) ensuring consistencyamong diﬀerent interval choices, (ii) staying invariant when technology advance-ments (e.g., on data servers, optimizer servers, or Direct Market Access (DMA))allow executing in higher and higher frequencies, and (iii) welcoming more com-putational methodologies, including for example, basis function based methodsthat are free of interval grids. Imaging Science, for instance, has witnessed ablossoming decade very much thanks to the similar advantages as one goes frompixel or graph based discrete models to continuum models, as the latter ones areindependent of camera resolutions and also befriend a wealth of mathematicaltools such as PDE, variational calculus, and wavelets (e.g., Chan and Shen [6]).The two major characteristics of the proposed model are, as the title hassuggested, (i) treating market volumes as Borel measures over the executionhorizon, and (ii) permitting generic price dynamics (including generic Bid-Askspread dynamics), other than being restricted to the linear or geometric Brow-nians prevailing in the literature.Generic price dynamics extend beyond the Markovian nature of conventionalBrownian-type price movements, and allow short-term or long-term memories.This will be explained in more details in Section 2. Two more speciﬁc exam-2les involving mean reversion and stochastic volatilities are also presented inSection 6 to illustrate the ﬂexibility of the model.In the typical reductionism approach to static modeling of algorithmic trad-ing [2, 12], the markets have usually been approximated by the volume proﬁle(instead of more complex signals associated with the dynamics of limit orderbooks (LOB)). Volume measures generalize discrete volume proﬁles to the con-tinuum setting. We attempt to demonstrate that measure theory and functionalanalysis serve a natural foundation for the continuum modeling. Section 3 laysout the general setting for volume measures.The proposed execution model is built upon four impact cost components,which are all linear . This is the third characteristics of the current work. Linearcost models allow faster calibration via linear regression techniques. They alsolead to

Quadratic Programming (QP) formulation for the ﬁnal execution model,which can be readily solved via robust commercial QP solvers (e.g., the IBMCPLEX). With volume measures, all the four impact cost components are con-sistently build upon the

Participation of Volume (PoV) rate function, which isthe Radon-Nikodym derivative of the execution measure against the market vol-ume measure. For both the transient and permanent costs, we have introducedvolume distance for the transient impact and cumulative volume denominationfor the permanent impact. These eﬀorts are novel to the best knowledge ofthe author, and make the cost models more realistic. Throughout the modelbuilding process, we have also particularly emphasized:(a) the inﬂuence of in-house

Child Order Placement strategies on modeling pa-rameters, and(b) the role of quantitative market makers on shaping the exact forms of impactmodels.All these subjects will be further elaborated in Section 4.The rest of the paper is devoted to the analysis and computation of theestablished model. In Section 5, we apply the theory of positive compact op-erators in Hilbert Spaces to establish the existence and uniqueness of solutionsto the resulted QP problem with a quadratic objective and linear constraints.A computational scheme is then proposed in Section 6, and several numericalexamples are also presented to reveal the eﬀects of various factors, including de-grees of risk aversion, shapes of volume measures, dominance of individual costcomponents, and non-Brownian price dynamics. A feature that is also some-what novel in the proposed model is the explicit permission of volume capping,which is popularly demanded from both internal and external clients.

2. Generic Price Dynamics

Regardless of execution styles, trading is always exposed to the overall mar-ket movements. This is especially true for large institutional orders for whichimmediate ﬁlling via open exchanges is infeasible. In the current work we willnot touch the subject of dark pools where it is not impossible to have a large3rder ﬁlled via internal crossing networks run by broker-dealers, agency services,or even exchanges.This part of the trading cost is often termed the opportunistic cost or risk ,and is directly caused by the innate price ﬂuctuations of the target securities.Trading models have to incorporate suitable price dynamics in order to quantifythe associated opportunistic cost.Let p denote the arrival price at time t = 0 on the normalized tradinghorizon I = [0 , National BestBid and Oﬀer (NBBO). The mid-quote of the NBBO has been traditionallyconsidered as the true price of the security [12], an assumption we shall followas well. Let p t or p ( t ) denote the mid-quote at time t ∈ I . Deﬁne the priceﬂuctuation to be δ t = p t − p , or its homogenized version to be: δ ◦ t = δ t /p . Ithas been commonly assumed in the literature that without impact costs, δ t iseither linearly or geometrically Brownian [2, 12], i.e., dδ t = ˆ µdt + ˆ σdW t , or dδ t /p t = µdt + σdW t . The model in the current work does not speciﬁcally address the overnight risks,and is primarily designed for intraday executions. For the intraday horizon, thegeometric Brownian can be well approximated by the linear substitute: dδ t /p = dδ ◦ t = µdt + σdW t . This works well for most non-penny common stocks whose prices are suﬃcientlypositive (e.g., above $5.00), and intraday price changes are no more than a fewpercentage points so that the dynamic denominator p t can be well approximatedby the static reference price p .The current work, however, makes no assumptions on the speciﬁc format ofthe price dynamics. Instead, we only rely on the auto-covariance function: forany t, s ∈ I = [0 , K ◦ δ ( t, s ) = E [ δ ◦ t · δ ◦ s ] , or K δ ( t, s ) = E [ δ t · δ s ] (cid:0) = p K ◦ δ ( t, s ) (cid:1) . For the intraday horizon, empirical evidences show that the drifting component µ in the Brownian framework can be assumed zero. We shall also assume that δ t (or δ ◦ t ) is a zero-mean process.For the kernel function K δ ( t, s ), the following natural assumptions are to bemade:(i) (Symmetry) K δ ( t, s ) = K δ ( s, t ), for any t, s ∈ I = [0 , ﬁnite subset Λ ⊂ I , and any real function on the set: a : Λ → ( −∞ , + ∞ ) , t → a t , one always has: (cid:88) t,s ∈ Λ K δ ( t, s ) a t a s ≥ . K δ ( t, s ) = p · σ · min( t, s ) = p · σ · ( t ∧ s ) . (For convenience, we denote min( t, s ) by t ∧ s , and max( t, s ) by t ∨ s .) Further-more, by working with the auto-covariance kernel alone, the underlying pricedynamics do not need to be Markovian or memoryless, which provides muchmore ﬂexibility in applications. In Section 6, we have implemented two non-Brownian examples whose price dynamics are governed by mean reversion orstochastic volatility.

3. Volume Measures and PoV

Besides the extension beyond the conventional Brownian-type price dynam-ics, another major methodological innovation in the current work is we assumethat the market volume distribution dV over the execution horizon t ∈ I = [0 , dt denote the traditional Lebesgue measure over the horizon I . If thevolume measure is Lebesgue absolutely continuous , the market execution speed v ( t ) is then well deﬁned to be the Radon-Nikodym derivative: v ( t ) = dVdt , and v ( t ) ∈ L ( I ) = L ( I, dt ) . Generally, embedded within a Borel volume measure could also be atomic mea-sures and continuous singular measures. In particular, the latter means thatmarket volumes could grow stealthily on a set of times whose total Lebesguemeasure is zero. This could be particularly useful when studying the joint ef-fects of lit and dark pools, though we will not expand this topic in the presentwork.Suppose a client intends to execute X shares of some security, over a nor-malized time interval of I = [0 , X positive if it is a buy to open along position or to close an existing short position, and negative if it is a sell toliquidate an existing long position or to open a new short position. Since thecurrent model assumes symmetry between buys and sells, from now on we shallonly work with a buy order as the default setting, unless stated otherwise.The pre-trade execution is then considered to be a Borel measure dX overthe horizon I . And the following basic assumptions are made throughout thework.(1) (Monotone) dX is a positive measure for buys (and negative for sells). Thisis a basic request from clients who are generally against opportunistic sellingfor a buy order, and vice versa. This implies that the cumulative shares X t or X ( t ) always change monotonically from 0 to X .(2) (Completion) (cid:82) I dX = X , i.e., the targeted order is entirely ﬁlled duringthe horizon. 53) (Absolute Continuity) We assume that dX is absolutely continuous withrespect to the market volume measure dV , so that the Participation ofVolume (PoV) rate function h ( t ) is the Radon-Nikodym derivative: h ( t ) = dXdV , and h ∈ L ( I, dV ) . The monotone assumption now simply states that h never changes sign (al-most every (a.e.) with respect to dV ). Similarly, the completion conditionbecomes (cid:90) I h ( t ) dV ( t ) = X . Recall that in measure theory [8, 9], the absolute continuity condition isequivalent to requiring that on any time subset Γ ⊆ I , V (Γ) = (cid:90) Γ dV = 0 wouldimply X (Γ) = (cid:90) Γ dX = 0. We call this the fundamental principle of trading .In the reductionism approach the market volume alone represents the entiremarket. Traders therefore would naturally avoid trading whenever there are nomarket activities.In what follows, we shall consistently build all the cost and risk componentsupon the PoV rate function h ( t ) or h t , which then becomes the decision variableto optimize with.

4. Components of Linear Impact Models

The impact model in the current work includes four pieces of components:spread cost, instantaneous cost, transient cost, and permanent cost. The ﬁrsttwo components contribute to realized impact cost, but leave no trailing imprintson market prices, while the latter two do. All components are not unfamiliar inthe literature (e.g., [4, 2]), but some innovation eﬀorts have been made:(i) For the spread cost, stochastic spreads are allowed, and the component co-eﬃcient is explicitly linked to the

Child Order Placement of each executionhouse.(ii) For the transient cost, we rely on a quantitative market-maker mechanismto build a volume-distance based exponential transition model. This eﬀortis very much inspired by the temporal exponential resilience of Obizhaevaand Wang [12], whose driving market force has however not been eluci-dated.(iii) For the permanent cost, we employ the look-back PoV as the dependentvariable, instead of the cumulative trading volume commonly used in theliterature. This allows to diﬀerentiate the permanent costs arising from dif-ferent market volume environments, e.g., those due to trading 1000 shareswhen the market volume is 5,000 shares versus when the market volumeis 20,000 shares. 6urthermore, in the current work, we shall also stick to linear models for allthe components, which result in a quadratic programming problem in HilbertSpaces for the ultimate trading model. Another advantage of linearity is thatmodel calibration could directly rely upon linear regression (see Section 6.1),instead of nonlinear solvers which are slower or with no guarantee on uniqueness.

Spread Cost: IC-sprd

Let θ t denote the NBBO spread at time t (in dollar amount), and θ ◦ t = θ t /p the homogenized spread with respect to the reference arrival price. That is, θ t = NBO t − NBB t , the gap between the national best oﬀer (NBO) and best bid (NBB). For instance,for the S&P-500 (large-cap) universe, the median homogenized spread is about5.5 basis points, i.e., θ ◦ t roughly ﬂuctuates around 5 . / ,

000 = 0 . true price, as onecrosses the spread to consume market liquidity, one has to pay at least the halfspread. More generally, we denote the signed spread cost (per share) by:IC-sprd t = α · sign( h t ) · θ t . Here the sign of the PoV indicates to pay higher for buying and receive lower forselling. Typically the coeﬃcient α is somewhere between 0.0 and 0.5. Regres-sion results from real trades in some of the execution houses where the authorhas worked have conﬁrmed this general behavior.There is practical reason why in reality α is less than 0 . Child Order Placement (COP). This typically involves amixture of limit and market orders. In reality, COP could also access internalor external non-displayed crossing networks, or the dark pools . Successful limitorders eliminate the need to cross the half spread to take liquidity as marketorders do, and actually gain the half spread. Most dark orders, on the otherhand, are usually pegged to the mid price of the NBBO and are thus traded atthe true price without spread gain or loss. In combination, realized spread costis typically less than the half spread.Therefore, the coeﬃcient α partially reﬂects the underlying COP strat-egy. Diﬀerent houses may observe diﬀerent α due to diﬀerent COP strategiesthat are tailored to their accessible trading venues and proprietary anti-gamingstrategies.To proceed, we further assume that θ t has a mean process ¯ θ t , and auto-covariance kernel K θ : K θ ( s, t ) = COV( θ s , θ t ) = E (cid:2) ( θ s − ¯ θ s )( θ t − ¯ θ t ) (cid:3) .

7e also assume that the NBO (or NBB) is the superposition of two independent processes: NBO t = p t + θ t / . p + δ t + θ t / . . The independence is a non-essential technical assumption, and one could oth-erwise work with the joint auto-covariance kernel for ( δ t , θ t ) in later sections. Instantaneous Cost IC-inst

Instantaneous cost represents the immediacy cost when execution consumesliquidity from the limit order books. In the reductionism approach when themarket volume summarizes the overall market, market volume has to be some-what proportional to the capacity or depth of the limit order book. This heuris-tic reasoning implies that bigger participation of volume (PoV) would eﬀacedeeper the limit order book and hence retreat the NBB more signiﬁcantly (orNBO for sells). We therefore model this part of instantaneous cost (per share)by: IC-inst t = α · h t , where h t = dX/dV is the PoV rate.Here the coeﬃcient α is in dollar and can be homogenized to a dimensionlessone via: α = α ◦ · p . Similar to the discussion for the spread cost, in practice α must also dependon the underlying COP strategies and diﬀerent execution houses may obtaindiﬀerent α when regressing against their own real trading data. Transient Cost IC-tran

Transient cost represents the short-term market digestion of the most recentwave of trades. In the literature, this component is also called the resilience term. For example, in the important work of Obizhaeva and Wang [12], theauthors attribute this term to the resilience of the limit order book.What actually drives such a resilient force, however, has not been made veryclear in the literature. Herein, we attribute it to the activities of quantitativemarket makers. A market maker adjusts her quoting levels based on the ex-pected market behavior. As a common practice, the expected behavior is oftenbased on the moving average of the observable. We assume that at any time t ,a typical quantitative market maker is primarily interested in the expected PoVvia the moving average of the observed PoV’s:¯ h t = (cid:90) t h s dM ( s | t ) , where dM ( s | t ) is a backward moving average kernel. (Borrowed from probabilitytheory, the vertical bar ( x | y ) denotes variable or value x given y .) As a movingaverage kernel, one requires dM ( s | t ) to be a probability measure over s ∈ [0 , t ]: (cid:90) s ∈ [0 ,t ] dM ( s | t ) = 1 . , and M ( A | t ) = (cid:90) s ∈ A dM ( s | t ) ≥ , A ⊆ [0 , t ]. The most popular kernel used either in academiaor in practical quantitative trading ﬁrms is the exponential kernel (e.g., [12]): dM ( s | t ) = 1 Z t exp( − γ ( t − s )) , ≤ s ≤ t. (The unital condition is often not strictly enforced, so that, for example, Z t ≡ γ − or any suitable constant [12].)Based on the expected PoV ¯ h t , the quantitative market maker then raisesthe quotes in an amount proportional to ¯ h t :IC-tran t = α · ¯ h t = α (cid:90) t h s · dM ( s | t )Any incoming trade on [ t, t + dt ) has to pay this premium cost.In this work, we assume that dM is absolutely continuous under the volumemeasure dV , and take the exponential decay kernel under volume distance asits Radon-Nikodym derivative: dM ( s | t ) = 1 Z t exp (cid:18) − V t − V s V ∗ (cid:19) dV s , with the normalization constant Z t given by: Z t = (cid:90) t exp (cid:18) − V t − V s V ∗ (cid:19) dV s = V ∗ · (1 − exp( − V t /V ∗ )) . Here the scaling constant V ∗ reﬂects the market marker’s volume-based window size for averaging short-term PoV’s. For example, a market maker could take V ∗ = 1% · ADV , the average daily volume of the target security. (In the UnitedStates where the normal daily core session lasts 390 minutes, this is roughly theaverage trading volumes over several minutes.)Since for the majority of time, V t (cid:29) V ∗ , Z t is quickly saturated to the levelof V ∗ . For the rest of the work we will make the following simpliﬁcation bydirectly setting: Z t ≡ V ∗ . This constancy approximation is also the default setting for most existing worksinvolving the temporal exponential kernel [12]. Analytically it helps alleviate thehardship involving a nonlinear exponential denominator.

Permanent Cost: IC-perm

By deﬁnition, as time elapses and market volume grows, the transition costimposed by a speciﬁc trade at some past time s will fade away. The permanentcost component then captures any permanent impact a trade might have left. Inthe current work, we assume it is proportional to the volume-weighted averagePoV: IC-perm t = α (cid:90) t h s dV s V t = α X t V t . X t (e.g., [2, 12]). The volume normalization introduced hereincaptures the diﬀerence between the impact of 5000 shares, say, executed in aperiod of cumulative market volume V t = 20 ,

000 shares vs. that of the samenumber of shares in another period of V t = 200 ,

000 shares. The permanentimpacts are clearly diﬀerent.In theory the denominator V t vanishes in the beginning of the execution, as itis the cumulative market volume from the start at t = 0. This could potentiallyintroduce some singularity when later on the total execution cost is assembled.In the scenario of ﬂat proﬁling when V t = v t for some constant market rate v ,for instance, the singularity is in the order of O (1 /t ).In practice, quantitative traders resolve such singularities through at leasttwo general approaches. The ﬁrst one is to slightly delay the computation inorder to get stronger signals (with a higher signal-to-noise ratio (SNR) ). Forinstance, many high-frequency strategies on the buy side will not open to tradewithin the ﬁrst 5 to 30 minutes of the market opens, unless they speciﬁcallytarget at the open auctions or opening moments. This “quiet” period allowsto collect stable signals based on moving averages. The other approach is to regularize the target signals. In our scenario, for example, to “boost” V t to V t + ε for some thresholding volume level ε , so that the signal V t only becomeseﬀective when V t (cid:29) ε . In practice, as discussed for the transient cost, ε couldbe 1% × ADV , one minute average

ADV , or even a few or several round lotsdepending on the liquidity level of the security. In the continuum setting, weuse the ε symbol to convey a sense of minuteness, as common in mathematics.As such, we ﬁnalize the permanent cost term via:IC-perm t = α (cid:90) t h s dV s V t + ε = α X t V t + ε . We point out that all the subsequent analysis also works for “hard-thresholding”when one takes: α (cid:90) t h s dV s V t ∨ ε = α X t V t ∨ ε instead,where V t ∨ ε = max( V t , ε ) more explicitly regularizes V t near t = 0.In actual discrete implementation, such regularization may not be necessarysince V t is generally positive at the ﬁrst grid point after t = t > Putting Together

Combining all the components, we arrive at a model for the execution priceˆ p t . Let p t denote the market price, which could be considered as the mid-quoteof the NBBO at any time t . Then, p t = p + δ t + IC-tran t + IC-perm t ; (1)ˆ p t = p t + IC-sprd t + IC-inst t . (2)10ere, δ t represents the intrinsic market price movement, regardless of “our” owntrading activity. As discussed earlier, we only assume that its auto-covariancekernel K δ ( s, t ) is known pre-trade.

5. The Execution Model and Analysis

The implementation shortfall (IS) is the amount overpaid (in the defaultscenario of buying) compared with the initial paper cost when the security priceis p , as ﬁrst introduced by Perold [13]. In dollar amount, for any deterministicexecution scheme : X t : 0 ≤ t ≤

1, it is deﬁned by:IS $ = (cid:90) (ˆ p t − p ) · dX t = (cid:90) p t · dX t + (cid:90) IC-sprd t · dX t + (cid:90) IC-inst t · dX t = (cid:90) δ t · dX t + (cid:90) IC-sprd t · dX t + (cid:90) IC-inst t · dX t + (cid:90) IC-tran t · dX t + (cid:90) IC-perm t · dX t = (cid:90) ( δ t + α θ t · sign( h t )) · dX t + (cid:90) (IC-inst t + IC-tran t + IC-perm t ) · dX t . Due to the monotone condition discussed earlier, sign( h t ) is static and globalsince it is always 1 for a buy order and − − · dX t = | dX t | . To prepare for themean-variance formulation, we ﬁrst compute the mean E[IS $ ], which is: (cid:90) (cid:18) α ¯ θ t · side + α h t + α (cid:90) t h s dM ( s | t ) + α (cid:90) t h s dV s V t + ε (cid:19) · dX t . (3)The variance isVAR[IS $ ] = (cid:90) (cid:90) K δ,θ ( s, t ) dX s dX t , with (4) K δ,θ ( s, t ) = K δ ( s, t ) + α K θ ( s, t ) . (5)For any given level of risk aversion expressed via a positive weighting parameter λ >

0, the mean-variance execution model is to ﬁnd the optimal PoV functionˆ h t , such that d ˆ X t = ˆ h t dV t and the following utility functional is minimized: J λ [ h t ] = E[IS $ ] + λ · VAR[IS $ ] . (6)11nder the given volume measure dV t , the objective functional becomes: J λ [ h t ] = α · side · (cid:90) ¯ θ t h t dV t + α (cid:90) h t dV t + α V ∗ (cid:90) (cid:90) h s h t exp (cid:18) − | V t − V s | V ∗ (cid:19) dV s dV t + α (cid:90) (cid:90) h s h t V t ∨ V s + ε dV s dV t + λ (cid:90) (cid:90) K δ,θ ( s, t ) h s h t dV s dV t , (7)with at least two common constraints discussed in previous sections,side · h t ≥ , and (cid:90) h t dV t = X , (8)for monotonicity and completion. Notice also that side = sign( X ).According to the published marketing or sales sheets, major execution housesalways allow their clients to specify the PoV capping, i.e., the linear inequalityconstraint: side · h t ≤ maxPoV . (9)For most clients, the popular comfort zone for maxPoV is somewhere between5% and 25%. If this constraint is turned on, there is an obvious compatibilitycondition for the model to yield a solution: (cid:90) maxPoV · dV t ≥ side · X , or maxPoV ≥ | X | V . (10)Otherwise, the execution may not be complete within the speciﬁed horizon.Except for very urgent or small orders, clients usually turn on this constraintas an ultimate safeguard. Since both the transient and permanent cost terms explicitly involve thecumulative market volume function: V t = (cid:90) [0 ,t ] dV s , we will make the following assumptions about the regularity of the volumemeasure.(i) (Absolute Continuity) We assume that dV is absolutely continuous withrespect to the ordinary Lebesgue measure dt . This amounts to sayingthat there exists a non-negative function v t ∈ L ( I ) = L ( I, dt ), such that dV t = v t dt . Then the cumulative market volume function V t = (cid:90) [0 ,t ] v t · dt,

12s well deﬁned pointwise , and is continuous and non-decreasing. We call v t the market (trading) rate function.(ii) (Lipschitz) Furthermore, we assume that the market rate function is boundedfrom above, i.e., there exists some constant A >

0, such that v t ≤ A, almost everywhere t ∈ I = [0 ,

1] under dt .Equivalently, we say that v t ∈ L ∞ ( I ). In terms of the cumulative market,it is equivalent to requiring V t to be Lipschitz [8, 9].Practically these are natural assumptions to most execution houses, wherevolume proﬁles are generated by overnight processes. One common componentof these processes is a smoothing kernel to curb the eﬀect of spurious spikesarising from direct historical averaging. Deﬁne the kernel functions: K ( s, t ) = 1 V ∗ exp (cid:18) − | V t − V s | V ∗ (cid:19) , and K ( s, t ) = 1 V t ∨ V s + ε . Then the objective functional J λ [ h ] involves three quadratic terms in the generalform of Q K,dV [ h ] = (cid:90) (cid:90) I × I K ( s, t ) h ( s ) h ( t ) dV ( s ) dV ( t ) , with K = K , K , or K δ,θ and I = [0 , K ( s, t ) = K ( t, s ) and that K : I × I → R is continuous . Here one notices the regularization role of ε introduced for the permanent costcomponent, without which K would not be continuous at ( t, s ) = (0 , (cid:107) Q K,dV (cid:107) := sup (cid:107) h (cid:107) L dV ) ≤ Q K,dV [ h ] ≤ (cid:107) K (cid:107) L ( dV ⊗ dV ) ≤ (cid:107) K (cid:107) L ∞ · V . Deﬁnition 1. Q K,dV [ h ] is said to be positive in the Hilbert space L ( I, dV ) if for any h ∈ L ( I, dV ) , Q K,dV [ h ] ≥ . A symmetric and continuous kernel K ( s, t ) on I × I is said to be positive , if for any ﬁnite sequence ( t i | i = 1 : N ) in I of arbitrary length N , and real scalars ( c i | i = 1 : N ) , one has N (cid:88) i,j =1 K ( t i , t j ) c i c j ≥ . Notice that in some literature, it is said to be nonnegative .13 roposition 1.

The quadratic form Q K,dV [ h ] induced by a positive kernel K must be positive.Proof . This is canonical in the context of ordinary Lebesgue measure dt ,which we do refresh here ﬁrst. Assume that K is a positive kernel. First noticethat Q K,dt [ g ] = (cid:90) (cid:90) I × I K ( s, t ) g ( s ) g ( t ) dsdt ≤ (cid:107) K (cid:107) L ∞ ( dt ) (cid:107) g (cid:107) , for any g ∈ L ( I, dt ). Thus Q K,dt is positive if and only if it is positive ona dense set of L ( I, dt ). It is well known in real analysis [9] that the set ofcontinuous functions C ( I ) is indeed dense in L ( I, dt ). For any φ ( t ) ∈ C ( I ), theLebesgue integral in L ( I × I, dt ⊗ ds ) Q K,dt [ φ ] = (cid:90) (cid:90) I × I K ( s, t ) φ ( s ) φ ( t ) dsdt is identical to its Riemann integral, since K ( s, t ) φ ( s ) φ ( t ) ∈ C ( I × I ). For anypartition of I : 0 = t < t < t < · · · < t N = 1 , the following Riemann sum is positive since K is assumed positive: N (cid:88) i,j =1 K ( t i , t j ) φ ( t i ) φ ( t j )( t i +1 − t i )( t j +1 − t j ) ≥ . Taking proper limits, one sees that Q K,dt is indeed positive on C ( I ).Now consider a general volume measure dV with v = dV /dt ∈ L ∞ ( dt ). Forany h t ∈ L ( dV ), one has (cid:90) I h v dt ≤ (cid:107) v (cid:107) ∞ · (cid:90) h vdt = (cid:107) v (cid:107) ∞ (cid:107) h (cid:107) L ( dV ) , implying that g ( t ) = h ( t ) v ( t ) ∈ L ( dt ). Then the proof is complete followingthat Q K,dV [ h ] = Q K,dt [ g ] ≥ . (cid:3) Proposition 2.

The risk kernel K = K δ,θ is positive. By deﬁnition, K δ,θ ( t, s ) is the auto-covariance function of the price-spreadmixed process: W t = δ t + α ( θ t − ¯ θ t ) , and K δ,θ ( t, s ) = E[ W t · W s ] . Then for any ﬁnite sequence ( t i | i = 1 : N ) and real scalars ( c i | i = 1 : N ), N (cid:88) i,j =1 K δ,θ ( t i , t j ) c i c j = E (cid:32) N (cid:88) i =1 c i W t i (cid:33)  ≥ , K δ,θ .To proceed further, we need a useful lemma whose proof follows directlyfrom the deﬁnition of kernel positivity. Lemma 1.

Suppose K ( T, S ) is a positive kernel on a subset D ⊆ R , and φ : I → D any real function with T = φ ( t ) . Then the pullback kernel K φ ( t, s ) = K ( φ ( t ) , φ ( s )) is positive on I . Proposition 3.

The transient kernel K = K is positive.Proof . Given a real function g ( t ) on t ∈ R , suppose its Fourier transform G ( ω ) = (cid:90) R g ( t ) e −√− tω dt is real and non-negative for all ω ∈ R . Then thekernel deﬁned via: K g ( t, s ) = g ( t − s ) must be positive for t, s ∈ R . This isbecause: N (cid:88) i,j =1 K g ( t i , t j ) c i c j = N (cid:88) i,j =1 g ( t i − t j ) c i c j = N (cid:88) i,j =1 π (cid:90) R G ( ω ) e −√− t i − t j ) ω dω · c i c j = 12 π (cid:90) R G ( ω ) (cid:12)(cid:12) N (cid:88) i =1 c i e −√− t i ω (cid:12)(cid:12) dω ≥ . Since the Fourier transform of an exponential g ( T ) = γ exp( − γ | T | ) with T ∈ R is G ( ω ) = 2 γ γ + ω , we conclude that K g ( T, S ) = g ( T − S ) = γ exp( − γ | T − S | )must be positive on T, S ∈ R . The proof is then complete by taking γ = 1 /V ∗ ,and φ ( t ) = V t in the preceding lemma. (cid:3) Proposition 4.

The permanent kernel K = K ( t, s ) is positive.Proof . First it is easy to see from the deﬁnition that, if K ( T, S ) is positiveon a subset D ⊆ R , and f ( t ) : D → R a real function, the new kernel K f ( T, S ) = K ( T, S ) f ( T ) f ( S )must be positive on the same domain. Now deﬁne D = (0 , ∞ ), and K B ( T, S ) = T ∧ S, and f ( T ) = 1 /T. Notice that K B ( T, S ) is positive since it is the auto-covariance of the canonicalBrownian motion B T : K B ( T, S ) = E [ B T · B S ] , T, S > . K ( T, S ) = 1 T ∨ S = T ∧ ST · S = K B ( T, S ) f ( T ) f ( S )must be positive on D . The proof is complete via the preceding lemma with φ ( t ) = ε + V t . (cid:3) Theorem 1 (Convexity and Uniqueness).

The objective functional J λ [ h t ] is strictly convex in L ( I, dV ) . As a result, the optimal execution solution h ∗ t to the constrained model must be unique .Proof . The preceding propositions establish that the last three quadraticforms in J λ [ h ] are all positive and thus convex in the Hilbert space L ( dV ).The second quadratic term (from instantaneous cost) α (cid:90) I h t dV t = α (cid:107) h t (cid:107) L ( dV t ) is the squared norm and hence strictly convex. Since the ﬁrst term on theaverage spread cost is linear, J λ [ h ] must be strictly convex. (cid:3) For convenience, deﬁne the combined kernel K λ ( t, s ) = α K ( t, s ) + α K ( t, s ) + λK δ,θ ( t, s )= α V ∗ exp (cid:18) − | V t − V s | V ∗ (cid:19) + α V t ∨ V s + ε + λK δ,θ ( t, s ) . (11)Since K λ ( t, s ) is a continuous kernel over I × I = [0 , , we have K λ ( t, s ) ∈ L ( I × I, dV ⊗ dV ) , and (cid:107) K λ ( t, s ) (cid:107) L ( dV ⊗ dV ≤ (cid:107) K λ ( t, s ) (cid:107) ∞ · V . We also use the same symbol to denote the induced linear operator in L ( I, dV ): K λ h t := (cid:90) I K λ ( t, s ) h ( s ) dV s . It is well known [7] that (i) the L function norm dominates the operator norm: (cid:107) K λ (cid:107) ≤ (cid:107) K λ (cid:107) L ( dV ⊗ dV ) < ∞ , and (ii) such a linear operator K λ must be compact in the Hilbert space of L ( I, dV ). Theorem 2.

There exists a unique optimal execution solution h ∗ t to the mean-variance model with J λ [ h ] deﬁned in Eqn. (7) as the objective function, and withthe following constraints (Eqn. (8) and (9)): monotonicity, completion, andPoV capping, as long as the latter two are compatible as expressed in Eqn. (10). roof . By Theorem 1, it suﬃces to further establish the existence portion.Deﬁne ˆ φ t = α · side · ¯ θ t . Then the objective J λ [ · ] can be expressed more compactly by: J λ [ h ] = (cid:10) ˆ φ t , h t (cid:11) + α (cid:10) h t , h t (cid:11) + (cid:10) K λ h t , h t (cid:11) , where the inner product is in the Hilbert space of L ( I, dV ). Let I denote theidentity operator. Then we have J λ [ h ] = (cid:10) ˆ φ t , h t (cid:11) + (cid:10) ( α I + K λ ) h t , h t (cid:11) . Since K λ is the linear combination of three positive operators (or kernels) withpositive coeﬃcients, it must be positive . It is well known in the spectral theory [7]that the spectrum set σ ( K λ ) of such a compact and positive operator must bea subset of the positive half real axis [0 , ∞ ), and for any µ / ∈ σ ( K λ ), the inverse K λ − µ I must exist and be bounded in the Hilbert space L ( I, dV ). In particular,with µ = − α / ∈ σ ( K λ ), ( K λ + α I ) − is a well-deﬁned bounded linear operator,and one can deﬁne φ t = ( α I + K λ ) − ˆ φ t ∈ L ( I, dV ) . We also further introduce a new symmetric bilinear function: for g, h ∈ L ( I, dV ), (cid:0) g, h (cid:1) = (cid:10) ( α I + K λ ) g t , h t (cid:11) It is strictly positive since it dominates the ordinary inner product: (cid:0) h, h (cid:1) ≥ α (cid:10) h, h (cid:11) . Thus it introduces a new inner product, which is actually equivalent to thenatural one since it is also bounded above: (cid:0) h, h (cid:1) = α (cid:10) h, h (cid:11) + (cid:10) K λ h, h (cid:11) ≤ ( α + (cid:107) K λ (cid:107) ∞ · V ) (cid:10) h, h (cid:11) . For convenience, we denote by L ( I, dV | ( · , · )) the same function space L ( I, dV )endowed with this new equivalent inner product, which is a Hilbert space. Fur-thermore, the original objective function J λ [ h ] simpliﬁes to: J λ [ h ] = (cid:0) φ t , h t (cid:1) + (cid:0) h t , h t (cid:1) . (12)Deﬁne ψ t = ( α I + K λ ) −

1. Then the completion constraint becomes: (cid:0) ψ t , h t (cid:1) = X . And the constraints on monotonicity and participation limit remain thesame: 0 ≤ side · h t ≤ maxPoV . The constant PoV execution strategy: h constt ≡ X V , t ∈ I = [0 , J λ ’s: h (1) t , h (2) t , · · · , in L ( I, dV ), such that lim n →∞ J λ [ h ( n ) ] = inf J λ [ h ] < ∞ , and each meets the constraints. The sequence must be bounded since one caneasily show from Eqn. (12) that (cid:107) h t (cid:107) ( · , · ) ≤ (cid:18) J λ [ h t ] + 14 (cid:107) φ t (cid:107) · , · ) (cid:19) / + 12 (cid:107) φ t (cid:107) ( · , · ) . Now that in Hilbert spaces, any bounded sequence must be weekly pre-compact,possibly replaced by one of its subsequences, ( h ( n ) t | n = 1 , , · · · ) could be as-sumed to weakly converge to some element h ∗ t ∈ L ( I, dV | ( · , · )), so that for any g ∈ L ( I, dV | ( · , · )): (cid:0) g t , h ∗ t (cid:1) = lim n →∞ (cid:0) g t , h ( n ) t (cid:1) . Since the Hilbert norm is known to be lower semi-continuous (l.s.c.) under weekconvergence, we have J λ [ h ∗ ] ≤ lim n →∞ (cid:0) φ t , h ( n ) t (cid:1) + lim inf n →∞ (cid:0) h ( n ) t , h ( n ) t (cid:1) = lim inf n →∞ J λ [ h ( n ) t ] = inf J λ [ h ] . Finally the admissible space deﬁned by the three constraints is easily seen to bea closed and convex set , which must be closed under weak convergence [7]. Thismeans h ∗ also meets all the three constraints on completion, monotonicity, andPoV capping. Then h ∗ must be the optimal execution strategy with J λ [ h ∗ ] =inf J λ [ h ]. (cid:3)

6. Computation and Numerical Examples

In this subsection, we brieﬂy explain the major steps for model calibration.Actual implementation should depend on the trade database each executionhouse owns. For example, as explained in Section 4, COP strategies employedby an execution house may directly impact data bookkeeping and subsequentmodel calibration,Although the proposed model involves a varieties of kernels, its main char-acteristics is the linearity for all its four major parameters: α , α , α , and α .Model calibration can thus be directly based upon linear regression.We ﬁrst make the following assumptions about the empirical trades alreadymade in the past and stored in the database of the execution house.18i) (Universe Coverage) For targeted securities, one has enough samples ofpast trades. For any major broker-dealers, exchanges, or agency houses,this is typically not an issue. For example, in the United States, securitiesfrom the universes of all major indices (e.g., S&P 500) are heavily traded.(Due to the proprietary nature of trade information, this often imposesmuch challenge for academic researchers.)(ii) (Data Cleaning) Trade data have been properly ﬁltered . This has been avery common practice in execution houses. To ﬁlter is to remove erroneoustrades or insigniﬁcant trades from participating in the calibration. Forexample, trades lasting less than 5 minutes or trades whose average PoVsare below 0.1% can be considered insigniﬁcant and ﬁltered out.(iii) (Recording Intervals) For each trade, the house has kept on record itsexecution details, which may include for example, shares traded over each10 seconds and average execution prices associated with. The periodicrecording interval should not be too long so that the calibrated model canproperly capture the transient eﬀect.(iv) (Proﬁles and Risks) Using statistical methods and consolidated exchangedata, the house has already created the standard proﬁles for the Bid-Askspread ¯ θ t , and the volume measure dV t for each security, as well as therisk metrics for the auto-covariance matrices K δ ( t, s ) and K θ ( t, s ) as riskmetrics.At the execution houses where the author had worked, trading models aretypically calibrated daily, or weekly the longest. (Risk parameters involvedcould stay longer, however.)Implementation shortfalls (IS) are typically recorded and reported as basispoints (bps). One basis point is 0 . / , bps = IS $ p | X | · . (13)For example, if a buying trade of targeted initial notional p ∗| X | = $10 , , bps , it means the ﬁnal net cost the client actuallyhas to pay is $10 , , bps in databases, and tohave normalized magnitudes for the model coeﬃcients, we make the followingcoeﬃcient normalization based on proper dimensionality analysis (with the cir-cular superscript (cid:3) ◦ representing normalized coeﬃcients)¯ θ t = ¯ θ ◦ t · p · bps , α = α ◦ · α i = α ◦ i · p · bps , i = 1 , , . $ , and the deﬁnition for IS bps in Eqn. (13),E [IS bps [ h ]] = α ◦ (cid:90) I ¯ θ ◦ t dX t X + α ◦ (cid:90) I h t dX t | X | + α ◦ (cid:90) I dX t | X | (cid:90) t h s dM ( s | t ) + α ◦ (cid:90) I dX t | X | (cid:90) t h s dV s V t + ε = α ◦ · C [ h ] + α ◦ · C [ h ] + α ◦ · C [ h ] + α ◦ · C [ h ]We thus apply linear regression to all validated historical trades (i.e., h ’s alreadyobserved) based on the linear model:IS bps [ h ] ∼ α ◦ · C [ h ] + α ◦ · C [ h ] + α ◦ · C [ h ] + α ◦ · C [ h ] + w [ h ] . Notice the heteroskedasticity of the model as the residual noise term w [ h ] carriesa variance that depends on the execution proﬁle h t , as shown in Eqn. (4). Wethus apply either the weighted least-square estimator (w-L.S.E.) or the equiva-lent ordinary L.S.E. to the variance normalized data.We leave some other auxiliary details to the execution houses who are in-terested in the current work and who can conduct actual model calibrationfrom their proprietary trading database. The author welcomes any feedback orsuggestion from the industry. In this section, we show how the proposed model can be eﬃciently computedvia quadratic programming algorithms and available commercial software (e.g.,MOSEK or IBM CPLEX).Unlike dynamic trading models adapted to evolving market environments,pre-trade models are typically built upon historical “proﬁles” for spreads, vol-umes, correlations and volatilities. These proﬁles are often generated via robuststatistical methods daily or weekly, and stored into data ﬁles. Typically they arediscrete intraday vectors with intervals ranging from 30 seconds to 5 minutes,based on both the liquidity levels of the target securities and the COP machin-ery each house employs. Therefore, below we study the discrete implementationon a regular discrete time grid.Throughout the work we have been using the normalized trading horizon I = [0 , I = [ T , T ], say, from 9:45am to 1:35pm. Given a timeinterval ∆ t , suppose the trading horizon is discretized to: t = T , t = ∆ t, · · · , t N = T . Then the continuous volume measure dV is discretized to a discrete volume“proﬁle”: d n = V ([ t n − , t n )) = (cid:90) t n t n − dV t , n = 1 : N, D = diag( d , · · · , d N )be the diagonal matrix. The target PoV function h t is similarly discretized toa column vector H = ( h , · · · , h N ) T , with h n = 1 d n (cid:90) t n t n − h t dV t , n = 1 : N, whenever d n > h n = 0 otherwise.Following the operator expression in Eqn. (11), we recycle the same symbolof K λ to denote the matrix corresponding to the discretization of the continuouskernel K λ ( t, s ). Let B = ( b , · · · , b N ) T denote the column vector with b n = α · side · ¯ θ t n − / , with t n − / = t n − + t n , n = 1 : N. Then the model is discretized to:minimize over H : B T · D · H + H T · ( α D + D · K λ · D ) · H subject to: side · H ≤ maxPoV;diag( D ) T · H = X ;side · h n ≥ , where d n > , and h n ≡ , where d n = 0 . Here diag( D ) denotes the column vector consisting of the diagonals of D (fol-lowing MATLAB). The last condition can also be used to reduce the actualdimension of the problem by eliminating zero PoV’s where there are no marketvolumes (i.e., with d n = 0).This discrete problem ﬁts well into the framework of quadratic program-ming, and can be eﬃciently solved numerically by commercial optimizers suchas MOSEK and IBM CPLEX, which are often integrated into the local Java orC++ libraries of in-house execution analytics. In this section, we present several examples that help reveal the generalbehavior of the proposed model. We have relied on the built-in quadratic pro-gramming optimizer quadprog.m in MATLAB. The MATLAB software gener-ating these examples are available from the author upon request.Throughout we assume a hypothetical market that opens for 390 minutesfrom 9:30am to 4:00pm. The target security is assumed to have an arrival priceof p = $30 .

00, and be moderately liquid with average daily volume (ADV)about 5,000,000 shares. We also assume that the volume and spread proﬁleshave been established discretely in minutes, and that the volume proﬁle bears21 typical U-shape with more volumes at the Open and Close. The primary taskis to buy X = 90 ,

000 shares during a horizon that spans 90 minutes.Under these general settings, we assume that the hypothetical client is al-lowed to customize on the following three trading factors: (1) starting time T ,(2) PoV capping maxPoV, and (3) level of risk aversion λ . The choice of start-ing time aﬀects the “shape” of the volume measure over the trading horizondue to the daily U-shape. In most examples we set maxPoV = 20%, and therisk aversion to a medium level of λ = 10 − . To better compare with the exist-ing literature, unless indicated otherwise, the price dynamics is assumed to beBrownian with constant spreads. Plotted in Fig. 1 and Fig. 2 are the optimal execution solutions correspondingto three diﬀerent levels of risk aversion: medium (Fig. 1), and high and low (leftand right panels in Fig. 2). As common for mean-variance models, more riskaversion implies more front-loading behavior. But unlike most results illustratedin the classical literature, front-loading is allowable only up to the level of themaxPoV ( which is 20% in this series of examples) typically set by the clients.

60 70 80 90 100 110 120 130 140 15000.10.2 Left: optimal trading PoV; Right: optimal trading shares60 70 80 90 100 110 120 130 140 150020004000optimal PoVoptimal shares70 80 90 100 110 120 130 140 150050001000015000 market volume profile/measure (in shares)trading horizon (in minutes)

Figure 1: Medium risk aversion with λ = 10 − Demonstrated in Fig. 3 and Fig. 4 are the eﬀects of volume measures. Undera ﬁxed moderate level of risk aversion ( λ = 10 − ), trading in a high volumemarket environment can drive down the average PoV’s and hence trading costs22 Figure 2: (Eﬀect of Risk Aversion) Left: high risk aversion with λ = 10 − ; Right: low with λ = 10 − . Higher risk aversion implies more front-loading behavior. Unlike many results inexistence, front loading is allowable only up to the level of the maxPoV set by a client. (e.g., the left panel of Fig. 4 which simulates typical risk-averse trading in a“morning” session). In order to better understand the role of each cost component in the mod-eling, in the next three ﬁgures (Fig. 5 to 7), we plot optimal solutions corre-sponding to the boosting of the individual component coeﬃcients: α ◦ i , i = 1 : 3.For each ﬁgure, we have boosted up one of the target coeﬃcients by 10 times,while holding the other two at the original moderate level. The captions of theindividual ﬁgures give more details and discussions. In the two panels of Fig. 8, we also demonstrate the ﬂexibility of the pro-posed model in dealing with price dynamics other than the classical (linear orgeometric) Brownian motions. The left panel plots the optimal solution for the mean-reversal model, and the right panel for an asymmetric-volatility model.

Mean Reversion

In the mean reversal model, one assumes the homogenized price change obeysthe following equation: dδ ◦ t = − κδ ◦ t dt + αdW t , with the initial condition δ ◦ t =0 = 0 .

0. Here κ stands for the strength of meanreversal, and αdW t for the Brownian component. It can be shown that thesolution bears the closed-form: δ ◦ t = α · (cid:90) t e − κ ( t − s ) dW s ,

50 160 170 180 190 200 210 220 230 24000.10.2 Left: optimal trading PoV; Right: optimal trading shares150 160 170 180 190 200 210 220 230 240020004000optimal PoVoptimal shares160 170 180 190 200 210 220 230 240020004000600080001000012000 market volume profile/measure (in shares)trading horizon (in minutes)

Figure 3: (Eﬀect of Volume Measures) In a U-shaped daily volume measure, volume isrelatively low and almost “ﬂat” near the hypothetical “noon” time (12:00pm - 13:30pm).Under the given maxPoV = 20% and a moderate level of risk aversion, the combined eﬀect ofrisk aversion and ﬂat volumes encourages to trade faster in the beginning of the horizon. and that the auto-covariance function is given by K ◦ δ ( t, s ) = α κ · e − κ | t − s | · (cid:16) − e − κ ( s ∧ t ) (cid:17) . (14)Notice that for bigger t, s , the covariance function behaves more like an expo-nential kernel.Thus both conceptually and quantitatively, mean reversion erases long-termmemory and only keeps a shift-invariant (when suﬃciently away from t = 0)short-term correlation. Unlike Brownian motions for which price uncertaintiesgrow in the order of √ t − t as time elapses, mean reversion maintains almosta constant level of uncertainty at α/ √ κ . This implies that faster front-loadingtrading does not necessarily help reduce risks as risks are time invariant. Thisis indeed conﬁrmed from the left panel of Fig. 8. Furthermore, as the kernelasymptotically behaves like the exponential function, we observe the “wall”eﬀect at the two boundaries as well documented in the classical literature [12]. Asymmetric Stochastic Volatility

In the next example, we consider a simple form of stochastic volatility: dδ ◦ t = σ · (cid:0) exp( − βδ ◦ t · δ ◦ t ≤ ) ∧ . (cid:1) · dW t , (15)where σ and β are constant. Under this model, instantaneous volatility in-creases when the price delta is negative, but has been capped under 2 σ . In24 Figure 4: (Eﬀect of Volume Measures) Model and trading parameters are being held the samefor Fig. 3 except for the starting time, which aﬀects the volume shape during the tradinghorizon. Left: trading in the “morning” session with more volumes skewed towards theOpen; Right: trading in the “afternoon” session with volumes more concentrated towards theClose. For the “afternoon” session, risk aversion hinders taking full advantage of the richervolumes near the end of the trading horizon, and encourages instead to participate more inthe beginning even the volume is comparatively lower. the positive regime when δ ◦ t >

0, the instantaneous volatility remains at theconstant level of σ . This behavioral transition has been made possible by theindicator function 1 δ ◦ t ≤ . The volatility is thus asymmetric with respect to thedirections of price movements.As no closed form exists for the auto-covariance function, we turn to theMonte-Carlo estimation method using thousands of simulated paths (40,000 inthis example). The estimated kernel function is then applied in the proposedexecution model.The resulted optimal solution has been plotted in the right panel of Fig. 8,in which one could observe the non-smoothness associated with the Monte-Carlo kernel estimation. Since the asymmetry increases the eﬀective volatility,risk aversion enhances the front loading behavior compared with the symmetriccase with a constant volatility σ , which is evident from the plotting.

7. Conclusion

While realistic trading models have to be dynamic , static pre-trade modelsare still important and widely implemented in the execution industry for anumber of reasons and applications explained in the Introduction section.The primary objective of the current work has been to build a continuummodel that(a) automatically accommodates the broadest varieties of price dynamics,(b) more faithfully engages the role of market volumes in the general reduction-ism approach of static modeling,(c) employs impact cost components that bear low complexities but with moremarket signals embedded and represented,25

Figure 5: (Eﬀect of Instantaneous Cost) The instantaneous cost coeﬃcient α ◦ is boostedup by 10 times, with α ◦ and α ◦ ﬁxed at the original moderate level. By deﬁnition, theinstantaneous cost component is highly localized and trading now vs. later bears no extrapenalties. The optimal execution is thus mostly shaped by risk aversion and is typicallyfront-loading. (d) is versatile enough to allow most popular constraints from clients, and yet(e) is still analytically tractable and computationally feasible.By working directly with the auto-covariance functions, the model virtually hasallowed any price dynamics, in particular, Markovians like Brownians or non-Markovians with memories. Building upon the foundation of measure theory, wehave also treated market volumes as Borel measures over the execution horizons.Pre-trade executions are then considered to be absolutely continuous measuresover such measure spaces, which naturally results in the target decision variableto optimize with – the PoV rate function. All the four impact cost componentshave been consistently built upon the PoV function. They are all kept linear butwith more market signals integrated in, including volume distances for the tran-sient costs and cumulative volume normalization for the permanent costs. Wehave also considered heuristically the inﬂuence of in-house Child Order Place-ment strategies and quantitative market makers on impact cost building.In combination, the proposed pre-trade model has led to a constrainedquadratic programming problem in inﬁnite-dimensional Hilbert spaces, whichaccommodates most linear constraints frequently requested by internal or ex-ternal clients. We have in particular worked with the following three primaryconstraints: (i) monotonicity, (ii) completion, and probably the most important,(iii) PoV capping or volume limits, which is frequently requested from clients.We have applied the theories of positive quadratic operators and compact26

Figure 6: (Eﬀect of Transient Cost) The transient cost coeﬃcient α ◦ is boosted up by 10times, with α ◦ and α ◦ held at the original moderate level. The notable boundary “wall” eﬀectarising from the exponential transiency has been well documented in the classical literature(e.g., see Proposition 2 and 3 in Obizhaeva and Wang [12]). operators in Hilbert spaces to establish both the positivity and compactness ofthe operators involved, and hence also the existence and uniqueness of optimalexecutions. One possible numerical scheme for projecting the continuum modelonto interval-based grids has also been provided, and several computationalexamples have been carefully designed to address the eﬀects of all the majorfactors.Despite the versatility of the proposed model, we have to make some neces-sary warnings. Firstly, the model has been primarily designed to be a pre-trade static model and to become a major service component in the pre-trade packagesoﬀered to both internal or external clients by execution houses. Real dynamic trading models could be heuristically built upon such pre-trade models, but haveto be integrated with a sound re-optimization strategy. Next, the current workhas been carried out still in the conventional reductionistic approach based onmarket volumes. With growing analytics and understanding about the limitorder book (LOB) dynamics, it would be naturally interesting to develop LOBbased pre-trade models which are both analytically tractable and computation-ally feasible, and also to compare them with volume-based models (via realtrading databases) to quantify the net improvements in pre-trade analytics.We conclude the work by emphasizing that without the collective academicunderstanding and industrial practicing started by the many pioneers and lead-ing practitioners, a portion of whose works have been frequently mentioned, thecurrent work and modeling eﬀorts would be absolutely impossible.27 Figure 7: (Eﬀect of Permanent Cost) The permanent cost coeﬃcient α ◦ is boosted up by 10times, with α ◦ and α ◦ held at the original moderate level. Permanent cost alone embracesback loading, so that shares traded at later times could pay less permanent cost built upby the earlier shares. The optimal trading proﬁle plotted here achieves a balance under thefront-loading pressure from risk aversion.

8. Disclaimers (1) The proposed model has not been the internal or external product of anyexecution houses where the author had worked. Any potential industrialconﬂict should be promptly directed to the attention of the author.(2) Due to the proprietary nature of the industry and the resulting scarcityof real trading data to the public, general expressions like “based on theexperience in the execution houses where the author has worked, ...”, arepurely for providing bona ﬁde academic views based on the past workingexperience with real trading data and results.(3) Any mentioning of certain brand names, e.g., MATLAB, IBM CPLEX Op-timizer, or MOSEK, etc., is not a product endorsement from the author forpurchasing or investment, but an indication of some popular practices inthe contemporary industry.(4) Execution houses who are interested in the current work and plan to im-plement it in their systems should be aware of any other operational risks,including for examples, the complexities at market opens or closes, tradinghalts, stock splitting, or any extreme market events, etc.28

Figure 8: (Eﬀect of Price Dynamics) Left Panel: under a mean-reversal (MR) price dynamics;Right Panel: under a price dynamics driven by asymmetric stochastic volatilities (ASV). Asthe MR dynamics erases long-term memories and converges to time-invariant short-term cor-relations, execution delay does not necessarily increase risk. Consequently, the front-loadingbehavior normally associated with risk aversion is weakened. The “wall” eﬀect is the classicalbehavior of the exponential kernel [12], which the MR auto-covariance function converges to(see Eqn. (14)). On the right panel, the asymmetric stochastic volatility model (15) increasesthe eﬀective instantaneous volatility compared with the normal Brownian dynamics and hencethe execution risk associated with delaying. Therefore, it enhances the front-loading behaviorassociated with risk aversion.

9. Acknowledgment

The author wishes to thank the following colleagues on Wall Street: RobertAlmgren for explaining to the author, then a freshman on Wall Street in 2007,how Wall Street operates under the introduction of my dear friend AndreaBertozzi; Adlar Kim, Max Hardy, Daniel Nehren, Xu Fan, Chang Lin, JesusRuiz-Mata, Kathryn Zhao, Calvin Kim, Harry Rana, Arun Rajasekhar, et al.,for all the delightful days at J. P. Morgan working so hard together to buildsolid Delta One and portfolio hedging, risk, and trading products; Anlong Li,Iaryna Grynkiv, Paul Radovanovich, Federico de Francisco, Mark Skinner, Mer-rell Hora, Rishi Dhingra, Lada Kyj, Nazed Mannan, Allison Greene, Alan Chen,Ajit Kumar, Yihu Fang, Li Xu, Sunmbal Raza, Peter Ciaccio, Peter Norr, HasanAhmed, Bing Song, Ming Yang, Huaguang Feng, and George Liu, et al., forworking so closely at the Barclays Capital from day to day as a grand team todeliver high-quality quantitative and technological solutions to modern algorith-mic trading and portfolio analytics, and for the favorite donuts and coﬀees fromthe Dunkin Donuts, and all mini or max cupcakes from Melissa or Magnolia!The author is particularly grateful to these colleagues and close friends, whohad not only shaped his wall street career but also deeply and warmly touchedhis everyday life: Adlar Kim, Max Hardy, Daniel Nehren, Paul Radovanovich,Bing Song, and Huaguang Feng. 29 eferences [1] R. Almgren. Optimal trading with stochastic liquidity and volatility.

SIAMJ. Financial Math. , 3:163–181, 2012.[2] R. Almgren and N. Chriss. Optimal execution of portfolio transactions.

J.Risk , 3:5–39, 2000.[3] R. Azencott, A. Beri, Y. Gadhyan, N. Joseph, C.-A. Lehalle,and M. Rowley. Realtime market microstructure analysis: on-line Transaction Cost Analysis.

ArXiv e-prints, arXiv1302.6363,http://adsabs.harvard.edu/abs/2013arXiv1302.6363A , 2013.[4] D. Bertsimas and A. W. Lo. Optimal control of execution costs.

J. Finan-cial Markets , 1(1):1–50, 1998.[5] B. Bouchard, N.-M. Dang, and C.-A. Lehalle. Optimal control of tradingalgorithms: a general impulse control approach.

SIAM J. Finan. Math. ,2(1):404–438, 2011.[6] T. F. Chan and J. Shen.

Image Processing and Analysis: variational, PDE,wavelet, and stochastic methods . SIAM Publisher, Philadelphia, 2005.[7] Y. Eidelman, V. Milman, and A. Tsolomitis.

Function Analysis - An In-troduction . Amer. Math. Soc., 2004.[8] L. C. Evans and R. F. Gariepy.

Measure Theory and Fine Properties ofFunctions . CRC Press, Inc., 1992.[9] G. B. Folland.

Real Analysis - Modern Techniques and Their Applications .John Wiley & Sons, Inc., second edition, 1999.[10] G. Huberman and W. Stanzl. Optimal liquidity trading.

Yale School ofManagement Working Papers , YSM 165, 2001.[11] R. Kissell and R. Malamut. Algorithmic decision making framework.

J.Trading , 1(1):12–21, 2006.[12] A. Obizhaeva and J. Wang. Optimal trading strategy and supply/demanddynamics.

J. Financial Markets , 16(1):1–32, 2013.[13] A. F. Perold. The implementation shortfall: Paper versus reality.