Artificial intelligence approach to momentum risk-taking
aa r X i v : . [ q -f i n . R M ] M a r ARTIFICIAL INTELLIGENCE APPROACH TOMOMENTUM RISK-TAKING
IVAN CHEREDNIK, UNC CHAPEL HILL † Abstract.
We propose a mathematical model of momentum risk-taking, which is essentially real-time risk management focused onshort-term volatility of stock markets. Its implementation, ourfully automated momentum equity trading system presented sys-tematically, proved to be successful in extensive historical and real-time experiments. Momentum risk-taking is one of the key com-ponents of general decision-making, a challenge for artificial intel-ligence and machine learning with deep roots in cognitive science;its variants beyond stock markets are discussed. We begin with anew algebraic-type theory of news impact on share-prices, whichdescribes well their power growth, periodicity, and the market phe-nomena like price targets and profit-taking. This theory generallyrequires Bessel and hypergeometric functions. Its discretizationresults in some tables of bids, which are basically expected returnsfor main investment horizons, the key in our trading system. TheML procedures we use are similar to those in neural networking.A preimage of our approach is the new contract card game pro-vided at the end, a combination of bridge and poker. Relations torandom processes and the fractional Brownian motion are outlined.
Key words : news impact, decision-making, risk management, stock mar-ket, short-term volatility, momentum trading, fractional Brownian motion,artificial intelligence, machine learning, neural networks, cognitive theory,behavioral finance, card games, Bessel functions, hypergeometric functionsMSC 2010: 33C90, 33C10, 60K37, 68R01, 90B50, 91A35, 91A80,92C30, 93E35, 33D90, 91E10, 91E45, 68T27, 68T37
Contents Introduction
Objectives and tools † March 18, 2020. Partially supported by NSF grant DMS–1901796.
Organization of the paper
AI and risk-taking
Universality of MRT
Games as concepts
Momentum investing
Modeling news impact
Hierarchy of news
Adding price targets
Logistic modification
Investing regimes
Two events, comments
Profit-taking etc
Toward Macdonald processes
Market implementation
Major challenges
Forecasting
Tables of two-bids
Basic system operations
Testing the system
Some charts
Pont, a card model
General design
Description
Variants
Comments
Concluding remarks
MRT: main findings
I APPROACH TO MOMENTUM RISK-TAKING 3 Introduction
Objectives and tools.
We propose a new theory of momentumrisk-taking, which is basically real-time risk management, one of thekey components of general decision-making. We focus on momentumrisk-taking,
MRT , when our decisions must be fast and mostly short-term. This is a development of ”thinking fast” from [Ka]. Stock mar-kets are the key to us; a new approach to short-term volatility andhigh frequency trading is the main theoretical result of this paper. Itsimplementation is a momentum trading system, which was extensivelytested in stock markets, including real-time trading. The discussion ofits performance is an important part of the paper. Stock markets pro-vide a unique opportunity to test our theory, but the core mechanismsof
MRT seem quite universal, well beyond investing. We will arguethat
MRT is a major purpose of any intelligence (not only with hu-mans). Our results indicate that modeling such mechanisms is withinthe reach of artificial intelligence systems; they can be natural ”ends”and also indispensable research ”means”, as we try to demonstrate.We propose a new continuous mathematical model of news impact onshare prices, and then describe its ”stratified discretization”, necessaryto deal with discontinuous functions. The news impact of a single eventis basically t r in terms of time t with fractional powers (exponents) r multiplied by proper functions in the form cos( A log( t ) + Bt ). Thelog( t )-periodicity here resembles that of Elliot waves. The t -periodicityis related to profit taking, which we associate with the asymptoticperiodicity of Bessel functions. Mathematical understanding profit-taking is of obvious importance in the theory of market volatility; seee.g. [ACH, EN, FL, FPSS] concerning the latter. Hypergeometricfunctions serve the case of two events, which is connected with certaintypes of hedging. Generally basic (difference) hypergeometric functionsappear here, but we stick to continuous models in this part of the paper.As any theory, our one must be checked experimentally. Stock chartsare the main examples for us; stock markets are quite a test for anyrisk-management theories. An obvious problem is that such charts arediscontinuous, so the differential equations must be replaced by differ-ence ones. Novel approaches to the discretization appeared necessary;cf. [ChS]. We restrict ourselves with relatively short time periods afterthe event; high volatility right after the news is mostly avoided. Thenthe core of our approach is the usage of tables of bids, which is based onranked collections of sample time-forecasts for different time-horizons.Given a chart, these tables provide a short-term prediction of itsevolution, which involves the prior behavior in some ”non-linear” way;cf. [Gu]. Forecasting here is not on the basis of derivatives of charts ortheir difference counterparts, though they are of course employed. The IVAN CHEREDNIK usage of ”global factors” and prior experience is the key for us, whichis no different from the way our brain works.Our tables are actually similar to bidding tables in contract cardgames, though the role of time, the non-linearity of our tables, and someother features have no counterparts with cards. In the realm of stockmarkets, the tables determine optimal horizons, expected durations ofthe investments, and also provides the corresponding short-term fore-casts for the share-prices. We think that our brain ”employs” similarauction-type procedures when risk-taking, so such bidding tables areuniversal well beyond playing cards and trading stocks.A usual approach to understanding the ways our brain works is viacarefully designed experiments, which are mostly focused on very spe-cific, basic, simplified and sometimes artificial tasks. However, thesimpler the challenges the more special and primitive tools our braininvokes. So laboratory experiments can generally clarify only very ba-sic features; they are games in a sense. With any game, our brainreadily switches to the corresponding optimal thinking mode, at leastupon some training; we are very good with this. So the experimentsmostly measure our ways to play this particular game. The risks mustbe as real as possible to force our brain to use its full potential, whichis hardly possible in experiments. Real behavior of people is difficultto recreate in artificially designed situations, even well crafted.It seems that the most promising, if not the only, rigorous approachto understanding risk-taking and other processes of this kind is to doour best with creating artificial intelligence systems and then compar-ing their decisions in real situations with those of people. Of coursethe ”super-aim” here is to reach some ”superhuman” levels, but evenany ”simple” reproduction of our real behavior is a breakthrough.The automated momentum trading system based on our approachcan be seen as a step in this direction; it is discussed in the second partof this paper. Its preimage is a new contract card game presented atthe end. By design, this system uses only share-prices; i.e. it operatesonly on the basis of the technical analysis. So it is inevitably ”late”with any decisions vs. professional traders and investors, and is subjectto the bid-ask spread and many other factors reducing profitability. Inspite of such disadvantages, the system proved to be profitable, whichis some justification of our approach. The success of any AI systemcan be of course only an indirect confirmation of its principles.We discuss the main features of our trading system in the paperand provide some typical results of its performance. Designing historicexperiments was a serious consideration, to prevent the usage of anykind of ”future”. At least, this is impossible with real-time trading,where the system was tested systematically (with about 1000 compa-nies). The results we provide can supply those who will implement
I APPROACH TO MOMENTUM RISK-TAKING 5 our approach and the tables practically with some benchmarks. Wethink that the pont-tables from Section 4.3 can significantly help tounderstand and get used to the from Section 3.3.Importantly, we can always ”explain”, interpret to be exact, thetrades our system makes; our system is not a black box and can shedlight on the risk-taking preferences of the traders. This provides some quantitative model of ”thinking fast” from [Ka]. Traders must quicklyreact to many unknown factors, which creates some special marketintuition of obvious interest to cognitive theory and behavioral finance.1.2.
Organization of the paper.
In the current section, we describeour approach and discuss its general origins, including risk-managementand cognitive theory.
Momentum risk-taking is essentially short-termforecasting based on whatever information available (frequently incom-plete). We demonstrate that it can be modeled mathematically. Due toour focus on professional trading, we can disregard the expected utilityhypothesis originated by Daniel Bernoulli, the asymmetry between lossand gain from prospective theory , and similar. The market agents areassumed to act ”rationally” on the basis of the current news impact,so the purpose of our AI system is to capture their preferences.
Section difference setting .The connection with fractional Brownian motion ( fBM ) is brieflydiscussed at the end of Section 2.4. Our price-functions are relatedto the standard deviations and transition probability densities of thecorresponding processes, which provides a statistical framework for ourapproach. See [Che, GJR, GNR] and [Bo] concerning fBM in studyingmarket volatility and power-laws for price functions.Using a single fBM with small Hurst exponent as a model for a pricefunction creates a theoretical problem with the existence of arbitrage (some kind of ”free lunch”) due to the negatively correlated increments.However, mixed fBM are arbitrage free, and anyway we are doing onlyrelatively short time-intervals, where this concern is not quite relevant.The author thanks Patrick Cheridito for a discussion.We mostly consider the impact of one-two events. Statistical ensem-bles of news are mathematically significantly more challenging. Thecorresponding stochastic processes are similar to those in [BC]. Also,our trading system provides some experimental support for our ap-proach to modeling the impacts of isolated consecutive events. The”multi-dimensional” theory of ensembles of events seems difficult tocheck experimentally and use practically.
IVAN CHEREDNIK
Section trading system . The results of extensive experiments,including real-time trading serve two purposes.
First, we provide evi-dence for power-laws for price functions with exponents depending onthe investment horizons (say, can be 0 .
137 for day-trading).
Second, we provide some performance benchmarks for those who will followour approach in their own trading systems. Our system has manynew features, including the simultaneous running its multiple variants(sometimes even with identical opti-parameters but with different entrypoints), simultaneous pro-trend and contra-trend trading, the usage ofthe results of optimization for creating weights of companies, and soon. Potential followers must know what to expect, theoretically andpractically. We also explain how testing the system was performed.We do not discuss much the machine learning procedures we em-ploy, namely the optimization of parameters and creating the companyweights. The discretization parameters, counterparts of action poten-tials for neurons , are the key for us, but there are also important dif-ference counterparts of first and second derivatives of the charts we usefor forecasting and trading. There is vast ML literature on entropy,information theory, Bayesian predictive method and generative adver-sarial networks, GAN . The latter approach is somewhat similar to ourauction-type procedure, when different decision-making ”bids” fromdifferent investment horizons contest with each other; cf. [DB, HS].See also [SC] about some general perspectives of deep learning .Since we deal with a limited number of opti-parameters (all of themhave theoretical meaning), a relatively straightforward gradient method is mostly used for the optimization. It is rare when our AI systemcannot find the parameters providing a solid jump in performance foralmost any ”education periods”, though their uniqueness is of coursenot granted. This is for individual companies or portfolios. The weightsof the companies we use are based on the prior optimization. Weomit the discussion of the usage of correlations between equities in thispaper, which is common in automated investing systems. The impactof our optimization-based weights can be significant, but using themgenerally restricts the trading volumes, which is a consideration for us.
Section by designing a con-tract card game, pont , combining the elements of bridge and poker . Itadds poker-style uncertainty to the bridge-type auction. The contractis declared on the basis of 6 cards, but the hand can consist up to 9,which is determined by the declarer, the winer of the auction. Thuswe add poker risk-taking to bridge-type bids, which is similar to in our trading system. It appeared that the players can easily get usedto such ”fractional bids”; the size of the hand is the denominator. It
I APPROACH TO MOMENTUM RISK-TAKING 7 is closely connected with our approach to discretization (the key inany neural networking); our are discrete, though the thresholdis subject to optimization. The play (taking tricks) has nothing todo with stock markets: pont is just a game. However the play makesthe bidding sufficiently non-trivial to consider pont as a model of realrisk-taking; this is missing in poker, where there is no actual ”play”.1.3.
AI and risk-taking.
The purpose of artificial intelligence (AI)systems is to perform tasks that require human cognition. Actually, theaim here is to exceed human decision-making abilities using computersand machine learning at full potential. Even if the quality of automateddecisions is mediocre, the cost efficiency, speed and the broad range ofapplications can be ”superhuman” and result in great societal and eco-nomic benefits. There is a lot of progress with narrow AI , focusing atspecial tasks. However we are decades away from general purpose AI according to the conclusion from ”The National Artificial IntelligenceResearch and Development Strategic Plan (2019 update)” by the Na-tional Science & Technology Council (USA). Astonishing versality andflexibility of human intelligence remains quite a challenge, and not onlybecause our brain contains about 100 billion of neurons.
Decision-making is the key test for any AI systems. This is quite acomplex process.
Risk management is one of its important components,which is generally a system of protection measures aimed at reducingfuture risks. Here the focus is mostly on general adjustments, not onexact timing. There is a direct analogy with predicting earthquakes;even if potential places are known, we do not know when, especiallyin advance. See e.g. [FL] for various aspects of risk management,including high-frequency trading, and [EN] for mathematical aspects.
Momentum risk-taking can be then broadly defined as real-time riskmanagement, prompt responses to events and developments, short-term forecasting. Here timing is the key, though a lot of prior knowledgeand experience is needed; see e.g. [EN]. The events we are reacting toare mostly not really new; almost always they have occurred before.The problem is to address quickly their strength and other factors in-volved. Real-time monitoring the developments before and after ouractions is an important part of risk-taking. The response can be re-quired immediately, so it can be difficult to understand what reallyaffected our decision.
Thinking fast, intuition, subconscious processes are certainly involved; this can be not too transcendental, a specialmode of our brain to quickly manage time-sensitive information.If the subconscious processing the signals is essentially similar to theusual (rational) one, then risk-taking AI systems can be quite relevant.Moreover, if this is true, then using AI can help a lot to understandwhich kind of ”thinking fast” and ”intuition” is involved here; this aloneis quite a motivation of the present paper and our project, without any
IVAN CHEREDNIK reference to stock markets. One of our main observations (based onmachine modeling) is that core risk-taking is actually controlled by veryfew parameters. Moreover, these parameters seem to be of universalnature, though they are obviously adjusted to serve concrete situations.Both findings can be seen in stock markets. As some confirmationof broad nature of risk-taking, the results based on the optimization ofindividual companies are only modestly better than the results basedon the optimization performed for the groups, ”portfolios”, of compa-nies. Generally, the greater variety of different risk-taking tasks youwent through, games included, the greater your risk-taking skills. Thissounds quite obvious, but is very difficult to implement in any auto-mated systems; developing general purpose artificial intelligence sys-tems is needed here, not just those focused on specific tasks.1.4.
Universality of MRT.
Let us try to outline which minimal tools seem necessary for any risk-taking. This will be not a biological (neu-roscientific) or philosophical discussion. We mostly rely on the fun-damental mathematical nature of the corresponding differential equa-tions. Our brain does many things; the decision-making is one of thekey for our (any) intelligence. Many aspects of our intellectual ac-tivities has no or almost no clear connection with risk-management ,however the origins of quite a few of them still have something to dowith decision-making. Science is an obvious example. Even if some re-search directions are focused on highly abstract objects, the value andmotivation of such research can be not just our curiosity or sense of ful-fillment; almost any research is somehow related to decision-making,at least historically. Also, many aspects of our intellectual life thatseem or are ”artificial”, say the games we design and play, serve a clearpurpose of developing and training our decision-making abilities (socialskills included), including preparing us for real risk-taking.With this pragmatic understanding of intelligence, there must be nosignificant difference between humans, other creatures, and artificialsystems. Here long-term risk-management is quite different. Uncondi-tional instincts and reflexes are generally the key for such managementin nature. Our ability to ”rationally” plan long-term is of course aspecial and cultural phenomenon. This requires at least a highly so-phisticated concept of time, among many other intellectual abilitiesunique for humans. Generally, anything about long-term forecastingand planning seems beyond what we can hope to model mathemati-cally, especially for the processes with high uncertainty.However, upon restricting ourselves with relatively short intervalsand with momentum risk-taking, for short
MRT , mathematical mod-eling seems much more doable. Almost any creatures have some basicconcepts of time in this range, sometimes at the level of chemical and
I APPROACH TO MOMENTUM RISK-TAKING 9 physical processes.
Thinking plays significantly greater role for short-term forecasting and performing the corresponding actions vs. instinctsand reflexes; this is not only for humans. Though some special
MRT -type intuition is of course heavily involved, humans included.It is not impossible that the core mechanisms of
MRT can be ob-served in the ”neural architecture” of our brain. One of the mainmechanisms is the well-studied action potential ; its counterparts arethe key for the discretization in this paper. The action-type proce-dures between different options are present too. We claim that theremust be at least one more important component here. Mathematically,any decision making requires some price-functions . Such ”functions”must be present somehow, but it can be difficult to understand howsuch functions can be practically formed and ”stored” in our brain.Recall that
MRT is news-driven short-term forecasting and the cor-responding risk-taking. Our brain does a lot of things. For instance,about 50 percent of the cortex is doing vision. Our model is natu-rally focused only on newly emerged events. After news is determined(which is quite a process), the brain must do its initial classificationand invoke the corresponding weight or rank of this news within thecurrent task, which is basically the expected significance. The weightis of course based on the prior experience, learning included; having itsomehow is a must for any decision-making systems.This is actually the key. We demonstrate that the differential (ordifference) equations for the news-function alone is mathematicallyinsufficient without adding the price-function . Only with these twofunctions and their interaction, the corresponding mathematical modelbecomes satisfactory. The news-function basically measures the re-sources (the number of neurons) currently involved in the analysis ofa particular event. The price-function measures an expected impor-tance of this particular event vs. other events and the corresponding expected brain resources needed for its analysis. The latter dependson the news-function: the greater the number of neurons currently in-volved in the analysis of the news, the greater its potential importance.The initial ”price of news” will increase when the news generates theneural activities greater than it was expected. Vice-versa, when thecurrent number of neurons involved approaches some expected levels(provided by the price-function), the news ”fades”. Its impact can stillcontinue to grow, as well as the price-function, but the brain will thenattempt to reduce the resources used for its analysis.This sort of interaction is essentially the system of differential equa-tion we suggest, which can be generally solved in terms of power-typefunctions and Bessel functions. Accordingly, the prediction is thatthe price-function must be somehow formed, constantly updated, andstored in the brain to be used again and again when the events of this kind emerge. We claim that such 2-function ”interaction”, the actuallyneeded resources vs. those expected to be used , is necessary for any kindof
MRT . This is of course a mathematical prediction, but the simplicityand fundamental nature of the corresponding differential and differenceequations is a strong argument in its favor. Actually, these equationsare relatively new, though with strong connection with classical specialfunctions.
Nonsymmetric
Bessel functions, Dunkl operators and otherlatest tools in harmonic analysis are involved here.
An example: driving . Brain activities while driving a car is a conve-nient example of ”general”
MRT . Permanent basic visual informationis a must here, and even more such and similar information can be ”re-quested” to provide for
MRT . This is very resource consuming, muchgreater than
MRT itself. Also, as with many tasks, driving can becombined with talking, listening radio, thinking about something etc.The actual beginning of
MRT is when our brain identifies events .They can be some turns or road curves, especially those requiring spe-cial attention, road signs, pedestrians, neighboring cars, navigationmatters and so on; see e.g. arXiv:1711.06976 , arXiv:1906.02939 on self-driving cars . Importantly, such categories of potential eventsare supposed to be analyzed constantly and simultaneously, even if thenext news is or will be in a particular category. There is an almostexact analogy with trading stocks, especially when they are treatedindependently, the main regime for our trading system: all stocksconsidered for potential investing and investment horizons must beconstantly monitored, regardless of the current or expected positions.Only some events will reach in our brain the level of signals . Theseparation of the signals from noise is quite a problem, which requiresa lot of prior experience. By noise , we mean ”insignificant events”,those that hardly require special consideration. After the signals aredetermined, our brain is supposed to estimate the resources neededfor the analysis of the signal. This stage requires invoking from thememory the latest price of the event, which is essentially an estimatefor the expected brain resources (neurons) needed for its analysis.Then a systematic analysis begins, which can trigger the numberof neurons significantly different from what was initially ”allocated”.This can be because of unexpected complexity of the event, due tochanging its priority for driving and so on. The greater the neuralactivity, the greater the growths of the corresponding price-function .However, when the activity becomes beyond the projected price-levels ,our brain will attempt to reduce the number of neurons involved.The risk-taking is this analysis and the corresponding driving ma-neuvers (when needed). The rank (category) of the news and the in-tensity of its impact must be high enough for this ”action”, which is I APPROACH TO MOMENTUM RISK-TAKING 11 similar to the usage of our . Our main assumption is that short-term news impact grows polynomially with some fractional exponent .This provides predictions for the intensity of the corresponding brainactivities, which can be used to end or restrict our analysis when theydrop below some termination curves . Similar to trading stocks, thiscan help to optimize the usage of the brain resources by switching tonew events when the importance of the current ones begins to ”fade”.1.5.
Games as concepts.
AI systems not always follow the ways ofour brain, even if the problems are human-related. However nature ,our brain included, is definitely the prime source of concepts for any AI.Just to give an example, the airplanes are very different from birds, butthe concept of flying is from nature. This is no different for AI.
NarrowAI systems , in specialized well-defined domains, can sometimes follow”non-human”, ways. However general AI systems are expected to bor-row a lot from human intelligence, though the final implementationscan be quite different: ”aircrafts vs. birds”.Importantly, many faces of decision-making are reflected in the games we play. Some include timing, some do not. For instance, solving puz-zles and playing chess are not focused on timing, unless in tournaments.On the other hand, poker and contract card games are time sensitive.The interaction and risks in card games are as close as possible to reallife, for models of course, which they are. Investing is obviously closerto playing poker than to playing chess or bridge. Poker’s bidding is agreat model of dealing with uncertainties, but the risks are too ”math-ematical” and the actual ”play” is missing. Solid rules and protocolsmake stock market some kind of a game, but here the risks are morethan real. From this perspective, it provides a highly developed andquite universal ”model” of risk-taking, which is of obvious interest.Psychologically, games reflect life in various ways, potentially prepar-ing us for real challenges; we naturally discuss only ”intellectual games”.Some are designed to deal with real tasks; playing them can be moredangerous than life itself. Using game theory, especially mean fieldgames, is quite common in financial mathematics; see e.g. [GLL]. Fromthis perspective, one can look for the games reflecting our concept ofmomentum risk-taking. We found none and invented a new one, pont ,which is essentially a version of bridge with poker-style bidding.Stock markets by design mean that their ”agents” look only to theirown interests and to market prices [GLL], though investing is a complexand very much interactive process with solid grounds in our psychol-ogy. Nevertheless investing is a great confirmation of the universalityof momentum risk-taking,
MRT . It looks like there is some generalpurpose risk-taking source code in our brain in charge of all kinds of momentum risks, which constantly improves itself whatever the nature of the risks and uncertainty. If this is true, then we can try to use AIto understand this code and to model it!Philosophically, we test here Kant’s antinomy 2 ( atomism ), by con-sidering risk-taking as composite substance, and his antinomy 3 ( causaldeterminism ) concerning the variability of decision-making. We studystock markets as ”an end in themselves”, disregarding their economicand societal purpose for the sake of mathematical modeling.1.6.
Momentum investing.
Marshmallow test.
A well-known test for children, ”one marshmallownow or two in 15 minutes”, is actually one of the origins of our model-ing risk-taking in stock markets. The latest psychological experimentsfound limited support to the thesis that delayed gratification with chil-dren leads to better outcomes in their future [WDQ]. Waiting for 15minutes here can be expected for those who have already learned that”patience is rewarded”, but not for all and not always. Also, the 15minute interval can be not that short for little ones. Their impatiencecan depend on the age, social and economic background; some can sim-ply favor short-term approaches. If the interval increases (say, days),some uncertainty is added, or the reward is diminished, the ”impa-tience” can be well justified; the problem is ”how much?”.Similar to [Ka], we began with some analysis of psychological roots.A starting point of our research was a postulate that we have quitea rigid ”table” of risk preferences in our brain. To give the simplestexample, if the return was 1% today and you count on extra 1% tomor-row, then do not sell. However, if only a fraction is expected tomorrowand even this is not granted, then sell now and avoid extra risks.This is very basic; and obviously the interval matters here. Whatjustifies risks for a day-trader can be not acceptable to the traders withgreater investment horizons. Of course, the better someone’s tradingexperience the more realistic expectations, and many other factors areinvolved. However the mechanism in our brain that determines accept-able risks seems quite universal to us. To give a model example: 1.5marshmallow tomorrow vs. 1 now ”sounds reasonable”. Indeed, theoffer ”2 tomorrow” is quite acceptable (for adults), but maybe just toogood; however, receiving tomorrow only 1 makes no sense, since we canhave it right now. So our brain possibly takes the average here, whichmakes 1.5 tomorrow a reasonable compensation for the delay.The auction in pont does almost exactly this. For instance, thesmallest bid during the auction is 3 / pont-bids are actually fractional ; the next bid is 4 / I APPROACH TO MOMENTUM RISK-TAKING 13 where your contract (if you win the auction) must be 4 / , / , / /
9. The number of cards in your hand and the number of taken tricksreflect respectively the duration of the investment and the return.The play itself is of course not market-related; this is simply a wayto validate your bid. In real investing, the ”contract” means opening aposition and the ”play” is finding the moment of its termination. Theresulting return is similar to the value of the contract. There are manysuccessful ways to invest; picking one of them resembles very much bid-ding in card games, but real ”timing” is not well reflected in card games.
Pont somehow addresses this; it is a model of our approach to compar-ing returns for different durations of positions. Interesting mathematicsis involved here, including Bessel and hypergeometric functions. Thisis not like ”comparing apples and oranges”, though our brain routinelycompares everything with everything.There are of course stock market features well beyond pont . For in-stance, the execution risks are connected with the investment risks [EF].Also, opening short positions and terminations of long ones are basedon the same sell signals , so they are related. Mathematically, pont-bids are linear (our tables are nonlinear), but it captures our method well.
Market implementation.
The termination rules we use are based on termination curves . These curves are directly linked to forecastingshare prices. The hierarchy of basic pont-bids discussed above is sucha curve (with 4 points): 3 tricks from 6 cards, 4 tricks from 7 or 8cards, and 5 from 9. As with cards, the discretization of stock marketbids is necessary for our trading system to work. The separation ofthe signals from noise, which we do successfully, absolutely requiressuch a discretization. Stock market charts are discontinuous functionsby their nature, especially short-term. Also, the discretization of bidsis closely related to the discretization of time, which is inevitable forfinding the optimal time range of investments (in hours, days, weeks).We will argue below that the prediction and termination curves areof type const · t r , where t is time and r is some fraction (generally,below 1 / momentum investing , whichcan be defined as ”investing on news”. It works well for individualcompanies, portfolios of companies, market indexes, including SPY,the spider, and for commodities; it seems to us of quite general nature.We note that the optimization becomes significantly more involvedfor our trading system when strict hedging was imposed, i.e. whenfor any open position, an equal amount is invested in the oppositedirection in SPY or similar; see [BLSZ]. Theoretically, hypergeometricfunctions are needed here; using { t r } becomes too approximate. Thesystem works, but the returns are less impressive. More generally,the correlations between companies are important; this is beyond thesystem we present, though we do the group optimization. In our approach, we do not even try to evaluate the news itself.Its impact is measured through the response of the markets via stockprices and trading volumes. Thus, the parameters we find and useactually reflect investor risk-taking preferences , which can be expectedsufficiently stable. The trading frequency is one of the main factorshere; see for instance [Al, ChS, CS]. The risk preferences of day-tradersare quite different from those of mutual funds. The challenge is that astock can be involved in trading with different frequencies and horizons,which was addressed in our bidding tables . This is especially applicableto trading indices; see e.g. [FPSS, GTW]; all kinds of trading patternscan be present here, but our system mostly manages them well. See[Bo, DB] on using typical time scales.The design of our trading systems included many special markettwists. For instance, the counter-trend (contrarian) variants of ourtrading system frequently outperform pro-trend ones. We actually usedboth variants simultaneously, which is some kind of hedging. Contra-trend trading can be successful because of several reasons. Our systemneeds time to measure the impact of the news to be sure that this is not”noise”; large trade sizes are a consideration too (see [GRS]). Counter-trend trading is not unusual in stock markets [CK]. Let us mentionhere that the initial version of our trading system mostly relied onthe intersections of termination curves with actual charts, changingthe directions of positions correspondingly. It worked reasonably, butreacted slowly to fast market moves, which was improved via ”start2-bids”, where we used the same curves to produce signals for openingnew positions; this complimented well using the intersections.To trade real-time, our system was designed completely automated,a must for any AI even if they are used interactively. See [CJP] con-cerning various aspects of automated high-frequency trading. We notethat the trades of our system are fully explainable ; it is not a ”blackbox”. Only such AI can be really trustworthy ; see e.g. [HG].
Acknowledgements.
I am very thankful to David Kazhdan, whogreatly contributed to the success of this project at many levels. Thetrading system discussed in the paper was tested (improved manytimes) thanks to support and supervision by Alexander Sidorenko; Iam very grateful to him. The author thanks Mikhail Khovanov forvarious suggestions and Jean-Pierre Fouque for his kind interest.2.
Modeling news impact
In this section, a simple mathematical model of short-term impact ofnews is suggested. News-driven fluctuations of share prices are the coreexamples. We come to certain linear differential equations, which canbe generally solved in terms of hypergeometric functions. We focus on
I APPROACH TO MOMENTUM RISK-TAKING 15 elementary solutions only. They have market applications, which wereextensively tested in various stock markets, including real-time exper-iments. There is another way to obtain essentially the same equationsvia random processes, but it will not be touched upon in this paper.See [EN]; e.g. compare their News Impact Curves with our ones.2.1.
Hierarchy of news.
Let us briefly describe the types of companyor industry news, which can be primary and secondary . The primaryones are basically core events and announcements. For instance, (a)new products or acquisitions, (b) significant changes of earning esti-mates by the company, (c) upgrades or downgrades by leading marketanalysts. Major sector, industry or economy news are of this kind too.Almost any core news generates a flow of secondary news in the formof (highly correlated) reports, reviews, and commentaries. They mostlypresent the same core news, but sometimes can impact our behavioreven greater than the original event. In our model, commentaries willbe generally treated on equal grounds with the core announcements.By reports , we mean analysts’ reports on the core event includingperspectives and predictions. Then reviews collect and present themain findings in reports, mostly aiming at professional investors. Fi-nally, the news itself and the findings above reach all consumers viamass media mostly in the form of commentaries .Importantly, consumers will be influenced by all primary and sec-ondary news more or less regardless of the level, the ”distance” fromthe actual event. The actual originality is not the point here. So theimpact of the commentaries can be significant and quite comparablewith the impact of the event itself.2.1.1.
The basic equation.
We assume that the impact of an event atthe moment t is proportional to the t -derivative of the total number ofpieces of news reflecting the event after it and before t . The coefficientof proportionality 0 < c ≤ reduction coefficient ; itdepends on time, but mostly it will be treated as a constant.The value c = 1 can be reached right after the news, and then c tends to 0 with time, depending on the ”investment horizon” (hours,days, months); cf. [DB]. Let us comment on this. Generally, (i) analytic reports and all secondary news tend to soften the ex-pected implications of the core news, (ii) commentaries of all kinds disperse the original core news anddiminish the expectations even further, (iii) the longer time passes after the core event and the core news,the smaller their impact becomes.All three mathematically mean that the coefficient c approaches zeroas t → ∞ . Indeed, putting news into perspective is the purpose ofanalysis and commentaries, but this almost always reduces the original expectations. In the momentum investing, the impact of news fadesfaster for short-term investing vs. long-term. Approximately, if thetrading positions are in days or weeks than c ∼ / c = 1 for months; it can be significantly smaller for high-frequencytrading. Our tables provide some ”natural” c -coefficients for differenttrading frequencies, investment categories .From now on, news will be represented by a positive or negative realnumber , i.e. we assign a numerical value to it. Also, we assume that thetime distribution of news is essentially uniform in the following sense.Let N ( t ) be the total sum of news values (positive or negative num-bers) released from 0 to moment t. Then the number of pieces of news(their total value, to be exact) arriving from t to t + δ for some δ , i.e. N ( t + δ ) − N ( t ) , equals approximately c · δ · N ( t ) /t, which is δ timesthe reduced average of all previous news from 0 till t. The greater theintensity (time-density) of commentaries etc. triggered by an event,which is N ( t ) /t , the greater the number of new commentaries. Wecome to the following differential equation: dN ( t ) dt = ct N ( t ) . (2.1)It can be solved immediately if c is a constant: N ( t ) = A t c for aconstant A > . When c = 1 the growth of N ( t ) is linear, i.e. the eventdoes not ”fade” with time and continues to attract constant attention.We disregard that N ( t ) can be bounded; adding the ”saturation” willbe addressed later. A physics-style argument in favor of this equation isits self-similarity : the solutions are multiplied by some constants whenthe time units change, and c does not depend on the choice of units. Tree growth.
Equations of this type can be expected to have manyapplications. Let us give one example. We will switch to a differencecounterpart of (2.1), naturally adding minimal ”maturity”: f n − f n − = cn − f n − − λf n − for n > , λ ≥ . (2.2)When λ = 0, this is a variant of the famous Fibonacci recurrence withthe birth rate cn − , i.e. when it is inversely proportional to ”time”. Theterm − λf n − restricts here over-population by allowing ”emigration”.Setting λ = 0 , c = 1 , the fundamental solutions are: f n = n, f n = D n / ( n − D n is the number of n -derangements , permutationsof n elements without fixed elements. The second solution approaches n/e as n → ∞ , so both have linear growth at infinity. We argue that f n in (2.2) basically describes the size of a tree at its n th year. I APPROACH TO MOMENTUM RISK-TAKING 17
In contrast to the ”Fibonacci rabbits”, trees grow linearly at most.The corresponding f n − f n − is proportional to the corresponding f n − ,where the coefficient of proportionality, ”the birth rate”, is roughly thesurface area of the root system divided by the volume of the tree. I.e.it is approximately r /r = 1 /r , where the ”radius” of the tree r can beassumed linearly depending on n . The radius is qualitatively about thesame for the tree and its root system. It is proportional to the numberof tree rings ; we obtain cn − . This describes the ”middle stage” of treegrowth. In the beginning, the volume of the tree is rather r than r , sothe tree can grow exponentially for a short period of time. Also, onlythe ”active part” of its root system contributes to the growth, whicheventually diminishes r to r or so. Mathematically, this gives the term c ( n − f n − in (2.2) and results in the saturation of the tree size at thelate stage of its life cycle, which matches real tree growth.This is somewhat parallel to the claim above that the reduction co-efficient c for N ( t ) tends to 0 when t → ∞ . We note here that thesaturation here can be of course simply due to the upper bounds for N ( t ) or f n , i.e. not because of the ”geometric” argument we suggestedabove (via the root area divided by the volume); see Section 2.3.There are obvious differences between news impact and tree growth .For instance, adding − λf n − is secondary for trees (due to their aging orsimilar growth reductions), but this term is of fundamental importancefor the news. It reflects ”pricing news”; see Section 2.2. Surprisinglysuch different processes are quite similar mathematically, which clearlyindicates that (2.1) and (2.2) are of universal nature.Without going into detail, let us mention that solving (2.2) and simi-lar difference Dunkl-type equations generally requires basic (difference)hypergeometric functions and their variants. This is actually a rela-tively recent direction; see e.g. [Ch1].2.2. Adding price targets.
So far we have not considered the fol-lowing market-style response to news: when the news is already pricedin , i.e. the current share-price already includes it, the effect of further(secondary) news goes down. Similarly, when the stock is consideredunderpriced, positive commentaries have greater impact. There is aspecific market way to address this: upgrades and downgrades . Theygenerally set new share-price targets . The main difference here fromgeneral news is the dependence on the current share-price. Generally, upgrades , are all market, company or equity news of any levels address-ing (depending on) the share-prices .Similar to N ( t ), we represent upgrades by positive or negative num-bers, using the notation U ( t ) instead of N ( t ) . Thus U ( t ), the sum ofvalues of upgrades, depends on the share-price. The following normal-ization u ( t ) = U ( t ) /U (0) − Let P t be the share price and p ( t ) = ( P t − P ) /P be the rate ofreturn from the price-level P . The equation above must be correctedfor u ( t ), since U ( t ) goes down if the share-price ”sufficiently” went upafter the event, i.e. the news is already priced in . Similarly, it goes upif the stock is considered undervalued. This correction can be assumedproportional to p ( t ) /t, the average rate of change of p ( t ) from 0, whichis more ”balanced” vs. taking dp ( t ) /dt here. Thus we arrive at thedifferential equation: du ( t ) dt = ct u ( t ) − σt p ( t ) . (2.3)We note that the term p ( t ) /t can be replaced by ”more aggressive” p ( t ) t ν − for 0 < ν ≤
1; see system (2.20&2.21) below. For instance itmeans for c = 0: the longer p ( t ) grows as t ν − , the greater the numberof downgrades. I.e. p ( t ) ≈ Const t − ν is considered non-sustainable.We will switch from now on from N ( t ) to u ( t ). Here σ is qualitativelyproportional to the P/E or P/S.
More generally, it reflects the expectedgrowth of the company. Mathematically, σ is essentially as follows.Let us assume that p ( t ) is basically linear in terms of t and ”shift” t = 0 to the moment when the company is rated ”strong buy”. Forsufficiently large t , we can assume that u ( t ) ∼ U ( t ) /U (0) and ignore u ( t ) /t ; so u ( t ) ∼ − p ( t ) /σ and p ( t max ) = σ at t = t max such that u ( t ), the current rating of the company, becomes 0. This moment oftime, t max , is when analysts change their stock ratings from ”buy” to”neutral” on the basis of its price valuation. So σ is essentially the relative price-target, i.e. σ ∼ p ( t max ) = ( P targ − P ) /P , where t max isthe moment of time when the news is fully priced in. We will makethis analysis somewhat more rigorous in Section 2.3.Now let us involve the differential equation for the share-price. Al-most no company event or news influences the share-price directly; thisdepends on the way the market reads the news. The simplest news-driven equation for p ( t ) is as follows: dp ( t ) σdt = at u ( t ) + b du ( t ) dt . (2.4)As with N ( t ), here u ( t ) /t is the average upgrade from the zero momentof time, which measures the global news impact from 0, essentially thecommonly used consensus rating of the company shares. The termwith du ( t ) dt is local : the response to the rate of change of u ( t ) at t . I APPROACH TO MOMENTUM RISK-TAKING 19
Logistic modification.
Before further analysis, let us touch uponthe modification of equation (2.3) under the assumption that the num-ber of upgrades or downgrades is limited. Let e U ( t ) be the sum of ± U ( t ) isbasically as follows: e U ( t ) = [ U ( t )] for the integer part [ x ] of real x. Since e U ( t ) is bounded, let u ( t ) = U ( t ) /U top < U top .Then (2.3) must be modified if we want to use it for sufficiently large t . Namely, we must multiply the right-hand side of (2.3) by (1 − u ( t )) , which reflects the ”number of remaining commentators”. One has: d u ( t ) /dt = (1 − u ( t ))( ct u ( t ) − σt p ( t )) . (2.5)In the absence of the price-term, it is a well-known logistic equation ,with the following modification: the interaction coefficient is propor-tional here to 1 /t. When p ( t ) ≡
0, it can be readily integrated.Equation (2.4) remains unchanged: dp ( t ) σdt = at u ( t ) + b d u ( t ) dt . (2.6)System (2.5)&(2.6) has no elementary solutions for a = 0. Let ussolve it when a = 0, for b -investing in the terminology below. One has: u ( t ) = ( β + Bt r − β ) / ( r + Bt r − β ) , r = c − b, (2.7) p ( t ) = σ ( bu ( t ) + β ) , ≤ β < r, B ≥ . (2.8)If B > , then u (0) = β/r, p (0) = σcβ/r, u ( ∞ ) = 1 , p ( ∞ ) = σ ( b + β ) . Let us assume that u (0) = 0, i.e. the rating of the company is”neutral” at t = 0 . Then β = 0 , p (0) = 0 and p ( ∞ ) = ( P targ − P ) /P = σb . So σ is ( P targ − P ) / ( bP ) for the price-target P targ , which matchesthe interpretation of σ from Section 2.2 for b ∼ a = 0 , the system can be solved numerically , but it is notclear whether the corresponding solutions are more relevant than thoseobtained from the original system (2.3)&(2.4). This is especially trueif we do not focus on large t , and the simpler the better! Also, thestochastic and discontinuous nature of price fluctuations restricts usingdifferential equations here. Furthermore, a, b, c, σ can depend on timeand do depend on the basic time-intervals, which is another reason tostick to the simplest assumptions.Thus, we will continue with system (2.3)&(2.4). Furthermore, toaddress the discontinuous and discrete nature of share-prices, we willlater switch from this system to ”tables” of its ”basic solutions”. Themain conclusion we will need from the analysis performed above is that ( P t − P t ) /P t after the news at t can be assumed Const ( t − t ) r forsome r for short, but not too short, time intervals [ t , t ].2.4. Investing regimes.
Let us solve system (2.3)& (2.4). Recall thatit describes fluctuations of share-prices under news-driven investing .Both a, b there are non-negative. The term b du/dt in (2.4) or (2.6)is typical for ”local” pure momentum investing , when only the latest upgrades are taken into account. The term a u ( t ) /t reflects a more”global”, balanced and less ”aggressive” approach, when the averageof all news values after the event is considered.We call the case b = 0 pure a -investing , and the case a = 0 pure b -investing . If both terms are non-zero, it is naturally mixed investing .The greater t − t after the major event at t , the greater chances that a -investing dominates.Equations (2.3) and (2.4) can be readily integrated. Substituting u ( t ) = t r , the roots of the characteristic equation are: r , = d ± √ D, d = ( c − b ) / , D = d − a. (2.9)Accordingly, unless D = 0, the formula for p ( t ) is as follows: p ( t ) = C t r + C t r if D > C , C , (2.10) p ( t ) = t d ( C sin( √− D log( t ))+ C cos( √− D log( t ))) if D < d >
0. For negative d , p ( t ) approaches zero forlarge t and therefore this is focused on ”the final stage” of the impact ofan event; our model and trading system are designed to serve mainlythe beginning of this period. We also assume that c is a constantand that 0 ≤ c ≤ , so d ≤ /
2. In fact, c slowly goes to zero as t increases and the impact of the event gradually diminishes, but we willnot do large t . Similarly, c may be greater than 1 right after the majorevent, but this stage is disregarded too; this is addressed in our tradingsystems by proper ”discretization”.Let us briefly discuss the oscillatory regime in (2.11). It can happenonly for a -investing or for the mixed one. According to (2.11), the quasi-period in terms of log( t ) is 2 π/ √− D. So the durations of theoscillations form a geometric sequence. The magnitude will grow intime as a power function of degree 0 ≤ d ≤ / . If b = c , then d = 0and the function p ( t ) is bounded.If the news is important for the share-price, a can be significantlylarger than d . Then √− D ∼ √ a , and the quasi-period for the loga-rithmic time log( t ) is about 2 π/ √ a, which clarifies the role of a. Let d = 1 / a -investing (when b = 0). This gives c = 1,i.e. the initial news-function N ( t ) grows linearly. Then the p -function I APPROACH TO MOMENTUM RISK-TAKING 21 behaves as a sum of random independent jumps of the share-price by σ or − σ for proper σ , ”heads or tails”, distributed uniformly. So ourequations have some statistical meaning; cf. [Gu].For the pure b -investing with b > p ( t ) = C t c − b + C and itsleading term is C t c − b , since a = 0 and c > b . By the way, p (0) maybe non-zero here; for instance, we can set p ( t ) = ( P t − P t ) /P t for anypoint t in equation (2.4).Now let us assume that c > b , and let b b = 0 , b a = b ( c − b ). Then thecorresponding b D is ( c/ − b ) , b r , = c/ ± ( c/ − b ) = { c − b, b } , and b p ( t ) = b C t c − b + b C t b . Since c − b > b, the leading term here coincideswith that from the previous formula. We conclude that for c > b , pure b -investing gives essentially the same as pure a -investing for proper a .This happens for sufficiently large t : then b eventually tends to zero.The difference between these two regimes becomes significant onlywhen b < c < b, i.e. during the middle stage of the “impact pe-riod”. Indeed, the exponent r cannot be made smaller than c/ a -investing. As to b -investing, r = c − b < b approaches zero when thenews ”fades” and the contribution of du/dt to p can be disregarded. Discussion.
The leading exponent r in t r , which is r in (2.10) and d in (2.11), satisfies d ≤ r ≤ d for d = ( c − b ) /
2. Here r ∼ d occurswhen b -investing dominates. The lower bound r ∼ d can be reachedonly when a -investing is strongly present. If b = 0, then r = c/ a . Recall that the news-reduction coefficient c isgenerally from 0 to 1; it is close to 1 when the initial news-functions N ( t ) grows linearly. Practically, the values r < . c = 1, then r ∼ a ∼ ∼ b . Each typeof investing has its own natural time-intervals, prime time-units , andits own typical average durations of positions. The time-unit can befrom hours (or smaller) to months, it was mostly 2 h in our tradingsystem. Let us refer to [Bo] on power-laws for price functions, thoughare approach is different (we study short-term news impacts).The C -constants above are essentially proportional to the value ofthe news and are related to the company momentum volatility, whichdepends on the investment horizons (reflected in the tables below). Seee.g. [EN, ABL, FPSS]. The dependence of the volatility on the horizonis reflected in our tables below; it is connected with the exponent r .The t -periodicity due to profit-taking is an important factor here. Thenthe stochastic volatility can be generally modeled via Bessel processes ,similarly to the usage of fBM discussed a bit below.
Connection to statistical framework . The leading term t r of our p ( t )is the square root of the variance V ar ( B H ) of the fractional Brownian motion B H ( t ) ( fBM for short) for the Hurst exponent H = r , where r is as above. It also appears in the self-similarity property of fBM : B r ( ts ) ∼ t r B H ( s ). One can try to introduce generalized fBM for the full solutions from (2.11) or even for those from Section 2.6 below interms of Bessel functions. A more systematic way to link our ODE toSDE is via the Kolmogorov-type equations for the transition probabilitydensity ; see e.g. equation (1.7) from [Kat] for Bessel processes.We refer to [Che, GJR, GNR] for the basic properties of B H ( t ) andtheir applications in financial mathematics; fBM is an important toolfor modeling volatility of stock markets. A qualitative reason for theconnection with our approach is that the expected (percent) growth ofthe share-price is essentially proportional to the standard deviation ofthe corresponding stochastic process. Another (essentially equivalent)connection goes via expected values of options . We will not discuss thepassage to SDE any further in this paper; at least, it explains that r isclosely correlated with the market volatility.2.5. Two events, comments.
The impact of two events at − τ < p ( t ) can be naturally described by the system du ( t ) dt = c t u ( t ) + c τ t + τ u ( t ) − σ ( t + τ ) p ( t ) , (2.12) dp ( t ) σdt = at u ( t ) + b du ( t ) dt for c def == c + c τ . (2.13)When c τ = 0, it describes the case when there is no news at − τ , butthis moment is taken as the support for the price-target; generally,price-targets do depend on historical levels. Let b = 0 here and below.We obtain: t ( t + τ ) d p/dt + ((1 − c ) t + (1 − c ) τ ) dp/dt + ap ( t ) = 0,which can be integrated in terms of hypergeometric functions. Namely, p ( t ) = F ( α, β ; γ, − t/τ ) is a solution for γ = 1 − c , α + β = − c , α, β = − c/ ± p c / − a ; see e.g.[AS],Ch.15, or use Mathematicafunction Hypergeometric2F1 [ α, β, γ, x ]. One can also take here p ( t ) = t − β F ( β, − α − c τ , β − α, − τ /t ) and p ( t ) upon α ↔ β in p ( t ). When τ /t ∼
0, such p , with proper coefficients of proportionality approach t r , as t >> r , from (2.9) under b = 0. Using deviations.
Hedging vs. SPY or some index is an example; seee.g. [BLSZ, BSV]. The assumption is that after the companies withinthe index reacted to some index news at the moment − τ , a specificcompany’s news arrives at 0. So if a position in a stock is hedged byinvesting an equal amount in the corresponding index in the oppositedirection, the return will be p ( t ) − p ind ( t ), where p ( t ) is governed by(2.12&2.13) an p ind , the index’ rate of return, is in the form of (2.10) or I APPROACH TO MOMENTUM RISK-TAKING 23 (2.11) for proper parameters. Practically, our trading system automati-cally determines r ind , C ind , r, C such that p ( t ) − p ind ( t ) ≃ C t r − C ind t r ind .However, more refined p ( t ), solutions of (2.12&2.13) instead of C t r , aresignificant here, especially for (relatively) small t .We mention that it makes perfect sense to switch here to the cor-responding difference equations. See e.g. [Ch2], Section 1, concerningthe one-dimensional global hypergeometric function . Practical matters.
Such adjustments are quite natural, but it ap-peared that the elementary solutions of system (2.10)&(2.11) alreadydescribe well the real market processes when we focus on the impact ofa single event and when the time interval is not too large. The follow-ing key features of these solutions can be observed in stock markets: (i) t r -dependence of the envelope of the price-function for 0 < r ≤ , (ii) quasi-periodic oscillations of the price-function in terms of log( t ) . Here t is the time from the event. In our trading system, (i) is thekey; the periodic oscillations are addressed using different tools, notreally connected with solving differential equations.Quasi-periodic oscillations (our second observation) are more diffi-cult to observe and measure. Mathematically, such oscillations aretypical for a -investing and do not appear for pure b -investing. Theyare ”around” the mean values, and generally require involved statisticalanalysis; cf. [FPSS]. They can be mostly seen only for relatively big t, so they can be ”overwritten” by other general market and companynews and trends. As to (i), the market evidence is solid.From the perspective of a -investing, the term p ( t ) /t is some kind of profit-taking , though we will argue below that taking p ( t ) instead of p ( t ) /t is more relevant for ”pure” profit-taking. Practically, the eventsor commentaries are sometimes used simply as triggers when profit-taking. Under a -investing, such ”overreacting” mathematically meansthat the coefficient a becomes relatively large; see (2.11).Generally, our model ”predicts” that in the absence of other majornews, the intervals between consecutive rounds of a -type profit-takingtend to grow approximately as a geometric sequence, i.e. we arrive atsome kind of Elliot waves (associated with Fibonacci numbers). Strictlyspeaking, the profit-taking is the effect of second order, i.e. for theshare-price minus its expected average. Mathematically, the averagesatisfies (2.11) in our model. The oscillations of this difference areactually t -periodic, not just log t -periodic as for a -investing, which willbe addressed below using Bessel functions.It is important that long-term returns of different companies becomecomparable in spite of very different trading patterns and volatility. I.e.they become closer to each other almost regardless of their short-term behavior. There are of course winners and losers, but the long-termrate of change is sufficiently uniform even for quite different types ofcompanies. Mathematically, it means that the smaller r , the bigger theconstants C in (2.10,2.11). We will reflect this in our g -functions (3.1-3.4) and tables, making ”basic returns” comparable after 3-4 months.This can be important for extending our system to trading options .To finish this discussion, let us emphasize that the analysis above isby no means restricted to stock markets. Market instruments and toolshave various counterparts beyond trading equities. For instance, short-trading, profit-taking, hedging, doing derivatives are quite commonin some forms, though reach the most sophisticated levels in stockmarkets. The discontinuous nature of market data is not unusual too;it will be addressed ”practically” in Section 3.2.6. Profit-taking etc.
The model above addresses well quasi-periods under a -investing (or mixed investing). The periodicity with respect tolog( t ) is some kind of profit-taking, but the actual one is significantlymore momentum: sell when p ( t ) reaches some level . This is a ma-jor reason for short-term ”periodic” volatility, which is an importantfeature of stock markets; see also [ACH]. Its role is crucial not onlyfor short-term trading; see [FPSS]. Figures 3,8 there are the key forthem and for us. The short-term volatility is ”around” the mean value p ( t ) = p avrg ( t ). By the way, the periodicity of the volatility provides anexplanation of the profitability of counter-trend (contrarian) strategies.For ”pure” profit-taking, u ( t ) must be understood as some market”consensus” on keeping a stock at its current price. So the ”upgradefunction” must react here to p ( t ), not to p ( t ) /t as above. This isrelative to p , an effect of ”second order”, so we will need to switch to e p ( t ) = ( p ( t ) − p ( t )) /p ( t ) and the corresponding e u ( t ).The most natural assumption is the proportionality of d e p ( t ) /dt to e u ( t ). Adding the term a e u ( t ) /t to (2.15) is possible too (see (2.21)),but the key change is the replacement of p ( t ) /t there by p ( t ). One has: d e u ( t ) dt = ct e u ( t ) − σ e p ( t ) , (2.14) d e p ( t ) σdt = e e u ( t ) . (2.15)This is almost exactly (3.14) from [ChM]. Generally, the spinor Dunkleigenvalue problem is the differential equation for v = { v ( t ) , v ( t ) } : dv ( t ) dt def == { ddt v , ddt v } = { ct v , } − { λv , λ ′ v } , (2.16)See [ChM], Sections 2,3, e.g. Lemma 3.4 there, and (2.28,2.29) below. I APPROACH TO MOMENTUM RISK-TAKING 25
This is a spinor variant of the equation dv ( t ) dt = c t ( v ( − t ) − v ) − λv, where we switch to v = v ( t )+ v ( − t )2 , v = v ( t ) − v ( − t )2 , considering them asindependent functions. I.e. we switch from v to a super-function v ,where λ is extended to a pair { λ, λ ′ } acting on v ”diagonally”.To solve equations (2.14)&(2.15), we obtain: t d e pdt − ct d e pdt + et p = 0 = t d e udt − ct d e udt + et u + cu, (2.17) e p = A e p + A e p , e p , ( t ) = t | α | J α , ( √ et ) for α , = ± c . (2.18)Here the parameters a, c are assumed generic, A , are underminedconstants, and we use the Bessel functions of the first kind: J α ( x ) = ∞ X m =0 ( − m ( x/ m + α m !Γ( m + α + 1) . See [Wa] (Ch.3, S 3.1). We will also need the asymptotic formula fromS 7.21 there: J α ( x ) ∼ r πx cos( x − πα − π x >> α − / . The latter gives that e p , ( t ) are approximately e C t c/ cos( √ et − φ , ) forsome constants e C , φ , . Interestingly, the phases φ , = ± c π + π areuniquely determined by c . We conclude that for sufficiently big t , thefunction p ( t ) under the profit-taking as above is basically: e p ( t ) ≈ t c/ (cid:0) A cos( √ et + πc/
4) + B sin( √ et − πc/ (cid:1) , (2.19)for some constants A, B ; the t -period is π √ e .Let us now replace e p/σ by b p t ν − /σ for 0 < ν ≤ d e u/dt − e e u ( t )by d b u/dt − e b u ( t ) /t in (2.14&2.15). The system becomes: d b u ( t ) dt = ct b u ( t ) − t ν σt b p ( t ) , (2.20) d b p ( t ) σdt = et b u ( t ) . (2.21)It can be solved in terms of Bessel function too. The correspondingfundamental solutions are b p , ( t ) = t c/ J ± c/ν ( √ eν t ν/ ). One has: b p , ( t ) ≈ b C t c − ν cos(2 √ et ν/ /ν − ψ , ) as t >> . I.e. b p ( t ) it is slower than e p ( t ) from (2.19) and the periodicity is for t ν/ in this case; it even tends to 0 as t → ∞ for c < ν/ p avrg = p taken from (2.11), p ( t ) canbe assumed a linear combination of t r cos( ρ log( t )+ ζ ) (cid:0) − ǫ cos( ̺t + ξ ) (cid:1) forproper parameters r, ρ, ̺, ζ , ξ, ǫ . This holds asymptotically, but seemsbasically sufficient for practical modeling momentum trading. We notehere a connection with [Che], where the sum of Brownian motion, BM ,with fBM was considered; see also the end of Section 2.4.The t -periodicity of profit-taking is directly related to short-termvolatility in stock markets. This is generally a stochastic phenomenon[EN, FL, FPSS]. However, as we see, the volatility due to profit-takinghas solid ”algebraic origins”. Namely, relatively simple algebraic-typeformulas with few parameters , which reflect investors’ trading prefer-ences, can look quite chaotic. This was actually the key for us: thereare very many traders, but possibly only very few trading patterns .Let us provide a numerical example of such ”algebraic volatility”.Using the g -functions from (3.1,3.2) with 1 ≤ t ≤ h (1 month), let : p ( t ) = 0 . − sin( t ) /
3) cos(2 π log( t )) g ( t, . − sin( t/ /
3) sin(2 π log( t )) g ( t +12 , . In spite of relatively simple formula, the fake chart in Figure 1 ex-hibits a lot of volatility, which is mathematically hardly surprising forsuch trigonometric expressions. Before managing real charts, the sys-tem was ”trained” to trade profitably such fake ones. It is momentum;catching the periods and quasi-periods was not an objective. We donot have sufficient ”stability theory” for the periods. However the ex-ponents r can be reasonably found by the system (automatically) forfake and real charts. In (2.22), r = 0 . , .
418 for g (1) , g (3).
20 40 60 80 100 120 140 - - - Figure 1.
Model chart: ”algebraic volatility”
I APPROACH TO MOMENTUM RISK-TAKING 27
Toward Macdonald processes.
A natural passage from ourdifferential equations to random processes is via considering ensem-bles of news-functions: { u (1) , u (2) , . . . , u ( n − , u ( n ) } , at the moments t > t > · · · > t n − > t n . Let us assume that they are governedby the same reduction coefficient c . Then a counterpart of (2.1) is thefollowing system: ∂u ( i ) /∂t j = c u ( i ) − u ( j ) t j − t i for i = j, (2.23) ∂u ( i ) /∂t i = c X j = i u ( i ) − u ( j ) t i − t j , (2.24)which is a special variant of rational Knizhnik-Zamolodchikov equationof type A in the standard n -dimensional representation of the symmet-ric group S n . See e.g. [Ch1], Section 1.1. This means that news at t i is positively influenced by that at j if u ( j ) > u ( i ) or by that at j > i if u ( j ) < u ( i ); the impact is negative otherwise. Qualitatively, a strongernews before t i increases the impact of u ( i ) and diminishes its impact ifit arrives after t i (and vice versa). Quantitatively, we divide here by( t i − t j ). Note that u (1) + · · · u ( n ) does not depend on t j , so this is aprocess of ”news-redistribution”. In the case of n = 2, we obtain: ∂u (1) /∂t = c u (1) − u (2) t − t , ∂u (1) /∂t = c u (1) − u (2) t − t ,∂u (2) /∂t = c u (2) − u (1) t − t , ∂u (2) /∂t = c u (2) − u (1) t − t . Spinor Dunkl problem.
This consideration is a special case of the fol-lowing more general approach. Let S n be the symmetric group, s ij thetranspositions. To link our systems to the Macdonald processes from[BC] (in the nonsymmetric setting), we consider an ensemble of scalar”news-functions” { u ( w ) , w ∈ S n } depending on { t i , ≤ i ≤ n } from the sector t > t > · · · > t n − > t n . We set w ( f ) def == f ( t w (1) , . . . , t w ( n ) ) for w = ( w (1) , . . . , w ( n )). A counterpart of (2.1) is the following system: ∂u ( w ) /∂t i = c X j = i u ( w ) − u ( ws ij ) t i − t j + λ w ( i ) u ( w ) for 1 ≤ i ≤ n. (2.25)It is equivariant under the action u ( w ) u ( vw ) , λ i λ v ( i ) of v ∈ S n (without touching { t i } ). This is the spinor Dunkl eigenvalue problem of type A in the terminology of [Ch2]. It can be integrated in terms of multi-dimensional Bessel functions ; the theory of the latter is relativelynew even in (classical) symmetric setting [Op]. The original Dunkleigenvalue problem is a system of differential-difference equation for asingle u = u ( id ) above defined for all sufficiently general { t i } ∈ R n , notonly in the sector above, where we plug in: u ( w ) = w ( u ). This one isequivariant with respect to the action of v ∈ S n on { t i } and { λ i } and u by permutations.The λ -parameters are adjusted to the original non-spinor Dunkleigenvalue problem. One can consider more general linear combinationsin (2.25) (in the spinor case). Note that the price functions are notinvolved here; the usage of λ serves as some substitute.Let us consider the case of n = 2, setting u (0) = u ( id ) , u (1) = u ( s ) : ∂u (0) ∂t = c u (0) − u (1) t − t + λ u (0) , ∂u (0) ∂t = c u (0) − u (1) t − t + λ u (0) , (2.26) ∂u (1) ∂t = c u (1) − u (0) t − t + λ u (1) , ∂u (1) ∂t = c u (1) − u (0) t − t + λ u (1) . (2.27)Switching to e u ( i ) = exp( − ( λ + λ )( t + t ) / u ( i ) , λ = ( λ − λ ) / x , = t ± t , t , = ( x ± x ) /
2, we obtain that ∂ e u ( i ) /∂x = 0.Setting y = x , f , = e u (0) + e u (1) , f = e u (0) − e u (1) , we arrive at: ∂ e u (0) ∂y = c e u (0) − e u (1) y + λ e u (0) , ∂ e u (1) ∂y = c e u (1) − e u (0) y − λ e u (1) , (2.28) ∂f ∂y = λf , ∂f ∂y = 2 c f y + λf , ∂ f ∂y = 2 cy ∂f ∂y + λ f . (2.29)The last equation can be integrated in terms of Bessel functions; cf.(3.14) from [ChM], equation (2.17) above, and also [Ch1], Section 2.4.However in contrast to the solution from (2.18), the function f willnot oscillate due to the positivity of λ ; we assume that λ ∈ R .Similar to (2.18), adding the profit-taking term results in λ λ − a for some constant a > λ . This will provide the desired oscillations; weomit details and generalizations.3. Market implementation
Major challenges.
The first major challenge with the mathe-matical analysis of stock charts and other market information is thatthe corresponding functions are of discontinuous nature. Automatedhigh-frequency trading adds a lot of volatility too [CJP]. This makesthe separation of the signals from noise and trading involved.
I APPROACH TO MOMENTUM RISK-TAKING 29
The second challenge is that even if the news and has clear meaning,the corresponding trading decisions can depend on many factors. Forinstance, it can be simply too late to invest in this particular news.Executing large orders can be with significant losses right after thenews, and so on. By the way, the counter-trend (contrarian) variantsof our trading system, i.e. those selling when the share-price goes upand so on, can outperform the pro-trend variants.
The third challenge is picking right moments for closing positions.We use the termination curves discussed below and the ”signals” op-posite to the direction (long or short) of the position taken, determinedautomatically. Obviously, the bid-ask spread reduces the profitability;see [KS]. This is one of the reasons why we optimize returns per posi-tions ; the positions generally last from 5 to 10 days.
The fourth challenge is that a significant variety of (profitable)strategies is needed to address market volatility. In our system, us-ing counter-trend and pro-trend variants simultaneously, employingdifferent opti-parameters , and varying the moments when the systemreceives quotes provide reasonable stability. The number of differentprofitable variants of the system is practically unlimited: 12 ”produc-tion lines” were used in real-time experiments.
The fifth challenge is using weights, which for us are mainly thosebased on the results of the prior optimization. We obviously rely mostlyon the equities the most suitable for our system, i.e. those performedthe best during the optimization process. However, the opti-parameters and weights based on the past performance can fail in future.
The sixth is simply due to the novelty of our approach. The usageof our for creating momentum trading systems, tradingoptions and technical analysis of stocks requires experience. The pont-tables from Section 4.3 can help to get used to our . Also, weprovide various performance results of our own system, which can beused as ”benchmarks” for those who follow our approach.3.2.
Forecasting.
The work of our system is based on the forecastingcurves , automatically produced time-predictions for share-prices. The termination curves are their shifts up or down with some coefficients ofproportionality providing some room before their intersection with theactual share-price graphs. These intersections trigger the terminationsof the taken positions (if any). This is similar to trading US-styleoptions, when the termination curves are horizontal lines shifted upor down for calls or puts. The curves we use are essentially b (time) r for ”bids” b and exponents r , assigned to the 7 ”categories” discussedbelow (main 4 and their 3 consecutive averages).The bids are discrete and must be large enough (at least 1) to forman admissible 2-bid , which is a pair { b = bid, c = category } . The 2-bids are ranked lexicographically , first with respect to b (the bigger the better) and then, if the bids coincide, with respect to c : the smaller c and its prime time-interval the better. The ”winner” is the top bid. Bids below the threshold in their categories are ignored as noise.
Thethresholds for prime-intervals are 1 , , . , c = 1 , , , β ; see the tables below.The basic functions we use are as follows: g ( t,
1) =0 . · Floor[1548 ( 0 . t + 0 . x − /
100 + 1 , x = 0 . , in the case of the super category ( c = 1),(3.1) g ( t,
3) =2 · Floor[10 ( 2( t/d ) − x ] /
10 for x = 0 . , in the case of the ultra category ( c = 3),(3.2) g ( t,
5) =0 . · Floor[22 .
875 ( 2 . t d − . x + 12 . , x = 0 . , in the case of the extra category ( c = 5),(3.3) g ( t,
7) =3 . · (Floor[10 .
25 ( t/ (22 d ))] /
10 + 1) , i.e. here x = 1 , which serves the regular category ( c = 7),(3.4)where t is measured in hours; 1 h is the prime time-interval in thesuper case, d = 6 . h , the duration of one Wall Street business day, isthat in the ultra-category. Accordingly, the prime time-intervals are1 week = 5 d in the extra category and 1 basic month = 22 d in theregular category. Here x ≈ .
137 + log( u ) / u = { , d, . d, d } ,where 2 . d (instead of 5 d for c = 5) is due to some practical reasons.Qualitatively, x is supposed to depend linearly on the logarithm of thecorresponding prime time-interval, but this can vary.Here Floor[ z ] means the maximal integer no greater than z . For0 < t < t • , where t • = 1 , d, d, d correspondingly ( t • i will be usedhere for i = 1 , , , g ( t • )(2 t + t • ) / (3 t • ). Also, we define g -functions for evencategories c = 2 i , where 2 i = 2 , ,
6, as the averages of the neighboring g , i.e. g ( t, i ) = ( g ( t, i −
1) + g ( t, i + 1)) /
2; the prime time-intervalsare t • i = 2 t • i − (not the corresponding averages).Finally, the basic functions will be b g ( t, c ) , where b is the bid (aninteger), c the category. The trading system automatically determinesthe bids backward as price-changes in percent divided by the corre-sponding g . This is performed at every moment when the system ob-tain quotes in all 7 categories, and with some depth , the number m ofsteps back. I.e. it constantly calculates for the rescaling coefficient β : b i ( m ) = Floor (cid:2) β | p t − p t − mt • i | g ( mt • i , i ) p t − mt • i (cid:3) , ≤ i ≤ , β ≥ , (3.5) I APPROACH TO MOMENTUM RISK-TAKING 31 for the corresponding t • i and a sequence m = 1 , , , . . . (mostly, 1month back); here p t is the share-price at t , | · | the absolute value.Then the highest 2-bid b i ◦ ( m ◦ ) among all i and m becomes the top2-bid ; if two 2-bids coincide, the smaller m ◦ the better. The correspond-ing b i ◦ g ( t − t ◦ ) for t ◦ = t − m ◦ t • i ◦ , shifted and with some proportionalitycoefficient, becomes the termination curve , which can be changed if ahigher top 2-bid arrives. To improve the performance, the top 2-bidsare renewed only when ± p ( t ) decelerates with some threshold (subjectto optimization); ± for long/short or ∓ for the ”counter-trend”. Also,the system constantly produces top start 2-bids , changed when ± p ( t ) accelerates (with their threshold). They are used for opening positions,forecasting and terminations of the trades in the opposite mode.Finally, the trading signals are the increases of the top 2-bids or topstart 2-bids and the intersections with the termination curves.Consecutive increases of top bids for the same equity in the samedirection are used to open multiple positions: of level 1 on the first bid,of level 2 for the first increase and so on. The trades based on level2,3 bids mostly outperform those of level 1. However omitting level1 bids significantly reduces the total amount that can be invested; inprofessional trading, the greater the better.3.3. Tables of two-bids.
Recall that there are four categories super(1), ultra (3), extra (5), and regular (7) , and also intermediate evencategories. They are governed by different bid-tables, where arepairs ( b, c ) . Usually b are integers from 1 to 5. Practically, 2-3 categoriesare mostly used for individual companies, though the system becomesless stable with 2 categories. This can be greater than 3 when tradingindices, but 3 seems reasonably optimal. The average durations ofpositions are mostly in the range from 3-15 days for us, so the regularcategory rarely occurs in our simulations and real-time runs.The termination can be only due to the signals, unless for clear”hangs”, which requires a special consideration; see e.g. [BDM]. Thesignals here are intersections with termination curves or start bids inthe opposite direction. So the average durations can be adjusted onlyby choosing proper combinations of categories and initial parameters;all parameters are subject to machine optimization. The system findsmany ”profitable” and stable combinations of parameters, which can beused to obtain desired durations of positions and for other adjustments.New positions are mostly open due to the new start bids .Using different initial values of parameters, pro-trend and counter-trend (contrarian) modes, weights and so on results in many differentvariants. Also, much depends on the moment the system enters themarket, obtain quotes and the prior history. The system was provedto be able to produce a lot of profitable trading lines , which resembles very much human decision-making. Even with playing simple games,there are almost always various ways to win; so one can choose.
Super table ( c = 1 ): b \ h | | | | | | | d equals 6.5 hours, 1 m means 22 · m =65 · t for a bid b andcategory i is simply bg ( t, i ), assuming that the initial moment is t = 0. Ultra category ( c = 3 ): b \ d | | | | | |
10 15 25 40 65 1006 |
12 18 30 48 78 120.Here, additionally, 6 m means 6 months, which is 126d, 2 monthsare (approximately) 45d; d always means 6.5h. Only working days arecounted. Extra category ( c = 5 ): b \ w | | | | |
14 22 34 62 1125 | Regular category ( c = 7 ): b \ m | | |
14 21 35 89
I APPROACH TO MOMENTUM RISK-TAKING 33 |
21 31.5 52.5 133.54 |
28 42 70 178.
Comparing the categories.
Let us compare the minimal admissiblebids (basic returns) in the different categories for the 13 basic durations,mostly taken from the tables above. Those from the tables above arein bold; the others are calculated using the corresponding g -functions:cat 1h 2h 1d 2d 1w 2w 3w 1m 2m 3m 4m 6m 9m7 — — — — — — — d = 6 . h, w = 5 d, m = 22 d, m = 45 d, m = 65 d, m = 86 d, m = 126 d, m = 191 d. (3.6)Also, recall that 2-bids are ranked naturally: first b , the bigger thebetter, then c (when b coincide) with the priority to smaller c , theshorter the durations of positions the better.Note that for b = 1, which is the smallest bid, the returns after 3or 4 months are approximately comparable for all 4 categories. Thisis by design. Also, the expected return at 2 t • i is 1 . t • i , which is the prime time-interval for the corresponding category( i = 1 , , , i = 5 (the extra category).Also, the curves we use for prediction (and termination) heavily dependon the category, but they produce reasonably comparable returns after3-4 months; we aim at using and trading options here.Any bid is automatically considered in all ”higher” categories. Forinstance, the smallest possible bid, which is the return of 1% next hour,in super category, is ”equivalent to” 3% next day, so it ”beats” thesmallest ultra-bit, which is 2% a day. Then it is supposed to generate6 .
5% next week (vs. minimal 3.5% in the extra category), and 11%next month (vs. 7% in the regular category). To make this tableworking, 2 times every bid in the same column from the comparisontable (with the same durations) is supposed to be greater than any bidthere, which holds. This matches well bidding in contract card games:the greatest bid wins regardless of the suit.The functions we used above are designed to provide such naturallogical inter-relations when comparing bids from different categories.Also, an integrality of some (not all) bids is a consideration. Thiscan help to use these tables manually without computers, though themathematical discretization is the main point here.
To avoid any misunderstanding, the bids above begin with 1 (1%per hour in the super category) mostly for the sake of readability. Thetrading system divide these tables (all of them) by the common rescal-ing coefficient β . For instance, the division of all bids by 2 makes sense:0 .
5% per hour is more realistic than 1%. Such rescaling significantlyincreases the number of ”admissible 2-bids”, which is generally neededfor the trading system to be stable and react promptly to the changesof share-prices. This coefficient β is subject to machine optimization,as well as all other parameters.Finally, let us provide the table where we compare in the same waythe minimal bids in all 7 categories:cat 1h 2h 3h 1d 2d 4d 1w 2w 1m 2m 3m 4m1 1. 1.49 2.27 3. 4.31 5.92 6.49 8.44 10.99 13.57 15.01 16.162 — 1.28 1.87 2.5 3.65 5.16 5.74 7.62 10.29 13.28 15.1 16.573 — — — 2. 3. 4.4 5. 6.8 9.6 13. 15.2 17.4 — — — — 2.54 3.71 4.25 6.15 9.05 12.85 15.35 17.55 — — — — — — 3.5 5.5 8.5 12.7 15.5 18.6 — — — — — — — 4.97 7.75 11.6 14.75 17.757 — — — — — — — — 7. 10.5 14. 17.5.3.4. Basic system operations.
SIGNALS . Producing buy signals and sell signals is the main pur-pose of our (any) trading system. When trading, our system generallyprocesses the quotes for the periods about one month backward, em-ploying the parameters obtained during the prior optimization and theweights based on the optimization too.There can be multiple signals in the same direction, the first, thesecond and so on. The consecutive number of a signal is called the level of the signal . Using such levels is a special feature of our system.Generally the signals of levels 2-3 are better “protected” than those oflevel 1, the first signals in a certain direction; only the signals of level1,2,3,4 were used in real-time runs.Statistically, the number of signals of level 1,
N L
1, matches that for2+3+4:
N L ∼ N L
N L
N L
4. Then
N L ∼ N L
N L
4, and soon. The combination of signals of level 2 and 3 gave better performancethan the usage of all (statistically, about 20% better than that for level1), but the signals of level 1 are also of good quality.The signals are mostly treated as orders . For instance, one sells shorton a sell signal and then buys to cover upon the first buy signal . Thisis the other way around for the counter-trend trading. The signals canbe due to sufficiently big bids or intersections with termination curves.The positions can be opened on the first, second signal or the signalsof higher levels . The positions of all levels are terminated altogetherafter the first signal comes in the opposite direction.
I APPROACH TO MOMENTUM RISK-TAKING 35
Practically, up to 4 simultaneous positions can be open with an eq-uity if the signals of all 4 levels were present. All of them will be closedat once upon the first signal in the opposite direction. We suggestedsome ways to split the termination of big positions into several steps,say, involving “neighboring lines”, however this was not tested. Exe-cuting large orders is a well-known market concern [MCL, GRS, CJP].Using levels resembles using leverage , but the system does it in itsown ways. Also, we note that the signals are produced independentlyfor different equities, although the system can work in more sophisti-cated regimes, including different variants of hedging.
RETURNS.
The return per one position is the main quantity thesystem optimizes. Here the ask-bid spreads, the slippage with executionof the orders, and the broker commission must be subtracted from the returns , practically, about 0.15-0.25% per one position for “professionaltrading”.
We always calculate pure returns, without taking the spreadand similar losses into consideration.
The returns we provide beloware mostly pure returns per position , but we always calculate the usual(pure) returns during the periods under consideration too.Pure returns like 0.4% per position are, generally, sufficient for theprofitability; the system can do better than this in spite of relying onquotes only, as the source of market information, various delays andcharges. The actual durations of the positions the system created weremostly in the range of 5-10 days.
OPTIMIZATION.
The optimization procedures can be for trading
Longs Only , Shorts Only , or (mostly) for trading both,
L & S .The optimization (“education”) periods are of obvious importance.Our system does not have any prior information about the market andequities beyond the information that it can extract from the data pro-vided during the optimization periods. They can be historical or basedon prior trades by the system. Generally, the optimization periodshave to be 1 year or longer. Ideally, they must be diverse , i.e., mustcontain sufficiently long periods when the stock goes up and when itgoes down. The more ”difficult” the optimization period, the betterand more stable the out-of-sample returns.These factors are of importance for choosing the optimization pe-riods, and creating real ”trading lines”. However after this, the real-time adjustment of parameters becomes entirely automated. Mostly,the ”real-time optimization” is for 6-months periods backward.Generally, the durations from 1 to 2 years of the optimization peri-ods are statistically reasonably to react properly to different types ofvolatility and various market trends. However 6 month periods and asimplified optimization are good enough to keep ”lines” running, untilthey are redesigned on the basis of more systematic optimization.
DURATIONS.
The end-user can request the desired average durations of positions. For our system, the range from 5 to 10 days was consid-ered reasonable. However, if the categories, trading modes and thecompanies to trade are prescribed, it is for the system to determinethe most optimal ”lengths” of positions. The positions are opened andclosed entirely on the basis of the signals , so the desired duration isnot imposed in any form during trading and tests. Generally, if theactual duration ( length ) of positions during the control (out-of-sample)period appears sufficiently close to the desired duration , then this isjust a confirmation that the optimization was relevant. Stable rhythm is an important indicator of stability of the system.3.5.
Testing the system.
Multiple experiments were conducted usingthe historical and real-time data. Special attention was paid to tradingliquid companies and SPY, the trust that owns stocks in the sameproportion as that represented by the SP500 stock index.
CONTROL PERIODS.
The most systematic historical testing was forthe period 2006/01/01-2007/04/13. More exactly, five 4 month’s con-trol periods (out-of-sample!) were taken:Period 1: 2006/01/01-2006/04/30, Period 2: 2006/04/01-2006/07/30,Period 3: 2006/07/01-2006/10/30, Period 4: 2006/10/01-2007/01/30,Period 5: 2007/01/01-2007/04/13. The last period was a little shorter.The historical testing consisted of(i) optimization during the 12 month’s optimization period taken back-ward from the beginning of the control period ,(ii) ”trading from scratch” during the next 4 month’s control period with closing all positions at the end of the period.Note that the control periods overlap (1 month), to simulate contin-uous trading, without closing all open positions at the ends of periods;this is how the system really works. The optimization periods and the corresponding control periods do not overlap of course. The systemwas used in the pro-trend variant in this test.We evaluate the
AVERAGE 4 MONTH RETURN for five 4 month’s controlperiods by the formula:
AVRG RETURN = 88 ∗ ( X i =1 RET i ∗ NUM i ) / ( X i =1 LNGTH i ∗ NUM i ) , where 88 is the average number of business days during 4 months, andRET i , NUM i , LNGTH i are the corresponding RET, NUM, LNGTH , theaverage return per position, the number of positions and the averagelength (duration in business days) of one position during the corre-sponding 4 month’s period.
I APPROACH TO MOMENTUM RISK-TAKING 37
TRADING SPY (LONG ONLY).
Let us provide the results of control”trading” SPY , without short positions and in the pro-trend regime.Generally, trading SPY is quite a challenge; see e.g. [FPSS] concerningsome aspects of its fluctuations. Mathematically, long and short trad-ing, are on equal grounds; addressing possible negative developmentsis part of any risk-managements, which is quite universal.The results for the signals of 4 levels are presented separately. By num, ret, lngth we denote the number of (long only) positions, thereturns per position, and their durations for each level. The numberin ( · ) is the corresponding standard deviation . The averages for all 5periods, RETURN , LNGTH , and
AVR CHANGE are provided. We mentionthat
RETURN becomes 15 .
3% in the (well-tested) variant with
LNGTH =5 . d , instead of 3 . d , which can be more suitable for end-users; theduration can be made even longer, but this can reduce profitability. TRADING SPY (LONG ONLY)AVERAGE POSITION LNGTH: 3.0 d;AVERAGE 4 MONTH RETURN: 14.9%;AVR SPY 4 MONTH CHANGE: 4.80%.PERIOD: 20060101-20060430, SPY CHANGE=4.6%NUM=18 RET=0.72(0.37) LNGTH=3.0d ALLnum=10 ret=0.58(0.38) lngth=3.1d lev=1num=4 ret=0.87(0.23) lngth=4.0d lev=2num=2 ret=0.79(0.19) lngth=3.1d lev=3num=2 ret=1.1(0.15) lngth=0.5d lev=4PERIOD: 20060401-20060730, SPY CHANGE=-1.0%NUM=13 RET=0.45(1.26) LNGTH=5.2d ALLnum=4 ret=-0.23(1.15) lngth=7.0d lev=1num=3 ret=0.17(1.05) lngth=6.3d lev=2num=3 ret=0.97(1.12) lngth=3.7d lev=3num=3 ret=1.11(1.19) lngth=3.3d lev=4PERIOD: 20060701-20061030, SPY CHANGE=9.0%NUM=23 RET=0.56(0.43) LNGTH=2.2d ALLnum=13 ret=0.44(0.42) lngth=2.1d lev=1num=5 ret=0.44(0.26) lngth=2.2d lev=2num=3 ret=0.8(0.15) lngth=2.9d lev=3num=2 ret=1.28(0.22) lngth=2.0d lev=4PERIOD: 20061001-20070130, SPY CHANGE=8.5%NUM=12 RET=0.59(0.35) LNGTH=2.2d ALLnum=8 ret=0.46(0.33) lngth=2.4d lev=1 num=3 ret=0.89(0.12) lngth=2.3d lev=2num=1 ret=0.8(0.09) lngth=0.8d lev=3PERIOD: 20070101-20070413, SPY CHANGE=2.0%NUM=17 RET=0.1(1.47) LNGTH=2.4d ALLnum=8 ret=0.08(1.58) lngth=2.4d lev=1num=5 ret=0.22(1.7) lngth=2.2d lev=2num=3 ret=0.31(0.52) lngth=2.2d lev=3num=1 ret=-0.94(0.02) lngth=3.1d lev=4.
Short trading with a market that essentially goes up is quite a chal-lenge for any trading system. Short trading here provides some ”in-surance” for the periods when SPY goes down. Some losses can beacceptable when it goes up, but the system actually remains profitable.Let us demonstrate this for the same periods and data. As we wrote,the bid-ask spread is not counted, not too high for liquid assets.
TRADING SPY (SHRT ONLY)AVERAGE POSITION LNGTH: 3.2 d;AVERAGE 4 MONTH RETURN: 3.15%;AVR SPY 4 MONTH CHANGE: 4.80%.PERIOD: 20060101-20060430, SPY CHANGE=4.6%NUM=33 RET=0.02(0.72) LNGTH=3.7d ALLnum=14 ret=-0.06(0.81) lngth=3.5d lev=1num=10 ret=0.19(0.69) lngth=3.2d lev=2num=5 ret=-0.11(0.51) lngth=4.6d lev=3num=4 ret=0.(0.62) lngth=4.5d lev=4PERIOD: 20060401-20060730, SPY CHANGE=-1.0%NUM=46 RET=0.5(0.61) LNGTH=2.7d ALLnum=18 ret=0.31(0.65) lngth=2.8d lev=1num=13 ret=0.6(0.58) lngth=2.8d lev=2num=8 ret=0.65(0.49) lngth=2.7d lev=3num=7 ret=0.64(0.53) lngth=2.0d lev=4PERIOD: 20060701-20071030, SPY CHANGE=9.0%NUM=66 RET=0.04(0.77) LNGTH=2.9d ALLnum=24 ret=0.01(0.83) lngth=2.7d lev=1num=15 ret=0.03(0.75) lngth=3.4d lev=2num=14 ret=0.04(0.75) lngth=3.1d lev=3num=13 ret=0.09(0.65) lngth=2.6d lev=4PERIOD: 20061001-20070130, SPY CHANGE=8.5%
I APPROACH TO MOMENTUM RISK-TAKING 39
NUM=42 RET=0.05(0.64) LNGTH=4.4d ALLnum=14 ret=-0.18(0.7) lngth=4.5d lev=1num=12 ret=0.11(0.56) lngth=4.4d lev=2num=10 ret=0.21(0.62) lngth=4.0d lev=3num=6 ret=0.18(0.49) lngth=4.8d lev=4PERIOD: 20070101-20070413, SPY CHANGE=2.0%NUM=68 RET=0.(0.93) LNGTH=2.5d ALLnum=31 ret=0.09(0.96) lngth=2.0d lev=1num=17 ret=0.06(1.08) lngth=2.6d lev=2num=11 ret=-0.17(0.68) lngth=2.8d lev=3num=9 ret=-0.22(0.7) lngth=3.2d lev=4.
TRADING LIQUID COMPANIES.
For the same periods, let us presentdata for ”trading” of 165 stocks, mostly liquid. It is for longs and shorts and pro-trend, i.e. essentially under the mean reversion trading . The
AVERAGE LNGTH = 5 and
RETURN = 9 .
56% are the averages over all 5periods;
NUM and num are the numbers of positions.
AVERAGE POSITION LNGTH: 5.0 d;AVERAGE 4 MONTH RETURN: 9.56%;AVR SPY 4 MONTH CHANGE: 4.80%.PERIOD: 20060101-20060430, SPY CHANGE=4.6%NUM=2236 RET=0.64(3.4) LNGTH=5.2d ALLnum=1105 ret=0.55(3.57) lngth=5.4d lev=1num=602 ret=0.68(3.25) lngth=5.2d lev=2num=344 ret=0.81(3.31) lngth=5.1d lev=3num=185 ret=0.79(2.89) lngth=4.7d lev=4PERIOD: 20060401-20060730, SPY CHANGE=-1.0%NUM=2433 RET=0.14(4.08) LNGTH=5.4d ALLnum=1169 ret=0.13(4.19) lngth=5.3d lev=1num=628 ret=0.16(4.12) lngth=5.6d lev=2num=394 ret=0.09(3.89) lngth=5.4d lev=3num=242 ret=0.25(3.78) lngth=4.9d lev=4PERIOD: 20060701-20071030, SPY CHANGE=9.0%NUM=2401 RET=0.66(3.93) LNGTH=4.5d ALLnum=1248 ret=0.64(3.92) lngth=4.4d lev=1num=619 ret=0.74(3.91) lngth=4.5d lev=2 num=344 ret=0.53(3.98) lngth=4.5d lev=3num=190 ret=0.7(3.99) lngth=4.2d lev=4PERIOD: 20061001-20070130, SPY CHANGE=8.5%NUM=2174 RET=0.71(3.67) LNGTH=5.2d ALLnum=1101 ret=0.67(3.73) lngth=5.2d lev=1num=566 ret=0.77(3.66) lngth=5.2d lev=2num=324 ret=0.74(3.54) lngth=5.d lev=3num=183 ret=0.73(3.62) lngth=5.2d lev=4PERIOD: 20070101-20070413, SPY CHANGE=2.0%NUM=1812 RET=0.65(3.05) LNGTH=5.d ALLnum=934 ret=0.56(3.1) lngth=5.1d lev=1num=476 ret=0.79(3.05) lngth=5.d lev=2num=257 ret=0.71(3.06) lngth=4.9d lev=3num=145 ret=0.62(2.63) lngth=4.9d lev=4.
The list of stock symbols of these companies is as follows: "AA", "AAP", "AAPL", "ABC", "ABT", "ACAS", "ADBE", "ADM", "ADP", "ADSK","AIG", "AIV", "ALL", "AMAT", "AMGN", "AMTD", "AMZN", "ANF", "ANN", "APA","APC", "ATI", "AVP", "AXP", "BA", "BAC", "BBBY", "BBY", "BEAS", "BEN","BHI", "BJS", "BMET", "BMY", "BNI", "BP", "BRCM", "BSC", "C", "CAL", "CAT","CCU", "CELG", "CEPH", "CFC", "CHK", "CHRW", "CHS", "CMCSA", "CMCSK", "CMI","COF", "COP", "COST", "CSCO", "CTSH", "CVS", "CVX", "D", "DE", "DELL","DO", "DVN", "EBAY", "EK", "EOG", "EQR", "ERTS", "ESRX", "FD", "FDO","FDX", "FNM", "FPL", "FRE", "GE", "GENZ", "GG", "GILD", "GLW", "GM", "GPS","GRMN", "GS", "GSF", "HD", "HON", "HPQ", "IBM", "INTC", "IP", "ITG", "ITW","JCP", "JNJ", "JPM", "JWN", "KLAC", "KO", "KR", "KSS", "LEH", "LLY", "LMT","LNCR", "LOW", "LRCX", "MCD", "MER", "MET", "MIL", "MMM", "MO", "MON","MOT", "MRO", "MRVL", "MSFT", "MXIM", "NBR", "NE", "NEM", "NKE", "NOV","NSC", "NUE", "ORCL", "OXY", "PEP", "PFE", "PG", "POT", "PRU", "QCOM","RIG", "ROK", "SBUX", "SLB", "SNDK", "SPG", "STN", "SU", "SUN", "SUNW","SYMC", "TEVA", "TGT", "TWX", "TXN", "UNH", "UNP", "UTX", "VLO", "VNO","VZ", "WAG", "WB", "WFMI", "WMT", "WYE", "X", "XLNX", "XOM", "XTO", "YHOO".
Let us combine all 5 control intervals in one period (avoiding termi-nations of the ends of the intervals) and show all levels and the cor-responding numbers of positions taken,
NUM for all and num for levels;the lengths are the average durations of the positions. One has:
Period: FROM 1/1/2006 TO 4/13/2007NUM=9332 RET=0.6 LNGTH=5.5d ALLnum=4143 ret=0.52 lngth=5.6 lev=1num=2228 ret=0.67 lngth=5.4 lev=2num=1285 ret=0.63 lngth=5.3 lev=3num=735 ret=0.69 lngth=5.1 lev=4
I APPROACH TO MOMENTUM RISK-TAKING 41 num=416 ret=0.76 lngth=5.3 lev=5num=237 ret=0.6 lngth=5.6 lev=6num=131 ret=0.55 lngth=5.7 lev=7num=76 ret=0.57 lngth=5.5 lev=8num=54 ret=0.99 lngth=4.9 lev=9num=27 ret=0.52 lngth=5. lev=10.
A simplified optimization was performed here, with only 2 fixed cat-egories ( c = 2 ,
4) and reduced number of iterations. For this period,24 stocks (from 165) performed negatively, including INTC, DELL,EBAY. Trading such ”heavy-weighters” generally requires full opti-mization and at least 3 categories. However here we made the opti-mization fully uniform for all companies and fast, aiming at thousandsof companies. The optimization for INTC or similar, if this is the ob-jective, must be done more thoroughly. The following 24 companieshad negative returns:
ADBE num= 90 ret=-0.29% lngth=3.9AMGN num= 49 ret=-0.48% lngth=9.1APA num= 66 ret=-0.25% lngth=5.6BJS num= 68 ret=-0.88% lngth=6.9CHK num= 58 ret=-0.7% lngth=8.1CHS num= 74 ret=-0.4% lngth=6.1COF num= 49 ret=-0.05% lngth=6.4COP num= 45 ret=-0.51% lngth=9.2DELL num= 88 ret=-0.32% lngth=4.8EBAY num= 101 ret=-0.51% lngth=4.3EOG num= 82 ret=-0.64% lngth=5.4HD num= 47 ret=-0.05% lngth=8.5INTC num= 89 ret=-0.53% lngth=7.1JNJ num= 26 ret=-0.94% lngth=11.6MMM num= 50 ret=-0.53% lngth=7.2MOT num= 67 ret=-0.86% lngth=5.3NBR num= 80 ret=-0.99% lngth=5.3NOV num= 87 ret=-0.66% lngth=5.1SNDK num= 90 ret=-0.73% lngth=2.8SUN num= 79 ret=-1.04% lngth=3.8SYMC num= 83 ret=-0.69% lngth=4.2TEVA num= 80 ret=-0.15% lngth=4.2TWX num= 46 ret=-0.27% lngth=10.4XLNX num= 82 ret=-0.16% lngth=5.5.
Here and above only signals of levels no greater than 4 were usedfor trading. We invested symbolic $100 in every position, so multiplesignals in one direction increased this amount up to $400, which re-sembles trading on margin. The first signal in the opposite direction(for this stock) results in the termination of all positions. This regimecan significantly improve profitability. Higher levels are more frequentfor actively traded companies, so this is some kind of leverage.We do not use weights here. Let us just mention that investing onlyin 100 companies from 165 above with the best optimization resultsconstantly improves the performance of the systems; which is a variantof using weights. However, some companies with solid optimizationreturns, i.e. suitable for our system, performed just so-so during thecontrol periods. This is the nature of stock markets, discussed well inthe literature; see e.g. [YZ].Let us now provide some auto-generated results of real-time tradingsimulation with 170 companies, similar to those listed above, under long & short with 4 levels (L1,L2,L3,L4), and for 3 ”production lines”(A,B,C). The lines were with different ”opti-parameters” and/or differ-ent entry points; ”B” was counter-trend. The first half, ”no weights”,describes the uniform trading of all companies, the second half is forthe 100 companies with the best returns during the optimization:
TRADING FROM 2007, 2, 20 TO 2007, 6, 4; ALL, NO WEIGHTS:RET AVR A: RETL1=0.68 RETL2=0.76 RETL3=0.89 RETL4=1.04RET AVR B: RETL1=0.67 RETL2=0.7 RETL3=0.86 RETL4=0.84RET AVR C: RETL1=0.61 RETL2=0.7 RETL3=0.75 RETL4=0.75TRADING FROM 2007, 2, 20 TO 2007, 6, 4; FOR 100 FROM 170:RET AVR A: RETL1=0.57 RETL2=0.79 RETL3=1.16 RETL4=1.4RET AVR B: RETL1=0.96 RETL2=1.04 RETL3=1.23 RETL4=1.23RET AVR C: RETL1=1.08 RETL2=1.11 RETL3=1.19 RETL4=1.17.
The returns here are per position; the average position lasted about5 days; SPY increased 5.5% during 2007/02/20-2007/06/04. Actuallyabout 1000 companies were traded for this period combined in groupsbased on trading volumes, with about 170 in each. Every company wastraded in 12 different ”lines”, so the total was 72 lines. The averagereturn was about 0.7% per position; the average position was about 5days. The results above are for 3 lines only.The optimization procedure is based on the gradient method and isactually not far from the methods used in networks ; see [BBO, HG].It was almost always with solid returns for any equities and ”learning
I APPROACH TO MOMENTUM RISK-TAKING 43 periods” in spite of using very few parameters. This alone is somediscovery. However predicting the future is of course much more subtleand much less certain, in spite of the fact that risk-taking preferencesof investors are quite conservative. In our approach, we only try topredict the ways investors react to news, but not the news itself! Seehere e.g. [CT] for various algorithms used in financial mathematics.3.6.
Some charts.
To clarify the logic of the decision-making insidethe system we will provide the performance graphs describing in detailpro-trend, long&short ”trading” SPY and XAU (Gold & Silver) usingthe historical stock quotes once a day . All signals, trades, positionsand returns can be seen under sufficiently high magnification. Thesecharts are upon the optimization, so we provide them mostly to clarifythe ”logic” of the system. Generally, using day-quotes only is a seriousdemerit; the system works reasonably, but the performance is worsethan trading SPY above with 3 quotes a day.We use green, grey and cyan correspondingly for the price-change,the returns based on level 1 signals, and those based on level 2 signals.Correspondingly, buy-sell signals are marked by blue-red rectangles-ovals; large ones mark trades for level 1 signals. See Figures 2,3.
Figure 2.
SPY, Long-Short, Daily Historical Quotes (cid:1)
RETURNS LEV1:78.64, LEV2:68.36. SIMPLE RETURN:45.36. LENGTHS LEV1:10.8days, LEV2:11.4days.
Figure 3.
XAU, Long-Short, Daily Historical Quotes (cid:57)(cid:34)(cid:54)(cid:1)(cid:71)(cid:83)(cid:66)(cid:72)(cid:78)(cid:70)(cid:79)(cid:85)(cid:27)(cid:1)(cid:71)(cid:83)(cid:80)(cid:78)(cid:1)(cid:1)(cid:19)(cid:17)(cid:17)(cid:24)(cid:16)(cid:17)(cid:25)(cid:16)(cid:20)(cid:18)(cid:1)(cid:14)(cid:1)(cid:18) (cid:16)(cid:17) (cid:15) (cid:1) (cid:53)(cid:73)(cid:70)(cid:1)(cid:71)(cid:74)(cid:83)(cid:84)(cid:85)(cid:1)(cid:77)(cid:66)(cid:83)(cid:72)(cid:70)(cid:1)(cid:1)(cid:9)(cid:74)(cid:15)(cid:70)(cid:15)(cid:1)(cid:77)(cid:70)(cid:87)(cid:30)(cid:18)(cid:10)(cid:1)(cid:83)(cid:70)(cid:69)(cid:1)(cid:80)(cid:87)(cid:66)(cid:77)(cid:1)(cid:88)(cid:66)(cid:84)(cid:1)(cid:85)(cid:73)(cid:70)(cid:1)(cid:84)(cid:85)(cid:66)(cid:83)(cid:85)(cid:1)(cid:80)(cid:71)(cid:1)(cid:66) (cid:1) (cid:84)(cid:73)(cid:80)(cid:83)(cid:85)(cid:1)(cid:81)(cid:80)(cid:84)(cid:74)(cid:85)(cid:74)(cid:80)(cid:79)(cid:1)(cid:85)(cid:70)(cid:83)(cid:78)(cid:74)(cid:79)(cid:66)(cid:85)(cid:70)(cid:69)(cid:1)(cid:66)(cid:85)(cid:1)(cid:85)(cid:73)(cid:70)(cid:1)(cid:77)(cid:66)(cid:83)(cid:72)(cid:70)(cid:1)(cid:67)(cid:77)(cid:86)(cid:70)(cid:1)(cid:83)(cid:70)(cid:68)(cid:85)(cid:66)(cid:79)(cid:72)(cid:77)(cid:70)(cid:1)(cid:66)(cid:85)(cid:1)(cid:66)(cid:1)(cid:77)(cid:80)(cid:84)(cid:84)(cid:27)(cid:1) (cid:1) (cid:72)(cid:83)(cid:70)(cid:90)(cid:1)(cid:84)(cid:85)(cid:83)(cid:74)(cid:81)(cid:1)(cid:69)(cid:80)(cid:88)(cid:79)(cid:15) Trading indices and commodities generally requires special app – roaches; see e.g. [FPSS, GTW]. Our system manages them reasonably, I APPROACH TO MOMENTUM RISK-TAKING 45 but it appeared necessary to increase the number of used categories to4, especially for SPY , versus our usual 2-3 for individual companies.This is natural, since indices and some commodities are subject tomany kinds of investing and hedging. This generally creates a lot of”noise” and makes it difficult to catch timely their ”response to news”.Their charts, especially short-term, are of very stochastic nature. Nev-ertheless our (automated) discretization procedures and other parts ofour algorithms proved to be efficient.The moments of buy signals (all of them, of all levels) are markedby blue rectangles; the large ones correspond to level 1 signals. Ac-cordingly, the sell signals are marked by red ovals; large for level 1.The blue and red vertical lines connect the level 1 execution pointsin the middle of the grey graph with those of the green equity chart.
The returns graphs are changed only upon the terminations . The cyangraph is for the trades based on level 2 signals; here vertical lines arenot used. The returns are in percent from the beginning of the graph.To help the readers, we provide a fragment of the XAU Figure 3. Forexample, here the first level 1 trade, marked by a large red oval (thefirst such), lasted till the first large blue rectangle and was executedat a loss: a vertical drop of the grey strip after the termination; XAUwent up significantly and ”unexpectedly” here. However the next trade,which was short on the sell signal of level 2, shown by the next (small)red oval, appeared successful: a small increase of the cyan strip.
These two charts are upon the optimization , so they only evaluate thequality of the optimization, i.e. what our automated optimization pro-cedure produced for this period. Only control periods (out-of-sample!)can be used to estimate real profitability. However these charts clarifythe ”logic” of the system. By the way, its unstable performance in thebeginning can be expected; the system needs sufficient ”history”.By simple returns here, we mean the total returns of SPY andXAU during the considered period (green curves). Only signals oflevels 1,2 were used for ”trading” (grey and cyan).
USING WEIGHTS.
Let us provide the performance results for thefollowing 2 periods: 3/21/2001/3/21 (9:30) - 6/14/2001/6/14 (13:30),2000/10/24 (9:30) - 2001/6/10 (13:30), with correspondingly 60 and113 days The graphs below are in terms of ”trading points”, when thesystem ”visits the market” (receives quotes), here 3 times a day. Sothe number of points is approximately 180 and 339 for these controlperiods. We focus on using weights based on the prior optimizationreturns. Namely, the better optimization returns, the greater amountsto invest in this stock. Picking the companies with optimization returnsgreater than some limit is a variant of using such weights. The 75 companies were traded, long & short, pro-trend (i.e. essentially undermean reversion trading); they were mainly taken from the list of themost liquid ones. Sharpe Ratio (SR) is Mean Standard Deviation.By ”straight”, we mean that symbolic $100 were invested per any po-sition (long or short) for the companies with the optimization (prior!)returns >
0% and > > > WEIGHTED DAILY RETURNS: 60 days
Mean = = Annual Sharpe Ratio: = * DailySR = RETURNS > % STRAIGHT DAILY RETURNS: 60 days
Mean = = Annual Sharpe Ratio: = * DailySR = RETURNS > % STRAIGHT DAILY RETURNS: 60 days
Mean = = Annual Sharpe Ratio: = * DailySR =
25 50 75 100 125 150 175-2020406080100 STRAIGHT RETURN vs SIMPLEFOR RETRNS >
20 : 106.36 vs 16.2825 50 75 100 125 150 175-2020406080100 STRAIGHT RETURN vs SIMPLEFOR RETRNS >
20 : 106.36 vs 16.28
WEIGHTED DAILY RETURNS: 113days
Mean = = Annual Sharpe Ratio: = * DailySR = RETURNS > % STRAIGHT DAILY RETURNS: 113days
Mean = = Annual Sharpe Ratio: = * DailySR = RETURNS > % STRAIGHT DAILY RETURNS: 113days
Mean = = Annual Sharpe Ratio: = * DailySR =
50 100 150 200 250 300-5050100150 STRAIGHT RETURN vs SIMPLEFOR RETRNS >
20 : 146.78 vs - >
20 : 146.78 vs - Figure 4.
75 Companies, L&S, 3 Quotes a day4.
Pont, a card model
General design.
This game is a combination of bridge and
Rus-sian preference with poker-style auction. The name ”bridge” was de-rived from earlier ”biritch”, so we make it further from the origin (andshorter). It utilizes a standard deck of 52 cards or a smaller one of 36cards. The auction is quite different from that of bridge and involvesmore risks. See here and below [Pa]. The bidding does not use thedenomination of suits. The player who starts the auction has no ad-vantage. The cards may be updated while bidding, which resembles
I APPROACH TO MOMENTUM RISK-TAKING 47 draw poker. The winner of the auction, the declarer, determines the fi-nal number of cards per hand as part of the declaration of the contract:the trump and the minimal number of tricks to be taken.Following suit and the use of trump cards is similar to bridge-typegames. The scoring is simpler than that of bridge. The declarer’s awardis based on the value of the contract depending upon whether or not itwas made. The game can be for 2, 3, 4 players, 2 partnerships, or for1 versus computer. There is also a poker-like version. All variants arealmost equally dynamic and playable.
Stock market connections.
The game, especially the auction, can beconsidered as a simple model of playing the market, especially undermomentum ”investing on news”. The bids then are some counterpartsof the forecasts of share-prices. The play checks the quality of the bid,but this is not related to real trading, where this quality is the returnupon the termination of the position taken.The number of cards per hand and the number of taken tricks re-flect respectively the duration of the investment and the return. Thedownplay and mis`ere resemble a bit selling short, but this is superficial.This is a game: just a model.The suits are substitutes for the time-horizons of investments or thecompanies considered for investing. They are on equal grounds in pont in contrast to other bridge-type games. Given a suit, the better cardsthe more reasons to make it a trump. In our trading system, thecategory of the top bid determines the time-horizon of the investment,though the categories are ranked in contrast to suits in pont .The players compete to become the declarer , which is somewhat sim-ilar to winning the ”right” to invest. The upgrades and increases aredesigned to reflect real-time actions. The bids are actually ,which adds some ”timing”; they depend on the size of hands (from 6to 9), which has no counterparts in other bridge-type games.The play itself has little to do with real playing the stock markets.For instance, the use of trump cards and positions of players aroundthe table have no market analogues. The role of such special elementsof card games is diminished in pont . However they are inevitable;the game must be not too primitive. Also, more playable games havestronger roots in our psychology. Making pont playable was a challengesince it uses unusual fractional bids , related to our approach to risk-taking. This was a test of the principles of our trading system. wethink that playing pont can help to get used to our and in realplaying the stock markets, possibly better than playing poker or bridge.
Description.
The game uses a standard card deck of 52 cards for4 players or a smaller, four-suit deck of 36 cards (from the ace down tothe 6), when there are two or three players. In the case of 4 players,they may divide themselves into two partnerships; here the whole deckis used too. The dealers are changed clockwise after each game. Thecards are dealt singly in the clockwise order and face down, giving eachplayer six cards. After the players pick up their hands, the dealer startsthe auction by making the bid or passing.
Auction. A bid is a fraction N/D with the denominator D is from 6to 8 and the numerator N is no larger than D. Generally speaking, thebid is the expected number of tricks to be taken (N) divided by thefinal number of cards per hand (D). The latter may be from 6 to 9.The fraction must be no smaller than 3/6 for 3 or 4 individual players,and no smaller than 4/6 for 2 players or partnerships. The fractions4/8, 7/7, 8/8 are excluded. The bids 3/6, 4/7, and 5/8 are not allowedfor 2 players, but are accepted for 3 or 4 players.The auction proceeds clockwise with each player either making a bidthat is not lower then the previous ones of other players; for instance,4/6, 5/7, 6/8, 5/6, 6/7, 7/8, 6/6 may be claimed after 4/6. Otherwisesay ”pass”. Bidding is forbidden after the first bid was made if a playerhas already passed. Passing is allowed after bidding only if there areother players who did not pass; also, the last remained (survived) playermay not pass. The round of bidding continues until the last bid , whena player (who then becomes the closer ) repeats his/her previous bidfor the first time, or simply says “close”. If the others (two opponentsfor the team variant) passed after this, the closer becomes the declarer .Otherwise there is no declarer.More rounds are necessary if all players passed or at least two ofthem claim the same bid. To start the next round, the dealer upgrades the cards, giving out a card per hand face down. Then each playerpicks up the card and after this removes one card from the hand bylaying it face down. I.e. the hands must be 6 again. Then the closer(or the dealer if all passed) claims first, repeating or enlarging his/herlast bid, and the auction continues following the same rules until thefirst repetition. Those who passed during the previous rounds do notbid, unless all passed. The cards may be upgraded only twice. Taking no tricks.
If all passed after the last (the second) upgrade,the dealer leads to start the downplay notrump, where the players aretrying to win the smallest number of tricks. Also, the closer starts thedownplay if two or more players (or both teams, if applicable) do notpass after the second upgrade, which is the last, but claim coinciding bids, i.e, neither of them is the winner.
I APPROACH TO MOMENTUM RISK-TAKING 49
At the end of the game, the numbers of taken tricks will be dimin-ished by the minimum number, which is to make it zero at least forone player, and subtracted from the corresponding scores. In the caseof 2 players, this diminished number must be divided by two beforesubtracting; e.g. the player who took 4 tricks will loose 1 point, whichis 4 minus 2, the number of tricks of the opponent, divided by 2.A player may claim mis`ere , which means that no tricks will be taken.This may be done only before the first upgrade, and bitten by 6/7 orhigher for 2 players (teams), by 5/6 or higher for 3 or 4 players.
Mis`ere is played notrump. The declarer makes the opening lead by placingthe card on the table face up. If there are 3 or 4 individual playersall cards are placed face up on the table after this. It is the same forpartnerships, but the partner does not participate laying his/her cardsface down. The mis`ere contract is defeated if either of the opponentsfinds the way where the declarer takes at least one trick.
The play.
After the auction, the declarer may increase , asking thedealer to deal out 1 card per player face down. The procedure canbe repeated several times, but the maximal number of cards per handmust be no greater than 9. The declarer picks up the cards every time.The others will do this only after the declaration of the contract. Thenthe declarer declares the contract , choosing the trump suit or notrump,which is allowed, and stating the minimal number of tricks to be taken(including the partner’s tricks for the partnerships). The denominator”D” equals the number of cards per hand after the last increase.
The number of tricks to win cannot be smaller than the final numberof cards per player (after the last increase) times the fraction from thelast declarer’s bid . The bid ” mis`ere ” can be changed by the declarer bythe contracts 6/7, 7/8, 8/9, 6/6, 7/7, 8/8, 9/9. It is the same for 2,3,4players, and the partnerships. Also, the last bid 6/8 can be changedby mis`ere if there were no upgrades and increases. The partner’s handis discarded face down when playing mis`ere in the team variant.The declarer starts the play trying to take enough tricks to fulfill thecontract or take no tricks for mis`ere . For partnerships, anytime duringthe play the declarer may ask the partner to place all his cards faceup on the table and then he/she starts playing both of the partnershiphands (unless in mis`ere ). All players have to follow suit if they can.Otherwise they must trump. Only the declarer may lead a trump.Other players may do this only if they have no other suits left. Theplay lasts until the declarer (together with the partner if applicable)takes the necessary number of tricks or the contract is defeated.
Score system.
At the end of the play, the declarer’s score goes upby the value of the contract, the number of tricks from the contractminus 3 for 2 players (or partnerships) and minus 2 for 3-4 individual players, if the contract was made. Otherwise this value is subtractedfrom the score. If the last bid before the first upgrading was more orequal than 5/6, then this value goes up by one, called premium (whenadding or subtracting). The same premium is added to mis`ere , treatedas 5/6 when calculating the score (3 points for 2 players/teams and4 points for 3,4 individual players). A fulfilled contract of fraction=
N/D = 1, gives 1 bonus point for 2 players (partnerships) and 2bonus points for 3-4 individual players. For 3 or 4 individual players,successful contracts 5/6, 6/7, 7/8, 8/9, 9/10, and mis`ere add 1 bonuspoint to the declarer’s score. In contrast to the premium, the bonus isnot subtracted from the score if the contract fails.There is another, somewhat more involved, variant of the pont scoresystem with more ”punishment” for defeated contracts. The play goestill the end. If the number of taken tricks is less than it was declaredthan the score of the declarer is diminished by the value of the contractmultiplied by the number of missed tricks. Say, if the declare took thenecessary tricks but one, the score becomes smaller by the value of thecontract. This score system is for experienced players.Finally, the rewards will be proportional to the scores of the play-ers diminished by their arithmetic mean, that is the total of all scoresdivided by the number of players. The partners may redistribute thetotal partnership reward (the sum of their rewards). The standard rec-ommended way is as follows. If both rewards are positive or negativethen it is the same as for individuals. If the first reward is positive,the second is negative, and the total is negative, then the first part-ner doesn’t pay. If the total is positive here, then the second partnerreceives nothing (and pays nothing).
Bidding table.
The following table is the list of bids in the increasingorder and the corresponding minimum contracts for different numbersof cards per hand. The stars (adding 1 point each to the score) showthe premium p for declaring during the first round of the auction andthe bonus b for making the contract.3–4 individual players contracts 2 players(partnerships) names b p bids: tricks / cards :bids p b names1 : — —1+1 : — —1+2 : — —2 :4/6 :5/7 :6/8 * * m/6: ...., 6/7, 7/8, 8/9 :5/6 * * * 5/6: ...., 6/7, 7/8, 8/9 :m/6 * m I APPROACH TO MOMENTUM RISK-TAKING 51 * * 6/7: :6/7 * * * 7/8: :7/8 * ** * 6/6: :6/6 * * Here mis`ere ( m = m/6 ) has the same list of admissible contracts as but is ranked higher for 2 players (partnerships) and lower for 3 or4 individuals. Recall that the mis`ere contract may be played after thelast bid or smaller; m/6 is omitted in the column of contracts.The names of the bids are convenient when bidding. The name givesthe number of additional card (after +) and the value of the (lowest)contract coinciding with the bid, calculated without the premium andbonus. For instance, the value of = for 3,4 players equals2+2=4. For 2 players, the contract = gives 3 points.4.3. Variants.
Basic-Pont (BP).
The simplest version of the game is the basic pont ,which is played without mis`ere , and ”premium”. The table is alsosimplified by dropping the bids of denominator 8 (the +2 -bids):3–4 individuals contracts 2 players(teams) names b bids: tricks / cards :bids b names1 : — —1+1 : — —2 :4/6 :5/7 * 5/6: :5/6 * 6/7: :6/7 ** 6/6: :6/6 * Poker-Pont (PP).
Another variant is poker pont for 2, 3, 4 individualplayers. It follows the table of the basic pont , without bonuses. bids: contracts
PP-betting.
As in poker, each player puts up ante (one chip or more)to form a pool , which consists of the pot and the sectors, one for a player.A player always puts chips in the corresponding sector. The dealerstarts betting , adding chips to the pool or putting nothing, passing .The player on the dealer’s left may pass, call by putting the same, or raise by adding extra chips of his/her own. Other players continueclockwise until all have finally called any raises. A player may raise after passing if the latter was before the first raise. Passing is allowedafter raising or calling if there are other players who did not pass.The first player who calls without adding is the closer . If all otherplayers passed, the closer is the declarer . If there is no declarer, thedealer upgrades cards and the closer starts another round of bettingby raising or doing nothing. Those who passed before if at least oneplayer raised may not bid. There are optional upgrades in poker pont ;they may be omitted. Then the card will be dealt to the next player. Aplayer who upgrades puts a chip to his/her sector (per each new card).The number of upgrades is no more than 3 (for 4 rounds of betting).If still all pass after the last upgrade then the ante goes to the pot, thedealer is changed clockwise, and a new game starts. PP-play.
If there are two or more players who put the same numberof chips (regardless of the extra chips for upgrades which may be differ-ent), then the closer begins one round of bidding among those playersonly. It is as in the basic pont ; the declarer is a player claiming thehighest bid. If still there is no declarer, there will be no more upgrades.Then the dealer moves all chips but ante from the sectors to the pot,and the next dealer starts a new game.The declarer may increase several times (no more than 3), adding achip per increase to the pool. After the declaration of the contract allopponents pick the cards and respond or pass clockwise starting withthe first on the declarer’s left. One must add one chip per each increase(totally, the current number of cards per hand minus 6) to the pool torespond and become an active opponent . Other opponents are passive .However all participate in the play, which follows the standard rules.The declarer leads and wins the pool (including the pot) when mak-ing the contract. If the latter is defeated then all the opponents, activeand passive, take chips back from their sectors and the active opponentsdivide the declarer’s chips and the pot among themselves proportion-ally to the number of taken tricks. The fractions are ignored and theremaining chips (if any) go to the pot. If there are no active opponents,the declarer takes his/her chips back even in the case of the failure (butnot the pot). The contract has to be the minimal possible for the cur-rent number of cards per hand. Namely, 3/6, 4/7 for 3-4 players, 4/6,5/7 for 2 players, and 6/8, 6/9 for either. It may not be lower than thelast bid if there has been a round of bidding to determine the declarer.4.4.
Comments.
Additional rules.
Extra penalties can be added for breaking the rules.The opponents may decide to diminish the declarer’s or partnership’sscore by the value of the contract if the declarer (partnership) madea mistake against the rules when playing. Vise versa, in the case ofopponent’s mistake, the declarer has the right to consider the contract
I APPROACH TO MOMENTUM RISK-TAKING 53 to be fulfilled and the other player(s) may decide to subtract its value(or its doubled value) from the score of the opponent whose fault isit. In the poker pont , the contract is considered to be defeated in thecase of a declarer’s mistake. If it is an opponent’s mistake, the chipsfrom the pool are distributed as if the contract were defeated, and theopponent who made a mistake gives this very number of chips to thedeclarer. These are basics, to be developed by players.The following regulation could improve the coordination of the op-ponents (for 3 or 4 individual players) and may be added to the rules.The opponents has to play the lowest card higher than the card of thedeclarer& partner to win the trick if they can. However the card mustbe the lowest possible to leave the trick to an opponent whose card al-ready beats the cards of the declarer&partner. As to the partnerships,a general regulation is to at least repeat the bid of your partner if youhave 2 sure tricks or more, i.e. could win two tricks for any trump. Forinstance, it may be either ”A A”, or ”A K” in the same suit, or ”A”in one suit and ”K Q” in another. When you pass, but the opponentsdon’t, it can stimulate your partner to pass or claim mis`ere . To avoidthis, it makes sense to bid if you can count on 3 (or more) tricks upondeclaring your trump, especially if you have honors and the hand seemsgood for the increases. Just to give an example of such coordination . A computer version.
The computer realization of the variant for twoplayers is based on the following principles. The computer is pro-grammed to selects the best one considering several random choicesof the hand of the player (taking into account all information aboutthe cards of the player appearing during the play). It does the samewhen bidding and declaring, but diminishes the most likely bid andcontract by one level. The simplest one-way version is when the com-puter never bids (and has no score), and the player either determinesthe contract (without upgrades) and then plays following the standardrules or passes subtracting 2 points from his/her score. It follows basicpont . However the bidding scale and the admissible contracts startsfrom 4/7 considered as and giving 1 point. More generally, ,which means (3+k)/(6+k), is counted as k points for k=1,...,6.The computer basic strategy is to win the trick leaded by the playerwith the smallest possible card and to play the lowest card otherwise. Ifit has no proper suit and no trump left, the card can be the lowest (fromthe shortest suit, if there are several cards of the same rank). Howeverthe suits where the player has no cards according to the informationduring the process of play are considered the best. When the computerleads, then the suit where the player has no cards is the first choice too.If the highest card (one of them if there are several of the same rank)has the adjacent one in the same suit (say, the pairs ”A K” or ”K Q” are adjacent), or the next card in the suit is lower by 4 or more (say,”A 10”), then this is the second choice for leading. Otherwise the suitmust be the longest and the card the highest among the longest suitsHowever the longest suit where the two highest cards are adjacent isconsidered first. If still there are several choices the computer decidesrandomly. These are of course very basic considerations; the actualcomputer program can be significantly more developed.4.5.
Concluding remarks.
Let us stress that our trading system isnot a black box ; the logic of its decisions concerning trading stocks (anyinstruments) can be fully reconstructed and understood; cf. [HG]. Wefound not many situations where its decisions could be questioned onthe basis of the usual technical analysis , though the system uses thestock-charts and its own prior decisions in novel ways.
Pont clarifiessome principles of our approach and test them ”psychologically”. Wealso hope that playing pont can help to get used to our .The bidding table of pont and the one used for system’s 2-bids ( b, c )are similar, and this is not just an analogy! The auction and biddingseem fundamental for any intelligence. This can be within some expertsystem, inside our brain or AI. Poker and contract card games serve wellthe humankind as a risk-taking playground: they obviously capturesomething important about human cognition. See [Pa].Obviously, using computers makes bidding formal and not ”immedi-ately understandable”. The automated optimization and deep learning are even more difficult to interpret, even if every optimization step canbe seen in detail, as in our programs. Generally, machine learning isfully ”trustworthy”, only if the results can be clearly interpreted ”hu-manly”. In our trading system, the optimization is mostly of this kinddue to a small number of parameters our system deals with. The mainare the categories, the modes (long/short, pro/counter), key thresholds,and some derived parameters like the average duration of positions; allare meaningful to investors. Our usage of power functions in the tablesof 2-bids has solid grounds too, as we tried to demonstrate.The discretization, which is necessary to separate noise from signals,is not really ”intuitive”, but the ”action potentials” are always neces-sary and any usage of computers requires discretization. In our tradingsystem we made the discretization as ”human” as possible. The authorof the paper is a specialist in discrete theories (mostly ”integrable”),but the market reality resulted in non-standard auction-style strati-fied discretization . It is new, though using the data stratification andsample curves is common in neural networking . It is likely that ourapproach reflects the risk-taking processes in our brain; its successfulmarket implementation can be regarded as some confirmation.The importance of finding optimal relation between the decisionsand sampling frequencies is well recognized. Let us quote [Si]:
I APPROACH TO MOMENTUM RISK-TAKING 55
Though available data are sampled at discrete intervals of time -daily, weekly, and so on - it need not be the case that economic agentsmake their decisions at the same sampling frequency. Yet it is notuncommon for the available data, including their sampling frequency,to dictate a modeler’s assumption about the decision interval of theeconomic agents in the model. Almost exclusively, two cases are con-sidered: discrete-time models typically match the sampling and deci-sion intervals - monthly sampled data mean monthly decision intervals,and so on - whereas continuous-time models assume that agents makedecisions continuously in time and then implications are derived fordiscretely sampled data. There is often no sound economic justifica-tion for either the coincidence of timing in discrete-time models, or theconvenience of continuous decision making in continuous-time models.
This is actually the key problem we address in our trading systemand this paper: how to coordinate different ”decision intervals” andwhat is optimal decision-making based on a simultaneous analysis ofseveral ”frequencies”. This is a must for AI systems focused on tradingand of obvious importance well beyond stock markets.Timing the market is and always was a great challenge, but now wehave a new chapter: a systematic AI-based research and optimizationof the process of investing. The usage of AI is a must here becausethe only reliable way to test performance of any trading system is (a) when it is fully(!) automated (machine learning included), (b) whensomeone else (not the creator) runs it, (c) the design and analysis of theexperiments is as rigorous as possible, and (d) all findings are confirmedby real-time trading, which obviously requires full automatization.We provide a sufficiently complete description of the basic principlesof our trading system and the ways it was tested. Not all aspects ofour approach were addressed here. The system consists of a lot of pro-grams; many are used for technical processing data, including but notlimited to managing historic and real-time quotes, practical matterslike splits-dividends, and so on. Quite a few serve the optimization,historical and real-time. The real-time optimization uses the systemown history of trades, upgrading the parameters is ”while trading”(normally during weekends). Historic simulations require a lot of spe-cial software too. This is on top of actual trading programs and thosemonitoring the performance. The coordination of such a ramified com-bination of service, optimization and action programs is quite a testfor any system; this is no different from the ways our brain works.
Beyond stock markets.
The stock markets can be considered as agreat model for many aspects of decision-making. In our approach,the impact of ”events” is measured indirectly , via the responses of the”agents”, which is quite standard in sociology and statistical physics.Many of our findings seem of universal nature. For instance, our equations connecting price-functions with the news-functions can beequally used to model the relation between the expected resources fora task and those actually used, presumably including our brain.”Investing” has its special features. Under momentum risk-taking ,the agents seek to optimize their actions: (a) entering the ”game”quickly when a clear signal is detected, and (b) exiting when some price-function reaches expected levels. Practically, we use here tables of 2-bids and forecasting-termination curves . Theoretically, power functionsand their generalizations, Bessel and hypergeometric functions, are ofsignificance here, as it was demonstrated in Section 2.Importantly, we focus on the time-intervals when the news impact remains growing. The main reason is obvious: an objective of anytrading is to capture local maxima of price-functions. Also, it is quitelikely that other events and profit-taking will occur before the ”naturaldecay” of the news impact. This is common in mathematical financeand physics to analyze the asymptotic behavior of correlation functions,which generally vanish at infinity; we model only short-term impacts.Our brain can be considered as a kind of social system, though withhuge number of neurons and very complex interactions. Assumingthis, ”events” reveal themselves via some waves of ”mass behavior”.Accordingly, such waves are likely to be the main information availableto individual neurons and the key source of their ”decisions”, governedby action potentials and similar mechanisms. Eventually our braincreates some ”images” of the underlying events. We are even able toform this way abstract concepts , such as space-time . Philosophically,let us at least mention here Kant; see e.g. [Ja].Our analysis of stock markets, especially the simplicity of the ba-sic differential equations we propose, can be an indication that the power-laws for the impacts of ”events”, auction-type procedures, andcertain price-functions are present in the biology of the brain at theneural level. This is related to neural networking . The price-function generally measures the current importance of the event and the corre-sponding expected resources needed for its analysis. Our brain will tryto diminish the neural activity when some ”price-levels” are reached,though the price (the importance) varies depending on the intensity ofthe triggered brain activity. This can even result in periodic ”waves ofinterest” in an event: an auto-mechanism for its abiding analysis, whichwe mathematically associate with Bessel functions. There is of coursesome ”macro-management” too, say timely corrections of the failed de-cisions. The mechanisms of such conscious or unconscious ”re-visiting”the analysis of past events are obviously complicated.4.6.
MRT: main findings. (1) Cognitive science.
The origin of our approach in cognitive scienceis the concept of momentum risk-taking , MRT , which can be defined
I APPROACH TO MOMENTUM RISK-TAKING 57 as short-term decision-making and forecasting based on the real-timemonitoring the actions of other agents. Poker and our pont are good ex-amples of games with similar data, but stock markets are of course themain source of this concept. In contrast to thinking-fast and thinking-slow from [Ka], when the ”agents” can generally choose between twomodes of thinking (unless in specially crafted experiments), there is nosuch choice here and high uncertainty is generally involved. Investorsare assumed to decide ”optimally” on the basis of the current newsimpact. Our restriction to short-term decisions and forecasts makes itpossible to propose a mathematical, quantititative model of
MRT , incontrast to thinking-fast , which is generally qualitative . (2) Toward general-purpose AI. The restriction to
MRT seems a re-alistic approach to general purpose
AI systems.
MRT is obviously oneof the key parts of any intelligence, not only with humans. There isan astonishing universality of momentum risk-taking; those who mas-ter it in one field, can generally use their expertise in other fields uponproper (sometimes little) training. We think that almost the same risk-taking curves (we call them forecasting or termination curves ) governquite a spectrum of short-term risk-management tasks and that thecorresponding ”learning” is quite uniform almost regardless of the con-crete tasks. The neural action potentials provide some discretizationand timing, but there must be other mechanisms in the biology of thebrain serving
MRT . Some are beyond the immediate purpose of
MRT ,for instance the analysis of prior decisions. Expecting errors and cor-recting them is what intelligence is about. (3) Modeling MRT.
Importantly,
MRT can be modeled mathemat-ically, which we perform thanks to our focus on short-term manage-ment. The power growth of our forecasting curves holds only for rel-atively short periods ”after the event”. The corresponding differentialequations modeling news impact seem sufficiently reliable to us. Thetrading system described in Section 3 is an experimental confirmation:it is based on the ”power-law” for price-functions with exponents de-pending on the corresponding investment horizons . Mathematically,an argument in favor of our approach is a model of profit-taking interms of Bessel functions. This relates the periodicity of profit-takingto the asymptotic periodicity of Bessel functions: a new approach tothe market volatility , one of the key subjects in quantitative finance. (4) Market volatility . The closest approach to our one we found in thevast literature on volatility in stock markets is based on the fractionalBrownian motion , fBM with Hurst exponents reflecting the invest-ment horizons. For instance, the usage of fBM explains theoreticallywhy the volatility is extreme for day-trading (with low Hurst expo-nents). Some statistical variant of our approach is a consideration of alinear combination of 2-3 fBM corresponding to ”heterogeneous time scales”. Let us refer here at least to [Che, DB]; see Section 2 above.The approach via fBM does not separate the profit-taking from the”stochastic” volatility of stock markets, which is of key importance forpractical trading (and our system). Our theoretical analysis indicatesthat
Bessel processes , generalizing fBM , are likely to emerge here. (5) Some perspectives.
As it was quoted in the Introduction, weare decades away from general purpose
AI (USA National Science &Technology Council). However, one can hope that some ”prototypes”,can be designed faster than this. Even limited ”deep learning” we(mostly) use in our experiments described in Section 3, provided ef-ficient ”human-like” behavior of our trading system. It was entirelyfocused on investing, but designing this kind of
MRT for various tasks(with uniform and sufficiently fast machine learning) seems quite doable.It will require (i) further developing the mathematical model of
MRT we suggested, (ii) finding its roots in the biology of the brain and psy-chology, (iii) improving the learning and risk-taking algorithms andmaking them really universal, (iv) experiments, and more experiments.
References [AS] M. Abramowitz, and I.A. Stegun, editors,
Handbook of mathematical func-tions with formulas, graphs, and mathematical tables , National Bureau ofStandards, Applied Mathematics Series 55, Tenth Printing with correc-tions, 1972. 22[Al] R. Almgren,
Optimal trading with stochastic liquidity and volatility , SIAMJournal on Financial Mathematics 3:1 (2012), 163–181. 14[ACH] T. Andersen, and G. Cebiroglu, and N. Hautsch,
Volatility, informationfeedback and market microstructure noise: a tale of two regimes , SSRN10.2139/ssrn.2921097, 2017. 3, 24[ABL] T. Andersen, and T. Bollerslev, and S. Lange,
Forecasting financial marketvolatility: Sample frequency vis-a-vis forecast horizon , Journal of Empiri-cal Finance 6:5 (1999), 457–477. 21[BSV] P. Bank, and H.M. Soner, and M. Voß,
Hedging with temporary priceimpact
Math Finan Econ 11, (2017), 215–239. 22[BC] A. Borodin, and I. Corwin,
Macdonald processes , Probability Theory andRelated Fields 158: 1-2 (2014), 225–400. 5, 27[BBO] A. Borovykh, and S. Bohte, and C.W. Oosterlee,
Dilated convolutionalneural networks for time series forecasting , Journal of Computational Fi-nance 22:4 (2019), 73–101. 42[Bo] J-P. Bouchaud,
Power laws in economics and finance: some ideas fromphysics , Quantitative Finance, 1:1 (2001), 105–112. 5, 14, 21[BLSZ] B. Bouchard, and G. Loeper, and H.M. Soner, and Ch. Zhou,
Secondorder stochastic target problems with generalized market impact , Preprint:arXiv:1806.08533v1, (2018). 13, 22
I APPROACH TO MOMENTUM RISK-TAKING 59 [BDM] M. Broadie, and Y. Du, and C. Moallemi,
Efficient risk estimation vianested sequential simulation , Management Science 57:6 (2011), 1172–1194. 31[CJP] ´A. Cartea, and S. Jaimungal, and J. Penalva,
Algorithmic and high-frequency trading , Cambridge University Press, 2015. 14, 28, 35[CS] P. Chan, and R. Sircar,
Optimal trading with predictable return and sto-chastic volatility , SSRN 2623747, 2015. 14[Ch1] I. Cherednik,
Double Affine Hecke Algebras , Cambridge University Press,LMS Lecture Note Series, 319, 446 pgs, 2005. 17, 27, 28[Ch2] — , Affine Hecke Algebras via DAHA , Arnold MJ 4:1 (2018), 69–85. 23,27[ChM] — , and X. Ma, Spherical and Whittaker functions via DAHA II , SelectaMathematica (N.S.) 19 (2013), 819–864. 24, 28[Che] P. Cheridito,
Mixed fractional Brownian motion , Bernoulli 7 (2001), 913–934. 5, 22, 26, 58[ChS] P. Cheridito, and T. Sepin
Optimal trade execution under stochasticvolatility and liquidity , Applied Mathematical Finance (2014) 21:4, 342–362. 3, 14[CT] V.L.R. Chinthalapati, and E. Tsang (editors),
Special issue on algorithmsin computational finance , Algorithms 12:4, 2019. 43[CK] J. Conrad, and G. Kaul,
An anatomy of trading strategies , The Review ofFinancial Studies 11:3 (1998), 489–519. 14[DB] D. Delpini, and G. Bormetti,
Stochastic volatility with heterogeneous timescales , Quantative finance 15 (2015), 1597–1608. 6, 14, 15, 58[EF] R.F. Engle, and R. Ferstenberg,
Execution risk: it is the same as invest-ment risk , Journal of Trading 2:2 (2007), 10–20. 13[EN] R.F. Engle, and V.K. Ng,
Measuring and testing the impact of news onvolatility , Journal of Finance 48:5 (1993), 1749–1778. 3, 7, 15, 21, 26[FL] J.-P. Fouque, and J. Langsam (editors),
Handbook on systemic risk , Cam-bridge University Press, 2013. 3, 7, 26[FPSS] J.-P. Fouque, and G. Papanicolaou, and R. Sircar, and K. Solna,
Shorttime-scales in S&P 500 volatility , Journal of Computational Finance 6:4 (2003), 1–23. 3, 14, 21, 23, 24, 26, 37, 44[GJR] J. Gatheral, and T. Jaisson, and M. Rosenbaum, (2018)
Volatility is rough ,Quantitative Finance, 18:6 (2018), 933–949 5, 22[GRS] S. G¨okay, and A.F. Roch, and H.M. Soner,
Liquidity models in continuousand discrete time , in Advanced Mathematical Methods for Finance, editorsG.Di Nunno and B. Øksendal, Springer-Verlag, 333–366, 2011. 14, 35[GNR] P. Guasoni, and Z. Nika, and M. R`asonyi,
Trading fractional Brownianmotion , SSRN Electronic Journal, 10.2139/ssrn.2991275 (2017). 5, 22[GTW] P. Guasoni, and A. Tolomeo, and Gu Wang,
Should commodity investorsfollow commodities’ prices? , SIAM Journal on Financial Mathematics 10:2(2019) 466–490. 14, 44[Gu] O. Gu´eant,
Permanent market impact can be nonlinear , Preprint: arxiv1305.0413v4 (q-fin TR), 2013. 3, 21 [GLL] O. Gu´eant, and J.-M. Lasry, and P-L. Lions, (2011)
Mean field gamesand applications , In: ”Paris-Princeton Lectures on Mathematical Finance2010”, Lecture Notes in Mathematics 2003 (2011), 205–266, Springer,Berlin, Heidelberg. 11[HS] J. Ho, and S. Stefano,
Generative Adversarial Imitation Learning , Ad-vances in Neural Information Processing Systems 29(NIPS 2016), 4565–4573. 6[HG] E. Horel, and K. Giesecke,
Towards explainable AI: significance tests forneural networks , Preprint: arxiv 1902.06021v1 (2019). 14, 42, 54[Ja] A. Janiak,
Kant’s views on space and time , Stanford Encyclopedia of Phi-losophy (Winter 2016 Edition), E. Zalta (ed.), 2016. 56[Ka] D. Kahneman,
Thinking, fast and slow , New York: Farrar, Straus andGiroux, 2011. 3, 5, 12, 57[Kat] M. Katori,
Bessel process, Schramm-Loewner evolution, and Dyson model ,Preprint: arxiv 1103.4728v1 (2011). 22[KS] A. Korajczyk, and R. Sadka,
Are momentum profits robust to tradingcosts? , Journal of Finance 59:3 (2004), 1039–1082. 29[MCL] S. Moazeni, and T.F. Coleman, and Y. Li,
Optimal portfolio executionstrategies and sensitivity to price impact parameters , SIAM J. Optimiza-tion, 30 (2010), 1620–1654. 35[Op] E.M. Opdam,
Dunkl operators, Bessel functions and the discriminant ofa finite Coxeter group , Compositio Mathematica, 85:3 (1993), 333–373.28[Pa] D. Parlett,
A history of card games , Oxford University Press, 1991. 46, 54[Si] K.J. Singleton,
Empirical dynamic asset pricing: Model specification andeconometric assessment , Princeton, N.J.: Princeton University Press,2006. 54[SC] J. Sirignano, and R. Cont,
Universal features of price formation in finan-cial markets: perspectives from deep learning , Quantitative Finance 19:9(2019), 1449–1459. 6[Wa] G.N. Watson,
A Treatise on the Theory of Bessel Functions , 2nd Edition,Cambridge University Press, Cambridge, 1944. 25[WDQ] T.W. Watts, and G.J. Duncan, and H. Quan,
Revisiting the marshmallowtest: a conceptual replication investigating links between early delay ofgratification and later outcomes , Psychol. Sci. 29:7 (2018), 1159–1177. 12[YZ] X. Yang, and H. Zhang,
Extreme absolute strength of stocks and perfor-mance of momentum strategies? , Journal of Financial Markets 44 (2019),71–90. 42(I. Cherednik), Journal of Financial Markets 44 (2019),71–90. 42(I. Cherednik)