[PDF] Artificial Intelligence Alter Egos: Who benefits from Robo-investing?

Abstract

Artificial intelligence, or AI, enhancements are increasingly shaping our daily lives. Financial decision-making is no exception to this. We introduce the notion of AI Alter Egos, which are shadow robo-investors, and use a unique data set covering brokerage accounts for a large cross-section of investors over a sample from January 2003 to March 2012, which includes the 2008 financial crisis, to assess the benefits of robo-investing. We have detailed investor characteristics and records of all trades. Our data set consists of investors typically targeted for robo-advising. We explore robo-investing strategies commonly used in the industry, including some involving advanced machine learning methods. The man versus machine comparison allows us to shed light on potential benefits the emerging robo-advising industry may provide to certain segments of the population, such as low income and/or high risk averse investors.

Full PDF

aa r X i v : . [ q -f i n . P M ] J u l A RTIFICIAL I NTELLIGENCE A LTER E GOS :W HO B ENEFITS FROM R OB O - INVESTING ? A P

REPRINT

Catherine D’Hondt, Rudy De Winne, Eric Ghysels, Steve Raymond

July 9, 2019————————————————————————————————————– A BSTRACT

Artiﬁcial intelligence, or AI, enhancements are increasingly shaping our daily lives. Financialdecision-making is no exception to this. We introduce the notion of AI Alter Egos, which are shadowrobo-investors, and use a unique data set covering brokerage accounts for a large cross-section ofinvestors over a sample from January 2003 to March 2012, which includes the 2008 ﬁnancial crisis,to assess the beneﬁts of robo-investing. We have detailed investor characteristics and records ofall trades. Our data set consists of investors typically targeted for robo-advising. We explore robo-investing strategies commonly used in the industry, including some involving advanced machinelearning methods. The man versus machine comparison allows us to shed light on potential beneﬁtsthe emerging robo-advising industry may provide to certain segments of the population, such as lowincome and/or high risk averse investors.————————————————————————————————————–

Introduction

To assess the beneﬁts of robo-investing we use a unique data set covering brokerage accounts for a large cross-sectionof 22,972 individual investors covering a sample from January 2003 to March 2012, and therefore includes the 2008ﬁnancial crisis. We have records of all trades, and in addition have detailed information about each individual investor’scharacteristics such as age, gender, education, annual net income, and most importantly, risk aversion assessed on thebasis of responses to survey questions. Although we work with Belgian individual investors, most of their tradingactivities pertain to foreign stocks (86% are non-Belgian and roughly a quarter are US). Hence, our analysis pertainsto international portfolio selection of stocks and ETFs.To the best of our knowledge there has not been any assessment of the potential beneﬁts of robo-investing over a longperiod of time for a heterogeneous panel of individual investors. We explore robo-investing strategies commonly usedin the industry, including some involving advanced machine learning methods. The man versus machine comparisonallows us to shed light on potential beneﬁts the emerging robo-advizing industry may provide to certain targetedsegments of the population, such as low income and/or investors with relatively little ﬁnancial literacy. Our sample has a number of appealing features to study robo-investing. Many investment brokerage ﬁrms are nowtargeting individuals with modest savings as it is generally believed that smaller investors don’t get the investmentadvice they need. In fact, 71% or almost 90 million American families have investment account balances worth lessthan $100,000. The growth of automated investment advisory services is ﬁlling a need for such investors. Our dataset consists of individual investors typically targeted by robo-advising. In terms of annual net income, approximately70% of the investors in our sample declare an income between 20,000 and 75,000 euros. The mean portfolio value inour sample is 29,244 euros and the average investor is about 48 years old.Note that our paper does not directly address the effect on wealth management of adopting robo-advising, as studiedby for example D’Acunto, Prabhala, and Rossi (2019). On the one hand, our data is richer in terms of details regardingthe characteristics - such as income, education, gender, risk aversion, trading habits - for each individual investor. Onthe other hand, we study a sample where robo-advising was not adopted by the brokerage ﬁrm whose trading datawe examine. Instead, we introduce the idea of shadow robo-investors to assess the potential beneﬁts of robo-advising.Namely, we study various robo-investors that shadow the individuals in our data set and the novelty of our approachis that we know what the investors have done in reality versus what a robo-investor would have done instead. In thatsense our analysis is a real-time experiment with real data. In the US, robo-advisor start-ups saw an eight-fold increase in their AUM in recent years on the back of some retirementsavings shifting to robo-advisor accounts. Cost advantages have been creating signiﬁcant momentum for the industry. In addition,the success of passive investment strategies in recent years has also been beneﬁcial. It is therefore fair to say that robo-advisors areposing a challenge to traditional ﬁnancial advisory services. One expects that some robo-advisory start-ups will probably end upin partnerships or be the subject of takeovers by established asset management ﬁrms or banks in the coming years. Moreover, thetraditional asset managers themselves are also adopting robo-investing strategies. In that respect, robo-advising will become moremainstream. This constraint ties each robot to a speciﬁc investor in our sample via their trading history. Note thatthe robo-investors use all the stocks/ETFs individual investor i held in the past two years, but may have sold in themeantime. Hence, the rationale is that the investor knows about the stocks/ETFs held by the shadow robo-investor. Wecall these shadow robo-investors Artiﬁcial Intelligence Alter Egos, or AI Alter Egos. The notion of AI Alter Egos is not unique to ﬁnance, although we might be the ﬁrst to coin the term. To illustrate,let’s look at machine learning (ML) advances in other ﬁelds, such as literature and music. Today, a ML text miningalgorithm can analyze the writings of a famous author and create entirely new literature in the style of the writer itwas exposed to and trained on. The same can be done with music. For example, Franz Schubert started his UnﬁnishedSymphony in B minor in 1882, but wrote only two complete movements, though he lived another six years. Now,deep-learning ML has produced a completed version of the entire symphony. We can characterize this as Schubert’sAI Alter Ego composing a new score. Would Schubert have done better than his AI Alter Ego? We prefer to leave thatdebate to the musicologists, but it’s fair to say it would probably be hard to address the question. Fortunately, it’s mucheasier to apply the notion of AI Alter Egos in a setting where comparing the outcomes of human and AI alternatives ismore straightforward – such as in ﬁnancial investments.We consider three investment strategies. Two are based on a Markowitz (1952) mean-variance (MV) scheme and athird is based on DeMiguel, Garlappi, and Uppal (2007) involving the /N it scheme where N it is the number of stocksheld by investor i over a 2-year trailing sample up to time t. The two MV strategies differ in terms of the sophisticationregarding the conditional mean and variance estimates. The ﬁrst involves two-year rolling sample estimates for boththe mean and variance. For the second we rev up the robot engines and replace the rolling sample estimators byrespectively expected return predictions using machine learning algorithms and sophisticated conditional covarianceestimators. More speciﬁcally for the conditional mean we use Elastic-Net, Random Forest, Neural Network, andmodel ensemble estimators. For the conditional covariance matrix - looking at a total of 683 stocks and 393 ETFs -we use the Engle, Ledoit, and Wolf (2019) nonlinear shrinkage method derived from random matrix theory to correctin-sample biases of sample eigenvalues. Finally, it is important to note that robo-investors have the option to hold cash,i.e. decide to avoid market risk exposure. No short selling is allowed, however.We study three rebalancing schemes: once a year, quarterly and monthly. In the main body of the paper we focusexclusively on the quarterly rebalancing scheme. Note that robo-investors buy and hold at ﬁxed sampling frequencies- end of quarter in the lead example. This is in contrast to the individual investors in our sample who execute theirtrades at any point in time. The majority of trading occurs in either equity or ETFs as described in detail in the Appendix, seeD’Hondt, De Winne, Ghysels, and Raymond (2019). Since the robo-investor schemes go beyond machine learning, as they involve portfolio allocation rules, we use the more generalterm of artiﬁcial intelligence. In our case the AI pertains a set of computer-driven self-learning rules which determine portfolioallocations. In contrast the machinelearning MV AI Alter Egos result in signiﬁcant investment portfolio performance improvements for certain types ofinvestors. In particular, those featuring high risk aversion beneﬁt greatly from following the robo-investor strategies.Low income (low education) investors typically also gain from the AI advise. These results conﬁrm the claims madeby practitioners in the industry regarding the promises the use of AI hold for the future of the FinTech industry. Moreintriguing, and somewhat unexpected are our results pertaining to the performance during the ﬁnancial crisis. Robo-investors outperform a large swath of investors. In fact, the median robo-investor moves into cash (because of negativeexpected returns using AI) whereas individuals feature behavioral biases, such as the disposition effect (cfr. Odean(1998)) with unfortunate consequences during the onset of the ﬁnancial crisis.As a by-product of our analysis, we also identify which machine learning methods perform well. While deep learningis often the best across a large cross-section of stocks, a close second-best is a much simpler linear prediction modelwith elastic net penalty based on the same set of predictor, namely those suggested by Welch and Goyal (2007), whichconsist of a mixture of ﬁrm-speciﬁc and macroeconomic covariates. Put differently, the gains from non-linear modelsis marginal at best.The paper is organized as follows. In section 2 we describe the brokerage data, with some of the details appearing inthe Appendix. Section 3 describes the various robo-investor schemes. Section 4 reports the empirical results. Section5 concludes the paper.

Our primary data set comes from a large Belgian online brokerage ﬁrm and consists of the trading accounts of 22,972individual investors. This unique data spans about 10 years from January 2003 to March 2012, and therefore includesthe 2008 ﬁnancial crisis. We have detailed information about each trade, such as the instrument, the time-stamp, thetrade direction, the executed quantity, the trade price, and explicit transaction costs. The details of the data are de-scribed in Appendix pf D’Hondt, De Winne, Ghysels, and Raymond (2019). We focus on common stock investmentsas well as ETFs and exclude other ﬁnancial instruments. Trading of ETFs, mutual funds, options and warrants ismore prevalent with high income/education investors. Trading of bonds is overall insigniﬁcant. Because we examinerobo-advisors which are mean-variance investors we focus exclusively on stocks and ETFs which best ﬁt the portfolioallocation model. For high income/education investors in particular this means we leave out to a certain degree otherassets which we have available. After applying some ﬁlters described in the Appendix, we end up with a sample of1,590,199 (stocks) + 60,344 (ETFs) = 1,650,543 trades (and more than 13 billion euros traded in stocks and close to Note that among the existing robo-investor practices there are number which proclaim using MV allocations and most likelyuse some type of rolling sample scheme - although most white papers are rather vague on the actual implementation. Further details see D’Hondt, De Winne, Ghysels, and Raymond (2019). In Appendix we document that 6,741 investors also traded options and warrants with an aggregate number of 602,833 tradesand 6,665 investors traded mutual funds with an aggregate number of 260,120 trades. Only a few investors (i.e. 1,813) traded bondswith an aggregate number of 5,999 trades. The average investor is about 48 years old andexecutes monthly 2.76 trades across 2.05 different stocks for a volume of 18,237 euros. Consistent with the literature,investors in our sample are under-diversiﬁed; the average (median) investor holds a ﬁve-stock (three-stock) portfolio.The average end-of-month portfolio value is about 28,003 euros (with a median value of about 7,552 euros). As forrisk aversion, the majority of investors seem to be risk tolerant since 65.33% of them declare a medium risk aversionand 27.88% of them even a low risk aversion.In terms of performance, our investors earn an average monthly gross return of 0.42% on stocks and ETFs (medianreturn of 0.13%), with a volatility of 10.04%. This high average volatility of individual portfolio gross returns is notsurprising given our sample period includes turbulent market conditions. The income measure reported in our data is recorded once, when the investor completed the MiFID tests. The classiﬁcationmay therefore be noisy over the 10 year sample period, particularly for the early entries. As detailed in D’Hondt, De Winne, Ghysels, and Raymond (2019) Appendix, to calculate portfolio returns, we opt for anapproximation of the Modiﬁed Dietz Method, aiming at delivering a return close to the money-weighted rate of return (e.g.,Shestopaloff and Shestopaloff (2007)). Robo-Investors

The robo-investors are limited to the set of stocks and ETFs in each individual investor’s history of trading - usinga rolling 2-year sample. This constraint ties each robot to a speciﬁc investor in our sample via their trading history.We call these shadow robo-investors Artiﬁcial Intelligence Alter Egos. Robo-investors have the option to hold cash,i.e. decide to avoid market risk exposure, but no short-selling occurs in our sample nor is it allowed for in the designof the robots. The two-year window is arguably somewhat arbitrary. Our results hold for longer windows. Shorterwindows are less appealing given the trading frequency of many investors, with only a median of 2 trades per month.The portfolio allocations of robo-investors occur at ﬁxed intervals, either monthly, quarterly or annually. In the mainbody of the paper we focus exclusively on the quarterly results. We construct three types of AI Alter Ego robo-investors. As we already noted, each setting only uses stocks and ETFsheld by an individual investor over the past two years, not the entire universe of stocks. The table below provides twoillustrative examples. When we refer to t, we mean end of the year, or quarter or month, depending on the case beingconsidered. Initial Trading t - 1 Trading t Investor Robo-investorholdings holdings potential holdingsStocks 1 & 2 Sells all of 2 Buys stock 3 Stocks 1 & 3 Stocks 1, 2 & 3Stock 1 Sells all of 1 Buys ETF 4 ETF 4 Stock 1 & ETF 4

The ﬁrst line portrays an investor holding two stocks - say 1 and 2 - at time t - 2 (column called Initial holdings). Atthe end of t - 1, the investors sells all holdings of stock 2 and at the end of the subsequent period t buys stock 3. Hence,at the end of t she/he holds stocks 1 and 3. The robo-investor has stocks 1, 2 and 3 to form a portfolio. The secondcase is similar, but the investor only holds stock 1, sells all of it in t - 1 and buys ETF 4 in t. The robo-investors hastwo assets to select from. It is important to stress that the robo-investor may hold cash, i.e. decide not to put all themoney in the stock market. This will be important as will become clear when discussing the empirical results.To proceed, we need to introduce some notation. Let S it be the set of stocks/ETFs investor i held over a two-yearperiod up to time t. The above illustrative examples clariﬁed that this does not mean that the investor holds thesestocks/ETFs at the end of year/quarter/month t. It only means that the investor held these stocks/ETFs in the recenttwo-year history. We denote by T i the duration of time (months/quarters/years whichever applies) investor i appears In the Online Appendix of D’Hondt, De Winne, Ghysels, and Raymond (2019), the monthly and detailed quarterly results arereported. The annual results are available on request. In the computations of returns we ignore transaction costs. Since our focus isquarterly trading frequencies this is a reasonable abstraction. The monthly robo-investor results are arguably more suspect of beingoverstated because transaction costs are not accounted for.

5n the sample. Moreover, we denote by N it = S it , the number of stocks/ETFs in the set. We only consider investorswith N it ≥ ∀ t = 1, . . . , T i . This ensures that the investment opportunity set contains a minimally sufﬁcient set ofstocks/ETFs for the robo-investors. This leaves us with 20,622 investors who satisfy this criteria and are included inour analysis.The robo-investors buy at the end of t and hold until end of t + 1 , i.e. for a month, quarter or full year. We thencompute holding period returns for the robo-investor, r aei,t +1 , and compute the alter-ego-less-investor’s realized returnspread as r si,t = r aei,t − r i,t . The ﬁrst type shadows each individual investor in our sample using the DeMiguel, Garlappi, and Uppal (2007) equalweighting rule, the second and third rely on a mean-variance Markowitz (1952) strategy with a short-sale constraint.The difference between the second and third variations is the sophistication of expected return and risk estimators. Inthe second approach a simple rolling sample estimator is involved for expected returns and the linear shrinkage esti-mator of Ledoit and Wolf (2004) for second conditional moments. In the third case, machine learning and conditionalcovariance estimators are used. More speciﬁcally for the conditional mean we use Elastic-Net, Random Forest, NeuralNetwork, and model ensemble estimators. For the conditional covariance matrix we use the Engle, Ledoit, and Wolf(2019) nonlinear shrinkage method derived from random matrix theory to correct in-sample biases of sample eigenval-ues.

Equal Weights

We endow the robo-investor with a DeMiguel, Garlappi, and Uppal (2007) /N i,t strategy. In par-ticular, for each individual i, the Alter Ego buys and holds at time t all the stocks in the set S it with equal allocations /N it . Henceforth we will refer to this as the EW portfolio rule.

Rolling Sample Markowitz

The mean-variance optimal portfolio is constructed as the maximum Sharpe ratio sub-ject to the short-sale constraint and the individual’s investment opportunity set. Investor i selects from the set S it ofstocks. Critical to the optimal portfolios are estimates of conditional expected returns ( µ it ) and the conditional covari-ance matrix of returns ( Σ it ) for the stocks in the set S it . The robo-investor solves for ˆ w i,t selecting among these stocksaccording to: max w i,t w ′ i,t µ it − γ w ′ i,t Σ it w i,t w i,t ≥ , where γ is often interpreted as a risk aversion parameter which we set equal to one as it maximizes the Sharpe ratio.We estimate µ it with two-year rolling-window historical averages, ˆ µ it = k P k − j =0 r dt − j , where r dt is an N it × vector ofdaily returns and k is the number of days in the two-year historical sample. For covariance, we also use rolling sample In between rebalancing periods, the portfolio weights adjust according to the performance of an individual asset relativeto the performance of the portfolio as a whole. In particular when t + 1 is not a rebalancing period, w i,t +1 = w i,t (1 + r i,t +1 ) / [ P Ni =1 w i,t (1 + r i,t +1 )] Machine learning and Shrinkage

Continuing with the Markowitz allocation scheme, we explore whether increas-ing the complexity of the rolling sample estimators translates into improved robo-investor performance. We assumethat each investor’s Alter Ego robo-investor has access to a common set of models that replace the rolling sampleschemes. For expected return predictions we use machine learning algorithms applied to each of the 1076 assets(683 stocks and 393 ETFs) and the Alter Ego robo-investor picks the prediction pertaining to the stocks in the sets S it . More speciﬁcally for the conditional mean estimates we use Elastic-Net (Zou and Hastie (2005)), Random Forest(Breiman (2001)), Neural Network (Friedman, Hastie, and Tibshirani, 2016, Chap. 11), and model ensemble estima-tors (Friedman, Hastie, and Tibshirani, 2016, Chap. 16). For the conditional covariance matrix - looking at a totalof 1076 assets - we use the Engle, Ledoit, and Wolf (2019) nonlinear shrinkage method derived from random matrixtheory to correct in-sample biases of sample eigenvalues.

What we have in mind is a situation where the robo-investors rely on a modeling department within the brokeragehouse to provide them with estimates of conditional means and conditional covariances for the entire universe ofstocks/ETFs and supplying the Alter Ego investor associated with each individual investor with the estimates µ it and Σ it for the stocks in the set S it . The modelers estimate a wide class of models and use out-of-sample performancemetrics to determine the most appropriate panel of conditional means and conditional covariances to supply to therobo-investors. Our goal here is to provide a simple approximation to the comprehensive conditional modeling processthat such a brokerage research group would undertake. In terms of expected returns models our analysis shares someof the methods also considered by Gu, Kelly, and Xiu (2018).For the purpose of our analysis, let r i,t − = ( r i,t , · · · , r i,t − k +1 ) ′ be the k × vector of own-lagged stock returnsfor stock i . We have N = 1076 stocks/ETFs to consider and T = 110 monthly periods. We use 70% of the datafor training, 20% of the data as a validation sample (for hyperparameter tuning), and 10% of the sample for testingout-of-sample performance. To maximize the use of our unique data set, we start building our models using returnsdata from January 1993 to December 2002 - namely a 10-year sample prior to the start of our individual investor data.We augment the panel of monthly stock/ETF returns with the ﬁve Fama-French monthly factors (Mkt, SMB, HML,RMW, CMA) as well as their momentum factor (see Ken French website for deﬁnitions), and Welch and Goyal (2007)predictors: div. price ratio, div. yield, earnings price ratio, div. payout ratio, stock variance, BM DJ stocks, net equityexpansion, TBill, long-term yield, term spread, default yield spread, inﬂation (see their paper for deﬁnitions). Weunderstand that a true data engineering group would likely create a much larger and more robust set of data sources.Our goal is not to replicate the true data-source generating process, but to provide a simple approximation to the set ofall useful signals for prediction. Let x t represent an M × vector of these predictors.7n each model class we estimate individual models for each stock/ETF separately, rather than pooling acrossstocks/ETFs, in order to allow as much heterogeneity as possible in model parameter estimates. The common modelingobjective is to estimate E t [ r i,t +1 ] , where E t [ r i,t +1 ] = f i ( z i,t ) . The modelers therefore employ different approachesto estimate f i () , and also work to curate the best possible set of covariates z i,t .We separate our conditional mean models into (1) linear and (2) nonlinear model sets. Within the linear models, weconsider OLS and elastic-net models. For nonlinear models we consider random forests of regression trees and shallowfeed-forward neural networks. Hence, we consider two popular nonparametric and parametric machine learning mod-els designed to introduce nonlinear interactions between covariates: random forests of regression trees (nonparametric)and artiﬁcial neural networks (parametric). Finally we consider a simple model ensemble across all models. Linear Models

The linear models we estimate for each stock i across time periods t = k, . . . , T − are of the form: r i,t +1 = β i, + β i,r r i,t + β i,x x t + ǫ i,t +1 (1)In addition to estimating this model with OLS, we ﬁt sets of linear models per stock i using Elastic Net involving twotuning parameters ( α, λ ) that we optimize over the validation sample. L i ( θ ) = 1 T − k T X t =1 ǫ i,t +1 + αλ X m ∈ β | β m | + 12 (1 − α ) λ X m ∈ β β m (2)where β = ( β i, β i,r β i,x ) ′ Nonlinear Models

We consider two popular nonparametric and parametric machine learning models designed tointroduce nonlinear interactions between covariates: random forests of regression trees (nonparametric) and artiﬁcialneural networks (parametric). We employ the algorithm of (Breiman (2001)) to estimate random forest models andwe use stochastic gradient descent to minimize an ℓ objective function with regularization terms in order to train theneural networks. In both cases our estimation techniques are standard. Again we estimate the model on the trainingdata and optimize all respective tuning parameters on the validation set. Random Forest

A random forest is a combination of individual regression trees. It is a bootstrapping method thatseeks to avoid both overﬁtting and decrease correlation among trees by using random subsets of predictors at eachbranch of a given tree. Each tree can be classiﬁed as having K terminal nodes (called “leaves”) with a depth of L . Theprediction of a given tree then can be stated as: h ( z i,t ; β, K, L ) = K X k =1 β k { z i,t ∈ P k ( L ) } (3)where P k ( L ) is the k -th partition that has at most L different branches that it considers. A set of branches for a givenpartition can be represented as a product of indicators for sequential branches. For a given partition, then ˆ β k is theaverage of the returns for all members of that given partition. A standard greedy search algorithm is used to maximize8he information gained at each split. The recursive binary splitting algorithm continues until a set of stopping criterionare met, which typically rely on the maximal additional information gained from a split being less than a threshold, ora max number of leaves and/or depth of a tree being reached.For the random forest models, the key tuning parameters are the number of bootstraped trees, the depth of eachtree, and the random subset of predictors that are considered at each potential split within a tree. The random forestprediction is then the bootstrapped average at any prediction point across trees. Neural Network

Our neural network architecture is two hidden layers with 10 neurons per layer, sigmoid transferfunctions in the input and hidden layers, and a linear transfer function in the output layer. We use stochastic gradientdescent to minimize an ℓ objective function with regularization terms in order to train the neural networks. In bothcases our estimation techniques are standard. Again we estimate the model on the training data and optimize allrespective tuning parameters on the validation set.Finally we consider a model ensemble of the above linear and nonlinear estimators, restricting ourselves to an equal-weighting scheme across predicted expected returns as to limit introducing additional estimation uncertainty.Figure 1 displays a set of 10 bar plot clusters. Each displays end-of-year (last quarter) snapshots of forecastingperformance. The 10 rolling samples displayed, each pertaining to a 10-year sample of return data to estimate, validateand forecast returns. For each of the 10 rolling samples the relative performance of the competing models (onlylooking at equities) is displayed. The out-of-sample performance is measured in terms of MSE and the height of eachbar represents the percentage a particular model has the lowest MSE in predicting the cross-section of returns for allthe stocks in the sample. For each cluster the height of the bars add up to 100% and each represents the fraction aparticular class of models provides the best return prediction for the 683 in the cross-section. We note that neuralnetwork models represent the most successful class of models, typically being the best for between 40 and 50 percentof the assets in the cross-section. Often a close second is the class of Elastic Net models. All other methods are lesssuccessful, although there is quite some variation across time.The results displayed in Figure 1 may leave the impression that neural network models are dominant. Let us turnour attention to Table 1 which sheds perhaps a different light on this result. Table 1 reports the average MSE andMAE of out of sample forecasts across all assets and rolling sample schemes. It shows that the elastic-net and neuralnet models deliver the lowest out-of-sample MSE when aggregating performance across stocks/ETFs. However, thedifferences between EN and NN are very small, indicating that while NN perhaps provides the best predictions, EN istypically a close second and arguably much easier to implement. Moreover, the EN is a linear model, whereas the NNis nonlinear. The presence of nonlinearities does not seem to substantially pay off.All the models/estimators have dimensions on where they could be reﬁned, but ultimately the modeling group deliversa set of conditional mean estimates by stock/ETFsto the robo-investors. Each of these chosen conditional meanestimates come from the model with the lowest out-of-sample MSE. A common model need not be chosen across9tocks/ETFs, and indeed we can see that even in a few cases, OLS with all covariates included is the model withthe best out-of-sample performance. The ﬁnal panel of ( ˆ E t [ r i,t +1 ]) i,t is used in the robo-investors’ optimal portfolioproblems. The empirical results focus on answering a number of questions: (a) are more sophisticated models better, (b) whogains from robo-advice, (c) how does robo-investing perform during a major ﬁnancial crisis, (d) how do AI AlterEgos compare to passive investment schemes and (e) are spreads due to behavioral biases? A subsection is devotedto answering each of these questions. In the main body of the paper we report a summary set of results pertainingto quarterly rebalancing. In the Online Appendix (D’Hondt, De Winne, Ghysels, and Raymond (2019)), the monthlyand detailed quarterly results are reported.

In Table 2 we report for all investors in our sample the median, ﬁrst (Q1) and third quartiles (Q3) of the cross-sectionaldistribution of return spreads r si,t = r aei,t - r i,t , considering only equity holdings (left panel) or the entire universe of683 stocks and 393 ETFs (right panel). The AI Alter Ego schemes are: (a) MV with Rolling Mean/Rolling Variance,(b) Machine Learning (ML) Mean/Rolling Variance - using the methods displayed in Table 1, (c) ML Mean/NonlinearSmoothed Variance, and ﬁnally the equally weighted (EW) portfolio scheme. Neither rolling sample mean nor equallyweighted portfolios have positive median spreads. Hence, the median shadow robo-investor performs worse thanthe humans. The highest median spread is obtained from the ML Mean/Rolling Variance, namely 2.93% per year(equities only) and 3.37% for the universe of stocks and ETFs. Using the nonlinear smoothing approach to covarianceestimation slightly reduces the median return by 15 basis points or even 41 basis points when ETFs are included.While there is a large cross-sectional heterogeneity, judging by the inter-quartile range, we also observe a right shift inthe entire distribution. The ﬁrst quartiles for MV Rolling Mean and EW are 3 percent lower, whereas Q3 is 5 percentlower compared to either type of MV ML. All the results reported so far pertain to quarterly portfolio rebalancing.In the Online Appendix, we provide detailed evidence showing that the ﬁndings extend to monthly rebalancing. Theannual rebalancing yield qualitatively the same ﬁndings as well.Overall, the results clearly show that the ML expected return scheme is superior to any of the two relatively naiveand simple robo-investing schemes. Hence, the answer is clearly that more sophisticated models are better. In theremainder of this section we will therefore focus exclusively on the MV ML/Rolling Variance robo-investor AI AlterEgos. 10 .2 Who gains from robo-advise? Continuing with quarterly rebalancing and MV ML/Rolling Variance robo-investors, in Table 4 we report the median,Q1, Q3 as well as conﬁdence interval for the median of the cross-sectional distributions of the spreads between AIAlter Ego and individual investor returns, considering the entire universe of 683 stocks and 393 ETFs. Summarystatistics are computed for separate samples with low/high education, low/high risk aversion and low/high incomeclassiﬁcation for investors.Let us start with high and low risk aversion, which is the panel in the middle of the Table. High risk averse medianindividual investors stand to gain 5.14 percent from robo-investor shadow Alter Egos. Their low risk aversion counter-parts only gain 3.29. Both clearly beneﬁt, since the conﬁdence intervals for either type of investor indicates that themedian spreads are signiﬁcantly different from zero. In addition, the 95% conﬁdence interval for the difference in me-dians is [0 . , . , and therefore excludes zero. Hence, the median high risk averse investor gains statisticallysigniﬁcantly more from robo-investing than the median low risk averse investor does. A similar pattern emerges forhigh/low income, with the median low income investor gaining roughly two-thirds more (4.13 percent versus 2.76)than the high income median investor. Low and high education differences are not as pronounced, with a wedge of 63basis points. The inference indicates, however, that the high/low median spread for income and education are not sta-tistically signiﬁcant. Needless to say that a spread between median 2.76 (high income) and 4.13 (low income) percentreturn per year is economically quite substantial. In Table 5 we report the Alter Ego return spreads for stocks/ETFs as they relate to the ﬁnancial crisis and GreatRecession ﬁnancial. The subsamples are benchmarked using the NBER chronology identifying the crisis period as12/2007 - 6/2009. The focus is again on the MV ML/Rolling Variance AI Alter Ego scheme. For each of thesubsamples we compute the median, Q1 and Q3 realized returns along with the same statistics for the AI Alter Egoreturns. Note that, since the median of a spread is not the difference in median returns, we are not inferring somethingdirectly related to the spreads reported in prior tables. We focus on the returns instead in order to highlight a veryimportant ﬁnding. Prior to the crisis we note that the median investor had an annual return of 9.26%, almost doublethe return of the median AI Alter Ego (4.17%). We also note though that the inter-quartile spread for investors istwice as large as the same statistic for robo-investors using ML. For individual investors the Q1-Q3 spans from -4.70 To construct conﬁdence intervals for aggregate summary statistics we do the following. We ﬁrst randomly sample individualsaccording to an individual bootstrap method whereby each investor is assumed independent of each other investor, and samplethe entire time-series path of each investor to maintain the dependence structure. For each bootstrap repetition we compute therelevant statistic per individual and aggregate the per-individual statistics over all of the sampled investors. Let { ˜ θ r } Rr =1 be theconstructed statistic over R bootstrap repetition, and let ˆ θ be the point estimate of interest. Let ˜ θ ( α/ and ˜ θ (1 − α/ representthe α/ and − α/ percentiles of the bootstrap statistic. We then construct pivotal − α conﬁdence intervals according to [2ˆ θ − ˜ θ (1 − α/ , θ − ˜ θ ( α/ ] . We also examined the more speciﬁcally targeted Belgian crisis dates related to the severe difﬁculties of the country’s ﬁnancialsector. The results are broadly speaking similar and not reported here.

11o 20.98 percent, whereas the AI Alter Egos feature a better Q1 of minus two percent, and a lower Q3 of almost elevenpercent.During the crisis things take a dramatic turn. The median robo-investor has zero return - meaning the median AI AlterEgo holds cash. In contrast, for individual investors the median is a 29 percent loss and even the Q3 investor still hasa -3.59% negative annual return. Compare that with 23.81 percent return for Q3 of the AI Alter Egos.After the crisis, things reverse to the pattern observed prior to the crisis - namely the median investor does better thanthe median AI Alter Ego, with again a much wider inter-quartile range for individual investors, even more than doublethe dispersion among robo-investors.A more striking picture emerges when we turn our attention to Figure 2. The ﬁve lines correspond to (a) realized returnof median investor, (b) median AI Alter Ego returns using MV Rolling Mean/Rolling Variance scheme (c) median AIAlter Ego returns using MV ML/Rolling Variance scheme (d) median AI Alter Ego returns using MV ML/NonlinearVariance and ﬁnally (e) the median EW robo-investor. One word of caution: these medians do not represent the sameinvestor or AI Alter Ego through time, so this is not the performance of a speciﬁc individual or robot. Each line startsout with one unit of investment at the beginning of the sample and the median returns are compounded subsequently.Prior to the crisis, the median investor reaches roughly 2.3. This means that the initial capital is doubled over a ﬁveyear span from 2002 until 2007. By the time the devastation of the crisis took its toll, the median investor is underwater by 20 percent and ﬁnally ends up with a meager 20 percent return over a 10-year period. It is remarkable thateven the EW robo-investor, whom we know from prior analysis is neither sophisticated nor particularly successful,achieves a higher return at the end of the sample. The best overall performance is obtained from the MV ML/RollingVariance median robo-investor (again not shadowing always the same investor across time) with a 60 percent overallreturn. This median robo-investor has a relatively slow start and under-performs prior to the crisis, but features smalllosses during the tumultuous market conditions. Note also that the MV ML/Nonlinear Variance AI Alter Ego is almostidentical to the ML/Rolling Variance scheme. Finally, the MV Rolling Mean/Rolling Variance scheme tracks the MLperformance very closely until the ﬁnancial crisis.To shed further light on this we turn our attention to Table 3 displaying the ranking of the regressors based on their ℓ contribution across stocks for the Elastic Net regressions deﬁned in equations (1) - (2). We focus on the ENregressions as they provide a fairly simple regression-based interpretation. In addition, it is often the best or nearly thebest prediction model. The ranks are computed for 10-year rolling samples starting with 93-03 and ending with 02-12.Of particular interest is the crisis period spanning across the 96-06 through 99-09 samples. The top ranked predictor inall but the last of rolling samples is df y namely the default yield spread. Another top-ranked series during the crisis is lty or the long term yield. Looking across all samples we also see svar stock variance, ntis net equity expansion and inf l inﬂation. Interestingly, the usual Fama-French regressors rarely appear among the top-ranked regressors. Thisshould not perhaps come as a surprise, since the Fama-French factors are meant to price the cross-section of returns.12inally, in Figure 3 we provide a time series plot of the fraction with negative expected returns among the cross-sectionof stocks, according to the best machine learning model. Early in the sample we see that typically between 20 and30% of the stocks featured negative expected returns. The fraction shoots up above 50% in 2008 and goes as high as60%. As a result, the majority of stocks featured negative expected returns, which explains why the AI Alter Egoshave a propensity to move out of the market. How do AI Alter Egos measure up against passive investment strategies, in particular buying and holding a market-wide ETF? To address this question we turn our attention to Table 6. We report summary statistics for spreads withrespect to two ETFs. One tracks the S&P 500 index and the other is the iShares MSCI Belgium ETF. Neither is ideal,but we did not ﬁnd an index available throughout the entire sample period that mimics the basket of stocks held by theinvestors in the brokerage data set. Unfortunately, the results reported in Table 6 depend on which ETF is selected.In the right panel displays the results for stocks+ETFs returns minus the benchmark ETF spreads, either S&P 500 orBelgian and in the left panel AI Alter Ego MV/ML/Nonlinear against the same benchmarks. Each panel contains themedian, ﬁrst and third quartile of the spreads. The full sample results appear in the top part of Table 6. Subsamplesstratiﬁed according to NBER crisis dates appear in the lower part. The median investor has a spread of -8.50% againstthe S&P 500, meaning the median investor vastly under-performs the benchmark. For the Belgian ETF the results arenot as dramatic, since the median investor does better with a positive spread of 1.37%. There is wide cross-sectionalvariation, although the third quartile for the US market index is only 2% (while 12% for the Belgian index). The AIAlter Ego spreads are better in both cases, although the US benchmark still yields a negative spread of -6.18%. Againstthe Belgian ETF, the AI Alter Ego has a positive median spread of almost 4 percent.When we look at the pre-crisis sample we note that the median investor and AI Alter Ego have returns below the twobenchmarks, more so for the Belgian ETF than its US counterpart. It is also worthwhile noting that the median AIAlter Ego performs worse. The crisis period is a totally different story. The AI Alter Egos median investors vastlyoutperform the benchmark by respectively 18.32% (SPDR) and 48.31%. Moreover, the median investor does betterthan the Belgian ETF by a substantial margin of 21.24% but is 8.78% below the S&P 500 ETF. In both cases we seesigniﬁcant improvements from the Alter Ego schemes. Post-crisis things return back to the pre-crisis situation.

Are the sharp ﬁndings regarding the crisis related to well documented behavioral biases? In the Appendix (seeD’Hondt, De Winne, Ghysels, and Raymond (2019)) we report that the investors in our sample feature the behavioralbiases studied in the literature. Regarding the crisis results, we would like to focus on two key ones: (1) the dispositioneffect (DE) - selling winners too soon, holding on to losers too long - as in Odean (1998) for each investor and (2)trading frequency. Investors hold 26% of US stocks and 14% of Belgian stocks.

13n Table 7 we document the results of cross-sectional median regressions, where the AI Alter Ego spreads from theMV ML/Rolling Variance robo-investors are a function of DE (left column) as well as DE combined with tradingfrequencies. We report the AI Alter Ego return spreads as well as the spreads vis-à-vis the S&P 500 ETF. The DEhas a positive impact on the spreads, albeit not always statistically signiﬁcant when combined with trading frequencyindicator regressors. When we add dummies for the 2nd through 4th quartile of trading frequency we note that theDE spreads are affected in a statistically signiﬁcant way, monotonically deteriorating for spreads and increasing forspreads vis-à-vis the S&P 500 ETF. This means that investors who trade a lot tend to have larger AI spreads againstthe ETF benchmark (in unreported results it is also the case for the Belgian ETF). Conversely, frequent traders tendto have less beneﬁts from AI Alter Egos, and DE is insigniﬁcant when combined with trading frequency. Overall, theresults indicate that behavioral biases explain to a certain degree the cross-section of AI Alter Ego spreads. Particu-larly, spreads against a passive investment strategy increase with trading frequency and disposition effect. Additionalresults involving controls for investor characteristics appear in the Online Appendix. Besides adding controls, we alsoconsider quantile regressions for 5%, 10% 20%, 80%, 90% and 95%. Overall the ﬁndings remain, particularly for theright tail of the distribution. The DE is signiﬁcant for the median and extreme right tail when looking at AI Alter Egospreads with respect to the benchmark ETF.

Artiﬁcial intelligence enhancements are increasingly shaping our daily lives. Financial decision-making is no excep-tion to this. We introduce the notion of AI Alter Egos, machine-driven decision makers which shadow a particularindividual, and apply it in the area of robo-investing using a brokerage accounts data set rich in both cross-sectionaland time series features.The purpose of our analysis is to assess the highly touted beneﬁts of robo-advising. Through the AI Alter Ego schemewe address a number of questions: (a) are more sophisticated models better, (b) who gains from robo-advise, (c) howdoes robo-investing perform during a major ﬁnancial crisis, (d) how do AI Alter Egos compare to passive investmentschemes and (e) are spreads due to behavioral biases? Overall, we ﬁnd that investors displaying certain characteristics- in particular high risk averse and low income - stand to gain signiﬁcantly. In particular: high risk-aversion, lowincome investors. Moreover, machine learning methods provide important portfolio return improvements. AI AlterEgo spreads are related to behavioral biases - in particular the disposition effect and trading frequency. During theﬁnancial crisis, robo-investors have a greater propensity to cash out of the market, which contributes to their overallreturn superiority. Finally, compared to passive ETF investment, we ﬁnd that the evidence is mixed, although duringthe ﬁnancial crisis AI Alter Egos were vastly better than the passive strategy.

References B REIMAN , L. (2001): “Random forests,”

Machine Learning , 45, 5–32.14’A

CUNTO , F., N. P

RABHALA , AND

A. G. R

OSSI (2019): “The promises and pitfalls of robo-advising,”

Review ofFinancial Studies , 32, 1983–2020.D E M IGUEL , V., L. G

ARLAPPI , AND

R. U

PPAL (2007): “Optimal versus naive diversiﬁcation: How inefﬁcient is the1/N portfolio strategy?,”

Review of Financial studies , 22, 1915–1953.D’H

ONDT , C., R. D E W INNE , E. G

HYSELS , AND

S. R

AYMOND (2019): “Artiﬁcial Intel-ligence Alter Egos: Who beneﬁts from robo-investing? - Online Appendix,” Available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3415981 .E NGLE , R. F., O. L

EDOIT , AND

M. W

OLF (2019): “Large dynamic covariance matrices,”

Journal of Business andEconomic Statistics (forthcoming).F

RIEDMAN , J., T. H

ASTIE , AND

R. T

IBSHIRANI (2016):

The elements of statistical learning - Second Ed.

Springer.G U , S., B. K ELLY , AND

D. X IU (2018): “Empirical asset pricing via machine learning,” Discussion paper, NationalBureau of Economic Research.L EDOIT , O.,

AND

M. W

OLF (2004): “Honey, I shrunk the sample covariance matrix,”

Journal of Portfolio Manage-ment , 30(4), 110–119.M

ARKOWITZ , H. (1952): “Portfolio selection,”

Journal of Finance , 7, 77–91.O

DEAN , T. (1998): “Are investors reluctant to realize their losses?,”

Journal of Finance , 53, 1775–1798.S

HESTOPALOFF , Y.,

AND

A. S

HESTOPALOFF (2007): “A hierarchy of methods for calculating rates of return,”

Journalof Performance Measurement , 12, 39–52.W

ELCH , I.,

AND

A. G

OYAL (2007): “A comprehensive look at the empirical performance of equity premium predic-tion,”

Review of Financial Studies , 21, 1455–1508.Z OU , H., AND

T. H

ASTIE (2005): “Regularization and variable selection via the elastic net,”

Journal of the RoyalStatistical Society: Series B (Statistical Methodology) , 67, 301–320. T ABLE

1: Out-of-Sample MSE Across Stocks

Cross-Sectional MSEOLS EN RF NN CombMean 0.0207 0.0104 0.0114 0.0100 0.0101Median 0.0147 0.0072 0.0081 0.0066 0.0070

Notes:

Cross-sectional average and median MSE’s on the out-of-sample testing data for: OLS, Elastic-Net (EN), Random Forest(RF), Neural Network (NN), and ensemble (Comb). IG . Bar charts for 10-year rolling samples are displayed, where we only display yearly snapshots. The ﬁrst covers thesample Jan 1993 - Jan 2003 and the last Jan 2002 - Jan 2012. For each of the 10 rolling samples the relative performance ofthe competing models (only looking at equities) is displayed. The bars add up to 100% for each of the 10 rolling samples. Theout-of-sample (OOS) performance is measured in terms of MSE and the height of each bar represents the percentage a particularmodel has the lowest MSE in predicting the cross-section of returns for all the stocks in the sample. The models are OLS, ElasticNet (EN), Random Forest (RF), Neural Net (NN) and Ensemble (Comb). We use 70% of the data for training, 20% of the data as avalidation sample (for hyperparameter tuning), and 10% of the sample for testing OOS performance. The bar charts pertain to theOOS performance.

Month

Cumulative Return Performance

RealizedMV-Roll-LinearMV-ML-LinearMV-ML-NonLinearEW F IG . The lines correspond to (a) realized return of median investor, (b) median AI Alter Ego returns using MV RollingMean/Rolling Variance scheme (c) median AI Alter Ego returns using MV ML/Rolling Variance scheme (d) median AI Alter Egoreturns using MV ML/Nonlinear Variance and ﬁnally (e) the median EW robo-investor. All start out with one unit of investment atthe beginning of the sample and median returns are compounded. ABLE

2: AI Alter Ego Return Spreads - All Investors

Equities only Equities + ETFMedian Q1 Q3 Median Q1 Q3Mean VarianceRolling Mean/Rolling Variance -0.08 -13.60 12.84 0.58 -12.72 13.16ML Mean/Rolling Variance 2.93 -10.66 17.41 3.37 -10.19 17.20ML Mean/Nonlinear Smoothed Variance 2.78 -10.72 17.13 2.96 -10.49 17.09Equally Weighted-0.67 -13.88 11.96 -0.54 -13.68 11.98

Notes:

Entries are median, ﬁrst (Q1) and third (Q3) quartiles of the cross-sectional distributions of the spreads between AI Alter Ego and individual investor returns, r si,t = r aei,t - r i,t , for three Mean Variance (MV) types of robo-investors and one equally weighted (EW) considering only equity holdings (left panel) or the entire universeof 683 stocks and 393 ETFs (right panel). The AI Alter Ego schemes are: (a) MV with Rolling Mean/Rolling Variance, (b) Machine Learning (ML) Mean/RollingVariance - using the methods displayed in Table 1, (c) ML Mean/Nonlinear Smoothed Variance, and ﬁnally the equally weighted portfolio scheme. The spreads are inpercentage per year. T ABLE

3: Ranked Variables Based on Relative ℓ Contribution Across Stocks

Rank 93-03 94-04 95-05 96-06 97-07 98-08 99-09 00-10 01-11 02-121 SMB svar ntis dfy dfy dfy lty Mkt-RF ntis svar2 dfy lty lty lty svar ntis dfy lty svar dfy3 lty inﬂ inﬂ svar SPvwx lty svar svar Mkt-RF lty4 RMW dfy tbl inﬂ inﬂ svar tbl dfy inﬂ inﬂ5 inﬂ Mkt-RF svar HML Mkt-RF tbl Mkt-RF SPvw SPvw ntis6 ntis SMB HML tbl dy Mkt-RF inﬂ ntis CMA RMW7 HML RMW RMW Mkt-RF lty inﬂ ntis tbl bm tbl8 Mkt-RF ntis dy bm RMW dy SPvw CMA lty SPvwx9 svar HML bm SMB ntis SPvwx RF bm ltr SPvw10 SPvw bm dfy SPvw dp dp SPvwx RMW SMB Mkt-RF11 Mom Mom Mkt-RF RF SMB SPvw HML SMB SPvwx CMA12 tbl ltr ep ntis tbl bm CMA HML dfy HML13 ltr dy RF dy bm SMB RMW SPvwx HML SMB14 de SPvw SMB CMA HML RMW bm dy dy ltr15 bm CMA dp RMW RF HML SMB Mom Mom dp16 SPvwx de CMA SPvwx SPvw CMA ltr dp dp RF17 RF SPvwx SPvw Mom Mom RF dy ltr tbl Mom18 CMA dp Mom ltr de de dp inﬂ RMW bm19 dy tbl ltr ep ep ep Mom ep ep dy20 ep RF de dp CMA ltr de de RF ep21 dp ep SPvwx de ltr Mom ep RF de de

Notes:

Elastic Net regressions deﬁned in equations (1)-(2) involve the following set of regressors: dp Dividend/Price, dy DividendYield, ep Earnings/Price, de Dividend Payout, svar

Stock Variance, bm Book-to-Market, ntis

Net Equity Expansion, tbl

T-BillRate, lty

Long Term Yield, ltr

Long Term Return, dfy

Default Yield Spread, infl

Inﬂation,

SP vw

S&P 500,

SP vwx

S&P500 (excl. dividends), the following Fama French factors

Mkt − RF Market,

SMB, HML, RMW, CMA, RF, and

Mom

Momentum (see Ken French website for deﬁnitions), and Welch and Goyal (2007) for deﬁnitions. IG . Time series plot of the fraction among the cross-section of stocks with negative expected returns, according to the bestmachine learning model - see Figure 1 for details. T ABLE

4: AI Alter Egos Return Spreads - Education, Risk Aversion and Income

Return Spreads CI MedianMedian Q1 Q3 C(2.5%) C(97.5%)

Education

Low 2.83 -10.47 16.79 1.18 4.22High 3.46 -10.15 17.26 3.01 3.85Conﬁdence interval difference in medians: [-0.5989, 2.1723]

Risk Aversion

Low 3.29 -10.86 17.43 2.39 4.38High 5.14 -9.12 17.28 3.63 6.36Conﬁdence interval difference in medians: [0.5546, 3.1111]

Income

Low 4.13 -10.00 17.57 3.01 5.20High 2.76 -11.70 15.01 0.82 4.91Conﬁdence interval difference in medians: [-3.1016, 0.6014]

Notes:

Entries are median, ﬁrst and third quartiles, as well as conﬁdence interval for the median of the cross-sectional distributionsof the spreads between AI Alter Ego and individual investor returns, considering the entire universe of 683 stocks and 393 ETFs.The AI Alter Ego scheme is Mean Variance (MV) with Machine Learning (ML) Mean/Rolling Variance - using the methodsdisplayed in Table 1. Summary statistics are computed for separate samples with low/high education, low/high risk aversion andlow/high income classiﬁcation for investors. 95% conﬁdence intervals for differences in medians are computed as described infootnote 11. The spreads are in percentages per year. ABLE

5: Returns Pre-Crisis, Crisis and Post-Crisis

Median Q1 Q3PreRealized 9.26 -4.70 20.98MV ML 4.17 -2.00 10.70DuringRealized -29.04 -45.87 -3.59MV ML 0.00 -13.81 23.81PostRealized 5.64 -6.47 15.30MV ML 2.05 -1.76 8.42

Notes:

The subsamples are benchmarked based on the NBER Crisis Time Period 12/2007 - 6/2009. The Pre-crisis sample starts in2002 and ends 11/2007, the post-crisis sample covers 7/2008 until end of sample, 2012. MV ML refers to the AI Alter Ego schemeis Mean Variance with Machine Learning Mean/Rolling Variance - using the methods displayed in Table 1. The spreads are inpercentages per year. T ABLE

6: AI Alter Ego Return Spreads vis-à-vis benchmark ETFs

Realized Stocks+ETFs AI Alter Egominus ETF minus ETFMedian 25q 75q Median 25q 75qFull sampleS&P 500 ETF -8.50 -21.05 2.00 -6.18 -13.99 1.32Belgian ETF 1.37 -9.90 12.00 3.93 -4.89 13.10Pre-CrisisS&P 500 ETF -2.22 -16.02 9.18 -7.20 -13.88 -0.41Belgian ETF -9.57 -21.28 1.45 -14.58 -21.33 -7.02During CrisisS&P 500 ETF -8.78 -27.02 12.48 18.32 -1.75 39.32Belgian ETF 21.24 1.23 40.35 48.31 25.60 71.02Post-CrisisS&P 500 ETF -11.37 -23.16 -1.78 -14.97 -19.62 -8.43Belgian ETF -1.48 -12.59 8.26 -4.82 -9.85 2.36

Notes:

The AI Alter Ego scheme is the ML Mean/Nonlinear Smoothed Variance - using the methods displayed in Table 1. Thespreads are in percentage per year. The subsamples are benchmarked based on the NBER Crisis Time Period 12/2007 - 6/2009. ThePre-crisis sample starts in 2002 and ends 11/2007, the post-crisis sample covers 7/2008 until end of sample, 2012. The benchmarkETFs are the SPDR ETF tracking the S&P 500 index and the iShares MSCI Belgium ETF. ABLE

7: Disposition Effect and AI Alter Ego Return Spreads - Median regression

Spreads Spreads vis-à-visS&P 500 ETFModel DE Model DE Model DE Model DEOnly + Trading Freq. Only + Trading Freq.Trading frequency2nd quartile -1.361* 0.782***3rd quartile -1.944*** 1.168***4th quartile -2.850*** 1.175***Disposition Effect 0.014* 0.011 0.004* 0.007***

Notes:

Cross-sectional Median Regression Alter Ego Spreads with MV ML/Rolling Variance, * p < < < N = 19118. Detailed parameter estimates for the controls (gender, education, risk aversion, income, fundsinvested, ETF use) appear in the Online Appendix.= 19118. Detailed parameter estimates for the controls (gender, education, risk aversion, income, fundsinvested, ETF use) appear in the Online Appendix.