[PDF] Adaptive l1-regularization for short-selling control in portfolio selection

Abstract

We consider the l1-regularized Markowitz model, where a l1-penalty term is added to the objective function of the classical mean-variance one to stabilize the solution process, promoting sparsity in the solution. The l1-penalty term can also be interpreted in terms of short sales, on which several financial markets have posed restrictions. The choice of the regularization parameter plays a key role to obtain optimal portfolios that meet the financial requirements. We propose an updating rule for the regularization parameter in Bregman iteration to control both the sparsity and the number of short positions. We show that the modified scheme preserves the properties of the original one. Numerical tests are reported, which show the effectiveness of the approach.

Full PDF

aa r X i v : . [ q -f i n . P M ] J u l Adaptive l -regularization for short-selling controlin portfolio selection S. Corsaro ∗ V. De Simone † Abstract

We consider the l -regularized Markowitz model, where a l -penaltyterm is added to the objective function of the classical mean-varianceone to stabilize the solution process, promoting sparsity in the solution.The l -penalty term can also be interpreted in terms of short sales, onwhich several ﬁnancial markets have posed restrictions. The choice ofthe regularization parameter plays a key role to obtain optimal portfoliosthat meet the ﬁnancial requirements. We propose an updating rule forthe regularization parameter in Bregman iteration to control both thesparsity and the number of short positions. We show that the modiﬁedscheme preserves the properties of the original one. Numerical tests arereported, which show the eﬀectiveness of the approach. Keywords : Portfolio selection. Markowitz model. l -regularization. Breg-man iteration. In the classical Markowitz mean-variance framework [1], portfolio selection aimsat the construction of an investment portfolio that exposes investor to minimumrisk providing him a ﬁxed expected return. This approach was proposed byMarkowitz in his aforementioned seminal paper, where he stated that portfolioselection strategy should provide an optimal trade-oﬀ between expected returnand risk (mean-variance approach). In a successive work [2], Markowitz rein-forced his theory arguing that, under certain, mild conditions, a portfolio froma mean-variance eﬃcient frontier will approximately maximize the investor’s ex-pected utility.Markowitz model relies on information about future, since expected returnsshould actually be computed discounting future ﬂows, that are clearly not avail-able. A common choice is to use historical data as predictive of the future be-havior of asset returns. This practice has certain drawbacks; indeed, a limited ∗ Dipartimento di Studi aziendali e quantitativi, Universit`a di Napoli “Parthenope”, ViaGenerale Parisi, 13, I-80133 Napoli, Italy, email: Italy [email protected] † Dipartimento di Matematica e Fisica, Universit`a della Campania “Luigi Vanvitelli”, VialeLincoln, 5, I-81100 Caserta, Italy, email: [email protected] l and squared- l norm constraints are proposed for the minimum-variance criterion. In [5] analgorithm for the optimal minimum-variance portfolio selection with a weighted l and squared- l norm penalty is presented. In [6] authors regularize the mean-variance objective function with a weighted elastic net penalty.In this paper, we consider the l mean-variance regularized model introducedin [7], where a l -penalty term is added to promote sparsity in the solution.Since solutions establish the amount of capital to be invested in each availablesecurity, sparsity means that money are invested in a few securities, the activepositions. This allows investor to reduce both the number of positions to bemonitored and the transaction costs, particularly relevant for small investors,that are not taken into account in the theoretical Markowitz model. Anotheruseful interpretation of l regularization is related to the amount of shorting inthe portfolio; from the ﬁnancial point of view negative solutions correspond toshort sales. In many markets, among which Italy, Germany and Switzerland,restrictions on short sales have been established in the last years, thus short-controlling is desired as well. Then, the choice of the regularization parameter iscrucial in order to provide sparse solutions, with either a limited or null numberof negative components, preserving ﬁdelity to data.In this paper we propose an iterative algorithm based on a modiﬁed Bregmaniteration. Bregman iteration is a well established method for the solution of l -regularized optimization problems. It has been successfully applied in dif-ferent ﬁelds, as image restoration [8] matrix rank minimization [9], compressedsensing [10] and ﬁnance [6]. Our modiﬁcation to the original scheme introducesan adaptive updating rule for the regularization parameter in the regularizedmodel. The algorithm selects a value capable to provide solutions satisfying aﬁxed ﬁnancial target, formulated in terms of limited number of active and/orshort positions.We show that our modiﬁed scheme preserves the properties of the originalone and is able to select a good value of the regularization parameter within anegligible computational time. Numerical tests conﬁrm the eﬀectiveness of theproposed algorithm.The paper is organized as follows. In section 2 we brieﬂy recall Markowitzmean-variance model. In section 3 we introduce Bregman iteration for portfolioselection. Our main results are in section 4, where we introduce our algorithm,based on a modiﬁed Bregman iteration, for the l − regularized Markowitz model.In section 5 we validate our approach by means of several numerical experiments.Finally, in section 6 we give some conclusion and outline future work.2 Portfolio selection model

We refer to the classical Markowitz mean-variance framework. Given n tradedassets, the core of the problem is to establish the amount of capital to be investedin each available security.We assume that one unit of capital is available and deﬁne w = ( w , w , . . . , w n ) T the portfolio weight vector, that is, the amount w i is invested in the i -th security.Asset returns are assumed to be stationary. If we denote with µ = ( µ , µ , . . . , µ n ) T the expected asset returns, then the expected portfolio return is their weightedsum: n X i =1 w i µ i . (1)We moreover denote with σ ij is the covariance between returns of securities i and j . The portfolio risk is measured by means of its variance, given by: V = n X i =1 n X j =1 σ ij w i w j . Let ρ be the ﬁxed expected portfolio return and C the covariance matrix ofreturns. Portfolio selection is formulated as the following quadratic constrainedoptimization problem: min w w T C w s . t . w T µ = ρ w T n = 1 , (2)where is the vector of ones of length n . The ﬁrst constraint ﬁxes the expectedreturn, according to (1). The second one is a budget constraint which estab-lishes that all the available capital is invested. The non-negativity constraintis often added to avoid short positions. We do not consider it here, since weaim at controlling short positions by tuning the regularization parameter, as itis discussed in the following.Let us consider a set of m evenly spaced dates t = ( t , t , . . . , t m )at which asset returns are estimated and build the matrix R ∈ R m × n thatcontains observed historical returns of asset i on its i -th column. It can beshown that problem (2) can be stated in the following form:min w 1 m k ρ m − R w k s . t . w T µ = ρ w T n = 1 . (3)3s the asset returns are typically correlated, the matrix R could have somesingular values close to zero; therefore regularization techniques, that add toobjective function some form of a priori knowledge about the solution, must beconsidered. In this paper we consider the following l -regularized problem:min w k ρ m − R w k + τ k w k s . t . w T µ = ρ w T n = 1 , (4)where the 1 /m term has been incorporated into the regularization one. From thesecond constraint in (4) it follows that the objective function can be equivalentlywritten as: || ρ m − R w || + 2 τ X i : w i < | w i | + τ. This form points out that l penalty is equivalent to a penalty on short positions.In the limit of very large values of the regularization parameter, we obtain aportfolio with only positive weights, as observed also in [11]. Portfolio selection can be formulated as the constrained nonlinear optimizationproblem: min w E ( w )s . t .A w = b , (5)where E ( w ) = k ρ − R w k + τ k w k is strictly convex and non-smooth due to the presence of the l penalty term, A = (cid:18) µ T n (cid:19) ∈ R × n and b = ( ρ, T ∈ R . One way to solve (5) is to convert it into an unconstrained problem, for exampleby using a penalty function / continuation method, which approximates it by asequence: min w E ( w ) + λ k k A w − b k , λ k ∈ R + . (6)It is well known that, if the k -th subproblem (6) has solution w k and { λ k } isan increasing sequence tending to ∞ as k −→ ∞ , any limit point of { w k } isa solution of (5) [12]. Therefore, in many problems it is necessary to choosevery large values of λ k and it makes (6) extremely diﬃcult to solve numerically.Alternatively, Bregman iteration can be used to reduce (5) in a short sequenceof unconstrained problems by using the Bregman distance associated with E [13], where, conversely, the value of λ k in (6) remains constant.4he Bregman distance [13] associated with a proper convex functional E ( w ) : R n −→ R at point v is deﬁned as: D p E ( w , v ) = E ( w ) − E ( v ) − < p , w − v >, (7)where p ∈ ∂E ( v ) is a subgradient in the subdiﬀerential of E at point v and < . , . > denotes the canonical inner product in R n . It is not a distance in theusual sense because it is not in general symmetric but it does measure closenessbetween w and v in the sense that if u lies on the line segment ( w , v ), then theline segment ( w , u ) has smaller Bregman distance than ( w , v ) does. At eachBregman iteration E ( w ) is replaced by the Bregman distance so a subproblemin the form of (6) is solved according to the following iterative scheme: (cid:26) w k +1 = argmin w D p k E ( w , w k ) + λ k A w − b k , p k +1 = p k − λA T ( A w k +1 − b ) ∈ ∂E ( w k +1 ) . (8)The updating rule of p k +1 is chosen according to the ﬁrst-order optimality condi-tion for w k +1 and ensures that D p k + E ( w , w k +1 ) is well deﬁned. Under suitablehypotheses the convergence of the sequence { w k } to the solution of the con-strained problem (5) is guaranteed in a ﬁnite number of steps [8]; furthermore,using the equivalence of Bregman iteration with the augmented Lagrangian one[14], convergence is proved also in [15]. Note that the convergence results guar-antee the monotonic decrease of k A w k − b k , thus for large k the constraintconditions are satisﬁed to an arbitrary high degree of accuracy. This yields anatural stopping criterion according to a discrepancy principle.Since there is generally no explicit expression for the solution of the sub-minimizationproblem involved in (8), at each iteration the solution is computed inexactlyusing an iterative solver. So, in the last years there has been a growing in-terest about inexact solution of the subproblem involved in Bregman iteration.In recent papers it is proved that, for many applications, Bregman iterationsyield very accurate solutions even if subproblems are not solved as accurately[8, 10, 16]. In [17] convergence results are obtained for piece-wise linear convexfunctionals. In [18] the inexactness in the inner solution is controlled by a cri-terion that preserves the convergence of the Bregman iteration and its featuresin image restoration. A crucial issue in the solution of (4) is the choice of a suitable value for theregularization parameter τ , as already pointed out. The aim is to select τ so torealize a trade-oﬀ between sparsity and short-controlling (requiring suﬃcientlylarge values) and ﬁdelity to data (requiring small values). While the literatureoﬀers a signiﬁcative number of methods for Tikhonov regularization [19], l regularization parameter selection is often based on problem-dependent criteriaand related to iterative empirical estimates, that require a high computationalcost. In [7] least-angle regression (LARS) algorithm proceeds by decreasing5he value of τ progressively from very large values, exploiting the fact that thedependence of the optimal weights on τ is piecewise linear.In this section, we present a numerical algorithm, based on a modiﬁed Breg-man iteration with adaptive updating rule for τ . Our basic idea for deﬁning therule for τ comes from the well-known properties of the l norm and the followingproposition [7]: Proposition 1

Let w τ and w τ be solution of the l -regularized problem (4) with τ and τ respectively. If some of ( w τ ) i are negative and all the entries in w τ are positive or zero, we have τ > τ . We then propose an updating rule for τ that generates an increasing sequence ofvalues. Our aim is to modify Bregman iteration, in order to produce solutionssatisfying a ﬁxed ﬁnancial target, deﬁned in terms of sparsity or short-controllingor a combination of them.Let E k ( w ) = k R w − ρ k + τ k k w k , k = 0 , , . . . We now prove the main result of this paper:

Theorem 1

Given ( w k +1 , p k +1 ) provided by (8) applied to E k , it holds ˜ p k +1 = τ k +1 τ k p k +1 + 2 (cid:16) − τ k +1 τ k (cid:17) R T ( R w k +1 − ρ ) ∈ ∂E k +1 ( w k +1 ) . (9) Proof.

It holds p k +1 ∈ ∂E k ( w k +1 ) = ∂ ( τ k k w k +1 k ) + ∇ (cid:16) k R w k +1 − ρ k (cid:17) , thusa vector q k +1 ∈ ∂ ( τ k k w k +1 k ) exists such that p k +1 = q k +1 + 2 R T ( R w k +1 − ρ ) . It follows that q k +1 = p k +1 − R T ( R w k +1 − ρ ) ∈ ∂ ( τ k k w k +1 k ). It is easy toverify that: τ k +1 τ k q k +1 ∈ ∂ ( τ k +1 k w k +1 k ) . Then τ k +1 τ k q k +1 + 2 R T ( R w k +1 − ρ ) ∈ ∂E k +1 ( w k +1 ), which completes the proof. (cid:3) We propose the following modiﬁed Bregman iteration:  p k = τ k τ k − p k + 2 (cid:16) − τ k τ k − (cid:17) R T ( R w k − ρ ) , w k +1 = argmin w D p k E k ( w , w k ) + λ k A w − b k , p k +1 = p k − λA T ( A w k +1 − b ) ,τ k +1 = h ( τ k ) (10)where h : ℜ + −→ ℜ + is an increasing, bounded function.Note that relation (9) in Theorem 1 guarantees that the iterative scheme (10)is well deﬁned, thus preserves the properties of the original one.6n this paper we choose a multiplicative form for the function h . We set τ k +1 = η k +1 τ k , where η k +1 depends on w k +1 according to the ﬁnancial target, as shownin Algorithm 1. Note that we are not ensured that a ﬁnite value of τ existsthat satisﬁes the ﬁnancial target, thus we force h to be bounded by settinga maximum value τ max . If the ﬁnancial target is met at a certain step, then η k = 1 for all successive iterations. Conversely, τ is set to τ max . In any case,there exists an iteration ¯ k such that τ k remain ﬁxed at a value ¯ τ for k ≥ ¯ k . Algorithm 1

Modiﬁed Bregman Iteration for portfolio selectionGiven τ > τ max , λ , θ > n short , n act % Financial target parameters k := 0 w := , p := , τ − := τ , while “stopping rule not satisﬁed” dop k = τ k τ k − p k + (cid:16) − τ k τ k − (cid:17) R T ( R w k − ρ ) w k +1 = argmin w D p k E k ( w , w k ) + λ k A w − b k p k +1 = p k − λA T ( A w k +1 − b ) W − k +1 = { i : ( w k +1 ) i < } W ak +1 = { i : ( w k +1 ) i = 0 } if | W − k +1 | > n short or | W ak +1 | > n act then η k +1 = θ else η k +1 = 1 end if τ k +1 = min { η k +1 τ k , τ max } k := k + 1 end whileTheorem 2 Let ¯ τ be the regularization parameter value produced by the Algo-rithm 1 at step ¯ k . Suppose that at a certain step k ≥ ¯ k the iterate w k satisﬁes A w k = b . Then w k is a solution to the constrained problem min w E ¯ k ( w )s . t .A w = b . (11) Proof.

We note that τ k = ¯ τ ∀ k ≥ ¯ k , thus the objective function is ﬁxed for k ≥ ¯ k . Therefore, the proof follows the proof of Theorem 2.2 in [14].This result shows that if the sequence provided by Algorithm 1 converges inthe sense of lim k −→∞ || A w k − b || = 0, then the iterates w k will get arbitrarilyclose to a solution to the original constrained problem with τ = τ .7 Experimental results

In this section, we discuss some computational issues and show the eﬀectivenessof Algorithm 1 for solving the regularized portfolio optimization problem (4).In Algorithm 1 we set λ = 1 , τ = 2 − , τ max = 1 and θ = 2. Iterationsare stopped as soon as k A w k − b k ≤ T ol with

T ol = 10 − that, from theﬁnancial point of the view, guarantees constraints at a suﬃcient accuracy. Weimplement the Fast Proximal Gradient method with backtracking stepsize rule(FISTA) [20] to solve the unconstrained subproblem at each modiﬁed Bregmaniteration in Algorithm 1. FISTA is an accelerated variant of Forward Backward(FB) algorithm, built upon the ideas of G¨ u ler [21] and Nesterov [22]. Note thatFB is a ﬁrst-order method for minimizing objective functions F ( x ) ≡ f ( x ) + g ( x ), where g : R n → R is a proper, convex, lower semicontinuous functionwith dom ( g ) closed, f : R n → R is convex and ∇ f is L -Lipschitz continuous.It generates a sequence ( x n ) n ∈ N in two separate stages; the former performsa forward (explicit) step which involves only f , while the latter performs abackward (implicit) step involving a proximal map associated to g [23]. In ourcase we set f = k ρ − R w k − < p , w > + λ k A w − b k and g = τ k w k , thenthe proximal map of g is the simple and explicit Soft threshold operator: P rox g ( w i ) = sgn ( w i ) ( | w i | − min {| w i | , τ } ) . Inner iterations are stopped when the relative diﬀerence in Euclidean norm be-tween two successive iterates is less than

T ol

Inn = 10 − . All our experiments,some of which are reported in the following, show that it is not worth to requirea great accuracy to the inner solver.The tests have been performed in Matlab R2015a (v. 8.5, 64-bit) environment,on a six-core Xeon processor with 24 GB of RAM and 12 MB of cache mem-ory, running Ubuntu/Linux 12.04.5. We compare our optimal portfolios withthe evenly weighted one (the naive portfolio), usually taken as benchmark inliterature [24]. This essentially for three reasons: it is easy to implement, manyinvestors still use such simple rule to allocate their wealth across assets and itallows one to diversify the risk.We evaluate our approach observing the out-of-sample performances of optimalportfolios as in [3]. This means that for each T-years period of asset returns, weuse historical series to solve (4); the target return ρ is ﬁxed to the average returnprovided by the naive portfolio in those years. The optimal solution obtained inthis way is used to build a portfolio that is retained for one year. We continuethis process by moving one year ahead until we reach the end of the period, end-ing with a series of out-of-sample portfolios. We then compare the so obtainedaverage return ˆ ρ and standard deviation values ˆ σ with the corresponding onesof the naive portfolio. We moreover compute the Sharpe ratio SR = ˆ ρ/ ˆ σ : sinceone would desire great return and small variance values, the Sharpe ratio can betaken as reference value for the comparison. We present the results on three testproblems; the ﬁrst and the second one come from Fama and French database , data available at http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data library.html BookEquity

We consider the ﬁrst database - FF48 - which contains monthly returns of 48industry sector portfolios from July 1926 to December 2015.Using data from 1970 to 2015, we construct optimal portfolios and analyze theirout-of-sample performance. Starting from July 1970, we use the T = 5-yearsso 40 optimal portfolios are built, until June 2015. Portfolios in FF48 exhibitmoderate correlation, indeed the condition number of C is O (10 ) for all sim-ulations. We tested diﬀerence values of T ol

Inn ; all our experiments, show thatlower values of

T ol

Inn do not improve results, thus we show results obtained for

T ol

Inn = 10 − .In table 1, for both optimal and naive portfolio, expected return, standard devi-ation and Sharpe ratio are reported, all expressed on annual basis. The optimalportfolios are no-short ones, ( n short = 0, n act = 48), that is, the target is toobtain positive solutions. Values refer to average values computed over 8 years,grouped as described in the ﬁrst column of the table. The ﬁrst row containsaverage values computed over the all 40-years period of simulation. In all cases,optimal portfolio exhibits greater values of Sharpe ratio than the naive one.In ﬁgure 1 we report the number of active positions in optimal no-short port-Optimal portfolio Naive portfolioPeriod ˆ ρ ˆ σ SR ˆ ρ ˆ σ SR1975 / − /

06 14% 38% 37% 15% 60% 26%1975 / − /

06 22% 44% 51% 29% 63% 47%1983 / − /

06 14% 39% 37% 7% 59% 12%1991 / − /

06 14% 29% 50% 15% 50% 30%1999 / − /

06 13% 34% 38% 17% 58% 29%2007 / − /

06 7% 44% 15% 10% 69% 14%Table 1: Comparison between optimal no-short ( n short = 0 , n act = 48) andnaive portfolio for FF48. Reported return and standard deviation are averagevalues over 40 years (ﬁrst line) and over groups of 8 years (lines 2 − τ range between 2 − and2 − , promoting sparsity (the percentage of sparsity varies from 6% to 21%) andpositivity.In table 2, for the same ﬁnancial target, we report results on optimal portfolioscontaining at most ten active positions ( n short = 48, n act = 10). Values areinterpreted as in table 1. In all cases, optimal portfolio exhibits again greatervalues of Sharpe ratio than the naive one. In ﬁgure 2 we report the number ofactive and short positions in optimal portfolios (top) and the number of modiﬁed9 Year of simulation N u m be r o f a c t i v e po s i t i on s Year of simulation N u m be r o f B r eg m an i t e r a t i on s Figure 1: Optimal portfolio for FF48, with n short = 0 , n act = 48. Top: activepositions. Bottom: number of modiﬁed Bregman iterations.10 ptimal portfolio Naive portfolio Period ˆ ρ ˆ σ SR ˆ ρ ˆ σ SR1975 / − /

06 14% 37% 38% 15% 60% 26%1975 / − /

06 21% 41% 49% 29% 63% 47%1983 / − /

06 15% 38% 40% 7% 59% 12%1991 / − /

06 14% 29% 49% 15% 50% 30%1999 / − /

06 12% 33% 37% 17% 58% 29%2007 / − /

06 8% 43% 18% 10% 69% 14%Table 2: Comparison between optimal and naive portfolio for FF48. Optimalportfolios contain at most ten active positions ( n short = 48, n act = 10). Re-ported return and standard deviation are average values over 40 years (ﬁrst line)and over groups of 8 years (lines 2 − Optimal portfolioAlgorithm 1 LARS

Period ˆ ρ ˆ σ SR ˆ σ SR1976 / − /

06 17% 37% 46% 41% 41%1976 / − /

06 23% 43% 53% 48% 49%1981 / − /

06 23% 36% 64% 41% 57%1986 / − /

06 9% 45% 20% 45% 20%1991 / − /

06 16% 21% 76% 26% 62%1996 / − /

06 16% 38% 42% 40% 40%2001 / − /

06 13% 39% 33% 43% 30%Table 3: Comparison between no-short optimal portfolios for FF48 produced byAlgorithm 1 and LARS in [7], Table 1. Reported return and standard deviationare average values over 30 years (ﬁrst line) and over groups of 5 years (lines2 − τ range between2 − and 2 − . Finally in table 3 we report a comparison with results exhibitedin Table 1 of paper [7]. We denote with Algorithm 1 the results produced byour optimization procedure and with LARS the results provided in [7]. We referto the same 30-years simulation period, with the average taken over 5-years foreach break-out period. We note that our procedure of regularization parameterselection produces higher values of Sharpe ratio; since the expected return isﬁxed by the constraint, this means that we obtain less risky portfolios. We here show results on the second database by Fama and French - FF100 -containing data of 100 portfolios which are the intersections of 10 portfoliosformed on size and 10 portfolios formed on the ratio of book equity to market11

Year of simulation

Active positionsShort positions

Year of simulation N u m be r o f B r eg m an i t e r a t i on s Figure 2: Optimal portfolio for FF48, with n short = 48 , n act = 10. Top: activeand short positions in optimal portfolios. Bottom: number of modiﬁed Bregmaniterations. 12ptimal portfolio Naive portfolioPeriod ˆ ρ ˆ σ SR ˆ ρ ˆ σ SR07 / − / / − / / − / / − / / − / / − / n short = 0 , n act = 100) optimal andnaive portfolio for FF100. Reported return and standard deviation are averagevalues over 40 years (ﬁrst line) and over groups of 8 years (lines 2 − T = 5 − years, 40 optimal portfoliosconstructed). Correlation values observed in FF100 are higher than in the pre-vious test, the conditioning of C is O (10 ).In table 4, we show optimal no-short portfolios. We report the expected return,the standard deviation and the Sharpe ratio expressed on annual basis. On theoverall period, optimal portfolio outperforms the naive one. The values of τ range between 2 − and 2 − , the percentage of sparsity varies from 4% to 17%(Fig. 3). Note that, looking at details on each year of simulation, we observenegative returns for both optimal and naive portfolio. For instance, in the 8 th year of simulation, optimal portfolio produces a loss of 4%, the naive one of12%. In the 10 th year the losses are of 1% and 10% respectively. This happensbecause almost all components in portfolios show decreased returns. Finally,we note that in the period 07 / − / rd year of simulation the optimal portfolio,which contains 5 assets (56 , , , , We here consider a portfolio constructed on real data from Italian market. Itconsiders the monthly returns of 72 equities, from September 2009 to August2016. Assets are reported in table 6; 25 assets are included in the FTSE MIBindex computation. The FTSE MIB is the primary benchmark Index for theItalian equity markets. The Index is comprised of highly liquid, leading compa-13

Year of simulation N u m be r o f a c t i v e po s i t i on s Year of simulation N u m be r o f B r eg m an i t e r a t i on s Figure 3: Optimal portfolio for FF100, with n short = 0 , n act = 100. Top: activepositions. Bottom: number of modiﬁed Bregman iterations.14 ptimal portfolioAlgorithm 1 LARS Period ˆ ρ ˆ σ SR ˆ σ SR1976 / − /

06 16% 48% 33% 53% 30%1976 / − /

06 12% 54% 22% 59% 21%1981 / − /

06 24% 44% 55% 49% 49%1986 / − /

06 10% 61% 16% 65% 15%1991 / − /

06 19% 29% 66% 31% 61%1996 / − /

06 18% 52% 35% 52% 35%2001 / − /

06 11% 49% 22% 55% 21%Table 5: Comparison between no-short optimal portfolios for FF100 producedby Algorithm 1 and LARS in [7], Table 3. Reported return and standard de-viation are average values over 30 years (ﬁrst line) and over groups of 5 years(lines 2 − A2A SPA EI TOWERS SPA PRIMA INDUSTRIE SPAACEA SPA EL.EN. SPA PRYSMIAN SPAAUTOGRILL SPA ENEL SPA RECORDATI SPAAMPLIFON SPA ENI SPA REPLY SPAATLANTIA SPA ERG SPA SABAF SPAAZIMUT HOLDING SPA EXOR SPA SALINI IMPREGILO SPABASICNET SPA ASSICURAZIONI GENERALI SAFILO GROUP SPABIALETTI INDUSTRIE SPA HERA SPA SAES GETTERS SPABANCA MEDIOLANUM SPA INDUSTRIA MACCHINE AUTOMATIC SIAS SPABANCA MONTE DEI PASCHI SIENA INTESA SANPAOLO SOGEFIBANCO POPOLARE SC INTESA SANPAOLO-RSP SOL SPABANCA POPOL EMILIA ROMAGNA ITALCEMENTI SPA SAIPEM SPABREMBO SPA ITALMOBILIARE SPA SNAM SPABUZZI UNICEM SPA ITALMOBILIARE SPA-RSP SARAS SPABUZZI UNICEM SPA-RSP LEONARDO-FINMECCANICA SPA ANSALDO STS SPACAIRO COMMUNICATIONS SPA LUXOTTICA GROUP SPA TELECOM ITALIA SPACEMENTIR HOLDING SPA MARR SPA TELECOM ITALIA-RSPDAVIDE CAMPARI-MILANO SPA MEDIOBANCA SPA TOD’S SPACREDITO VALTELLINESE SCARL MOLECULAR MEDICINE SPA TERNA SPADATALOGIC SPA MEDIASET SPA UBI BANCA SPADANIELI & CO MAIRE TECNIMONT SPA UNICREDIT SPADIASORIN SPA PANARIAGROUP INDUSTRIE CERAM UNIPOL GRUPPO FINANZIARIO SPD’AMICO INTERNATIONAL SHIPPI PARMALAT SPA UNIPOLSAI SPADE’LONGHI SPA BANCA POPOLARE DI MILANO ZIGNAGO VETRO SPA

Table 6: IT72 assets.nies across diﬀerent sectors, indeed it captures about the 80% of the domesticmarket capitalization. The FTSE MIB is computed on 40 Italian equities andseeks to replicate the broad sector weights of the Italian stock market.Starting from September 2009, we use the T = 6-years data to build the opti-mal portfolio from September 2015 until August 2016. The conditioning of R T R is O (10 ). In ﬁgure 4 we graphically show the composition of the optimal port-folio we constructed. The optimization strategy allocates the investor wealth on14 equities, with weights represented as percentage in the ﬁgure, among which 5belong to the FTSE MIB set. The result is obtained in 10 Bregman iterations,with τ = 2 − . We note that the optimal portfolio has return and standard de-viation, on annual basis, given by 11% and 34% respectively. The same valuesfor the naive portfolio are −

14% and 60%, thus the latter provides a loss to theinvestor. 15

MPLIFON SPA4% DAVIDE CAMPARI-MILANO SPA13% DATALOGIC SPA5%DIASORIN SPA7%ENI SPA8%LUXOTTICA GROUP SPA0%MOLECULAR MEDICINE SPA2%SABAF SPA3%SOL SPA8%SNAM SPA19%ANSALDO STS SPA6%TELECOM ITALIA-RSP2% TERNA SPA14% ZIGNAGO VETRO SPA9%

Figure 4: Optimal portfolio on Italian market equities. Built on monthly his-torical returns of 72 equities from September 2009 to August 2016.16

Conclusions

We have proposed an algorithm, which exploits the Bregman iteration method,for the portfolio selection problem formulated as an l − regularized mean-variancemodel. The choice of the regularization parameter is the key point in order toprovide solutions with either a limited or null number of negative componentsand/or a limited number of active positions. Our main contribution is themodiﬁcation of the Bregman iteration, which adaptively sets the value of theregularization parameter depending on the ﬁnancial target. It is observed thatboth sparsity and short-controlling are obtained for suﬃciently large values ofthe regularization parameter. The basic idea is then to generate an increasingsequence of values and ﬁx it when requirements are met. We show that ourmodiﬁcation to the Bregman iteration preserves the convergence of the originalscheme. Numerical experiments conﬁrm the eﬀectiveness of the proposed algo-rithm.We saw in our experiments that sometimes the eﬀectiveness of the optimiza-tion strategy can be aﬀected by changes in market conditions. Future workcould consider dynamic asset allocation, which involves frequent portfolio ad-justments. Acknowledgments

This work was partially supported by FFABR grant, annuity 2017, and INdAM-GNCS project.

References [1] H. Markowitz, Portfolio selection, J. Financ. 7 (1) (1952) 7791.[2] H. Markowitz, Portfolio selection: eﬃcient diversiﬁcation of investments,Wiley, 1959.[3] V. DeMiguel, L. Garlappi, F. Nogales, R. Uppal, A generalized approachto portfolio optimization: improving performance by constraining portfolionorms, Manage Sci. 55 (5) (2009) 798812.[4] M. Carrasco, N. Noumon, Optimal portfolio selection using regularization(2012).[5] Y. Yen, T. Yen, Solving norm constrained portfolio optimization viacoordinate-wise descent algorithms, Comput. Stat. Data An. 76 (2014)737 759.[6] M. Ho, Z. Sun, J. Xin, Weighted elastic net penalized mean-varianceportfo- lio design and computation, SIAM J. Finan. Math. 6 (1) (2015)12201244. 177] J. Brodie, I. Daubechies, C. DeMol, D. Giannone, I. Loris, Sparse andstable markowitz portfolios, PNAS 30 (106) (2009) 1226712272.[8] S. Osher, M. Burger, D. Goldfarb, J. Xu, W. Yin, An iterative regulariza-tion method for total variation-based image restoration, Multiscale ModelSim. 4 (2) (2005) 460489.[9] S. Ma, D. Goldfarb, L. Chen, Fixed point and bregman iterative methodsfor matrix rank minimization, Math. Program. 128 (1) (2011) 321353.[10] W. Yin, S. Osher, D. Goldfarb, J. Darbon, Bregman iterative algorithmsfor l1-minimization with applications to compressed sensing, SIAM J.Imaging Sci. 1 (1) (2008) 143168.[11] R. Jagannathan, M. Tongshu, Risk reduction in large portfolios: why im-posing the wrong constraints helps, J. Financ. 58 (4) (2003) 16511683.[12] D. G. Luenberger, Y. Ye, Linear and Nonlinear Programming, Springer,2008.[13] L. Bregman, The relaxation method of ﬁnding the common point of convexsets and its application to the solution of problems in convex programming,USSR Comput. Math. Math Phys 7 (1967) 200217.[14] T. Goldstein, S. Osher, The split bregman for l1-regularization problems,SIAM J. Imaging Sci. 2 (2) (2009) 323343.[15] K. Grick, O. Scherzer, Regularization of ill-posed linear equations by thenon-stationary augmented lagrangian method, J. Integral Equ. Appl. 22(2) (2010) 217257.[16] T. Goldstein, X. Bresson, S. Osher, Geometric applications of the splitbregman method: segmentation and surface reconstruction, J. Sci. Com-put. 45 (1) (2010) 272293.[17] W. Yin, S. Osher, Error forgetting of bregman iteration, S. J Sci Comput.54 (2013) 684695.[18] A. Benfenati, V. Ruggiero, Inexact bregman iteration with an applicationto poisson data reconstruction, Inverse Probl. 29 (6) (2013) 065016.[19] C. Vogel, Computational Methods for Inverse Problems, SIAM, 2002.[20] A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithmfor linear inverse problems, SIAM J. Imaging Sci. 2 (2009) 183202.[21] O. G¨ u ter, New proximal point algorithms for convex minimization, SIAMJ. Optimiz. 2 (4) (1992) 649664.[22] Y. Nesterov, A method of solving a convex programming problem withconvergence rate o (1/k2), Soviet Mathematics Doklady 27 (1983) 372376. 1823] A. Beck, M. Teboulle, Gradient-based algorithms with applications to sig-nal recovery, Convex optimization in signal processing and communications(2009) 4288.[24] V. DeMiguel, L. Garlappi, R. Uppal, Optimal versus naive diversiﬁcation:How ineﬃcient is the 1-n portfolio strategy?, Review of Financial Studies22 (5) (2009) 19151953. 19ter, New proximal point algorithms for convex minimization, SIAMJ. Optimiz. 2 (4) (1992) 649664.[22] Y. Nesterov, A method of solving a convex programming problem withconvergence rate o (1/k2), Soviet Mathematics Doklady 27 (1983) 372376. 1823] A. Beck, M. Teboulle, Gradient-based algorithms with applications to sig-nal recovery, Convex optimization in signal processing and communications(2009) 4288.[24] V. DeMiguel, L. Garlappi, R. Uppal, Optimal versus naive diversiﬁcation:How ineﬃcient is the 1-n portfolio strategy?, Review of Financial Studies22 (5) (2009) 19151953. 19