[PDF] Data-Driven Diverse Logistic Regression Ensembles

Abstract

A novel framework for statistical learning is introduced which combines ideas from regularization and ensembling. This framework is applied to learn an ensemble of logistic regression models for high-dimensional binary classification. In the new framework the models in the ensemble are learned simultaneously by optimizing a multi-convex objective function. To enforce diversity between the models the objective function penalizes overlap between the models in the ensemble. Measures of diversity in classifier ensembles are used to show how our method learns the ensemble by exploiting the accuracy-diversity trade-off for ensemble models. In contrast to other ensembling approaches, the resulting ensemble model is fully interpretable as a logistic regression model, asymptotically consistent, and at the same time yields excellent prediction accuracy as demonstrated in an extensive simulation study and gene expression data applications. The models found by the proposed ensemble methodology can also reveal alternative mechanisms that can explain the relationship between the predictors and the response variable. An open-source compiled software library implementing the proposed method is briefly discussed.

Full PDF

SSplit Modeling for High-Dimensional LogisticRegression

Anthony-Alexander ChristidisDepartment of Statistics, University of British Columbia([email protected])Stefan Van AelstDepartment of Mathematics, KU Leuven([email protected])Ruben ZamarDepartment of Statistics, University of British Columbia([email protected])

Abstract

A novel method is proposed to learn an ensemble of logistic classiﬁcation modelsin the context of high-dimensional binary classiﬁcation. The models in the ensembleare built simultaneously by optimizing a multi-convex objective function. To enforcediversity between the models the objective function penalizes overlap between the mod-els in the ensemble. We study the bias and variance of the individual models as wellas their correlation and discuss how our method learns the ensemble by exploiting theaccuracy-diversity trade-oﬀ for ensemble models. In contrast to other ensembling ap-proaches, the resulting ensemble model is fully interpretable as a logistic regressionmodel and at the same time yields excellent prediction accuracy as demonstrated inan extensive simulation study and gene expression data applications. An open-sourcecompiled software library implementing the proposed method is brieﬂy discussed.

Keywords:

High-Dimensional Classiﬁcation; Split Ensemble Learning; Logistic Regression1 a r X i v : . [ s t a t . M E ] F e b Introduction

High-dimensional classiﬁcation problems are ubiquitous nowadays and many methods have beendeveloped to tackle this problem. It is well-known that basic methods such as logistic regression forbinary classiﬁcation become unstable in high-dimensions and do not generalize well. On the onehand, regularization methods have been developed to obtain sparse solutions (see e.g. Friedmanet al. 2010; Hastie et al. 2015). These methods provide results that are interpretable and allow toidentify important variables. On the other hand, ensemble methods have been constructed whichyield better prediction performance if the members of the ensemble are suﬃciently diverse. This isoften established by using randomization (Ho 1998; Breiman 2001; Song et al. 2013) or sequentiallyﬁtting the residuals of the previous ﬁt (Friedman 2001; B¨uhlmann and Yu 2003; Schapire andFreund 2012; Yu et al. 2020). Ensemble methods have been successful in many diﬀerent areasof applications, such as computer vision (Mu et al. 2009; Rani and Muneeswaran 2018), speechrecognition (Krajewski et al. 2010; Rieger et al. 2014) and biological sciences (Dorani et al. 2018;Zahoor and Zafar 2020), among others. However, interpretation of the resulting prediction rulesand identiﬁcation of important predictor variables usually becomes more diﬃcult or even infeasiblefor such ensemble models. Since predictions may have important consequences in decision-makingprocesses, it becomes necessary, e.g. for legal reasons, to understand how a model arrives at itspredictions so that decisions can be fully explained (see e.g. Rudin 2019). Therefore, our goal is toconstruct an interpretable and explainable model with prediction performance that is at least onpar with that of ensemble methods.We propose a new approach to optimally learn an ensemble of models for high-dimensionalclassiﬁcation. As a base estimator for each of the models in the ensemble, we use penalized logisticregression with a sparsity inducing penalty such as the lasso (Donoho and Johnstone 1994; Tib-shirani 1996) or the elastic net (Zou and Hastie 2005). Rather than resorting to randomization orother indirect methods to decorrelate the models, we jointly learn the models in the ensemble onthe full training data and incorporate a diversity penalty (Christidis et al. 2020) in the objectivefunction to induce low correlation between the models. In this way, the method eﬃciently exploitsthe so-called accuracy-diversity tradeoﬀ between the models, and generates an ensemble with highpredictive performance that often outperforms popular state-of-the-art ensemble methods. More-over, by ensembling the models at the level of their linear predictors we retain interpretability forthe logistic regression coeﬃcients in the ensembled model.We construct a block coordinate descent algorithm to learn the optimal ensemble for any givennumber of models. We study the accuracy-diversity trade-oﬀ in ensemble models for our methodwhich essentially boils down to a variance-covariance trade-oﬀ in this case. These results provideinsight on the eﬀect of the number of models on the resulting ensemble and allows us to makerecommendations regarding the choice of the number of models.2n Section 2 we ﬁrst review the frameworks of learning via regularized risk minimization andlearning via ensembling. Then, we formally describe the general split statistical learning frame-work. This framework is applied in Section 3 to optimally learn an ensemble of logistic classiﬁcationmodels. An eﬃcient algorithm to solve the resulting non-convex optimization problem is providedin Section 4 while Section 5 presents the results of an extensive simulation study to compare ourmethod with a large number of state-of-the-art competitors for high-dimensional binary classiﬁ-cation. In Section 6, we provide further insight in the good performance of our split statisticallearning method based on the bias-variance-covariance trade-oﬀ for ensemble models. In Section 7,we perform a benchmark study of the proposed method on a large collection of high-dimensionalgene expression data to demonstrate its excellent performance in high-dimensional classiﬁcationproblems. A discussion on optimal ensembles for high-dimensional classiﬁcation and directions forfurther research are provided in Section 8.

The main objective in statistical learning is to ﬁnd a function that yields good generalizationperformance, i.e. a function trained on observed data that will provide accurate predictions onfuture unobserved outcomes. This objective is particularly challenging when the number of trainingsamples is relatively small and much smaller than the number of input variables in the data. Moreformally, let X be a vector of predictor variables for a binary response Y , then we wish to ﬁnd afunction f in some hypothesis space H which minimizes the expected risk V ( f ) = (cid:90) X × Y L ( f ( x ) , y ) p X ,Y ( x , y ) d x dy, where L is a given loss function and p X ,Y the (true) joint density function of the variables. Inpractice, we only have training data ( x i , y i ), 1 ≤ i ≤ n and a proxy for the expected risk is givenby the empirical risk V n ( f ) = 1 n n (cid:88) i =1 L ( f ( x i ) , y i ) . (1)When p (cid:29) n , minimizing the empirical risk leads to overﬁtting of the training data and poorgeneralization. In the seminal work by Vapnik (1998), it is shown that generalization error boundscan be obtained eﬀectively by restricting the hypothesis space H over which V n is minimized.Furthermore, Evgeniou et al. (1999) showed that Tikhonov regularization (Tikhonov and Arsenin1977) can ensure good generalization performance. In this case, the solution is given byˆ f = arg min f ∈H (cid:26) V n ( f ) + λ R ( f ) (cid:27) , (2)3or some positive parameter λ > R : H (cid:55)→ R . The functional R essentially acts as a guidance on where in the hypothesis space H the solution should be searchedfor, by penalizing functions f which are more complex. Minimization of the regularized empiricalrisk (2) is at the heart of learning theory. The penalty R can take many diﬀerent forms which mayor may not induce sparsity of the solution. The purpose of regularization is to exploit a favorabletrade-oﬀ between a decrease in variance and an increase of bias, resulting in a model with lowgeneralization error.Ensemble methods also aim to produce a classiﬁer with good generalization performance. De-composition of the generalization error of ensemble models in bias, variance and covariance com-ponents was ﬁrst discussed by Ueda and Nakano (1996). Brown et al. (2005) provided an in-depth analysis of the accuracy-diversity and bias-variance-covariance tradeoﬀ of ensemble models.In particular, if the ensemble of a collection of estimators f g ∈ H , 1 ≤ g ≤ G , is their aver-age ¯ f = (cid:80) Gg =1 ˆ f g /G , then the generalization error (GE) is given by the usual formula GE ( ¯ f ) =Bias( ¯ f ) + Var( ¯ f ). The bias and variance of the ensemble can be decomposed further asBias (cid:0) ¯ f (cid:1) = Bias G , and (3)Var (cid:0) ¯ f (cid:1) = 1 G Var G + G − G Cov G , (4)with Bias G = 1 G G (cid:88) g =1 Bias (cid:16) ˆ f g (cid:17) , Var G = 1 G G (cid:88) g =1 Var (cid:16) ˆ f g (cid:17) , andCov G = 1 G ( G − G (cid:88) g =1 (cid:88) h (cid:54) = g Cov (cid:16) ˆ f g , ˆ f h (cid:17) . From (4) it becomes clear that if the number of estimators increases, their correlations play a muchmore critical role than their individual variability to obtain a good ensemble estimator. Randomforests are built precisely on this idea. The individual decision trees are weakened by the randomselection of candidate features at each node, but the de-correlation of the trees strengthens theensemble accuracy and results in a smaller upper bound on the generalization error (Breiman2001). The need for diversity among the models in an ensemble has been studied more broadly.For example, measures of diversity in classiﬁer ensembles and their relation to ensemble predictionaccuracy was investigated by Kuncheva and Whitaker (2003).We now introduce split learning as a new framework to obtain an ensemble which achieves good4eneralization. For notational convenience, we deﬁne the indexed set G = { g : 1 ≤ g ≤ G } . Splitlearning aims for a collection of functions of sparse and diverse models which jointly satisfy (cid:110) ˆ f g (cid:111) g ∈G = arg min f g ∈H G (cid:88) g =1 (cid:20) V n ( f g ) + λ s R ( f g ) (cid:21) + λ d (cid:88) g (cid:54) = h D ( f g , f h )  , (5)where λ s , λ d >

0. Similarly as in (2) the functional R regularizes each of the individual models ˆ f g and induces sparsity in the models by using well-chosen penalties. We call the additional functional D : H × H (cid:55)→ R the diversity penalty. The diversity penalty D serves to put a restriction on thelearning functions such that the overlap between the models ˆ f g is as small as possible. As willbe shown in Section 6, the ensemble model then achieves good generalization performance due toreduction of its variance prompted by the diversity penalty.The ensemble can also be interpreted more intuitively. Because of the sparsity penalty R , eachof the models ˆ f g contains a subset of the available predictors that work well together to modeland predict the outcome. Moreover, the diversity penalty D ensures that the diﬀerent models inthe ensemble complement each other well. Finally, the overall ensemble model will also be sparsein general in the sense that for high-dimensional problems, the set of predictors that appear in atleast one of the models will be much smaller than the complete set of candidate predictors. Notethat the diversity penalty D should have two desirable properties. In the ﬁrst place, it shouldencourage the selection of uncorrelated models. Secondly, it should be computationally tractableso that the minimization problem (5) can be solved in a stable and timely manner. We consider the binary classiﬁcation problem with the classes labeled as Y = {− , } . Let ( y , X )denote the training data, where y ∈ R n is the vector of class labels and X ∈ R n × p is the designmatrix comprising n measurements on p features. We assume that the predictor variables havebeen standardized, i.e. n (cid:80) i =1 x i,j /n = 0 and n (cid:80) i =1 x i,j /n = 1, 1 ≤ j ≤ p . The logistic regressionmodel represents the class-conditional probabilities through a nonlinear function of the predictorvariables, p i = P ( Y i = 1 | x i ) = S ( β + x Ti β ) , ≤ i ≤ n, (6)where β and β ∈ R p are the intercept and vector of regression coeﬃcients, respectively. Thefunction S is the sigmoid activation function S ( t ) = e t / (1 + e t ).5 .1 Split Logistic Regression We now apply the split statistical learning framework in (5) to logistic regression. Hence, for theloss function in (1) we use the logistic regression loss function with f ( x ) a linear function, i.e. L ( f ( x i ) , y i ) = L ( β , β | y i , x i ) = log (cid:16) e − y i f ( x i ) (cid:17) , f ( x i ) = β + x Ti β . (7)Moreover, the regularization functionals R and D in (5) now become functions of the regressionparameters { β g , β g } g ∈G of the G models. Hence, the resulting objective function has the form O (cid:16) { β g , β g } g ∈G (cid:12)(cid:12) y , X (cid:17) = G (cid:88) g =1 (cid:34) n n (cid:88) i =1 L ( β g , β g | y i , x i ) + λ s P s ( β g ) (cid:35) + λ d G (cid:88) h (cid:54) = g P d (cid:16) β h , β g (cid:17) , (8) which needs to be jointly minimized with respect to all regression coeﬃcients.The regularization penalty P s ( β ) avoids overﬁtting of the individual models and induces sparsityin the individual models. In this article, we use the elastic net penalty (Zou and Hastie 2005), P s ( β g ) = (cid:18) − α (cid:107) β g (cid:107) + α (cid:107) β g (cid:107) (cid:19) , α ∈ [0 , . (9)Other penalties such as SCAD (Fan and Li 2001) or MCP (Zhang 2010) can be used as well.The diversity penalty P d ( β h , β g ) should favor decorrelated models. We use the diversity penaltyof Christidis et al. (2020), P d ( β h , β g ) = p (cid:88) j =1 | β gj || β hj | . (10)Hence, models can share a predictor variable only if this suﬃciently improves their ﬁts.The tuning constants λ s , λ d ≥ λ d → ∞ , enforces that the diversity penalty P d (cid:0) β h , β g (cid:1) → g (cid:54) = h such that the active variables in each model will be distinct. In this case the models in theensemble will be fully diverse in the sense that they do not share any predictor variables. On theother hand, it can be seen that for λ d = 0, the solution ( ˆ β g , ˆ β g ) for each model is equal to thesolution of the logistic elastic net optimization problem with the same penalty parameter λ s . Hence,split logistic regression is a generalization of penalized logistic regression. By applying the splitstatistical learning concept to logistic regression, this method allows automatic selection of G > λ s and λ d , split logistic regression optimally distributesvariables across the models in the ensemble and at the same time shrinks their contributions toobtain stable models. 6 .2 Ensembling models Minimizing the split logistic regression objective function (8) yields G models with correspondingestimated functions ˆ f g ( x ) = ˆ β g + x T ˆ β g which are well-suited for creating an ensemble. We proposeto use the ensembling function F (cid:18)(cid:110) ˆ f g (cid:111) g ∈G (cid:19) = S  G G (cid:88) g =1 ˆ f g  . (11)The advantage of this ensembling function is that the actual ensembling takes place at the levelof the linear ﬁts, resulting in a linear ensembled ﬁt with coeﬃcients given by ˆ β = G (cid:80) Gg =1 ˆ β g and ˆ β = G (cid:80) Gg =1 ˆ β g . The sigmoid function is then applied on this ensembled ﬁt, hence the resultingensembled model is still a highly interpretable logistic regression model. Moreover, (11) allowsthe simple construction of coeﬃcients solution paths for the ensembled model as shown in thesupplementary material. The diﬃculty of obtaining a global minimizer of the objective function (8) is primarily due to thenon-convexity of the diversity penalty P d . Note that a global minimum of the objective function(8) exists for any λ s > O ( { β g , β g } g ∈G (cid:12)(cid:12) y , X ) → ∞ as (cid:107) β g (cid:107) → ∞ for any 1 ≤ g ≤ G .Hence, a global minimizer exists.To construct an eﬃcient computing algorithm, we observe that the objective function is multi-convex; the objective function parameters can be partitioned in such a way that the problem isconvex on each set when the others are ﬁxed. A modern and rigorous treatment of multi-convexprogramming can be found in Shen et al. (2017). In the case of (8), the optimization problemfor the parameters ( β g , β g ) of a particular model is a penalized logistic regression problem with aweighted elastic net penalty. Indeed, ignoring constant terms, the objective function for ( β g , β g )reduces to O (cid:0) β g , β g (cid:12)(cid:12) y , X (cid:1) = 1 n n (cid:88) i =1 L ( β g , β g | y i , x i ) + λ s (1 − α )2 (cid:107) β g (cid:107) + p (cid:88) j =1 | β gj | u j,g , where the weights u j,g in the L penalty term are given by u j,g =  αλ s + λ d (cid:88) h (cid:54) = g | β hj |  . Recent work in non-convex optimization using block coordinate descent algorithms for applicationsin signal processing and machine learning has been very promising, see Yang et al. (2019) forexamples. The block coordinate descent algorithm described in Xu and Yin (2013) is adapted toour problem. The key idea is to sequentially update the current estimate for each model using aquadratic approximation L Q for the loss function in the objective functionmin β g ∈ R , β g ∈ R p  n n (cid:88) i =1 L Q ( β g , β g | y i , x i ) + λ s P s ( β g ) + λ d G (cid:88) h (cid:54) = g P d (cid:16) β h , β g (cid:17) . More precisely, we cycle through the coordinates of (cid:0) β , β (cid:1) by applying a single coordinate descentupdate to each variable, then through those of (cid:0) β , β (cid:1) , and so on until we reach (cid:0) β G , β G (cid:1) . Then,we check for convergence. The proposed algorithm converges to a coordinate-wise minimizer of (8)by Theorem 4.1 of Tseng (2001).There are many variants of the block coordinate descent algorithm proposed by Xu and Yin(2013): the order of the updates can be deterministic or stochastic, a deterministic order can becyclic or greedy, and the update applied to each block can consist of a limited number of coordinatedescent iterations or can be an exact minimization by iterating until convergence. Based on ournumerical experiments, we have decided to apply the updates in a deterministic, cyclic order andthe update applied to each block is obtained by a single iteration for each variable as summarizedin the following proposition. A detailed description of the algorithm, including the derivation ofthe quadratic approximation of the loss function (7), is given in the supplementary material. Proposition 1

Let { ˜ β g , ˜ β gj } g ∈G denote the current estimates, then the coordinate descent updatesfor ˜ β gj and ˜ β g are given by ˆ β g = ˜ β g + (cid:104) z − ˜p g , n (cid:105)(cid:104) ˜w g , n (cid:105) , ˆ β gj = Soft (cid:16) n (cid:16) ˜ r gj + ˜ β gj (cid:104) x j , ˜w g (cid:105) (cid:17) , αλ s + λ d (cid:80) h (cid:54) = g | ˜ β hj | (cid:17) n (cid:104) x j , ˜w g (cid:105) + (1 − α ) λ s , where n = (1 , . . . , T ∈ R n and ˜ r gj = (cid:104) x j , z (cid:105) − (cid:104) x j , ˜p g (cid:105) with z i = ( y i + 1) / . The elements of the -dimensional vectors ˜p g and ˜w g are given by ˜ p gi = S ( ˜ β g + x Ti ˜ β g ) and ˜ w gi = ˜ p gi (1 − ˜ p gi ) , ≤ i ≤ n ,respectively. Convergence is declared when successive estimates of the coeﬃcients in the ensemble model showlittle diﬀerence, i.e. max ≤ j ≤ p | ˜ β j − ˆ β j | < δ, for some small tolerance level δ > To select the tuning parameters we alternate between a grid search for the sparsity penalty and agrid search for the diversity penalty, such that the cross-validated loss of the ensemble classiﬁer isminimized. By default, K -fold cross-validation is used with K = 10. The details of the alternatinggrid search are available in the supplementary material, including an illustration of the coeﬃcientpaths of the models in the ensemble using a sonar data application. Note that the value λ d = 0 isincluded in the grid search for the diversity penalty, such that the (single model) Lasso or elasticnet is a possible solution of split logistic regression. The warm-start and active-set cycling strategies proposed by Friedman et al. (2010) are well suitedfor our algorithm, and our implementation incorporates these cycling acceleration strategies. Anopen-source

C++ software library with multithreading capability implementing the proposed methodis available. Details are provided at the end of the article. The compiled library has also beenwrapped in the R package SplitGLM publicly available on the Comprehensive R Archive Network(CRAN). The computational cost of this implementation of our method is explored in Section 6.

We investigate the performance of split logistic regression in an extensive simulation study com-prised of three scenarios with diﬀerent conﬁgurations. In Scenario 1 all predictors are equallycorrelated. In Scenario 2 the correlation between active and inactive predictors is lower than theother correlations. In Scenario 3, the active predictors follow a block-correlation structure.

For all three scenarios, the coeﬃcients of the active variables are randomly generated as ( − z | u | where z is Bernoulli distributed with parameter 0 . u is uniformly distributed on the interval(0 , / ζ ∈ (0 , p (1 − ζ ). Weconsider the case p = 1,500 and sparsity levels ζ ∈ { . , . , . , . } . For all simulation settingswe consider the sample sizes n = 50 and n = 100 for the training data with event probability P ( Y = 1) ∈ { . , . , . } . Scenario 1 : Data are generated according to the logistic modellog (cid:18) p i − p i (cid:19) = β + x TA,i β A , ≤ i ≤ n, where x TA,i are the active predictors and β A the corresponding regression coeﬃcients. In thisscenario all predictors are equally correlated with correlation equal to ρ = 0 . ρ = 0 .

5, or ρ = 0 . Scenario 2 : Data are generated from the logistic model of Scenario 1 with correlation ρ betweenactive and inactive predictors and ρ for all other correlations, where ρ and ρ are taken over allcombinations of ρ ∈ { , . , . } and ρ ∈ { . , . , . } such that ρ < ρ . Scenario 3 : Data are generated from a logistic model with active predictors in B disjoint blockslog (cid:18) p i − p i (cid:19) = β + B (cid:88) b =1 x Tb,i β b , ≤ i ≤ n, where x b,i are the predictor variables for block b and β b the corresponding regression coeﬃcients.Each block contains 25 predictors, so the number of blocks in each setting equals B = pζ/

25. Thecorrelation between predictors in diﬀerent blocks and between active and inactive predictors isgiven by ρ while ρ is the correlation between predictors in the same block and between inactivepredictors. The correlations ρ and ρ are taken over all combinations of ρ ∈ { . , . } and ρ ∈ { . , . } such that ρ < ρ . We compare the performance of split logistic regression with that of nine state-of-the-art competi-tors as implemented in the R packages listed below, using the default settings for tuning parameters.1–2. Split-Lasso and

Split-EN logistic regression based on G = 10 models, computed using the SplitGLM package.3–4.

Lasso and Elastic Net ( EN ) logistic regression, computed using the glmnet package.5–6. Adaptive and

Relaxed lasso (Zou 2006; Meinshausen 2007) for logistic regression, computedusing the gcdnet and glmnet packages, respectively.10. Minimum concave (

MC+ ) penalized (Zhang 2010) logistic regression, computed using the ncvreg package.8. Random Forest ( RF ) (Breiman 2001), computed using the randomForest package.9. Random GLM ( RGLM ) (Song et al. 2013), computed using the

RGLM package.10. Sure Independence Screening (

SIS ) (Fan and Lv 2008) with the

SCAD penalty, computedusing the

SIS package.11. Extreme Gradient Boosting (

XGBoost ) (Chen and Guestrin 2016), computed using the xgboost package.

To compare the methods, we measure for each of the methods misclassiﬁcation rate (MR), sensi-tivity (SE), speciﬁcity (SP), test-sample loss (TL) using (7), recall (RC) and precision (PR) on anindependent test set of size n = 2,000. Note that recall (RC) and precision (PR) are deﬁned asRC = (cid:80) pj =1 I ( β j (cid:54) = 0 , ˆ β j (cid:54) = 0) (cid:80) pj =1 I ( β j (cid:54) = 0) , PR = (cid:80) pj =1 I ( β j (cid:54) = 0 , ˆ β j (cid:54) = 0) (cid:80) pj =1 I ( ˆ β j (cid:54) = 0) , where β and ˆ β are the true and estimated regression coeﬃcients, respectively. Since RF andXGBoost use all the predictors, we do not compute their RC and PR.For each conﬁguration, we randomly generate N = 50 training and test sets and for each ofthe methods measure average performance on the test sets. In Table 1 the simulation results aresummarized by reporting for each performance metric the average rank of the competitors overall simulation settings. Lower ranks indicate better performance. Detailed simulation results areavailable in the supplementary material.From Table 1 it can be seen that the split logistic regression methods overall performed best interms of MR, with the “blackbox” ensemble methods RF and RGLM being the closest competitors.RF and RGLM ranked ﬁrst in terms of speciﬁcity followed by the split regression methods, butthe split logistic regression methods outperformed the other methods in terms of sensitivity (par-ticularly for low event probabilities, i.e. P ( Y = 1) = 0 . Method MR SE SP TL RC PR

Split-Lasso − −

RGLM 3.3 3.4

SIS-SCAD 10.6 10.7 10.5 10.3 8.8

XGBoost 9.0 9.1 8.6 8.3 − −

Constructing accurate and diverse models for an ensemble are contradictory objectives (Krogh andVedelsby 1995). The decomposition in (4) reveals that the generalization error of an ensemble canbe made small if the correlation between the individual functions is low, particularly for a largernumbers of models. In this section, we explore this further for split logistic regression by providingan empirical study of the bias-variance-covariance trade-oﬀ. In particular, it turns out that thediversiﬁcation in our method has little eﬀect on the bias, thus the accuracy-diversity trade-oﬀ forensembles essentially reduces to a variance-covariance trade-oﬀ for the individual models.

The analysis of microarray data via high-throughput technologies has generated the need for classi-ﬁcation algorithms that can handle high-dimensional data containing correlated predictors (genes)within diﬀerent pathways or networks, see Youseﬁ et al. (2011) and Zhang and Coombes (2012)for example. In light of this, to investigate the bias-variance-covariance trade-oﬀ of split logisticregression, we use the high-dimensional block correlation setting of Scenario 3 in the previous sec-tion with conﬁguration parameters ( n, p ) = (50 , ρ , ρ ) = (0 . , . ζ ∈ { . , . , . } and P ( Y = 1) = 0 .

4. The generalization error and its components are estimated by taking averagesover the test sets. 12n Table 2, we report the generalization error of the probabilities predicted by the ensembleas well as its components Bias G , Var G and Cov G , as a function of the number of models for splitlogistic regression. The results show that the average bias of the individual models remains smalland stable for all numbers of models. On the other hand, in all three settings the average varianceVar G increases with the number of models while the average covariance Cov G decreases. However,if the number of models increases, the average variance has a small impact on the variance of theensemble compared to the average covariance due to the scaling factor 1 /G in (4). This results ina smaller GE for the ensemble as the number of models increases.Table 2: Generalization error (GE), | Bias G | , Var G , and Cov G of the predicted probabilities as afunction of the number of models under Scenario 3. The output has been scaled by a factor of 100. ζ = . ζ = . ζ = .

4G GE | Bias G | Var G Cov G GE | Bias G | Var G Cov G GE | Bias G | Var G Cov G For a small number of models the average variance Var G still plays an important role in the varianceof the ensemble. In this case it can be beneﬁcial for models to share important predictors to reducetheir variance although there is a penalty cost controlled by λ d . Conversely, for a large number ofmodels the average variance only plays a small role in the variance of the ensemble. In this caseit becomes beneﬁcial to minimize the correlation between the models to reduce the generalizationerror of the ensemble. To achieve this, the diversity penalty aims to minimize their overlapOV = p (cid:80) j =1 o j I { o j (cid:54) = 0 } p (cid:80) j =1 I { o j (cid:54) = 0 } , o j = 1 G G (cid:88) g =1 I { ˆ β gj (cid:54) = 0 } . An appealing feature of the split learning method is that the penalty cost for overlap and con-sequently the variance-covariance trade-oﬀ is controlled by the diversity tuning parameter λ d inits objective function (8). That is, the optimal balance between individual model accuracy and13iversity between the models is learned from the data. The results in Table 3 conﬁrm that whilethe overlap is large in case of only G = 2 models, this overlap quickly decreases if the number ofmodels grow.Table 3 shows that while misclassiﬁcation rate of the ensemble quickly decreases and stabilizeswhen the number of models increases, the number of models also aﬀects recall and precision of theresulting ensemble. For example, for the case ζ = 0 . .

05 for G = 1 to RC = 0 .

62 for G = 25, while precision drops from PR = 0 .

53 toPR = 0 .

44. Hence, the gain in recall is much larger than the loss of precision. However, if predictionTable 3: MR, RC, PR and OV as a function of the number of models for Scenario 3. ζ = . ζ = . ζ = .

4G MR RC PR OV MR RC PR OV MR RC PR OV − − − A k =  j : G (cid:88) g =1 I (cid:16) ˆ β gj (cid:54) = 0 (cid:17) ≥ k  , ≤ k ≤ G, which contain the predictors that appear in at least k models. Clearly, A G ⊆ A G − ⊆ · · · ⊆ A and the smaller sets contain the more important variables. A high precision can thus be achievedby considering a set A k for suﬃciently large k instead of A . Tables 2 and 3 indicate that a larger number of models results in an ensemble with lower GE andMR. However, both errors stabilize quickly, so there is a diminishing returns type of behavior interms of prediction accuracy versus computational cost. Indeed, we also ran split logistic regressionusing G = 50 models, but in all cases there is hardly any improvement in GE and MR compared14o the ensemble with G = 25 models shown in Tables 2 and 3. In fact, with G = 25 modelssplit logistic regression already achieves nearly full diversity (OV = 0), so little gain is expected byincreasing the number of models while computation time does grow as shown in Table 4. This tablecontains the average computation time (in CPU seconds) across all sparsity levels as a function ofthe number of models. This computation time seems to depend linearly on the number of modelsand is approximately given by 4 .

88 + 1 . × G for G ≥ G Time G = 10. Whilethis choice is potentially sub-optimal the split logistic regression methods were already able tooutperform state-of-the-art competitors in a large number of scenarios. In real data applications, wepropose to determine the number of models in the ensemble by increasing G until the performance,measured either by cross-validation or on a test set, has stabilized. In the next section we applysplit logistic regression to gene expression data using cross-validation to determine the optimalnumber of groups. Split logistic regression and its competitors in Section 5.2 are applied to ten gene expression datasets. Some details about the data sets, collected from the Gene Expression Omnibus (GEO)database are given in Table 5. The ﬁrst eight data sets involve the classiﬁcation of diﬀerent typesof cancerous cell tissues, whereas the last two data sets involve the identiﬁcation of psoriasis andmultiple sclerosis cell tissues from adjacent normal cell tissue. The data sets are preprocessed byselecting the 10,000 genes with highest mean expression level and then retaining the p genes whichare most important in terms of two-sample t -tests (Tibshirani et al. 2003; Hall et al. 2009). Weconsidered the choices p = 100 , , , and 1,000.15able 5: GEO identiﬁcation (ID) codes, sample sizes ( n ), number of genes and data sets descrip-tions. GEO ID n Genes Description

GSE5364 29 22,283 Esophageal cancerous cell tissue.GSE20347 34 22,277 Esophageal cancerous cell tissue.GSE23400 106 22,283 Esophageal cancerous cell tissue.GSE23400 102 22,477 Esophageal cancerous cell tissue.GSE10245 58 54,675 Lung cancerous cell tissue.GSE5364 30 22,283 Lung cancerous cell tissue.GSE25869 75 27,578 Gastric cancerous cell tissue.GSE5364 51 22,283 Thyroid cancerous cell tissue.GSE21942 27 54,675 Multiple sclerosis cell tissue.GSE14905 54 54,675 Psoriasis cell tissueEach data set is split randomly N = 50 times into a training set and test set. For the proportionof training data we considered both 0 .

35 and 0 .

50, where proportion 0 .

35 is only used if it results inat least 20 training samples. Each of the methods is applied to the training data and evaluated onthe test data using both MR and TL. Methods are ranked according their performance averagedover the 50 random splits for the four dimensions p and both training data proportions.Table 6 summarizes the results by showing for each method the number of times it achievedtop one, top three and lowest three rank among the ten data sets in terms of both MR and TL.The results clearly show that Split-Lasso and Split-EN are the most stable methods as they arethe only methods that are in the top three for the majority of the datasets and never belong tothe worst three methods for both criteria. These methods thus are not only good classiﬁers, butthey also minimize losses, which indicates that they yield good predicted probabilities. The closestcompetitors are the “blackbox” ensemble methods RF and RGLM, but they are less often in thetop three and sometimes also belong to the worst three methods so they are less stable. Moreover,these methods are also much harder to interpret than the split logistic ensemble models which arestill logistic regression models. Detailed results for the ten gene expression data can be found inthe supplementary material. 16able 6: Number of top and lowest ranks for MR and TL over the ten gene expression data setsin Table 5. Top 1 Top 3 Low 3Method MR TL MR TL MR TL

Split-Lasso 1 1 7 8 0 0Split-EN 3 7 8 9 0 0Lasso 0 0 0 0 0 0EN 0 0 4 7 1 1Adaptive 1 0 2 0 5 3Relaxed 0 0 0 0 0 9MC+ 0 1 1 1 6 3RF 3 0 5 3 2 1RGLM 2 1 2 2 1 0SIS-SCAD 0 0 0 0 8 7XGBoost 0 0 1 0 7 6

We presented a new approach to learn in a computationally feasible way an ensemble of classi-ﬁcation models for high-dimensional data with small sample size. The ensemble is learned bybalancing between the individual model strengths and uncorrelatedness between the models. Thisis achieved by optimizing an objective function containing a diversity penalty which favorably ex-ploits the bias-variance-covariance trade-oﬀ for ensemble methods. The extensive simulation andgene expression benchmark studies demonstrate the excellent prediction accuracy of the method-ology. Its combination of ensemble prediction accuracy and interpretability makes split logisticregression a uniquely powerful tool for data analysis.The block coordinate descent algorithm is an eﬀective approach to solve the multi-convex op-timization problem for split logistic regression. In future research we will investigate whetheralternative approaches can further decrease the computational cost of the method. Split logisticregression ensembles the models at the level of the linear predictors. This guarantees high inter-pretability of the ensemble model, but is not necessarily optimal from a prediction point of view.In future research it will be examined whether alternative ensembling functions can improve onthe prediction accuracy of the ensemble.Split logistic regression can make use of diﬀerent groups of variables in diﬀerent models to buildan ensemble. Allowing interactions among predictors can be beneﬁcial for our method to further17mprove the prediction performance. Since split logistic regression can have much higher recallthan single-model Lasso and elastic net, our methodology can also be useful to detect importantinteraction eﬀects that would be missed by these single-model methods. This is important forexample for gene expression data applications where it is known that gene interaction eﬀects arecommon.Ensemble methods are very popular to analyze small sample data with a large number ofpredictor variables. Split statistical learning provides a framework to build an optimal ensemblemodel. Similarly to logistic regression, the general split learning framework could be applied tomulti-class classiﬁcation problems, to transform a single model classiﬁcation method into a powerfulensembling method. Split statistical learning could also be extended to generalized linear modelsin general.

Software and Computational Details

The details of the algorithm and its alternating grid search are available in the supplementarymaterial. The open-source

C++ library with multithreading capability implementing the methodin this article is available at https://github.com/AnthonyChristidis/SplitGLM-CPP-Library .The library has also been wrapped in the R package SplitGLM publicly available on CRAN.

Supplementary Material

The supplementary material contains the details of our algorithm, an illustration of the coeﬃcientpaths obtained by our algorithm in a sonar data application, and the full results of our simulationand gene expression data experiments. The data and scripts to replicate the simulations areavailable at https://doi.org/10.5281/zenodo.4518635 . ACKNOWLEDGMENTS

Part of this work was conducted while Anthony-Alexander Christidis was a UBC Doctoral Re-searcher at KU Leuven’s Department of Mathematics under a Mitacs Globalink Research Award.

References

Breiman, L. (2001). Random forests.

Machine Learning , 45(1):5–32.Brown, G., Wyatt, J. L., and Tiˇno, P. (2005). Managing diversity in regression ensembles.

Journalof machine learning research , 6(Sep):1621–1650.18¨uhlmann, P. and Yu, B. (2003). Boosting with the l 2 loss: regression and classiﬁcation.

Journalof the American Statistical Association , 98(462):324–339.Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In

Proceedings ofthe 22nd acm sigkdd international conference on knowledge discovery and data mining , pages785–794.Christidis, A.-A., Lakshmanan, L., Smucler, E., and Zamar, R. (2020). Split regularized regression.

Technometrics , 62(3):330–338.Donoho, D. L. and Johnstone, J. M. (1994). Ideal spatial adaptation by wavelet shrinkage.

Biometrika , 81(3):425–455.Dorani, F., Hu, T., Woods, M. O., and Zhai, G. (2018). Ensemble learning for detecting gene-geneinteractions in colorectal cancer.

PeerJ , 6:e5854.Evgeniou, T., Pontil, M., and Poggio, T. (1999). A uniﬁed framework for regularization networksand support vector machines.Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracleproperties.

J. Am. Statist. Ass. , 96(456):1348–1360.Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space.

J.R. Statist. Soc. B , 70(5):849–911.Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for generalized linearmodels via coordinate descent.

Journal of statistical software , 33(1):1.Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine.

Ann.Statist. , 29(5):1189–1232.Hall, P., Titterington, D., and Xue, J.-H. (2009). Median-based classiﬁers for high-dimensionaldata.

Journal of the American Statistical Association , 104(488):1597–1608.Hastie, T., Tibshirani, R., and Wainwright, M. (2015).

Statistical learning with sparsity: the lassoand generalizations . CRC press.Ho, T. K. (1998). The random subspace method for constructing decision forests.

IEEE transactionson pattern analysis and machine intelligence , 20(8):832–844.Krajewski, J., Batliner, A., and Kessel, S. (2010). Comparing multiple classiﬁers for speech-baseddetection of self-conﬁdence-a pilot study. In , pages 3716–3719. IEEE. 19rogh, A. and Vedelsby, J. (1995). Neural network ensembles, cross validation, and active learning.In

Advances in neural information processing systems , pages 231–238.Kuncheva, L. I. and Whitaker, C. J. (2003). Measures of diversity in classiﬁer ensembles and theirrelationship with the ensemble accuracy.

Machine learning , 51(2):181–207.Meinshausen, N. (2007). Relaxed lasso.

Computational Statistics & Data Analysis , 52(1):374–393.Mu, X., Lu, J., Watta, P., and Hassoun, M. H. (2009). Weighted voting-based ensemble classiﬁerswith application to human face recognition and voice recognition. In , pages 2168–2171. IEEE.Rani, P. I. and Muneeswaran, K. (2018). Emotion recognition based on facial components.

S¯adhan¯a ,43(3):48.Rieger, S. A., Muraleedharan, R., and Ramachandran, R. P. (2014). Speech based emotion recogni-tion using spectral feature extraction and an ensemble of knn classiﬁers. In

The 9th InternationalSymposium on Chinese Spoken Language Processing , pages 589–593. IEEE.Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions anduse interpretable models instead.

Nature Machine Intelligence , 1(5):206–215.Schapire, R. E. and Freund, Y. (2012).

Boosting: Foundations and Algorithms . The MIT Press.Shen, X., Diamond, S., Udell, M., Gu, Y., and Boyd, S. (2017). Disciplined multi-convex pro-gramming. In , pages 895–900.IEEE.Song, L., Langfelder, P., and Horvath, S. (2013). Random generalized linear model: a highlyaccurate and interpretable ensemble predictor.

BMC bioinformatics , 14(1):5.Tibshirani, R. (1996). Regression shrinkage and selection via the lasso.

J. R. Statist. Soc. B ,58(1):267–288.Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2003). Class prediction by nearestshrunken centroids, with applications to dna microarrays.

Statistical Science , pages 104–117.Tikhonov, A. N. and Arsenin, V. Y. (1977). Solutions of ill-posed problems.

Halsted Press, NewYork , pages 1–30.Tseng, P. (2001). Convergence of a block coordinate descent method for nondiﬀerentiable mini-mization.

Journal of optimization theory and applications , 109(3):475–494.20eda, N. and Nakano, R. (1996). Generalization error of ensemble estimators. In

Proceedings ofInternational Conference on Neural Networks (ICNN’96) , volume 1, pages 90–95. IEEE.Vapnik, V. (1998). Statistical learning theory.

Wiley, New York , 1:624.Xu, Y. and Yin, W. (2013). A block coordinate descent method for regularized multiconvex opti-mization with applications to nonnegative tensor factorization and completion.

SIAM Journalon imaging sciences , 6(3):1758–1789.Yang, Y., Pesavento, M., Luo, Z.-Q., and Ottersten, B. (2019). Inexact block coordinate descentalgorithms for nonsmooth nonconvex optimization.

IEEE Transactions on Signal Processing .Youseﬁ, M. R., Hua, J., and Dougherty, E. R. (2011). Multiple-rule bias in the comparison ofclassiﬁcation rules.

Bioinformatics , 27(12):1675–1683.Yu, B., Qiu, W., Chen, C., Ma, A., Jiang, J., Zhou, H., and Ma, Q. (2020). Submito-xgboost: pre-dicting protein submitochondrial localization by fusing multiple feature information and extremegradient boosting.

Bioinformatics , 36(4):1074–1081.Zahoor, J. and Zafar, K. (2020). Classiﬁcation of microarray gene expression data using an inﬁl-tration tactics optimization (ito) algorithm.

Genes , 11(7):819.Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty.

Ann.Statist. , 38(2):894–942.Zhang, J. and Coombes, K. R. (2012). Sources of variation in false discovery rate estimationinclude sample size, correlation, and inherent diﬀerences between groups.

BMC bioinformatics ,13(S13):S1.Zou, H. (2006). The adaptive lasso and its oracle properties.

Journal of the American statisticalassociation , 101(476):1418–1429.Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net.

J. R.Statist. Soc. B , 67(2):301–320.

Appendix A: Details of Computing Algorithm

In this section, we provide the derivation for the quadratic approximation of the logistic regressionloss, the high-level steps of the block coordinate descent algorithm, and a detailed description ofthe alternating grid search for the tuning parameters.21 ppendix A.1: Quadratic Approximation

For the binary classiﬁcation problem with the classes labeled as Y = {− , } , let y ∈ R n be thevector of class labels and X ∈ R n × p be the design matrix with sample size n and number of features p . The logistic regression loss function is given by L ( f ( x i ) , y i ) = L ( β , β | y i , x i ) = log (cid:16) e − y i f ( x i ) (cid:17) , ≤ i ≤ n, where f ( x i ) = β + x Ti β is a linear function of the predictor variables, β ∈ R and β ∈ R p are theintercept and vector of regression coeﬃcients.We denote the augmented design matrix by a column of ones ˜ X ∈ R n × ( p +1) and the augmentedvector of regression coeﬃcients by the intercept term β A ∈ R p +1 . The quadratic approximationfor the logistic regression loss at the current estimates ( ˜ β , ˜ β ) is given by L Q ( β , β | y i , x i ) = ∇L ( ˜ β , ˜ β | y, x i ) T (cid:16) β A − ˜ β A (cid:17) + 12 ( β A − ˜ β A ) T H (cid:16) ˜ β , ˜ β | y i , x i (cid:17) (cid:16) β A − ˜ β A (cid:17) , where the gradient vector and hessian matrix are given by ∇L ( ˜ β , ˜ β | y i , x i ) = − ˜ x Ti ( z i − ˜ p i ) , and H (cid:16) ˜ β , ˜ β | y i , x i (cid:17) = ˜ x Ti ˜ w i ˜ x i . The current probability and weight estimates are ˜ p i = S ( ˜ β + x Ti ˜ β ) and ˜ w i = ˜ p i (1 − ˜ p i ) respectively,and z i = ( y i + 1) /

2, 1 ≤ i ≤ n . The quadratic approximation can subsequently be rewritten as theweighted least-squares problem L Q ( β , β | y i , x i ) = 12 ˜ w i (˜ y i − f ( x i )) + C (cid:16) ˜ β , ˜ β (cid:17) , where ˜ y i = ˜ β + x Ti ˜ β + z i − ˜ p i ˜ w i and C (cid:16) ˜ β , ˜ β (cid:17) is a constant term. Appendix A.2: Block Coordinate Descent Algorithm

The objective function in the main article is multi-convex and can be written as a weighed elasticnet problem in each group, where the L penalty depends on the parameters in the other groups.In particularly, for a ﬁxed group g , the objective function is given by O (cid:0) β g , β g (cid:12)(cid:12) y , X (cid:1) = 1 n n (cid:88) i =1 L ( β g , β g | y i , x i ) + λ s (1 − α )2 (cid:107) β g (cid:107) + p (cid:88) j =1 | β gj | u j,g , ≤ g ≤ G. u j,g = ( αλ s + λ d (cid:80) h (cid:54) = g | β hj | ). We apply a block coordinate descentalgorithm by cycling through the parameters of one group at a time. When cycling through theparameters of a group, a single coordinate descent update is applied for each parameter.For notational convenience, denote ˜p g , ˜w g , ˜r g ∈ R n the vectors of transformed residuals, ﬁttedprobabilities and weights for group g , 1 ≤ g ≤ G . The coordinate descent updates are obtainedby minimizing the penalized quadratic approximation of the ﬁxed-group objective function at thecurrent parameter estimates for the ensemble. For parameter j of ﬁxed group g , 1 ≤ j ≤ p , thecoordinate descent update is given byˆ β gj = arg min β gj ∈ R n n (cid:88) i =1 L Q ( β g , β g | y i , x i ) + λ s (1 − α )2 (cid:107) β g (cid:107) + p (cid:88) j =1 | β gj | u j,g = arg min β gj ∈ R n n (cid:88) i =1 ˜ w i  ˜ y gi − β g − n (cid:88) i =1 p (cid:88) k (cid:54) = j x ik ˜ β gk − β gj x ij  + λ s (1 − α )2 (cid:16) β gj (cid:17) + | β gj | u j,g = Soft (cid:16) n (cid:16) ˜ r gj + ˜ β gj (cid:104) x j , ˜w g (cid:105) (cid:17) , αλ s + λ d (cid:80) h (cid:54) = g | ˜ β hj | (cid:17) n (cid:104) x j , ˜w g (cid:105) + (1 − α ) λ s , where ˜ r gj = (cid:104) x j , z (cid:105) − (cid:104) x j , ˜p g (cid:105) , and the last equality follows from the optimality condition forsubgradients. After each coordinate descent update for group g , the vectors ˜p g , ˜w g and ˜y g mustbe updated. A similar derivation can be made for the coordinate descent update of the interceptterm in Proposition 1 of the main article,ˆ β g = ˜ β g + (cid:104) z − ˜p g , n (cid:105)(cid:104) ˜w g , n (cid:105) . After a complete cycle through the G groups in the ensemble, a check for convergence isperformed at the level of the ﬁnal model coeﬃcient ˆ β = G (cid:80) Gg =1 ˆ β g , where the convergence criterionis given by max ≤ j ≤ p | ˜ β j − ˆ β j | < δ . The high-level steps of the block coordinate descent algorithm isgiven in 1. Appendix A.3: Alternating Grid Search for Tuning Parameters

The selection of the sparsity and diversity tuning parameters, λ s and λ d , is done by an alternatinggrid search. The ﬁrst grid search is done over λ s with a ﬁxed diversity tuning parameter λ opt d = 0,which yields the optimal value minimizing the cross-validated loss (CVL), λ opt s . Using a ﬁxed valuefor the sparsity of λ opt s , we similarly perform a grid search over λ d which yields λ opt d . This processis repeated until the cross-validated loss (CVL) no longer decreases.To construct a grid over λ s , we estimate a value λ max s that makes all models null. In the special23 lgorithm 1 Block coordinate descent for split logistic regression

Inputs:

Design matrix X ∈ R n × p , response vector y ∈ R n , α ∈ [0 , G , andtuning parameters λ s , λ d ≥

0, warm-start initial estimates (cid:110) ˜ β g , ˜ β g (cid:111) g ∈G . function BlockCoordinate ( X , y , α, G, λ s , λ d ) do (cid:110) ˆ β g , ˆ β g (cid:111) g ∈G ← (cid:110) ˜ β g , ˜ β g (cid:111) g ∈G for g = 1 to G do ˜ β g ← UpdateIntercept (cid:18) g, (cid:110) ˜ β g , ˜ β g (cid:111) g ∈G (cid:19) ( ˜p g , ˜w g , ˜r g ) ← UpdateComponents (cid:16) ˜ β g , ˜ β g (cid:17) for j = 1 to p do ˜ β gj ← UpdateParameter (cid:18) g, j, (cid:110) ˜ β g , ˜ β g (cid:111) g ∈G (cid:19) ( ˜p g , ˜w g , ˜r g ) ← UpdateComponents (cid:16) ˜ β g , ˜ β g (cid:17) end for end for while max ≤ j ≤ p (cid:12)(cid:12)(cid:12) G (cid:80) Gg =1 ˜ β gj − G (cid:80) Gg =1 ˆ β gj (cid:12)(cid:12)(cid:12) ≥ δ return (cid:110) ˜ β g , ˜ β g (cid:111) g ∈G end function case where λ d = 0 and α >

0, it can easily be shown that λ max s = α max ≤ j ≤ p | ¯ x j | . For a ﬁxeddiversity penalty λ d >

0, we estimate the smallest λ max s that makes all models null by performingan internal grid search. We then build the grid for the sparsity penalty λ s for a ﬁxed diversitypenalty λ d similarly to the case of (single-group) penalized logistic regression: we use (by default)100 log-equispaced points between (cid:15)λ max s and λ max s , where (cid:15) = 10 − if p < n and 10 − otherwise.The smallest diversity penalty λ max d that makes the models fully disjoint for some ﬁxed λ s ≥ λ s analogously using (by default) 100 log-equispaced points between (cid:15)λ max d and λ max d . For agrid search over one of the tuning parameters while keeping the other one ﬁxed, we use warm-startsby computing solutions for a decreasing sequence of λ s or λ d , leading to a more stable algorithm.The high-level steps of the alternating grid search are given in Algorithm 2. Appendix B: Sonar Data - Coeﬃcient Paths

To illustrate the coeﬃcients solution paths of our computing algorithm, we apply split logisticregression with only G = 2 groups to a sonar data application with a moderate number of variables.The sonar data-set, publicly available on the UCI Machine Learning Repository under the name24 lgorithm 2 Alternating grid search for split logistic regression

Inputs:

Design matrix X ∈ R n × p , response vector y ∈ R n , α ∈ [0 , G , numberof cross-validation folds K , number of tuning parameters elements for sparsity L s and diversity L d . function cvSplitLogistic ( X , y , α, G, K, L s , L d ) λ opt d ← Sparsity ← true do if Sparsity then λ max s ← SparsitySearch (cid:16) X , y , α, G, λ opt d (cid:17) λ s ← GenerateGrid ( L s , λ max s ) for j = 0 to L s do CVL j ← cvBlockCoordinate (cid:16) X , y , α, G, K, λ js , λ opt d (cid:17) end for if min ( CVL ) is decreasing then λ opt s ← λ MinIndex ( CVL ) s end if Sparsity ← false else λ max d ← DiversitySearch (cid:16) X , y , α, G, λ opt d (cid:17) λ d ← GenerateGrid ( L d , λ max d ) for j = 0 to L d do CVL j ← cvBlockCoordinate (cid:16) X , y , α, G, K, λ opt s , λ jd (cid:17) end for if min ( CVL ) is decreasing then λ opt d ← λ MinIndex ( CVL ) d end if Sparsity ← true end if while min ( CVL ) is decreasing return (cid:16) λ opt s , λ opt d (cid:17) end function n = 208 observations of p = 60 sonar energy measurementsobtained by bouncing frequency-modulated chirp signals at 60 diﬀerent angles on either a rock( n = 111) or a mine ( n = 97), the targets for this binary classiﬁcation problem. A preliminaryexploratory analysis of the data reveals the presence of high correlation between consecutive anglemeasurements. We compare the coeﬃcients paths for the eight highest coeﬃcients in each groupand the ensemble in Figure 1 as function of λ s for the optimally selected λ d . Tuning parametersselections for λ d and λ s were carried out by tenfold cross-validation (CV).As a preliminary study of the eﬀect of split modeling on prediction for logistic regression, wemimic a toy version of a high-dimensional scenario by randomly dividing the data set 100 timesin a training and testing set of 25 and 75 percent of the observations, such that the dimension ofthe data p = 60 is slightly greater than the sample size n train = 52, and n test = 155. We comparethe prediction results split logistic regression with G = 2 to the logistic elastic net model ( G = 1)in Table 7, and even for a case where the dimension of the problem is relatively low with p = 60and we use only G = 2 groups, the MR of the ensemble split model is lower than that of thelogistic elastic net. It is interesting to note that the individual split models have similar predictionperformances to the logistic elastic net, and they also have on average roughly the same size as thelogistic elastic net.Table 7: Average (standard deviation) of misclassiﬁcation rates (MRs) and number of variablesselected (NV) for the logistic elastic net model, the two individual split models, and the logisticsplit ensemble. Method MR NV

Elastic Net 0.281 (0.045) 15.7Split Model 1 0.278 (0.041) 15.7Split Model 2 0.283 (0.043) 16.1Split Ensemble 0.276 (0.042) 21.9Energy measurements 35 and 51 are included in the set A for approximately half of thereplications, but only the latter energy measurement was included majoritarily A when it wasselected. 26 a) Coeﬃcient Path of ˆ β (b) Coeﬃcient Path of ˆ β (c) Ensemble coeﬃcient Path of ˆ β Figure 1: Sonar data coeﬃcients paths for optimal λ d as a function of λ s for each split modelusing G = 2. The eight variables with highest absolute coeﬃcient values are included in each plot.The dotted line in each plot represents the ﬁnal model selected.27 ppendix C: Full Results of Simulation Study In this section, the full results of the simulation study are reported. We report the misclassiﬁcationrate (MR), sensitivity (SE), speciﬁcity (SP), test-sample loss (TL), recall (RC) and precision (PR)for the eleven methods listed in the main paper. The simulation results are given for the threeScenarios, as well as the correlation parameters ( ρ , ρ and ρ ), sample sizes ( n ), sparsity levels ( ζ ),and probability of positive events ( π ). • Scenario 1 : – Tables 2-7 contain the results for the MR, SE and SP. – Tables 8-13 contain the results for the TL, RC and PR. • Scenario 2 : – Tables 14-25 contain the results for the MR, SE and SP. – Tables 26-37 contain the results for the TL, RC and PR. • Scenario 3 : – Tables 38-43 contain the results for the MR, SE and SP. – Tables 44-49 contain the results for the TL, RC and PR.

Appendix D: Full Results for Gene Expression Data

Tables 50-65 contain the full results for the gene expression data benchmark study. For eachcombination of data set and training set proportion, the relative performances for the MR andTL are reported for the number of genes preserved following the preprocessing step, where weconsidered p = 100, 250, 500 and 1,000. The entry 1.00 in each column corresponds to the bestmethod, and the results for the other methods are reported as a ratio over the performance of thebest performer. 28able 8: MR, SE and SP for Scenario 1, ρ = 0.2, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.16 0.44 0.95 0.12 0.49 0.97 0.10 0.60 0.98 0.09 0.57 0.99Split-EN 0.16 0.46 0.95 0.12 0.52 0.97 0.09 0.63 0.98 0.09 0.62 0.99Lasso 0.18 0.34 0.95 0.16 0.37 0.96 0.15 0.43 0.97 0.13 0.45 0.97EN 0.17 0.38 0.95 0.14 0.42 0.97 0.13 0.50 0.97 0.12 0.50 0.98Adaptive 0.21 0.07 0.99 0.19 0.09 0.99 0.19 0.14 0.99 0.18 0.16 0.990.2 Relaxed 0.19 0.34 0.94 0.17 0.38 0.95 0.15 0.50 0.94 0.14 0.51 0.95MC+ 0.21 0.20 0.95 0.20 0.18 0.96 0.20 0.22 0.96 0.19 0.22 0.96RF 0.17 0.27 0.98 0.16 0.23 1.00 0.16 0.26 1.00 0.15 0.23 1.00RGLM 0.16 0.38 0.97 0.15 0.32 0.99 0.14 0.36 0.99 0.14 0.33 1.00SIS-SCAD 0.22 0.26 0.92 0.20 0.19 0.95 0.20 0.22 0.95 0.19 0.21 0.95XGBoost 0.19 0.31 0.94 0.18 0.26 0.97 0.17 0.29 0.97 0.17 0.26 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.19 0.58 0.91 0.15 0.70 0.92 0.11 0.74 0.96 0.10 0.75 0.97Split-EN 0.19 0.58 0.91 0.14 0.71 0.92 0.10 0.76 0.96 0.09 0.77 0.97Lasso 0.23 0.52 0.88 0.19 0.61 0.90 0.16 0.62 0.93 0.16 0.62 0.94EN 0.21 0.53 0.90 0.17 0.65 0.91 0.14 0.67 0.95 0.13 0.67 0.95Adaptive 0.27 0.25 0.94 0.24 0.35 0.94 0.22 0.38 0.96 0.20 0.42 0.960.3 Relaxed 0.23 0.52 0.88 0.20 0.60 0.89 0.17 0.63 0.92 0.16 0.65 0.92MC+ 0.27 0.39 0.88 0.25 0.44 0.89 0.24 0.39 0.92 0.23 0.44 0.90RF 0.21 0.44 0.95 0.17 0.52 0.97 0.16 0.49 0.99 0.15 0.50 0.99RGLM 0.20 0.53 0.93 0.16 0.60 0.95 0.15 0.57 0.98 0.14 0.57 0.98SIS-SCAD 0.28 0.40 0.85 0.27 0.44 0.85 0.26 0.44 0.88 0.26 0.45 0.87XGBoost 0.25 0.44 0.89 0.22 0.48 0.91 0.21 0.48 0.93 0.21 0.48 0.92 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.21 0.70 0.86 0.16 0.77 0.89 0.11 0.83 0.93 0.09 0.84 0.95Split-EN 0.21 0.70 0.86 0.15 0.78 0.89 0.11 0.83 0.93 0.09 0.86 0.95Lasso 0.26 0.59 0.84 0.21 0.69 0.85 0.18 0.72 0.88 0.17 0.75 0.89EN 0.24 0.62 0.85 0.19 0.72 0.87 0.15 0.76 0.91 0.13 0.80 0.92Adaptive 0.31 0.41 0.88 0.26 0.52 0.88 0.24 0.55 0.90 0.21 0.61 0.900.4 Relaxed 0.26 0.60 0.84 0.21 0.71 0.84 0.19 0.74 0.86 0.18 0.75 0.86MC+ 0.31 0.50 0.82 0.28 0.57 0.82 0.26 0.61 0.83 0.26 0.61 0.83RF 0.21 0.64 0.89 0.17 0.68 0.93 0.14 0.71 0.96 0.12 0.74 0.97RGLM 0.21 0.67 0.87 0.17 0.72 0.91 0.14 0.74 0.94 0.12 0.76 0.95SIS-SCAD 0.32 0.56 0.75 0.30 0.58 0.78 0.29 0.59 0.79 0.28 0.59 0.80XGBoost 0.27 0.59 0.81 0.25 0.60 0.85 0.24 0.62 0.86 0.23 0.64 0.8629able 9: MR, SE and SP for Scenario 1, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.69 0.96 0.07 0.76 0.97 0.05 0.80 0.98 0.05 0.80 0.99Split-EN 0.09 0.69 0.96 0.07 0.78 0.97 0.05 0.82 0.98 0.05 0.82 0.99Lasso 0.12 0.61 0.95 0.10 0.66 0.96 0.09 0.68 0.97 0.09 0.69 0.97EN 0.11 0.64 0.96 0.08 0.72 0.97 0.07 0.74 0.98 0.07 0.74 0.98Adaptive 0.15 0.33 0.98 0.13 0.45 0.98 0.11 0.49 0.98 0.13 0.43 0.990.2 Relaxed 0.12 0.61 0.95 0.10 0.69 0.95 0.09 0.72 0.95 0.10 0.72 0.95MC+ 0.16 0.41 0.94 0.16 0.44 0.94 0.15 0.40 0.96 0.16 0.38 0.96RF 0.11 0.55 0.98 0.09 0.58 0.99 0.08 0.59 1.00 0.09 0.56 1.00RGLM 0.10 0.60 0.97 0.08 0.64 0.99 0.07 0.65 0.99 0.08 0.63 1.00SIS-SCAD 0.17 0.32 0.96 0.16 0.35 0.96 0.16 0.31 0.97 0.16 0.31 0.97XGBoost 0.14 0.47 0.96 0.13 0.48 0.97 0.13 0.51 0.96 0.13 0.47 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.77 0.93 0.08 0.85 0.95 0.06 0.85 0.97 0.05 0.88 0.98Split-EN 0.12 0.78 0.93 0.08 0.86 0.95 0.06 0.87 0.98 0.05 0.89 0.98Lasso 0.14 0.73 0.91 0.11 0.79 0.93 0.10 0.77 0.96 0.09 0.80 0.95EN 0.13 0.75 0.92 0.10 0.83 0.94 0.08 0.81 0.97 0.07 0.84 0.97Adaptive 0.16 0.58 0.95 0.14 0.64 0.95 0.12 0.67 0.97 0.12 0.69 0.970.3 Relaxed 0.15 0.73 0.90 0.12 0.80 0.91 0.11 0.79 0.93 0.10 0.81 0.93MC+ 0.19 0.58 0.91 0.18 0.65 0.90 0.17 0.60 0.93 0.17 0.61 0.92RF 0.12 0.71 0.95 0.09 0.77 0.97 0.09 0.72 0.99 0.08 0.76 0.99RGLM 0.12 0.74 0.94 0.09 0.80 0.96 0.08 0.76 0.98 0.07 0.81 0.98SIS-SCAD 0.20 0.55 0.90 0.19 0.59 0.91 0.19 0.56 0.92 0.19 0.55 0.92XGBoost 0.16 0.62 0.93 0.15 0.68 0.93 0.15 0.61 0.95 0.14 0.66 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.13 0.83 0.89 0.09 0.89 0.92 0.06 0.92 0.95 0.05 0.92 0.96Split-EN 0.13 0.84 0.89 0.09 0.89 0.92 0.06 0.92 0.95 0.05 0.93 0.97Lasso 0.15 0.80 0.88 0.12 0.85 0.90 0.10 0.87 0.92 0.10 0.86 0.93EN 0.14 0.81 0.89 0.11 0.87 0.91 0.08 0.89 0.94 0.07 0.89 0.95Adaptive 0.17 0.73 0.90 0.14 0.80 0.90 0.11 0.84 0.92 0.11 0.82 0.940.4 Relaxed 0.16 0.80 0.87 0.14 0.84 0.87 0.11 0.86 0.91 0.11 0.86 0.91MC+ 0.20 0.73 0.85 0.18 0.77 0.86 0.16 0.78 0.88 0.16 0.77 0.88RF 0.13 0.81 0.91 0.09 0.85 0.94 0.07 0.88 0.96 0.06 0.88 0.98RGLM 0.13 0.81 0.90 0.10 0.86 0.93 0.07 0.88 0.95 0.07 0.89 0.97SIS-SCAD 0.22 0.69 0.84 0.20 0.73 0.84 0.19 0.73 0.86 0.19 0.72 0.87XGBoost 0.18 0.74 0.87 0.16 0.77 0.89 0.15 0.78 0.90 0.15 0.76 0.9130able 10: MR, SE and SP for Scenario 1, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.07 0.83 0.96 0.05 0.87 0.98 0.03 0.91 0.98 0.03 0.88 0.99Split-EN 0.07 0.84 0.96 0.05 0.88 0.98 0.03 0.92 0.98 0.03 0.89 0.99Lasso 0.08 0.78 0.96 0.06 0.81 0.97 0.05 0.84 0.97 0.06 0.81 0.98EN 0.07 0.80 0.96 0.05 0.84 0.98 0.04 0.88 0.98 0.04 0.85 0.99Adaptive 0.10 0.64 0.97 0.09 0.65 0.98 0.07 0.70 0.98 0.08 0.69 0.990.2 Relaxed 0.09 0.79 0.95 0.07 0.83 0.96 0.06 0.84 0.96 0.06 0.83 0.97MC+ 0.13 0.60 0.95 0.11 0.61 0.96 0.11 0.60 0.96 0.11 0.59 0.96RF 0.07 0.79 0.97 0.05 0.80 0.99 0.04 0.83 0.99 0.04 0.81 1.00RGLM 0.07 0.80 0.96 0.05 0.81 0.98 0.04 0.84 0.99 0.05 0.81 0.99SIS-SCAD 0.13 0.53 0.96 0.12 0.54 0.97 0.11 0.54 0.98 0.11 0.54 0.98XGBoost 0.12 0.71 0.93 0.11 0.69 0.94 0.10 0.72 0.94 0.11 0.70 0.94 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.08 0.86 0.95 0.05 0.91 0.96 0.04 0.94 0.98 0.03 0.94 0.98Split-EN 0.08 0.86 0.95 0.05 0.91 0.96 0.03 0.94 0.98 0.03 0.95 0.98Lasso 0.09 0.83 0.94 0.07 0.86 0.95 0.06 0.88 0.96 0.06 0.89 0.96EN 0.08 0.84 0.95 0.06 0.89 0.96 0.04 0.91 0.97 0.04 0.92 0.98Adaptive 0.11 0.74 0.96 0.09 0.80 0.96 0.07 0.84 0.97 0.07 0.85 0.970.3 Relaxed 0.10 0.82 0.93 0.08 0.86 0.94 0.07 0.90 0.95 0.07 0.90 0.94MC+ 0.15 0.70 0.92 0.13 0.73 0.93 0.13 0.74 0.93 0.13 0.75 0.93RF 0.08 0.83 0.96 0.06 0.87 0.98 0.04 0.89 0.99 0.04 0.89 0.99RGLM 0.08 0.85 0.95 0.06 0.89 0.97 0.04 0.91 0.98 0.04 0.92 0.98SIS-SCAD 0.14 0.67 0.94 0.13 0.69 0.95 0.12 0.70 0.96 0.12 0.72 0.95XGBoost 0.13 0.77 0.92 0.13 0.78 0.91 0.12 0.79 0.92 0.12 0.77 0.92 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.90 0.92 0.06 0.92 0.96 0.04 0.95 0.97 0.03 0.96 0.97Split-EN 0.09 0.90 0.92 0.06 0.93 0.96 0.04 0.96 0.97 0.03 0.96 0.97Lasso 0.10 0.87 0.91 0.08 0.89 0.95 0.06 0.92 0.95 0.06 0.92 0.95EN 0.09 0.89 0.92 0.06 0.91 0.95 0.05 0.94 0.96 0.04 0.94 0.96Adaptive 0.11 0.84 0.93 0.09 0.86 0.95 0.07 0.90 0.95 0.07 0.91 0.950.4 Relaxed 0.11 0.87 0.90 0.09 0.89 0.93 0.07 0.91 0.93 0.07 0.92 0.93MC+ 0.15 0.80 0.89 0.14 0.79 0.91 0.12 0.84 0.90 0.13 0.83 0.90RF 0.09 0.88 0.93 0.06 0.90 0.97 0.04 0.94 0.97 0.04 0.94 0.98RGLM 0.09 0.89 0.93 0.06 0.91 0.96 0.04 0.94 0.97 0.04 0.95 0.97SIS-SCAD 0.14 0.80 0.89 0.13 0.79 0.92 0.12 0.82 0.92 0.12 0.84 0.91XGBoost 0.13 0.84 0.89 0.13 0.83 0.90 0.12 0.84 0.91 0.13 0.84 0.9031able 11: MR, SE and SP for Scenario 1, ρ = 0.2, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.49 0.95 0.11 0.60 0.97 0.08 0.69 0.98 0.07 0.74 0.98Split-EN 0.15 0.49 0.95 0.11 0.61 0.97 0.08 0.69 0.98 0.06 0.76 0.98Lasso 0.17 0.42 0.94 0.14 0.46 0.96 0.12 0.57 0.97 0.11 0.58 0.97EN 0.16 0.44 0.95 0.13 0.52 0.96 0.10 0.61 0.97 0.10 0.64 0.97Adaptive 0.20 0.12 0.99 0.17 0.18 0.99 0.16 0.27 0.99 0.14 0.39 0.980.2 Relaxed 0.17 0.41 0.94 0.14 0.48 0.96 0.12 0.58 0.96 0.11 0.60 0.96MC+ 0.20 0.29 0.94 0.18 0.28 0.96 0.17 0.34 0.96 0.16 0.38 0.95RF 0.17 0.29 0.99 0.15 0.29 1.00 0.15 0.31 1.00 0.13 0.36 1.00RGLM 0.15 0.46 0.96 0.12 0.46 0.98 0.12 0.47 0.99 0.10 0.51 0.99SIS-SCAD 0.20 0.30 0.94 0.18 0.31 0.94 0.18 0.35 0.95 0.17 0.34 0.95XGBoost 0.18 0.34 0.96 0.16 0.33 0.98 0.15 0.34 0.98 0.14 0.36 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.18 0.61 0.91 0.13 0.74 0.92 0.09 0.81 0.95 0.08 0.82 0.97Split-EN 0.18 0.61 0.91 0.13 0.75 0.92 0.09 0.82 0.95 0.07 0.83 0.97Lasso 0.21 0.54 0.90 0.17 0.65 0.91 0.13 0.72 0.93 0.12 0.72 0.94EN 0.20 0.56 0.90 0.16 0.68 0.91 0.12 0.75 0.94 0.11 0.75 0.95Adaptive 0.24 0.34 0.95 0.20 0.48 0.95 0.16 0.61 0.95 0.15 0.59 0.960.3 Relaxed 0.21 0.54 0.90 0.17 0.65 0.91 0.14 0.72 0.92 0.13 0.73 0.93MC+ 0.24 0.45 0.90 0.21 0.55 0.90 0.19 0.59 0.90 0.19 0.55 0.92RF 0.19 0.48 0.95 0.15 0.57 0.97 0.13 0.60 0.99 0.14 0.55 1.00RGLM 0.18 0.59 0.91 0.14 0.67 0.94 0.11 0.71 0.97 0.11 0.67 0.99SIS-SCAD 0.25 0.46 0.87 0.24 0.51 0.88 0.22 0.57 0.87 0.22 0.53 0.89XGBoost 0.22 0.50 0.91 0.19 0.55 0.92 0.18 0.57 0.94 0.17 0.54 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.20 0.70 0.87 0.15 0.79 0.89 0.10 0.85 0.93 0.08 0.88 0.94Split-EN 0.20 0.70 0.87 0.15 0.79 0.89 0.10 0.86 0.94 0.08 0.89 0.95Lasso 0.23 0.66 0.85 0.18 0.73 0.88 0.15 0.78 0.90 0.13 0.81 0.91EN 0.22 0.67 0.85 0.17 0.75 0.88 0.13 0.81 0.91 0.11 0.83 0.92Adaptive 0.26 0.53 0.88 0.21 0.62 0.90 0.16 0.72 0.91 0.15 0.75 0.910.4 Relaxed 0.23 0.65 0.85 0.18 0.73 0.87 0.15 0.78 0.89 0.14 0.80 0.90MC+ 0.26 0.60 0.83 0.23 0.64 0.85 0.21 0.69 0.86 0.20 0.71 0.86RF 0.21 0.62 0.90 0.16 0.69 0.93 0.12 0.75 0.96 0.10 0.78 0.97RGLM 0.20 0.69 0.87 0.16 0.75 0.90 0.11 0.80 0.94 0.10 0.82 0.95SIS-SCAD 0.29 0.58 0.80 0.26 0.62 0.81 0.25 0.65 0.82 0.24 0.66 0.83XGBoost 0.24 0.61 0.86 0.21 0.67 0.87 0.19 0.69 0.89 0.18 0.71 0.9032able 12: MR, SE and SP for Scenario 1, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.72 0.96 0.06 0.79 0.97 0.04 0.86 0.98 0.04 0.86 0.99Split-EN 0.09 0.72 0.96 0.06 0.80 0.97 0.04 0.87 0.98 0.04 0.87 0.99Lasso 0.11 0.67 0.95 0.08 0.71 0.97 0.07 0.76 0.98 0.07 0.77 0.98EN 0.10 0.69 0.96 0.08 0.74 0.97 0.06 0.80 0.98 0.05 0.80 0.98Adaptive 0.13 0.48 0.98 0.10 0.56 0.98 0.08 0.65 0.98 0.08 0.64 0.990.2 Relaxed 0.11 0.67 0.95 0.09 0.71 0.96 0.07 0.77 0.97 0.07 0.79 0.96MC+ 0.15 0.50 0.94 0.14 0.49 0.95 0.13 0.52 0.96 0.14 0.51 0.95RF 0.10 0.62 0.98 0.08 0.62 0.99 0.07 0.66 1.00 0.07 0.66 1.00RGLM 0.09 0.66 0.97 0.08 0.67 0.99 0.06 0.71 0.99 0.06 0.71 1.00SIS-SCAD 0.15 0.42 0.96 0.15 0.39 0.97 0.14 0.43 0.97 0.14 0.41 0.97XGBoost 0.12 0.56 0.96 0.11 0.55 0.98 0.10 0.58 0.98 0.10 0.58 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.11 0.79 0.93 0.08 0.87 0.95 0.05 0.90 0.97 0.04 0.92 0.98Split-EN 0.11 0.79 0.93 0.08 0.87 0.95 0.05 0.90 0.97 0.04 0.92 0.98Lasso 0.13 0.74 0.93 0.10 0.82 0.94 0.08 0.84 0.96 0.07 0.85 0.96EN 0.12 0.76 0.93 0.09 0.83 0.94 0.07 0.86 0.96 0.06 0.88 0.97Adaptive 0.14 0.64 0.95 0.11 0.74 0.96 0.09 0.79 0.97 0.08 0.81 0.970.3 Relaxed 0.13 0.74 0.93 0.10 0.82 0.93 0.08 0.86 0.95 0.08 0.86 0.94MC+ 0.16 0.65 0.92 0.15 0.71 0.92 0.13 0.72 0.93 0.13 0.73 0.93RF 0.12 0.72 0.95 0.09 0.79 0.97 0.07 0.80 0.99 0.06 0.81 0.99RGLM 0.11 0.75 0.94 0.08 0.82 0.96 0.06 0.83 0.98 0.06 0.85 0.98SIS-SCAD 0.17 0.62 0.92 0.15 0.69 0.92 0.14 0.70 0.93 0.14 0.68 0.93XGBoost 0.14 0.67 0.94 0.12 0.73 0.94 0.11 0.72 0.96 0.11 0.73 0.96 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.83 0.91 0.08 0.90 0.93 0.05 0.93 0.96 0.05 0.94 0.96Split-EN 0.12 0.83 0.91 0.08 0.90 0.93 0.05 0.93 0.96 0.04 0.95 0.97Lasso 0.14 0.79 0.90 0.11 0.86 0.91 0.08 0.89 0.94 0.08 0.90 0.94EN 0.13 0.80 0.91 0.10 0.87 0.92 0.07 0.90 0.95 0.06 0.92 0.95Adaptive 0.15 0.73 0.92 0.12 0.83 0.92 0.09 0.86 0.94 0.08 0.89 0.940.4 Relaxed 0.15 0.78 0.90 0.11 0.86 0.91 0.09 0.88 0.93 0.09 0.90 0.93MC+ 0.17 0.74 0.89 0.15 0.81 0.89 0.13 0.82 0.90 0.13 0.82 0.90RF 0.13 0.80 0.92 0.09 0.87 0.94 0.06 0.89 0.97 0.05 0.90 0.98RGLM 0.13 0.82 0.91 0.09 0.88 0.93 0.06 0.90 0.96 0.06 0.91 0.97SIS-SCAD 0.19 0.73 0.87 0.17 0.78 0.87 0.15 0.79 0.89 0.15 0.80 0.89XGBoost 0.15 0.76 0.90 0.13 0.82 0.91 0.12 0.81 0.93 0.11 0.83 0.9333able 13: MR, SE and SP for Scenario 1, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.06 0.82 0.97 0.04 0.89 0.98 0.03 0.92 0.99 0.02 0.94 0.99Split-EN 0.06 0.83 0.97 0.04 0.90 0.98 0.03 0.93 0.99 0.02 0.94 0.99Lasso 0.07 0.79 0.96 0.05 0.85 0.97 0.04 0.86 0.98 0.04 0.87 0.98EN 0.07 0.80 0.97 0.05 0.87 0.98 0.03 0.88 0.99 0.03 0.89 0.99Adaptive 0.08 0.68 0.98 0.06 0.79 0.98 0.05 0.80 0.99 0.05 0.83 0.990.2 Relaxed 0.08 0.78 0.96 0.06 0.85 0.96 0.05 0.87 0.97 0.05 0.87 0.97MC+ 0.12 0.60 0.95 0.11 0.66 0.95 0.10 0.64 0.96 0.10 0.65 0.96RF 0.06 0.78 0.98 0.04 0.84 0.99 0.03 0.84 1.00 0.03 0.86 1.00RGLM 0.06 0.80 0.97 0.05 0.85 0.98 0.03 0.87 0.99 0.03 0.88 0.99SIS-SCAD 0.12 0.50 0.97 0.11 0.54 0.98 0.11 0.55 0.98 0.11 0.55 0.98XGBoost 0.08 0.72 0.97 0.07 0.76 0.98 0.07 0.75 0.98 0.07 0.75 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.08 0.86 0.95 0.05 0.92 0.97 0.03 0.95 0.98 0.03 0.96 0.98Split-EN 0.08 0.86 0.95 0.05 0.92 0.97 0.03 0.95 0.98 0.02 0.96 0.98Lasso 0.09 0.83 0.95 0.06 0.88 0.96 0.05 0.91 0.97 0.04 0.91 0.98EN 0.08 0.84 0.95 0.06 0.90 0.97 0.04 0.93 0.98 0.03 0.93 0.98Adaptive 0.10 0.77 0.96 0.07 0.85 0.97 0.05 0.88 0.97 0.05 0.89 0.980.3 Relaxed 0.09 0.82 0.94 0.07 0.88 0.95 0.06 0.91 0.95 0.05 0.92 0.96MC+ 0.13 0.72 0.93 0.12 0.76 0.94 0.11 0.77 0.94 0.10 0.78 0.94RF 0.08 0.83 0.96 0.05 0.89 0.98 0.04 0.91 0.99 0.03 0.92 0.99RGLM 0.08 0.84 0.95 0.05 0.90 0.97 0.04 0.92 0.98 0.03 0.93 0.99SIS-SCAD 0.13 0.69 0.95 0.11 0.73 0.96 0.11 0.73 0.96 0.10 0.75 0.96XGBoost 0.10 0.80 0.95 0.08 0.83 0.96 0.07 0.84 0.97 0.07 0.85 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.08 0.89 0.93 0.05 0.94 0.95 0.03 0.96 0.97 0.03 0.97 0.97Split-EN 0.08 0.89 0.93 0.05 0.94 0.95 0.03 0.96 0.97 0.03 0.97 0.98Lasso 0.09 0.87 0.93 0.07 0.92 0.94 0.05 0.93 0.96 0.05 0.94 0.96EN 0.09 0.88 0.93 0.06 0.93 0.95 0.04 0.94 0.97 0.04 0.95 0.97Adaptive 0.10 0.85 0.94 0.07 0.91 0.94 0.05 0.92 0.96 0.05 0.93 0.960.4 Relaxed 0.10 0.87 0.92 0.07 0.92 0.93 0.06 0.93 0.95 0.06 0.94 0.94MC+ 0.13 0.82 0.91 0.11 0.86 0.91 0.10 0.86 0.93 0.10 0.87 0.93RF 0.08 0.88 0.94 0.05 0.93 0.96 0.03 0.94 0.98 0.03 0.96 0.98RGLM 0.09 0.89 0.93 0.06 0.93 0.95 0.04 0.95 0.97 0.03 0.96 0.98SIS-SCAD 0.12 0.83 0.91 0.10 0.87 0.92 0.09 0.87 0.94 0.09 0.88 0.93XGBoost 0.10 0.86 0.92 0.08 0.89 0.94 0.07 0.89 0.95 0.07 0.90 0.9534able 14: TL, RC and PR for Scenario 1, ρ = 0.2, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.71 0.17 0.09 0.55 0.20 0.16 0.43 0.19 0.27 0.40 0.18 0.49Split-EN 0.70 0.21 0.08 0.54 0.23 0.15 0.41 0.26 0.25 0.36 0.25 0.47Lasso 0.85 0.02 0.08 0.73 0.02 0.12 0.67 0.02 0.23 0.62 0.02 0.44EN 0.79 0.03 0.08 0.66 0.03 0.13 0.57 0.03 0.22 0.51 0.03 0.43Adaptive 0.97 0.02 0.07 0.89 0.02 0.12 0.86 0.02 0.22 0.80 0.02 0.430.2 Relaxed 1.08 0.02 0.08 0.97 0.01 0.12 1.15 0.01 0.22 0.90 0.01 0.43MC+ 1.00 0.00 0.05 0.94 0.00 0.11 0.97 0.00 0.16 0.91 0.00 0.41RF 0.77 − − − − − − − −

RGLM 0.73 0.11 0.06 0.64 0.10 0.12 0.62 0.09 0.21 0.59 0.08 0.41SIS-SCAD 1.09 0.00 0.11 0.98 0.00 0.11 0.93 0.00 0.19 0.89 0.00 0.39XGBoost 0.91 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.84 0.16 0.09 0.64 0.20 0.14 0.47 0.21 0.26 0.42 0.20 0.47Split-EN 0.84 0.21 0.08 0.63 0.25 0.13 0.45 0.28 0.24 0.40 0.27 0.46Lasso 1.02 0.03 0.10 0.85 0.02 0.14 0.74 0.02 0.22 0.70 0.02 0.42EN 0.93 0.04 0.09 0.76 0.04 0.13 0.62 0.04 0.23 0.58 0.03 0.42Adaptive 1.07 0.03 0.10 0.98 0.02 0.14 0.90 0.02 0.22 0.86 0.02 0.420.3 Relaxed 1.23 0.02 0.09 1.05 0.02 0.13 0.94 0.02 0.22 0.91 0.02 0.42MC+ 1.12 0.01 0.08 1.06 0.00 0.12 1.05 0.00 0.20 1.02 0.00 0.43RF 0.90 − − − − − − − −

RGLM 0.85 0.13 0.06 0.77 0.12 0.12 0.72 0.10 0.21 0.71 0.10 0.41SIS-SCAD 1.33 0.00 0.11 1.21 0.00 0.11 1.15 0.00 0.23 1.10 0.00 0.41XGBoost 1.08 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.89 0.21 0.08 0.69 0.20 0.14 0.50 0.21 0.25 0.43 0.19 0.46Split-EN 0.90 0.25 0.08 0.68 0.26 0.13 0.48 0.27 0.24 0.41 0.25 0.45Lasso 1.08 0.02 0.08 0.92 0.02 0.13 0.81 0.02 0.23 0.74 0.02 0.43EN 1.02 0.03 0.07 0.82 0.04 0.13 0.67 0.04 0.23 0.58 0.04 0.42Adaptive 1.16 0.02 0.08 1.02 0.02 0.12 0.98 0.02 0.23 0.91 0.02 0.430.4 Relaxed 1.15 0.02 0.08 1.16 0.02 0.13 1.11 0.02 0.25 1.07 0.02 0.42MC+ 1.20 0.00 0.06 1.13 0.00 0.12 1.10 0.00 0.24 1.07 0.00 0.44RF 0.98 − − − − − − − −

RGLM 0.92 0.15 0.06 0.82 0.13 0.11 0.78 0.11 0.21 0.76 0.11 0.41SIS-SCAD 1.44 0.00 0.08 1.32 0.00 0.11 1.25 0.00 0.21 1.22 0.00 0.38XGBoost 1.15 − − − − − − − − ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.45 0.20 0.09 0.31 0.20 0.17 0.24 0.20 0.30 0.23 0.17 0.54Split-EN 0.45 0.23 0.09 0.30 0.26 0.15 0.23 0.27 0.28 0.22 0.25 0.51Lasso 0.56 0.01 0.06 0.45 0.01 0.11 0.39 0.01 0.20 0.39 0.01 0.41EN 0.50 0.03 0.06 0.36 0.03 0.12 0.31 0.03 0.20 0.31 0.03 0.41Adaptive 0.67 0.01 0.05 0.56 0.01 0.11 0.50 0.01 0.20 0.58 0.01 0.410.2 Relaxed 0.97 0.01 0.06 0.72 0.01 0.13 0.72 0.01 0.20 0.71 0.01 0.42MC+ 0.79 0.00 0.12 0.76 0.00 0.06 0.70 0.00 0.25 0.78 0.00 0.44RF 0.51 − − − − − − − −

RGLM 0.49 0.08 0.05 0.43 0.08 0.11 0.41 0.07 0.20 0.42 0.07 0.41SIS-SCAD 0.80 0.00 0.06 0.73 0.00 0.11 0.70 0.00 0.24 0.73 0.00 0.36XGBoost 0.69 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.55 0.16 0.09 0.37 0.19 0.15 0.28 0.19 0.28 0.25 0.18 0.50Split-EN 0.54 0.20 0.08 0.36 0.25 0.14 0.27 0.27 0.26 0.24 0.25 0.48Lasso 0.66 0.02 0.07 0.52 0.02 0.12 0.45 0.01 0.20 0.42 0.01 0.40EN 0.59 0.03 0.07 0.43 0.04 0.12 0.36 0.03 0.21 0.32 0.03 0.42Adaptive 0.73 0.02 0.07 0.67 0.02 0.12 0.56 0.01 0.20 0.55 0.01 0.400.3 Relaxed 1.01 0.02 0.07 0.86 0.01 0.12 0.84 0.01 0.20 0.68 0.01 0.40MC+ 0.83 0.00 0.11 0.78 0.00 0.12 0.74 0.00 0.26 0.74 0.00 0.37RF 0.61 − − − − − − − −

RGLM 0.58 0.10 0.06 0.52 0.09 0.11 0.49 0.09 0.22 0.48 0.08 0.40SIS-SCAD 0.88 0.00 0.03 0.82 0.00 0.10 0.80 0.00 0.25 0.80 0.00 0.45XGBoost 0.74 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.61 0.17 0.08 0.41 0.18 0.14 0.29 0.19 0.27 0.26 0.17 0.48Split-EN 0.61 0.22 0.07 0.40 0.24 0.13 0.28 0.26 0.25 0.26 0.24 0.46Lasso 0.71 0.02 0.07 0.56 0.02 0.12 0.46 0.02 0.22 0.44 0.02 0.40EN 0.66 0.03 0.06 0.48 0.04 0.12 0.36 0.04 0.22 0.34 0.03 0.40Adaptive 0.80 0.02 0.07 0.67 0.02 0.12 0.59 0.02 0.22 0.56 0.02 0.410.4 Relaxed 1.11 0.02 0.07 1.09 0.01 0.12 0.86 0.01 0.24 0.71 0.01 0.40MC+ 0.86 0.00 0.07 0.78 0.00 0.13 0.71 0.00 0.23 0.73 0.00 0.35RF 0.67 − − − − − − − −

RGLM 0.64 0.10 0.06 0.56 0.09 0.11 0.53 0.09 0.21 0.52 0.09 0.41SIS-SCAD 0.93 0.00 0.04 0.86 0.00 0.13 0.83 0.00 0.21 0.83 0.00 0.33XGBoost 0.79 − − − − − − − − ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.31 0.23 0.11 0.21 0.25 0.23 0.16 0.23 0.40 0.17 0.17 0.62Split-EN 0.31 0.31 0.10 0.21 0.33 0.19 0.16 0.35 0.35 0.16 0.27 0.57Lasso 0.37 0.01 0.04 0.29 0.01 0.11 0.24 0.01 0.22 0.25 0.01 0.39EN 0.33 0.02 0.05 0.24 0.03 0.12 0.19 0.03 0.21 0.19 0.03 0.40Adaptive 0.47 0.01 0.04 0.41 0.01 0.12 0.35 0.01 0.22 0.37 0.01 0.390.2 Relaxed 0.68 0.01 0.05 0.59 0.01 0.10 0.56 0.01 0.21 0.58 0.01 0.38MC+ 0.58 0.00 0.06 0.52 0.00 0.10 0.51 0.00 0.36 0.50 0.00 0.60RF 0.35 − − − − − − − −

RGLM 0.34 0.06 0.05 0.29 0.06 0.11 0.26 0.06 0.20 0.27 0.06 0.39SIS-SCAD 0.58 0.00 0.06 0.55 0.00 0.05 0.51 0.00 0.22 0.54 0.00 0.46XGBoost 0.64 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.38 0.17 0.10 0.25 0.20 0.18 0.19 0.18 0.33 0.18 0.16 0.57Split-EN 0.37 0.24 0.09 0.24 0.29 0.16 0.18 0.28 0.29 0.17 0.26 0.52Lasso 0.44 0.01 0.06 0.34 0.01 0.13 0.28 0.01 0.22 0.27 0.01 0.40EN 0.39 0.03 0.06 0.28 0.03 0.12 0.22 0.03 0.22 0.21 0.03 0.41Adaptive 0.54 0.01 0.05 0.46 0.01 0.13 0.38 0.01 0.22 0.35 0.01 0.390.3 Relaxed 1.07 0.01 0.05 0.92 0.01 0.11 0.56 0.01 0.22 0.56 0.01 0.40MC+ 0.64 0.00 0.07 0.60 0.00 0.19 0.58 0.00 0.29 0.58 0.00 0.44RF 0.41 − − − − − − − −

RGLM 0.40 0.08 0.06 0.34 0.07 0.11 0.31 0.07 0.22 0.31 0.07 0.40SIS-SCAD 0.66 0.00 0.06 0.62 0.00 0.10 0.62 0.00 0.26 0.59 0.00 0.29XGBoost 0.72 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.42 0.14 0.08 0.27 0.15 0.15 0.20 0.16 0.28 0.19 0.14 0.50Split-EN 0.42 0.17 0.08 0.26 0.23 0.13 0.19 0.25 0.25 0.18 0.23 0.46Lasso 0.48 0.01 0.07 0.36 0.01 0.10 0.29 0.01 0.21 0.28 0.01 0.39EN 0.44 0.03 0.06 0.30 0.03 0.10 0.23 0.03 0.22 0.22 0.03 0.39Adaptive 0.58 0.01 0.07 0.47 0.01 0.10 0.40 0.01 0.21 0.38 0.01 0.390.4 Relaxed 0.96 0.01 0.06 0.97 0.01 0.11 0.89 0.01 0.22 0.79 0.01 0.39MC+ 0.66 0.00 0.12 0.63 0.00 0.13 0.58 0.00 0.22 0.60 0.00 0.42RF 0.44 − − − − − − − −

RGLM 0.43 0.07 0.05 0.36 0.08 0.11 0.33 0.07 0.21 0.33 0.07 0.40SIS-SCAD 0.68 0.00 0.07 0.62 0.00 0.12 0.59 0.00 0.18 0.59 0.00 0.41XGBoost 0.75 − − − − − − − − ρ = 0.2, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.68 0.29 0.09 0.49 0.30 0.14 0.37 0.31 0.26 0.31 0.28 0.48Split-EN 0.67 0.31 0.08 0.49 0.35 0.14 0.36 0.37 0.25 0.30 0.36 0.46Lasso 0.78 0.04 0.10 0.64 0.03 0.13 0.53 0.03 0.24 0.51 0.03 0.41EN 0.75 0.05 0.09 0.59 0.04 0.13 0.48 0.05 0.23 0.44 0.04 0.42Adaptive 0.88 0.04 0.10 0.79 0.03 0.13 0.69 0.03 0.24 0.60 0.03 0.410.2 Relaxed 0.84 0.04 0.10 0.70 0.03 0.13 0.59 0.03 0.24 0.62 0.02 0.40MC+ 0.91 0.01 0.13 0.80 0.01 0.14 0.76 0.01 0.24 0.75 0.01 0.38RF 0.74 − − − − − − − −

RGLM 0.68 0.22 0.07 0.57 0.17 0.11 0.53 0.15 0.21 0.51 0.14 0.41SIS-SCAD 1.06 0.01 0.12 0.93 0.00 0.12 0.88 0.00 0.23 0.82 0.00 0.41XGBoost 0.90 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.79 0.28 0.09 0.60 0.31 0.14 0.42 0.32 0.25 0.34 0.30 0.46Split-EN 0.78 0.32 0.08 0.60 0.37 0.14 0.41 0.38 0.24 0.33 0.38 0.45Lasso 0.90 0.05 0.11 0.75 0.04 0.15 0.62 0.04 0.26 0.56 0.03 0.42EN 0.87 0.06 0.11 0.70 0.06 0.15 0.54 0.06 0.25 0.49 0.05 0.41Adaptive 0.97 0.05 0.11 0.85 0.04 0.15 0.71 0.04 0.26 0.68 0.03 0.420.3 Relaxed 0.95 0.04 0.12 0.79 0.04 0.15 0.81 0.03 0.26 0.66 0.03 0.42MC+ 1.01 0.01 0.10 0.91 0.01 0.17 0.86 0.01 0.25 0.84 0.01 0.41RF 0.86 − − − − − − − −

RGLM 0.79 0.24 0.06 0.68 0.21 0.12 0.62 0.18 0.22 0.60 0.16 0.40SIS-SCAD 1.10 0.01 0.10 1.04 0.00 0.12 1.00 0.00 0.24 1.01 0.00 0.40XGBoost 1.01 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.85 0.28 0.08 0.65 0.34 0.14 0.44 0.32 0.25 0.37 0.31 0.45Split-EN 0.85 0.32 0.08 0.65 0.38 0.13 0.44 0.37 0.23 0.36 0.38 0.44Lasso 0.97 0.05 0.11 0.80 0.04 0.15 0.65 0.04 0.24 0.60 0.03 0.42EN 0.94 0.07 0.11 0.76 0.06 0.15 0.58 0.06 0.25 0.52 0.05 0.42Adaptive 1.04 0.05 0.11 0.90 0.04 0.16 0.76 0.04 0.25 0.72 0.03 0.420.4 Relaxed 0.98 0.05 0.12 0.88 0.04 0.16 0.78 0.03 0.24 0.71 0.03 0.43MC+ 1.07 0.02 0.11 0.96 0.02 0.18 0.89 0.01 0.26 0.87 0.01 0.40RF 0.95 − − − − − − − −

RGLM 0.86 0.26 0.06 0.74 0.23 0.12 0.67 0.19 0.22 0.64 0.17 0.40SIS-SCAD 1.17 0.01 0.13 1.13 0.01 0.16 1.07 0.00 0.20 1.03 0.00 0.42XGBoost 1.10 − − − − − − − − ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.42 0.30 0.10 0.29 0.31 0.17 0.20 0.30 0.32 0.19 0.26 0.56Split-EN 0.42 0.35 0.09 0.29 0.38 0.16 0.19 0.38 0.30 0.18 0.34 0.53Lasso 0.49 0.03 0.09 0.39 0.02 0.12 0.31 0.02 0.22 0.30 0.02 0.45EN 0.47 0.04 0.08 0.35 0.04 0.12 0.25 0.04 0.22 0.24 0.04 0.43Adaptive 0.56 0.03 0.09 0.47 0.02 0.12 0.39 0.02 0.22 0.39 0.02 0.450.2 Relaxed 0.61 0.03 0.09 0.47 0.02 0.11 0.40 0.02 0.23 0.43 0.02 0.45MC+ 0.73 0.00 0.11 0.67 0.00 0.14 0.63 0.00 0.26 0.64 0.00 0.43RF 0.48 − − − − − − − −

RGLM 0.44 0.14 0.06 0.39 0.11 0.11 0.35 0.11 0.21 0.35 0.10 0.42SIS-SCAD 0.73 0.00 0.09 0.66 0.00 0.13 0.66 0.00 0.22 0.64 0.00 0.44XGBoost 0.57 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.52 0.24 0.09 0.36 0.25 0.15 0.23 0.28 0.28 0.21 0.25 0.51Split-EN 0.51 0.28 0.08 0.35 0.31 0.14 0.23 0.34 0.26 0.20 0.33 0.49Lasso 0.59 0.03 0.08 0.46 0.02 0.12 0.35 0.03 0.23 0.33 0.02 0.40EN 0.56 0.04 0.08 0.42 0.04 0.12 0.30 0.05 0.22 0.27 0.04 0.41Adaptive 0.65 0.03 0.08 0.55 0.02 0.11 0.43 0.03 0.22 0.40 0.02 0.400.3 Relaxed 0.63 0.03 0.08 0.53 0.02 0.11 0.45 0.03 0.23 0.58 0.02 0.40MC+ 0.73 0.01 0.10 0.66 0.01 0.13 0.63 0.00 0.24 0.60 0.00 0.42RF 0.57 − − − − − − − −

RGLM 0.53 0.16 0.06 0.45 0.14 0.11 0.41 0.13 0.22 0.41 0.12 0.42SIS-SCAD 0.77 0.01 0.09 0.69 0.00 0.11 0.63 0.00 0.23 0.64 0.00 0.42XGBoost 0.66 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.56 0.23 0.08 0.39 0.25 0.14 0.26 0.26 0.25 0.22 0.25 0.48Split-EN 0.56 0.29 0.08 0.39 0.31 0.13 0.25 0.33 0.24 0.22 0.32 0.46Lasso 0.64 0.03 0.09 0.49 0.03 0.14 0.38 0.03 0.22 0.36 0.03 0.41EN 0.61 0.05 0.09 0.45 0.05 0.14 0.32 0.05 0.22 0.29 0.05 0.41Adaptive 0.72 0.03 0.09 0.58 0.03 0.14 0.47 0.03 0.22 0.43 0.03 0.410.4 Relaxed 0.68 0.03 0.09 0.62 0.03 0.14 0.53 0.02 0.22 0.50 0.02 0.41MC+ 0.76 0.01 0.11 0.67 0.01 0.15 0.61 0.01 0.23 0.59 0.01 0.39RF 0.63 − − − − − − − −

RGLM 0.58 0.19 0.06 0.49 0.15 0.11 0.45 0.13 0.21 0.44 0.13 0.41SIS-SCAD 0.84 0.00 0.06 0.74 0.00 0.13 0.73 0.00 0.22 0.66 0.00 0.40XGBoost 0.71 − − − − − − − − ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.30 0.36 0.13 0.20 0.37 0.23 0.14 0.34 0.41 0.13 0.26 0.64Split-EN 0.29 0.42 0.11 0.19 0.44 0.20 0.14 0.42 0.37 0.13 0.35 0.60Lasso 0.35 0.02 0.07 0.25 0.02 0.13 0.19 0.02 0.22 0.19 0.01 0.41EN 0.32 0.04 0.07 0.22 0.04 0.12 0.16 0.04 0.21 0.15 0.04 0.42Adaptive 0.40 0.02 0.06 0.30 0.02 0.13 0.26 0.02 0.23 0.24 0.01 0.410.2 Relaxed 0.50 0.01 0.07 0.39 0.02 0.13 0.29 0.01 0.22 0.32 0.01 0.43MC+ 0.56 0.00 0.10 0.51 0.00 0.15 0.46 0.00 0.19 0.49 0.00 0.55RF 0.32 − − − − − − − −

RGLM 0.31 0.10 0.06 0.25 0.10 0.12 0.22 0.09 0.21 0.22 0.08 0.41SIS-SCAD 0.59 0.00 0.08 0.53 0.00 0.11 0.49 0.00 0.18 0.49 0.00 0.40XGBoost 0.41 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.37 0.25 0.10 0.23 0.28 0.19 0.16 0.27 0.35 0.15 0.23 0.58Split-EN 0.36 0.33 0.09 0.22 0.36 0.16 0.16 0.37 0.31 0.15 0.34 0.54Lasso 0.42 0.02 0.07 0.29 0.02 0.11 0.23 0.02 0.21 0.21 0.02 0.42EN 0.39 0.03 0.07 0.25 0.04 0.11 0.19 0.04 0.21 0.17 0.04 0.42Adaptive 0.49 0.02 0.07 0.37 0.02 0.11 0.30 0.02 0.21 0.28 0.02 0.420.3 Relaxed 0.53 0.01 0.06 0.44 0.01 0.10 0.45 0.01 0.20 0.32 0.02 0.43MC+ 0.61 0.00 0.05 0.53 0.00 0.12 0.50 0.00 0.28 0.49 0.00 0.45RF 0.38 − − − − − − − −

RGLM 0.38 0.12 0.06 0.29 0.10 0.11 0.26 0.09 0.21 0.25 0.09 0.40SIS-SCAD 0.59 0.00 0.06 0.54 0.00 0.10 0.54 0.00 0.16 0.50 0.00 0.38XGBoost 0.46 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.40 0.17 0.08 0.25 0.22 0.15 0.17 0.22 0.28 0.16 0.19 0.50Split-EN 0.39 0.22 0.07 0.25 0.28 0.14 0.17 0.31 0.26 0.16 0.30 0.48Lasso 0.45 0.02 0.08 0.32 0.02 0.13 0.24 0.02 0.24 0.23 0.02 0.41EN 0.42 0.04 0.07 0.28 0.04 0.12 0.20 0.04 0.22 0.19 0.04 0.41Adaptive 0.50 0.02 0.08 0.38 0.02 0.12 0.31 0.02 0.23 0.29 0.02 0.420.4 Relaxed 0.62 0.02 0.07 0.48 0.02 0.14 0.37 0.02 0.23 0.39 0.02 0.42MC+ 0.57 0.00 0.11 0.49 0.00 0.10 0.46 0.00 0.23 0.45 0.00 0.49RF 0.41 − − − − − − − −

RGLM 0.41 0.13 0.06 0.31 0.11 0.11 0.28 0.10 0.21 0.28 0.09 0.40SIS-SCAD 0.57 0.00 0.07 0.47 0.00 0.15 0.45 0.00 0.20 0.44 0.00 0.42XGBoost 0.49 − − − − − − − − ρ = 0, ρ = 0.2, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.20 0.13 0.99 0.17 0.20 0.99 0.13 0.40 0.99 0.11 0.47 0.99Split-EN 0.20 0.12 0.99 0.17 0.19 0.99 0.13 0.40 0.99 0.11 0.49 0.99Lasso 0.21 0.14 0.97 0.18 0.20 0.98 0.16 0.35 0.97 0.14 0.42 0.97EN 0.20 0.14 0.98 0.18 0.20 0.98 0.15 0.37 0.98 0.13 0.43 0.98Adaptive 0.22 0.02 1.00 0.20 0.05 0.99 0.20 0.11 0.99 0.18 0.15 0.990.2 Relaxed 0.21 0.17 0.96 0.18 0.24 0.96 0.16 0.39 0.96 0.15 0.46 0.95MC+ 0.22 0.14 0.96 0.20 0.16 0.97 0.20 0.23 0.96 0.19 0.25 0.95RF 0.22 0.00 1.00 0.20 0.01 1.00 0.20 0.07 1.00 0.17 0.13 1.00RGLM 0.21 0.06 1.00 0.18 0.12 1.00 0.16 0.25 1.00 0.14 0.29 1.00SIS-SCAD 0.23 0.22 0.92 0.21 0.22 0.94 0.21 0.24 0.94 0.20 0.24 0.95XGBoost 0.21 0.20 0.95 0.19 0.22 0.96 0.18 0.29 0.97 0.17 0.27 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.24 0.33 0.95 0.18 0.51 0.95 0.13 0.63 0.97 0.11 0.70 0.97Split-EN 0.24 0.30 0.96 0.18 0.51 0.96 0.13 0.62 0.98 0.11 0.70 0.98Lasso 0.26 0.33 0.92 0.23 0.46 0.91 0.19 0.54 0.94 0.17 0.59 0.93EN 0.25 0.33 0.93 0.21 0.51 0.91 0.17 0.58 0.95 0.14 0.63 0.95Adaptive 0.28 0.15 0.97 0.26 0.25 0.96 0.24 0.29 0.97 0.21 0.40 0.960.3 Relaxed 0.26 0.35 0.91 0.23 0.48 0.90 0.20 0.56 0.91 0.18 0.61 0.91MC+ 0.28 0.30 0.91 0.26 0.38 0.90 0.24 0.43 0.91 0.23 0.45 0.90RF 0.29 0.05 1.00 0.27 0.14 0.99 0.24 0.24 1.00 0.19 0.39 0.99RGLM 0.26 0.23 0.97 0.20 0.41 0.97 0.17 0.49 0.98 0.14 0.56 0.98SIS-SCAD 0.29 0.37 0.85 0.28 0.42 0.86 0.27 0.42 0.87 0.25 0.44 0.88XGBoost 0.28 0.34 0.89 0.25 0.42 0.90 0.22 0.45 0.93 0.21 0.47 0.92 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.26 0.52 0.89 0.19 0.67 0.91 0.13 0.77 0.93 0.10 0.82 0.94Split-EN 0.26 0.52 0.89 0.18 0.67 0.91 0.13 0.77 0.94 0.10 0.83 0.95Lasso 0.30 0.50 0.84 0.25 0.59 0.85 0.20 0.69 0.88 0.18 0.73 0.88EN 0.29 0.51 0.85 0.23 0.62 0.87 0.18 0.72 0.89 0.15 0.78 0.91Adaptive 0.36 0.29 0.87 0.30 0.42 0.88 0.26 0.51 0.90 0.21 0.63 0.890.4 Relaxed 0.31 0.51 0.81 0.26 0.59 0.84 0.21 0.69 0.86 0.18 0.75 0.87MC+ 0.34 0.42 0.81 0.30 0.52 0.82 0.26 0.57 0.85 0.26 0.60 0.84RF 0.34 0.24 0.93 0.27 0.38 0.96 0.21 0.53 0.96 0.15 0.67 0.97RGLM 0.27 0.48 0.90 0.21 0.61 0.92 0.15 0.71 0.94 0.13 0.76 0.95SIS-SCAD 0.33 0.52 0.76 0.31 0.55 0.78 0.30 0.57 0.79 0.29 0.58 0.80XGBoost 0.32 0.50 0.80 0.28 0.56 0.83 0.24 0.61 0.86 0.23 0.64 0.8641able 21: MR, SE and SP for Scenario 2, ρ = 0, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.13 0.41 0.99 0.10 0.54 0.99 0.07 0.66 0.99 0.07 0.71 0.99Split-EN 0.13 0.39 0.99 0.10 0.54 0.99 0.07 0.66 0.99 0.06 0.72 0.99Lasso 0.15 0.39 0.97 0.12 0.49 0.97 0.10 0.59 0.97 0.10 0.62 0.98EN 0.14 0.40 0.98 0.11 0.50 0.98 0.09 0.61 0.98 0.08 0.65 0.98Adaptive 0.18 0.16 0.99 0.15 0.29 0.99 0.14 0.37 0.99 0.14 0.37 0.990.2 Relaxed 0.15 0.45 0.95 0.13 0.55 0.95 0.11 0.63 0.95 0.10 0.68 0.96MC+ 0.17 0.36 0.95 0.16 0.40 0.95 0.15 0.45 0.95 0.16 0.41 0.95RF 0.17 0.18 1.00 0.13 0.35 1.00 0.10 0.49 1.00 0.10 0.52 1.00RGLM 0.13 0.41 0.99 0.10 0.53 0.99 0.08 0.63 0.99 0.08 0.63 1.00SIS-SCAD 0.18 0.29 0.96 0.17 0.29 0.97 0.16 0.30 0.97 0.16 0.32 0.97XGBoost 0.14 0.47 0.96 0.13 0.51 0.97 0.12 0.50 0.97 0.13 0.48 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.14 0.63 0.95 0.10 0.77 0.96 0.08 0.78 0.98 0.06 0.84 0.98Split-EN 0.14 0.63 0.96 0.10 0.76 0.97 0.08 0.78 0.98 0.06 0.85 0.98Lasso 0.17 0.62 0.92 0.14 0.72 0.93 0.11 0.73 0.95 0.10 0.77 0.96EN 0.16 0.62 0.94 0.12 0.74 0.94 0.10 0.74 0.97 0.08 0.80 0.97Adaptive 0.19 0.45 0.96 0.17 0.55 0.96 0.14 0.61 0.97 0.13 0.64 0.970.3 Relaxed 0.18 0.64 0.90 0.15 0.72 0.91 0.12 0.76 0.93 0.11 0.77 0.94MC+ 0.20 0.56 0.90 0.18 0.63 0.91 0.17 0.62 0.92 0.17 0.63 0.92RF 0.17 0.45 0.99 0.12 0.64 0.99 0.11 0.65 0.99 0.08 0.74 0.99RGLM 0.14 0.64 0.96 0.10 0.75 0.97 0.08 0.75 0.98 0.07 0.80 0.98SIS-SCAD 0.21 0.55 0.90 0.19 0.58 0.91 0.18 0.55 0.93 0.18 0.58 0.92XGBoost 0.17 0.62 0.92 0.15 0.67 0.94 0.15 0.63 0.95 0.14 0.66 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.14 0.79 0.91 0.10 0.86 0.93 0.07 0.90 0.95 0.06 0.91 0.96Split-EN 0.14 0.78 0.91 0.10 0.86 0.93 0.07 0.90 0.95 0.06 0.91 0.97Lasso 0.18 0.74 0.88 0.15 0.81 0.88 0.11 0.85 0.91 0.11 0.84 0.93EN 0.17 0.76 0.89 0.13 0.83 0.90 0.10 0.87 0.93 0.09 0.87 0.94Adaptive 0.19 0.67 0.90 0.15 0.78 0.90 0.12 0.82 0.92 0.11 0.81 0.940.4 Relaxed 0.18 0.74 0.87 0.16 0.81 0.87 0.12 0.85 0.90 0.12 0.85 0.91MC+ 0.20 0.71 0.85 0.20 0.75 0.84 0.17 0.77 0.88 0.16 0.76 0.89RF 0.15 0.71 0.94 0.11 0.80 0.95 0.08 0.86 0.96 0.07 0.87 0.98RGLM 0.14 0.78 0.91 0.10 0.85 0.93 0.07 0.89 0.95 0.06 0.89 0.97SIS-SCAD 0.22 0.70 0.84 0.21 0.73 0.83 0.20 0.73 0.85 0.19 0.71 0.87XGBoost 0.18 0.75 0.86 0.17 0.78 0.87 0.16 0.78 0.88 0.15 0.77 0.9142able 22: MR, SE and SP for Scenario 2, ρ = 0, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.65 0.98 0.07 0.72 0.99 0.05 0.78 0.99 0.05 0.78 0.99Split-EN 0.09 0.62 0.98 0.07 0.70 0.99 0.05 0.77 0.99 0.05 0.78 0.99Lasso 0.11 0.58 0.97 0.10 0.62 0.98 0.08 0.67 0.98 0.08 0.69 0.98EN 0.11 0.57 0.97 0.09 0.63 0.98 0.07 0.69 0.99 0.07 0.71 0.99Adaptive 0.14 0.39 0.99 0.12 0.44 0.99 0.11 0.48 0.99 0.10 0.55 0.990.2 Relaxed 0.11 0.65 0.95 0.10 0.69 0.96 0.08 0.73 0.96 0.08 0.77 0.96MC+ 0.12 0.60 0.95 0.11 0.63 0.95 0.11 0.60 0.96 0.11 0.58 0.96RF 0.08 0.69 0.98 0.05 0.78 0.99 0.04 0.82 0.99 0.04 0.81 1.00RGLM 0.08 0.74 0.97 0.06 0.78 0.98 0.04 0.83 0.99 0.05 0.81 0.99SIS-SCAD 0.14 0.47 0.97 0.12 0.51 0.97 0.11 0.50 0.98 0.12 0.52 0.98XGBoost 0.09 0.74 0.95 0.10 0.70 0.96 0.09 0.73 0.95 0.11 0.69 0.94 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.77 0.96 0.07 0.84 0.97 0.05 0.88 0.98 0.04 0.91 0.98Split-EN 0.10 0.76 0.96 0.07 0.83 0.97 0.05 0.88 0.98 0.04 0.91 0.98Lasso 0.12 0.71 0.95 0.11 0.77 0.95 0.09 0.82 0.96 0.08 0.84 0.96EN 0.12 0.72 0.96 0.10 0.78 0.96 0.07 0.83 0.97 0.06 0.87 0.97Adaptive 0.14 0.63 0.97 0.11 0.70 0.97 0.09 0.77 0.98 0.08 0.80 0.970.3 Relaxed 0.13 0.74 0.93 0.11 0.81 0.93 0.10 0.85 0.93 0.09 0.85 0.94MC+ 0.15 0.68 0.92 0.14 0.72 0.92 0.13 0.73 0.93 0.13 0.73 0.93RF 0.09 0.79 0.97 0.06 0.86 0.98 0.04 0.89 0.99 0.04 0.90 0.99RGLM 0.09 0.81 0.96 0.06 0.86 0.97 0.04 0.91 0.98 0.04 0.92 0.98SIS-SCAD 0.14 0.67 0.95 0.13 0.69 0.95 0.12 0.70 0.95 0.12 0.72 0.95XGBoost 0.12 0.77 0.93 0.10 0.79 0.94 0.10 0.80 0.94 0.11 0.79 0.93 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.86 0.93 0.07 0.89 0.96 0.05 0.93 0.96 0.04 0.94 0.97Split-EN 0.10 0.86 0.93 0.07 0.88 0.96 0.05 0.93 0.97 0.04 0.95 0.97Lasso 0.13 0.81 0.90 0.11 0.83 0.93 0.09 0.88 0.94 0.08 0.90 0.94EN 0.12 0.82 0.91 0.10 0.85 0.94 0.07 0.89 0.95 0.06 0.91 0.95Adaptive 0.12 0.80 0.93 0.11 0.82 0.94 0.08 0.87 0.95 0.08 0.89 0.950.4 Relaxed 0.14 0.81 0.89 0.12 0.84 0.91 0.10 0.88 0.92 0.09 0.90 0.91MC+ 0.16 0.77 0.88 0.15 0.78 0.90 0.14 0.81 0.90 0.13 0.83 0.90RF 0.09 0.87 0.94 0.06 0.90 0.96 0.04 0.94 0.97 0.04 0.94 0.98RGLM 0.09 0.87 0.93 0.07 0.91 0.95 0.04 0.95 0.96 0.04 0.95 0.97SIS-SCAD 0.14 0.79 0.90 0.13 0.80 0.92 0.12 0.82 0.92 0.12 0.83 0.91XGBoost 0.12 0.84 0.91 0.11 0.85 0.92 0.10 0.86 0.92 0.12 0.85 0.9043able 23: MR, SE and SP for Scenario 2, ρ = 0.2, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.13 0.48 0.97 0.09 0.60 0.98 0.07 0.69 0.98 0.06 0.73 0.99Split-EN 0.13 0.45 0.98 0.10 0.60 0.98 0.07 0.69 0.98 0.06 0.74 0.99Lasso 0.14 0.43 0.96 0.12 0.53 0.97 0.10 0.61 0.97 0.10 0.62 0.98EN 0.14 0.44 0.97 0.11 0.56 0.97 0.09 0.63 0.98 0.08 0.65 0.98Adaptive 0.18 0.16 0.99 0.15 0.30 0.99 0.14 0.37 0.99 0.14 0.38 0.990.2 Relaxed 0.15 0.46 0.95 0.13 0.56 0.95 0.11 0.65 0.95 0.11 0.67 0.95MC+ 0.17 0.35 0.95 0.16 0.39 0.95 0.15 0.43 0.95 0.16 0.39 0.96RF 0.16 0.25 0.99 0.13 0.38 1.00 0.10 0.49 1.00 0.10 0.52 1.00RGLM 0.13 0.45 0.98 0.10 0.56 0.99 0.08 0.61 0.99 0.08 0.62 1.00SIS-SCAD 0.17 0.28 0.97 0.17 0.30 0.97 0.16 0.30 0.98 0.16 0.32 0.97XGBoost 0.14 0.45 0.96 0.13 0.50 0.97 0.13 0.49 0.97 0.13 0.48 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.67 0.93 0.10 0.78 0.95 0.08 0.80 0.98 0.06 0.85 0.98Split-EN 0.15 0.66 0.94 0.10 0.78 0.95 0.08 0.80 0.98 0.06 0.85 0.98Lasso 0.17 0.63 0.92 0.13 0.72 0.93 0.11 0.74 0.95 0.10 0.77 0.95EN 0.16 0.64 0.93 0.12 0.75 0.94 0.10 0.76 0.96 0.08 0.80 0.96Adaptive 0.20 0.44 0.95 0.17 0.55 0.96 0.14 0.60 0.97 0.13 0.63 0.970.3 Relaxed 0.18 0.64 0.90 0.15 0.74 0.91 0.12 0.78 0.93 0.11 0.76 0.94MC+ 0.21 0.55 0.90 0.18 0.63 0.91 0.17 0.61 0.92 0.17 0.63 0.92RF 0.17 0.50 0.96 0.12 0.64 0.98 0.11 0.66 0.99 0.08 0.74 0.99RGLM 0.14 0.65 0.94 0.10 0.76 0.96 0.09 0.75 0.98 0.07 0.80 0.98SIS-SCAD 0.20 0.55 0.90 0.19 0.58 0.91 0.18 0.55 0.93 0.18 0.58 0.92XGBoost 0.17 0.61 0.92 0.15 0.66 0.94 0.15 0.63 0.95 0.14 0.65 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.78 0.89 0.11 0.86 0.91 0.08 0.90 0.94 0.06 0.91 0.96Split-EN 0.16 0.78 0.89 0.11 0.85 0.91 0.07 0.90 0.95 0.06 0.91 0.96Lasso 0.18 0.74 0.87 0.15 0.81 0.88 0.12 0.84 0.91 0.10 0.85 0.93EN 0.17 0.75 0.88 0.13 0.83 0.89 0.10 0.86 0.93 0.09 0.88 0.94Adaptive 0.19 0.67 0.90 0.16 0.75 0.90 0.13 0.80 0.92 0.12 0.81 0.940.4 Relaxed 0.19 0.75 0.86 0.16 0.81 0.87 0.13 0.84 0.89 0.11 0.85 0.91MC+ 0.21 0.71 0.84 0.20 0.75 0.84 0.17 0.77 0.87 0.17 0.75 0.89RF 0.17 0.71 0.91 0.12 0.80 0.94 0.08 0.85 0.96 0.07 0.86 0.98RGLM 0.15 0.77 0.90 0.11 0.85 0.92 0.08 0.88 0.95 0.07 0.89 0.97SIS-SCAD 0.22 0.69 0.84 0.21 0.73 0.83 0.20 0.73 0.85 0.20 0.71 0.87XGBoost 0.18 0.75 0.86 0.17 0.77 0.87 0.16 0.78 0.88 0.15 0.77 0.9144able 24: MR, SE and SP for Scenario 2, ρ = 0.2, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.67 0.97 0.07 0.74 0.98 0.05 0.79 0.99 0.05 0.78 0.99Split-EN 0.10 0.65 0.97 0.07 0.73 0.98 0.05 0.79 0.99 0.05 0.78 0.99Lasso 0.12 0.60 0.96 0.10 0.65 0.97 0.08 0.70 0.98 0.08 0.70 0.98EN 0.11 0.60 0.97 0.09 0.66 0.98 0.07 0.72 0.98 0.07 0.73 0.99Adaptive 0.14 0.40 0.98 0.12 0.46 0.99 0.10 0.50 0.99 0.10 0.55 0.990.2 Relaxed 0.12 0.66 0.94 0.10 0.72 0.95 0.08 0.74 0.96 0.08 0.73 0.96MC+ 0.13 0.60 0.95 0.11 0.63 0.95 0.11 0.60 0.96 0.11 0.58 0.96RF 0.08 0.69 0.98 0.06 0.77 0.99 0.04 0.82 0.99 0.04 0.80 1.00RGLM 0.08 0.75 0.97 0.06 0.79 0.98 0.04 0.83 0.99 0.05 0.80 0.99SIS-SCAD 0.14 0.47 0.97 0.12 0.51 0.97 0.11 0.50 0.98 0.12 0.52 0.98XGBoost 0.10 0.74 0.95 0.10 0.70 0.96 0.09 0.73 0.95 0.11 0.69 0.94 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.78 0.95 0.07 0.84 0.96 0.05 0.88 0.97 0.04 0.91 0.98Split-EN 0.10 0.77 0.96 0.07 0.84 0.97 0.05 0.88 0.97 0.04 0.91 0.98Lasso 0.13 0.73 0.94 0.10 0.79 0.94 0.08 0.83 0.96 0.07 0.85 0.96EN 0.11 0.74 0.95 0.10 0.79 0.95 0.07 0.84 0.96 0.06 0.87 0.97Adaptive 0.14 0.63 0.97 0.12 0.69 0.97 0.09 0.78 0.97 0.08 0.80 0.970.3 Relaxed 0.13 0.75 0.92 0.11 0.81 0.93 0.10 0.85 0.93 0.08 0.87 0.94MC+ 0.15 0.69 0.92 0.14 0.73 0.92 0.13 0.73 0.93 0.13 0.73 0.93RF 0.09 0.79 0.97 0.06 0.86 0.98 0.04 0.89 0.99 0.04 0.90 0.99RGLM 0.09 0.82 0.95 0.06 0.87 0.97 0.04 0.91 0.98 0.04 0.92 0.98SIS-SCAD 0.14 0.67 0.95 0.13 0.69 0.95 0.13 0.70 0.95 0.12 0.72 0.95XGBoost 0.12 0.78 0.93 0.10 0.79 0.94 0.10 0.80 0.94 0.11 0.79 0.93 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.85 0.92 0.08 0.88 0.95 0.06 0.92 0.96 0.05 0.94 0.96Split-EN 0.11 0.85 0.92 0.08 0.87 0.95 0.06 0.92 0.96 0.04 0.94 0.96Lasso 0.14 0.81 0.90 0.11 0.83 0.93 0.09 0.88 0.93 0.08 0.90 0.94EN 0.12 0.82 0.91 0.10 0.84 0.94 0.08 0.89 0.95 0.07 0.91 0.95Adaptive 0.13 0.79 0.93 0.11 0.80 0.95 0.09 0.86 0.95 0.08 0.88 0.940.4 Relaxed 0.14 0.81 0.89 0.12 0.84 0.91 0.09 0.88 0.92 0.09 0.90 0.91MC+ 0.16 0.78 0.88 0.15 0.78 0.90 0.14 0.81 0.90 0.13 0.83 0.90RF 0.09 0.87 0.94 0.06 0.89 0.97 0.04 0.94 0.97 0.04 0.94 0.98RGLM 0.09 0.87 0.93 0.07 0.90 0.96 0.04 0.94 0.96 0.04 0.95 0.97SIS-SCAD 0.14 0.79 0.90 0.13 0.79 0.92 0.12 0.82 0.92 0.12 0.83 0.91XGBoost 0.12 0.84 0.91 0.10 0.85 0.93 0.10 0.85 0.92 0.12 0.85 0.9045able 25: MR, SE and SP for Scenario 2, ρ = 0.5, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.72 0.96 0.07 0.78 0.97 0.05 0.83 0.98 0.05 0.81 0.99Split-EN 0.09 0.71 0.96 0.07 0.78 0.97 0.05 0.83 0.98 0.05 0.82 0.99Lasso 0.11 0.66 0.95 0.09 0.71 0.96 0.07 0.75 0.97 0.07 0.74 0.97EN 0.10 0.68 0.95 0.08 0.73 0.97 0.06 0.79 0.98 0.06 0.76 0.98Adaptive 0.14 0.43 0.98 0.12 0.49 0.98 0.10 0.53 0.99 0.10 0.58 0.990.2 Relaxed 0.11 0.69 0.94 0.09 0.75 0.95 0.08 0.77 0.96 0.08 0.77 0.96MC+ 0.13 0.58 0.94 0.12 0.61 0.95 0.11 0.60 0.96 0.12 0.57 0.96RF 0.09 0.67 0.97 0.06 0.75 0.98 0.04 0.80 0.99 0.05 0.79 1.00RGLM 0.09 0.73 0.96 0.06 0.78 0.98 0.04 0.82 0.99 0.05 0.78 0.99SIS-SCAD 0.14 0.48 0.96 0.12 0.52 0.97 0.12 0.49 0.98 0.12 0.52 0.98XGBoost 0.10 0.71 0.95 0.10 0.70 0.96 0.09 0.73 0.95 0.11 0.69 0.94 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.80 0.94 0.08 0.85 0.95 0.05 0.90 0.97 0.04 0.91 0.98Split-EN 0.10 0.79 0.94 0.08 0.85 0.95 0.05 0.89 0.97 0.04 0.91 0.98Lasso 0.12 0.76 0.93 0.10 0.81 0.94 0.08 0.85 0.95 0.07 0.86 0.96EN 0.11 0.77 0.94 0.09 0.82 0.95 0.07 0.86 0.96 0.06 0.88 0.97Adaptive 0.14 0.63 0.96 0.12 0.70 0.96 0.09 0.78 0.97 0.08 0.80 0.970.3 Relaxed 0.12 0.77 0.92 0.10 0.83 0.93 0.09 0.87 0.93 0.08 0.87 0.94MC+ 0.15 0.70 0.92 0.14 0.74 0.92 0.13 0.74 0.93 0.13 0.74 0.93RF 0.10 0.77 0.95 0.07 0.84 0.97 0.05 0.88 0.99 0.04 0.90 0.99RGLM 0.10 0.81 0.95 0.07 0.86 0.96 0.05 0.90 0.98 0.04 0.92 0.98SIS-SCAD 0.14 0.67 0.95 0.13 0.69 0.95 0.12 0.70 0.95 0.12 0.72 0.95XGBoost 0.12 0.76 0.93 0.10 0.79 0.94 0.10 0.80 0.94 0.11 0.79 0.93 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.11 0.85 0.91 0.09 0.88 0.94 0.06 0.92 0.95 0.05 0.94 0.96Split-EN 0.11 0.85 0.91 0.09 0.87 0.94 0.06 0.92 0.95 0.05 0.94 0.96Lasso 0.13 0.82 0.90 0.11 0.84 0.92 0.09 0.88 0.93 0.08 0.90 0.93EN 0.12 0.83 0.91 0.10 0.85 0.93 0.08 0.89 0.94 0.06 0.92 0.95Adaptive 0.14 0.77 0.92 0.12 0.80 0.94 0.09 0.85 0.94 0.08 0.88 0.940.4 Relaxed 0.14 0.82 0.89 0.11 0.85 0.91 0.10 0.87 0.92 0.09 0.90 0.92MC+ 0.16 0.79 0.88 0.15 0.79 0.90 0.13 0.81 0.90 0.13 0.84 0.90RF 0.11 0.85 0.92 0.07 0.88 0.96 0.05 0.93 0.97 0.04 0.94 0.98RGLM 0.10 0.87 0.92 0.07 0.89 0.95 0.05 0.94 0.96 0.04 0.95 0.97SIS-SCAD 0.14 0.79 0.90 0.13 0.80 0.92 0.12 0.82 0.92 0.12 0.83 0.91XGBoost 0.12 0.84 0.91 0.11 0.85 0.93 0.11 0.85 0.92 0.12 0.85 0.9046able 26: MR, SE and SP for Scenario 2, ρ = 0, ρ = 0.2, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.18 0.22 0.99 0.14 0.34 0.99 0.10 0.54 0.99 0.08 0.67 0.99Split-EN 0.18 0.20 0.99 0.14 0.33 0.99 0.10 0.53 0.99 0.07 0.68 0.99Lasso 0.19 0.27 0.97 0.15 0.34 0.97 0.13 0.50 0.97 0.11 0.57 0.97EN 0.18 0.26 0.97 0.15 0.35 0.98 0.12 0.52 0.98 0.10 0.61 0.97Adaptive 0.21 0.08 0.99 0.18 0.11 0.99 0.17 0.25 0.99 0.14 0.37 0.980.2 Relaxed 0.19 0.28 0.96 0.16 0.35 0.97 0.13 0.52 0.96 0.12 0.60 0.96MC+ 0.20 0.23 0.96 0.18 0.27 0.96 0.17 0.35 0.96 0.16 0.36 0.96RF 0.22 0.00 1.00 0.20 0.03 1.00 0.19 0.11 1.00 0.15 0.25 1.00RGLM 0.19 0.17 0.99 0.15 0.28 0.99 0.12 0.43 0.99 0.10 0.53 0.99SIS-SCAD 0.21 0.30 0.93 0.18 0.29 0.95 0.18 0.33 0.95 0.17 0.40 0.94XGBoost 0.20 0.24 0.96 0.16 0.29 0.97 0.16 0.35 0.98 0.14 0.37 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.20 0.45 0.95 0.15 0.61 0.95 0.10 0.76 0.96 0.08 0.79 0.97Split-EN 0.21 0.43 0.96 0.15 0.61 0.95 0.10 0.76 0.96 0.08 0.79 0.98Lasso 0.22 0.48 0.91 0.18 0.59 0.92 0.15 0.69 0.93 0.13 0.70 0.94EN 0.21 0.49 0.92 0.17 0.60 0.92 0.13 0.71 0.93 0.11 0.73 0.95Adaptive 0.25 0.28 0.96 0.22 0.39 0.96 0.17 0.57 0.95 0.16 0.58 0.960.3 Relaxed 0.22 0.49 0.90 0.19 0.59 0.91 0.15 0.70 0.92 0.14 0.71 0.93MC+ 0.24 0.44 0.91 0.21 0.53 0.90 0.19 0.59 0.90 0.19 0.56 0.92RF 0.29 0.06 1.00 0.24 0.22 0.99 0.18 0.42 0.99 0.16 0.46 1.00RGLM 0.21 0.39 0.97 0.16 0.58 0.96 0.11 0.71 0.97 0.10 0.69 0.98SIS-SCAD 0.25 0.45 0.88 0.23 0.50 0.89 0.23 0.55 0.87 0.21 0.53 0.89XGBoost 0.24 0.43 0.91 0.20 0.53 0.92 0.18 0.58 0.93 0.17 0.54 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.22 0.61 0.90 0.16 0.73 0.91 0.10 0.83 0.94 0.08 0.87 0.95Split-EN 0.22 0.59 0.90 0.16 0.72 0.92 0.10 0.83 0.94 0.08 0.87 0.95Lasso 0.24 0.62 0.85 0.20 0.69 0.87 0.15 0.78 0.89 0.14 0.79 0.91EN 0.23 0.63 0.86 0.19 0.70 0.88 0.14 0.79 0.90 0.12 0.82 0.92Adaptive 0.26 0.51 0.90 0.22 0.59 0.90 0.17 0.73 0.90 0.15 0.75 0.910.4 Relaxed 0.25 0.61 0.84 0.20 0.69 0.87 0.16 0.78 0.88 0.14 0.80 0.90MC+ 0.27 0.58 0.84 0.23 0.65 0.85 0.20 0.71 0.86 0.20 0.71 0.86RF 0.30 0.29 0.97 0.23 0.47 0.97 0.16 0.65 0.97 0.12 0.73 0.98RGLM 0.22 0.59 0.91 0.16 0.71 0.92 0.11 0.81 0.94 0.10 0.83 0.95SIS-SCAD 0.28 0.58 0.81 0.26 0.60 0.82 0.24 0.65 0.83 0.24 0.66 0.83XGBoost 0.26 0.58 0.84 0.22 0.65 0.86 0.19 0.70 0.89 0.18 0.70 0.9047able 27: MR, SE and SP for Scenario 2, ρ = 0, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.11 0.57 0.98 0.08 0.67 0.99 0.05 0.78 0.99 0.04 0.82 0.99Split-EN 0.11 0.55 0.98 0.08 0.67 0.99 0.05 0.77 0.99 0.04 0.82 0.99Lasso 0.12 0.55 0.96 0.10 0.63 0.97 0.08 0.70 0.98 0.07 0.74 0.98EN 0.12 0.55 0.97 0.09 0.64 0.98 0.07 0.72 0.98 0.06 0.77 0.98Adaptive 0.14 0.38 0.99 0.11 0.50 0.98 0.10 0.57 0.99 0.09 0.63 0.990.2 Relaxed 0.13 0.56 0.96 0.10 0.65 0.96 0.09 0.71 0.96 0.08 0.77 0.96MC+ 0.15 0.47 0.95 0.14 0.48 0.95 0.13 0.50 0.96 0.13 0.54 0.95RF 0.14 0.33 1.00 0.11 0.48 1.00 0.08 0.58 1.00 0.07 0.63 1.00RGLM 0.11 0.55 0.98 0.08 0.66 0.99 0.06 0.75 0.99 0.06 0.74 0.99SIS-SCAD 0.15 0.47 0.95 0.15 0.45 0.96 0.14 0.43 0.97 0.14 0.43 0.97XGBoost 0.12 0.56 0.96 0.11 0.57 0.97 0.10 0.59 0.98 0.10 0.58 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.71 0.96 0.08 0.82 0.96 0.06 0.87 0.98 0.04 0.90 0.98Split-EN 0.12 0.70 0.96 0.08 0.82 0.96 0.05 0.87 0.98 0.04 0.90 0.98Lasso 0.14 0.68 0.94 0.11 0.77 0.94 0.09 0.81 0.96 0.08 0.84 0.96EN 0.13 0.69 0.94 0.11 0.78 0.95 0.08 0.82 0.97 0.07 0.86 0.97Adaptive 0.15 0.59 0.96 0.12 0.71 0.96 0.09 0.76 0.97 0.08 0.80 0.970.3 Relaxed 0.15 0.68 0.93 0.12 0.77 0.93 0.09 0.81 0.95 0.08 0.85 0.95MC+ 0.16 0.63 0.92 0.15 0.70 0.93 0.13 0.71 0.94 0.13 0.71 0.93RF 0.14 0.55 0.98 0.10 0.71 0.98 0.08 0.76 0.99 0.07 0.79 0.99RGLM 0.12 0.70 0.96 0.08 0.81 0.96 0.06 0.85 0.98 0.06 0.85 0.98SIS-SCAD 0.17 0.63 0.91 0.16 0.69 0.92 0.14 0.69 0.93 0.14 0.69 0.93XGBoost 0.14 0.68 0.93 0.12 0.74 0.94 0.11 0.73 0.96 0.11 0.73 0.96 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.13 0.79 0.93 0.09 0.88 0.94 0.06 0.91 0.96 0.05 0.94 0.96Split-EN 0.13 0.79 0.93 0.09 0.88 0.94 0.06 0.91 0.96 0.05 0.94 0.97Lasso 0.15 0.77 0.90 0.12 0.83 0.91 0.09 0.87 0.93 0.08 0.89 0.93EN 0.15 0.78 0.91 0.11 0.85 0.92 0.08 0.88 0.94 0.07 0.91 0.95Adaptive 0.15 0.73 0.92 0.12 0.82 0.92 0.10 0.85 0.94 0.09 0.87 0.940.4 Relaxed 0.16 0.77 0.89 0.13 0.83 0.90 0.10 0.87 0.92 0.09 0.89 0.92MC+ 0.17 0.74 0.89 0.15 0.79 0.89 0.13 0.81 0.90 0.13 0.82 0.90RF 0.14 0.72 0.95 0.09 0.83 0.96 0.07 0.87 0.98 0.05 0.90 0.98RGLM 0.12 0.80 0.93 0.09 0.87 0.94 0.06 0.91 0.96 0.05 0.91 0.97SIS-SCAD 0.18 0.74 0.87 0.16 0.78 0.87 0.15 0.80 0.89 0.15 0.80 0.88XGBoost 0.16 0.77 0.90 0.13 0.81 0.91 0.12 0.82 0.93 0.11 0.83 0.9348able 28: MR, SE and SP for Scenario 2, ρ = 0, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.07 0.72 0.98 0.05 0.81 0.98 0.04 0.83 0.99 0.03 0.88 0.99Split-EN 0.08 0.70 0.98 0.05 0.80 0.99 0.04 0.83 0.99 0.03 0.88 0.99Lasso 0.09 0.67 0.97 0.07 0.75 0.98 0.06 0.77 0.98 0.05 0.80 0.98EN 0.09 0.66 0.98 0.07 0.75 0.98 0.05 0.77 0.99 0.05 0.82 0.99Adaptive 0.11 0.54 0.99 0.08 0.68 0.99 0.07 0.68 0.99 0.06 0.75 0.990.2 Relaxed 0.10 0.69 0.96 0.08 0.77 0.96 0.06 0.80 0.97 0.06 0.82 0.97MC+ 0.12 0.62 0.95 0.11 0.65 0.95 0.10 0.63 0.96 0.11 0.64 0.96RF 0.07 0.72 0.98 0.05 0.82 0.99 0.03 0.84 1.00 0.03 0.86 1.00RGLM 0.07 0.75 0.98 0.05 0.84 0.98 0.03 0.87 0.99 0.03 0.88 0.99SIS-SCAD 0.12 0.52 0.97 0.11 0.59 0.97 0.11 0.53 0.98 0.11 0.53 0.98XGBoost 0.08 0.74 0.97 0.07 0.77 0.97 0.06 0.76 0.98 0.07 0.76 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.81 0.96 0.05 0.89 0.97 0.04 0.91 0.98 0.03 0.93 0.98Split-EN 0.09 0.80 0.97 0.05 0.89 0.97 0.04 0.91 0.98 0.03 0.94 0.99Lasso 0.11 0.76 0.95 0.08 0.84 0.96 0.06 0.86 0.97 0.06 0.88 0.97EN 0.10 0.76 0.96 0.07 0.85 0.96 0.06 0.87 0.98 0.05 0.90 0.98Adaptive 0.11 0.70 0.97 0.08 0.82 0.97 0.06 0.84 0.98 0.06 0.86 0.980.3 Relaxed 0.11 0.77 0.94 0.08 0.85 0.95 0.07 0.88 0.95 0.06 0.89 0.95MC+ 0.14 0.70 0.93 0.12 0.77 0.93 0.11 0.75 0.94 0.11 0.76 0.94RF 0.08 0.80 0.97 0.05 0.89 0.98 0.04 0.91 0.99 0.03 0.92 0.99RGLM 0.08 0.82 0.96 0.05 0.90 0.97 0.03 0.93 0.98 0.03 0.93 0.99SIS-SCAD 0.13 0.68 0.95 0.11 0.76 0.95 0.10 0.76 0.96 0.10 0.75 0.96XGBoost 0.10 0.80 0.95 0.08 0.84 0.96 0.07 0.85 0.97 0.07 0.84 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.87 0.94 0.06 0.92 0.95 0.04 0.94 0.97 0.03 0.96 0.97Split-EN 0.09 0.86 0.94 0.06 0.92 0.95 0.04 0.94 0.98 0.03 0.96 0.98Lasso 0.11 0.83 0.92 0.09 0.89 0.93 0.07 0.90 0.95 0.06 0.92 0.95EN 0.11 0.84 0.93 0.08 0.89 0.94 0.06 0.91 0.96 0.05 0.93 0.96Adaptive 0.11 0.83 0.94 0.08 0.89 0.95 0.06 0.90 0.96 0.06 0.92 0.960.4 Relaxed 0.12 0.84 0.91 0.10 0.89 0.92 0.07 0.91 0.94 0.07 0.92 0.94MC+ 0.14 0.80 0.90 0.12 0.84 0.91 0.11 0.84 0.93 0.11 0.86 0.92RF 0.08 0.86 0.95 0.05 0.93 0.96 0.03 0.94 0.98 0.03 0.96 0.98RGLM 0.09 0.88 0.94 0.06 0.93 0.95 0.04 0.95 0.97 0.03 0.96 0.97SIS-SCAD 0.13 0.81 0.91 0.10 0.86 0.92 0.10 0.86 0.93 0.09 0.87 0.93XGBoost 0.11 0.85 0.92 0.09 0.89 0.93 0.07 0.89 0.95 0.07 0.90 0.9549able 29: MR, SE and SP for Scenario 2, ρ = 0.2, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.11 0.59 0.97 0.08 0.68 0.98 0.05 0.79 0.99 0.04 0.83 0.99Split-EN 0.11 0.58 0.97 0.08 0.67 0.98 0.05 0.79 0.99 0.04 0.83 0.99Lasso 0.12 0.57 0.96 0.10 0.63 0.97 0.08 0.72 0.97 0.07 0.74 0.98EN 0.12 0.58 0.96 0.09 0.65 0.97 0.07 0.74 0.98 0.06 0.77 0.98Adaptive 0.14 0.38 0.98 0.12 0.46 0.99 0.10 0.57 0.99 0.09 0.63 0.980.2 Relaxed 0.13 0.56 0.95 0.10 0.65 0.96 0.08 0.73 0.96 0.08 0.76 0.96MC+ 0.15 0.48 0.95 0.14 0.46 0.96 0.13 0.50 0.96 0.13 0.52 0.95RF 0.14 0.38 0.99 0.11 0.47 1.00 0.08 0.59 1.00 0.07 0.63 1.00RGLM 0.11 0.59 0.97 0.08 0.66 0.99 0.06 0.74 0.99 0.06 0.73 0.99SIS-SCAD 0.15 0.47 0.95 0.15 0.40 0.97 0.13 0.44 0.97 0.14 0.43 0.97XGBoost 0.12 0.55 0.96 0.11 0.56 0.97 0.10 0.59 0.98 0.10 0.58 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.72 0.94 0.09 0.82 0.95 0.06 0.87 0.97 0.05 0.90 0.98Split-EN 0.13 0.71 0.94 0.09 0.82 0.95 0.06 0.87 0.97 0.05 0.90 0.98Lasso 0.14 0.69 0.93 0.11 0.78 0.94 0.09 0.81 0.96 0.08 0.84 0.96EN 0.13 0.70 0.94 0.11 0.79 0.94 0.08 0.83 0.96 0.07 0.86 0.96Adaptive 0.15 0.58 0.96 0.12 0.70 0.96 0.10 0.75 0.97 0.09 0.79 0.970.3 Relaxed 0.15 0.68 0.92 0.12 0.78 0.93 0.09 0.82 0.95 0.09 0.85 0.94MC+ 0.17 0.63 0.92 0.15 0.70 0.92 0.13 0.71 0.94 0.13 0.71 0.93RF 0.15 0.56 0.97 0.11 0.71 0.98 0.08 0.76 0.99 0.07 0.79 0.99RGLM 0.13 0.72 0.94 0.09 0.81 0.96 0.06 0.84 0.98 0.06 0.85 0.98SIS-SCAD 0.17 0.63 0.92 0.15 0.69 0.92 0.14 0.68 0.93 0.14 0.69 0.93XGBoost 0.14 0.68 0.93 0.12 0.73 0.94 0.11 0.73 0.96 0.11 0.73 0.96 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.14 0.79 0.91 0.10 0.87 0.93 0.06 0.91 0.96 0.05 0.93 0.96Split-EN 0.14 0.79 0.92 0.10 0.87 0.93 0.06 0.91 0.96 0.05 0.93 0.96Lasso 0.15 0.77 0.90 0.12 0.84 0.91 0.09 0.87 0.93 0.08 0.89 0.93EN 0.15 0.78 0.90 0.11 0.85 0.92 0.08 0.88 0.94 0.07 0.91 0.94Adaptive 0.16 0.72 0.92 0.12 0.81 0.92 0.10 0.85 0.94 0.09 0.87 0.940.4 Relaxed 0.16 0.76 0.89 0.12 0.83 0.90 0.10 0.88 0.92 0.09 0.89 0.92MC+ 0.17 0.74 0.89 0.15 0.79 0.89 0.13 0.82 0.90 0.13 0.82 0.90RF 0.16 0.71 0.93 0.10 0.82 0.95 0.07 0.87 0.97 0.05 0.90 0.98RGLM 0.14 0.79 0.91 0.09 0.87 0.93 0.06 0.90 0.96 0.05 0.92 0.97SIS-SCAD 0.19 0.73 0.87 0.17 0.78 0.87 0.15 0.80 0.88 0.15 0.79 0.88XGBoost 0.16 0.76 0.90 0.13 0.81 0.91 0.12 0.82 0.93 0.11 0.83 0.9350able 30: MR, SE and SP for Scenario 2, ρ = 0.2, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.07 0.73 0.98 0.05 0.82 0.98 0.04 0.85 0.99 0.03 0.89 0.99Split-EN 0.08 0.72 0.98 0.05 0.81 0.98 0.04 0.84 0.99 0.03 0.89 0.99Lasso 0.09 0.68 0.97 0.07 0.77 0.97 0.06 0.79 0.98 0.05 0.82 0.98EN 0.09 0.68 0.98 0.07 0.77 0.98 0.05 0.79 0.98 0.04 0.84 0.99Adaptive 0.10 0.55 0.99 0.08 0.69 0.98 0.07 0.70 0.99 0.06 0.76 0.990.2 Relaxed 0.09 0.70 0.96 0.07 0.78 0.96 0.06 0.81 0.97 0.06 0.84 0.97MC+ 0.12 0.62 0.95 0.11 0.66 0.95 0.10 0.64 0.96 0.11 0.64 0.96RF 0.07 0.72 0.98 0.05 0.82 0.99 0.04 0.84 0.99 0.03 0.86 1.00RGLM 0.07 0.76 0.97 0.05 0.85 0.98 0.03 0.87 0.99 0.03 0.88 0.99SIS-SCAD 0.12 0.52 0.97 0.11 0.58 0.98 0.11 0.54 0.98 0.11 0.53 0.98XGBoost 0.08 0.74 0.97 0.07 0.77 0.97 0.06 0.76 0.98 0.07 0.76 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.81 0.96 0.06 0.88 0.97 0.04 0.92 0.98 0.03 0.94 0.98Split-EN 0.09 0.80 0.96 0.06 0.87 0.97 0.04 0.92 0.98 0.03 0.94 0.98Lasso 0.10 0.77 0.95 0.08 0.83 0.96 0.06 0.87 0.97 0.06 0.89 0.97EN 0.10 0.77 0.96 0.08 0.84 0.96 0.06 0.88 0.97 0.05 0.90 0.98Adaptive 0.11 0.71 0.97 0.08 0.80 0.97 0.07 0.84 0.97 0.06 0.86 0.980.3 Relaxed 0.11 0.77 0.94 0.08 0.84 0.95 0.07 0.88 0.95 0.06 0.89 0.95MC+ 0.14 0.71 0.93 0.12 0.75 0.94 0.11 0.76 0.94 0.11 0.76 0.94RF 0.08 0.80 0.97 0.05 0.87 0.98 0.04 0.91 0.99 0.03 0.92 0.99RGLM 0.08 0.83 0.96 0.05 0.89 0.97 0.03 0.93 0.98 0.03 0.93 0.99SIS-SCAD 0.13 0.69 0.95 0.11 0.74 0.95 0.10 0.75 0.96 0.10 0.75 0.96XGBoost 0.10 0.80 0.95 0.08 0.83 0.96 0.07 0.85 0.97 0.07 0.84 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.87 0.93 0.06 0.92 0.95 0.04 0.94 0.97 0.03 0.96 0.97Split-EN 0.09 0.86 0.93 0.06 0.92 0.95 0.04 0.94 0.97 0.03 0.96 0.97Lasso 0.12 0.84 0.92 0.09 0.89 0.93 0.07 0.90 0.95 0.06 0.92 0.95EN 0.11 0.84 0.92 0.08 0.90 0.94 0.06 0.91 0.96 0.05 0.94 0.96Adaptive 0.11 0.83 0.93 0.08 0.89 0.94 0.06 0.90 0.96 0.06 0.92 0.960.4 Relaxed 0.12 0.84 0.91 0.09 0.89 0.92 0.07 0.91 0.94 0.07 0.92 0.94MC+ 0.14 0.80 0.90 0.12 0.84 0.91 0.10 0.85 0.93 0.10 0.86 0.92RF 0.09 0.86 0.95 0.05 0.93 0.96 0.03 0.94 0.98 0.03 0.96 0.98RGLM 0.09 0.87 0.93 0.06 0.93 0.95 0.04 0.95 0.97 0.03 0.96 0.97SIS-SCAD 0.13 0.82 0.91 0.10 0.86 0.92 0.10 0.85 0.93 0.09 0.87 0.93XGBoost 0.11 0.85 0.92 0.08 0.89 0.93 0.07 0.89 0.95 0.07 0.90 0.9451able 31: MR, SE and SP for Scenario 2, ρ = 0.5, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.07 0.76 0.97 0.05 0.84 0.98 0.04 0.87 0.99 0.03 0.91 0.99Split-EN 0.08 0.74 0.97 0.05 0.83 0.98 0.04 0.87 0.99 0.03 0.91 0.99Lasso 0.09 0.71 0.97 0.07 0.79 0.97 0.05 0.81 0.98 0.05 0.84 0.98EN 0.08 0.72 0.97 0.06 0.80 0.97 0.05 0.83 0.98 0.04 0.86 0.98Adaptive 0.10 0.57 0.98 0.08 0.71 0.98 0.07 0.72 0.99 0.06 0.77 0.990.2 Relaxed 0.09 0.71 0.96 0.07 0.79 0.96 0.06 0.82 0.97 0.06 0.84 0.97MC+ 0.12 0.62 0.95 0.11 0.67 0.95 0.10 0.65 0.96 0.11 0.64 0.96RF 0.08 0.71 0.98 0.05 0.81 0.99 0.04 0.84 0.99 0.03 0.86 1.00RGLM 0.07 0.77 0.97 0.05 0.85 0.98 0.03 0.86 0.99 0.03 0.88 0.99SIS-SCAD 0.12 0.52 0.97 0.11 0.58 0.97 0.11 0.54 0.98 0.11 0.53 0.98XGBoost 0.08 0.73 0.97 0.07 0.78 0.97 0.06 0.76 0.98 0.07 0.76 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.82 0.95 0.06 0.88 0.96 0.04 0.93 0.97 0.03 0.94 0.98Split-EN 0.09 0.81 0.95 0.06 0.88 0.96 0.04 0.92 0.97 0.03 0.95 0.98Lasso 0.10 0.78 0.95 0.08 0.84 0.95 0.06 0.88 0.96 0.05 0.90 0.97EN 0.10 0.79 0.95 0.07 0.85 0.96 0.05 0.89 0.97 0.04 0.91 0.98Adaptive 0.11 0.72 0.97 0.08 0.80 0.97 0.07 0.85 0.97 0.06 0.86 0.980.3 Relaxed 0.11 0.78 0.94 0.08 0.85 0.95 0.07 0.89 0.95 0.06 0.90 0.96MC+ 0.14 0.72 0.93 0.12 0.75 0.94 0.11 0.77 0.94 0.11 0.77 0.94RF 0.09 0.79 0.96 0.06 0.86 0.97 0.04 0.90 0.99 0.03 0.92 0.99RGLM 0.09 0.83 0.95 0.06 0.88 0.97 0.04 0.92 0.98 0.03 0.93 0.99SIS-SCAD 0.12 0.70 0.95 0.11 0.74 0.95 0.10 0.76 0.96 0.10 0.75 0.96XGBoost 0.10 0.80 0.95 0.08 0.83 0.96 0.07 0.85 0.97 0.07 0.84 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.86 0.93 0.07 0.92 0.94 0.04 0.94 0.97 0.04 0.96 0.97Split-EN 0.10 0.86 0.93 0.07 0.92 0.94 0.04 0.94 0.97 0.03 0.96 0.97Lasso 0.11 0.85 0.91 0.08 0.90 0.93 0.06 0.91 0.95 0.06 0.92 0.95EN 0.11 0.85 0.92 0.08 0.91 0.94 0.06 0.92 0.96 0.05 0.94 0.96Adaptive 0.11 0.82 0.93 0.08 0.89 0.94 0.06 0.90 0.96 0.06 0.92 0.960.4 Relaxed 0.12 0.84 0.91 0.09 0.89 0.92 0.07 0.91 0.94 0.07 0.93 0.94MC+ 0.14 0.81 0.90 0.11 0.85 0.91 0.10 0.85 0.93 0.10 0.87 0.92RF 0.10 0.85 0.93 0.06 0.92 0.95 0.04 0.94 0.98 0.03 0.96 0.98RGLM 0.09 0.87 0.93 0.06 0.93 0.95 0.04 0.94 0.97 0.03 0.96 0.97SIS-SCAD 0.12 0.82 0.91 0.10 0.86 0.92 0.10 0.86 0.93 0.09 0.87 0.93XGBoost 0.11 0.85 0.92 0.08 0.89 0.93 0.07 0.89 0.95 0.07 0.90 0.9552able 32: TL, RC and PR for Scenario 2, ρ = 0, ρ = 0.2, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.88 0.28 0.37 0.71 0.37 0.40 0.55 0.33 0.54 0.45 0.24 0.73Split-EN 0.86 0.38 0.29 0.70 0.40 0.36 0.54 0.39 0.51 0.44 0.33 0.67Lasso 0.96 0.07 0.51 0.84 0.06 0.63 0.72 0.05 0.80 0.66 0.03 0.87EN 0.93 0.11 0.37 0.80 0.09 0.58 0.66 0.08 0.74 0.58 0.05 0.86Adaptive 1.04 0.06 0.46 0.96 0.06 0.61 0.90 0.05 0.78 0.82 0.03 0.900.2 Relaxed 1.08 0.07 0.54 1.18 0.05 0.71 1.10 0.04 0.82 1.10 0.03 0.87MC+ 1.00 0.03 0.58 1.00 0.02 0.67 0.95 0.01 0.85 0.91 0.01 0.88RF 0.99 − − − − − − − −

RGLM 0.91 0.41 0.18 0.77 0.34 0.35 0.68 0.27 0.61 0.61 0.17 0.83SIS-SCAD 1.12 0.02 0.62 1.09 0.01 0.82 1.02 0.01 0.91 0.92 0.00 0.96XGBoost 1.04 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.96 0.44 0.26 0.76 0.46 0.33 0.56 0.41 0.48 0.47 0.29 0.70Split-EN 0.96 0.43 0.25 0.76 0.51 0.33 0.56 0.50 0.44 0.46 0.38 0.66Lasso 1.10 0.10 0.50 0.98 0.08 0.62 0.84 0.06 0.79 0.75 0.04 0.91EN 1.06 0.14 0.44 0.90 0.13 0.54 0.74 0.10 0.71 0.63 0.07 0.87Adaptive 1.16 0.10 0.49 1.07 0.08 0.65 0.98 0.06 0.79 0.88 0.04 0.900.3 Relaxed 1.25 0.09 0.56 1.41 0.07 0.70 1.43 0.05 0.86 1.06 0.03 0.93MC+ 1.15 0.05 0.66 1.09 0.03 0.77 1.05 0.02 0.87 0.99 0.01 0.97RF 1.15 − − − − − − − −

RGLM 1.04 0.47 0.18 0.91 0.42 0.36 0.79 0.32 0.63 0.72 0.21 0.88SIS-SCAD 1.28 0.03 0.72 1.56 0.02 0.81 1.17 0.01 0.87 1.10 0.00 0.91XGBoost 1.23 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 1.01 0.57 0.21 0.78 0.49 0.30 0.58 0.42 0.48 0.47 0.29 0.69Split-EN 1.02 0.56 0.21 0.78 0.56 0.28 0.57 0.50 0.45 0.45 0.39 0.65Lasso 1.17 0.12 0.54 1.04 0.09 0.63 0.87 0.07 0.80 0.79 0.04 0.89EN 1.16 0.16 0.46 0.96 0.13 0.56 0.77 0.11 0.71 0.65 0.07 0.85Adaptive 1.26 0.10 0.51 1.15 0.08 0.65 1.03 0.06 0.80 0.93 0.04 0.880.4 Relaxed 1.29 0.09 0.63 1.26 0.08 0.68 1.12 0.06 0.83 1.20 0.03 0.92MC+ 1.29 0.04 0.60 1.17 0.03 0.79 1.07 0.02 0.88 1.09 0.01 0.93RF 1.25 − − − − − − − −

RGLM 1.12 0.51 0.18 0.98 0.44 0.36 0.85 0.34 0.63 0.77 0.22 0.87SIS-SCAD 1.33 0.03 0.71 1.33 0.02 0.77 1.34 0.01 0.89 1.22 0.00 0.92XGBoost 1.30 − − − − − − − − ρ = 0, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.57 0.51 0.31 0.43 0.45 0.39 0.32 0.32 0.49 0.29 0.21 0.68Split-EN 0.57 0.54 0.33 0.42 0.54 0.37 0.32 0.45 0.48 0.28 0.32 0.67Lasso 0.67 0.10 0.56 0.56 0.07 0.64 0.47 0.04 0.74 0.44 0.02 0.87EN 0.63 0.17 0.51 0.51 0.13 0.59 0.41 0.09 0.70 0.37 0.05 0.84Adaptive 0.79 0.10 0.55 0.69 0.07 0.65 0.59 0.04 0.75 0.62 0.02 0.870.2 Relaxed 0.92 0.09 0.67 1.51 0.05 0.81 0.95 0.04 0.86 0.72 0.02 0.93MC+ 0.78 0.03 0.71 0.75 0.01 0.83 0.69 0.01 0.94 0.74 0.00 0.97RF 0.73 − − − − − − − −

RGLM 0.60 0.51 0.33 0.50 0.39 0.56 0.43 0.28 0.84 0.42 0.16 0.95SIS-SCAD 0.81 0.03 0.91 0.73 0.01 0.98 0.71 0.01 0.99 0.73 0.00 1.00XGBoost 0.66 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.63 0.57 0.31 0.44 0.51 0.35 0.34 0.36 0.53 0.29 0.24 0.68Split-EN 0.61 0.66 0.27 0.44 0.60 0.35 0.34 0.49 0.50 0.28 0.34 0.67Lasso 0.76 0.12 0.52 0.62 0.09 0.63 0.53 0.05 0.75 0.46 0.03 0.87EN 0.70 0.20 0.48 0.54 0.16 0.58 0.44 0.10 0.72 0.38 0.06 0.84Adaptive 0.85 0.12 0.56 0.75 0.09 0.65 0.64 0.05 0.74 0.60 0.03 0.870.3 Relaxed 1.29 0.11 0.70 0.89 0.08 0.75 1.17 0.04 0.90 0.80 0.02 0.93MC+ 0.92 0.04 0.75 0.77 0.02 0.83 0.77 0.01 0.89 0.73 0.01 0.96RF 0.85 − − − − − − − −

RGLM 0.69 0.59 0.33 0.58 0.46 0.58 0.50 0.30 0.89 0.47 0.18 0.98SIS-SCAD 0.96 0.03 0.87 0.82 0.01 0.92 0.80 0.01 0.97 0.80 0.00 0.97XGBoost 0.77 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.63 0.68 0.22 0.46 0.53 0.35 0.33 0.40 0.52 0.29 0.26 0.69Split-EN 0.64 0.77 0.23 0.46 0.64 0.33 0.34 0.52 0.51 0.29 0.37 0.68Lasso 0.80 0.15 0.59 0.68 0.09 0.64 0.52 0.06 0.78 0.49 0.03 0.87EN 0.75 0.23 0.51 0.59 0.16 0.59 0.44 0.11 0.75 0.39 0.07 0.84Adaptive 0.90 0.15 0.61 0.76 0.09 0.66 0.63 0.06 0.77 0.61 0.03 0.870.4 Relaxed 1.27 0.13 0.70 1.47 0.07 0.84 0.95 0.05 0.89 0.77 0.03 0.94MC+ 0.90 0.05 0.80 0.87 0.02 0.78 0.74 0.02 0.88 0.73 0.01 0.95RF 0.92 − − − − − − − −

RGLM 0.75 0.63 0.33 0.62 0.48 0.59 0.53 0.33 0.93 0.51 0.19 0.99SIS-SCAD 0.96 0.03 0.87 0.92 0.01 0.91 0.85 0.01 0.91 0.83 0.00 0.97XGBoost 0.80 − − − − − − − − ρ = 0, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.40 0.55 0.24 0.31 0.40 0.33 0.24 0.29 0.43 0.23 0.19 0.57Split-EN 0.41 0.79 0.25 0.31 0.65 0.35 0.24 0.47 0.48 0.23 0.34 0.63Lasso 0.52 0.09 0.46 0.44 0.05 0.52 0.36 0.03 0.62 0.34 0.02 0.63EN 0.49 0.20 0.50 0.40 0.14 0.55 0.32 0.08 0.67 0.30 0.05 0.72Adaptive 0.66 0.08 0.52 0.58 0.05 0.53 0.51 0.03 0.63 0.49 0.02 0.650.2 Relaxed 1.08 0.07 0.66 1.02 0.04 0.77 0.83 0.02 0.85 0.73 0.01 0.91MC+ 0.57 0.01 0.80 0.54 0.01 0.84 0.53 0.00 0.92 0.52 0.00 0.93RF 0.48 − − − − − − − −

RGLM 0.41 0.49 0.40 0.33 0.35 0.65 0.27 0.23 0.93 0.27 0.13 0.97SIS-SCAD 0.61 0.02 0.99 0.56 0.01 0.97 0.53 0.00 1.00 0.55 0.00 0.98XGBoost 0.46 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.44 0.60 0.23 0.33 0.45 0.33 0.26 0.29 0.42 0.23 0.19 0.56Split-EN 0.44 0.82 0.23 0.34 0.68 0.34 0.27 0.50 0.47 0.23 0.35 0.62Lasso 0.59 0.09 0.47 0.49 0.06 0.48 0.40 0.04 0.55 0.36 0.02 0.66EN 0.54 0.23 0.49 0.44 0.16 0.55 0.35 0.10 0.63 0.30 0.06 0.72Adaptive 0.66 0.09 0.45 0.59 0.06 0.51 0.50 0.04 0.59 0.45 0.02 0.650.3 Relaxed 1.61 0.08 0.75 1.05 0.05 0.77 0.78 0.03 0.84 0.86 0.02 0.83MC+ 0.67 0.01 0.62 0.62 0.01 0.67 0.59 0.00 0.82 0.59 0.00 0.89RF 0.55 − − − − − − − −

RGLM 0.47 0.53 0.41 0.38 0.39 0.68 0.31 0.26 0.98 0.30 0.15 1.00SIS-SCAD 0.66 0.03 0.96 0.64 0.01 1.00 0.60 0.01 1.00 0.59 0.00 1.00XGBoost 0.60 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.46 0.60 0.21 0.35 0.43 0.31 0.27 0.30 0.44 0.24 0.19 0.56Split-EN 0.46 0.86 0.21 0.35 0.72 0.34 0.27 0.52 0.49 0.24 0.37 0.63Lasso 0.63 0.10 0.42 0.53 0.06 0.47 0.41 0.04 0.59 0.38 0.02 0.65EN 0.58 0.24 0.45 0.46 0.16 0.55 0.35 0.10 0.65 0.32 0.07 0.73Adaptive 0.71 0.10 0.42 0.61 0.06 0.48 0.51 0.04 0.60 0.49 0.02 0.650.4 Relaxed 1.05 0.08 0.63 0.96 0.05 0.81 0.99 0.03 0.85 1.07 0.02 0.89MC+ 0.72 0.02 0.59 0.67 0.01 0.71 0.63 0.01 0.74 0.60 0.00 0.85RF 0.60 − − − − − − − −

RGLM 0.51 0.55 0.38 0.41 0.40 0.67 0.32 0.28 0.99 0.32 0.16 1.00SIS-SCAD 0.69 0.03 0.92 0.63 0.01 0.99 0.58 0.01 0.99 0.58 0.00 0.99XGBoost 0.57 − − − − − − − − ρ = 0.2, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.58 0.44 0.31 0.43 0.42 0.39 0.33 0.32 0.49 0.28 0.21 0.68Split-EN 0.58 0.55 0.29 0.42 0.51 0.36 0.33 0.43 0.48 0.28 0.31 0.66Lasso 0.66 0.09 0.47 0.56 0.06 0.62 0.49 0.04 0.66 0.45 0.02 0.82EN 0.64 0.15 0.45 0.50 0.12 0.57 0.43 0.08 0.66 0.38 0.05 0.81Adaptive 0.81 0.09 0.51 0.69 0.07 0.62 0.59 0.04 0.66 0.62 0.02 0.820.2 Relaxed 0.95 0.08 0.58 1.22 0.05 0.73 0.98 0.03 0.77 0.96 0.02 0.88MC+ 0.78 0.02 0.67 0.77 0.01 0.80 0.72 0.01 0.83 0.75 0.00 0.96RF 0.71 − − − − − − − −

RGLM 0.60 0.47 0.29 0.49 0.37 0.53 0.44 0.25 0.73 0.43 0.15 0.90SIS-SCAD 0.78 0.02 0.86 0.75 0.01 0.94 0.71 0.01 0.97 0.73 0.00 0.99XGBoost 0.68 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.65 0.52 0.30 0.46 0.48 0.37 0.35 0.35 0.54 0.29 0.25 0.70Split-EN 0.65 0.52 0.30 0.47 0.54 0.38 0.35 0.49 0.50 0.29 0.34 0.68Lasso 0.77 0.11 0.48 0.61 0.08 0.65 0.52 0.05 0.76 0.46 0.03 0.85EN 0.72 0.18 0.43 0.54 0.15 0.60 0.44 0.10 0.73 0.38 0.06 0.84Adaptive 0.86 0.11 0.49 0.77 0.08 0.67 0.65 0.05 0.76 0.61 0.03 0.850.3 Relaxed 1.38 0.10 0.60 1.01 0.07 0.76 1.05 0.04 0.88 0.70 0.03 0.91MC+ 0.92 0.03 0.61 0.79 0.02 0.78 0.78 0.01 0.86 0.74 0.01 0.91RF 0.83 − − − − − − − −

RGLM 0.69 0.55 0.31 0.57 0.44 0.55 0.50 0.28 0.81 0.48 0.17 0.94SIS-SCAD 0.89 0.03 0.83 0.82 0.01 0.90 0.80 0.01 0.99 0.80 0.00 0.97XGBoost 0.78 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.70 0.55 0.30 0.51 0.50 0.35 0.35 0.39 0.54 0.30 0.25 0.72Split-EN 0.70 0.60 0.30 0.51 0.56 0.36 0.35 0.48 0.52 0.30 0.35 0.70Lasso 0.82 0.13 0.54 0.69 0.08 0.61 0.53 0.06 0.74 0.49 0.03 0.86EN 0.78 0.21 0.48 0.61 0.15 0.57 0.45 0.11 0.71 0.39 0.07 0.84Adaptive 0.89 0.13 0.54 0.79 0.08 0.61 0.66 0.06 0.74 0.61 0.03 0.850.4 Relaxed 1.19 0.12 0.60 1.17 0.07 0.72 0.97 0.05 0.85 0.71 0.03 0.92MC+ 0.93 0.04 0.65 0.87 0.02 0.72 0.76 0.01 0.80 0.74 0.01 0.91RF 0.90 − − − − − − − −

RGLM 0.73 0.60 0.32 0.62 0.45 0.55 0.54 0.31 0.84 0.52 0.19 0.96SIS-SCAD 0.99 0.03 0.79 0.92 0.01 0.89 0.86 0.01 0.88 0.83 0.00 0.97XGBoost 0.81 − − − − − − − − ρ = 0.2, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.42 0.55 0.26 0.32 0.42 0.36 0.24 0.29 0.45 0.24 0.19 0.59Split-EN 0.44 0.72 0.30 0.33 0.64 0.36 0.25 0.47 0.49 0.24 0.33 0.63Lasso 0.53 0.08 0.47 0.45 0.05 0.51 0.36 0.03 0.61 0.35 0.02 0.63EN 0.51 0.19 0.50 0.40 0.14 0.55 0.31 0.08 0.66 0.30 0.05 0.71Adaptive 0.65 0.08 0.50 0.56 0.05 0.51 0.50 0.03 0.63 0.49 0.02 0.620.2 Relaxed 1.09 0.07 0.68 1.02 0.04 0.75 0.87 0.02 0.81 0.80 0.01 0.84MC+ 0.57 0.01 0.80 0.53 0.01 0.78 0.52 0.00 0.93 0.52 0.00 0.87RF 0.48 − − − − − − − −

RGLM 0.41 0.51 0.41 0.33 0.36 0.66 0.27 0.23 0.90 0.28 0.13 0.96SIS-SCAD 0.61 0.02 0.99 0.56 0.01 0.99 0.53 0.00 1.00 0.55 0.00 0.97XGBoost 0.46 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.46 0.60 0.25 0.35 0.44 0.35 0.27 0.30 0.47 0.23 0.19 0.59Split-EN 0.47 0.77 0.27 0.35 0.67 0.34 0.27 0.49 0.48 0.23 0.35 0.65Lasso 0.59 0.09 0.42 0.48 0.06 0.50 0.40 0.04 0.62 0.35 0.02 0.71EN 0.54 0.21 0.49 0.44 0.15 0.55 0.35 0.09 0.63 0.30 0.06 0.74Adaptive 0.67 0.09 0.44 0.60 0.06 0.49 0.50 0.04 0.62 0.46 0.02 0.710.3 Relaxed 1.26 0.08 0.64 1.04 0.05 0.73 0.84 0.03 0.86 0.83 0.02 0.90MC+ 0.67 0.01 0.66 0.63 0.01 0.65 0.60 0.00 0.76 0.59 0.00 0.84RF 0.55 − − − − − − − −

RGLM 0.46 0.55 0.43 0.37 0.39 0.69 0.31 0.25 0.96 0.30 0.15 1.00SIS-SCAD 0.66 0.02 0.96 0.64 0.01 0.99 0.60 0.01 1.00 0.59 0.00 0.99XGBoost 0.60 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.49 0.60 0.25 0.38 0.41 0.32 0.28 0.30 0.45 0.24 0.18 0.59Split-EN 0.49 0.80 0.25 0.39 0.64 0.36 0.29 0.51 0.49 0.24 0.36 0.64Lasso 0.63 0.09 0.42 0.54 0.06 0.50 0.42 0.04 0.59 0.39 0.02 0.69EN 0.58 0.22 0.47 0.48 0.14 0.56 0.36 0.10 0.65 0.32 0.06 0.75Adaptive 0.70 0.09 0.43 0.63 0.06 0.50 0.53 0.04 0.60 0.47 0.02 0.680.4 Relaxed 0.94 0.08 0.64 0.96 0.05 0.73 0.83 0.03 0.83 1.06 0.02 0.87MC+ 0.72 0.02 0.52 0.67 0.01 0.62 0.62 0.01 0.70 0.60 0.00 0.82RF 0.60 − − − − − − − −

RGLM 0.50 0.57 0.41 0.40 0.41 0.66 0.33 0.27 0.97 0.32 0.16 1.00SIS-SCAD 0.69 0.03 0.96 0.63 0.01 0.98 0.59 0.01 0.98 0.58 0.00 1.00XGBoost 0.57 − − − − − − − − ρ = 0.5, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.43 0.48 0.30 0.32 0.40 0.37 0.23 0.27 0.50 0.22 0.18 0.62Split-EN 0.45 0.65 0.27 0.33 0.58 0.36 0.23 0.44 0.50 0.22 0.30 0.64Lasso 0.53 0.07 0.40 0.42 0.05 0.51 0.34 0.03 0.60 0.34 0.01 0.59EN 0.49 0.15 0.43 0.38 0.11 0.53 0.29 0.08 0.64 0.29 0.05 0.68Adaptive 0.64 0.07 0.42 0.55 0.05 0.51 0.48 0.03 0.61 0.47 0.01 0.590.2 Relaxed 1.03 0.06 0.53 1.15 0.04 0.65 0.72 0.02 0.72 0.77 0.01 0.75MC+ 0.61 0.01 0.70 0.56 0.01 0.74 0.53 0.00 0.90 0.54 0.00 0.85RF 0.49 − − − − − − − −

RGLM 0.43 0.46 0.36 0.34 0.33 0.58 0.28 0.21 0.77 0.30 0.12 0.84SIS-SCAD 0.62 0.02 0.93 0.57 0.01 0.92 0.53 0.00 0.99 0.55 0.00 0.95XGBoost 0.50 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.48 0.51 0.29 0.36 0.40 0.38 0.27 0.28 0.50 0.22 0.19 0.65Split-EN 0.49 0.61 0.33 0.36 0.58 0.37 0.27 0.46 0.51 0.22 0.33 0.68Lasso 0.57 0.08 0.43 0.47 0.05 0.50 0.39 0.03 0.61 0.33 0.02 0.74EN 0.53 0.17 0.46 0.43 0.12 0.52 0.34 0.09 0.64 0.28 0.06 0.77Adaptive 0.67 0.08 0.44 0.59 0.05 0.49 0.49 0.03 0.61 0.46 0.02 0.730.3 Relaxed 1.69 0.06 0.69 0.95 0.04 0.66 0.74 0.03 0.77 0.74 0.02 0.83MC+ 0.67 0.01 0.58 0.61 0.01 0.60 0.58 0.00 0.78 0.58 0.00 0.82RF 0.55 − − − − − − − −

RGLM 0.46 0.53 0.40 0.38 0.37 0.62 0.33 0.23 0.84 0.31 0.14 0.95SIS-SCAD 0.67 0.02 0.95 0.63 0.01 0.96 0.60 0.01 0.98 0.59 0.00 0.99XGBoost 0.62 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.53 0.48 0.31 0.41 0.38 0.38 0.29 0.29 0.50 0.24 0.19 0.64Split-EN 0.54 0.64 0.32 0.42 0.53 0.39 0.30 0.45 0.52 0.24 0.34 0.67Lasso 0.60 0.09 0.45 0.53 0.05 0.49 0.42 0.04 0.61 0.37 0.02 0.71EN 0.58 0.18 0.45 0.48 0.12 0.53 0.37 0.09 0.63 0.30 0.06 0.76Adaptive 0.72 0.08 0.43 0.64 0.05 0.47 0.54 0.04 0.62 0.47 0.02 0.700.4 Relaxed 0.89 0.07 0.58 0.88 0.04 0.69 0.95 0.03 0.77 1.06 0.02 0.83MC+ 0.70 0.02 0.52 0.65 0.01 0.55 0.61 0.01 0.64 0.60 0.00 0.79RF 0.60 − − − − − − − −

RGLM 0.50 0.55 0.39 0.42 0.37 0.59 0.35 0.25 0.83 0.33 0.16 0.96SIS-SCAD 0.70 0.03 0.92 0.64 0.01 0.93 0.59 0.01 0.95 0.58 0.00 0.97XGBoost 0.57 − − − − − − − − ρ = 0, ρ = 0.2, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.77 0.49 0.32 0.59 0.58 0.33 0.44 0.54 0.42 0.34 0.39 0.64Split-EN 0.76 0.58 0.26 0.59 0.59 0.32 0.44 0.59 0.40 0.33 0.48 0.60Lasso 0.83 0.16 0.53 0.68 0.12 0.69 0.58 0.09 0.80 0.53 0.05 0.91EN 0.80 0.21 0.49 0.66 0.16 0.64 0.55 0.12 0.73 0.46 0.08 0.87Adaptive 0.93 0.16 0.58 0.83 0.12 0.71 0.72 0.09 0.81 0.64 0.05 0.910.2 Relaxed 0.83 0.13 0.69 0.75 0.11 0.73 0.72 0.08 0.84 0.65 0.05 0.94MC+ 0.88 0.07 0.77 0.84 0.04 0.87 0.77 0.03 0.95 0.76 0.01 0.99RF 0.94 − − − − − − − −

RGLM 0.80 0.67 0.17 0.66 0.58 0.37 0.56 0.46 0.68 0.51 0.30 0.93SIS-SCAD 0.97 0.05 0.70 0.92 0.03 0.81 0.89 0.01 0.95 0.84 0.01 0.96XGBoost 0.99 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.85 0.61 0.23 0.66 0.67 0.26 0.45 0.59 0.40 0.37 0.44 0.63Split-EN 0.85 0.68 0.20 0.65 0.73 0.23 0.45 0.65 0.37 0.36 0.51 0.60Lasso 0.93 0.21 0.52 0.79 0.15 0.68 0.66 0.11 0.80 0.59 0.06 0.91EN 0.93 0.25 0.42 0.77 0.19 0.61 0.61 0.14 0.75 0.51 0.09 0.86Adaptive 1.00 0.21 0.55 0.92 0.15 0.71 0.76 0.11 0.79 0.69 0.06 0.910.3 Relaxed 0.98 0.18 0.63 0.85 0.13 0.73 0.77 0.10 0.84 0.68 0.06 0.93MC+ 0.97 0.10 0.76 0.90 0.06 0.88 0.84 0.04 0.95 0.81 0.02 0.98RF 1.09 − − − − − − − −

RGLM 0.92 0.73 0.17 0.78 0.67 0.36 0.64 0.54 0.72 0.59 0.34 0.95SIS-SCAD 1.04 0.05 0.76 0.98 0.03 0.88 1.00 0.02 0.92 1.01 0.01 0.97XGBoost 1.08 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.90 0.75 0.16 0.70 0.72 0.22 0.47 0.62 0.39 0.38 0.46 0.63Split-EN 0.92 0.83 0.13 0.69 0.79 0.20 0.47 0.68 0.37 0.38 0.55 0.59Lasso 1.00 0.20 0.49 0.85 0.16 0.67 0.68 0.11 0.79 0.63 0.07 0.91EN 0.99 0.25 0.44 0.82 0.20 0.59 0.63 0.16 0.74 0.55 0.10 0.87Adaptive 1.06 0.20 0.53 0.96 0.16 0.71 0.77 0.11 0.79 0.71 0.07 0.900.4 Relaxed 1.04 0.16 0.69 0.87 0.15 0.70 0.80 0.11 0.83 0.74 0.06 0.94MC+ 1.06 0.10 0.76 0.97 0.07 0.85 0.88 0.04 0.95 0.85 0.02 0.98RF 1.18 − − − − − − − −

RGLM 0.99 0.77 0.16 0.84 0.70 0.36 0.68 0.57 0.73 0.63 0.37 0.96SIS-SCAD 1.12 0.05 0.80 1.06 0.03 0.88 1.03 0.02 0.97 1.02 0.01 0.99XGBoost 1.15 − − − − − − − − ρ = 0, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.47 0.68 0.23 0.34 0.61 0.31 0.24 0.46 0.43 0.21 0.32 0.64Split-EN 0.48 0.73 0.23 0.34 0.71 0.28 0.24 0.58 0.41 0.21 0.42 0.63Lasso 0.55 0.17 0.53 0.45 0.12 0.67 0.36 0.07 0.76 0.32 0.04 0.87EN 0.53 0.26 0.47 0.42 0.18 0.63 0.32 0.12 0.71 0.28 0.08 0.84Adaptive 0.63 0.17 0.52 0.52 0.12 0.66 0.45 0.07 0.75 0.41 0.04 0.870.2 Relaxed 0.64 0.15 0.65 0.56 0.10 0.78 0.69 0.06 0.88 0.47 0.04 0.95MC+ 0.71 0.04 0.83 0.66 0.02 0.90 0.62 0.01 0.94 0.64 0.01 0.98RF 0.65 − − − − − − − −

RGLM 0.52 0.72 0.28 0.41 0.58 0.59 0.34 0.42 0.95 0.34 0.23 1.00SIS-SCAD 0.79 0.04 0.81 0.69 0.02 0.90 0.65 0.01 0.96 0.64 0.01 0.97XGBoost 0.58 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.53 0.79 0.18 0.38 0.66 0.26 0.27 0.51 0.44 0.23 0.35 0.64Split-EN 0.53 0.87 0.15 0.38 0.76 0.25 0.27 0.62 0.42 0.23 0.45 0.63Lasso 0.64 0.20 0.56 0.52 0.13 0.63 0.41 0.08 0.76 0.36 0.05 0.85EN 0.61 0.28 0.47 0.48 0.19 0.57 0.35 0.14 0.70 0.30 0.09 0.83Adaptive 0.70 0.20 0.53 0.60 0.13 0.60 0.48 0.09 0.77 0.43 0.05 0.840.3 Relaxed 0.71 0.17 0.64 0.59 0.12 0.72 0.50 0.08 0.83 0.45 0.05 0.90MC+ 0.74 0.07 0.78 0.66 0.04 0.84 0.62 0.02 0.93 0.61 0.01 0.96RF 0.76 − − − − − − − −

RGLM 0.60 0.80 0.28 0.49 0.66 0.55 0.39 0.48 0.98 0.39 0.27 1.00SIS-SCAD 0.93 0.05 0.79 0.77 0.03 0.91 0.65 0.01 0.92 0.67 0.01 0.99XGBoost 0.66 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.57 0.79 0.18 0.40 0.69 0.28 0.29 0.54 0.46 0.24 0.37 0.66Split-EN 0.57 0.87 0.15 0.41 0.79 0.25 0.29 0.66 0.43 0.24 0.47 0.64Lasso 0.68 0.21 0.52 0.56 0.14 0.64 0.43 0.09 0.74 0.38 0.05 0.87EN 0.66 0.29 0.46 0.51 0.21 0.60 0.37 0.15 0.71 0.32 0.09 0.85Adaptive 0.74 0.21 0.53 0.62 0.14 0.66 0.51 0.09 0.75 0.46 0.05 0.860.4 Relaxed 0.80 0.18 0.66 0.72 0.13 0.73 0.59 0.08 0.86 0.54 0.05 0.94MC+ 0.76 0.09 0.77 0.67 0.05 0.87 0.60 0.03 0.92 0.59 0.01 0.96RF 0.83 − − − − − − − −

RGLM 0.66 0.80 0.25 0.52 0.69 0.55 0.42 0.51 0.98 0.42 0.28 1.00SIS-SCAD 0.81 0.05 0.82 0.77 0.03 0.90 0.66 0.02 0.97 0.67 0.01 0.96XGBoost 0.72 − − − − − − − − ρ = 0, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.34 0.71 0.18 0.24 0.57 0.27 0.19 0.42 0.40 0.17 0.28 0.54Split-EN 0.34 0.89 0.16 0.25 0.76 0.27 0.20 0.57 0.41 0.17 0.41 0.57Lasso 0.43 0.12 0.40 0.34 0.08 0.49 0.28 0.05 0.55 0.25 0.03 0.64EN 0.41 0.26 0.45 0.31 0.18 0.56 0.25 0.11 0.61 0.21 0.07 0.71Adaptive 0.50 0.12 0.41 0.40 0.08 0.48 0.36 0.05 0.57 0.31 0.03 0.640.2 Relaxed 0.66 0.09 0.77 0.60 0.07 0.78 0.50 0.04 0.82 0.44 0.02 0.88MC+ 0.56 0.01 0.79 0.52 0.01 0.86 0.48 0.00 0.86 0.48 0.00 0.96RF 0.42 − − − − − − − −

RGLM 0.36 0.64 0.37 0.27 0.51 0.68 0.21 0.33 0.99 0.21 0.18 1.00SIS-SCAD 0.57 0.03 0.93 0.59 0.02 0.95 0.50 0.01 1.00 0.54 0.00 0.99XGBoost 0.41 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.39 0.72 0.17 0.26 0.56 0.26 0.21 0.40 0.38 0.18 0.26 0.53Split-EN 0.39 0.91 0.15 0.27 0.79 0.27 0.21 0.61 0.42 0.18 0.42 0.58Lasso 0.50 0.15 0.48 0.37 0.09 0.48 0.31 0.06 0.57 0.27 0.03 0.64EN 0.47 0.28 0.45 0.34 0.19 0.52 0.28 0.13 0.64 0.23 0.08 0.71Adaptive 0.57 0.15 0.49 0.44 0.09 0.45 0.38 0.06 0.55 0.34 0.03 0.650.3 Relaxed 0.72 0.13 0.66 0.52 0.07 0.75 0.46 0.05 0.88 0.46 0.02 0.95MC+ 0.64 0.02 0.67 0.54 0.01 0.70 0.52 0.01 0.82 0.52 0.00 0.84RF 0.50 − − − − − − − −

RGLM 0.42 0.70 0.35 0.31 0.54 0.66 0.25 0.37 1.00 0.25 0.20 1.00SIS-SCAD 0.61 0.04 0.91 0.55 0.02 0.92 0.50 0.01 0.98 0.52 0.01 0.99XGBoost 0.47 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.41 0.75 0.16 0.29 0.56 0.26 0.21 0.40 0.39 0.19 0.26 0.55Split-EN 0.41 0.91 0.14 0.29 0.78 0.27 0.22 0.62 0.43 0.20 0.44 0.60Lasso 0.54 0.15 0.39 0.41 0.10 0.48 0.33 0.06 0.57 0.29 0.03 0.66EN 0.50 0.28 0.42 0.37 0.20 0.52 0.29 0.13 0.63 0.25 0.08 0.71Adaptive 0.58 0.14 0.40 0.48 0.10 0.46 0.38 0.06 0.56 0.34 0.03 0.650.4 Relaxed 0.72 0.12 0.65 0.76 0.08 0.77 0.55 0.05 0.87 0.48 0.03 0.91MC+ 0.64 0.03 0.62 0.55 0.02 0.64 0.49 0.01 0.71 0.49 0.01 0.75RF 0.54 − − − − − − − −

RGLM 0.45 0.73 0.32 0.34 0.56 0.65 0.27 0.38 1.00 0.28 0.21 1.00SIS-SCAD 0.65 0.04 0.81 0.49 0.02 0.88 0.47 0.01 0.93 0.44 0.01 0.98XGBoost 0.52 − − − − − − − − ρ = 0.2, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.49 0.63 0.24 0.36 0.61 0.31 0.24 0.46 0.45 0.21 0.32 0.67Split-EN 0.49 0.69 0.22 0.36 0.69 0.29 0.24 0.57 0.43 0.21 0.42 0.65Lasso 0.55 0.15 0.49 0.46 0.11 0.64 0.36 0.07 0.75 0.32 0.04 0.85EN 0.53 0.23 0.45 0.42 0.17 0.60 0.31 0.12 0.71 0.27 0.07 0.84Adaptive 0.63 0.15 0.48 0.54 0.11 0.63 0.45 0.07 0.76 0.40 0.04 0.850.2 Relaxed 0.62 0.14 0.57 0.61 0.10 0.71 0.62 0.06 0.86 0.49 0.03 0.92MC+ 0.68 0.04 0.68 0.66 0.02 0.82 0.62 0.01 0.90 0.66 0.01 0.96RF 0.64 − − − − − − − −

RGLM 0.51 0.71 0.28 0.41 0.57 0.57 0.35 0.40 0.86 0.35 0.23 0.98SIS-SCAD 0.77 0.04 0.82 0.69 0.02 0.92 0.63 0.01 0.96 0.66 0.01 0.98XGBoost 0.60 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.57 0.70 0.24 0.40 0.63 0.30 0.27 0.50 0.46 0.23 0.35 0.66Split-EN 0.57 0.77 0.22 0.41 0.73 0.27 0.27 0.62 0.43 0.23 0.45 0.64Lasso 0.64 0.18 0.50 0.52 0.12 0.62 0.40 0.08 0.74 0.36 0.05 0.84EN 0.61 0.25 0.48 0.48 0.18 0.58 0.35 0.13 0.72 0.31 0.08 0.83Adaptive 0.71 0.18 0.51 0.60 0.12 0.61 0.49 0.08 0.75 0.43 0.05 0.840.3 Relaxed 0.74 0.16 0.60 0.58 0.12 0.70 0.51 0.08 0.84 0.47 0.04 0.89MC+ 0.76 0.06 0.62 0.66 0.04 0.80 0.64 0.02 0.86 0.80 0.01 0.93RF 0.75 − − − − − − − −

RGLM 0.60 0.79 0.28 0.48 0.64 0.55 0.39 0.46 0.90 0.39 0.27 0.99SIS-SCAD 0.77 0.05 0.74 0.69 0.03 0.88 0.68 0.01 0.91 0.69 0.01 0.95XGBoost 0.68 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.62 0.68 0.24 0.44 0.64 0.31 0.29 0.53 0.48 0.25 0.36 0.68Split-EN 0.62 0.72 0.25 0.44 0.72 0.28 0.30 0.63 0.45 0.25 0.47 0.66Lasso 0.69 0.19 0.52 0.55 0.13 0.62 0.42 0.09 0.76 0.38 0.05 0.85EN 0.67 0.26 0.46 0.51 0.19 0.58 0.38 0.14 0.72 0.33 0.09 0.84Adaptive 0.76 0.19 0.53 0.62 0.13 0.64 0.51 0.09 0.75 0.46 0.05 0.850.4 Relaxed 0.78 0.17 0.64 0.64 0.12 0.70 0.62 0.08 0.84 0.53 0.05 0.92MC+ 0.76 0.08 0.70 0.69 0.05 0.76 0.60 0.03 0.87 0.59 0.01 0.92RF 0.83 − − − − − − − −

RGLM 0.65 0.83 0.26 0.51 0.68 0.54 0.43 0.48 0.91 0.43 0.28 0.99SIS-SCAD 0.82 0.05 0.80 0.75 0.03 0.88 0.67 0.02 0.94 0.71 0.01 0.94XGBoost 0.74 − − − − − − − − ρ = 0.2, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.34 0.71 0.19 0.25 0.57 0.28 0.19 0.41 0.41 0.17 0.26 0.54Split-EN 0.35 0.87 0.19 0.25 0.75 0.28 0.19 0.56 0.43 0.17 0.40 0.58Lasso 0.42 0.12 0.45 0.33 0.08 0.51 0.27 0.05 0.60 0.24 0.03 0.66EN 0.40 0.25 0.47 0.30 0.18 0.56 0.24 0.11 0.65 0.21 0.07 0.73Adaptive 0.49 0.12 0.44 0.40 0.08 0.50 0.34 0.05 0.59 0.31 0.03 0.670.2 Relaxed 0.60 0.10 0.69 0.52 0.07 0.75 0.52 0.04 0.80 0.41 0.02 0.88MC+ 0.56 0.01 0.82 0.51 0.01 0.80 0.47 0.00 0.90 0.48 0.00 0.94RF 0.41 − − − − − − − −

RGLM 0.35 0.68 0.40 0.26 0.52 0.70 0.21 0.32 0.99 0.21 0.18 1.00SIS-SCAD 0.59 0.03 0.92 0.53 0.02 0.98 0.50 0.01 1.00 0.53 0.00 0.99XGBoost 0.41 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.40 0.73 0.18 0.28 0.56 0.28 0.21 0.40 0.41 0.18 0.27 0.56Split-EN 0.41 0.88 0.17 0.29 0.77 0.28 0.21 0.60 0.44 0.18 0.42 0.60Lasso 0.49 0.14 0.47 0.38 0.09 0.49 0.31 0.05 0.60 0.26 0.03 0.69EN 0.48 0.27 0.49 0.35 0.20 0.53 0.27 0.12 0.65 0.22 0.08 0.73Adaptive 0.56 0.14 0.49 0.44 0.09 0.47 0.37 0.06 0.58 0.33 0.03 0.670.3 Relaxed 0.67 0.11 0.69 0.53 0.08 0.69 0.45 0.05 0.81 0.46 0.03 0.92MC+ 0.63 0.02 0.60 0.56 0.01 0.67 0.51 0.01 0.72 0.52 0.00 0.83RF 0.50 − − − − − − − −

RGLM 0.41 0.73 0.37 0.31 0.55 0.69 0.25 0.37 0.99 0.25 0.20 1.00SIS-SCAD 0.62 0.04 0.90 0.56 0.02 0.92 0.51 0.01 0.97 0.52 0.01 0.99XGBoost 0.47 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.43 0.73 0.18 0.30 0.57 0.29 0.21 0.40 0.42 0.19 0.27 0.58Split-EN 0.43 0.90 0.15 0.30 0.77 0.29 0.22 0.62 0.44 0.19 0.44 0.62Lasso 0.54 0.14 0.46 0.40 0.09 0.50 0.32 0.06 0.61 0.28 0.03 0.69EN 0.51 0.27 0.43 0.37 0.19 0.53 0.28 0.12 0.63 0.24 0.08 0.74Adaptive 0.58 0.15 0.41 0.47 0.10 0.51 0.37 0.06 0.60 0.34 0.03 0.690.4 Relaxed 0.70 0.11 0.68 0.62 0.08 0.74 0.52 0.05 0.88 0.44 0.03 0.90MC+ 0.63 0.03 0.53 0.54 0.02 0.65 0.49 0.01 0.74 0.50 0.01 0.76RF 0.54 − − − − − − − −

RGLM 0.44 0.76 0.33 0.33 0.57 0.68 0.27 0.38 1.00 0.28 0.21 1.00SIS-SCAD 0.59 0.05 0.82 0.50 0.02 0.90 0.47 0.01 0.94 0.45 0.01 0.95XGBoost 0.52 − − − − − − − − ρ = 0.5, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.34 0.68 0.19 0.24 0.53 0.29 0.18 0.40 0.45 0.16 0.28 0.60Split-EN 0.35 0.81 0.19 0.24 0.71 0.28 0.18 0.54 0.44 0.16 0.38 0.61Lasso 0.41 0.11 0.46 0.32 0.07 0.51 0.25 0.05 0.66 0.22 0.03 0.73EN 0.39 0.22 0.48 0.29 0.16 0.55 0.22 0.10 0.68 0.19 0.06 0.78Adaptive 0.48 0.11 0.46 0.39 0.07 0.51 0.33 0.05 0.66 0.29 0.03 0.720.2 Relaxed 0.64 0.09 0.65 0.55 0.06 0.67 0.43 0.04 0.76 0.42 0.02 0.87MC+ 0.56 0.01 0.77 0.51 0.01 0.81 0.46 0.00 0.87 0.49 0.00 0.94RF 0.41 − − − − − − − −

RGLM 0.35 0.66 0.38 0.27 0.50 0.65 0.22 0.31 0.91 0.22 0.18 0.98SIS-SCAD 0.57 0.03 0.90 0.54 0.02 0.94 0.50 0.01 0.99 0.53 0.00 0.99XGBoost 0.41 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.42 0.69 0.21 0.29 0.54 0.31 0.20 0.41 0.46 0.17 0.28 0.63Split-EN 0.42 0.81 0.19 0.29 0.73 0.29 0.21 0.57 0.47 0.17 0.41 0.65Lasso 0.48 0.13 0.49 0.37 0.09 0.54 0.29 0.05 0.67 0.25 0.03 0.75EN 0.47 0.23 0.49 0.34 0.17 0.57 0.25 0.12 0.68 0.21 0.07 0.78Adaptive 0.55 0.13 0.48 0.44 0.09 0.53 0.36 0.06 0.65 0.33 0.03 0.750.3 Relaxed 0.66 0.10 0.67 0.51 0.08 0.63 0.43 0.05 0.75 0.42 0.03 0.88MC+ 0.63 0.02 0.57 0.55 0.01 0.62 0.50 0.01 0.66 0.52 0.00 0.78RF 0.50 − − − − − − − −

RGLM 0.42 0.73 0.35 0.31 0.52 0.64 0.26 0.35 0.94 0.25 0.20 0.99SIS-SCAD 0.62 0.04 0.84 0.59 0.02 0.90 0.51 0.01 0.93 0.52 0.01 0.99XGBoost 0.48 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.45 0.66 0.23 0.30 0.55 0.34 0.21 0.41 0.48 0.19 0.27 0.65Split-EN 0.45 0.80 0.22 0.31 0.72 0.31 0.22 0.59 0.48 0.19 0.42 0.67Lasso 0.52 0.13 0.45 0.39 0.09 0.56 0.30 0.06 0.65 0.27 0.03 0.74EN 0.50 0.23 0.46 0.35 0.17 0.58 0.26 0.12 0.69 0.23 0.07 0.78Adaptive 0.58 0.13 0.42 0.46 0.09 0.57 0.36 0.06 0.65 0.33 0.03 0.750.4 Relaxed 0.65 0.10 0.64 0.61 0.08 0.74 0.55 0.05 0.81 0.42 0.03 0.86MC+ 0.62 0.03 0.51 0.52 0.02 0.61 0.47 0.01 0.70 0.47 0.01 0.79RF 0.54 − − − − − − − −

RGLM 0.45 0.76 0.33 0.33 0.56 0.65 0.28 0.37 0.96 0.28 0.21 1.00SIS-SCAD 0.57 0.05 0.83 0.50 0.02 0.88 0.46 0.01 0.93 0.44 0.01 0.95XGBoost 0.52 − − − − − − − − ρ = 0.2, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.46 0.95 0.13 0.48 0.97 0.11 0.53 0.98 0.09 0.62 0.98Split-EN 0.16 0.45 0.95 0.13 0.50 0.97 0.10 0.56 0.98 0.08 0.66 0.98Lasso 0.18 0.37 0.95 0.16 0.38 0.96 0.15 0.39 0.97 0.13 0.47 0.97EN 0.17 0.39 0.95 0.15 0.42 0.96 0.13 0.43 0.98 0.11 0.54 0.98Adaptive 0.21 0.09 0.99 0.19 0.15 0.99 0.18 0.14 0.99 0.17 0.18 0.990.2 Relaxed 0.18 0.37 0.94 0.17 0.41 0.94 0.15 0.43 0.95 0.13 0.54 0.95MC+ 0.21 0.26 0.94 0.20 0.23 0.95 0.20 0.24 0.95 0.18 0.28 0.95RF 0.18 0.26 0.98 0.16 0.24 0.99 0.16 0.23 1.00 0.15 0.27 1.00RGLM 0.16 0.40 0.96 0.15 0.34 0.98 0.14 0.32 0.99 0.13 0.36 1.00SIS-SCAD 0.21 0.31 0.92 0.20 0.22 0.95 0.20 0.20 0.96 0.19 0.20 0.96XGBoost 0.18 0.34 0.95 0.18 0.27 0.96 0.17 0.27 0.97 0.16 0.31 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.19 0.58 0.92 0.15 0.65 0.94 0.12 0.70 0.96 0.10 0.76 0.96Split-EN 0.19 0.58 0.91 0.15 0.66 0.93 0.11 0.72 0.96 0.09 0.78 0.96Lasso 0.22 0.52 0.90 0.19 0.55 0.92 0.17 0.59 0.93 0.16 0.64 0.93EN 0.21 0.53 0.91 0.18 0.60 0.92 0.15 0.63 0.95 0.13 0.69 0.94Adaptive 0.26 0.27 0.95 0.25 0.27 0.96 0.23 0.33 0.96 0.21 0.38 0.960.3 Relaxed 0.22 0.50 0.90 0.20 0.57 0.90 0.17 0.62 0.92 0.17 0.66 0.91MC+ 0.25 0.39 0.91 0.25 0.38 0.91 0.25 0.37 0.92 0.23 0.45 0.91RF 0.21 0.43 0.95 0.18 0.46 0.97 0.18 0.45 0.99 0.15 0.51 0.99RGLM 0.19 0.54 0.92 0.17 0.56 0.95 0.16 0.54 0.98 0.14 0.59 0.98SIS-SCAD 0.26 0.45 0.87 0.26 0.43 0.87 0.26 0.43 0.89 0.25 0.44 0.88XGBoost 0.24 0.47 0.89 0.22 0.45 0.92 0.22 0.43 0.93 0.20 0.47 0.93 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.20 0.70 0.86 0.16 0.76 0.89 0.12 0.82 0.91 0.10 0.84 0.93Split-EN 0.20 0.70 0.86 0.16 0.77 0.89 0.12 0.83 0.92 0.10 0.85 0.94Lasso 0.24 0.65 0.83 0.21 0.69 0.85 0.18 0.73 0.87 0.17 0.74 0.89EN 0.23 0.67 0.84 0.19 0.72 0.86 0.16 0.77 0.89 0.14 0.78 0.91Adaptive 0.29 0.47 0.88 0.28 0.48 0.88 0.22 0.63 0.89 0.21 0.63 0.900.4 Relaxed 0.25 0.66 0.81 0.22 0.69 0.84 0.19 0.74 0.86 0.18 0.74 0.87MC+ 0.28 0.58 0.81 0.28 0.56 0.82 0.26 0.62 0.82 0.25 0.60 0.84RF 0.22 0.62 0.88 0.18 0.66 0.92 0.14 0.73 0.94 0.13 0.72 0.96RGLM 0.21 0.68 0.86 0.18 0.71 0.90 0.14 0.77 0.92 0.13 0.75 0.94SIS-SCAD 0.30 0.57 0.78 0.29 0.57 0.80 0.29 0.60 0.79 0.28 0.60 0.80XGBoost 0.26 0.62 0.82 0.25 0.60 0.84 0.23 0.64 0.85 0.22 0.64 0.8665able 45: MR, SE and SP for Scenario 3, ρ = 0.2, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.14 0.52 0.96 0.13 0.46 0.97 0.11 0.57 0.97 0.10 0.61 0.98Split-EN 0.14 0.50 0.96 0.13 0.48 0.97 0.10 0.59 0.97 0.09 0.65 0.98Lasso 0.17 0.42 0.95 0.16 0.35 0.97 0.14 0.43 0.96 0.14 0.46 0.96EN 0.16 0.44 0.95 0.15 0.38 0.97 0.13 0.45 0.97 0.12 0.51 0.97Adaptive 0.20 0.17 0.98 0.19 0.09 0.99 0.18 0.15 0.99 0.18 0.17 0.980.2 Relaxed 0.17 0.43 0.94 0.16 0.39 0.95 0.15 0.44 0.95 0.15 0.50 0.94MC+ 0.19 0.38 0.94 0.19 0.25 0.95 0.19 0.25 0.95 0.19 0.27 0.95RF 0.17 0.32 0.98 0.16 0.23 0.99 0.15 0.27 1.00 0.15 0.29 1.00RGLM 0.14 0.50 0.96 0.15 0.36 0.98 0.13 0.38 0.99 0.13 0.40 0.99SIS-SCAD 0.19 0.35 0.94 0.20 0.22 0.95 0.20 0.25 0.94 0.19 0.24 0.95XGBoost 0.17 0.44 0.94 0.17 0.29 0.97 0.17 0.31 0.96 0.17 0.32 0.96 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.16 0.65 0.92 0.14 0.67 0.94 0.12 0.72 0.94 0.11 0.72 0.96Split-EN 0.17 0.63 0.92 0.14 0.68 0.94 0.12 0.74 0.94 0.10 0.75 0.96Lasso 0.20 0.58 0.91 0.19 0.56 0.92 0.18 0.58 0.92 0.17 0.59 0.93EN 0.19 0.59 0.91 0.17 0.61 0.92 0.16 0.63 0.93 0.15 0.64 0.95Adaptive 0.23 0.35 0.96 0.24 0.30 0.96 0.23 0.36 0.95 0.23 0.33 0.960.3 Relaxed 0.21 0.58 0.89 0.19 0.59 0.91 0.19 0.59 0.90 0.18 0.60 0.92MC+ 0.21 0.55 0.90 0.23 0.48 0.90 0.24 0.42 0.91 0.23 0.43 0.91RF 0.19 0.50 0.95 0.18 0.48 0.97 0.16 0.53 0.97 0.16 0.51 0.99RGLM 0.17 0.63 0.93 0.16 0.59 0.95 0.15 0.60 0.95 0.15 0.58 0.97SIS-SCAD 0.22 0.55 0.88 0.25 0.46 0.88 0.26 0.43 0.88 0.25 0.43 0.88XGBoost 0.20 0.57 0.90 0.21 0.47 0.92 0.22 0.49 0.91 0.21 0.47 0.93 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.18 0.73 0.89 0.16 0.76 0.90 0.13 0.81 0.91 0.11 0.83 0.92Split-EN 0.18 0.72 0.88 0.15 0.77 0.90 0.13 0.82 0.91 0.10 0.85 0.93Lasso 0.21 0.68 0.86 0.22 0.66 0.87 0.19 0.73 0.87 0.17 0.75 0.88EN 0.20 0.69 0.87 0.19 0.71 0.88 0.17 0.76 0.88 0.15 0.79 0.89Adaptive 0.26 0.51 0.90 0.27 0.51 0.89 0.24 0.59 0.89 0.21 0.63 0.890.4 Relaxed 0.22 0.68 0.85 0.21 0.68 0.86 0.19 0.73 0.86 0.19 0.75 0.86MC+ 0.23 0.65 0.85 0.25 0.60 0.84 0.25 0.62 0.83 0.25 0.64 0.82RF 0.20 0.65 0.90 0.18 0.66 0.93 0.16 0.71 0.93 0.13 0.75 0.96RGLM 0.17 0.73 0.89 0.17 0.73 0.90 0.16 0.75 0.91 0.14 0.76 0.93SIS-SCAD 0.23 0.66 0.84 0.27 0.59 0.82 0.29 0.59 0.80 0.29 0.61 0.78XGBoost 0.22 0.66 0.86 0.23 0.63 0.86 0.23 0.65 0.85 0.23 0.65 0.8566able 46: MR, SE and SP for Scenario 3, ρ = 0.5, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.10 0.69 0.95 0.08 0.74 0.97 0.06 0.79 0.98 0.05 0.80 0.98Split-EN 0.10 0.68 0.95 0.08 0.75 0.96 0.06 0.80 0.98 0.05 0.82 0.98Lasso 0.12 0.62 0.95 0.11 0.66 0.96 0.09 0.70 0.96 0.09 0.69 0.97EN 0.11 0.64 0.95 0.09 0.69 0.96 0.08 0.73 0.97 0.07 0.75 0.98Adaptive 0.15 0.35 0.98 0.14 0.39 0.98 0.13 0.46 0.98 0.11 0.51 0.980.2 Relaxed 0.13 0.61 0.94 0.12 0.66 0.94 0.10 0.72 0.94 0.09 0.72 0.96MC+ 0.16 0.45 0.94 0.16 0.44 0.95 0.15 0.46 0.95 0.16 0.42 0.95RF 0.11 0.57 0.97 0.10 0.59 0.99 0.09 0.59 0.99 0.08 0.60 1.00RGLM 0.11 0.63 0.96 0.09 0.64 0.98 0.08 0.64 0.99 0.08 0.65 0.99SIS-SCAD 0.17 0.34 0.96 0.16 0.35 0.96 0.16 0.36 0.97 0.16 0.36 0.96XGBoost 0.14 0.52 0.95 0.13 0.50 0.96 0.13 0.50 0.96 0.13 0.49 0.97 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.78 0.92 0.09 0.84 0.94 0.07 0.87 0.96 0.06 0.88 0.97Split-EN 0.12 0.77 0.92 0.09 0.84 0.94 0.07 0.87 0.96 0.05 0.89 0.97Lasso 0.14 0.73 0.91 0.12 0.78 0.92 0.11 0.79 0.94 0.09 0.81 0.95EN 0.14 0.75 0.92 0.11 0.81 0.93 0.09 0.82 0.95 0.07 0.85 0.96Adaptive 0.17 0.56 0.94 0.14 0.65 0.95 0.12 0.69 0.95 0.12 0.68 0.960.3 Relaxed 0.15 0.73 0.90 0.13 0.80 0.91 0.12 0.80 0.92 0.10 0.82 0.93MC+ 0.18 0.65 0.89 0.17 0.67 0.90 0.16 0.66 0.91 0.16 0.65 0.92RF 0.13 0.71 0.94 0.10 0.75 0.96 0.08 0.77 0.98 0.07 0.77 0.99RGLM 0.13 0.75 0.93 0.10 0.79 0.95 0.08 0.80 0.97 0.08 0.81 0.97SIS-SCAD 0.19 0.58 0.91 0.19 0.59 0.91 0.19 0.59 0.91 0.19 0.58 0.91XGBoost 0.16 0.67 0.91 0.16 0.67 0.92 0.15 0.66 0.93 0.14 0.66 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.13 0.83 0.90 0.10 0.86 0.92 0.08 0.90 0.94 0.07 0.92 0.95Split-EN 0.13 0.82 0.89 0.10 0.86 0.92 0.07 0.90 0.95 0.06 0.93 0.95Lasso 0.16 0.79 0.88 0.13 0.82 0.90 0.11 0.85 0.92 0.11 0.86 0.92EN 0.15 0.80 0.89 0.12 0.84 0.91 0.09 0.87 0.93 0.08 0.88 0.94Adaptive 0.17 0.72 0.91 0.15 0.75 0.91 0.13 0.79 0.92 0.13 0.79 0.930.4 Relaxed 0.16 0.79 0.87 0.14 0.83 0.89 0.12 0.85 0.90 0.11 0.86 0.90MC+ 0.19 0.74 0.86 0.18 0.75 0.86 0.17 0.77 0.88 0.17 0.76 0.88RF 0.14 0.79 0.90 0.11 0.82 0.94 0.08 0.86 0.96 0.07 0.87 0.97RGLM 0.13 0.81 0.90 0.11 0.83 0.93 0.09 0.86 0.94 0.08 0.88 0.95SIS-SCAD 0.20 0.71 0.86 0.21 0.69 0.86 0.20 0.72 0.86 0.20 0.71 0.87XGBoost 0.17 0.74 0.88 0.17 0.74 0.89 0.15 0.77 0.90 0.15 0.76 0.9067able 47: MR, SE and SP for Scenario 3, ρ = 0.2, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.48 0.96 0.11 0.55 0.97 0.09 0.68 0.97 0.07 0.69 0.98Split-EN 0.15 0.49 0.96 0.11 0.56 0.97 0.09 0.69 0.97 0.07 0.71 0.98Lasso 0.16 0.47 0.94 0.14 0.47 0.96 0.12 0.56 0.96 0.11 0.56 0.97EN 0.15 0.47 0.95 0.13 0.49 0.97 0.11 0.60 0.96 0.10 0.59 0.98Adaptive 0.19 0.20 0.98 0.17 0.19 0.99 0.15 0.32 0.98 0.15 0.31 0.990.2 Relaxed 0.16 0.45 0.94 0.14 0.48 0.96 0.12 0.58 0.95 0.11 0.59 0.96MC+ 0.18 0.37 0.94 0.18 0.28 0.96 0.17 0.36 0.95 0.17 0.31 0.96RF 0.17 0.28 0.98 0.15 0.27 0.99 0.14 0.33 1.00 0.14 0.29 1.00RGLM 0.14 0.49 0.96 0.13 0.46 0.98 0.11 0.49 0.99 0.11 0.43 1.00SIS-SCAD 0.18 0.35 0.94 0.18 0.31 0.95 0.18 0.35 0.95 0.17 0.32 0.95XGBoost 0.17 0.39 0.95 0.16 0.33 0.97 0.15 0.36 0.98 0.15 0.32 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.17 0.62 0.92 0.13 0.73 0.93 0.10 0.80 0.95 0.08 0.82 0.96Split-EN 0.17 0.62 0.92 0.13 0.72 0.93 0.10 0.80 0.95 0.08 0.83 0.96Lasso 0.19 0.58 0.92 0.16 0.66 0.91 0.14 0.70 0.93 0.13 0.72 0.94EN 0.18 0.60 0.92 0.15 0.68 0.92 0.13 0.73 0.93 0.11 0.75 0.95Adaptive 0.21 0.41 0.96 0.19 0.48 0.95 0.17 0.55 0.95 0.15 0.59 0.960.3 Relaxed 0.19 0.58 0.91 0.16 0.67 0.91 0.14 0.71 0.92 0.13 0.74 0.92MC+ 0.22 0.52 0.90 0.20 0.55 0.90 0.20 0.57 0.91 0.19 0.57 0.91RF 0.20 0.46 0.95 0.16 0.52 0.97 0.14 0.59 0.98 0.13 0.58 0.99RGLM 0.17 0.62 0.92 0.14 0.66 0.94 0.12 0.70 0.96 0.11 0.69 0.98SIS-SCAD 0.22 0.50 0.90 0.22 0.53 0.88 0.22 0.55 0.88 0.22 0.56 0.88XGBoost 0.20 0.53 0.91 0.18 0.55 0.93 0.18 0.56 0.94 0.17 0.56 0.95 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.19 0.71 0.87 0.15 0.80 0.89 0.10 0.85 0.92 0.09 0.87 0.94Split-EN 0.19 0.71 0.87 0.15 0.80 0.89 0.10 0.85 0.92 0.08 0.88 0.94Lasso 0.21 0.69 0.86 0.18 0.75 0.87 0.15 0.79 0.89 0.14 0.80 0.91EN 0.20 0.70 0.86 0.17 0.77 0.88 0.13 0.81 0.90 0.12 0.82 0.92Adaptive 0.23 0.57 0.90 0.19 0.67 0.89 0.17 0.73 0.90 0.15 0.74 0.920.4 Relaxed 0.22 0.68 0.86 0.18 0.75 0.87 0.15 0.80 0.89 0.14 0.80 0.90MC+ 0.24 0.64 0.85 0.22 0.67 0.85 0.21 0.70 0.86 0.20 0.70 0.87RF 0.22 0.62 0.90 0.17 0.71 0.92 0.13 0.75 0.95 0.11 0.76 0.97RGLM 0.19 0.71 0.87 0.15 0.77 0.90 0.12 0.81 0.93 0.10 0.81 0.95SIS-SCAD 0.25 0.62 0.84 0.25 0.64 0.83 0.25 0.66 0.82 0.24 0.65 0.84XGBoost 0.22 0.66 0.86 0.20 0.69 0.87 0.18 0.71 0.89 0.18 0.70 0.9068able 48: MR, SE and SP for Scenario 3, ρ = 0.2, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.59 0.96 0.10 0.63 0.97 0.08 0.72 0.97 0.08 0.73 0.97Split-EN 0.12 0.57 0.96 0.10 0.62 0.97 0.08 0.73 0.97 0.07 0.74 0.97Lasso 0.14 0.53 0.95 0.13 0.52 0.96 0.12 0.59 0.96 0.11 0.59 0.96EN 0.14 0.53 0.96 0.12 0.53 0.97 0.11 0.63 0.96 0.10 0.63 0.97Adaptive 0.18 0.26 0.99 0.16 0.24 0.99 0.15 0.32 0.98 0.15 0.34 0.980.2 Relaxed 0.15 0.54 0.95 0.13 0.51 0.96 0.12 0.61 0.95 0.11 0.62 0.95MC+ 0.15 0.51 0.95 0.16 0.42 0.95 0.16 0.41 0.95 0.17 0.40 0.95RF 0.15 0.37 0.98 0.14 0.32 0.99 0.13 0.37 1.00 0.13 0.37 1.00RGLM 0.12 0.62 0.96 0.11 0.54 0.98 0.10 0.55 0.98 0.11 0.53 0.99SIS-SCAD 0.16 0.47 0.95 0.16 0.41 0.95 0.17 0.40 0.94 0.18 0.37 0.94XGBoost 0.15 0.52 0.95 0.15 0.41 0.97 0.14 0.39 0.97 0.15 0.38 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.71 0.92 0.11 0.75 0.94 0.10 0.80 0.95 0.09 0.82 0.95Split-EN 0.15 0.69 0.93 0.12 0.75 0.94 0.10 0.81 0.95 0.08 0.83 0.96Lasso 0.17 0.66 0.91 0.15 0.68 0.93 0.14 0.71 0.93 0.13 0.72 0.94EN 0.16 0.67 0.92 0.14 0.69 0.93 0.13 0.74 0.93 0.12 0.75 0.94Adaptive 0.18 0.55 0.95 0.18 0.49 0.96 0.16 0.57 0.95 0.16 0.59 0.960.3 Relaxed 0.17 0.67 0.90 0.15 0.68 0.92 0.14 0.71 0.92 0.13 0.72 0.93MC+ 0.18 0.65 0.90 0.17 0.62 0.92 0.19 0.60 0.91 0.19 0.58 0.91RF 0.17 0.56 0.95 0.16 0.55 0.97 0.14 0.59 0.98 0.13 0.60 0.99RGLM 0.14 0.72 0.92 0.12 0.71 0.95 0.12 0.71 0.96 0.11 0.70 0.97SIS-SCAD 0.18 0.62 0.91 0.19 0.56 0.92 0.21 0.56 0.89 0.22 0.56 0.88XGBoost 0.17 0.64 0.91 0.17 0.58 0.94 0.17 0.59 0.93 0.17 0.56 0.94 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.15 0.78 0.90 0.13 0.83 0.90 0.11 0.86 0.92 0.10 0.86 0.93Split-EN 0.16 0.77 0.89 0.13 0.83 0.90 0.11 0.86 0.92 0.09 0.87 0.94Lasso 0.17 0.74 0.88 0.16 0.78 0.87 0.15 0.80 0.89 0.14 0.80 0.90EN 0.17 0.74 0.89 0.16 0.79 0.88 0.14 0.81 0.89 0.13 0.81 0.91Adaptive 0.19 0.66 0.91 0.18 0.72 0.89 0.17 0.75 0.89 0.16 0.73 0.910.4 Relaxed 0.18 0.74 0.87 0.17 0.78 0.87 0.15 0.80 0.88 0.14 0.80 0.90MC+ 0.19 0.72 0.86 0.19 0.74 0.86 0.20 0.73 0.85 0.20 0.70 0.87RF 0.17 0.70 0.91 0.16 0.74 0.91 0.13 0.77 0.93 0.12 0.76 0.96RGLM 0.15 0.79 0.89 0.14 0.81 0.90 0.13 0.81 0.91 0.12 0.80 0.94SIS-SCAD 0.18 0.73 0.87 0.22 0.70 0.84 0.24 0.67 0.82 0.24 0.65 0.83XGBoost 0.19 0.73 0.87 0.19 0.72 0.87 0.19 0.72 0.87 0.18 0.69 0.9069able 49: MR, SE and SP for Scenario 3, ρ = 0.5, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.09 0.71 0.96 0.07 0.80 0.97 0.05 0.87 0.97 0.04 0.88 0.98Split-EN 0.09 0.71 0.96 0.07 0.80 0.97 0.05 0.88 0.97 0.04 0.88 0.98Lasso 0.11 0.67 0.95 0.09 0.72 0.96 0.07 0.77 0.97 0.07 0.79 0.97EN 0.10 0.68 0.96 0.08 0.74 0.96 0.06 0.81 0.97 0.06 0.82 0.97Adaptive 0.13 0.46 0.98 0.11 0.55 0.98 0.09 0.62 0.98 0.08 0.71 0.980.2 Relaxed 0.11 0.67 0.95 0.10 0.73 0.95 0.08 0.79 0.96 0.07 0.80 0.96MC+ 0.14 0.52 0.95 0.14 0.52 0.95 0.13 0.53 0.95 0.13 0.57 0.95RF 0.10 0.60 0.97 0.09 0.64 0.98 0.07 0.70 0.99 0.06 0.71 1.00RGLM 0.10 0.68 0.96 0.08 0.70 0.98 0.06 0.74 0.99 0.06 0.75 0.99SIS-SCAD 0.14 0.49 0.95 0.14 0.45 0.96 0.14 0.50 0.95 0.14 0.51 0.96XGBoost 0.12 0.59 0.96 0.11 0.58 0.97 0.10 0.61 0.98 0.10 0.60 0.98 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.11 0.79 0.93 0.08 0.85 0.95 0.06 0.89 0.96 0.05 0.90 0.97Split-EN 0.11 0.78 0.93 0.08 0.85 0.95 0.06 0.90 0.96 0.05 0.91 0.97Lasso 0.13 0.75 0.93 0.10 0.80 0.94 0.09 0.83 0.95 0.08 0.84 0.96EN 0.12 0.76 0.93 0.10 0.81 0.94 0.07 0.85 0.96 0.07 0.86 0.96Adaptive 0.15 0.62 0.95 0.12 0.70 0.96 0.09 0.78 0.96 0.09 0.78 0.970.3 Relaxed 0.13 0.74 0.92 0.11 0.81 0.93 0.09 0.84 0.94 0.08 0.85 0.94MC+ 0.16 0.68 0.91 0.14 0.70 0.92 0.13 0.73 0.93 0.13 0.72 0.93RF 0.13 0.71 0.94 0.10 0.76 0.97 0.07 0.79 0.98 0.07 0.80 0.99RGLM 0.11 0.77 0.94 0.09 0.80 0.96 0.07 0.83 0.97 0.06 0.83 0.98SIS-SCAD 0.16 0.66 0.92 0.15 0.68 0.92 0.14 0.69 0.93 0.14 0.68 0.93XGBoost 0.14 0.71 0.93 0.13 0.72 0.95 0.11 0.73 0.96 0.11 0.72 0.96 π Method MR SE SP MR SE SP MR SE SP MR SE SP

Split-Lasso 0.12 0.83 0.90 0.09 0.89 0.93 0.07 0.91 0.95 0.05 0.94 0.95Split-EN 0.12 0.83 0.90 0.09 0.89 0.93 0.06 0.91 0.95 0.05 0.94 0.96Lasso 0.14 0.81 0.89 0.11 0.86 0.91 0.09 0.87 0.93 0.08 0.90 0.93EN 0.13 0.82 0.90 0.10 0.87 0.92 0.08 0.88 0.94 0.07 0.91 0.94Adaptive 0.14 0.76 0.92 0.12 0.83 0.92 0.10 0.84 0.94 0.09 0.88 0.940.4 Relaxed 0.14 0.81 0.89 0.12 0.85 0.91 0.10 0.87 0.92 0.09 0.90 0.92MC+ 0.16 0.78 0.89 0.14 0.81 0.89 0.13 0.81 0.91 0.13 0.82 0.90RF 0.14 0.80 0.91 0.10 0.85 0.93 0.07 0.87 0.96 0.06 0.90 0.97RGLM 0.12 0.83 0.91 0.09 0.87 0.93 0.08 0.88 0.95 0.06 0.90 0.96SIS-SCAD 0.16 0.77 0.89 0.16 0.79 0.87 0.16 0.79 0.88 0.15 0.79 0.89XGBoost 0.14 0.79 0.90 0.13 0.81 0.91 0.12 0.81 0.93 0.11 0.82 0.9370able 50: TL, RC and PR for Scenario 3, ρ = 0.2, ρ = 0.5, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.69 0.29 0.18 0.58 0.25 0.22 0.48 0.21 0.33 0.39 0.19 0.52Split-EN 0.70 0.33 0.17 0.56 0.32 0.20 0.45 0.30 0.31 0.36 0.26 0.50Lasso 0.80 0.05 0.25 0.74 0.03 0.21 0.69 0.02 0.28 0.60 0.02 0.46EN 0.77 0.08 0.23 0.67 0.05 0.22 0.61 0.04 0.29 0.50 0.03 0.47Adaptive 0.95 0.05 0.25 0.87 0.03 0.21 0.84 0.02 0.28 0.79 0.02 0.460.2 Relaxed 0.91 0.04 0.27 0.96 0.02 0.23 0.88 0.02 0.29 0.86 0.01 0.47MC+ 0.92 0.01 0.30 0.97 0.01 0.25 1.03 0.00 0.29 0.90 0.00 0.48RF 0.79 − − − − − − − −

RGLM 0.72 0.26 0.13 0.66 0.16 0.18 0.62 0.12 0.27 0.58 0.10 0.47SIS-SCAD 1.05 0.01 0.42 0.95 0.00 0.32 0.92 0.00 0.33 0.87 0.00 0.54XGBoost 0.86 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.80 0.32 0.17 0.66 0.29 0.21 0.52 0.25 0.32 0.44 0.20 0.50Split-EN 0.81 0.36 0.16 0.65 0.32 0.21 0.50 0.34 0.31 0.41 0.28 0.48Lasso 0.94 0.06 0.24 0.86 0.03 0.23 0.77 0.02 0.29 0.70 0.02 0.44EN 0.89 0.09 0.24 0.78 0.06 0.23 0.66 0.05 0.30 0.59 0.04 0.45Adaptive 1.04 0.06 0.23 1.01 0.03 0.24 0.92 0.02 0.30 0.89 0.02 0.440.3 Relaxed 1.19 0.06 0.30 1.13 0.03 0.27 1.02 0.02 0.31 1.01 0.02 0.46MC+ 1.06 0.02 0.29 1.05 0.01 0.23 1.04 0.01 0.33 0.97 0.00 0.45RF 0.93 − − − − − − − −

RGLM 0.84 0.29 0.13 0.77 0.19 0.18 0.73 0.14 0.28 0.70 0.11 0.45SIS-SCAD 1.14 0.02 0.42 1.16 0.01 0.31 1.12 0.00 0.36 1.14 0.00 0.42XGBoost 1.04 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.88 0.33 0.18 0.72 0.29 0.22 0.54 0.25 0.32 0.47 0.22 0.50Split-EN 0.88 0.37 0.17 0.72 0.33 0.21 0.52 0.33 0.31 0.44 0.30 0.48Lasso 1.04 0.06 0.24 0.96 0.04 0.26 0.82 0.03 0.31 0.76 0.02 0.46EN 0.96 0.10 0.24 0.86 0.07 0.26 0.70 0.05 0.32 0.63 0.04 0.46Adaptive 1.11 0.07 0.25 1.07 0.04 0.26 0.93 0.03 0.31 0.89 0.02 0.460.4 Relaxed 1.43 0.06 0.27 1.12 0.04 0.27 1.01 0.03 0.32 1.07 0.02 0.46MC+ 1.14 0.02 0.30 1.14 0.01 0.34 1.06 0.01 0.38 1.06 0.00 0.41RF 1.00 − − − − − − − −

RGLM 0.91 0.32 0.14 0.83 0.22 0.19 0.78 0.16 0.29 0.76 0.12 0.47SIS-SCAD 1.36 0.01 0.37 1.25 0.01 0.42 1.28 0.00 0.42 1.31 0.00 0.50XGBoost 1.11 − − − − − − − − ρ = 0.2, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.65 0.40 0.21 0.60 0.28 0.28 0.49 0.22 0.33 0.44 0.18 0.49Split-EN 0.65 0.49 0.22 0.57 0.39 0.26 0.46 0.32 0.33 0.41 0.27 0.50Lasso 0.78 0.05 0.26 0.75 0.02 0.22 0.70 0.02 0.25 0.64 0.01 0.33EN 0.73 0.11 0.28 0.71 0.06 0.28 0.62 0.04 0.30 0.56 0.03 0.38Adaptive 0.90 0.05 0.26 0.90 0.02 0.24 0.81 0.02 0.25 0.82 0.01 0.350.2 Relaxed 0.83 0.04 0.34 0.97 0.02 0.22 1.34 0.01 0.25 0.94 0.01 0.33MC+ 0.87 0.02 0.35 1.21 0.01 0.24 0.97 0.00 0.33 0.91 0.00 0.38RF 0.76 − − − − − − − −

RGLM 0.66 0.36 0.20 0.65 0.21 0.24 0.61 0.13 0.31 0.60 0.10 0.45SIS-SCAD 0.88 0.02 0.61 1.02 0.01 0.39 0.98 0.00 0.39 0.94 0.00 0.44XGBoost 0.82 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.73 0.45 0.21 0.63 0.35 0.26 0.56 0.26 0.34 0.50 0.20 0.49Split-EN 0.74 0.52 0.21 0.63 0.43 0.25 0.53 0.37 0.33 0.47 0.29 0.49Lasso 0.89 0.07 0.28 0.85 0.03 0.26 0.80 0.02 0.26 0.75 0.01 0.32EN 0.85 0.13 0.31 0.77 0.07 0.29 0.71 0.05 0.31 0.64 0.03 0.37Adaptive 0.98 0.06 0.29 0.98 0.03 0.26 0.97 0.02 0.26 0.95 0.01 0.320.3 Relaxed 1.08 0.05 0.40 1.06 0.03 0.27 1.50 0.02 0.28 0.90 0.01 0.34MC+ 0.91 0.02 0.39 1.00 0.01 0.34 1.01 0.00 0.26 1.00 0.00 0.35RF 0.88 − − − − − − − −

RGLM 0.75 0.42 0.21 0.74 0.26 0.26 0.73 0.16 0.32 0.71 0.11 0.46SIS-SCAD 1.00 0.02 0.60 1.17 0.01 0.46 1.16 0.00 0.36 1.12 0.00 0.46XGBoost 0.92 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.79 0.47 0.19 0.70 0.36 0.26 0.59 0.29 0.34 0.51 0.20 0.48Split-EN 0.79 0.54 0.19 0.69 0.43 0.25 0.57 0.39 0.34 0.48 0.30 0.49Lasso 0.94 0.07 0.28 0.95 0.04 0.27 0.84 0.02 0.29 0.77 0.02 0.34EN 0.89 0.12 0.30 0.85 0.08 0.30 0.75 0.06 0.34 0.66 0.04 0.41Adaptive 1.06 0.07 0.29 1.05 0.04 0.29 0.98 0.02 0.29 0.91 0.02 0.350.4 Relaxed 1.15 0.06 0.38 1.10 0.04 0.29 1.06 0.02 0.32 1.03 0.01 0.37MC+ 0.98 0.02 0.33 1.07 0.01 0.30 1.07 0.01 0.35 1.03 0.00 0.35RF 0.95 − − − − − − − −

RGLM 0.80 0.43 0.21 0.80 0.29 0.27 0.79 0.18 0.35 0.76 0.13 0.47SIS-SCAD 1.04 0.03 0.70 1.18 0.01 0.46 1.20 0.00 0.49 1.20 0.00 0.47XGBoost 0.96 − − − − − − − − ρ = 0.5, ρ = 0.8, n = 50, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.48 0.31 0.17 0.37 0.23 0.22 0.28 0.21 0.34 0.24 0.17 0.54Split-EN 0.48 0.38 0.16 0.36 0.33 0.20 0.27 0.30 0.32 0.23 0.27 0.52Lasso 0.56 0.03 0.16 0.50 0.02 0.16 0.42 0.01 0.23 0.39 0.01 0.34EN 0.52 0.07 0.17 0.43 0.04 0.17 0.35 0.03 0.25 0.31 0.03 0.38Adaptive 0.70 0.03 0.16 0.64 0.02 0.16 0.55 0.01 0.23 0.52 0.01 0.340.2 Relaxed 0.74 0.03 0.18 1.24 0.01 0.15 0.77 0.01 0.23 1.13 0.01 0.38MC+ 0.74 0.01 0.25 0.74 0.00 0.19 0.77 0.00 0.26 0.75 0.00 0.32RF 0.54 − − − − − − − −

RGLM 0.51 0.21 0.14 0.46 0.12 0.16 0.42 0.09 0.26 0.41 0.08 0.44SIS-SCAD 0.80 0.01 0.34 0.73 0.00 0.21 0.72 0.00 0.27 0.72 0.00 0.40XGBoost 0.64 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.58 0.30 0.20 0.42 0.26 0.23 0.32 0.21 0.32 0.27 0.17 0.50Split-EN 0.57 0.35 0.18 0.41 0.36 0.22 0.31 0.31 0.32 0.26 0.27 0.50Lasso 0.66 0.05 0.21 0.56 0.02 0.20 0.49 0.02 0.23 0.42 0.01 0.35EN 0.62 0.09 0.22 0.49 0.06 0.22 0.40 0.04 0.27 0.34 0.03 0.39Adaptive 0.77 0.05 0.22 0.68 0.02 0.20 0.60 0.02 0.24 0.57 0.01 0.340.3 Relaxed 0.99 0.04 0.29 0.78 0.02 0.20 0.81 0.01 0.26 0.82 0.01 0.35MC+ 0.81 0.01 0.31 0.76 0.01 0.21 0.73 0.00 0.29 0.69 0.00 0.44RF 0.64 − − − − − − − −

RGLM 0.59 0.27 0.16 0.53 0.16 0.19 0.50 0.11 0.27 0.48 0.09 0.44SIS-SCAD 0.84 0.01 0.40 0.84 0.01 0.31 0.81 0.00 0.35 0.79 0.00 0.47XGBoost 0.71 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.61 0.34 0.20 0.48 0.25 0.23 0.34 0.21 0.32 0.31 0.17 0.50Split-EN 0.61 0.37 0.20 0.47 0.31 0.22 0.33 0.30 0.31 0.29 0.26 0.49Lasso 0.71 0.05 0.21 0.60 0.03 0.20 0.50 0.02 0.24 0.48 0.01 0.36EN 0.67 0.10 0.23 0.54 0.06 0.21 0.42 0.04 0.28 0.38 0.03 0.40Adaptive 0.82 0.05 0.21 0.74 0.03 0.20 0.64 0.02 0.24 0.62 0.01 0.360.4 Relaxed 0.85 0.05 0.25 0.90 0.03 0.21 0.95 0.01 0.22 0.78 0.01 0.39MC+ 0.84 0.01 0.27 0.81 0.01 0.19 0.74 0.00 0.26 0.73 0.00 0.45RF 0.70 − − − − − − − −

RGLM 0.63 0.30 0.17 0.58 0.16 0.19 0.54 0.12 0.28 0.53 0.10 0.46SIS-SCAD 0.88 0.02 0.44 0.87 0.01 0.29 0.85 0.00 0.28 0.84 0.00 0.51XGBoost 0.77 − − − − − − − − ρ = 0.2, ρ = 0.5, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.65 0.42 0.18 0.51 0.44 0.22 0.39 0.37 0.33 0.32 0.29 0.51Split-EN 0.65 0.44 0.17 0.51 0.46 0.21 0.38 0.44 0.31 0.31 0.37 0.50Lasso 0.74 0.10 0.28 0.64 0.06 0.28 0.55 0.04 0.34 0.51 0.03 0.44EN 0.70 0.13 0.25 0.60 0.09 0.29 0.50 0.06 0.34 0.45 0.05 0.45Adaptive 0.83 0.09 0.29 0.76 0.06 0.29 0.67 0.04 0.34 0.63 0.03 0.440.2 Relaxed 0.83 0.09 0.32 0.69 0.06 0.29 0.61 0.04 0.35 0.65 0.02 0.44MC+ 0.84 0.03 0.34 0.80 0.02 0.35 0.76 0.01 0.34 0.77 0.01 0.49RF 0.75 − − − − − − − −

RGLM 0.66 0.45 0.14 0.57 0.32 0.22 0.54 0.21 0.30 0.51 0.15 0.47SIS-SCAD 0.85 0.03 0.49 0.85 0.01 0.46 0.86 0.01 0.45 0.81 0.00 0.56XGBoost 0.84 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.75 0.45 0.18 0.60 0.42 0.22 0.44 0.40 0.32 0.37 0.32 0.49Split-EN 0.75 0.46 0.18 0.60 0.49 0.21 0.44 0.46 0.30 0.35 0.40 0.48Lasso 0.82 0.12 0.30 0.73 0.07 0.31 0.63 0.05 0.36 0.57 0.03 0.47EN 0.81 0.15 0.27 0.69 0.10 0.30 0.57 0.07 0.35 0.50 0.05 0.48Adaptive 0.90 0.12 0.30 0.82 0.07 0.31 0.76 0.05 0.35 0.68 0.03 0.470.3 Relaxed 0.85 0.11 0.32 0.80 0.07 0.32 0.70 0.05 0.37 0.80 0.03 0.47MC+ 0.94 0.04 0.34 0.89 0.02 0.34 0.84 0.01 0.38 0.83 0.01 0.49RF 0.89 − − − − − − − −

RGLM 0.76 0.51 0.14 0.67 0.37 0.22 0.62 0.25 0.31 0.59 0.19 0.48SIS-SCAD 0.95 0.04 0.56 0.98 0.02 0.47 0.98 0.01 0.47 1.00 0.00 0.56XGBoost 0.96 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.82 0.48 0.17 0.65 0.46 0.21 0.48 0.42 0.32 0.38 0.33 0.49Split-EN 0.82 0.47 0.18 0.65 0.52 0.20 0.48 0.47 0.30 0.38 0.39 0.48Lasso 0.89 0.12 0.28 0.78 0.08 0.31 0.67 0.05 0.37 0.62 0.03 0.46EN 0.88 0.16 0.27 0.74 0.11 0.31 0.61 0.08 0.38 0.54 0.06 0.46Adaptive 0.98 0.12 0.29 0.87 0.08 0.31 0.77 0.05 0.38 0.71 0.04 0.460.4 Relaxed 0.93 0.11 0.33 0.80 0.08 0.32 0.76 0.05 0.38 0.73 0.03 0.47MC+ 0.99 0.05 0.32 0.94 0.03 0.34 0.87 0.02 0.39 0.84 0.01 0.47RF 0.97 − − − − − − − −

RGLM 0.83 0.54 0.13 0.72 0.40 0.22 0.67 0.28 0.32 0.64 0.20 0.48SIS-SCAD 1.04 0.04 0.54 1.02 0.02 0.50 1.05 0.01 0.55 1.07 0.00 0.57XGBoost 1.01 − − − − − − − − ρ = 0.2, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.55 0.60 0.17 0.46 0.51 0.25 0.36 0.39 0.35 0.35 0.29 0.50Split-EN 0.57 0.64 0.18 0.46 0.60 0.24 0.35 0.49 0.33 0.33 0.37 0.50Lasso 0.65 0.10 0.29 0.59 0.06 0.31 0.53 0.03 0.28 0.53 0.02 0.37EN 0.63 0.17 0.32 0.57 0.10 0.34 0.49 0.07 0.33 0.47 0.04 0.42Adaptive 0.75 0.09 0.29 0.72 0.06 0.31 0.66 0.03 0.28 0.66 0.02 0.370.2 Relaxed 0.74 0.09 0.42 0.63 0.06 0.33 0.61 0.03 0.30 0.61 0.02 0.37MC+ 0.68 0.03 0.39 0.79 0.02 0.40 0.70 0.01 0.36 0.78 0.00 0.42RF 0.71 − − − − − − − −

RGLM 0.56 0.58 0.20 0.53 0.42 0.31 0.51 0.26 0.36 0.51 0.17 0.49SIS-SCAD 0.73 0.04 0.71 0.78 0.02 0.58 0.80 0.01 0.42 0.83 0.00 0.51XGBoost 0.71 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.65 0.59 0.16 0.52 0.55 0.24 0.44 0.43 0.35 0.39 0.32 0.50Split-EN 0.66 0.65 0.17 0.52 0.65 0.22 0.43 0.54 0.34 0.37 0.42 0.50Lasso 0.74 0.10 0.29 0.66 0.07 0.30 0.62 0.05 0.34 0.58 0.03 0.38EN 0.73 0.17 0.31 0.63 0.12 0.34 0.57 0.08 0.38 0.53 0.05 0.43Adaptive 0.80 0.11 0.29 0.79 0.07 0.31 0.74 0.04 0.34 0.70 0.03 0.370.3 Relaxed 0.78 0.09 0.40 0.74 0.07 0.35 0.71 0.04 0.36 0.65 0.02 0.37MC+ 0.80 0.03 0.29 0.78 0.02 0.40 0.82 0.01 0.39 0.84 0.01 0.42RF 0.82 − − − − − − − −

RGLM 0.65 0.60 0.19 0.60 0.49 0.31 0.61 0.30 0.38 0.60 0.20 0.50SIS-SCAD 0.81 0.04 0.69 0.85 0.02 0.64 0.94 0.01 0.57 1.04 0.00 0.55XGBoost 0.83 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.68 0.60 0.15 0.58 0.56 0.23 0.49 0.43 0.35 0.43 0.32 0.48Split-EN 0.69 0.67 0.16 0.59 0.63 0.23 0.48 0.54 0.34 0.42 0.41 0.49Lasso 0.78 0.11 0.30 0.73 0.07 0.31 0.68 0.04 0.32 0.63 0.03 0.38EN 0.76 0.18 0.31 0.69 0.12 0.34 0.63 0.08 0.37 0.56 0.05 0.44Adaptive 0.87 0.11 0.30 0.82 0.07 0.31 0.78 0.04 0.32 0.75 0.03 0.390.4 Relaxed 0.84 0.09 0.51 0.79 0.07 0.33 0.76 0.04 0.33 0.68 0.03 0.39MC+ 0.84 0.03 0.29 0.81 0.02 0.35 0.86 0.02 0.39 0.87 0.01 0.42RF 0.87 − − − − − − − −

RGLM 0.68 0.63 0.18 0.65 0.51 0.30 0.66 0.32 0.37 0.64 0.22 0.50SIS-SCAD 0.85 0.05 0.76 0.96 0.02 0.68 1.08 0.01 0.56 1.05 0.00 0.52XGBoost 0.88 − − − − − − − − ρ = 0.5, ρ = 0.8, n = 100, p = 1,500. ζ = . ζ = . ζ = . ζ = . π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.42 0.48 0.17 0.32 0.42 0.24 0.22 0.33 0.36 0.20 0.26 0.53Split-EN 0.43 0.54 0.16 0.32 0.51 0.22 0.22 0.43 0.34 0.20 0.34 0.52Lasso 0.50 0.06 0.21 0.43 0.04 0.22 0.34 0.02 0.24 0.31 0.02 0.38EN 0.47 0.11 0.23 0.39 0.07 0.24 0.29 0.05 0.26 0.26 0.04 0.39Adaptive 0.58 0.06 0.22 0.51 0.04 0.21 0.44 0.02 0.24 0.38 0.02 0.370.2 Relaxed 0.54 0.06 0.24 0.60 0.03 0.24 0.43 0.02 0.24 0.42 0.02 0.36MC+ 0.65 0.01 0.35 0.69 0.01 0.30 0.63 0.00 0.23 0.65 0.00 0.44RF 0.50 − − − − − − − −

RGLM 0.45 0.38 0.16 0.40 0.23 0.21 0.36 0.15 0.27 0.35 0.12 0.44SIS-SCAD 0.67 0.02 0.38 0.67 0.01 0.35 0.65 0.00 0.26 0.64 0.00 0.44XGBoost 0.57 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.51 0.49 0.17 0.38 0.40 0.23 0.26 0.33 0.34 0.24 0.25 0.51Split-EN 0.52 0.54 0.16 0.38 0.49 0.22 0.26 0.42 0.33 0.23 0.34 0.51Lasso 0.59 0.08 0.24 0.48 0.04 0.23 0.39 0.03 0.25 0.35 0.02 0.35EN 0.56 0.13 0.25 0.45 0.08 0.25 0.34 0.05 0.28 0.30 0.04 0.39Adaptive 0.69 0.08 0.23 0.58 0.04 0.23 0.46 0.03 0.25 0.44 0.02 0.350.3 Relaxed 0.68 0.07 0.26 0.60 0.04 0.24 0.48 0.02 0.25 0.46 0.02 0.36MC+ 0.70 0.02 0.25 0.65 0.01 0.32 0.61 0.01 0.30 0.61 0.00 0.43RF 0.61 − − − − − − − −

RGLM 0.53 0.44 0.16 0.46 0.29 0.23 0.42 0.18 0.29 0.41 0.13 0.45SIS-SCAD 0.77 0.03 0.46 0.69 0.01 0.39 0.64 0.01 0.32 0.64 0.00 0.42XGBoost 0.64 − − − − − − − − π Method TL RC PR TL RC PR TL RC PR TL RC PR

Split-Lasso 0.56 0.45 0.19 0.41 0.40 0.24 0.30 0.32 0.33 0.25 0.25 0.49Split-EN 0.56 0.48 0.19 0.41 0.47 0.23 0.30 0.40 0.32 0.24 0.34 0.49Lasso 0.62 0.08 0.25 0.51 0.05 0.24 0.43 0.03 0.26 0.38 0.02 0.38EN 0.61 0.13 0.26 0.48 0.09 0.27 0.38 0.06 0.29 0.32 0.04 0.40Adaptive 0.71 0.08 0.26 0.59 0.05 0.24 0.52 0.03 0.26 0.46 0.02 0.380.4 Relaxed 0.66 0.08 0.31 0.64 0.04 0.27 0.50 0.03 0.26 0.51 0.02 0.37MC+ 0.71 0.03 0.28 0.64 0.01 0.26 0.61 0.01 0.27 0.60 0.01 0.38RF 0.66 − − − − − − − −

RGLM 0.57 0.47 0.16 0.49 0.31 0.23 0.46 0.19 0.30 0.44 0.15 0.46SIS-SCAD 0.72 0.04 0.58 0.70 0.01 0.45 0.71 0.01 0.36 0.68 0.00 0.52XGBoost 0.68 − − − − − − − − p = p = p = p = Split-Lasso 1.38 1.22 1.66 1.27 1.58 1.25 1.38 1.29Split-EN 1.00 1.00 1.00 1.00 1.00 1.00 1.03 1.00Lasso 1.47 1.44 1.95 1.74 1.87 1.74 1.85 1.83EN 1.06 1.05 1.21 1.08 1.24 1.11 1.13 1.09Adaptive 2.81 2.08 3.08 2.21 3.95 2.72 4.08 2.86Relaxed 1.79 5.47 2.05 6.08 2.21 6.69 2.03 7.15MC+ 2.83 2.32 3.55 2.68 3.58 2.79 3.72 3.05RF 1.28 1.24 1.13 1.38 1.05 1.45 1.00 1.63RGLM 1.83 1.43 1.89 1.43 1.76 1.44 1.74 1.57SIS-SCAD 2.70 2.14 3.42 2.51 3.32 2.54 3.26 2.72XGBoost 2.64 2.27 3.26 2.58 3.26 2.67 3.18 2.85Table 57: MR and TL relative performances for GSE20347 and training proportion 0.50. p = p = p = p = Split-Lasso 1.08 1.06 1.25 1.12 1.71 1.18 2.40 1.18Split-EN 1.00 1.01 1.10 1.00 1.57 1.00 2.00 1.00Lasso 1.36 1.29 1.80 1.41 2.50 1.54 3.60 1.57EN 1.12 1.00 1.10 1.06 1.50 1.06 2.00 1.06Adaptive 1.52 1.94 1.80 2.25 2.71 2.37 3.70 2.64Relaxed 1.72 3.92 1.60 3.96 2.64 4.61 2.90 3.90MC+ 2.48 3.11 3.25 3.59 4.64 3.84 6.40 3.93RF 1.04 1.39 1.00 1.61 1.00 1.86 1.00 2.25RGLM 1.20 1.67 1.30 1.80 1.86 1.93 2.70 2.03SIS-SCAD 2.72 3.58 3.50 4.17 5.07 4.46 7.20 4.61XGBoost 2.68 4.08 3.35 4.64 4.79 4.98 6.70 5.1077able 58: MR and TL relative performances for GSE23400 (part one) and training proportion 0.35. p = p = p = p = Split-Lasso 1.00 1.02 1.01 1.03 1.00 1.02 1.00 1.02Split-EN 1.02 1.00 1.00 1.00 1.00 1.00 1.00 1.00Lasso 1.05 1.13 1.09 1.19 1.10 1.18 1.04 1.18EN 1.05 1.04 1.06 1.08 1.10 1.09 1.02 1.08Adaptive 1.04 1.42 1.08 1.45 1.11 1.44 1.08 1.49Relaxed 1.21 3.45 1.16 2.98 1.17 2.92 1.23 3.91MC+ 1.48 1.68 1.44 1.74 1.51 1.75 1.50 1.75RF 1.18 1.25 1.14 1.32 1.16 1.34 1.09 1.37RGLM 1.22 1.17 1.12 1.23 1.17 1.22 1.10 1.22SIS-SCAD 1.63 1.95 1.61 2.02 1.70 2.06 1.64 2.04XGBoost 1.72 2.22 1.72 2.43 1.79 2.41 1.74 2.36Table 59: MR and TL relative performances for GSE23400 (part one) and training proportion 0.50. p = p = p = p =1,000Method MR TL MR TL MR TL MR TL Split-Lasso 1.00 1.01 1.04 1.00 1.02 1.00 1.06 1.04Split-EN 1.04 1.00 1.00 1.00 1.00 1.00 1.00 1.00Lasso 1.13 1.19 1.18 1.20 1.16 1.19 1.16 1.18EN 1.07 1.06 1.09 1.09 1.10 1.10 1.10 1.12Adaptive 1.08 1.33 1.06 1.43 1.04 1.41 1.02 1.48Relaxed 1.20 2.17 1.19 2.22 1.21 2.59 1.23 2.42MC+ 1.41 1.62 1.46 1.63 1.42 1.59 1.36 1.59RF 1.19 1.28 1.14 1.28 1.09 1.28 1.06 1.32RGLM 1.09 1.12 1.06 1.15 1.02 1.13 1.03 1.19SIS-SCAD 1.41 1.78 1.39 1.83 1.38 1.84 1.36 1.88XGBoost 1.50 1.84 1.44 1.86 1.46 1.87 1.44 1.8978able 60: MR and TL relative performances for GSE23400 (part two) and training proportion0.35. p = p = p = p = Split-Lasso 1.04 1.06 1.00 1.01 1.01 1.00 1.10 1.00Split-EN 1.01 1.02 1.00 1.00 1.04 1.00 1.06 1.00Lasso 1.02 1.05 1.06 1.08 1.11 1.12 1.14 1.06EN 1.00 1.00 1.02 1.01 1.06 1.05 1.12 1.03Adaptive 1.05 1.22 1.00 1.31 1.00 1.31 1.00 1.28Relaxed 1.07 2.74 1.08 2.84 1.15 2.74 1.12 2.78MC+ 1.14 1.30 1.11 1.30 1.12 1.30 1.12 1.28RF 1.21 1.21 1.17 1.24 1.14 1.28 1.15 1.31RGLM 1.14 1.03 1.13 1.08 1.10 1.12 1.16 1.15SIS-SCAD 1.12 1.49 1.10 1.52 1.14 1.53 1.18 1.52XGBoost 1.61 1.94 1.56 1.99 1.58 1.97 1.61 1.94Table 61: MR and TL relative performances for GSE23400 (part two) and training proportion0.50. p = p = p = p = Split-Lasso 1.07 1.07 1.02 1.00 1.00 1.01 1.05 1.00Split-EN 1.09 1.06 1.00 1.00 1.04 1.00 1.05 1.00Lasso 1.08 1.04 1.01 1.04 1.11 1.09 1.11 1.13EN 1.00 1.00 1.02 1.00 1.08 1.06 1.13 1.06Adaptive 1.01 1.26 1.06 1.35 1.08 1.38 1.01 1.38Relaxed 1.09 2.51 1.02 2.47 1.16 2.84 1.12 2.97MC+ 1.05 1.35 1.01 1.27 1.02 1.25 1.00 1.22RF 1.27 1.28 1.19 1.26 1.18 1.29 1.13 1.29RGLM 1.20 1.10 1.17 1.08 1.10 1.09 1.13 1.10SIS-SCAD 1.07 1.49 1.08 1.55 1.10 1.58 1.05 1.44XGBoost 1.51 1.73 1.48 1.67 1.48 1.67 1.39 1.5979able 62: MR and TL relative performances for GSE10245 and training proportion 0.35. p = p = p = p = Split-Lasso 1.00 1.03 1.02 1.05 1.02 1.05 1.01 1.04Split-EN 1.01 1.00 1.00 1.00 1.00 1.00 1.00 1.00Lasso 1.33 1.26 1.26 1.25 1.36 1.40 1.33 1.38EN 1.03 1.03 1.07 1.08 1.09 1.11 1.08 1.13Adaptive 2.13 1.73 2.19 1.80 2.28 1.89 2.31 1.90Relaxed 1.45 2.95 1.28 2.41 1.43 3.46 1.52 3.71MC+ 2.69 2.02 2.81 2.12 2.79 2.08 2.55 1.99RF 1.28 1.36 1.34 1.44 1.37 1.50 1.35 1.53RGLM 1.53 1.40 1.43 1.38 1.34 1.40 1.22 1.40SIS-SCAD 3.02 2.19 3.01 2.21 2.91 2.19 2.79 2.13XGBoost 2.61 2.02 2.58 2.00 2.44 1.97 2.35 1.93Table 63: MR and TL relative performances for GSE10245 and training proportion 0.50. p = p = p = p =1,000Method MR TL MR TL MR TL MR TL Split-Lasso 1.05 1.05 1.10 1.09 1.04 1.05 1.00 1.00Split-EN 1.00 1.00 1.00 1.00 1.00 1.00 1.03 1.00Lasso 1.23 1.17 1.43 1.25 1.17 1.14 1.23 1.16EN 1.09 1.02 1.12 1.06 1.05 1.03 1.07 1.01Adaptive 2.21 1.73 2.55 1.88 2.26 1.75 2.37 1.86Relaxed 1.46 3.47 1.49 3.57 1.29 3.46 1.32 2.57MC+ 2.54 1.88 2.80 2.04 2.81 2.06 2.75 2.01RF 1.30 1.38 1.36 1.48 1.32 1.47 1.31 1.51RGLM 1.49 1.37 1.57 1.40 1.18 1.34 1.37 1.36SIS-SCAD 2.88 2.17 3.00 2.21 2.74 2.11 2.87 2.11XGBoost 2.68 2.17 2.93 2.36 2.71 2.29 2.76 2.2680able 64: MR and TL relative performances for GSE5364 (lung) and training proportion 0.50. p = p = p = p = Split-Lasso 1.12 1.01 1.33 1.02 1.50 1.02 1.65 1.02Split-EN 1.10 1.00 1.28 1.00 1.39 1.00 1.60 1.00Lasso 1.45 1.25 1.72 1.24 2.02 1.24 2.48 1.28EN 1.29 1.08 1.44 1.08 1.61 1.08 1.75 1.07Adaptive 2.39 1.56 2.95 1.53 3.11 1.61 3.85 1.64Relaxed 1.39 3.62 1.73 3.66 1.90 3.27 2.56 3.84MC+ 1.65 1.34 2.09 1.40 2.25 1.39 2.79 1.43RF 1.00 1.00 1.00 1.03 1.00 1.05 1.00 1.06RGLM 1.28 1.04 1.41 1.04 1.61 1.06 1.69 1.07SIS-SCAD 1.96 1.36 2.20 1.38 2.45 1.37 2.94 1.38XGBoost 1.86 1.32 2.02 1.29 2.30 1.30 2.69 1.30Table 65: MR and TL relative performances for GSE25869 and training proportion 0.35. p = p = p = p =1,000Method MR TL MR TL MR TL MR TL Split-Lasso 1.02 1.17 1.03 1.24 1.05 1.29 1.05 1.25Split-EN 1.00 1.12 1.01 1.19 1.03 1.19 1.02 1.15Lasso 1.11 1.30 1.06 1.36 1.12 1.36 1.14 1.43EN 1.04 1.16 1.02 1.25 1.10 1.33 1.02 1.31Adaptive 1.30 1.23 1.27 1.36 1.41 1.37 1.40 1.39Relaxed 1.14 2.61 1.13 2.76 1.18 3.22 1.21 2.93MC+ 1.30 1.26 1.37 1.41 1.39 1.40 1.48 1.42RF 1.04 1.00 1.08 1.09 1.05 1.11 1.04 1.11RGLM 1.01 1.05 1.00 1.00 1.00 1.00 1.00 1.00SIS-SCAD 1.35 1.29 1.35 1.39 1.36 1.40 1.35 1.40XGBoost 1.33 1.39 1.35 1.56 1.43 1.63 1.44 1.6181able 66: MR and TL relative performances for GSE25869 and training proportion 0.50. p = p = p = p = Split-Lasso 1.09 1.13 1.11 1.22 1.08 1.28 1.02 1.14Split-EN 1.08 1.09 1.09 1.23 1.04 1.20 1.01 1.10Lasso 1.19 1.29 1.18 1.50 1.19 1.47 1.12 1.31EN 1.06 1.17 1.13 1.31 1.13 1.30 1.07 1.18Adaptive 1.17 1.22 1.22 1.34 1.26 1.40 1.33 1.33Relaxed 1.20 2.25 1.25 2.66 1.20 2.63 1.19 2.46MC+ 1.45 1.32 1.41 1.41 1.44 1.48 1.35 1.41RF 1.11 1.06 1.04 1.08 1.09 1.11 1.04 1.06RGLM 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00SIS-SCAD 1.53 1.36 1.43 1.39 1.45 1.46 1.44 1.41XGBoost 1.13 1.11 1.17 1.21 1.18 1.25 1.14 1.17Table 67: MR and TL relative performances for GSE5364 (thyroid) and training proportion 0.50. p = p = p = p =1,000Method MR TL MR TL MR TL MR TL