Improved supervised prediction of aging-related genes via weighted dynamic network analysis
II MPROVING SUPERVISED PREDICTION OF AGING - RELATEDGENES VIA DYNAMIC NETWORK ANALYSIS
Qi Li and Tijana Milenkovi´c *
Department of Computer Science, Center for Network & Data Science, and Eck Institute for Global HealthUniversity of Notre DameNotre Dame, IN 46556, USA [email protected], [email protected] A BSTRACT
Motivation:
This study focuses on supervised prediction of aging-related genes from -omics data.Unlike gene expression methods that capture aging-specific information but study genes in isolation ,or protein-protein interaction (PPI) network methods that account for PPIs but the PPIs are context-unspecific , we recently integrated the two data types into an aging-specific PPI subnetwork, whichyielded more accurate aging-related gene predictions. However, a dynamic aging-specific subnetworkdid improve prediction performance compared to a static aging-specific subnetwork, despite the agingprocess being dynamic.
Results:
So, here, we propose computational advances towards improving prediction accuracy froma dynamic aging-specific subnetwork. We develop a supervised learning model that when applied toa dynamic subnetwork yields extremely high prediction performance, with F-score of 91.4%, whilethe best model on any static subnetwork yields F-score of “only” 74.3%. Hence, our predictive modelcould guide with high confidence the discovery of novel aging-related gene candidates for future wetlab validation.
Contact: [email protected]
Supplementary information:
Attached.
Incidence of many complex diseases, such as diabetes, cancer, osteoarthritis, cardiovascular, and Alzheimer’s disease,increases with age [1]. Even the most recent and widespread COVID-19 seems to be related to aging. According to theU.S. Centers for Disease Control and Prevention , as of April 17, 2020, 22.41% of all deaths related to COVID-19occurred in the wide age range of 64 and under, while 22.23% occurred in the short 65-74 age range alone, 27.23% inthe short 75-84 age range alone, and 28.13% in the short age range of 85 and over.Understanding of the molecular mechanisms behind the aging process, including identification of human genesimplicated in aging, is important for understanding and treating such aging-related diseases [2, 3]. However, wet labexperimental analyses of human aging are hard due to long human life span and ethical constraints [4,5]. Computationalanalyses can fill this gap. This includes identification (i.e., prediction) of aging-related genes via supervised learningfrom human -omics data [3, 6, 7], which is the task that we focus on in this paper. Approaches for achieving this taskcan be categorized as follows: (i) those that use gene expression data alone, (i) those that use PPI network data alone,and (iii) those that combine gene expression data with PPI network data, as follows.Approaches from the first category predict a gene as aging-related if its expression level varies with age [8–12]. Whilesuch approaches do capture aging-specific information, they study genes in isolation from each other. This is theirdrawback, because genes, i.e., their protein products, carry out cellular functioning, including the aging process, byinteracting with each other [13].Approaches from the second category predict a gene as aging-related if its position (i.e., node representa-tion/embedding/feature) in the PPI network is “similar enough” to the network positions of known aging-relatedgenes [6, 7, 14, 15]. While these approaches do capture PPIs that carry out cellular functioning, their drawback is thatthe PPIs are context-unspecific, i.e., they span different conditions, such as cell types, tissues, diseases, environments,patients, etc. In the context of our study, this means that the PPIs are not aging-specific, i.e., they span different ages. a r X i v : . [ q - b i o . M N ] M a y pproaches from the third category address the above drawbacks by considering both aging-specific gene expressiondata and context-unspecific PPI network data. Earlier approaches of this type extracted genes’ features from each ofgene expression data and PPI network data and then concatenated the features [6, 7]. As such, they integrated thefeatures rather than the data. Consequently, they still considered the context-unspecific PPI network. More recentefforts, including by our group, first integrated the two data types to infer an aging-specific subnetwork of the entirecontext-unspecific PPI network, and then analyzed the resulting subnetwork [4, 16]. Specifically, these studies useda traditional induced approach : given gene expression data for different ages and an entire context-unspecific PPInetwork (which happens to be static ), the studies formed a subnetwork snapshot for each age, where each snapshotconsisted of all genes that were significantly expressed (i.e., active) at the given age and all of their PPIs from the entirenetwork. That is, they formed a given age-specific snapshot by taking the induced subgraph on the active genes at thatage. Then, they combined snapshots for all ages into a dynamic aging-specific PPI subnetwork. Such a subnetwork,being dynamic, is meant to capture how network positions of genes change with age.These studies used their inferred dynamic subnetworks in unsupervised aging-related tasks [4, 16]. So, while theirinferred subnetworks are relevant for this paper, their unsupervised analyses of the subnetworks fall outside of its scope. Supervised analyses from our other, even more recent work [17, 18] are directly relevant for this paper. In that work, weperformed supervised prediction of aging-related genes from three networks: the dynamic aging-specific subnetworkby Faisal et al. [4], a static (still aging-specific) counterpart of the dynamic subnetwork, and the entire (also static)context-unspecific network from which the dynamic aging-specific subnetwork was inferred. Then, we examinedwhether using the dynamic or static aging-specific subnetwork improved the prediction accuracy compared to using theentire (static) context-unspecific PPI network. Indeed, we found this to be the case (which is why it is no longer neededto examine the entire context-unspecific PPI network when predicting aging-related genes). Also, because the agingprocess is dynamic, we examined whether using the dynamic aging-specific subnetwork was superior to using the staticaging-specific subnetwork. Surprisingly, we found this not to hold.This unexpected finding could be because the dynamic subnetwork was inferred using the induced approach, whichis quite naive. Recently, an alternative, methodologically more advanced notion of network propagation (NP) wasused for inference of a static context-specific subnetwork in the context of cancer [19], with prominent NP approachesbeing NetWalk [20] and HotNet2 [21]. NP maps expression levels (i.e., activities) onto the genes in the entire context-unspecific PPI network. Then, NP propagates the activities via random walks or diffusion, to assign condition-specificweights to the nodes (genes, i.e., their protein products) or edges (PPIs) in the entire network. Finally, NP assumes thatit is the highest-weighted network regions that are the most relevant for the condition of interest, i.e., that such regionsform the context-specific subnetwork. For information on methodological differences between NetWalk and HotNet2,please refer to their original publications or to work by Newaz et al. [22].Our group extended these NP approaches to allow for inference of a dynamic context-specific subnetwork in the contextof aging [22]. Then, the NP-based subnetworks were evaluated in an unsupervised learning context. So, while theinferred NP-based subnetworks are relevant for this paper, the unsupervised analyses of the networks by Newaz etal. [22] fall outside of its scope.Here, besides the induced dynamic aging-specific subnetwork by Faisal et al. [4], we also analyze the NP dynamicsubnetworks by Newaz et al. [22], plus static aging-specific counterparts of all dynamic subnetworks, with a hypothesisthat a dynamic aging-specific subnetwork will be superior to all static aging-specific subnetworks in the task ofsupervised prediction of aging-related genes.
In each (dynamic vs. static, induced vs. NP) aging-specific subnetwork, we label genes as aging- or (currently)non-aging-related by relying on GenAge, a trusted source of aging-related knowledge [23].For each subnetwork, we develop a predictive model that consists of a network feature, dimensionality choice, andclassifier. We consider 11 features, seven of which are dynamic, i.e., can be extracted from a dynamic subnetwork, andfour of which are static, i.e., can be extracted from a static subnetwork. For each feature, we examine its full version, aswell as its seven (non-)linear dimensionality-reduced versions. We couple each feature (version) with nine classifiers.For each dynamic subnetwork, we consider × (1 + 7) × predictive models, i.e., combinations of feature,dimensionality choice, and classifier. For each static subnetwork, we consider × (1 + 7) × predictive models.For each subnetwork, we evaluate each predictive model via cross-validation [18]: we train on a subset of the aging-and non-aging-related genes and test prediction accuracy on the remaining genes. We evaluate accuracy of a predictivemodel in terms of the area under the precision-recall curve (AUPR), precision, recall, and F-score. Then, for eachsubnetwork, we choose the most accurate predictive model. Finally, we compare accuracy of the different subnetworks.2o validate our hypothesis, it would suffice for the best dynamic subnetwork to yield higher prediction accuracy than allconsidered static subnetworks. Indeed, we find this to hold, which justifies the need for our study. Importantly, our bestmodel on a dynamic subnetwork yields extremely high accuracy when predicting aging-related genes, with AUPR of95.6%, precision of 93.8%, recall of 89.2%, and F-score 91.4%.During our comprehensive evaluation, we examine two additional questions. First, the above results are when usingGenAge to label genes as aging- vs. non-aging-related for classification. Human genes in GenAge are sequenceorthologs of aging-related genes in model species. Because not all human genes have sequence orthologs in modelspecies [24, 25], using GenAge may miss aspects of the aging process that are unique to human [26]. So, we consideranother source of aging-related knowledge obtained by studying human directly (rather than doing so indirectly frommodel species), namely the genotype-tissue expression (GTEx) project [8]. Second, the above results are when usingthe induced and NP subnetworks inferred from a particular context-unspecific PPI network [27]. We also considerinduced and NP subnetworks inferred from another context-unspecific PPI network [28, 29]. We discuss the findingswhen using the alternative aging-related data or context-unspecific PPI network in Section 3. In this study, we consider aging-specific subnetworks inferred from one of two entire context-unspecific human PPInetworks: HPRD [27] and HINT+HI2 [28, 29]. The latter is newer. HPRD contains PPIs from the correspondingdatabase. HINT combines PPIs from eight databases. Then, HINT+HI2 combines PPIs from HINT and the HI2database. The HPRD network has 8,938 nodes and 35,900 edges. The HINT+HI2 network has 9,858 nodes and 40,705edges. 6,989 nodes and 14,791 edges are common to the two networks.
For each of the two entire context-unspecific networks (HPRD or HINT+HI2), we consider three dynamic aging-specificsubnetworks (one induced and two NP-based), their compressed counterparts, and their static counterparts (describedbelow and summarized in Table 1).First, for HPRD and HINT+HI2, we use dynamic aging-specific subnetworks inferred by Faisal et al. [4] and Newaz etal. [22], respectively. Both were inferred using the induced approach: given human gene expression data at 37 differentages between 20 and 99 [10], a dynamic aging-specific subnetwork consisting of 37 subnetwork snapshots, one snapshotper age, was constructed; each subnetwork snapshot corresponded to the induced subgraph of those genes that aresignificantly expressed (i.e., are active) at the given age. Henceforth, we denote the resulting networks as Induced.Second, for a given entire PPI network (HPRD or HINT+HI2), we use two dynamic subnetworks inferred by Newaz etal. [22] using NP. Key differences between the induced approach and NP are as follows [22]. The induced approachconsiders all PPIs from the entire context-unspecific network that exist between only active genes . On the other hand,with an assumption that not all PPIs between the active genes might be equally important, NP’s weighting mechanismallows it to select only highest-weighted of all PPIs between the active genes. Also, with a different assumption that itis important to consider both active genes and non-active genes that critically connect the active genes in the entirenetwork, NP propagates activities of highly expressed (i.e., active) genes to other (i.e., non-active) genes in the network.As such, it could assign a high weight to a non-active gene if it (directly or indirectly) links sufficiently many activegenes. Consequently, NP could include such a high-weight non-active gene into its context-specific subnetwork.Newaz et al. [22] extended two prominent NP approaches originally proposed for inference of a static condition-specificsubnetwork in the context of cancer, NetWalk and HotNet2, to the problem of inferring a dynamic context-specificsubnetwork in the context of aging. They evaluated the subnetworks produced by NetWalk against those produced byHotNet2, and found the former to be better, i.e., to overlap more with aging-related ground truth data. Consequently, inthis paper, we only focus on the dynamic aging-specific subnetworks inferred using NetWalk. For this NP approach,Newaz et al. [22] considered two versions, referred to as NetWalk and NetWalk*. Their key difference is which geneexpression levels (i.e., activity scores) they assign to which genes prior to propagating the gene activities through thenetwork. Intuitively, the former assigns actual expression levels to only those genes that are significantly expressed(active) at a given age and “non-informative” (i.e., “dummy”) expression levels to all other genes in the entire network,while the latter assigns actual expression levels to both active and non-active genes; for more information, see [22].Multiple dynamic aging-specific subnetworks were produced by each of NetWalk and NetWalk*. Here, for each ofHPRD and HINT+HI2, we select the best of all NetWalk dynamic subnetworks, and the best of all NetWalk* dynamicsubnetworks; by “best”, we mean the subnetwork that overlapped the most with the aging-related ground truth data3ccording to the analysis by Newaz et al. [22] or that has resulted in the best prediction accuracy in our evaluation fromthis paper. Specifically, our selected NetWalk dynamic subnetwork based on HPRD was referred to as NetWalk-2 in theoriginal publication [22]; our selected NetWalk* dynamic subnetwork based on HPRD was referred to as NetWalk*;our selected NetWalk dynamic subnetwork based on HINT+HI2 was referred to as NetWalk; and our selected NetWalk*dynamic subnetwork based on HINT+HI2 was referred to as NetWalk*-2.So, for each of HPRD and HINT+HI2, we consider three dynamic aging-related subnetworks: Induced, NetWalk, andNetWalk*.Because our key question is whether aging-related genes can be predicted more accurately from a dynamic aging-relatedsubnetwork than from a static subnetwork, for each dynamic subnetwork, we create its two static counterparts, whichwe refer to as compressed and static, as follows.We obtain a compressed aging-related subnetwork by aggregating (i.e., taking the union of) all nodes and edges over all37 snapshots of the corresponding dynamic subnetwork. That is, the compressed subnetwork contains any node andedge that appears in any of the 37 snapshots. In other words, we form a compressed subnetwork by first integratinggene expression data at each individual age with the entire context-unspecific PPI network, and then aggregating thenetwork information over all ages. This allows for fairly studying the effect of dynamic vs. static network analysis bycomparing a dynamic subnetwork against its compressed counterpart, as the two contain the same nodes and edges.On the other hand, we form a static aging-related subnetwork by first aggregating gene expression data over all ages (sothat a node is considered active if it is active at any one or more of the 37 ages), and then integrating the aggregatedgene expression-based activity data with the entire context-unspecific PPI network. This way, we directly infer a staticsubnetwork, rather than doing so indirectly by first inferring a dynamic and only then a compressed subnetwork. ForInduced, this procedure yields a static subnetwork that is the same as its compressed counterpart. For each of NetWalkand NetWalk*, nodes or edges could differ between its compressed subnetwork and its static subnetwork. For thesetwo approaches, after the aggregated gene expression data is propagated over the entire context-unspecific networkand weights are assigned to all edges in the entire network, we choose the edge weight threshold that results in thestatic subnetwork matching the number of edges in its compressed counterpart as closely as possible. This way, thecompressed subnetwork and its static counterpart are fairly comparable to each other.In total, for each of the two entire networks (HPRD and HINT+HI2), we have three groups of aging-specific subnetworks(Induced, NetWalk, and NetWalk*), each with three versions (Dynamic, Compressed, and Static). That is, we consider × × aging-specific subnetworks, which are summarized in Table 1.Table 1: The sizes of the 18 considered aging-specific subnetworks. The size of a dynamic subnetwork is the averageover its 37 subnetwork snapshots. The purpose of using the following data sets is described in Section 2.1.4.• GenAge [23] is a highly trusted source of human aging-related genes, most of which are sequence orthologs ofexperimentally validated aging-related genes in model organisms. Of all 307 human genes from GenAge, 84are present in all 18 considered subnetworks. Henceforth, we denote the 84-gene set as GenAge.• GTEx [8] contains 863 genes whose expressions decrease with age (i.e., are down-regulated ), of which 131are present in all subnetworks. Henceforth, we denote the 131-gene set as GTEx-DAG.• GTEx contains 710 genes whose expressions increase with age (i.e., are up-regulated ), of which 33 are presentin all 18 considered subnetworks. Henceforth, we denote the 33-gene set as GTEx-UAG.• Lu et al. [9] identified 442 genes whose expressions vary with age, of which 87 are present in all 18 consideredsubnetworks.• Berchtold et al. [10] identified 8,277 genes whose expressions vary with age, of which 693 are present in all18 considered subnetworks. 4 Simpson et al. [12] identified 2,911 genes whose expressions vary across different stages of Alzheimer’sdisease, of which 272 genes are present in all 18 considered subnetworks.
For classification (discussed in the following sections), we need a set of genes labeled as aging-related (positive class)and a set of genes labeled as non-aging-related (a negative class). Because GenAge is considered to be one of the mostconfident aging-related ground truth data sources, our primary label definition is with respect to GenAge. However,human genes in GenAge are sequence orthologs of aging-related genes in model organisms, but human aging is acomplex process that has unique biological aspects compared to the aging process in a model species [26]. So, tohopefully capture the human-unique aspects, we separately consider a secondary label definition with respect to GTEx,which was obtained by studying aging directly in human. Specifically, of the entire GTEx, we focus on GTEx-DAG,because we analyze PPI data, and because genes in GTEx-DAG were shown to be relevant for PPIs, unlike genes inGTEx-UAG [8]. The two label definitions are as follows.
Primary definition with respect to GenAge.
We label as aging-related those 84 genes from GenAge that exist in all 18considered subnetworks. Because we want our non-aging-related gene set to be as confident as possible, we label asnon-aging-related those genes that exist in all 18 subnetworks but not in any of the six considered aging-related groundtruth data sets; there are 315 such genes. We intentionally use the same 84-gene positive class and the same 315-genenegative class for classification in all 18 subnetworks. This allows us to fairly compare classification accuracy (i.e.,accuracy of aging-related gene prediction) across all subnetworks.
Secondary definition with respect to GTEx-DAG.
We label as aging-related those 131 genes from GTEx-DAG thatexist in all 18 subnetworks. We label as non-aging-related those 315 genes that exist in all 18 subnetworks but not inany of the six considered aging-related ground truth data sets. Note that we use the GenAge definition separately fromthe GTEx-DAG definition, i.e., we perform classification for each of the two individually. This is because the two arevery different (sequence- vs. expression-based) data sets that possibly capture different aspects of the aging process.Indeed, of the 84 GenAge-based and 131 GTEx-based aging-related genes, only 14 genes are common to the two genesets. • DGDV, dynamic graphlet degree vector [30], counts how many times a node participates in each considereddynamic graphlet. Dynamic graphlets are an extension of static graphlets (small subgraphs on up to n nodes)to the dynamic context, where the temporal order in which events occur in a dynamic network is added ontoedges of a static graphlet. As in the original DGDV paper [30], we use up to 4-node and 6-event graphlets,resulting in a 3,727-dimensional DGDV node feature.• GoT, graphlet orbit transitions [31], of a node counts, for every pair of considered graphlets, how many timesone graphlet changes into the other. As in the original GoT publication [31], and for a fair comparison withDGDV, we use 4-node graphlets, resulting in a 121-dimensional GoT node feature.• GDC, graphlet degree centrality [32], is defined for a node in a static network (as discussed below). We useGDC to compute, for each node v , v ’s centrality in each of the 37 snapshots of a dynamic subnetwork. Then,we combine v ’s 37 GDC values into its 37-dimensional dynamic GDC feature. We do the same for all otherconsidered centrality features (listed below), each of which results in a 37-dimensional node feature. Goingback to the definition of GDC in a static network: it ranks a node v as central if v participates in many graphletsor in complex (large or dense) graphlets (or both).For GDC, we use the code by Faisal et al. [4], which considers up to 5-node graphlets without an option tocustomize the graphlet size.• ECC, eccentricity centrality [4], of a node v calculates distance (i.e., the shortest path length) between v andeach other node in the network, finds a node u that is the most distant to v , and computes the reciprocal of thedistance between u and v .• KC, k -coreness [4], of a node v is the size of the largest network core to which v belongs. A network coreis a subgraph in which each node is connected to at least k other nodes. When v belongs to a k -core, it alsobelongs to 1-core, 2-core, 3-core, ..., ( k -1)-core. Then, KC of v is the size of its largest k -core.• DegC, degree centrality [4], of a node v measures how many other nodes in the network v is connected to.5 CentraMV, centrality mean and variation [17, 18], of a node v measures, for each of the four consideredcentrality-based features (GDC, ECC, KC, and DegC), two quantities: the mean and variation over v ’s 37centrality values. These two quantities for each of the four centrality-based features combined form an8-dimensional CentraMV node feature. • SGDV, static graphlet degree vector [33], a static counterpart of DGDV, counts how many times a nodeparticipates in each considered static graphlet. Just as for DGDV and GoT, we consider up to 4-node staticgraphlets. This results in a 15-dimensional SGDV node feature. We have tested up to 5-node SGDV in thisstudy, but this has not yielded an improvement (results not shown). So, for simplicity, we just report results forup to 4-node SGDV.• cSGDV, colored SGDV [34], is a heterogeneous analog of SGDV. It counts how many times a node participatesin each heterogeneous graphlet on up to n nodes and with c colors. We consider up to four nodes and useaging-related information about genes to derive four colors, just as done in our recent work [18]. Thisresults in a 225-dimensional cSGDV feature. We have verified that up to 5-node cSGDV does not yielded animprovement.• UniNet [7] aggregates 14 node centralities (DegC, ECC, KC, average shortest path, betweenness, closeness,clustering coefficient, neighborhood connectivity, radiality, stress, topological coefficient, aging neighborcount, aging neighbor ratio, and binary aging neighbor representations) into a 14-dimensional node feature.• m BPIs [6] works as follows. First, m highest-degree nodes in the network are found. Then, for a node v , v ’sfeature has m dimensions corresponding to the m nodes, where each dimension d of node v ’s feature has avalue of 1 if v interacts with the top m highest-degree node d , or otherwise it has a value of 0 if v does notinteract with d . Just as [17, 18], and as suggested in the original m BPIs publication, we use m = 30 in thispaper. This results in a 30-dimensional m BPIs node feature.
A common problem in supervised classification is overfitting, often due to a high dimensions of the feature usedin the classification task [35]. Some of our considered features have high dimensions, such as DGDV, GoT, andcSGDV. So, as in our previous work [17, 18], for each feature, we consider: (i) the full feature (i.e., no dimensionalityreduction), (ii) linear dimensionality reduction via principal component analysis (PCA) that considers as few principalcomponents as needed to account for at least 90% of variation in the data corresponding to the given feature, and(iii)-(viii) nonlinear dimensionality reduction via t -distributed stochastic neighbor embedding (tSNE) under six differentperplexity parameters (5, 13, 21, 30, 40, 50). This totals to considered dimensionality reduction choices.Feature dimensionality reduction can be done prior to or during cross-validation [18]. There is a debate on which isbetter [36]. So, we have examined both options and found their results to be virtually indistinguishable. Because ofthis, and because reduction during cross-validation avoids any potential leakage of information from the testing data inthe training step of cross-validation, unlike reduction prior to cross-validation, we report results only for the former –feature dimensionality reduction during cross-validation. We consider the same nine classifiers under the same parameters as in our previous work [17, 18], all implementedusing a Phython library scikit-learn (version 0.20.3) [37]: Adaptive Boosting (AB), Random Forest (RF), DecisionTree (Dtree), Logistic Regression (LR), Gaussian Naïve Bayes (NB), K -Nearest Neighbors (KNN), Support VectorMachine with a linear kernel (SVM-linear) and a non-linear kernel (radial basis function (SVM-rbf)), and MultilayerPerceptron (MLP); see Supplementary Section S1. All methodology in this section is described when using the primary GenAge-based definition of aging- and non-aging-related gene labels. Everything is analogous when using the secondary GTEx-DAG definition.
For each of the 18 age-specific subnetworks, we develop a predictive model that consists of a network feature,dimensionality choice, and classifier. For each feature, we consider each of the eight dimensionality choices. We runeach feature with each dimensionality choice under each of the nine classifiers. Given that of the 11 features, seven are6ynamic, for each dynamic subnetwork, we consider × × combinations of a feature, dimensionality choice,and classifier, i.e., 504 predictive models. Given that of the 11 features, four are static, for each static subnetwork, weconsider × × predictive models.Note that given that we do this for the two entire networks (HPRD and HINT+HI2), and for their three dynamic, threecompressed, and three static subnetworks, this totals to × (3 ×
504 + 3 ×
288 + 3 × , predictive models.And that is just for the GenAge-based definition of aging- and non-aging-related gene labels. The number is the samefor the GTEx-based definition. So, over the entire study, we develop and evaluate a total of 12,960 predictive models.We present this number to give an idea to the reader of how comprehensive our study is. We run each predictive model under a systematic 5-fold cross-validation framework [18]. That is, we randomly splitthe 84 aging- and 315 non-aging-related genes into five equal-size subsets. We train on the union of four subsetsof the aging- and non-aging-related genes and test the prediction accuracy on the remaining subset. We repeat thisprocess five times, such that each time we are using a different subset as testing data. For each of the five folds ofcross-validation, the output of a predictive model is a ranked list of the genes from the testing data based on their theirpredicted likelihood of being aging-related. Then, to make aging-related gene predictions from a model, a thresholdneeds to be selected so that those genes whose likelihoods of being aging-related are above the threshold are predictedas aging-related. We vary the prediction threshold from to (cid:100) (84 + 315) / (cid:101) = 80 in increments of . For aging-related genes predicted at a given threshold, we use their actual aging- and non-aging-related labels tocompute the numbers of true positives (TPs), false positives (FPs), and false negatives (FNs). A TP is a gene that ispredicted to be aging-related and is also labeled as aging-related. A FP is a gene that is predicted to be aging-related butis labeled as non-aging-related. A FN is a gene that is not predicted to be aging-related but is labeled as aging-related.Given these numbers, we evaluate accuracy of a predictive model in the given fold in terms of AUPR over all predictionthresholds, as well as via precision, recall, and F-score at the prediction threshold where (i) F-score is maximized, whileat the same time (ii) precision is at least as large as recall. Precision is / ( + ) . Recall is / ( + ) . F-score is the harmonic mean of precision and recall. For all four measures, we report apredictive model’s accuracy as the average over the five cross-validation folds. Note that we force the above condition(ii) because we believe that for potential wet lab validation of predictions, precision should be favored over recall, aslong as recall is not too low [17, 18]. And maximizing F-score is exactly what ensures that recall is not too low, i.e., thatboth precision and recall are reasonably high. For each of the 18 subnetworks, we select its best predictive model, i.e., the one that maximizes AUPR, to give it thebest-case advantage. Then, we report AUPR, precision, recall, and F-score of the selected model.
First, we compare accuracy of each selected predictive model against accuracy of a random approach. The latter worksas follows. For a given testing data set (i.e., cross-validation fold) and each prediction threshold g , we randomly select g testing genes and predict them as aging-related. Then, we repeat this for each of the five folds. To account forrandomness, we run these two steps 30 times, which means that we perform random aging-related gene prediction ×
30 = 150 times. We report the accuracy (AUPR, precision, recall, and F-score) of the random approach as theaverage over the 150 runs. A predictive model is good only if its accuracy is statistically significantly higher than thatof a random approach.Second, we compare pairs of the selected predictive models (plus the random approach), to determine if one model’spredicted gene set is statistically significantly more accurate than another model’s set. We do this by comparing the twomodels’ five accuracy scores corresponding to the five folds via the Wilcoxon rank sum test. Because we perform thistest for multiple pairs of models, we apply the false discovery rate (FDR) correction to adjust the p -values. We considerone model to be statistically significantly better than another if the adjusted p -value is below 0.05. To evaluate potential complementarity of gene sets predicted by different models, we measure their overlap via theJaccard index. We compute the statistical significance of the overlap size using the hypergeometric test. Because we7un this test for multiple pairs of predictive models, we correct the p -values using the FDR correction. We consider anoverlap to be statistically significantly large if the adjusted p -value is below 0.05. Sections 3.1-3.3 are for the nine subnetworks inferred from HPRD, when using the GenAge-based definition of aging-and non-aging-related gene labels. Section 3.4 is for the other nine subnetworks, also when using the GenAge-basedlabel definition. Section 3.5 is for both the nine subnetworks inferred from HPRD and the nine subnetworks inferredfrom HINT+HI2, but when using the GTEx-DAG-based label definition.
Recall that for each dynamic subnetwork (Induced-Dynamic, NetWalk-Dynamic, and NetWalk*-Dynamic), we consider7 features × × = 504 predictive models, and for each static subnetwork (Induced-Compressed, Induced-Static, NetWalk-Compressed, NetWalk-Static, NetWalk*-Compressed, and NetWalk*-Static), weconsider × × predictive models. Then, for each subnetwork, we select its best (most accurate) predictivemodel (Table 2). Recall that Induced-Compressed and Induced-Static are the exact same network. So, their resultsmatch, which is why we report them jointly.Table 2: The selected (i.e., best) predictive model for each of the three dynamic (red), three compressed (blue), andthree static (green) aging-specific subnetworks inferred from HPRD. The predictive models are presented in the“feature + dimensionality choice + classifier” format. Note that for dimensionality choice, “None” corresponds to nodimensionality reduction.The selected model matches only between the three compressed subnetworks (and thus also Induced-Static). All otherselected models differ in at least one of feature, dimensionality choice, and classifier. In more detail, the nine selectedmodels (corresponding to the nine subnetworks) encompass four features, two out of the seven features for the dynamicsubnetworks and two out of the four features for the static subnetworks. They encompass two dimensionality choices,of which no reduction is preferred for eight out of the nine subnetworks. Finally, the selected models encompass threeclassifiers. So, while some individual components of a model are somewhat preferred over others (especially the UniNetfeature on the static subnetworks, or no dimensionality reduction on all subnetworks), there is no model as a whole thatis selected for more than one subnetwork except the one for the compressed subnetworks. However, as we will show inthe following sections, the most accurate of all selected models is the one for NetWalk*-Dynamic. So, in summary,there is no clear pattern regarding which component of which model works well on which network.This predictive model inconsistency across the subnetworks stresses out the need for our comprehensive evaluationof running all of the different combinations of feature, dimensionality choice, and classifier, in order to give eachsubnetwork the best-case advantage. Also, it emphasizes the importance of a shift in the field of machine learningtowards development of interpretable predictive models, in order to allow for understanding the effect of each modelcomponent on the prediction outcome. First, we note that all nine subnetworks result in significantly higher accuracy than the random approach (Fig. 1); alladjusted p -values are below . for all of AUPR, precision, recall, and F-score.Second, we test our key hypothesis: whether the best dynamic subnetwork predicts aging-related genes more accuratelythan all static subnetworks (when using for each subnetwork its best predictive model; Section 3.1). Indeed, wefind that the hypothesis holds. That is, the best of the dynamic subnetworks (NetWalk*-Dynamic) outperforms allstatic subnetworks in terms of all of AUPR, precision, recall, and F-score (Fig. 1). Its superiority is not statisticallysignificant, which is at least partly due to the prediction accuracy of all subnetworks being relatively high (i.e., thelowest AUPR over all nine subnetworks is 67.9%, the lowest precision is 68.8%, the lowest recall is 53.5%, and thelowest F-score is 61.8%). Also, this could partly be due to the fact that we compute the significance of the differencebetween subnetworks (i.e., their predictive models) via the Wilcoxon test applied to the models’ five accuracy scores(corresponding to the five folds of cross-validation). However, it is well known that comparing such small samples via8he Wilcoxon test tends to decrease (though not invalidate) the power of the test. Importantly, the best predictive modelyields extremely high accuracy when predicting aging-related genes. Namely, for NetWalk*-Dynamic, AUPR is 95.6%,precision is 93.8%, recall is 89.2%, and F-score is 91.4%.Figure 1: Prediction accuracy in terms of AUPR, precision, recall, and F-score for the nine aging-specific subnetworksinferred from HPRD, each under its best predictive model, when using the primary GenAge-based definition of aging-and non-aging-related gene labels.Because all subnetworks perform well, meaning that they predict a majority (over 53.5%) of the genes that are labeledas aging-related (i.e., of the known aging-related genes), we expect the pairwise overlaps of their true positives to bereasonably large. Indeed, we find this to be the case. Given all pairs of the nine true positive sets (corresponding to thenine subnetworks), overlaps are statistically significantly high for over two thirds of the pairs. The largest overlap overall pairs is 80.3%, the smallest one is 47.8%, and the average one is 65.4% (Supplementary Fig. S1).Similarly, we examine the pairwise overlaps of the subnetworks’ predicted genes that are labeled as non-aging-related,i.e., of their novel aging-related gene predictions. We find that none of the pairwise overlaps between the ninesubnetworks are statistically significant. The largest overlap over all pairs is 30.3%, the smallest one is 0%, and theaverage one is 14.0% (Supplementary Fig. S2). Intuitively, the higher the precision of a subnetwork, by definition thefewer novel predictions it makes. In fact, the best-performing subnetwork, NetWalk*-Dynamic, given its extremelyhigh precision score, makes only five novel predictions. The novel aging-related genes predicted from this subnetworkmay be good candidates for future experimental validation (the five genes are: CSK, RHOG, SH2B2, ELK1, andCAPN3). This subnetwork’s well-performing model can be used to predict even more of novel aging-related genes, e.g.,by applying it to the genes from the dynamic subnetwork that were not in the training or testing data, and then selectingthe genes with the highest predicted probabilities of being aging-related as the most confident novel predictions forfuture wet lab validation. Recall the following from Section 1.1. The key motivation behind using NP in this study is that Induced-Dynamic didnot improve upon Induced-Compressed (which is the same as Induced-Static) in our past work [17, 18]. So, we wantto use a better dynamic aging-specific subnetwork. NP is expected to produce such a subnetwork, given that unlikeInduced, it can remove low-weight PPIs between active genes (because such PPIs are hypothesized to be less importantfor the condition of interest, in our case, the aging process) or add to its subnetwork non-active genes that criticallyconnect the active genes in the entire PPI network (along with their PPIs). Note that by removing PPIs between activegenes, if all of the PPIs that a gene is incident to are removed, such a gene would become isolated and would thusremoved from the subnetwork. So, NP might end up losing some of the active genes compared to Induced.Indeed, as shown in Fig. 1, NP outperforms Induced. That is, NetWalk*-Dynamic, the best of all NP subnetworks, isbetter than all of the Induced subnetworks. 9ot only is NP better than Induced, but also, recall that in our past work [17, 18], we did not observe Induced-Dynamicto be better than Induced-Compressed/Induced-Static. We confirm this in the current study as well. So, while forInduced, using a dynamic subnetwork is not superior to using a static subnetwork, doing so is superior for NP, becauseNetWalk*-Dynamic is superior to both NetWalk*-Compressed and NetWalk*-Static (Fig. 1). This further stresses outthe power of NP.Figure 2: Overlaps of all genes (black numbers) and genes labeled as aging-related (red numbers) between Induced,NetWalk, and NetWalk* subnetworks that are inferred from HPRD. To be precise, for each circle (e.g., NetWalk),the pink number (e.g., 1,555 for NetWalk) corresponds to the size of the node intersection of all three correspondingHPRD-based subnetworks (e.g., NetWalk-Dynamic, NetWalk-Compressed, and NetWalk-Static for NetWalk). So, theoverlaps are between (i) the intersection of the three Induced subnetworks, (ii) the intersection of the three NetWalksubnetworks, and (iii) the intersection of the three NetWalk* subnetworks. Labeling only the 84 genes from the verycenter that are in all subnetworks and in GenAge as aging-related (and labeling the 315 genes that are in all subnetworksbut not in any of the six ground truth data sets as non-aging-related) for classification allows for a fair comparison ofprediction accuracy over all subnetworks.We want to comment on additional observations regarding (dis)similarity between Induced and NP. First, regardingInduced vs. NetWalk: All NetWalk nodes are in Induced (Fig. 2). This means the NetWalk does not add non-activenodes to its subnetworks, i.e., NetWalk “just” removes low-weight edges from Induced (and consequently some activenodes too, which become isolated because of the edge removal). That is, both Induced and NetWalk contain only activenodes. NetWalk results in similar performance as Induced (Fig. 1).Second, regarding Induced vs. NetWalk*: The latter has nodes that are not in Induced (Fig. 2). This means thatNetWalk* does add non-active genes to its subnetwork, unlike Induced (and NetWalk). Also, Induced has nodes thatare not in NetWalk*. As all genes in Induced have to be active, this means that NetWalk* is removing some low-weightedges between active genes (like NetWalk), because only removal of such edges can cause an active gene to be removedfrom NetWalk* by becoming isolated.Combing these two observations, it seems that it is both of the topological changes (removal of low-weight edgesbetween active genes and addition of non-active genes) of NetWalk* compared to Induced that result in NetWalk*outperforming Induced (Fig. 1). All results reported in Sections 3.1-3.3 are for the nine HPRD-based subnetworks, and when using the GenAge-basedgene labels. Here, we analyze the nine HINT+HI2-based subnetworks, also when using the GenAge-based labels.Note that for HINT+HI2, for NetWalk*-Dynamic, due to its large size (Table 1), we could not run one of the dynamicfeatures (DGDV). All other analyses remain the same as in Sections 3.1-3.3.We again select the best predictive model for each of the nine HINT+HI2-based subnetworks (Supplementary TableS1). We find that the key result for HINT+HI2 is qualitatively the same as the result for HPRD: the best dynamicsubnetwork is better than all static subnetworks (Supplementary Figs. S3-S6). One difference is that now NP does notnecessarily improve upon Induced, because both Induced-Dynamic and NetWalk*-Dynamic are the best of all networks,and the two are almost tied. Another difference is that the prediction accuracy of all subnetworks is somewhat lower forHINT+HI2 than for HPRD.So, some although not all of our results are robust to the choice of entire context-unspecific network used to infer anaging-specific subnetwork. 10 .5 Results when using GTEx-DAG to label genes
All results reported in Sections 3.1-3.4 are for when using the GenAge-based gene labels. Here, we report results whenusing the GTEx-DAG-based labels, for both the HPRD-based and HINT+HI2-based subnetworks. The subnetworks’best predictive models when using GTEx-DAG are shown in Supplementary Tables S2 and S3, respectively.First, we comment on results for the nine HPRD-based subnetworks. We already showed that for the three Inducedsubnetworks, prediction accuracy is not nearly as good under the GTEx-DAG-based gene labels as it is under theGenAge-based gene labels [18]. Here, we confirm that the same holds for the six NP (NetWalk and NetWalk*)subnetworks. That is, although in this study all nine subnetworks are statistically significantly more accurate than therandom approach (adjusted p -values below 0.04) in terms of precision, recall, and F-score, although not in terms ofAUPR, their prediction accuracy is only in the ∼ p -values below 0.04) interms of precision, recall, and F-score, although not in terms of AUPR. However, unlike in all of the above analyses,now it is a static subnetwork, NetWalk-Static, that is (marginally) superior to all other subnetworks, including thedynamic ones. This subnetwork is NP-based, which again justifies a need for NP.Because for both HPRD and HINT+HI2, using GTEx-DAG yields much lower accuracy than using GenAge, we do notanalyze any further the true positives and novel predictions resulting from using GTEx-DAG.The inferior performance of GTEx-DAG compared to GenAge is surprising for two reasons. First, genes in GTEx-DAGhave been linked to aging directly in human, while those in GenAge have been linked to aging indirectly based on theirsequence similarities to aging-related genes in model species. So, GTEx-DAG should better capture aspects of the agingprocess that are unique to human. Given this, and given that we have aimed to extract human aging-related knowledgefrom human PPI subnetworks, one would expect GTEx-DAG to perform better. However, it does not. This could bebecause GenAge may be more accurate than GTEx-DAG. Second, the considered subnetworks have been formed byintegrating gene expression data with PPI data. Given that GTEx-DAG is also gene expression-based, while GenAge issequence-based, one would expect GTEx-DAG to perform better. However, it does not. This could be because GTExand GenAge are highly complementary (only 14 genes are labeled as aging-related in both), and because our predictivemodels may be better suited for the aging-related knowledge present in GenAge.
Unlike traditional approaches that aim to extract aging-related knowledge from static PPI data, we develop a supervisedlearning model that when applied to a dynamic aging-specific PPI subnetwork yields extremely high predictionperformance, significantly outperforming all models on all static subnetworks. Hence, our model could guide with highconfidence the discovery of novel aging-related predictions for future web lab validation.We have used NP in hope of improving upon Induced that has been used traditionally. Indeed, this is what we observe.Our findings are partly robust to the choice of entire context-unspecific network that is used for inference of an aging-specific subnetwork, but not to the choice of ground truth aging-related data used to label genes for cross-validation.Namely, our predictive models works better for sequence-based aging-related data from model species than for geneexpression-based aging-related data from human. Development of novel predictive models that will be capable ofcapturing well human-unique aspects of the aging process that do not exist in model species is important.We have studied human aging from PPI data. Nonetheless, our work can be applied to other types of biological networks,other species, or other dynamic biological processes, such as disease progression.
We thank Khalique Newaz for discussion on how to infer the NP-based static subnetworks, and for sharing the dynamicsubnetworks from his recent unsupervised learning study [22].11
Funding
This work is supported by the U.S. National Science Foundation (NSF) CAREER award (CCF-1452795).
References [1] Judith Campisi. Aging, cellular senescence, and cancer.
Annual Review of Physiology , 75:685–705, 2013.[2] Davide Bolignano, Francesco Mattace-Raso, Eric JG Sijbrands, and Carmine Zoccali. The aging kidney revisited:a systematic review.
Ageing Research Reviews , 14:65–80, 2014.[3] Fabio Fabris, João Pedro De Magalhães, and Alex A Freitas. A review of supervised machine learning applied toageing research.
Biogerontology , 18(2):171–188, 2017.[4] Fazle E Faisal and Tijana Milenkovi´c. Dynamic networks reveal key players in aging.
Bioinformatics , 30(12):1721–1729, 2014.[5] Fazle Elahi Faisal, Han Zhao, and Tijana Milenkovi´c. Global network alignment in the context of aging.
IEEE/ACMTransactions on Computational Biology and Bioinformatics , 12(1):40–52, 2014.[6] Alex A Freitas, Olga Vasieva, and João Pedro de Magalhães. A data mining approach for classifying DNA repairgenes into ageing-related or non-ageing-related.
BMC Genomics , 12(1):27, 2011.[7] Csaba Kerepesi, Bálint Daróczy, Ádám Sturm, Tibor Vellai, and András Benczúr. Prediction and characterizationof human ageing-related proteins by using machine learning.
Scientific Reports , 8(1):4094, 2018.[8] Kaiwen Jia, Chunmei Cui, Yuanxu Gao, Yuan Zhou, and Qinghua Cui. An analysis of aging-related genes derivedfrom the genotype-tissue expression project (GTEx).
Cell Death Discovery , 5(1):26, 2018.[9] Tao Lu, Ying Pan, Shyan-Yuan Kao, Cheng Li, Isaac Kohane, Jennifer Chan, and Bruce A Yankner. Generegulation and DNA damage in the ageing human brain.
Nature , 429(6994):883, 2004.[10] Nicole C Berchtold, David H Cribbs, Paul D Coleman, Joseph Rogers, Elizabeth Head, Ronald Kim, Tom Beach,Carol Miller, Juan Troncoso, John Q Trojanowski, et al. Gene expression changes in the course of normal brainaging are sexually dimorphic.
Proceedings of the National Academy of Sciences , 105(40):15605–15610, 2008.[11] Inge R Holtman, Divya D Raj, Jeremy A Miller, Wandert Schaafsma, Zhuoran Yin, Nieske Brouwer, Paul DWes, Thomas Möller, Marie Orre, Willem Kamphuis, et al. Induction of a common microglia gene expressionsignature by aging and neurodegenerative conditions: a co-expression meta-analysis.
Acta NeuropathologicaCommunications , 3(1):31, 2015.[12] Julie E Simpson, Paul G Ince, Pamela J Shaw, Paul R Heath, Rohini Raman, Claire J Garwood, CatherineGelsthorpe, Lynne Baxter, Gillian Forster, Fiona E Matthews, et al. Microarray analysis of the astrocytetranscriptome in the aging brain: relationship to Alzheimer’s pathology and APOE genotype.
Neurobiology ofAging , 32(10):1795–1807, 2011.[13] Thomas BL Kirkwood. Understanding the odd science of aging.
Cell , 120(4):437–447, 2005.[14] Yaping Fang, Xinkun Wang, Elias K Michaelis, and Jianwen Fang. Classifying aging genes into DNA repair ornon-DNA repair-related categories. In
International Conference on Intelligent Computing , pages 20–29. Springer,2013.[15] Fabio Fabris, Alex A Freitas, and Jennifer Tullet. An extensive empirical comparison of probabilistic hierar-chical classifiers in datasets of ageing-related genes.
IEEE/ACM Transactions on Computational Biology andBioinformatics , 13(6):1045–1058, 2016.[16] Rasha Elhesha, AAisharjya Sarkar, Christina Boucher, , and Tamer Kahveci. Identification of co-evolving temporalnetworks.
BMC Genomics , 20(434), 2019.[17] Qi Li and Tijana Milenkovi´c. Supervised prediction of aging-related genes from a context-specific proteininteraction subnetwork.
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , pp:130–137,2019.[18] Qi Li and Tijana Milenkovi´c. Supervised prediction of aging-related genes from a context-specific proteininteraction subnetwork. arXiv preprint arXiv:1908.08135 , 2020. Note: This paper is under review, and it is anextended version of the IEEE BIBM 2019 paper with the same name.[19] Lenore Cowen, Trey Ideker, Benjamin J Raphael, and Roded Sharan. Network propagation: a universal amplifierof genetic associations.
Nature Reviews Genetics , 18(9):551, 2017.1220] Kakajan Komurov, Michael A. White, and Prahlad T. Ram. Use of data-biased random walks on graphs for theretrieval of context-specific networks from genomic data.
PLOS Computational Biology , 6(8):1–10, 2010.[21] Mark D M Leiserson et al. Pan-cancer network analysis identifies combinations of rare somatic mutations acrosspathways and protein complexes.
Nature Genetics , 47:106–114, 2015.[22] Khalique Newaz and Tijana Milenkovi´c. Improving inference of the dynamic biological network underlying agingvia network propagation. arXiv preprint arXiv:1807.05637 , 2020.[23] Robi Tacutu, Daniel Thornton, Emily Johnson, Arie Budovsky, Diogo Barardo, Thomas Craig, Eugene Diana,Gilad Lehmann, Dmitri Toren, Jingwei Wang, et al. Human Ageing Genomic Resources: new and updateddatabases.
Nucleic Acids Research , 46(D1):D1083–D1090, 2017.[24] R. Mazza, F. Strozzi, A. Caprera, P. Ajmone-Marsan, and J.L. Williams. The other side of comparative genomics:genes with no orthologs between the cow and other mammalian species.
BMC Genomics , 10(1):604, 2009.[25] K. Khalturin, G. Hemmrich, S. Fraune, R. Augustin, and T.C.G. Bosch. More than just orphans: are taxonomically-restricted genes important in evolution?
Trends in Genetics , 25(9):404–413, 2009.[26] Antoine Danchin. Bacteria in the ageing gut: did the taming of fire promote a long human lifespan?
EnvironmentalMicrobiology , 20(6):1966–1987, 2018.[27] TS Prasad, Renu Goel, Kumaran Kandasamy, Shivakumar Keerthikumar, Sameer Kumar, Suresh Mathivanan,Deepthi Telikicherla, Rajesh Raju, Beema Shafreen, Abhilash Venugopal, et al. Human Protein ReferenceDatabase—2009 update.
Nucleic Acids Research , 37(suppl_1):D767–D772, 2009.[28] Jishnu Das and Haiyuan Yu. HINT: High-quality protein interactomes and their applications in understandinghuman disease.
BMC Systems Biology , 6(1):92, 2012.[29] Thomas Rolland, Murat Ta¸san, Benoit Charloteaux, Samuel J Pevzner, Quan Zhong, Nidhi Sahni, Song Yi, IrmaLemmens, Celia Fontanillo, Roberto Mosca, et al. A proteome-scale map of the human interactome network.
Cell ,159(5):1212–1226, 2014.[30] Yuriy Hulovatyy, Huili Chen, and Tijana Milenkovi´c. Exploring the structure and function of temporal networkswith dynamic graphlets.
Bioinformatics , 31(12):i171–i180, 2015.[31] David Aparício, Pedro Ribeiro, and Fernando Silva. Graphlet-orbit transitions (GoT): A fingerprint for temporalnetwork comparison.
PLOS ONE , 13(10):e0205497, 2018.[32] Tijana Milenkovi´c, Vesna Memiševi´c, Anthony Bonato, and Nataša Pržulj. Dominating biological networks.
PLOS ONE , 6(8):e23016, 2011.[33] Tijana Milenkovi´c and Nataša Pržulj. Uncovering biological network function via graphlet degree signatures.
Cancer Informatics , 6:CIN–S680, 2008.[34] Shawn Gu, John Johnson, Fazle E Faisal, and Tijana Milenkovi´c. From homogeneous to heterogeneous networkalignment via colored graphlets.
Scientific Reports , 8(1):12524, 2018.[35] Jyothi Subramanian and Richard Simon. Overfitting in prediction models–is it a problem only in high dimensions?
Contemporary Clinical Trials , 36(2):636–641, 2013.[36] Roman Hornung, Christoph Bernau, Caroline Truntzer, Rory Wilson, Thomas Stadler, and Anne-Laure Boulesteix.A measure of the impact of CV incompleteness on prediction error estimation with application to PCA andnormalization.
BMC Medical Research Methodology , 15(1):95, 2015.[37] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn:Machine learning in Python.