Elizabeth Tapia
National Scientific and Technical Research Council
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Elizabeth Tapia.
The Plant Genome | 2012
Leonardo Ornella; Sukhwinder Singh; Paulino Pérez; Juan Burgueño; Ravi P. Singh; Elizabeth Tapia; Sridhar Bhavani; Susanne Dreisigacker; Hans-Joachim Braun; Ky L. Mathews; José Crossa
Durable resistance to the rust diseases of wheat (Triticum aestivum L.) can be achieved by developing lines that have race‐nonspecific adult plant resistance conferred by multiple minor slow‐rusting genes. Genomic selection (GS) is a promising tool for accumulating favorable alleles of slow‐rusting genes. In this study, five CIMMYT wheat populations evaluated for resistance were used to predict resistance to stem rust (Puccinia graminis) and yellow rust (Puccinia striiformis) using Bayesian least absolute shrinkage and selection operator (LASSO) (BL), ridge regression (RR), and support vector regression with linear or radial basis function kernel models. All parents and populations were genotyped using 1400 Diversity Arrays Technology markers and different prediction problems were assessed. Results show that prediction ability for yellow rust was lower than for stem rust, probably due to differences in the conditions of infection of both diseases. For within population and environment, the correlation between predicted and observed values (Pearsons correlation [ρ]) was greater than 0.50 in 90% of the evaluations whereas for yellow rust, ρ ranged from 0.0637 to 0.6253. The BL and RR models have similar prediction ability, with a slight superiority of the BL confirming reports about the additive nature of rust resistance. When making predictions between environments and/or between populations, including information from another environment or environments or another population or populations improved prediction.
Heredity | 2014
Leonardo Ornella; Paulino Pérez; Elizabeth Tapia; Juan Manuel González-Camacho; Juan Burgueño; Xuecai Zhang; Sukhwinder Singh; Felix San Vicente; David Bonnett; Susanne Dreisigacker; Ravi P. Singh; N Long; José Crossa
Pearson’s correlation coefficient (ρ) is the most commonly reported metric of the success of prediction in genomic selection (GS). However, in real breeding ρ may not be very useful for assessing the quality of the regression in the tails of the distribution, where individuals are chosen for selection. This research used 14 maize and 16 wheat data sets with different trait–environment combinations. Six different models were evaluated by means of a cross-validation scheme (50 random partitions each, with 90% of the individuals in the training set and 10% in the testing set). The predictive accuracy of these algorithms for selecting individuals belonging to the best α=10, 15, 20, 25, 30, 35, 40% of the distribution was estimated using Cohen’s kappa coefficient (κ) and an ad hoc measure, which we call relative efficiency (RE), which indicates the expected genetic gain due to selection when individuals are selected based on GS exclusively. We put special emphasis on the analysis for α=15%, because it is a percentile commonly used in plant breeding programmes (for example, at CIMMYT). We also used ρ as a criterion for overall success. The algorithms used were: Bayesian LASSO (BL), Ridge Regression (RR), Reproducing Kernel Hilbert Spaces (RHKS), Random Forest Regression (RFR), and Support Vector Regression (SVR) with linear (lin) and Gaussian kernels (rbf). The performance of regression methods for selecting the best individuals was compared with that of three supervised classification algorithms: Random Forest Classification (RFC) and Support Vector Classification (SVC) with linear (lin) and Gaussian (rbf) kernels. Classification methods were evaluated using the same cross-validation scheme but with the response vector of the original training sets dichotomised using a given threshold. For α=15%, SVC-lin presented the highest κ coefficients in 13 of the 14 maize data sets, with best values ranging from 0.131 to 0.722 (statistically significant in 9 data sets) and the best RE in the same 13 data sets, with values ranging from 0.393 to 0.948 (statistically significant in 12 data sets). RR produced the best mean for both κ and RE in one data set (0.148 and 0.381, respectively). Regarding the wheat data sets, SVC-lin presented the best κ in 12 of the 16 data sets, with outcomes ranging from 0.280 to 0.580 (statistically significant in 4 data sets) and the best RE in 9 data sets ranging from 0.484 to 0.821 (statistically significant in 5 data sets). SVC-rbf (0.235), RR (0.265) and RHKS (0.422) gave the best κ in one data set each, while RHKS and BL tied for the last one (0.234). Finally, BL presented the best RE in two data sets (0.738 and 0.750), RFR (0.636) and SVC-rbf (0.617) in one and RHKS in the remaining three (0.502, 0.458 and 0.586). The difference between the performance of SVC-lin and that of the rest of the models was not so pronounced at higher percentiles of the distribution. The behaviour of regression and classification algorithms varied markedly when selection was done at different thresholds, that is, κ and RE for each algorithm depended strongly on the selection percentile. Based on the results, we propose classification method as a promising alternative for GS in plant breeding.
Pattern Recognition Letters | 2012
Elizabeth Tapia; Pilar Bulacio; Laura Angelone
A method is described for performing sparse and stable gene selection from a number of unstable, but low cost, SVM-RFE units referred to as SVM-RFE subunits. Using a comprehensive simulation study, we show that the introduction of a consensus constraint with respect to variations in the policy of gene removal and a stability constraint with respect to perturbations in the training data can remarkably improve gene selection precision, dimensionality reduction ratio and stability of low cost SVM-RFE subunits still guaranteeing affordable computational costs. The method, which does not require the preselection of the number of selected genes, is divided into two stages. Multiple rough gene removal policies are first applied to multiple surrogate training datasets (spreading). Multiple consensus gene sets with respect to variations in the gene removal policy are then obtained and passed through a stability filter which selects the best performing gene set (despreading). Hence, while the consensus constraint performs strong dimensionality reduction at affordable computational costs, the stability constraint ensures acceptable indexes of gene selection stability and further dimensionality reduction. The method is validated on three benchmark microarray datasets.
brazilian symposium on bioinformatics | 2007
Ariel E. Bayá; Mónica G. Larese; Pablo M. Granitto; Juan Carlos Gómez; Elizabeth Tapia
Gene Set Enrichment Analysis (GSEA) is a well-known technique used for studying groups of functionally related genes and their correlation with phenotype. This method creates a ranked list of genes, which is used to calculate an enrichment score. In this work, we introduce two different metrics for gene ranking in GSEA, namely the Wilcoxon and the Baumgartner-Weis-Schindler tests. The advantage of these metrics is that they do not assume any particular distribution on the data. We compared them with the signal-to-noise ratio metric originally proposed by the developers of GSEA on a type 2 diabetes mellitus (DM2) database. Statistical significance is evaluated by means of false discovery rate and p-value calculations. Results show that the Baumgartner-WeisSchindler test detects more pathways with statistical significance. One of them could be related to DM2, according to the literature, but further research is needed.
Information Fusion | 2013
Javier Murillo; Serge Guillaume; Elizabeth Tapia; Pilar Bulacio
An important limitation of fuzzy integrals for information fusion is the exponential growth of coefficients for an increasing number of information sources. To overcome this problem a variety of fuzzy measure identification algorithms has been proposed. HLMS is a simple gradient-based algorithm for fuzzy measure identification which suffers from some convergence problems. In this paper, two proposals for HLMS convergence improvement are presented, a modified formula for coefficients update and new policy for monotonicity check. A comprehensive experimental work shows that these proposals indeed contribute to HLMS convergence, accuracy and robustness.
Pattern Recognition Letters | 2016
Flavio E. Spetale; Pilar Bulacio; Serge Guillaume; Javier Murillo; Elizabeth Tapia
Unsupervised feature selection towards effective SVM-RFE on IR data is considered.Unsupervised feature selection is guided by spectral envelope functions of IR data.Spectral windows are induced from peaks of the spectral envelope functions.SVM-RFE is applied to individual spectral windows.Promising results are observed across three different NIR/MIR application domains. Infrared spectroscopy data is characterized by the presence of a huge number of variables. Applications of infrared spectroscopy in the mid-infrared (MIR) and near-infrared (NIR) bands are of widespread use in many fields. To effectively handle this type of data, suitable dimensionality reduction methods are required. In this paper, a dimensionality reduction method designed to enable effective Support Vector Machine Recursive Feature Elimination (SVM-RFE) on NIR/MIR datasets is presented. The method exploits the information content at peaks of the spectral envelope functions which characterize NIR/MIR spectra datasets. Experimental evaluation across different NIR/MIR application domains shows that the proposed method is useful for the induction of compact and accurate SVM classifiers for qualitative NIR/MIR applications involving stringent interpretability or time processing requirements.
G3: Genes, Genomes, Genetics | 2016
Flavia J Krsticevic; Débora P Arce; Joaquín Ezpeleta; Elizabeth Tapia
In plants, fruit maturation and oxidative stress can induce small heat shock protein (sHSP) synthesis to maintain cellular homeostasis. Although the tomato reference genome was published in 2012, the actual number and functionality of sHSP genes remain unknown. Using a transcriptomic (RNA-seq) and evolutionary genomic approach, putative sHSP genes in the Solanum lycopersicum (cv. Heinz 1706) genome were investigated. A sHSP gene family of 33 members was established. Remarkably, roughly half of the members of this family can be explained by nine independent tandem duplication events that determined, evolutionarily, their functional fates. Within a mitochondrial class subfamily, only one duplicated member, Solyc08g078700, retained its ancestral chaperone function, while the others, Solyc08g078710 and Solyc08g078720, likely degenerated under neutrality and lack ancestral chaperone function. Functional conservation occurred within a cytosolic class I subfamily, whose four members, Solyc06g076570, Solyc06g076560, Solyc06g076540, and Solyc06g076520, support ∼57% of the total sHSP RNAm in the red ripe fruit. Subfunctionalization occurred within a new subfamily, whose two members, Solyc04g082720 and Solyc04g082740, show heterogeneous differential expression profiles during fruit ripening. These findings, involving the birth/death of some genes or the preferential/plastic expression of some others during fruit ripening, highlight the importance of tandem duplication events in the expansion of the sHSP gene family in the tomato genome. Despite its evolutionary diversity, the sHSP gene family in the tomato genome seems to be endowed with a core set of four homeostasis genes: Solyc05g014280, Solyc03g082420, Solyc11g020330, and Solyc06g076560, which appear to provide a baseline protection during both fruit ripening and heat shock stress in different tomato tissues.
Archive | 2015
D. P. Arce; Flavia Krsticevic; M. R. Bertolaccini; Joaquín Ezpeleta; S. D. Ponce; Elizabeth Tapia
Small Heat Shock Proteins (sHSPs) are low molecular weight chaperones that play an important role during stress response and development in all living organisms. Fruit maturation and oxidative stress can induce sHSP synthesis both in Arabidopsis and tomato plants. RNA-Seq technology is becoming widely used in various transcriptomics studies; however, analyzing and interpreting the RNA-Seq data face serious challenges. In the present work, we de novo assembled the Solanum lycopersicum transcriptome for three different maturation stages (mature green, breaker and red ripe). Differential gene expression analysis was carried out during tomato fruit development. We identified sHSPs differentially expressed that might be involved in breaker and red ripe fruit maturation. Interestingly, these sHSPs have different subcellular localization and suggest a complex regulation of the fruit maturation network process.
Fuzzy Sets and Systems | 2015
Javier Murillo; Serge Guillaume; Flavio E. Spetale; Elizabeth Tapia; Pilar Bulacio
In many real world datasets both the individual and coordinated action of features may be relevant for class identification. In this paper, a computational strategy for relevant feature selection based on the characterization of redundant or complementary features is proposed. The characterization is achieved using fuzzy measures and an interaction index computed from fuzzy measure coefficients. Fuzzy measure identification requires raw data to be turned into confidence degrees. This key step is carried out considering the distributions of feature values across all the classes. Fuzzy measure coefficients are then estimated with an improved version of the Heuristic Least Mean Squares algorithm that includes an efficient management of untouched coefficients. Then, a generalization of the Shapley index for an arbitrary number of features is used. Simulations experiments on synthetic datasets are performed to study the behavior of this generalized interaction index. For extreme datasets, containing either redundant or complementary features as well as noise, the index value is defined by mathematical formula. This result is used to motivate feature selection guidelines that take into account feature interactions. Experimental results on benchmark datasets show that the proposal allows for the design of compact, interpretable and competitive classification models.
OncoImmunology | 2017
Gabriela Vera-Lozada; Carolina Minnicelli; Priscilla Segges; Gustavo Stefanoff; Flavia Kristcevic; Joaquín Ezpeleta; Elizabeth Tapia; Gerald Niedobitek; Mário Henrique M. Barros; Rocio Hassan
ABSTRACT Interleukin-10 (IL10) is an immune regulatory cytokine. Single nucleotide polymorphisms (SNPs) in IL10 promoter have been associated with prognosis in adult classical Hodgkin lymphoma (cHL). We analyzed IL10 SNPs −1082 and −592 in respect of therapy response, gene expression and tumor microenvironment (TME) composition in 98 pediatric patients with cHL. As confirmatory results, we found that −1082AA/AG; −592CC genotypes and ATA haplotype were associated with unfavourable prognosis: Progression-free survival (PFS) was shorter in −1082AA+AG (72.2%) than in GG patients (100%) (P = 0.024), and in −592AA (50%) and AC (74.2%) vs. CC patients (87.0%) (P = 0.009). In multivariate analysis, the −592CC genotype and the ATA haplotype retained prognostic impact (HR: 0.41, 95% CI 0.2–0.86; P = 0.018, and HR: 3.06 95% CI 1.03–9.12; P = 0.044, respectively). Our analysis further led to some new observations, namely: (1) Low IL10 mRNA expression was associated with −1082GG genotype (P = 0.014); (2) IL10 promoter polymorphisms influence TME composition;−1082GG/−592CC carriers showed low numbers of infiltrating cells expressing MAF transcription factor (20 vs. 78 and 49 vs. 108 cells/mm2, respectively; P< 0.05); while ATA haplotype (high expression) associated with high numbers of MAF+ cells (P = 0.005). Specifically, −1082GG patients exhibited low percentages of CD68+MAF+ (M2-like) intratumoral macrophages (15.04% vs. 47.26%, P = 0.017). Considering ours as an independent validation cohort, our results give support to the clinical importance of IL10 polymorphisms in the full spectrum of cHL, and advance the concept of genetic control of microenvironment composition as a basis for susceptibility and therapeutic response.