Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alberto Ferrer is active.

Publication


Featured researches published by Alberto Ferrer.


Genome Research | 2011

Differential expression in RNA-seq: A matter of depth

Sonia Tarazona; Fernando Garcia-Alcalde; Joaquín Dopazo; Alberto Ferrer; Ana Conesa

Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach--NOISeq--that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.


Bioinformatics | 2006

maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments

Ana Conesa; María José Nueda; Alberto Ferrer; Manuel Talon

MOTIVATION Multi-series time-course microarray experiments are useful approaches for exploring biological processes. In this type of experiments, the researcher is frequently interested in studying gene expression changes along time and in evaluating trend differences between the various experimental groups. The large amount of data, multiplicity of experimental conditions and the dynamic nature of the experiments poses great challenges to data analysis. RESULTS In this work, we propose a statistical procedure to identify genes that show different gene expression profiles across analytical groups in time-course experiments. The method is a two-regression step approach where the experimental groups are identified by dummy variables. The procedure first adjusts a global regression model with all the defined variables to identify differentially expressed genes, and in second a variable selection strategy is applied to study differences between groups and to find statistically significant different profiles. The methodology is illustrated on both a real and a simulated microarray dataset.


Nucleic Acids Research | 2015

Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package

Sonia Tarazona; Pedro Furió-Tarí; David Turrà; Antonio Di Pietro; María José Nueda; Alberto Ferrer; Ana Conesa

As the use of RNA-seq has popularized, there is an increasing consciousness of the importance of experimental design, bias removal, accurate quantification and control of false positives for proper data analysis. We introduce the NOISeq R-package for quality control and analysis of count data. We show how the available diagnostic tools can be used to monitor quality issues, make pre-processing decisions and improve analysis. We demonstrate that the non-parametric NOISeqBIO efficiently controls false discoveries in experiments with biological replication and outperforms state-of-the-art methods. NOISeq is a comprehensive resource that meets current needs for robust data-aware analysis of RNA-seq differential expression.


Quality Engineering | 2007

Multivariate Statistical Process Control Based on Principal Component Analysis (MSPC-PCA): Some Reflections and a Case Study in an Autobody Assembly Process

Alberto Ferrer

ABSTRACT In modern manufacturing processes, massive amounts of multivariate data are routinely collected through automated in-process sensing. These data often exhibit high correlation, rank deficiency, low signal-to-noise ratio and missing values. Conventional univariate and multivariate statistical process control techniques are not suitable to be used in these environments. This article discusses these issues and advocates the use of multivariate statistical process control based on principal component analysis (MSPC-PCA) as an efficient statistical tool for process understanding, monitoring and diagnosing assignable causes for special events in these contexts. Data from an autobody assembly process are used to illustrate the practical benefits of using MSPC-PCA rather than conventional SPC in manufacturing processes.


Bioinformatics | 2007

Discovering gene expression patterns in time course microarray experiments by ANOVA–SCA

María José Nueda; Ana Conesa; Johan A. Westerhuis; Huub C. J. Hoefsloot; Age K. Smilde; Manuel Talon; Alberto Ferrer

MOTIVATION Designed microarray experiments are used to investigate the effects that controlled experimental factors have on gene expression and learn about the transcriptional responses associated with external variables. In these datasets, signals of interest coexist with varying sources of unwanted noise in a framework of (co)relation among the measured variables and with the different levels of the studied factors. Discovering experimentally relevant transcriptional changes require methodologies that take all these elements into account. RESULTS In this work, we develop the application of the Analysis of variance-simultaneous component analysis (ANOVA-SCA) Smilde et al. Bioinformatics, (2005) to the analysis of multiple series time course microarray data as an example of multifactorial gene expression profiling experiments. We denoted this implementation as ASCA-genes. We show how the combination of ANOVA-modeling and a dimension reduction technique is effective in extracting targeted signals from data by-passing structural noise. The methodology is valuable for identifying main and secondary responses associated with the experimental factors and spotting relevant experimental conditions. We additionally propose a novel approach for gene selection in the context of the relation of individual transcriptional patterns to global gene expression signals. We demonstrate the methodology on both real and synthetic datasets. AVAILABILITY ASCA-genes has been implemented in the statistical language R and is available at http://www.ivia.es/centrodegenomica/bioinformatics.htm. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Technometrics | 1999

Integration of statistical and engineering process control in a continuous polymerization process

Carmen Capilla; Alberto Ferrer; Rafael Romero; Angel Hualda

The purpose of this article is to present a case study of integrating statistical process control (SPC) and engineering process control (EPC) in an industrial polymerization process. We develop and compare the effectiveness of several regulation strategies to reduce polymer viscosity deviations from target. Controllers are derived using the constrained minimum variance criterion. Robustness properties and stability conditions of the controllers are discussed. The performance and the adequacy of the regulation schemes in a simulation of realistic assignable causes affecting the process are studied. In this context the benefits of implementing an integrated EPC/SPC system are compared with those of using an EPC system alone.


Metabolomics | 2012

Chemometric approaches to improve PLSDA model outcome for predicting human non-alcoholic fatty liver disease using UPLC-MS as a metabolic profiling tool

Guillermo Quintás; Nuria Portillo; Juan Carlos García-Cañaveras; José V. Castell; Alberto Ferrer; Agustín Lahoz

An MS-based metabolomics strategy including variable selection and PLSDA analysis has been assessed as a tool to discriminate between non-steatotic and steatotic human liver profiles. Different chemometric approaches for uninformative variable elimination were performed by using two of the most common software packages employed in the field of metabolomics (i.e., MATLAB and SIMCA-P). The first considered approach was performed with MATLAB where the PLS regression vector coefficient values were used to classify variables as informative or not. The second approach was run under SIMCA-P, where variable selection was performed according to both the PLS regression vector coefficients and VIP scores. PLSDA models performance features, such as model validation, variable selection criteria, and potential biomarker output, were assessed for comparison purposes. One interesting finding is that variable selection improved the classification predictiveness of all the models by facilitating metabolite identification and providing enhanced insight into the metabolic information acquired by the UPLC-MS method. The results prove that the proposed strategy is a potentially straightforward approach to improve model performance. Among others, GSH, lysophospholipids and bile acids were found to be the most important altered metabolites in the metabolomic profiles studied. However, further research and more in-depth biochemical interpretations are needed to unambiguously propose them as disease biomarkers.


Pattern Recognition | 2008

Performance evaluation of soft color texture descriptors for surface grading using experimental design and logistic regression

Fernando López; José Miguel Valiente; José Manuel Prats; Alberto Ferrer

This paper presents a novel approach to the question of surface grading, the soft color texture descriptors method. This method is extracted from an extensive evaluation process of several factors based on the use of two well established statistical tools: experimental design and logistic regression. The utility of different combinations of factors is evaluated in regard to the problem of automatic classification of materials such as ceramic tiles that need to be grouped according to homogeneous visual appearance, that is, the surface grading application. The set of factors includes the number of neighbors in the k-NN classifier (several values of k parameter), color space representation schemes (CIE Lab, CIE Luv, RGB, and grayscale), and color texture features (mean, standard deviation, 2nd-5th histogram moments). A factorial experimental design is performed testing all combinations of the above factors on a large image database of ceramic tiles. Accuracy estimates are computed using logistic regression to determine the best combinations of factors. From the point of view of machine learning the overall process conforms a wrapper approach able to select significant design choices (k parameter in k-NN classifier and color space) and carry out a feature selection within the set of color texture features at the same time. Experiments were repeated with alternate color texture schemes from the literature: color histograms and centile-LBP. Comparisons of methods are presented describing both accuracy estimates and runtimes.


Journal of Chemometrics | 2012

Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: theoretical aspects

José Camacho; Alberto Ferrer

Cross‐validation has become one of the principal methods to adjust the meta‐parameters in predictive models. Extensions of the cross‐validation idea have been proposed to select the number of components in principal components analysis (PCA). The element‐wise k‐fold (ekf) cross‐validation is among the most used algorithms for principal components analysis cross‐validation. This is the method programmed in the PLS_Toolbox, and it has been stated to outperform other methods under most circumstances in a numerical experiment. The ekf algorithm is based on missing data imputation, and it can be programmed using any method for this purpose. In this paper, the ekf algorithm with the simplest missing data imputation method, trimmed score imputation, is analyzed. A theoretical study is driven to identify in which situations the application of ekf is adequate and, more importantly, in which situations it is not. The results presented show that the ekf method may be unable to assess the extent to which a model represents a test set and may lead to discard principal components with important information. On a second paper of this series, other imputation methods are studied within the ekf algorithm. Copyright


BMC Bioinformatics | 2009

Functional assessment of time course microarray data

María José Nueda; Patricia Sebastián; Sonia Tarazona; Francisco García-García; Joaquín Dopazo; Alberto Ferrer; Ana Conesa

MotivationTime-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.MethodsWe present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.ResultsSynthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

Collaboration


Dive into the Alberto Ferrer's collaboration.

Top Co-Authors

Avatar

José Manuel Prats-Montalbán

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar

Abel Folch-Fortuny

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar

Jesús Picó

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Raffaele Vitale

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar

Ana Conesa

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar

Francisco Arteaga

Universidad Católica de Valencia San Vicente Mártir

View shared research outputs
Top Co-Authors

Avatar

Sonia Tarazona

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

D. Aguado

Polytechnic University of Valencia

View shared research outputs
Researchain Logo
Decentralizing Knowledge