Ricardo Cao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ricardo Cao is active.

Explore More

Publication

Featured researches published by Ricardo Cao.

Computational Statistics & Data Analysis | 1994

A comparative study of several smoothing methods in density estimation

Ricardo Cao; Antonio Cuevas; Wensceslao González Manteiga

Abstract The theory of bandwidth choice in density estimation is developing very fast. Several methods (with plenty of varieties and subvarieties) have been recently proposed as an alternative to least squares cross-validation, the standard for years. This paper includes (a) A critical up-to-date review of the main methods currently available. The discussion provide some new insights on the important problem of estimating the minimization criteria and on the choice of pilot bandwidths in bootstrap-based methods. (b) An extensive simulation study of ten selected bandwidths. (c) A final discussion with some recommendations for practitioners. The conclusions are not easily summarized in a few words, because different cases have to be considered and important nuances must be pointed out. However, we could mention that the classical cross-validation bandwidths show, generally speaking, a relatively poor behavior (this is especially clear for the pseudo-likelihood method). On the other hand, although no selector appears to be uniformly better, the plug-in (in a similar version to that proposed by Sheather and Jones, J. Royal Statist. Soc. Ser. B 5 1991) and the (smoothed) bootstrap-based selectors show a fairly satisfactory performance which suggests that they could be the new standard methods for the problem of smoothing in density estimation. Interesting results are also obtained for a new type of bandwidths based on the number of inflection points.

Test | 1997

Universal smoothing factor selection in density estimation: theory and practice

Duc Devroye; Jan Beirlant; Ricardo Cao; Ricardo Fraiman; Peter Hall; M. C. Jones; Gábor Lugosi; Enno Mammen; J. S. Marron; César Sánchez-Sellero; J. Uña; Frederic Udina; Luc Devroye

AbstractIn earlier work with Gabor Lugosi, we introduced a method to select a smoothing factor for kernel density estimation such that, forall densities in all dimensions, theL1 error of the corresponding kernel estimate is not larger than 3+∈ times the error of the estimate with the optimal smoothing factor plus a constant times

Journal of the American Statistical Association | 1996

Bootstrap Selection of the Smoothing Parameter in Nonparametric Hazard Rate Estimation

Wenceslao González-Manteiga; Ricardo Cao; J. S. Marron

Annals of Human Genetics | 2009

Evaluating the Ability of Tree‐Based Methods and Logistic Regression for the Detection of SNP‐SNP Interaction

Manuel García-Magariños; Ignacio López-de-Ullibarri; Ricardo Cao; Antonio Salas

\sqrt {\log n/n}

Test | 1993

Testing the hypothesis of a general linear model using nonparametric regression estimation

Wenceslao González-Manteiga; Ricardo Cao

International Journal of Legal Medicine | 1993

Population genetics of three VNTR polymorphisms in two different Spanish populations

Emilio Valverde; Carmen Cabrero; Ricardo Cao; M. S. Rodríguez-Calvo; Díez A; Francisco Barross; Jorge Alemany; Angel Carracedo

, wheren is the sample size, and the constant only depends on the complexity of the kernel used in the estimate. The result is nonasymptotic, that is, the bound is valid for eachn. The estimate uses ideas from the minimum distance estimation work of Yatracos. We present a practical implementation of this estimate, report on some comparative results, and highlight some key properties of the new method.

Technometrics | 1995

Predicting using Box-Jenkins, nonparametric, and bootstrap techniques

Ignacio García-Jurado; Wenceslao González-Manteiga; J. M. Prada-Sánchez; Manuel Febrero-Bande; Ricardo Cao

Abstract An asymptotic representation of the mean weighted integrated squared error for the kernel-based estimator of the hazard rate in the presence of right-censored samples is obtained for different bootstrap resampling methods. As a consequence, a new bandwidth selector based on the bootstrap is introduced. Very satisfactory simulations results are obtained in comparison to the cross-validation selector for different models, using WARPed (i.e., binned) versions of the estimators.

Stochastic Processes and their Applications | 1999

Rate of convergence of a convolution-type estimator of the marginal density of a MA(1) process

Ángeles Saavedra; Ricardo Cao

Most common human diseases are likely to have complex etiologies. Methods of analysis that allow for the phenomenon of epistasis are of growing interest in the genetic dissection of complex diseases. By allowing for epistatic interactions between potential disease loci, we may succeed in identifying genetic variants that might otherwise have remained undetected. Here we aimed to analyze the ability of logistic regression (LR) and two tree‐based supervised learning methods, classification and regression trees (CART) and random forest (RF), to detect epistasis. Multifactor‐dimensionality reduction (MDR) was also used for comparison. Our approach involves first the simulation of datasets of autosomal biallelic unphased and unlinked single nucleotide polymorphisms (SNPs), each containing a two‐loci interaction (causal SNPs) and 98 ‘noise’ SNPs. We modelled interactions under different scenarios of sample size, missing data, minor allele frequencies (MAF) and several penetrance models: three involving both (indistinguishable) marginal effects and interaction, and two simulating pure interaction effects. In total, we have simulated 99 different scenarios. Although CART, RF, and LR yield similar results in terms of detection of true association, CART and RF perform better than LR with respect to classification error. MAF, penetrance model, and sample size are greater determining factors than percentage of missing data in the ability of the different techniques to detect true association. In pure interaction models, only RF detects association. In conclusion, tree‐based methods and LR are important statistical tools for the detection of unknown interactions among true risk‐associated SNPs with marginal effects and in the presence of a significant number of noise SNPs. In pure interaction models, RF performs reasonably well in the presence of large sample sizes and low percentages of missing data. However, when the study design is suboptimal (unfavourable to detect interaction in terms of e.g. sample size and MAF) there is a high chance of detecting false, spurious associations.

Computational Statistics & Data Analysis | 1995

Minimum distance density-based estimation

Ricardo Cao; Antonio Cuevas; Ricardo Fraiman

SummaryGiven the modelYi=m(χi)+ɛi,whereE(ɛi) =0,Xi≠Ci=1, ...,n, andC is ap-dimensional compact set, we have designed a new method for testing the hypothesis that the regression function follows a general linear model,m(·) ∈ {mθ(·) =At(·)θ}θ∈Θ⊂ℛq, withA a function fromℜp toℜq. The statistic, denoted ΔASE, used fortesting the given hypothesis is defined to be the difference between the average squared errors (ASE) associated with the non-parametric estimator

Journal of Nonparametric Statistics | 2005