Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huey-miin Hsueh is active.

Publication


Featured researches published by Huey-miin Hsueh.


Bioinformatics | 2004

Analysis of variance components in gene expression data

James J. Chen; Robert R. Delongchamp; Chen-An Tsai; Huey-miin Hsueh; Frank D. Sistare; Karol L. Thompson; Varsha G. Desai; James C. Fuscoe

MOTIVATION A microarray experiment is a multi-step process, and each step is a potential source of variation. There are two major sources of variation: biological variation and technical variation. This study presents a variance-components approach to investigating animal-to-animal, between-array, within-array and day-to-day variations for two data sets. The first data set involved estimation of technical variances for pooled control and pooled treated RNA samples. The variance components included between-array, and two nested within-array variances: between-section (the upper- and lower-sections of the array are replicates) and within-section (two adjacent spots of the same gene are printed within each section). The second experiment was conducted on four different weeks. Each week there were reference and test samples with a dye-flip replicate in two hybridization days. The variance components included week-to-week, animal-to-animal and between-array and within-array variances. RESULTS We applied the linear mixed-effects model to quantify different sources of variation. In the first data set, we found that the between-array variance is greater than the between-section variance, which, in turn, is greater than the within-section variance. In the second data set, for the reference samples, the week-to-week variance is larger than the between-array variance, which, in turn, is slightly larger than the within-array variance. For the test samples, the week-to-week variance has the largest variation. The animal-to-animal variance is slightly larger than the between-array and within-array variances. However, in a gene-by-gene analysis, the animal-to-animal variance is smaller than the between-array variance in four out of five housekeeping genes. In summary, the largest variation observed is the week-to-week effect. Another important source of variability is the animal-to-animal variation. Finally, we describe the use of variance-component estimates to determine optimal numbers of animals, arrays per animal and sections per array in planning microarray experiments.


Journal of Biopharmaceutical Statistics | 2003

Comparison of Methods for Estimating the Number of True Null Hypotheses in Multiplicity Testing

Huey-miin Hsueh; James J. Chen; Ralph L. Kodell

Abstract When a large number of statistical tests is performed, the chance of false positive findings could increase considerably. The traditional approach is to control the probability of rejecting at least one true null hypothesis, the familywise error rate (FWE). To improve the power of detecting treatment differences, an alternative approach is to control the expected proportion of errors among the rejected hypotheses, the false discovery rate (FDR). When some of the hypotheses are not true, the error rate from either the FWE- or the FDR-controlling procedure is usually lower than the designed level. This paper compares five methods used to estimate the number of true null hypotheses over a large number of hypotheses. The estimated number of true null hypotheses is then used to improve the power of FWE- or FDR-controlling methods. Monte Carlo simulations are conducted to evaluate the performance of these methods. The lowest slope method, developed by Benjamini and Hochberg (2000) on the adaptive control of the FDR in multiple testing with independent statistics, and the mean of differences method appear to perform the best. These two methods control the FWE properly when the number of nontrue null hypotheses is small. A data set from a toxicogenomic microarray experiment is used for illustration.


BMC Bioinformatics | 2010

Power and sample size estimation in microarray studies

Wei-Jiun Lin; Huey-miin Hsueh; James J. Chen

BackgroundBefore conducting a microarray experiment, one important issue that needs to be determined is the number of arrays required in order to have adequate power to identify differentially expressed genes. This paper discusses some crucial issues in the problem formulation, parameter specifications, and approaches that are commonly proposed for sample size estimation in microarray experiments. Common methods for sample size estimation are formulated as the minimum sample size necessary to achieve a specified sensitivity (proportion of detected truly differentially expressed genes) on average at a specified false discovery rate (FDR) level and specified expected proportion (π1) of the true differentially expression genes in the array. Unfortunately, the probability of detecting the specified sensitivity in such a formulation can be low. We formulate the sample size problem as the number of arrays needed to achieve a specified sensitivity with 95% probability at the specified significance level. A permutation method using a small pilot dataset to estimate sample size is proposed. This method accounts for correlation and effect size heterogeneity among genes.ResultsA sample size estimate based on the common formulation, to achieve the desired sensitivity on average, can be calculated using a univariate method without taking the correlation among genes into consideration. This formulation of sample size problem is inadequate because the probability of detecting the specified sensitivity can be lower than 50%. On the other hand, the needed sample size calculated by the proposed permutation method will ensure detecting at least the desired sensitivity with 95% probability. The method is shown to perform well for a real example dataset using a small pilot dataset with 4-6 samples per group.ConclusionsWe recommend that the sample size problem should be formulated to detect a specified proportion of differentially expressed genes with 95% probability. This formulation ensures finding the desired proportion of true positives with high probability. The proposed permutation method takes the correlation structure and effect size heterogeneity into consideration and works well using only a small pilot dataset.


Journal of Biopharmaceutical Statistics | 2002

BAYESIAN APPROACH TO EVALUATION OF BRIDGING STUDIES

Jen-pei Liu; Chin-Fu Hsiao; Huey-miin Hsueh

We address the issue of analysis of clinical data generated by the bridging study conducted in the new region to evaluate the similarity for extrapolation of the foreign clinical data. A bridging study is usually conducted in the new region only after the test product is approved for commercial marketing in the original region due to its proven efficacy and safety. Sufficient information on efficacy, safety, dosage, and dose regimen has already generated in the original region. The empirical Bayesian approach is proposed to synthesize the data generated by the bridging study and foreign clinical data generated in the original region for assessment of similarity between the new and the original regions. A method for sample size determination for the bridging study is also suggested. It can be shown that the total sample size is inversely proportional to the strength of the evidence for the efficacy presented in the original region and the proportion of the patients assigned to receive the test product in the bridging study. *The views expressed in this article are personal opinions of the authors and may not necessarily represent the position of the National Cheng-Kung University, the National Health Research Institutes, and the National Cheng-Chi University, Taiwan.


Journal of Biopharmaceutical Statistics | 2004

A Bayesian Noninferiority Approach to Evaluation of Bridging Studies

Jen-pei Liu; Huey-miin Hsueh; Chin-Fu Hsiao

Abstract A bridging study defined by The International Conference on Harmonization E5 is usually conducted in the new region only after the test product has been approved for commercial marketing in the original region due to its proven efficacy and safety. In this paper, we address the issue of analysis of clinical data generated by the bridging study conducted in the new region to evaluate the similarity for extrapolation of the foreign clinical data to the population of the new region. Information on efficacy, safety, dosage, and dose regimen of the original region cannot be concurrently obtained from the local bridging studies but are available in the trials conducted in the original region. A Bayesian noninferiority approach is therefore proposed to incorporate the data generated in the original region to evaluate bridging evidence by the local bridging studies and assess similarity between the new and original regions. Methods for sample size determination for the bridging study are also proposed.


Journal of Biopharmaceutical Statistics | 2004

A Generalized Additive Model For Microarray Gene Expression Data Analysis

Chen-An Tsai; Huey-miin Hsueh; James J. Chen

Abstract Microarray technology allows the measurement of expression levels of a large number of genes simultaneously. There are inherent biases in microarray data generated from an experiment. Various statistical methods have been proposed for data normalization and data analysis. This paper proposes a generalized additive model for the analysis of gene expression data. This model consists of two sub-models: a non-linear model and a linear model. We propose a two-step normalization algorithm to fit the two sub-models sequentially. The first step involves a non-parametric regression using lowess fits to adjust for non-linear systematic biases. The second step uses a linear ANOVA model to estimate the remaining effects including the interaction effect of genes and treatments, the effect of interest in a study. The proposed model is a generalization of the ANOVA model for microarray data analysis. We show correspondences between the lowess fit and the ANOVA model methods. The normalization procedure does not assume the majority of genes do not change their expression levels, and neither does it assume two channel intensities from the same spot are independent. The procedure can be applied to either one channel or two channel data from the experiments with multiple treatments or multiple nuisance factors. Two toxicogenomic experiment data sets and a simulated data set are used to contrast the proposed method with the commonly known lowess fit and ANOVA methods.


Journal of Statistical Computation and Simulation | 2007

Incorporating the number of true null hypotheses to improve power in multiple testing: application to gene microarray data

Huey-miin Hsueh; Chen-An Tsai; James J. Chen

Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. In common exploratory microarray experiments, most genes are not expected to be differentially expressed. The family-wise error (FWE) rate and false discovery rate (FDR) are two common approaches used to account for multiple hypothesis tests to identify differentially expressed genes. When the number of hypotheses is very large and some null hypotheses are expected to be true, the power of an FWE or FDR procedure can be improved if the number of null hypotheses is known. The mean of differences (MD) of ranked p-values has been proposed to estimate the number of true null hypotheses under the independence model. This article proposes to incorporate the MD estimate into an FWE or FDR approach for gene identification. Simulation results show that the procedure appears to control the FWE and FDR well at the FWE=0.05 and FDR=0.05 significant levels; it exceeds the nominal level for FDR=0.01 when the null hypotheses are highly correlated, a correlation of 0.941. The proposed approach is applied to a public colon tumor data set for illustration.


Biometrics | 2003

Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data

Chen-An Tsai; Huey-miin Hsueh; James J. Chen


BMC Bioinformatics | 2007

Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data.

James J. Chen; Huey-miin Hsueh; Robert R. Delongchamp; Chien-Ju Lin; Chen-An Tsai


Statistics in Medicine | 2002

Tests for equivalence or non-inferiority for paired binary data.

Jen-pei Liu; Huey-miin Hsueh; Eric Hsieh; James J. Chen

Collaboration


Dive into the Huey-miin Hsueh's collaboration.

Top Co-Authors

Avatar

James J. Chen

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

Jen-pei Liu

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chen-An Tsai

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

Robert R. Delongchamp

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar

Chin-Fu Hsiao

National Health Research Institutes

View shared research outputs
Top Co-Authors

Avatar

Chien-Ju Lin

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

James C. Fuscoe

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

Ralph L. Kodell

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar

Varsha G. Desai

Food and Drug Administration

View shared research outputs
Researchain Logo
Decentralizing Knowledge