Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Arief Gusnanto is active.

Publication


Featured researches published by Arief Gusnanto.


Genetic Epidemiology | 2008

Estimation of Significance Thresholds for Genomewide Association Scans

Frank Dudbridge; Arief Gusnanto

The question of what significance threshold is appropriate for genomewide association studies is somewhat unresolved. Previous theoretical suggestions have yet to be validated in practice, whereas permutation testing does not resolve a discrepancy between the genomewide multiplicity of the experiment and the subset of markers actually tested. We used genotypes from the Wellcome Trust Case‐Control Consortium to estimate a genomewide significance threshold for the UK Caucasian population. We subsampled the genotypes at increasing densities, using permutation to estimate the nominal P‐value for 5% family‐wise error. By extrapolating to infinite density, we estimated the genomewide significance threshold to be about 7.2 × 10−8. To reduce the computation time, we considered Pattersons eigenvalue estimator of the effective number of tests, but found it to be an order of magnitude too low for multiplicity correction. However, by fitting a Beta distribution to the minimum P‐value from permutation replicates, we showed that the effective number is a useful heuristic and suggest that its estimation in this context is an open problem. We conclude that permutation is still needed to obtain genomewide significance thresholds, but with subsampling, extrapolation and estimation of an effective number of tests, the threshold can be standardized for all studies of the same population. Genet. Epidemiol. 2008.


Bioinformatics | 2005

False discovery rate, sensitivity and sample size for microarray studies

Yudi Pawitan; Stefan Michiels; Serge Koscielny; Arief Gusnanto; Alexander Ploner

MOTIVATION In microarray data studies most researchers are keenly aware of the potentially high rate of false positives and the need to control it. One key statistical shift is the move away from the well-known P-value to false discovery rate (FDR). Less discussion perhaps has been spent on the sensitivity or the associated false negative rate (FNR). The purpose of this paper is to explain in simple ways why the shift from P-value to FDR for statistical assessment of microarray data is necessary, to elucidate the determining factors of FDR and, for a two-sample comparative study, to discuss its control via sample size at the design stage. RESULTS We use a mixture model, involving differentially expressed (DE) and non-DE genes, that captures the most common problem of finding DE genes. Factors determining FDR are (1) the proportion of truly differentially expressed genes, (2) the distribution of the true differences, (3) measurement variability and (4) sample size. Many current small microarray studies are plagued with large FDR, but controlling FDR alone can lead to unacceptably large FNR. In evaluating a design of a microarray study, sensitivity or FNR curves should be computed routinely together with FDR curves. Under certain assumptions, the FDR and FNR curves coincide, thus simplifying the choice of sample size for controlling the FDR and FNR jointly.


Journal of Clinical Investigation | 2005

Platelet genomics and proteomics in human health and disease

Iain C. Macaulay; Philippa Carr; Arief Gusnanto; Willem H. Ouwehand; Des Fitzgerald; Nicholas A. Watkins

Proteomic and genomic technologies provide powerful tools for characterizing the multitude of events that occur in the anucleate platelet. These technologies are beginning to define the complete platelet transcriptome and proteome as well as the protein-protein interactions critical for platelet function. The integration of these results provides the opportunity to identify those proteins involved in discrete facets of platelet function. Here we summarize the findings of platelet proteome and transcriptome studies and their application to diseases of platelet function.


Blood | 2010

Transcription profiling in human platelets reveals LRRFIP1 as a novel protein regulating platelet function.

Alison H. Goodall; Philippa Burns; Isabelle I. Salles; Iain C. Macaulay; Chris I. Jones; Diego Ardissino; Bernard de Bono; Sarah L. Bray; Hans Deckmyn; Frank Dudbridge; Desmond J. Fitzgerald; Stephen F. Garner; Arief Gusnanto; Kerstin Koch; Cordelia Langford; Marie N. O'Connor; Catherine M. Rice; Derek L. Stemple; Jonathan Stephens; Mieke D. Trip; Jaap-Jan Zwaginga; Nilesh J. Samani; Nicholas A. Watkins; Patricia B. Maguire; Willem H. Ouwehand

Within the healthy population, there is substantial, heritable, and interindividual variability in the platelet response. We explored whether a proportion of this variability could be accounted for by interindividual variation in gene expression. Through a correlative analysis of genome-wide platelet RNA expression data from 37 subjects representing the normal range of platelet responsiveness within a cohort of 500 subjects, we identified 63 genes in which transcript levels correlated with variation in the platelet response to adenosine diphosphate and/or the collagen-mimetic peptide, cross-linked collagen-related peptide. Many of these encode proteins with no reported function in platelets. An association study of 6 of the 63 genes in 4235 cases and 6379 controls showed a putative association with myocardial infarction for COMMD7 (COMM domain-containing protein 7) and a major deviation from the null hypo thesis for LRRFIP1 [leucine-rich repeat (in FLII) interacting protein 1]. Morpholino-based silencing in Danio rerio identified a modest role for commd7 and a significant effect for lrrfip1 as positive regulators of thrombus formation. Proteomic analysis of human platelet LRRFIP1-interacting proteins indicated that LRRFIP1 functions as a component of the platelet cytoskeleton, where it interacts with the actin-remodeling proteins Flightless-1 and Drebrin. Taken together, these data reveal novel proteins regulating the platelet response.


Bioinformatics | 2006

Multidimensional local false discovery rate for microarray studies

Alexander Ploner; Stefano Calza; Arief Gusnanto; Yudi Pawitan

MOTIVATION The false discovery rate (fdr) is a key tool for statistical assessment of differential expression (DE) in microarray studies. Overall control of the fdr alone, however, is not sufficient to address the problem of genes with small variance, which generally suffer from a disproportionally high rate of false positives. It is desirable to have an fdr-controlling procedure that automatically accounts for gene variability. METHODS We generalize the local fdr as a function of multiple statistics, combining a common test statistic for assessing DE with its standard error information. We use a non-parametric mixture model for DE and non-DE genes to describe the observed multi-dimensional statistics, and estimate the distribution for non-DE genes via the permutation method. We demonstrate this fdr2d approach for simulated and real microarray data. RESULTS The fdr2d allows objective assessment of DE as a function of gene variability. We also show that the fdr2d performs better than commonly used modified test statistics. AVAILABILITY An R-package OCplus containing functions for computing fdr2d() and other operating characteristics of microarray data is available at http://www.meb.ki.se/~yudpaw.


Current Opinion in Lipidology | 2007

Identification of differentially expressed genes and false discovery rate in microarray studies

Arief Gusnanto; Stefano Calza; Yudi Pawitan

Purpose of review To highlight the development in microarray data analysis for the identification of differentially expressed genes, particularly via control of false discovery rate. Recent findings The emergence of high-throughput technology such as microarrays raises two fundamental statistical issues: multiplicity and sensitivity. We focus on the biological problem of identifying differentially expressed genes. First, multiplicity arises due to testing tens of thousands of hypotheses, rendering the standard P value meaningless. Second, known optimal single-test procedures such as the t-test perform poorly in the context of highly multiple tests. The standard approach of dealing with multiplicity is too conservative in the microarray context. The false discovery rate concept is fast becoming the key statistical assessment tool replacing the P value. We review the false discovery rate approach and argue that it is more sensible for microarray data. We also discuss some methods to take into account additional information from the microarrays to improve the false discovery rate. Summary There is growing consensus on how to analyse microarray data using the false discovery rate framework in place of the classical P value. Further research is needed on the preprocessing of the raw data, such as the normalization step and filtering, and on finding the most sensitive test procedure.


Human Genomics | 2006

Detecting multiple associations in genome-wide studies

Frank Dudbridge; Arief Gusnanto; Bobby P. C. Koeleman

Recent developments in the statistical analysis of genome-wide studies are reviewed. Genome-wide analyses are becoming increasingly common in areas such as scans for disease-associated markers and gene expression profiling. The data generated by these studies present new problems for statistical analysis, owing to the large number of hypothesis tests, comparatively small sample size and modest number of true gene effects. In this review, strategies are described for optimising the genotyping cost by discarding promising genes at an earlier stage, saving resources for the genes that show a trend of association. In addition, there is a review of new methods of analysis that combine evidence across genes to increase sensitivity to multiple true associations in the presence of many non-associated genes. Some methods achieve this by including only the most significant results, whereas others model the overall distribution of results as a mixture of distributions from true and null effects. Because genes are correlated even when having no effect, permutation testing is often necessary to estimate the overall significance, but this can be very time consuming. Efficiency can be improved by fitting a parametric distribution to permutation replicates, which can be re-used in subsequent analyses. Methods are also available to generate random draws from the permutation distribution. The review also includes discussion of new error measures that give a more reasonable interpretation of genome-wide studies, together with improved sensitivity. The false discovery rate allows a controlled proportion of positive results to be false, while detecting more true positives; and the local false discovery rate and false-positive report probability give clarity on whether or not a statistically significant test represents a real discovery.


Bioinformatics | 2007

Robust smooth segmentation approach for array CGH data analysis

Jian Huang; Arief Gusnanto; Kathleen O'Sullivan; Johan Staaf; Åke Borg; Yudi Pawitan

MOTIVATION Array comparative genomic hybridization (aCGH) provides a genome-wide technique to screen for copy number alteration. The existing segmentation approaches for analyzing aCGH data are based on modeling data as a series of discrete segments with unknown boundaries and unknown heights. Although the biological process of copy number alteration is discrete, in reality a variety of biological and experimental factors can cause the signal to deviate from a stepwise function. To take this into account, we propose a smooth segmentation (smoothseg) approach. METHODS To achieve a robust segmentation, we use a doubly heavy-tailed random-effect model. The first heavy-tailed structure on the errors deals with outliers in the observations, and the second deals with possible jumps in the underlying pattern associated with different segments. We develop a fast and reliable computational procedure based on the iterative weighted least-squares algorithm with band-limited matrix inversion. RESULTS Using simulated and real data sets, we demonstrate how smoothseg can aid in identification of regions with genomic alteration and in classification of samples. For the real data sets, smoothseg leads to smaller false discovery rate and classification error rate than the circular binary segmentation (CBS) algorithm. In a realistic simulation setting, smoothseg is better than wavelet smoothing and CBS in identification of regions with genomic alterations and better than CBS in classification of samples. For comparative analyses, we demonstrate that segmenting the t-statistics performs better than segmenting the data. AVAILABILITY The R package smoothseg to perform smooth segmentation is available from http://www.meb.ki.se/~yudpaw.


Platelets | 2008

Identification of variation in the platelet transcriptome associated with Glycoprotein 6 haplotype

Philippa Burns; Arief Gusnanto; Macaulay Ic; Rankin A; Brian D. M. Tom; Cordelia Langford; Frank Dudbridge; Willem H. Ouwehand; Nicholas A. Watkins

Platelet Glycoprotein VI (GPVI) is the activatory collagen signalling receptor that transmits an outside-in signal via the FcR γ-chain. In Caucasians two GP6 haplotypes have been identified which encode GPVI isoforms that differ by five amino-acids. The minor haplotype is associated with a modest but statistically significant reduction in GPVI abundance and reduced downstream signalling events. As GPVI is also expressed on megakaryocytes, different GPVI isoforms may imprint on the platelet transcriptome. We investigated the association of GP6 haplotype with transcription by comparing the transcriptomes of platelets from individuals homozygous for the major (‘a’) and minor (‘b’) haplotypes to identify differentially expressed (DE) transcripts. Platelet RNA was isolated from apheresis concentrates from 16 ‘aa’ donors and eight ‘bb’ donors. mRNA was amplified using a template-switching PCR based protocol and fluorescently labelled. Samples were randomly paired both within and between haplotypes and compared on a cDNA microarray. No consistently DE transcripts were identified within the ‘aa’ haplotype but 52 significantly DE transcripts were observed between haplotypes. Generally the fold differences were low (two to four-fold) but were confirmed by qRT-PCR for selected transcripts (TUBB1, P = 0.0004; VWF, P = 0.0126). The results of this study indicate that there are subtle differences between the platelet transcriptomes of individuals who differ by GP6 haplotype. The identification of DE genes may identify critical pathways and nodes not previously known to be involved in platelet development and function.


Journal of Biomedical Informatics | 2013

Partial least squares and logistic regression random-effects estimates for gene selection in supervised classification of gene expression data

Arief Gusnanto; Alexander Ploner; Farag Shuweihdi; Yudi Pawitan

Our main interest in supervised classification of gene expression data is to infer whether the expressions can discriminate biological characteristics of samples. With thousands of gene expressions to consider, a gene selection has been advocated to decrease classification by including only the discriminating genes. We propose to make the gene selection based on partial least squares and logistic regression random-effects (RE) estimates before the selected genes are evaluated in classification models. We compare the selection with that based on the two-sample t-statistics, a current practice, and modified t-statistics. The results indicate that gene selection based on logistic regression RE estimates is recommended in a general situation, while the selection based on the PLS estimates is recommended when the number of samples is low. Gene selection based on the modified t-statistics performs well when the genes exhibit moderate-to-high variability with moderate group separation. Respecting the characteristics of the data is a key aspect to consider in gene selection.

Collaboration


Dive into the Arief Gusnanto's collaboration.

Top Co-Authors

Avatar

Frank Dudbridge

Laboratory of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar

Cordelia Langford

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jaap-Jan Zwaginga

Catholic University of Leuven

View shared research outputs
Top Co-Authors

Avatar

Michael Steward

Catholic University of Leuven

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge