Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zhifa Liu is active.

Publication


Featured researches published by Zhifa Liu.


Journal of Investigative Dermatology | 2015

The Genomic Landscape of Childhood and Adolescent Melanoma

Charles Lu; Jinghui Zhang; Panduka Nagahawatte; John Easton; Seungjae Lee; Zhifa Liu; Li Ding; Matthew A. Wyczalkowski; Marcus B. Valentine; Fariba Navid; Heather L. Mulder; Ruth G. Tatevossian; James Dalton; James Davenport; Zhirong Yin; Michael Edmonson; Michael Rusch; Gang Wu; Yongjin Li; Matthew Parker; Erin Hedlund; Sheila A. Shurtleff; Susana C. Raimondi; Vadodaria Bhavin; Yergeau Donald; Elaine R. Mardis; Richard Wilson; William E. Evans; David W. Ellison; Stanley Pounds

Despite remarkable advances in the genomic characterization of adult melanoma, the molecular pathogenesis of pediatric melanoma remains largely unknown. We analyzed 15 conventional melanomas (CMs), 3 melanomas arising in congenital nevi (CNMs), and 5 spitzoid melanomas (SMs), using various platforms, including whole genome or exome sequencing, the molecular inversion probe assay, and/or targeted sequencing. CMs demonstrated a high burden of somatic single-nucleotide variations (SNVs), with each case containing a TERT promoter (TERT-p) mutation, 13/15 containing an activating BRAF V600 mutation, and >80% of the identified SNVs consistent with UV damage. In contrast, the three CNMs contained an activating NRAS Q61 mutation and no TERT-p mutations. SMs were characterized by chromosomal rearrangements resulting in activated kinase signaling in 40%, and an absence of TERT-p mutations, except for the one SM that succumbed to hematogenous metastasis. We conclude that pediatric CM has a very similar UV-induced mutational spectrum to that found in the adult counterpart, emphasizing the need to promote sun protection practices in early life and to improve access to therapeutic agents being explored in adults in young patients. In contrast, the pathogenesis of CNM appears to be distinct. TERT-p mutations may identify the rare subset of spitzoid melanocytic lesions prone to disseminate.


Nature Communications | 2015

Genomic landscape of paediatric adrenocortical tumours

Emilia M. Pinto; Xiang Chen; John Easton; David Finkelstein; Zhifa Liu; Stanley Pounds; Carlos Rodriguez-Galindo; Troy C. Lund; Elaine R. Mardis; Richard Wilson; Kristy Boggs; Donald Yergeau; Jinjun Cheng; Heather L. Mulder; Jayanthi Manne; Jesse J. Jenkins; Maria José Mastellaro; Bonald C. Figueiredo; Michael A. Dyer; Alberto S. Pappo; Jinghui Zhang; James R. Downing; Raul C. Ribeiro; Gerard P. Zambetti

Pediatric adrenocortical carcinoma is a rare malignancy with poor prognosis. Here we analyze 37 adrenocortical tumors (ACTs) by whole genome, whole exome and/or transcriptome sequencing. Most cases (91%) show loss of heterozygosity (LOH) of chromosome 11p, with uniform selection against the maternal chromosome. IGF2 on chromosome 11p is overexpressed in 100% of the tumors. TP53 mutations and chromosome 17 LOH with selection against wild-type TP53 are observed in 28 ACTs (76%). Chromosomes 11p and 17 undergo copy-neutral LOH early during tumorigenesis, suggesting tumor-driver events. Additional genetic alterations include recurrent somatic mutations in ATRX and CTNNB1 and integration of human herpesvirus-6 in chromosome 11p. A dismal outcome is predicted by concomitant TP53 and ATRX mutations and associated genomic abnormalities, including massive structural variations and frequent background mutations. Collectively, these findings demonstrate the nature, timing and potential prognostic significance of key genetic alterations in pediatric ACT and outline a hypothetical model of pediatric adrenocortical tumorigenesis.


BMC Bioinformatics | 2014

An R package that automatically collects and archives details for reproducible computing

Zhifa Liu; Stan Pounds

BackgroundIt is scientifically and ethically imperative that the results of statistical analysis of biomedical research data be computationally reproducible in the sense that the reported results can be easily recapitulated from the study data. Some statistical analyses are computationally a function of many data files, program files, and other details that are updated or corrected over time. In many applications, it is infeasible to manually maintain an accurate and complete record of all these details about a particular analysis.ResultsTherefore, we developed the rctrack package that automatically collects and archives read only copies of program files, data files, and other details needed to computationally reproduce an analysis.ConclusionsThe rctrack package uses the trace function to temporarily embed detail collection procedures into functions that read files, write files, or generate random numbers so that no special modifications of the primary R program are necessary. At the conclusion of the analysis, rctrack uses these details to automatically generate a read only archive of data files, program files, result files, and other details needed to recapitulate the analysis results. Information about this archive may be included as an appendix of a report generated by Sweave or knitR. Here, we describe the usage, implementation, and other features of the rctrack package. The rctrack package is freely available from http://www.stjuderesearch.org/site/depts/biostats/rctrack under the GPL license.


Bioinformatics | 2013

A genomic random interval model for statistical analysis of genomic lesion data

Stan Pounds; Cheng Cheng; Shaoyu Li; Zhifa Liu; Jinghui Zhang; Charles G. Mullighan

MOTIVATION Tumors exhibit numerous genomic lesions such as copy number variations, structural variations and sequence variations. It is difficult to determine whether a specific constellation of lesions observed across a cohort of multiple tumors provides statistically significant evidence that the lesions target a set of genes that may be located across different chromosomes but yet are all involved in a single specific biological process or function. RESULTS We introduce the genomic random interval (GRIN) statistical model and analysis method that evaluates the statistical significance of the abundance of genomic lesions that overlap a specific locus or a pre-defined set of biologically related loci. The GRIN model retains certain biologically important properties of genomic lesions that are ignored by other methods. In a simulation study and two example analyses of leukemia genomic lesion data, GRIN more effectively identified important loci as significant than did three methods based on a permutation-of-markers model. GRIN also identified biologically relevant pathways with a significant abundance of lesions in both examples. AVAILABILITY An R package will be freely available at CRAN and www.stjuderesearch.org/site/depts/biostats/software.


Clinical Cancer Research | 2016

Prognostic Significance of Major Histocompatibility Complex Class II Expression in Pediatric Adrenocortical Tumors: A St. Jude and Children's Oncology Group Study

Emilia M. Pinto; Carlos Rodriguez-Galindo; John K. Choi; Stanley Pounds; Zhifa Liu; Geoffrey Neale; David Finkelstein; John Hicks; Alberto S. Pappo; Bonald C. Figueiredo; Raul C. Ribeiro; Gerard P. Zambetti

Purpose: Histologic markers that differentiate benign and malignant pediatric adrenocortical tumors are lacking. Previous studies have implicated an association of MHC class II expression with adrenocortical tumor prognosis. Here, we determined the expression of MHC class II as well as the cell of origin of these immunologic markers in pediatric adrenocortical tumor. The impact of MHC class II gene expression on outcome was determined in a cohort of uniformly treated children with adrenocortical carcinomas. Experimental Design: We analyzed the expression of MHC class II and a selected cluster of differentiation genes in 63 pediatric adrenocortical tumors by Affymetrix Human U133 Plus 2.0 or HT HG-U133+PM gene chip analyses. Cells expressing MHC class II were identified by morphologic and immunohistochemical assays. Results: MHC class II expression was significantly greater in adrenocortical adenomas than in carcinomas (P = 4.8 ×10−6) and was associated with a higher progression-free survival (PFS) estimate (P = 0.003). Specifically, HLA-DPA1 expression was most significantly associated with PFS after adjustment for tumor weight and stage. HLA-DPA1 was predominantly expressed by hematopoietic infiltrating cells and undetectable in tumor cells in 23 of 26 cases (88%). Conclusions: MHC class II expression, which is produced by tumor-infiltrating immune cells, is an indicator of disease aggressiveness in pediatric adrenocortical tumor. Our results suggest that immune responses modulate adrenocortical tumorigenesis and may allow the refinement of risk stratification and treatment for this disease. Clin Cancer Res; 22(24); 6247–55. ©2016 AACR.


Annals of Human Genetics | 2015

SVSI: Fast and Powerful Set‐Valued System Identification Approach to Identifying Rare Variants in Sequencing Studies for Ordered Categorical Traits

Wenjian Bi; Guolian Kang; Yanlong Zhao; Yuehua Cui; Song Yan; Yun Li; Cheng Cheng; Stanley Pounds; Michael J. Borowitz; Mary V. Relling; Jun Yang; Zhifa Liu; Ching-Hon Pui; Stephen P. Hunger; Christine Hartford; Wing Leung; Ji-Feng Zhang

In genetic association studies of an ordered categorical phenotype, it is usual to either regroup multiple categories of the phenotype into two categories and then apply the logistic regression (LG), or apply ordered logistic (oLG), or ordered probit (oPRB) regression, which accounts for the ordinal nature of the phenotype. However, they may lose statistical power or may not control type I error due to their model assumption and/or instable parameter estimation algorithm when the genetic variant is rare or sample size is limited. To solve this problem, we propose a set‐valued (SV) system model to identify genetic variants associated with an ordinal categorical phenotype. We couple this model with a SV system identification algorithm to identify all the key system parameters. Simulations and two real data analyses show that SV and LG accurately controlled the Type I error rate even at a significance level of 10−6 but not oLG and oPRB in some cases. LG had significantly less power than the other three methods due to disregarding of the ordinal nature of the phenotype, and SV had similar or greater power than oLG and oPRB. We argue that SV should be employed in genetic association studies for ordered categorical phenotype.


BMC Bioinformatics | 2013

Feature selection and prediction with a Markov blanket structure learning algorithm

Yuan-zhen Tan; Zhifa Liu

Background Classification and prediction are common tasks in machine learning. For example, many studies have attempted to predict gene expression given information, such as DNA sequence, expression of other genes or epigenetic modifications. Many existing methods, such as neural networks and support vector machines, have been used to make these predictions. Unfortunately, these black box techniques offer little insight into the reasoning behind the predictions. In many cases, relatively few attributes contribute to the classification accuracy. Bayesian networks explicitly encode the relationships among attributes to make predictions. In a Bayesian network, the Markov blanket (MB) of the class variable gives all of the


BMC Bioinformatics | 2015

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Iwona Pawlikowska; Zhifa Liu; Lei Shi; Tong Lin; Tanja A. Gruber; Giles W. Robinson; Arzu Onar-Thomas; Stan Pounds

Background Cluster analysis is widely used in cancer research to discover molecular subgroups that inform subsequent laboratory investigations and define risk classification criteria for subsequent clinical trials. However, for any data set, there are a very large number of candidate cluster analysis methods (CCAMs) due to the many choices for feature selection criteria, number of selected features, number of clusters to define, etc. Frequently, a specific CCAM is chosen without quantifying the validity of its results in terms of reproducibility or distinctiveness of the reported subgroups.


BMC Bioinformatics | 2013

Our strategy to achieve and document reproducible computing

Nisrine Enyinda; Zhifa Liu; Areg Negatu; Stan Pounds

Materials and methods Here, we describe our three-component strategy to achieve and document permanent reproducible computing in our research environment. First, we use the Sweave literate programming infrastructure to embed R code and report text in the same file. Sweave performs the specified calculations in R, inserts those results directly into a LaTeX typesetting command file, and finally compiles the LaTeX typesetting file into a PDF file. Thus, a Sweave file internally documents the top-level R code that produces the reported results. However, a Sweave report does not retain its reproducibility if the input data files and lowerlevel R code are modified later. Therefore, as the second component of our strategy, we developed the Igloo system to archive and freeze files for permanent reproducibility. The Igloo system requests that the user document every file that is transferred to a frozen archive. The Igloo system freezes the files in an archive with a directory structure that annotates the files by research team (leukemia, brain tumor, etc) and category (code file, type of data file, etc). The archive directory is visible in our Windows and Linux high-performance computing environments and has permission controls to ensure appropriate access to the files. However, neither Sweave nor Igloo assists with the cumbersome task of identifying specific input files that should be frozen to ensure permanent reproducibility. As the third component of our strategy, we developed the R package rctrack that computationally tracks the accession and generation of files by an R analysis program. The rctrack package defines a function that identifies files which need to be frozen in order to ensure permanent reproducibility. Additionally, rctrack provides mechanisms to track and document the usage of other software for some calculations. Finally, the rctrack package defines a function that generates a Sweave appendix with details regarding the input data and code files and their impact on the reproducibility of the report.


BMC Bioinformatics | 2013

A powerful association for comorbidity analysis based on score based test

Zhifa Liu

Background In the studies of mental and behavioral disorders, comorbidity is an important issue since multiple correlated disorders are usually recorded to understand the etiology of substance dependence, which is imperative to the development of effective treatment and prevention strategies. Many studies have been reported the comorbidity between substance abuse disorders and psychiatric disorders such as anxiety and major depression. In order to investigate the association between the comorbidity of complex diseases and genetic locus, it is critical to develop a computationally efficient and powerful multiple traits association test. Recently, taking the advantage of high throughput genomic data, lots of genetic variants have been identified for individual drug addiction based on genome-wide association studies (GWAS). Despite many successes, the current GWAS may be insufficient to detect genetic variants with moderate-to-small effect since a stringent significance threshold is used to control the false discovery rate, which is a key factor to the missing heritability problem. Materials and methods To detect novel genetic variants, joint analyzing correlated traits are promising solutions. First, joint analyzing correlated traits may increase the power in detecting genetic variants with moderate effects across multiple traits by exploiting the correlation between traits. Second, joint analysis can alleviate multiple comparison problems which incurred in analyzing individual trait separately. Comprehensive studies have been used to demonstrate that jointly testing correlated traits is more powerful than testing a single trait at a time. Therefore, it is important to consider the comorbidity of correlated traits in order to detect novel genetic loci in GWAS. We introduce a multiple traits association test based on least square. The proposed method only needs to specify the marginal distribution of multiple traits. We systematically investigate the strengths and weaknesses of the multiple traits association test through simulation and comprehensive real data analysis. We demonstrated the advantage of multiple traits association test when multiple traits share common genetic variation. However, when multiple traits share no or weak common genetic variation, the multiple traits association test has no advantage.

Collaboration


Dive into the Zhifa Liu's collaboration.

Top Co-Authors

Avatar

Stan Pounds

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Jinghui Zhang

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Stanley Pounds

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Alberto S. Pappo

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Carlos Rodriguez-Galindo

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Cheng Cheng

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

David Finkelstein

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Elaine R. Mardis

Nationwide Children's Hospital

View shared research outputs
Top Co-Authors

Avatar

Emilia M. Pinto

St. Jude Children's Research Hospital

View shared research outputs
Top Co-Authors

Avatar

Gang Wu

St. Jude Children's Research Hospital

View shared research outputs
Researchain Logo
Decentralizing Knowledge