Pora Kim
University of Texas Health Science Center at Houston
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pora Kim.
Nucleic Acids Research | 2016
Min Zhao; Pora Kim; Ramkrishna Mitra; Junfei Zhao; Zhongming Zhao
Tumor suppressor genes (TSGs) are a major type of gatekeeper genes in the cell growth. A knowledgebase with the systematic collection and curation of TSGs in multiple cancer types is critically important for further studying their biological functions as well as for developing therapeutic strategies. Since its development in 2012, the Tumor Suppressor Gene database (TSGene), has become a popular resource in the cancer research community. Here, we reported the TSGene version 2.0, which has substantial updates of contents (e.g. up-to-date literature and pan-cancer genomic data collection and curation), data types (noncoding RNAs and protein-coding genes) and content accessibility. Specifically, the current TSGene 2.0 contains 1217 human TSGs (1018 protein-coding and 199 non-coding genes) curated from over 9000 articles. Additionally, TSGene 2.0 provides thousands of expression and mutation patterns derived from pan-cancer data of The Cancer Genome Atlas. A new web interface is available at http://bioinfo.mc.vanderbilt.edu/TSGene/. Systematic analyses of 199 non-coding TSGs provide numerous cancer-specific non-coding mutational events for further screening and clinical use. Intriguingly, we identified 49 protein-coding TSGs that were consistently down-regulated in 11 cancer types. In summary, TSGene 2.0, which is the only available database for TSGs, provides the most updated TSGs and their features in pan-cancer.
Nucleic Acids Research | 2016
Pora Kim; Feixiong Cheng; Junfei Zhao; Zhongming Zhao
Accumulating evidence has demonstrated that rewiring of metabolism in cells is an important hallmark of cancer. The percentage of patients killed by metabolic disorder has been estimated to be 30% of the advanced-stage cancer patients. Thus, a systematic annotation of cancer cell metabolism genes is imperative. Here, we present ccmGDB (Cancer Cell Metabolism Gene DataBase), a comprehensive annotation database for cell metabolism genes in cancer, available at http://bioinfo.mc.vanderbilt.edu/ccmGDB. We assembled, curated, and integrated genetic, genomic, transcriptomic, proteomic, biological network and functional information for over 2000 cell metabolism genes in more than 30 cancer types. In total, we integrated over 260 000 somatic alterations including non-synonymous mutations, copy number variants and structural variants. We also integrated RNA-Seq data in various primary tumors, gene expression microarray data in over 1000 cancer cell lines and protein expression data. Furthermore, we constructed cancer or tissue type-specific, gene co-expression based protein interaction networks and drug-target interaction networks. Using these systematic annotations, the ccmGDB portal site provides 6 categories: gene summary, phenotypic information, somatic mutations, gene and protein expression, gene co-expression network and drug pharmacological information with a user-friendly interface for browsing and searching. ccmGDB is developed and maintained as a useful resource for the cancer research community.
Nucleic Acids Research | 2016
Myunggyo Lee; Kyubum Lee; Namhee Yu; Insu Jang; Ikjung Choi; Pora Kim; Ye Eun Jang; Byounggun Kim; Sunkyu Kim; Byungwook Lee; Jaewoo Kang; Sanghyuk Lee
Fusion gene is an important class of therapeutic targets and prognostic markers in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data and manual curations. In this update, the database coverage was enhanced considerably by adding two new modules of The Cancer Genome Atlas (TCGA) RNA-Seq analysis and PubMed abstract mining. ChimerDB 3.0 is composed of three modules of ChimerKB, ChimerPub and ChimerSeq. ChimerKB represents a knowledgebase including 1066 fusion genes with manual curation that were compiled from public resources of fusion genes with experimental evidences. ChimerPub includes 2767 fusion genes obtained from text mining of PubMed abstracts. ChimerSeq module is designed to archive the fusion candidates from deep sequencing data. Importantly, we have analyzed RNA-Seq data of the TCGA project covering 4569 patients in 23 cancer types using two reliable programs of FusionScan and TopHat-Fusion. The new user interface supports diverse search options and graphic representation of fusion gene structure. ChimerDB 3.0 is available at http://ercsb.ewha.ac.kr/fusiongene/.
Methods | 2015
Timothy D. O’Brien; Peilin Jia; Junfeng Xia; Uma Saxena; Hailing Jin; Huy Vuong; Pora Kim; Qingguo Wang; Martin J. Aryee; Mari Mino-Kenudson; Jeffrey A. Engelman; Long P. Le; A. John Iafrate; Rebecca S. Heist; William Pao; Zhongming Zhao
Whole exome sequencing (WES) and RNA sequencing (RNA-Seq) are two main platforms used for next-generation sequencing (NGS). While WES is primarily for DNA variant discovery and RNA-Seq is mainly for measurement of gene expression, both can be used for detection of genetic variants, especially single nucleotide variants (SNVs). How consistently variants can be detected from WES and RNA-Seq has not been systematically evaluated. In this study, we examined the technical and biological inconsistencies in SNV detection using WES and RNA-Seq data from 27 pairs of tumor and matched normal samples. We analyzed SNVs in three categories: WES unique - those only detected in WES, RNA-Seq unique - those only detected in RNA-Seq, and shared - those detected in both. We found a small overlap (average ∼14%) between the SNVs called in WES and RNA-Seq. The WES unique SNVs were mainly due to low coverage, low expression, or their location on the non-transcribed strand in RNA-Seq data, while the RNA-Seq unique SNVs were primarily due to their location out of the WES-capture boundary regions (accounting ∼71%), as well as low coverage of the regions, low coverage of the mutant alleles or RNA-editing. The shared SNVs had high locus-specific coverage in both WES and RNA-Seq and high gene expression levels. Additionally, WES unique and RNA-Seq unique SNVs showed different nucleotide substitution patterns, e.g., ∼55% of RNA-Seq unique variants were A:T→G:C, a hallmark of RNA editing. This study provides an important evaluation on the inconsistencies of somatic SNVs called in WES and RNA-Seq data.
Nucleic Acids Research | 2018
Pora Kim; Aekyung Park; Guangchun Han; Hua Sun; Peilin Jia; Zhongming Zhao
Abstract Tissue-specific gene expression is critical in understanding biological processes, physiological conditions, and disease. The identification and appropriate use of tissue-specific genes (TissGenes) will provide important insights into disease mechanisms and organ-specific therapeutic targets. To better understand the tissue-specific features for each cancer type and to advance the discovery of clinically relevant genes or mutations, we built TissGDB (Tissue specific Gene DataBase in cancer) available at http://zhaobioinfo.org/TissGDB. We collected and curated 2461 tissue specific genes (TissGenes) across 22 tissue types that matched the 28 cancer types of The Cancer Genome Atlas (TCGA) from three representative tissue-specific gene expression resources: The Human Protein Atlas (HPA), Tissue-specific Gene Expression and Regulation (TiGER), and Genotype-Tissue Expression (GTEx). For these 2461 TissGenes, we performed gene expression, somatic mutation, and prognostic marker-based analyses across 28 cancer types using TCGA data. Our analyses identified hundreds of TissGenes, including genes that universally kept or lost tissue-specific gene expression, with other features: cancer type-specific isoform expression, fusion with oncogenes or tumor suppressor genes, and markers for protective or risk prognosis. TissGDB provides seven categories of annotations: TissGeneSummary, TissGeneExp, TissGene-miRNA, TissGeneMut, TissGeneNet, TissGeneProg, TissGeneClin.
Nucleic Acids Research | 2017
Pora Kim; Junfei Zhao; Pinyi Lu; Zhongming Zhao
Mutations at the ligand binding sites (LBSs) can influence protein structure stability, binding affinity with small molecules, and drug resistance in cancer patients. Our recent analysis revealed that ligand binding residues had a significantly higher mutation rate than other parts of the protein. Here, we built mutLBSgeneDB (mutated Ligand Binding Site gene DataBase) available at http://zhaobioinfo.org/mutLBSgeneDB. We collected and curated over 2300 genes (mutLBSgenes) having ∼12 000 somatic mutations at ∼10 000 LBSs across 16 cancer types and selected 744 drug targetable genes (targetable_mutLBSgenes) by incorporating kinases, transcription factors, pharmacological genes, and cancer driver genes. We analyzed LBS mutation information, differential gene expression network, drug response correlation with gene expression, and protein stability changes for all mutLBSgenes using integrated genetic, genomic, transcriptomic, proteomic, network and functional information. We calculated and compared the binding affinities of 20 carefully selected genes with their drugs in wild type and mutant forms. mutLBSgeneDB provides a user-friendly web interface for searching and browsing through seven categories of annotations: Gene summary, Mutated information, Protein structure related information, Differential gene expression and gene-gene network, Phenotype information, Pharmacological information, and Conservation information. mutLBSgeneDB provides a useful resource for functional genomics, protein structure, drug and disease research communities.
Neuro-oncology | 2018
Ae Kyung Park; Pora Kim; Leomar Y. Ballester; Yoshua Esquenazi; Zhongming Zhao
Background A high heterogeneity and activation of multiple oncogenic pathways have been implicated in failure of targeted therapies in glioblastoma (GBM). Methods Using The Cancer Genome Atlas data, we identified subtype-specific prognostic core genes by a combined approach of genome-wide Cox regression and Gene Set Enrichment Analysis. The results were validated with 8 combined public datasets containing 608 GBMs. We further examined prognostic chromosome aberrations and mutations. Results In classical and mesenchymal subtypes, 2 receptor tyrosine kinases (RTKs) (MET and IGF1R), and the genes in RTK downstream pathways such as phosphatidylinositol-3 kinase (PI3K)/Akt/mammalian target of rapamycin (mTOR), and nuclear factor-kappaB (NF-kB), were commonly detected as prognostic core genes. Classical subtype-specific prognostic core genes included those in cell cycle, DNA repair, and the Janus kinase/signal transducers and activators of transcription (JAK-STAT) pathway. Immune-related genes were enriched in the prognostic genes showing negative promoter cytosine-phosphate-guanine (CpG) methylation/expression correlations. Mesenchymal subtype-specific prognostic genes were those related to mesenchymal cell movement, PI3K/Akt, mitogen-activated protein kinase (MAPK)/extracellular signal-regulated kinase (ERK), Wnt/β-catenin, and Wnt/Ca2+ pathways. In copy number alterations and mutations, 6p loss and TP53 mutation were associated with poor and good survival, respectively, in the classical subtype. In the mesenchymal subtype, patients with PIK3R1 or PCLO mutations showed poor prognosis. In the glioma CpG island methylator phenotype (G-CIMP) subtype, patients harboring 10q loss, 12p gain, or 14q loss exhibited poor survival. Furthermore, 10q loss was significantly associated with the recently recognized G-CIMP subclass showing relatively low CpG methylation and poor prognosis. Conclusion These subtype-specific alterations have promising potentials as new prognostic biomarkers and therapeutic targets combined with surrogate markers of GBM subtypes. However, considering the small number of events, the results of copy number alterations and mutations require further validations.
Briefings in Bioinformatics | 2018
Hua Sun; Pora Kim; Peilin Jia; Ae Kyung Park; Han Liang; Zhongming Zhao
Testicular germ cell tumors (TGCTs) are classified into two main subtypes, seminoma (SE) and non-seminoma (NSE), but their molecular distinctions remain largely unexplored. Here, we used expression data for mRNAs and microRNAs (miRNAs) from The Cancer Genome Atlas (TCGA) to perform a systematic investigation to explain the different telomere length (TL) features between NSE (n = 48) and SE (n = 55). We found that TL elongation was dominant in NSE, whereas TL shortening prevailed in SE. We further showed that both mRNA and miRNA expression profiles could clearly distinguish these two subtypes. Notably, four telomere-related genes (TelGenes) showed significantly higher expression and positively correlated with telomere elongation in NSE than SE: three telomerase activity-related genes (TERT, WRAP53 and MYC) and an independent telomerase activity gene (ZSCAN4). We also found that the expression of genes encoding Yamanaka factors was positively correlated with telomere lengthening in NSE. Among them, SOX2 and MYC were highly expressed in NSE versus SE, while POU5F1 and KLF4 had the opposite patterns. These results suggested that enhanced expression of both TelGenes (TERT, WRAP53, MYC and ZSCAN4) and Yamanaka factors might induce telomere elongation in NSE. Conversely, the relative lack of telomerase activation and low expression of independent telomerase activity pathway during cell division may be contributed to telomere shortening in SE. Taken together, our results revealed the potential molecular profiles and regulatory roles involving the TL difference between NSE and SE, and provided a better molecular understanding of this complex disease.
Oncotarget | 2017
Pora Kim; Leomar Y. Ballester; Zhongming Zhao
Genomic rearrangements involving transcription factors (TFs) can form fusion proteins resulting in either enhanced, weakened, or even loss of TF activity. Functional domain (FD) retention is a critical factor in the activity of transcription factor fusion genes (TFFGs). A systematic investigation of FD retention in TFFGs and their outcome (e.g. expression changes) in a pan-cancer study has not yet been completed. Here, we examined the FD retention status in 386 TFFGs across 13 major cancer types and identified 83 TFFGs involving 67 TFs that retained FDs. To measure the potential biological relevance of TFs in TFFGs, we introduced a Major Active Isofusion Index (MAII) and built a prioritized TFFG network using MAII scores and the observed frequency of fusion positive samples. Interestingly, the four TFFGs (PML-RARA, RUNX1-RUNX1T1, TMPRSS2-ERG, and SFPQ-TFE3) with the highest MAII scores showed 50 differentially expressed target genes (DETGs) in fusion-positive versus fusion-negative cancer samples. DETG analysis revealed that they were involved in tumorigenesis-related processes in each cancer type. PLAU, which encodes plasminogen activator urokinase and serves as a biomarker for tumor invasion, was found to be consistently activated in the samples with the highest MAII scores. Among the 50 DETGs, 21 were drug targetable genes. Fourteen of these 21 DETGs were expressed in acute myeloid leukemia (AML) samples. Accordingly, we constructed an AML-specific TFFG network, which included 38 DETGs in RUNX1-RUNX1T1 or PML-RARA positive samples. In summary, this study revealed several TFFGs and their potential target genes, and provided insights into the clinical implications of TFFGs.
Briefings in Bioinformatics | 2016
Pora Kim; Peilin Jia; Zhongming Zhao