Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hae Kyung Im is active.

Publication


Featured researches published by Hae Kyung Im.


RNA Biology | 2011

Population differences in microRNA expression and biological implications.

R. Stephanie Huang; Eric R. Gamazon; Dana Ziliak; Yujia Wen; Hae Kyung Im; Wei Zhang; Claudia Wing; Shiwei Duan; Wasim K. Bleibel; Nancy J. Cox; M. Eileen Dolan

Population differences observed for complex traits may be attributed to the combined effect of socioeconomic, environmental, genetic and epigenetic factors. To better understand population differences in complex traits, genome-wide genetic and gene expression differences among ethnic populations have been studied. Here we set out to evaluate population differences in small non-coding RNAs through an evaluation of microRNA (miRNA) baseline expression in HapMap lymphoblastoid cell lines (LCLs) derived from 53 CEU (Utah residents with northern and western European ancestry) and 54 YRI (African from Ibadan, Nigeria). Using the Exiqon miRCURYTM LNA arrays, we found that 16% of all miRNAs evaluated in our study differ significantly between these 2 ethnic groups (pBonferroni corrected<0.05). Furthermore, we explored the potential biological function of these observed differentially expressed miRNAs by comprehensively examining their effect on the transcriptome and their relationship with cellular sensitivity drug phenotypes. After multiple testing adjustment (false discovery rate (FDR)<0.1), we found that 55% and 88% of the differentially expressed miRNAs were significantly and inversely correlated with an mRNA expression phenotype in the CEU and YRI samples, respectively. Interestingly, a substantial proportion (64%) of these miRNAs correlated with cellular sensitivity to chemotherapeutic agents (FDR<0.05). Lastly, upon performing a genome-wide association study between SNPs and miRNA expression, we identified a large number of SNPs exhibiting different allele frequencies that affect the expression of these differentially expressed miRNAs, suggesting the role of genetic variants in mediating the observed population differences.


PLOS ONE | 2010

ExprTarget: An Integrative Approach to Predicting Human MicroRNA Targets

Eric R. Gamazon; Hae Kyung Im; Shiwei Duan; Yves A. Lussier; Nancy J. Cox; M. Eileen Dolan; Wei Zhang

Variation in gene expression has been observed in natural populations and associated with complex traits or phenotypes such as disease susceptibility and drug response. Gene expression itself is controlled by various genetic and non-genetic factors. The binding of a class of small RNA molecules, microRNAs (miRNAs), to mRNA transcript targets has recently been demonstrated to be an important mechanism of gene regulation. Because individual miRNAs may regulate the expression of multiple gene targets, a comprehensive and reliable catalogue of miRNA-regulated targets is critical to understanding gene regulatory networks. Though experimental approaches have been used to identify many miRNA targets, due to cost and efficiency, current miRNA target identification still relies largely on computational algorithms that aim to take advantage of different biochemical/thermodynamic properties of the sequences of miRNAs and their gene targets. A novel approach, ExprTarget, therefore, is proposed here to integrate some of the most frequently invoked methods (miRanda, PicTar, TargetScan) as well as the genome-wide HapMap miRNA and mRNA expression datasets generated in our laboratory. To our knowledge, this dataset constitutes the first miRNA expression profiling in the HapMap lymphoblastoid cell lines. We conducted diagnostic tests of the existing computational solutions using the experimentally supported targets in TarBase as gold standard. To gain insight into the biases that arise from such an analysis, we investigated the effect of the choice of gold standard on the evaluation of the various computational tools. We analyzed the performance of ExprTarget using both ROC curve analysis and cross-validation. We show that ExprTarget greatly improves miRNA target prediction relative to the individual prediction algorithms in terms of sensitivity and specificity. We also developed an online database, ExprTargetDB, of human miRNA targets predicted by our approach that integrates gene expression profiling into a broader framework involving important features of miRNA target site predictions.


American Journal of Human Genetics | 2012

On Sharing Quantitative Trait GWAS Results in an Era of Multiple-omics Data and the Limits of Genomic Privacy

Hae Kyung Im; Eric R. Gamazon; Dan L. Nicolae; Nancy J. Cox

Recent advances in genome-scale, system-level measurements of quantitative phenotypes (transcriptome, metabolome, and proteome) promise to yield unprecedented biological insights. In this environment, broad dissemination of results from genome-wide association studies (GWASs) or deep-sequencing efforts is highly desirable. However, summary results from case-control studies (allele frequencies) have been withdrawn from public access because it has been shown that they can be used for inferring participation in a study if the individuals genotype is available. A natural question that follows is how much private information is contained in summary results from quantitative trait GWAS such as regression coefficients or p values. We show that regression coefficients for many SNPs can reveal the persons participation and for participants his or her phenotype with high accuracy. Our power calculations show that regression coefficients contain as much information on individuals as allele frequencies do, if the persons phenotype is rather extreme or if multiple phenotypes are available as has been increasingly facilitated by the use of multiple-omics data sets. These findings emphasize the need to devise a mechanism that allows data sharing that will facilitate scientific progress without sacrificing privacy protection.


PLOS Genetics | 2015

Identification and Functional Characterization of G6PC2 Coding Variants Influencing Glycemic Traits Define an Effector Transcript at the G6PC2-ABCB11 Locus

Anubha Mahajan; Xueling Sim; Hui Jin Ng; Alisa K. Manning; Manuel A. Rivas; Heather M Highland; Adam E. Locke; Niels Grarup; Hae Kyung Im; Pablo Cingolani; Jason Flannick; Pierre Fontanillas; Christian Fuchsberger; Kyle J. Gaulton; Tanya M. Teslovich; N. William Rayner; Neil R. Robertson; Nicola L. Beer; Jana K. Rundle; Jette Bork-Jensen; Claes Ladenvall; Christine Blancher; David Buck; Gemma Buck; Noël P. Burtt; Stacey Gabriel; Anette P. Gjesing; Christopher J. Groves; Mette Hollensted; Jeroen R. Huyghe

Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights.


Translational Research | 2011

Germline Polymorphisms Discovered via a Cell-based Genome-wide Approach Predict Platinum Response in Head and Neck Cancers

Dana Ziliak; Peter H. O'Donnell; Hae Kyung Im; Eric R. Gamazon; Peixian Chen; Shannon M. Delaney; Sunita J. Shukla; Soma Das; Nancy J. Cox; Everett E. Vokes; Ezra E.W. Cohen; M. Eileen Dolan; R. Stephanie Huang

Identifying patients prior to treatment who are more likely to benefit from chemotherapeutic agents or more likely to experience adverse events is an aim of personalized medicine. Pharmacogenomics offers a potential means of achieving this goal through the discovery of predictive germline genetic biomarkers. When applied particularly to the treatment of head and neck cancers, such information could offer significant benefit to patients as a means of potentially reducing morbidity associated with platinum-based chemotherapy. We developed a genome-wide, cell-based approach to identify single nucleotide polymorphisms (SNPs) associated with platinum susceptibility and then evaluated these SNPs as predictors for response and toxicity in head and neck cancer patients treated with platinum-based therapy as part of a phase II clinical trial. Sixty head and neck cancer patients were evaluated. Of 45 genome-wide SNPs examined, we found that 2 SNPs, rs6870861 (P=0.004; false discovery rate [FDR] <0.05) and rs2551038 (P=0.005; FDR <0.05), were associated significantly with overall response to carboplatin-based induction chemotherapy when incorporated into a model along with total carboplatin exposure. Interestingly, these 2 SNPs are associated strongly with the baseline expression of >20 genes (all P ≤10(-4)), and that 2 genes (SLC22A5 and SLCO4C1) are important organic cation/anion transporters known to affect platinum uptake and clearance. Several other SNPs were associated nominally with carboplatin-related hematologic toxicities. These findings demonstrate importantly that a genome-wide, cell-based model can identify novel germline genetic biomarkers of platinum susceptibility, which are replicable in a clinical setting with treated cancer patients and seem clinically meaningful for potentially enabling future personalization of care in such patients.


bioRxiv | 2016

Integrating tissue specific mechanisms into GWAS summary results

Alvaro N. Barbeira; Scott P. Dickinson; Jason Torres; Eric S. Torstenson; Jiamao Zheng; Heather E. Wheeler; Kaanan P. Shah; Todd L. Edwards; Dan L. Nicolae; Nancy J. Cox; Hae Kyung Im

Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations were tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.To gain biological insight into the discoveries made by GWAS and meta-analysis studies, effective integration of functional data generated by large-scale efforts such as the GTEx Project is needed. PrediXcan is a gene-level approach that addresses this need by estimating the genetically determined component of gene expression. These predicted expression traits can then be tested for association with phenotype in order to test for mediating role of gene expression levels. Furthermore, due to the polygenic nature of many complex traits, efforts to aggregate multiple GWAS studies and conduct meta-analyses have successfully increased our ability to identify variants of small effect sizes. To take advantage of the results generated by these efforts and to avoid the problems associated with accessing and handling individual-level data (e.g. consent limitations, large computational/storage costs) we have developed an extension of PrediXcan. The new method, MetaXcan, infers the results of PrediXcan using only summary statistics from large-scale GWAS or meta-analyses. Here we show that the concordance between PrediXcan and MetaXcan is excellent when the right reference population is used (R2 > 0.95) and robust to population mismatches (R2 > 0.85). We provide open source local and web-based software for easy implementation through (https://github.com/hakyimlab/MetaXcan)To understand the mechanistic underpinnings of type 2 diabetes (T2D) loci mapped through GWAS, we performed a tissue-specific gene association study in a cohort of over 100K individuals (ncases ≈ 26K, ncontrols ≈ 84K) across 44 human tissues using MetaXcan, a summary statistics extension of PrediXcan. We found that 90 genes significantly (FDR < 0.05) associated with T2D, of which 24 are previously reported T2D genes, 29 are novel in established T2D loci, and 37 are novel genes in novel loci. Of these, 13 reported genes, 15 novel genes in known loci, and 6 genes in novel loci replicated (FDRrep < 0.05) in an independent study (ncases ≈ 10K, ncontrols ≈ 62K). We also found enrichment of significant associations in expected tissues such as liver, pancreas, adipose, and muscle but also in tibial nerve, fibroblasts, and breast. Finally, we found that monogenic diabetes genes are enriched in T2D genes from our analysis suggesting that moderate alterations in monogenic (severe) diabetes genes may promote milder and later onset type 2 diabetes.To understand the biological mechanisms underlying thousands of genetic variants robustly associated with complex traits, scalable methods that integrate GWAS and functional data generated by large-scale efforts are needed. Here we propose a method termed MetaXcan that addresses this need by inferring the downstream consequences of genetically regulated components of molecular traits on complex phenotypes using summary data only. MetaXcan allows multiple causal variants and flexible multivariate models extending the capabilities of existing methods and enabling the testing of more complex processes. As an example application, we trained prediction models of gene expression levels in 44 human tissues and inferred the consequences of their regulation in 40 complex phenotypes. Our examination of this broad set of human tissues revealed many novel genes and re-identified known ones with patterns of regulation in expected as well as unexpected tissues.


Genetic Epidemiology | 2014

Poly-Omic Prediction of Complex Traits: OmicKriging

Heather E. Wheeler; Keston Aquino-Michaels; Eric R. Gamazon; Vassily Trubetskoy; M. Eileen Dolan; R. Stephanie Huang; Nancy J. Cox; Hae Kyung Im

High‐confidence prediction of complex traits such as disease risk or drug response is an ultimate goal of personalized medicine. Although genome‐wide association studies have discovered thousands of well‐replicated polymorphisms associated with a broad spectrum of complex traits, the combined predictive power of these associations for any given trait is generally too low to be of clinical relevance. We propose a novel systems approach to complex trait prediction, which leverages and integrates similarity in genetic, transcriptomic, or other omics‐level data. We translate the omic similarity into phenotypic similarity using a method called Kriging, commonly used in geostatistics and machine learning. Our method called OmicKriging emphasizes the use of a wide variety of systems‐level data, such as those increasingly made available by comprehensive surveys of the genome, transcriptome, and epigenome, for complex trait prediction. Furthermore, our OmicKriging framework allows easy integration of prior information on the function of subsets of omics‐level data from heterogeneous sources without the sometimes heavy computational burden of Bayesian approaches. Using seven disease datasets from the Wellcome Trust Case Control Consortium (WTCCC), we show that OmicKriging allows simple integration of sparse and highly polygenic components yielding comparable performance at a fraction of the computing time of a recently published Bayesian sparse linear mixed model method. Using a cellular growth phenotype, we show that integrating mRNA and microRNA expression data substantially increases performance over either dataset alone. Using clinical statin response, we show improved prediction over existing methods. We provide an R package to implement OmicKriging (http://www.scandb.org/newinterface/tools/OmicKriging.html).


Molecular Cancer Therapeutics | 2012

Genetic Variation That Predicts Platinum Sensitivity Reveals the Role of miR-193b* in Chemotherapeutic Susceptibility

Dana Ziliak; Eric R. Gamazon; Bonnie LaCroix; Hae Kyung Im; Yujia Wen; Rong Stephanie Huang

Platinum agents are the backbone of cancer chemotherapy. Recently, we identified and replicated the role of a single nucleotide polymorphism (SNP, rs1649942) in predicting platinum sensitivity both in vitro and in vivo. Using the CEU samples from the International HapMap Project, we found the same SNP to be a master regulator of multiple gene expression phenotypes, prompting us to investigate whether rs1649942-mediated regulation of miRNAs may in part contribute to variation in platinum sensitivity. To these ends, 60 unrelated HapMap CEU I/II samples were used for our discovery-phase study using high-throughput genome-wide miRNA and gene expression profiling. Examining the relationships among rs1649942, its gene expression targets, genome-wide miRNA expression, and cellular sensitivity to carboplatin and cisplatin, we identified 2 platinum-associated miRNAs (miR-193b* and miR-320) that inhibit the expression of 5 platinum-associated genes (CRIM1, IFIT2, OAS1, KCNMA1, and GRAMD1B). We further replicated the relationship between the expression of miR-193b*, CRIM1, IFIT2, KCNMA1, and GRAMD1B, and platinum sensitivity in a separate HapMap CEU III dataset. We then showed that overexpression of miR-193b* in a randomly selected HapMap cell line results in resistance to both carboplatin and cisplatin. This relationship was also found in 7 ovarian cancer cell lines from NCI60 dataset and confirmed in an OVCAR-3 that overexpression of miR-193b* leads to increased resistance to carboplatin. Our findings highlight a potential mechanism of action for a previously observed genotype-survival outcome association. Further examination of miR-193b* in platinum sensitivity in ovarian cancer is warranted. Mol Cancer Ther; 11(9); 2054–61. ©2012 AACR.


PLOS Genetics | 2012

Mixed Effects Modeling of Proliferation Rates in Cell-Based Models: Consequence for Pharmacogenomics and Cancer

Hae Kyung Im; Eric R. Gamazon; Amy L. Stark; R. Stephanie Huang; Nancy J. Cox; M. Eileen Dolan

The International HapMap project has made publicly available extensive genotypic data on a number of lymphoblastoid cell lines (LCLs). Building on this resource, many research groups have generated a large amount of phenotypic data on these cell lines to facilitate genetic studies of disease risk or drug response. However, one problem that may reduce the usefulness of these resources is the biological noise inherent to cellular phenotypes. We developed a novel method, termed Mixed Effects Model Averaging (MEM), which pools data from multiple sources and generates an intrinsic cellular growth rate phenotype. This intrinsic growth rate was estimated for each of over 500 HapMap cell lines. We then examined the association of this intrinsic growth rate with gene expression levels and found that almost 30% (2,967 out of 10,748) of the genes tested were significant with FDR less than 10%. We probed further to demonstrate evidence of a genetic effect on intrinsic growth rate by determining a significant enrichment in growth-associated genes among genes targeted by top growth-associated SNPs (as eQTLs). The estimated intrinsic growth rate as well as the strength of the association with genetic variants and gene expression traits are made publicly available through a cell-based pharmacogenomics database, PACdb. This resource should enable researchers to explore the mediating effects of proliferation rate on other phenotypes.


PLOS Genetics | 2016

Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues.

Heather E. Wheeler; Kaanan P. Shah; Jonathon Brenner; Tzintzuni Garcia; Keston Aquino-Michaels; Nancy J. Cox; Dan L. Nicolae; Hae Kyung Im

Understanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h2 can be relatively well characterized with 59% of expressed genes showing significant h2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h2. Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).

Collaboration


Dive into the Hae Kyung Im's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nancy J. Cox

Vanderbilt University Medical Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge