Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ajay Yesupriya is active.

Publication


Featured researches published by Ajay Yesupriya.


Nature Genetics | 2008

A navigator for human genome epidemiology

Wei Yu; Marta Gwinn; Melinda Clyne; Ajay Yesupriya; Muin J. Khoury

To the Editor: Recent successes in large-scale genetic association studies call for renewed attention to integrating research results, not only among studies, but across disciplines1. At the molecular level, genetic polymorphisms provide a starting point for investigating the functions of complex biological systems. At the population level, epidemiologists can begin to use data on genetic variation, associations and interactions to interpret population attributable fractions and estimate the potential health impact of genetically directed interventions2. Publicly available genetic sequence databases have demonstrated their value in accelerating the Human Genome Project and advancing the field of molecular genetics; newer efforts, such as dbGaP and CGEMS, are now beginning to make genotypephenotype data broadly available to the scientific community3. The published scientific literature also reflects rapid growth in studies of human genetic factors in relation to health and disease. Since 2001, the Human Genome Epidemiology Network (HuGENet) has maintained a database of published, population-based epidemiologic studies of human genes extracted from PubMed4. We recently replaced our PubMed search strategy with a new approach using machine learning, which has reduced manual effort and increased both the sensitivity and specificity of screening. Our curator updates the database weekly with articles newly added to PubMed and assigns to them one or more study types (for example, observational study, meta-analysis or genome-wide association study) and data categories (for example, gene-disease association, gene-environment interaction or pharmacogenomics). Each article is indexed in the database with MeSH terms (using the MeSH hierarchical structure) and gene information from the National Center for Bioinformatics (NCBI) Entrez Gene database. As of November 2007, the database has indexed more than 30,000 articles, referencing more than 3,000 genes and nearly 2,000 disease terms (Table 1). Most articles (80%) describe genetic associations. Approximately 20% of all articles were published in 2007, including 68 of 82 genome-wide association studies. To make this database more accessible and useful to interdisciplinary researchers, we have developed an integrated set of applications known collectively as the HuGE Navigator (http://www.hugenavigator. net). Using PubMed abstracts as the core data source, we have developed data and text mining algorithms to create a knowledge base for exploring genetic associations, candidate gene selection and investigator networks. Genetic information can be displayed whenever needed from major gene-centered databases (for example, Entrez Gene, SwissProt, OMIM and GeneCards), as well as from databases of genetic variation and prevalence (for example, dbSNP and HapMap Project), pathways (for example, CGAP, KEGG and BioCarta), and other aspects (for example, Gene Ontology and Gene Clinics). The HuGE Navigator is constructed according to the principles of open source, standardization, interoperability and extensibility, so that new applications can be easily incorporated5. Currently, the HuGE Navigator allows users to navigate and search the database in an integrated manner by using the six applications discussed below. The HuGE Literature Finder is a search engine for finding published literature on human genome epidemiology, including genetic association studies. The search query can include disease terms, environmental factors, genes, or author names and affiliations. The search results can be further refined by using filtering features, including disease, gene, category, study type, author, year, journal, and country. The filtering process can be performed indefinitely until the desired result is obtained. The results (PubMed IDs) can be exported to the PubMed Web site for further exploration and downloading to bibliographic software. The HuGE Investigator Browser is a search engine for finding investigators or collaborators on the basis of research interests, such as diseases, risk factors, or genes. We extract investigator data by using an accessory utility that automatically parses the affiliation data provided by PubMed6. GeneSelectAssist is a search tool for finding possible candidate genes associated with the subject of interest. Search terms can include diseases and exposures. GeneSelectAssist selects and prioritizes genes on the basis of genetic association studies in the HuGE Navigator database, as well as other PubMed abstracts, and evidence from animal models in the NCBI Entrez Gene database. HuGE Watch is a tool for tracking the evolution of human genome epidemiology research dynamically, on the basis of the literature database. It allows users to view temporal trends in publication by gene, disease, and number of investigators, as well as by the geographic distribution of authors. HuGEpedia is an online encyclopedia that summarizes research on gene-disease associations. We are currently developing a system for extracting data from meta-analyses and published genome-wide associations that will form the basis for a disease-specific synopsis written by domain experts. HuGEpedia can be searched by gene or disease. HuGE Risk Translator is a tool that assesses the validity of genetic variants for predicting health outcomes by calculating epidemiologic measures such as population attributable risk, sensitivity, specificity and positive and negative predictive values. The HuGE Navigator offers a new way to navigate and mine the growing scientific literature on human gene-disease associations and related data in human genome epidemiology. As an interconnected system of applications that users can enter by using genes, diseases, or risk factors as the starting point, HuGE Navigator provides a potential bridge between epidemiologic and genetic research domains. Disease and gene names are mapped to standardized vocabularies, so investigators can use their preferred terms to query the knowledge base. By linking to disease-specific databases, such as AlzGene7, HuGE Navigator aims to be the vehicle for navigating the ‘network of networks’ of investigators now working to


American Journal of Epidemiology | 2008

Prevalence in the United States of Selected Candidate Gene Variants Third National Health and Nutrition Examination Survey, 1991–1994

Man-huei Chang; Mary Lou Lindegren; Mary Ann Butler; Stephen J. Chanock; Nicole F. Dowling; Margaret Gallagher; Ramal Moonesinghe; Cynthia A. Moore; Renée M. Ned; Mary Reichler; Christopher L. Sanders; Robert Welch; Ajay Yesupriya; Muin J. Khoury

Population-based allele frequencies and genotype prevalence are important for measuring the contribution of genetic variation to human disease susceptibility, progression, and outcomes. Population-based prevalence estimates also provide the basis for epidemiologic studies of gene–disease associations, for estimating population attributable risk, and for informing health policy and clinical and public health practice. However, such prevalence estimates for genotypes important to public health remain undetermined for the major racial and ethnic groups in the US population. DNA was collected from 7,159 participants aged 12 years or older in Phase 2 (1991–1994) of the Third National Health and Nutrition Examination Survey (NHANES III). Certain age and minority groups were oversampled in this weighted, population-based US survey. Estimates of allele frequency and genotype prevalence for 90 variants in 50 genes chosen for their potential public health significance were calculated by age, sex, and race/ethnicity among non-Hispanic whites, non-Hispanic blacks, and Mexican Americans. These nationally representative data on allele frequency and genotype prevalence provide a valuable resource for future epidemiologic studies in public health in the United States.


BMC Medical Research Methodology | 2008

Reporting of human genome epidemiology (HuGE) association studies: an empirical assessment

Ajay Yesupriya; Evangelos Evangelou; Fotini K. Kavvoura; Nikolaos A. Patsopoulos; Melinda Clyne; Matthew C. Walsh; Bruce K. Lin; Wei Yu; Marta Gwinn; John P. A. Ioannidis; Muin J. Khoury

BackgroundSeveral thousand human genome epidemiology association studies are published every year investigating the relationship between common genetic variants and diverse phenotypes. Transparent reporting of study methods and results allows readers to better assess the validity of study findings. Here, we document reporting practices of human genome epidemiology studies.MethodsArticles were randomly selected from a continuously updated database of human genome epidemiology association studies to be representative of genetic epidemiology literature. The main analysis evaluated 315 articles published in 2001–2003. For a comparative update, we evaluated 28 more recent articles published in 2006, focusing on issues that were poorly reported in 2001–2003.ResultsDuring both time periods, most studies comprised relatively small study populations and examined one or more genetic variants within a single gene. Articles were inconsistent in reporting the data needed to assess selection bias and the methods used to minimize misclassification (of the genotype, outcome, and environmental exposure) or to identify population stratification. Statistical power, the use of unrelated study participants, and the use of replicate samples were reported more often in articles published during 2006 when compared with the earlier sample.ConclusionWe conclude that many items needed to assess error and bias in human genome epidemiology association studies are not consistently reported. Although some improvements were seen over time, reporting guidelines and online supplemental material may help enhance the transparency of this literature.


European Journal of Human Genetics | 2014

A systematic review of cancer GWAS and candidate gene meta-analyses reveals limited overlap but similar effect sizes.

Christine Q. Chang; Ajay Yesupriya; Jessica L. Rowell; Camilla B. Pimentel; Melinda Clyne; Marta Gwinn; Muin J. Khoury; Anja Wulf; Sheri D. Schully

Candidate gene and genome-wide association studies (GWAS) represent two complementary approaches to uncovering genetic contributions to common diseases. We systematically reviewed the contributions of these approaches to our knowledge of genetic associations with cancer risk by analyzing the data in the Cancer Genome-wide Association and Meta Analyses database (Cancer GAMAdb). The database catalogs studies published since January 1, 2000, by study and cancer type. In all, we found that meta-analyses and pooled analyses of candidate genes reported 349 statistically significant associations and GWAS reported 269, for a total of 577 unique associations. Only 41 (7.1%) associations were reported in both candidate gene meta-analyses and GWAS, usually with similar effect sizes. When considering only noteworthy associations (defined as those with false-positive report probabilities ≤0.2) and accounting for indirect overlap, we found 202 associations, with 27 of those appearing in both meta-analyses and GWAS. Our findings suggest that meta-analyses of well-conducted candidate gene studies may continue to add to our understanding of the genetic associations in the post-GWAS era.


Neurotoxicology and Teratology | 2009

Lead and cognitive function in ALAD genotypes in the third National Health and Nutrition Examination Survey.

Edward F. Krieg; Mary Ann Butler; Man-huei Chang; Tiebin Liu; Ajay Yesupriya; Mary Lou Lindegren; Nicole F. Dowling

The relationship between the blood lead concentration and cognitive function in children and adults with different ALAD genotypes who participated in the third National Health and Nutrition Examination Survey was investigated. The relationship between blood lead and serum homocysteine concentrations was also investigated. In children 12 to 16 years old, no difference in the relationship between cognitive function and blood lead concentration between genotypes was found. In adults 20 to 59 years old, mean reaction time decreased as the blood lead concentration increased in the ALAD rs1800435 CC/CG group. This represents an improvement in performance. In adults 60 years and older, no difference in the relationship between cognitive function and blood lead concentration between genotypes was found. The serum homocysteine concentration increased as the blood lead concentration increased in adults 20 to 59 years old and 60 years and older, but there were no differences between genotypes. The mean blood lead concentration of children with the ALAD rs1800435 CC/CG genotype was less than that of children with the GG genotype.


BMC Bioinformatics | 2008

GAPscreener: An automatic tool for screening human genetic association literature in PubMed using the support vector machine technique

Wei Yu; Melinda Clyne; Siobhan M. Dolan; Ajay Yesupriya; Anja Wulf; Tiebin Liu; Muin J. Khoury; Marta Gwinn

BackgroundSynthesis of data from published human genetic association studies is a critical step in the translation of human genome discoveries into health applications. Although genetic association studies account for a substantial proportion of the abstracts in PubMed, identifying them with standard queries is not always accurate or efficient. Further automating the literature-screening process can reduce the burden of a labor-intensive and time-consuming traditional literature search. The Support Vector Machine (SVM), a well-established machine learning technique, has been successful in classifying text, including biomedical literature. The GAPscreener, a free SVM-based software tool, can be used to assist in screening PubMed abstracts for human genetic association studies.ResultsThe data source for this research was the HuGE Navigator, formerly known as the HuGE Pub Lit database. Weighted SVM feature selection based on a keyword list obtained by the two-way z score method demonstrated the best screening performance, achieving 97.5% recall, 98.3% specificity and 31.9% precision in performance testing. Compared with the traditional screening process based on a complex PubMed query, the SVM tool reduced by about 90% the number of abstracts requiring individual review by the database curator. The tool also ascertained 47 articles that were missed by the traditional literature screening process during the 4-week test period. We examined the literature on genetic associations with preterm birth as an example. Compared with the traditional, manual process, the GAPscreener both reduced effort and improved accuracy.ConclusionGAPscreener is the first free SVM-based application available for screening the human genetic association literature in PubMed with high recall and specificity. The user-friendly graphical user interface makes this a practical, stand-alone application. The software can be downloaded at no charge.


Circulation-cardiovascular Genetics | 2011

Racial/ethnic variation in the association of lipid-related genetic variants with blood lipids in the US adult population.

Man-huei Chang; Renée M. Ned; Yuling Hong; Ajay Yesupriya; Quanhe Yang; Tiebin Liu; A. Cecile J.W. Janssens; Nicole F. Dowling

Background— Genome-wide association studies (GWAS) have identified a number of single-nucleotide polymorphisms (SNPs) associated with serum lipid level in populations of European descent. The individual and the cumulative effect of these SNPs on blood lipids are largely unclear for the US population. Methods and Results— Using data from the second phase (1991–1994) of the Third National Health and Nutrition Examination Survey (NHANES III), a nationally representative survey of the US population, we examined associations of 57 GWAS-identified or well-established lipid-related genetic loci with plasma concentrations of high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol, total cholesterol, triglycerides, total cholesterol/HDL-C ratio, and non-HDL-C. We used multivariable linear regression to examine single SNP associations and the cumulative effect of multiple SNPs (using a genetic risk score [GRS]) on blood lipid levels. Analyses were conducted in adults from each of the 3 major racial/ethnic groups in the United States: non-Hispanic whites (n=2296), non-Hispanic blacks (n=1699), and Mexican Americans (n=1713). Allele frequencies for all SNPs varied significantly by race/ethnicity, except rs3764261 in CETP. Individual SNPs had very small effects on lipid levels, effects that were generally consistent in direction across racial/ethnic groups. More GWAS-validated SNPs were replicated in non-Hispanic whites (<67%) than in non-Hispanic blacks (<44%) or Mexican Americans (<44%). GRSs were strongly associated with increased lipid levels in each racial/ethnic group. The combination of all SNPs into a weighted GRS explained no more than 11% of the total variance in blood lipid levels. Conclusions— Our findings show that the combined association of SNPs, based on a GRS, was strongly associated with increased blood lipid measures in all major race/ethnic groups in the United States, which may help in identifying subgroups with a high risk for an unfavorable lipid profile.


European Journal of Human Genetics | 2011

GWAS Integrator: a bioinformatics tool to explore human genetic associations reported in published genome-wide association studies

Wei Yu; Ajay Yesupriya; Anja Wulf; Lucia A. Hindorff; Nicole F. Dowling; Muin J. Khoury; Marta Gwinn

Genome-wide association studies (GWAS) have successfully identified numerous genetic loci that are associated with phenotypic traits and diseases. GWAS Integrator is a bioinformatics tool that integrates information on these associations from the National Human Genome Research institute (NHGRI) Catalog, SNAP (SNP Annotation and Proxy Search), and the Human Genome Epidemiology (HuGE) Navigator literature database. This tool includes robust search and data mining functionalities that can be used to quickly identify relevant associations from GWAS, as well as proxy single-nucleotide polymorphisms (SNPs) and potential candidate genes. Query-based University of California Santa Cruz (UCSC) Genome Browser custom tracks are generated dynamically on the basis of users’ selected GWAS hits or candidate genes from HuGE Navigator literature database (http://www.hugenavigator.net/HuGENavigator/gWAHitStartPage.do). The GWAS Integrator may help enhance inference on potential genetic associations identified from GWAS studies.


BMC Medical Genetics | 2012

Race-ethnic differences in the association of genetic loci with HbA1c levels and mortality in U.S. adults: the third National Health and Nutrition Examination Survey (NHANES III)

Jonna Grimsby; Bianca Porneala; Jason L. Vassy; Quanhe Yang; Jose C. Florez; Josée Dupuis; Tiebin Liu; Ajay Yesupriya; Man-huei Chang; Renée M. Ned; Nicole F. Dowling; Muin J. Khoury; James B. Meigs

BackgroundHemoglobin A1c (HbA1c) levels diagnose diabetes, predict mortality and are associated with ten single nucleotide polymorphisms (SNPs) in white individuals. Genetic associations in other race groups are not known. We tested the hypotheses that there is race-ethnic variation in 1) HbA1c-associated risk allele frequencies (RAFs) for SNPs near SPTA1, HFE, ANK1, HK1, ATP11A, FN3K, TMPRSS6, G6PC2, GCK, MTNR1B; 2) association of SNPs with HbA1c and 3) association of SNPs with mortality.MethodsWe studied 3,041 non-diabetic individuals in the NHANES (National Health and Nutrition Examination Survey) III. We stratified the analysis by race/ethnicity (NHW: non-Hispanic white; NHB: non-Hispanic black; MA: Mexican American) to calculate RAF, calculated a genotype score by adding risk SNPs, and tested associations with SNPs and the genotype score using an additive genetic model, with type 1 error = 0.05.ResultsRAFs varied widely and at six loci race-ethnic differences in RAF were significant (p < 0.0002), with NHB usually the most divergent. For instance, at ATP11A, the SNP RAF was 54% in NHB, 18% in MA and 14% in NHW (p < .0001). The mean genotype score differed by race-ethnicity (NHW: 10.4, NHB: 11.0, MA: 10.7, p < .0001), and was associated with increase in HbA1c in NHW (β = 0.012 HbA1c increase per risk allele, p = 0.04) and MA (β = 0.021, p = 0.005) but not NHB (β = 0.007, p = 0.39). The genotype score was not associated with mortality in any group (NHW: OR (per risk allele increase in mortality) = 1.07, p = 0.09; NHB: OR = 1.04, p = 0.39; MA: OR = 1.03, p = 0.71).ConclusionAt many HbA1c loci in NHANES III there is substantial RAF race-ethnic heterogeneity. The combined impact of common HbA1c-associated variants on HbA1c levels varied by race-ethnicity, but did not influence mortality.


BMC Medical Genetics | 2010

Gene polymorphisms in association with emerging cardiovascular risk markers in adult women

Amy Z. Fan; Ajay Yesupriya; Man-huei Chang; Meaghan House; Jing Fang; Renée M. Ned; Donald K. Hayes; Nicole F. Dowling; Ali H. Mokdad

BackgroundEvidence on the associations of emerging cardiovascular disease risk factors/markers with genes may help identify intermediate pathways of disease susceptibility in the general population. This population-based study is aimed to determine the presence of associations between a wide array of genetic variants and emerging cardiovascular risk markers among adult US women.MethodsThe current analysis was performed among the National Health and Nutrition Examination Survey (NHANES) III phase 2 samples of adult women aged 17 years and older (sample size n = 3409). Fourteen candidate genes within ADRB2, ADRB3, CAT, CRP, F2, F5, FGB, ITGB3, MTHFR, NOS3, PON1, PPARG, TLR4, and TNF were examined for associations with emerging cardiovascular risk markers such as serum C-reactive protein, homocysteine, uric acid, and plasma fibrinogen. Linear regression models were performed using SAS-callable SUDAAN 9.0. The covariates included age, race/ethnicity, education, menopausal status, female hormone use, aspirin use, and lifestyle factors.ResultsIn covariate-adjusted models, serum C-reactive protein concentrations were significantly (P value controlling for false-discovery rate ≤ 0.05) associated with polymorphisms in CRP (rs3093058, rs1205), MTHFR (rs1801131), and ADRB3 (rs4994). Serum homocysteine levels were significantly associated with MTHFR (rs1801133).ConclusionThe significant associations between certain gene variants with concentration variations in serum C-reactive protein and homocysteine among adult women need to be confirmed in further genetic association studies.

Collaboration


Dive into the Ajay Yesupriya's collaboration.

Top Co-Authors

Avatar

Muin J. Khoury

Office of Public Health Genomics

View shared research outputs
Top Co-Authors

Avatar

Nicole F. Dowling

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Man-huei Chang

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Marta Gwinn

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Renée M. Ned

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Wei Yu

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Tiebin Liu

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Anja Wulf

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Melinda Clyne

Centers for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Quanhe Yang

Centers for Disease Control and Prevention

View shared research outputs
Researchain Logo
Decentralizing Knowledge