Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jing Huang is active.

Publication


Featured researches published by Jing Huang.


Ophthalmic Epidemiology | 2018

Evaluation of Approaches to Analyzing Continuous Correlated Eye Data When Sample Size Is Small

Jing Huang; Jiayan Huang; Yong Chen; Gui-shuang Ying

ABSTRACT Purpose: To evaluate the performance of commonly used statistical methods for analyzing continuous correlated eye data when sample size is small. Methods: We simulated correlated continuous data from two designs: (1) two eyes of a subject in two comparison groups; (2) two eyes of a subject in the same comparison group, under various sample size (5–50), inter-eye correlation (0–0.75) and effect size (0–0.8). Simulated data were analyzed using paired t-test, two sample t-test, Wald test and score test using the generalized estimating equations (GEE) and F-test using linear mixed effects model (LMM). We compared type I error rates and statistical powers, and demonstrated analysis approaches through analyzing two real datasets. Results: In design 1, paired t-test and LMM perform better than GEE, with nominal type 1 error rate and higher statistical power. In design 2, no test performs uniformly well: two sample t-test (average of two eyes or a random eye) achieves better control of type I error but yields lower statistical power. In both designs, the GEE Wald test inflates type I error rate and GEE score test has lower power. Conclusion: When sample size is small, some commonly used statistical methods do not perform well. Paired t-test and LMM perform best when two eyes of a subject are in two different comparison groups, and t-test using the average of two eyes performs best when the two eyes are in the same comparison group. When selecting the appropriate analysis approach the study design should be considered.


Journal of the American Medical Informatics Association | 2018

PIE: A prior knowledge guided integrated likelihood estimation method for bias reduction in association studies using electronic health records data

Jing Huang; Rui Duan; Rebecca A. Hubbard; Yonghui Wu; Jason H. Moore; Hua Xu; Yong Chen

Abstract Objectives This study proposes a novelPrior knowledge guidedIntegrated likelihoodEstimation (PIE) method to correct bias in estimations of associations due to misclassification of electronic health record (EHR)-derived binary phenotypes, and evaluates the performance of the proposed method by comparing it to 2 methods in common practice. Methods We conducted simulation studies and data analysis of real EHR-derived data on diabetes from Kaiser Permanente Washington to compare the estimation bias of associations using the proposed method, the method ignoring phenotyping errors, the maximum likelihood method with misspecified sensitivity and specificity, and the maximum likelihood method with correctly specified sensitivity and specificity (gold standard). The proposed method effectively leverages available information on phenotyping accuracy to construct a prior distribution for sensitivity and specificity, and incorporates this prior information through the integrated likelihood for bias reduction. Results Our simulation studies and real data application demonstrated that the proposed method effectively reduces the estimation bias compared to the 2 current methods. It performed almost as well as the gold standard method when the prior had highest density around true sensitivity and specificity. The analysis of EHR data from Kaiser Permanente Washington showed that the estimated associations from PIE were very close to the estimates from the gold standard method and reduced bias by 60%–100% compared to the 2 commonly used methods in current practice for EHR data. Conclusions This study demonstrates that the proposed method can effectively reduce estimation bias caused by imperfect phenotyping in EHR-derived data by incorporating prior information through integrated likelihood.


BMC Medical Informatics and Decision Making | 2017

A signal detection method for temporal variation of adverse effect with vaccine adverse event reporting system data

Yi Cai; Jingcheng Du; Jing Huang; Susan S. Ellenberg; Sean Hennessy; Cui Tao; Yong Chen

BackgroundTo identify safety signals by manual review of individual report in large surveillance databases is time consuming; such an approach is very unlikely to reveal complex relationships between medications and adverse events. Since the late 1990s, efforts have been made to develop data mining tools to systematically and automatically search for safety signals in surveillance databases. Influenza vaccines present special challenges to safety surveillance because the vaccine changes every year in response to the influenza strains predicted to be prevalent that year. Therefore, it may be expected that reporting rates of adverse events following flu vaccines (number of reports for a specific vaccine-event combination/number of reports for all vaccine-event combinations) may vary substantially across reporting years. Current surveillance methods seldom consider these variations in signal detection, and reports from different years are typically collapsed together to conduct safety analyses. However, merging reports from different years ignores the potential heterogeneity of reporting rates across years and may miss important safety signals.MethodReports of adverse events between years 1990 to 2013 were extracted from the Vaccine Adverse Event Reporting System (VAERS) database and formatted into a three-dimensional data array with types of vaccine, groups of adverse events and reporting time as the three dimensions. We propose a random effects model to test the heterogeneity of reporting rates for a given vaccine-event combination across reporting years. The proposed method provides a rigorous statistical procedure to detect differences of reporting rates among years. We also introduce a new visualization tool to summarize the result of the proposed method when applied to multiple vaccine-adverse event combinations.ResultWe applied the proposed method to detect safety signals of FLU3, an influenza vaccine containing three flu strains, in the VAERS database. We showed that it had high statistical power to detect the variation in reporting rates across years. The identified vaccine-event combinations with significant different reporting rates over years suggested potential safety issues due to changes in vaccines which require further investigation.ConclusionWe developed a statistical model to detect safety signals arising from heterogeneity of reporting rates of a given vaccine-event combinations across reporting years. This method detects variation in reporting rates over years with high power. The temporal trend of reporting rate across years may reveal the impact of vaccine update on occurrence of adverse events and provide evidence for further investigations.


Statistics in Medicine | 2018

A Bayesian latent class approach for EHR-based phenotyping: Bayesian latent class phenotyping

Rebecca A. Hubbard; Jing Huang; Joanna Harton; Arman Oganisian; Grace Choi; Levon Utidjian; Ihuoma Eneli; L. Charles Bailey; Yong Chen

Phenotyping, ie, identification of patients possessing a characteristic of interest, is a fundamental task for research conducted using electronic health records. However, challenges to this task include imperfect sensitivity and specificity of clinical codes and inconsistent availability of more detailed data such as laboratory test results. Despite these challenges, most existing electronic health records–derived phenotypes are rule‐based, consisting of a series of Boolean arguments informed by expert knowledge of the disease of interest and its coding. The objective of this paper is to introduce a Bayesian latent phenotyping approach that accounts for imperfect data elements and missing not at random missingness patterns that can be used when no gold‐standard data are available. We conducted simulation studies to compare alternative phenotyping methods under different patterns of missingness and applied these approaches to a cohort of 68u2009265 children at elevated risk for type 2 diabetes mellitus (T2DM). In simulation studies, the latent class approach had similar sensitivity to a rule‐based approach (95.9% vs 91.9%) while substantially improving specificity (99.7% vs 90.8%). In the PEDSnet cohort, we found that biomarkers and clinical codes were strongly associated with latent T2DM status. The latent T2DM class was also strongly predictive of missingness in biomarkers. Glucose was missing in 83.4% of patients (odds ratio for latent T2DM status = 0.52) while hemoglobin A1c was missing in 91.2% (odds ratio for latent T2DM status = 0.03 ), suggesting missing not at random missingness. The latent phenotype approach may substantially improve on rule‐based phenotyping.


Frontiers in Pharmacology | 2018

Characterization of the Differential Adverse Event Rates by Race/Ethnicity Groups for HPV Vaccine by Integrating Data From Different Sources

Jing Huang; Jingcheng Du; Rui Duan; Xinyuan Zhang; Cui Tao; Yong Chen

Data from the Vaccine Adverse Event Reporting System (VAERS) contain spontaneously reported adverse events (AEs) from the public. It has been a major data source for detecting AEs and monitoring vaccine safety. As one major limitation of spontaneous surveillance systems, the VAERS reports by themselves sometimes do not provide enough information to answer certain research questions. For example, patient level demographics are very limited in VAERS due to the protection of patient privacy, such that investigation of differential AE rates across race/ethnicity groups cannot be conducted using VAERS data only. For many vaccines, racial and ethnical difference in immune responses has been found in studies based on racially diverse cohorts. It is of great interest to characterize the differential AE rates by race and ethnicity groups for vaccines. In this study, we propose a novel statistical method to integrate VAERS data with data from other resources for vaccine pharmacovigilance research. Specifically, we integrate VAERS data with CDC survey data of vaccine coverage and U.S. census data of race/ethnicity distribution to quantify differential AE rates by race/ethnicity groups for HPV vaccine. We utilize the difference of race/ethnicity distributions across U.S. states to investigate the association between AE reporting rate and race/ethnicity groups at the population level. We identify 9 AEs with significantly different reporting rates between non-Hispanic White females and other race/ethnicity groups.


Biometrika | 2018

A conditional composite likelihood ratio test with boundary constraints

Yong Chen; Jing Huang; Yang Ning; Kung-Yee Liang; Bruce G. Lindsay

Summary Composite likelihood has been widely used in applications. The asymptotic distribution of the composite likelihood ratio statistic at the boundary of the parameter space is a complicated mixture of weighted &khgr;2 distributions. In this paper we propose a conditional test with data‐dependent degrees of freedom. We consider a modification of the composite likelihood which satisfies the second‐order Bartlett identity. We show that the modified composite likelihood ratio statistic given the number of estimated parameters lying on the boundary converges to a simple &khgr;2 distribution. This conditional testing procedure is validated through simulation studies.


international conference on data mining | 2017

Post-marketing drug safety evaluation using data mining based on FAERS

Rui Duan; Xinyuan Zhang; Jingcheng Du; Jing Huang; Cui Tao; Yong Chen

Healthcare is going through a big data revolution. The amount of data generated by healthcare is expected to increase significantly in the coming years. Therefore, efficient and effective data processing methods are required to transform data into information. In addition, applying statistical analysis can transform the information into useful knowledge. We developed a data mining method that can uncover new knowledge in this enormous field for clinical decision making while generating scientific methods and hypotheses. The proposed pipeline can be generally applied to a variety of data mining tasks in medical informatics. For this study, we applied the proposed pipeline for post-marketing surveillance on drug safety using FAERS, the data warehouse created by FDA. We used 14 kinds of neurology drugs to illustrate our methods. Our result indicated that this approach can successfully reveal insight for further drug safety evaluation.


Genetic Epidemiology | 2017

On meta‐ and mega‐analyses for gene–environment interactions

Jing Huang; Yulun Liu; Steve Vitale; Trevor M. Penning; Alexander S. Whitehead; Ian A. Blair; Anil Vachani; Margie L. Clapper; Joshua E. Muscat; Philip Lazarus; Paul Scheet; Jason H. Moore; Yong Chen

Gene‐by‐environment (G × E) interactions are important in explaining the missing heritability and understanding the causation of complex diseases, but a single, moderately sized study often has limited statistical power to detect such interactions. With the increasing need for integrating data and reporting results from multiple collaborative studies or sites, debate over choice between mega‐ versus meta‐analysis continues. In principle, data from different sites can be integrated at the individual level into a “mega” data set, which can be fit by a joint “mega‐analysis.” Alternatively, analyses can be done at each site, and results across sites can be combined through a “meta‐analysis” procedure without integrating individual level data across sites. Although mega‐analysis has been advocated in several recent initiatives, meta‐analysis has the advantages of simplicity and feasibility, and has recently led to several important findings in identifying main genetic effects. In this paper, we conducted empirical and simulation studies, using data from a G × E study of lung cancer, to compare the mega‐ and meta‐analyses in four commonly used G × E analyses under the scenario that the number of studies is small and sample sizes of individual studies are relatively large. We compared the two data integration approaches in the context of fixed effect models and random effects models separately. Our investigations provide valuable insights in understanding the differences between mega‐ and meta‐analyses in practice of combining small number of studies in identifying G × E interactions.


american medical informatics association annual symposium | 2016

An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies

Rui Duan; Ming Cao; Yonghui Wu; Jing Huang; Joshua C. Denny; Hua Xu; Yong Chen


ieee international conference on healthcare informatics | 2018

Comparing Pharmacovigilance Outcomes Between FAERS and EMR Data for Acute Mania Patients

Xinyuan Zhang; Rui Duan; Jingcheng Du; Jing Huang; Yong Chen; Cui Tao

Collaboration


Dive into the Jing Huang's collaboration.

Top Co-Authors

Avatar

Yong Chen

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Rui Duan

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Jingcheng Du

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar

Cui Tao

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar

Xinyuan Zhang

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Jason H. Moore

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Hua Xu

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yonghui Wu

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge