Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zhi Geng is active.

Publication


Featured researches published by Zhi Geng.


Journal of The Royal Statistical Society Series B-statistical Methodology | 2002

Criteria for confounders in epidemiological studies

Zhi Geng; Jianhua Guo; Wing-Kam Fung

The paper addresses a formal definition of a confounder based on the qualitative definition that is commonly used in standard epidemiology text-books. To derive the criterion for a factor to be a confounder given by Miettinen and Cook and to clarify inconsistency between various criteria for a confounder, we introduce the concepts of an irrelevant factor, an occasional confounder and a uniformly irrelevant factor. We discuss criteria for checking these and show that Miettinen and Cooks criterion can also be applied to occasional confounders. Moreover, we consider situations with multiple potential confounders, and we obtain two necessary conditions that are satisfied by each confounder set. None of the definitions and results presented in this paper require the ignorability and sufficient control confounding assumptions which are commonly employed in observational and epidemiological studies.


fuzzy systems and knowledge discovery | 2005

Structural learning of graphical models and its applications to traditional chinese medicine

Ke Deng; Delin Liu; Shan Gao; Zhi Geng

Bayesian networks and undirected graphical models are often used to cope with uncertainty for complex systems with a large number of variables. They can be applied to discover causal relationships and associations between variables. In this paper, we present heuristic algorithms for structural learning of undirected graphical models from observed data. These algorithms are applied to traditional Chinese medicine.


international symposium on neural networks | 2006

Combination of network construction and cluster analysis and its application to traditional chinese medicine

Mingfeng Wang; Zhi Geng; Miqu Wang; Feng Chen; Weijun Ding; Ming Liu

Bayesian networks and cluster analysis are widely applied to network construction, data mining and causal discovery in bioinformation and medical researches. A Bayesian network is used to describe associations among a large number of variables, such as a gene network and a network describing relationships among symptoms. Cluster analysis is used to cluster associated variables, For example, genes with similar expressions or associated symptoms are grouped into a cluster. In this paper, we combine these approaches of network construction and cluster analysis together. On the one hand, we use Bayesian networks to explain relationships among variables in each cluster; on the other hand we use hierarchical cluster approach to assist network construction, and we propose a structure learning approach. In the stepwise approach, a subnetwork over a larger cluster is constructed by combining several subnetworks over small clusters whenever these small clusters are grouped together. The proposed approach is applied to a traditional Chinese medical study on a kidney disease.


Statistics in Medicine | 2012

Discovering herbal functional groups of traditional Chinese medicine

Ping He; Ke Deng; Zhihai Liu; Delin Liu; Jun S. Liu; Zhi Geng

For the traditional Chinese medicine (TCM), a prescription for a patient often contains several herbs. Some herbs are often used together in prescriptions, and these herbs can be considered as a functional group. In this paper, we propose an approach for discovering herbal functional groups from a large set of prescriptions recorded in TCM books. These functional groups are allowed to overlap with each other. Our approach is validated with a simulation study and applied to a data set containing thousands of TCM prescriptions.


International Journal of Environmental Research and Public Health | 2018

A Bayesian Approach to Real-Time Monitoring and Forecasting of Chinese Foodborne Diseases

Xueli Wang; Moqin Zhou; Jinzhu Jia; Zhi Geng; Gexin Xiao

Foodborne diseases have a big impact on public health and are often underreported. This is because a lot of patients delay treatment when they suffer from foodborne diseases. In Hunan Province (China), a total of 21,226 confirmed foodborne disease cases were reported from 1 March 2015 to 28 February 2016 by the Foodborne Surveillance Database (FSD) of the China National Centre for Food Safety Risk Assessment (CFSA). The purpose of this study was to make use of the daily number of visiting patients to forecast the daily true number of patients. Our main contribution is that we take the reporting delays into consideration and propose a Bayesian hierarchical model for this forecast problem. The data shows that there were 21,226 confirmed cases reported among 21,866 visiting patients, a proportion as high as 97%. Given this observation, the Bayesian hierarchical model was established to predict the daily true number of patients using the number of visiting patients. We propose several scoring rules to assess the performance of different nowcasting procedures. We conclude that Bayesian nowcasting with consideration of right truncation of the reporting delays has a good performance for short-term forecasting, and could effectively predict the epidemic trends of foodborne diseases. Meanwhile, this approach could provide a methodological basis for future foodborne disease monitoring and control strategies, which are crucial for public health.


ACM Transactions on Intelligent Systems and Technology | 2016

Semiparametric Inference of the Complier Average Causal Effect with Nonignorable Missing Outcomes

Hua Chen; Peng Ding; Zhi Geng; Xiao Hua Zhou

Noncompliance and missing data often occur in randomized trials, which complicate the inference of causal effects. When both noncompliance and missing data are present, previous papers proposed moment and maximum likelihood estimators for binary and normally distributed continuous outcomes under the latent ignorable missing data mechanism. However, the latent ignorable missing data mechanism may be violated in practice, because the missing data mechanism may depend directly on the missing outcome itself. Under noncompliance and an outcome-dependent nonignorable missing data mechanism, previous studies showed the identifiability of complier average causal effect for discrete outcomes. In this article, we study the semiparametric identifiability and estimation of complier average causal effect in randomized clinical trials with both all-or-none noncompliance and outcome-dependent nonignorable missing continuous outcomes, and propose a two-step maximum likelihood estimator in order to eliminate the infinite dimensional nuisance parameter. Our method does not need to specify a parametric form for the missing data mechanism. We also evaluate the finite sample property of our method via extensive simulation studies and sensitivity analysis, with an application to a double-blinded psychiatric clinical trial.


Statistics in Medicine | 2014

Identifiability of subgroup causal effects in randomized experiments with nonignorable missing covariates

Peng Ding; Zhi Geng

Although randomized experiments are widely regarded as the gold standard for estimating causal effects, missing data of the pretreatment covariates makes it challenging to estimate the subgroup causal effects. When the missing data mechanism of the covariates is nonignorable, the parameters of interest are generally not pointly identifiable, and we can only get bounds for the parameters of interest, which may be too wide for practical use. In some real cases, we have prior knowledge that some restrictions may be plausible. We show the identifiability of the causal effects and joint distributions for four interpretable missing data mechanisms and evaluate the performance of the statistical inference via simulation studies. One application of our methods to a real data set from a randomized clinical trial shows that one of the nonignorable missing data mechanisms fits better than the ignorable missing data mechanism, and the results conform to the studys original expert opinions. We also illustrate the potential applications of our methods to observational studies using a data set from a job-training program.


Communications in Statistics-theory and Methods | 2012

Using Auxiliary Data for Binomial Parameter Estimation with Nonignorable Nonresponse

Xueli Wang; Hua Chen; Zhi Geng; Xiaohua Zhou

Nonignorable nonresponse is a nonresponse mechanism that depends on the values of the variable having nonresponse. When an observed data of a binomial distribution suffer missing values from a nonignorable nonresponse mechanism, the binomial distribution parameters become unidentifiable without any other auxiliary information or assumption. To address the problems of non identifiability, existing methods mostly based on the log-linear regression model. In this article, we focus on the model when the nonresponse is nonignorable and we consider to use the auxiliary data to improve identifiability; furthermore, we derive the maximum likelihood estimator (MLE) for the binomial proportion and its associated variance. We present results for an analysis of real-life data from the SARS study in China. Finally, the simulation study shows that the proposed method gives promising results.


international conference on intelligent computing | 2009

Hidden Markov Models with Multiple Observers

Hua Chen; Zhi Geng; Jinzhu Jia

Hidden Markov models (HMMs) usually assume that the state transition matrices and the output models are time-invariant. Without this assumption, the parameters in a HMM may not be identifiable. In this paper, we propose a HMM with multiple observers such that its parameters are local identifiable without the time-invariant assumption. We show a sufficient condition for local identifiability of parameters in HMMS.


Lecture Notes in Computer Science | 2006

Identifiability and estimation of probabilities from multiple databases with incomplete data and sampling selection

Jinzhu Jia; Zhi Geng; Mingfeng Wang

For an application problem, there may be multiple databases, and each database may not contain complete variables or attributes, that is, some variables are observed but some others are missing. Further, data of a database may be collected conditionally on some designed variables. In this paper, we discuss problems related to data mining from such multiple databases. We propose an approach for detecting identifiability of a joint distribution from multiple databases. For an identifiable joint distribution, we further present the expectation-maximization (EM) algorithm for calculating the maximum likelihood estimates (MLEs) of the joint distribution.

Collaboration


Dive into the Zhi Geng's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jinzhu Jia

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xueli Wang

Beijing University of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zongming Ma

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge