Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jingjing Yin is active.

Publication


Featured researches published by Jingjing Yin.


Biometrical Journal | 2016

Improved nonparametric estimation of the optimal diagnostic cut-off point associated with the Youden index under different sampling schemes.

Jingjing Yin; Hani M. Samawi; Daniel F. Linder

A diagnostic cut-off point of a biomarker measurement is needed for classifying a random subject to be either diseased or healthy. However, the cut-off point is usually unknown and needs to be estimated by some optimization criteria. One important criterion is the Youden index, which has been widely adopted in practice. The Youden index, which is defined as the maximum of (sensitivity + specificity -1), directly measures the largest total diagnostic accuracy a biomarker can achieve. Therefore, it is desirable to estimate the optimal cut-off point associated with the Youden index. Sometimes, taking the actual measurements of a biomarker is very difficult and expensive, while ranking them without the actual measurement can be relatively easy. In such cases, ranked set sampling can give more precise estimation than simple random sampling, as ranked set samples are more likely to span the full range of the population. In this study, kernel density estimation is utilized to numerically solve for an estimate of the optimal cut-off point. The asymptotic distributions of the kernel estimators based on two sampling schemes are derived analytically and we prove that the estimators based on ranked set sampling are relatively more efficient than that of simple random sampling and both estimators are asymptotically unbiased. Furthermore, the asymptotic confidence intervals are derived. Intensive simulations are carried out to compare the proposed method using ranked set sampling with simple random sampling, with the proposed method outperforming simple random sampling in all cases. A real data set is analyzed for illustrating the proposed method.


Communications in Statistics-theory and Methods | 2017

On kernel density estimation based on different stratified sampling with optimal allocation

Hani M. Samawi; Arpita Chatterjee; Jingjing Yin; Haresh Rochani

ABSTRACT Kernel density estimation is probably the most widely used non parametric statistical method for estimating probability densities. In this paper, we investigate the performance of kernel density estimator based on stratified simple and ranked set sampling. Some asymptotic properties of kernel estimator are established under both sampling schemes. Simulation studies are designed to examine the performance of the proposed estimators under varying distributional assumptions. These findings are also illustrated with the help of a dataset on bilirubin levels in babies in a neonatal intensive care unit.


Journal of Statistical Computation and Simulation | 2018

A Simpler Approach for Mediation Analysis for Dichotomous Mediators in Logistic Regression

Hani M. Samawi; Jingxian Cai; Daniel F. Linder; Haresh Rochani; Jingjing Yin

ABSTRACT Mediation is a hypothesized causal chain among three variables. Mediation analysis for continuous response variables is well developed in the literature, and it can be shown that the indirect effect is equal to the total effect minus the direct effect. However, mediation analysis for categorical responses is still not fully developed. The purpose of this article is to propose a simpler method of analysing the mediation effect among three variables when the dependent and mediator variables are both dichotomous. We propose using the latent variable technique which in turn will adjust for the necessary condition that indirect effect is equal to the total effect minus the direct effect. An intensive simulation study is conducted to compare the proposed method with other methods in the literature. Our theoretical derivation and simulation study show that the proposed approach is simpler to use and at least as good as other approaches provided in the literature. We illustrate our approach to test for the potential mediators on the relationship between depression and obesity among children and adolescents compared to the method in Winship and Mare using National children health survey data 2011–2012.


Statistics in Medicine | 2017

Notes on the overlap measure as an alternative to the Youden index: How are they related?

Hani M. Samawi; Jingjing Yin; Haresh Rochani; Viral Panchal

The receiver operating characteristic (ROC) curve is frequently used to evaluate and compare diagnostic tests. As one of the ROC summary indices, the Youden index measures the effectiveness of a diagnostic marker and enables the selection of an optimal threshold value (cut-off point) for the marker. Recently, the overlap coefficient, which captures the similarity between 2 distributions directly, has been considered as an alternative index for determining the diagnostic performance of markers. In this case, a larger overlap indicates worse diagnostic accuracy, and vice versa. This paper provides a graphical demonstration and mathematical derivation of the relationship between the Youden index and the overlap coefficient and states their advantages over the most popular diagnostic measure, the area under the ROC curve. Furthermore, we outline the differences between the Youden index and overlap coefficient and identify situations in which the overlap coefficient outperforms the Youden index. Numerical examples and real data analysis are provided.


Biometrical Journal | 2015

Improved Estimation of Diagnostic Cut-Off Point Associated with Youden Index Using Ranked Set Sampling

Jingjing Yin; Hani M. Samawi; Chen Mo; Daniel F. Linder

A diagnostic cut-off point of a biomarker measurement is needed for classifying a random subject to be either diseased or healthy. However, the cut-off point is usually unknown and needs to be estimated by some optimization criteria. One important criterion is the Youden index, which has been widely adopted in practice. The Youden index, which is defined as the maximum of (sensitivity + specificity -1), directly measures the largest total diagnostic accuracy a biomarker can achieve. Therefore, it is desirable to estimate the optimal cut-off point associated with the Youden index. Sometimes, taking the actual measurements of a biomarker is very difficult and expensive, while ranking them without the actual measurement can be relatively easy. In such cases, ranked set sampling can give more precise estimation than simple random sampling, as ranked set samples are more likely to span the full range of the population. In this study, kernel density estimation is utilized to numerically solve for an estimate of the optimal cut-off point. The asymptotic distributions of the kernel estimators based on two sampling schemes are derived analytically and we prove that the estimators based on ranked set sampling are relatively more efficient than that of simple random sampling and both estimators are asymptotically unbiased. Furthermore, the asymptotic confidence intervals are derived. Intensive simulations are carried out to compare the proposed method using ranked set sampling with simple random sampling, with the proposed method outperforming simple random sampling in all cases. A real data set is analyzed for illustrating the proposed method.


Journal of statistical theory and practice | 2018

Reducing Sample Size Needed for Accelerated Failure Time Model Using More Efficient Sampling Methods

Hani M. Samawi; Amal Helu; Haresh Rochani; Jingjing Yin; Lili Yu; Robert L. Vogel

Survival data are time-to-event data, such as time to death, time to appearance of a tumor, or time to recurrence of a disease. Accelerated failure time (AFT) models provide a linear relationship between the log of the failure time and covariates that affect the expected time to failure by contracting or expanding the time scale. The AFT model has intensive application in the field of social, medical, behavioral, and public health sciences. In this article we propose a more efficient sampling method of recruiting subjects for survival analysis. We propose using a Moving Extreme Ranked Set Sampling (MERSS) or an Extreme Ranked Set Sampling (ERSS) scheme with ranking based on an easy-to-evaluate baseline auxiliary variable known to be associated with survival time. This article demonstrates that these approaches provide a more powerful testing procedure, as well as a more efficient estimate of hazard ratio, than that based on simple random sampling (SRS). Theoretical derivation and simulation studies are provided. The Iowa 65+ Rural Health Study data are used to illustrate the methods developed in this article.


Communications in Statistics-theory and Methods | 2018

On quantiles estimation based on different stratified sampling with optimal allocation

Hani M. Samawi; Arpita Chatterjee; Jingjing Yin; Haresh Rochani

ABSTRACT This work considers the problem of estimating a quantile function based on different stratified sampling mechanism. First, we develop an estimate for population quantiles based on stratified simple random sampling (SSRS) and extend the discussion for stratified ranked set sampling (SRSS). Furthermore, the asymptotic behavior of the proposed estimators are presented. In addition, we derive an analytical expression for the optimal allocation under both sampling schemes. Simulation studies are designed to examine the performance of the proposed estimators under varying distributional assumptions. The efficiency of the proposed estimates is further illustrated by analyzing a real data set from CHNS.


Communications in Statistics-theory and Methods | 2018

Increased Fisher's Information for Parameters of Association in Count Regression via Extreme Ranks

Daniel F. Linder; Jingjing Yin; Haresh Rochani; Hani M. Samawi; Sanjay Sethi

ABSTRACT The article details a sampling scheme which can lead to a reduction in sample size and cost in clinical and epidemiological studies of association between a count outcome and risk factor. We show that inference in two common generalized linear models for count data, Poisson and negative binomial regression, is improved by using a ranked auxiliary covariate, which guides the sampling procedure. This type of sampling has typically been used to improve inference on a population mean. The novelty of the current work is its extension to log-linear models and derivations showing that the sampling technique results in an increase in information as compared to simple random sampling. Specifically, we show that under the proposed sampling strategy the maximum likelihood estimate of the risk factor’s coefficient is improved through an increase in the Fisher’s information. A simulation study is performed to compare the mean squared error, bias, variance, and power of the sampling routine with simple random sampling under various data-generating scenarios. We also illustrate the merits of the sampling scheme on a real data set from a clinical setting of males with chronic obstructive pulmonary disease. Empirical results from the simulation study and data analysis coincide with the theoretical derivations, suggesting that a significant reduction in sample size, and hence study cost, can be realized while achieving the same precision as a simple random sample.


Communications in Statistics - Simulation and Computation | 2018

Methods Improving the Estimate of Diagnostic Odds Ratio

Yisong Huang; Jingjing Yin; Hani M. Samawi

ABSTRACT Diagnostic odds ratio is defined as the ratio of the odds of the positivity of a diagnostic test results in the diseased population relative to that in the non-diseased population. It is a function of sensitivity and specificity, which can be seen as an indicator of the diagnostic accuracy for the evaluation of a biomarker/test. The naïve estimator of diagnostic odds ratio fails when either sensitivity or specificity is close to one, which leads the denominator of diagnostic odds ratio equal to zero. We propose several methods to adjust for such situation. Agresti and Coull’s adjustment is a common and straightforward way for extreme binomial proportions. Alternatively, estimation methods based on a more advanced sampling design can be applied, which systematically selects samples from underlying population based on judgment ranks. Under such design, the odds can be estimated by the sum of indicator functions and thus avoid the situation of dividing by zero and provide a valid estimation. The asymptotic mean and variance of the proposed estimators are derived. All methods are readily applied for the confidence interval estimation and hypothesis testing for diagnostic odds ratio. A simulation study is conducted to compare the efficiency of the proposed methods. Finally, the proposed methods are illustrated using a real dataset.


Communications for Statistical Applications and Methods | 2016

Evaluating the Efficiency of Treatment Comparison in Crossover Design by Allocating Subjects Based On Ranked Auxiliary Variable

Yisong Huang; Hani M. Samawi; Robert L. Vogel; Jingjing Yin; Worlanyo Eric Gato; Daniel F. Linder

The validity of statistical inference depends on proper randomization methods. However, even with proper randomization, we can have imbalanced with respect to important characteristics. In this paper, we introduce a method based on ranked auxiliary variables for treatment allocation in crossover designs using Latin squares models. We evaluate the improvement of the efficiency in treatment comparisons using the proposed method. Our simulation study reveals that our proposed method provides a more powerful test compared to simple randomization with the same sample size. The proposed method is illustrated by conducting an experiment to compare two different concentrations of titanium dioxide nanofiber (TDNF) on rats for the purpose of comparing weight gain.

Collaboration


Dive into the Jingjing Yin's collaboration.

Top Co-Authors

Avatar

Hani M. Samawi

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Haresh Rochani

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Daniel F. Linder

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Robert L. Vogel

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Arpita Chatterjee

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Lili Yu

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Amal Helu

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Jingxian Cai

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Yi Hao

Georgia Southern University

View shared research outputs
Top Co-Authors

Avatar

Yisong Huang

Georgia Southern University

View shared research outputs
Researchain Logo
Decentralizing Knowledge