Jianping Dong
Michigan Technological University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jianping Dong.
American Journal of Human Genetics | 2003
Shuanglin Zhang; Qiuying Sha; Huann Sheng Chen; Jianping Dong; Renfang Jiang
Studies using haplotypes of multiple tightly linked markers are more informative than those using a single marker. However, studies based on multimarker haplotypes have some difficulties. First, if we consider each haplotype as an allele and use the conventional single-marker transmission/disequilibrium test (TDT), then the rapid increase in the degrees of freedom with an increasing number of markers means that the statistical power of the conventional tests will be low. Second, the parental haplotypes cannot always be unambiguously reconstructed. In the present article, we propose a haplotype-sharing TDT (HS-TDT) for linkage or association between a disease-susceptibility locus and a chromosome region in which several tightly linked markers have been typed. This method is applicable to both quantitative traits and qualitative traits. It is applicable to any size of nuclear family, with or without ambiguous phase information, and it is applicable to any number of alleles at each of the markers. The degrees of freedom (in a broad sense) of the test increase linearly as the number of markers considered increases but do not increase as the number of alleles at the markers increases. Our simulation results show that the HS-TDT has the correct type I error rate in structured populations and that, in most cases, the power of HS-TDT is higher than the power of the existing single-marker TDTs and haplotype-based TDTs.
Annals of Human Genetics | 2005
Qiuying Sha; Jianping Dong; Renfang Jiang; Shuanglin Zhang
Candidate gene association tests are currently performed using several intragenic SNPs simultaneously, by testing SNP haplotype or genotype effects in multifactorial diseases or traits. The number of haplotypes drastically increases with an increase in the number of typed SNPs. As a result, large numbers of haplotypes will introduce large degrees of freedom in haplotype‐based tests, and thus limit the power of the tests.
Journal of Computational and Graphical Statistics | 1994
Jianping Dong; Jeffrey S. Simonoff
Abstract In recent years several authors have investigated the use of smoothing methods for sparse multinomial data. In particular, Hall and Titterington (1987) studied kernel smoothing in detail. It is pointed out here that the bias of kernel estimates of probabilities for cells near the boundaries of the multinomial vector can dominate the mean sum of squared error of the estimator for most true probability vectors. Fortunately, boundary kernels devised to correct boundary effects for kernel regression estimators can achieve the same result for these estimators. Properties of estimates based on boundary kernels are investigated and compared to unmodified kernel estimates and maximum penalized likelihood estimates. Monte Carlo evidence indicates that the boundary-corrected kernel estimates usually outperform uncorrected kernel estimates and are quite competitive with penalized likelihood estimates.
Genetic Epidemiology | 2001
Jinming Li; Dai Wang; Jianping Dong; Renfang Jiang; Kui Zhang; Shuanglin Zhang; Hongyu Zhao; Fengzhu Sun
We develop a score statistic to test for linkage in the presence of linkage disequilibrium for quantitative traits. We then extend this method to analyze multiple tightly linked markers. One potential limitation with the use of many genetic markers is the large number of degrees of freedom involved that may reduce the overall power to detect linkage. To overcome this limitation, we propose to group haplotypes on the basis of haplotype similarity before performing transmission disequilibrium tests. Finally, we apply these methods to the Genetic Analysis Workshop 12 simulated data and compare their power.
Forest Ecology and Management | 2001
E Kuuseoks; Jianping Dong; David D. Reed
Age data from hazel (Corylus spp.) and total shrub populations of seven aspen stands in northern Minnesota were used to investigate two age-density distribution models: a negative exponential model and a power function model. The negative exponential model, implying a constant mortality rate for all age shrubs, is the better model for describing population dynamics of both hazel and total shrub populations in the study area. The mortality rate within a stand was independent of stem density and age, decreased when overstory basal area increased, and increased as site quality increased. The mortality rate of hazel increased with light availability, but the total shrub population mortality rate decreased with increasing light availability. The regeneration rate of new aerial stems differed among the sampled stands, indicating its importance in regulating shrub dynamics. The regeneration rate was negatively related to overstory basal area, and positively related to light availability and site index. Hazel regeneration was also positively related to overstory age, implying more regeneration of aerial stems in older stands. Coefficients of the negative exponential shrub age structure relationships are estimated from overstory characteristics and site conditions.
Statistics & Probability Letters | 2001
Jianping Dong; Chuang Zheng
Jones et al. (Biometrica 85, 235) proposed an edge frequency polygon estimator to estimate a probability density function. Their estimator has a smaller asymptotic mean integrated squared error than that of the frequency polygon estimator. In this paper we introduce a generalized edge frequency polygon estimator. Instead of averaging heights of two bins at each bin edge, we take weighted averages of the heights in the neighboring 2k (k[greater-or-equal, slanted]1) bins, which further reduces the asymptotic mean integrated squared error.
Communications in Statistics-theory and Methods | 2009
Jianping Dong; Renfang Jiang
In this article, we introduce a wavelet threshold estimator to estimate multinomial probabilities. The advantages of the estimator are its adaptability to the roughness and sparseness of the data. The asymptotic behavior of the estimator is investigated through an often-used criteria: the mean sum of squared error (MSSE). We show that the MSSE of the estimator achieves the optimal rate of convergence. Its performance on finite samples is examined through simulation studies which show favorable results for the new estimator over the commonly used kernel estimator.
American Journal of Human Genetics | 2004
Shuanglin Zhang; Qiuying Sha; Huann Sheng Chen; Jianping Dong; Renfang Jiang
To the Editor:Knapp and Becker (2004xImpact of genotyping error on type I error rate of the haplotype-sharing transmission /disequilibrium test (HS-TDT). Knapp, M and Becker, T. Am J Hum Genet. 2004; 74: 589–591Abstract | Full Text | Full Text PDF | PubMed | Scopus (25)See all References2004 [in this issue]) have argued that genotyping errors may lead to an inflated type I error rate for the haplotype-sharing transmission/disequilibrium test (HS-TDT) that we proposed (Zhang et al.2003xTransmission/disequilibrium test based on haplotype sharing for tightly linked markers. Zhang, S, Sha, Q, Chen, HS, Dong, J, and Jiang, R. Am J Hum Genet. 2003; 73: 566–579Abstract | Full Text | Full Text PDF | PubMed | Scopus (61)See all References2003). The reason is that transmitted haplotypes are partially checked for genotyping errors by Mendelian inconsistency (MI), whereas there is no such checking at all for nontransmitted haplotypes. As a result of the unbalanced checking for genotyping errors, nontransmitted haplotypes appear less similar than transmitted haplotypes, which may lead to an inflated type I error rate for the HS-TDT. This is especially true for cases in which there is only one child per nuclear family. As noted by Gordon et al. (2001xA transmission/disequilibrium test that allows for genotyping errors in the analysis of single-nucleotide polymorphism data. Gordon, D, Heath, SC, Liu, X, and Ott, J. Am J Hum Genet. 2001; 69: 371–380Abstract | Full Text | Full Text PDF | PubMed | Scopus (121)See all References2001), the original TDT also has this problem. The HS-TDT that we proposed is applicable to any size of nuclear family and to different traits. To quantify the magnitude of type I error inflation of HS-TDT, Knapp and Becker (2004xImpact of genotyping error on type I error rate of the haplotype-sharing transmission /disequilibrium test (HS-TDT). Knapp, M and Becker, T. Am J Hum Genet. 2004; 74: 589–591Abstract | Full Text | Full Text PDF | PubMed | Scopus (25)See all References2004) performed a simulation study of nuclear families with one child. In fact, the magnitude of the type I error inflation caused by the unbalanced checking of the genotyping errors depends on the genotyping error rate as well as the following factors: 1.The number of children. If there is more than one child in the nuclear family, the genotyping errors in the haplotypes that do not transmit to the first child may be still detectable because these haplotypes may transmit to the other children. So, the inclusion of families with more than one child can reduce the type I error inflation.2.The allele frequencies. A smaller minor allele frequency will lead to a larger probability of homozygous genotypes and, therefore, a larger probability of detectable genotyping errors (MI). Consequently, it will lead to larger type I error inflation (see table 3 of Gordon et al. 2001xA transmission/disequilibrium test that allows for genotyping errors in the analysis of single-nucleotide polymorphism data. Gordon, D, Heath, SC, Liu, X, and Ott, J. Am J Hum Genet. 2001; 69: 371–380Abstract | Full Text | Full Text PDF | PubMed | Scopus (121)See all References2001). For HS-TDT, a marker with a small minor allele frequency in the middle part of the haplotype has a bigger effect than a marker with a small minor allele frequency in the edge part of the haplotype.3.The haplotype similarity measure.We believe that the reasons for the high type I error rate of HS-TDT in Knapp and Becker’s simulation studies are the following: (1) only families with one child were used; (2) the minor allele frequencies are small for the markers in the middle part of the haplotypes (for the total 19 markers, the minor allele frequencies from marker 7 to marker 16 are 0.16, 0.125, 0.143, 0.143, 0.11, 0.268, 0.089, 0.143, 0.143, and 0.036, respectively); and (3) the haplotype similarity measure that we proposed in Zhang et al. 2003xTransmission/disequilibrium test based on haplotype sharing for tightly linked markers. Zhang, S, Sha, Q, Chen, HS, Dong, J, and Jiang, R. Am J Hum Genet. 2003; 73: 566–579Abstract | Full Text | Full Text PDF | PubMed | Scopus (61)See all References2003 is not robust to genotyping errors. To compare the different haplotype similarity measures, we propose another measure (called “new similarity measure”) as follows. For two haplotypes, H and h, let Hi (hi) denote the allele of the haplotype H (h) at marker i. To find the similarity measure of the two haplotypes around marker i, we compare alleles of the two haplotypes in the right-hand markers, beginning with marker i+1, until marker i+r satisfies Hi+r≠hi+r and either Hi+r+1≠hi+r+1 or Hi+r+2≠hi+r+2. Then, similarly, we compare alleles of the two haplotypes in the left-hand markers, beginning with marker i-1, until marker i-l satisfies Hi-l≠hi-l and either Hi-l-1≠hi-l-1 or Hi-l-2≠hi-l-2. The new similarity measure is defined as the distance between marker i-l and marker i+r. Note that a genotyping error that occurs at one marker but does not occur at the nearby markers will not affect the new similarity measure. The probability that genotyping errors will occur in several consecutive markers is very small. To compare the effect of the number of children and different haplotype similarity measures, we performed simulation studies in which we used the data and the error options EO2 and EO3 given by Knapp and Becker (2003). We did not use EO1 because our program automatically deletes the families with MI genotyping errors. The simulation results are summarized in table 1table 1. This table reveals that, if there are three children in each of the nuclear families, a good agreement between the nominal and estimated type I error rate is evident for all the simulated samples. In the case of one child per family, the inflation of the type I error rate is greatly reduced by using the new similarity measure. We currently are investigating methods that are more robust to genotyping errors.Table 1Parameters and Results of Simulation Study of Type I Error Rate of HS-TDT in the Presence of Genotyping ErrorType I Error Rate forParametersEO2EO3Similarity MeasureNo. of Children/ FamilyNo. of Nuclear Families/ SampleTyping Error Rate (e)α=.05aaα=.01aaα=.05aaα=.01aaOriginal:1100.01.457.227.364.1471100.005.228.08.193.0751200.005.364.147.315.1203100.01.053.012.06.0163100.005.056.013.046.0113200.005.044.008.053.006New:1100.01.117.037.092.0191100.005.079.016.073.0141200.005.081.016.101.0293100.01.059.016.042.0063100.005.059.015.043.0093200.005.047.010.045.003Note.—The “original similarity measure” refers to the one used by Zhang et al. (2003xTransmission/disequilibrium test based on haplotype sharing for tightly linked markers. Zhang, S, Sha, Q, Chen, HS, Dong, J, and Jiang, R. Am J Hum Genet. 2003; 73: 566–579Abstract | Full Text | Full Text PDF | PubMed | Scopus (61)See all References2003). Simulation studies were based on 1,000 replicated samples.aα = nominal type I error rate.
Communications in Statistics-theory and Methods | 2000
Jianping Dong; Renfang Jiang
We define and compute a boundary kernel for local polynomial regression, We prove that the new kernel provides improvement over the existing kernels, Simulations show the improvement in finite samples.
Communications in Statistics-theory and Methods | 1996
Jianping Dong; Qian Ye
This paper introduces two estimators, a boundary corrected minimum variance kernel estimator based on a uniform kernel and a discrete frequency polygon estimator, for the cell probabilities of ordinal contingency tables. Simulation results show that the minimum variance boundary kernel estimator has a smaller average sum of squared error than the existing boundary kernel estimators. The discrete frequency polygon estimator is simple and easy to interpret, and it is competitive with the minimum variance boundary kernel estimator. It is proved that both estimators have an optimal rate of convergence in terms of mean sum of squared error, The estimators are also defined for high-dimensional tables.