Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Weihua Zhang is active.

Publication


Featured researches published by Weihua Zhang.


Proceedings of the National Academy of Sciences of the United States of America | 2003

Linkage disequilibrium in human populations

Christine Lonjou; Weihua Zhang; Andrew Collins; William Tapper; Eiram Elahi; Nikolas Maniatis; N. E. Morton

Whereas the human linkage map appears on limited evidence to be constant over populations, maps of linkage disequilibrium (LD) vary among populations that differ in gene history. The greatest difference is between populations of sub-Saharan origin and populations remotely derived from Africa after a major bottleneck that reduced their heterozygosity and altered their Malecot parameters, increasing the intercept M that reflects association in founders and decreasing the exponential decline ɛ. Variation among populations within this ethnic dichotomy is much smaller. These observations validate use of a cosmopolitan LD map based on a sizeable sample representing a large population reliably typed for markers at high density. Then an LD map for a region or isolate within an ethnic group may be created by fitting the sample LD to the cosmopolitan map, estimating Malecot parameters simultaneously. The cosmopolitan map scaled by ɛ recovers 95% of the information that a local map at the same density gives and therefore more than the information in a low-resolution local map. Relative to a Eurasian cosmopolitan map the scaling factors are estimated to be 0.82 for isolates of European descent, 1.53 for Yorubans, and 1.74 for African Americans. These observations are consistent with a common bottleneck (perhaps but not necessarily speciation) ≈173,500 years ago, if the bottleneck associated with migration out of Africa was 100,000 years ago. Eurasian populations (especially isolates with numerous cases) are efficient for genome scans, and populations of recent African origin (such as African Americans) are efficient for identification of causal polymorphisms within a candidate sequence.


American Journal of Human Genetics | 2004

Positional Cloning by Linkage Disequilibrium

Nikolas Maniatis; Andrew Collins; Jane Gibson; Weihua Zhang; William Tapper; N. E. Morton

Recently, metric linkage disequilibrium (LD) maps that assign an LD unit (LDU) location for each marker have been developed (Maniatis et al. 2002). Here we present a multiple pairwise method for positional cloning by LD within a composite likelihood framework and investigate the operating characteristics of maps in physical units (kb) and LDU for two bodies of data (Daly et al. 2001; Jeffreys et al. 2001) on which current ideas of blocks are based. False-negative indications of a disease locus (type II error) were examined by selecting one single-nucleotide polymorphism (SNP) at a time as causal and taking its allelic count (0, 1, or 2, for the three genotypes) as a pseudophenotype, Y. By use of regression and correlation, association between every pseudophenotype and the allelic count of each SNP locus (X) was based on an adaptation of the Malecot model, which includes a parameter for location of the putative gene. By expressing locations in kb or LDU, greater power for localization was observed when the LDU map was fitted. The efficiency of the kb map, relative to the LDU map, to describe LD varied from a maximum of 0.87 to a minimum of 0.36, with a mean of 0.62. False-positive indications of a disease locus (type I error) were examined by simulating an unlinked causal SNP and the allele count was used as a pseudophenotype. The type I error was in good agreement with Walds likelihood theorem for both metrics and all models that were tested. Unlike tests that select only the most significant marker, haplotype, or haploset, these methods are robust to large numbers of markers in a candidate region. Contrary to predictions from tagging SNPs that retain haplotype diversity, the sample with smaller size but greater SNP density gave less error. The locations of causal SNPs were estimated with the same precision in blocks and steps, suggesting that block definition may be less useful than anticipated for mapping a causal SNP. These results provide a guide to efficient positional cloning by SNPs and a benchmark against which the power of positional cloning by haplotype-based alternatives may be measured.


Human Genetics | 2004

Does haplotype diversity predict power for association mapping of disease susceptibility

Weihua Zhang; Andrew Collins; N. E. Morton

Many recent studies have established that haplotype diversity in a small region may not be greatly diminished when the number of markers is reduced to a smaller set of “haplotype-tagging” single-nucleotide polymorphisms (SNPs) that identify the most common haplotypes. These studies are motivated by the assumption that retention of haplotype diversity assures retention of power for mapping disease susceptibility by allelic association. Using two bodies of real data, three proposed measures of diversity, and regression-based methods for association mapping, we found no scenario for which this assumption was tenable. We compared the chi-square for composite likelihood and the maximum chi-square for single SNPs in diplotypes, excluding the marker designated as causal. All haplotype-tagging methods conserve haplotype diversity by selecting common SNPs. When the causal marker has a range of allele frequencies as in real data, chi-square decreases faster than under random selection as the haplotype-tagging set diminishes. Selecting SNPs by maximizing haplotype diversity is inefficient when their frequency is much different from the unknown frequency of the causal variant. Loss of power is minimized when the difference between minor allele frequencies of the causal SNP and a closely associated marker SNP is small, which is unlikely in ignorance of the frequency of the causal SNP unless dense markers are used. Therefore retention of haplotype diversity in simulations that do not mirror genomic allele frequencies has no relevance to power for association mapping. TagSNPs that are assigned to bins instead of haplotype blocks also lose power compared with random SNPs. This evidence favours a multi-stage design in which both models and density change adaptively.


American Journal of Human Genetics | 2007

Genome scanning by composite likelihood.

N. E. Morton; Nikolas Maniatis; Weihua Zhang; Sarah Ennis; Andrew Collins

Ambitious programs have recently been advocated or launched to create genomewide databases for meta-analysis of association between DNA markers and phenotypes of medical and/or social concern. A necessary but not sufficient condition for success in association mapping is that the data give accurate estimates of both genomic location and its standard error, which are provided for multifactorial phenotypes by composite likelihood. That class includes the Malecot model, which we here apply with an illustrative example. This preliminary analysis leads to five inferences: permutation of cases and controls provides a test of association free of autocorrelation; two hypotheses give similar estimates, but one is consistently more accurate; estimation of the false-discovery rate is extended to causal genes in a small proportion of regions; the minimal data for successful meta-analysis are inferred; and power is robust for all genomic factors except minor-allele frequency. An extension to meta-analysis is proposed. Other approaches to genome scanning and meta-analysis should, if possible, be similarly extended so that their operating characteristics can be compared.


Annals of Human Genetics | 2006

Refined Association Mapping for a Quantitative Trait: Weight in the H19-IGF2-INS-TH Region

Weihua Zhang; Nikolas Maniatis; Santiago Rodriguez; Gj Miller; Inm Day; Tom R. Gaunt; Andrew Collins; N. E. Morton

Previous analyses have provided evidence for one or more loci affecting body weight in the H19‐IGF2‐INS‐TH region on chromosome 11p15. To identify the location of a possible causal locus or loci we applied association analysis by composite likelihood to a large cohort under the Malecot model for body weight. A random sample of 2731 men in the UK were typed for eleven single nucleotide polymorphisms (SNPs) in IGF2, two SNPs in H19, one SNP in INS and one microsatellite marker in the TH genes. Using F tests appropriate to small marker sets, the superiority of regression over correlation was confirmed. All the evidence for association came from IGF2, with P= 0.007 for height‐adjusted weight and P= 0.019 for weight additionally adjusted for smoking and alcohol drinking. Although the estimated point location for the suspected causal variant was close to IGF2 ApaI, the 95% confidence and support intervals covered most of IGF2 but none of the other loci. Identification of the causal SNP or SNPs within IGF2 will require typing of more variants in this region.


Human Heredity | 2001

A tournament of linkage tests in complex inheritance

Weihua Zhang; William Tapper; Andrew Collins; Kevin B. Jacobs; R.C. Elston; N. E. Morton

The performance of some weakly parametric linkage tests in common use was compared on 200 replicates of oligogenic inheritance from Genetic Analysis Workshop 10. Each random sample for the quantitative trait was dichotomized at different thresholds and also selected through 2 affected sibs, generating 8 combinations of sample and variable. The variance component program SOLAR performed best with a continuous trait, even in selected samples, when the population mean was used. The sib-pair program SIBPAL2 was best in most other cases when the phenotype product, population mean, and empirical estimates of pair correlations were used. The BETA program that introduced phenotype products was slightly more powerful than maximum likelihood scores under the null hypothesis and approached but did not exceed SIBPAL2 under its optimal conditions. Type I errors generally exceeded expectations from a χ2 test, but were conservative with respect to bounds on lods. All methods can be improved by use of the population mean, empirical correlations, logistic representation for affection status, and correct lods for samples that favour the null hypothesis. It remains uncertain whether all information can be extracted by weakly parametric methods and whether correction for ascertainment bias demands a strongly parametric model. Performance on a standard set of simulated data is indispensable for recognising optimal methods.


Human Genomics | 2005

Cosmopolitan linkage disequilibrium maps

Jane Gibson; William Tapper; Weihua Zhang; N. E. Morton; Andrew Collins

Linkage maps have been invaluable for the positional cloning of many genes involved in severe human diseases. Standard genetic linkage maps have been constructed for this purpose from the Centre dEtude du Polymorphisme Humain and other panels, and have been widely used. Now that attention has shifted towards identifying genes predisposing to common disorders using linkage disequilibrium (LD) and maps of single nucleotide polymorphisms (SNPs), it is of interest to consider a standard LD map which is somewhat analogous to the corresponding map for linkage. We have constructed and evaluated a cosmopolitan LD map by combining samples from a small number of populations using published data from a 10-megabase region on chromosome 20. In support of a pilot study, which examined a number of small genomic regions with a lower density of markers, we have found that a cosmopolitan map, which serves all populations when appropriately scaled, recovers 91 to 95 per cent of the information within population-specific maps. Recombination hot spots appear to have a dominant role in shaping patterns of LD. The success of the cosmopolitan map might be attributed to the co-localisation of hot spots in all populations. Although there must be finer scale differences between populations due to other processes (mutation, drift, selection), the results suggest that a whole-genome standard LD map would indeed be a useful resource for disease gene mapping.


Annals of Human Genetics | 2002

Mapping quantitative effects of oligogenes by allelic association.

Weihua Zhang; Andrew Collins; Gonçalo R. Abecasis; Lon R. Cardon; N. E. Morton

Regression analysis of a quantitative trait as a function of a single diallelic polymorphism has been extended to allelic association by composite likelihood under the Malecot model for multiple markers. We applied the method to 10 single nucleotide polymorphisms (SNPs) spanning 27 kb of the angiotensin‐I converting enzyme (ACE) gene in British families, localising a causal SNP between G2530A and 4656(CT)3/2 in the 3′ region, at a distance of 21.6±0.9 kb from the most proximal SNP T‐5491C. Neither they nor the I/D polymorphism is causal. To clarify genetic parameters we applied combined segregation, linkage and association analysis. Stronger evidence for the 3′ region was obtained, with significant evidence of a lesser 5′ effect as reported in French and Nigerian families. However, rigorous confirmation requires that the causal SNPs be identified. Both Malecot and parametric analysis appear to have high power by comparison with alternative methods for localizing oligogenes and their causal polymorphisms.


Annals of Human Genetics | 2002

A linkage tournament: affection status, parametric analysis, multivariate traits, and enhancements to variance components and relative pairs

Weihua Zhang; Andrew Collins; C. Lonjou; N. E. Morton

Linkage tests to localize oligogenes have been extended during the past year. Using simulated data and multiplex selection we find that several tests on affected sib pairs have comparable power and type I error. Three variants of SIBPAL2 are favoured when substantial numbers of normal sibs are included, but performance relative to the BETA benchmark degrades rapidly as normal sibs are depleted by selective sampling or typing. Neglect of this fact may explain recent failure of retrospective collaboration to confirm asthma candidates in the 5q cytokine region that are supported by other studies. A fully quantitative trait favours variance components under complete ascertainment and two options in SIBPAL2 under multiplex selection, with substantial gain in power from covariance analysis if the covariate is independent of the candidate locus. A dichotomy and liability threshold give virtually identical results in the SOLAR variance components program. Comparison with single‐marker parametric analysis suggests that extension to multiple markers would be competitive with nonparametric methods in power, and superior in depth of genetic analysis. The simulated examples illustrate common problems encountered with linkage scans for oligogenes. They show that nonparametric methods provide no panacea for analytical problems posed by different phenotypes and methods of ascertainment.


Human Heredity | 2001

Combination of Linkage Evidence in Complex Inheritance

Weihua Zhang; Andrew Collins; N. E. Morton

The central problem of complex inheritance is to combine evidence from data that typically differ in markers, phenotypes, ascertainment, and other factors, without sacrificing the reliability that lods have given to linkage mapping for major loci. Here we evaluate 5 possible solutions on 200 replicates simulated in Genetic Analysis Workshop 10. Two methods differ from less efficient ones by distinguishing the tails of a normal distribution. Maximum likelihood scores (currently implemented only for the BETA model) and the approach of Self and Liang perform about as well as pooling samples, which is not feasible with heterogeneous data. With moderately heterogeneous data the Self and Liang method appears to be more efficient than maximum likelihood scores. Although improvements are being made in sample design and statistical analysis, the problem of combining linkage evidence from multiple data sets appears to have been solved. Allelic association presents different problems not yet addressed.

Collaboration


Dive into the Weihua Zhang's collaboration.

Top Co-Authors

Avatar

Andrew Collins

University of Southampton

View shared research outputs
Top Co-Authors

Avatar

N. E. Morton

Southampton General Hospital

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

William Tapper

University of Southampton

View shared research outputs
Top Co-Authors

Avatar

N. E. Morton

Southampton General Hospital

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gj Miller

University of Southampton

View shared research outputs
Top Co-Authors

Avatar

Inm Day

University of Bristol

View shared research outputs
Top Co-Authors

Avatar

Jane Gibson

University of Southampton

View shared research outputs
Researchain Logo
Decentralizing Knowledge