Fereydoun Hormozdiari
University of California, Davis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fereydoun Hormozdiari.
Nature | 2011
Ryan E. Mills; Klaudia Walter; Chip Stewart; Robert E. Handsaker; Ken Chen; Can Alkan; Alexej Abyzov; Seungtai Yoon; Kai Ye; R. Keira Cheetham; Asif T. Chinwalla; Donald F. Conrad; Yutao Fu; Fabian Grubert; Iman Hajirasouliha; Fereydoun Hormozdiari; Lilia M. Iakoucheva; Zamin Iqbal; Shuli Kang; Jeffrey M. Kidd; Miriam K. Konkel; Joshua M. Korn; Ekta Khurana; Deniz Kural; Hugo Y. K. Lam; Jing Leng; Ruiqiang Li; Yingrui Li; Chang-Yun Lin; Ruibang Luo
Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
Nature Genetics | 2009
Can Alkan; Jeffrey M. Kidd; Tomas Marques-Bonet; Gozde Aksay; Francesca Antonacci; Fereydoun Hormozdiari; Jacob O. Kitzman; Carl Baker; Maika Malig; Onur Mutlu; S. Cenk Sahinalp; Richard A. Gibbs; Evan E. Eichler
Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73–87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 × 10−16). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.
Nature | 2015
Peter H. Sudmant; Tobias Rausch; Eugene J. Gardner; Robert E. Handsaker; Alexej Abyzov; John Huddleston; Zhang Y; Kai Ye; Goo Jun; Markus His Yang Fritz; Miriam K. Konkel; Ankit Malhotra; Adrian M. Stütz; Xinghua Shi; Francesco Paolo Casale; Jieming Chen; Fereydoun Hormozdiari; Gargi Dayama; Ken Chen; Maika Malig; Mark Chaisson; Klaudia Walter; Sascha Meiers; Seva Kashin; Erik Garrison; Adam Auton; Hugo Y. K. Lam; Xinmeng Jasmine Mu; Can Alkan; Danny Antaki
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
Nature | 2013
Javier Prado-Martinez; Peter H. Sudmant; Jeffrey M. Kidd; Heng Li; Joanna L. Kelley; Belen Lorente-Galdos; Krishna R. Veeramah; August E. Woerner; Timothy D. O’Connor; Gabriel Santpere; Alexander Cagan; Christoph Theunert; Ferran Casals; Hafid Laayouni; Kasper Munch; Asger Hobolth; Anders E. Halager; Maika Malig; Jessica Hernandez-Rodriguez; Irene Hernando-Herraez; Kay Prüfer; Marc Pybus; Laurel Johnstone; Michael Lachmann; Can Alkan; Dorina Twigg; Natalia Petit; Carl Baker; Fereydoun Hormozdiari; Marcos Fernandez-Callejo
Most great ape genetic variation remains uncharacterized; however, its study is critical for understanding population history, recombination, selection and susceptibility to disease. Here we sequence to high coverage a total of 79 wild- and captive-born individuals representing all six great ape species and seven subspecies and report 88.8 million single nucleotide polymorphisms. Our analysis provides support for genetically distinct populations within each species, signals of gene flow, and the split of common chimpanzees into two distinct groups: Nigeria–Cameroon/western and central/eastern populations. We find extensive inbreeding in almost all wild populations, with eastern gorillas being the most extreme. Inferred effective population sizes have varied radically over time in different lineages and this appears to have a profound effect on the genetic diversity at, or close to, genes in almost all species. We discover and assign 1,982 loss-of-function variants throughout the human and great ape lineages, determining that the rate of gene loss has not been different in the human branch compared to other internal branches in the great ape phylogeny. This comprehensive catalogue of great ape genome diversity provides a framework for understanding evolution and a resource for more effective management of wild and captive great ape populations.
Nature Genetics | 2014
Laurent C. Francioli; Androniki Menelaou; Sara L. Pulit; Freerk van Dijk; Pier Francesco Palamara; Clara C. Elbers; Pieter B. T. Neerincx; Kai Ye; Victor Guryev; Wigard P. Kloosterman; Patrick Deelen; Abdel Abdellaoui; Elisabeth M. van Leeuwen; Mannis van Oven; Martijn Vermaat; Mingkun Li; Jeroen F. J. Laros; Lennart C. Karssen; Alexandros Kanterakis; Najaf Amin; Jouke-Jan Hottenga; Eric-Wubbo Lameijer; Mathijs Kattenberg; Martijn Dijkstra; Heorhiy Byelas; Jessica van Setten; Barbera D. C. van Schaik; Jan Bot; Isaac J. Nijman; Ivo Renkens
Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring families and constructed a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions. The intermediate coverage (∼13×) and trio design enabled extensive characterization of structural variation, including midsize events (30–500 bp) previously poorly catalogued and de novo mutations. We demonstrate that the quality of the haplotypes boosts imputation accuracy in independent samples, especially for lower frequency alleles. Population genetic analyses demonstrate fine-scale structure across the country and support multiple ancient migrations, consistent with historical changes in sea level and flooding. The GoNL Project illustrates how single-population whole-genome sequencing can provide detailed characterization of genetic variation and may guide the design of future population studies.
PLOS Computational Biology | 2011
Andrew McPherson; Fereydoun Hormozdiari; Abdalnasser Zayed; Ryan Giuliany; Gavin Ha; Mark Sun; Malachi Griffith; Alireza Heravi Moussavi; Janine Senz; Nataliya Melnyk; Marina Pacheco; Marco A. Marra; Martin Hirst; Torsten O. Nielsen; S. Cenk Sahinalp; David Huntsman; Sohrab P. Shah
Gene fusions created by somatic genomic rearrangements are known to play an important role in the onset and development of some cancers, such as lymphomas and sarcomas. RNA-Seq (whole transcriptome shotgun sequencing) is proving to be a useful tool for the discovery of novel gene fusions in cancer transcriptomes. However, algorithmic methods for the discovery of gene fusions using RNA-Seq data remain underdeveloped. We have developed deFuse, a novel computational method for fusion discovery in tumor RNA-Seq data. Unlike existing methods that use only unique best-hit alignments and consider only fusion boundaries at the ends of known exons, deFuse considers all alignments and all possible locations for fusion boundaries. As a result, deFuse is able to identify fusion sequences with demonstrably better sensitivity than previous approaches. To increase the specificity of our approach, we curated a list of 60 true positive and 61 true negative fusion sequences (as confirmed by RT-PCR), and have trained an adaboost classifier on 11 novel features of the sequence data. The resulting classifier has an estimated value of 0.91 for the area under the ROC curve. We have used deFuse to discover gene fusions in 40 ovarian tumor samples, one ovarian cancer cell line, and three sarcoma samples. We report herein the first gene fusions discovered in ovarian cancer. We conclude that gene fusions are not infrequent events in ovarian cancer and that these events have the potential to substantially alter the expression patterns of the genes involved; gene fusions should therefore be considered in efforts to comprehensively characterize the mutational profiles of ovarian cancer transcriptomes.
Nature Methods | 2010
Faraz Hach; Fereydoun Hormozdiari; Can Alkan; Farhad Hormozdiari; Inanc Birol; Evan E. Eichler; S. Cenk Sahinalp
In addition to single-nucleotide variations and small insertions-deletions (indels), largersized structural variations (for example, insertions, deletions, inversions, segmental duplications and copy-number polymorphisms) contribute to human genetic diversity. In almost all recent structural variation discovery (SVD) studies, short reads from a donor genome have been mapped to a reference genome as a first step. The accuracy of such an SVD study is directly correlated to the accuracy of this mapping step, which also provides the main computational bottleneck of the SVD study.
Bioinformatics | 2010
Fereydoun Hormozdiari; Iman Hajirasouliha; Phuong Dao; Faraz Hach; Deniz Yorukoglu; Can Alkan; Evan E. Eichler; S. Cenk Sahinalp
Recent years have witnessed an increase in research activity for the detection of structural variants (SVs) and their association to human disease. The advent of next-generation sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified by the 1000 Genomes Project. Although various computational methods have been described for the detection of SVs, no such algorithm is yet fully capable of discovering transposon insertions, a very important class of SVs to the study of human evolution and disease. In this article, we provide a complete and novel formulation to discover both loci and classes of transposons inserted into genomes sequenced with high-throughput sequencing technologies. In addition, we also present ‘conflict resolution’ improvements to our earlier combinatorial SV detection algorithm (VariationHunter) by taking the diploid nature of the human genome into consideration. We test our algorithms with simulated data from the Venter genome (HuRef) and are able to discover >85% of transposon insertion events with precision of >90%. We also demonstrate that our conflict resolution algorithm (denoted as VariationHunter-CR) outperforms current state of the art (such as original VariationHunter, BreakDancer and MoDIL) algorithms when tested on the genome of the Yoruba African individual (NA18507). Availability: The implementation of algorithm is available at http://compbio.cs.sfu.ca/strvar.htm. Contact: [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
research in computational molecular biology | 2006
Fereydoun Hormozdiari; Petra Berenbrink; Nataša Pržulj; S. Cenk Sahinalp
The (asymptotic) degree distributions of the best-known “scale-free” network models are all similar and are independent of the seed graph used; hence, it has been tempting to assume that networks generated by these models are generally similar. In this paper, we observe that several key topological features of such networks depend heavily on the specific model and the seed graph used. Furthermore, we show that starting with the “right” seed graph (typically a dense subgraph of the protein–protein interaction network analyzed), the duplication model captures many topological features of publicly available protein–protein interaction networks very well.
Nature Methods | 2009
Seunghak Lee; Fereydoun Hormozdiari; Can Alkan; Michael Brudno
deletion at this location, the distribution p(Ci) will shift (Fig. 1a). If the observed cluster is the site of a heterozygous indel, approximately half of the observed mate pairs will be generated from the shifted distribution, and the other half will come from the original, unshifted p(Y) (Fig. 1b). MoDIL represents the random variable of the expected size of indel (mean of insert size minus the mapped distance) with two random variables, one for each haplotype. Given a cluster, MoDIL identifies the two distributions, {D1,D2}, with the fixed shape of p(Y) and arbitrary means that best fits the observed data using the Kolmogorov-Smirnov test. To find the means of the two distributions, MoDIL uses the expectation-maximization algorithm and appropriate Bayesian priors to prevent over-fitting. For each distribution Dk ∈ {1,2} the size of the indel event can be estimated with high confidence: its expected size follows a MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions