Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kateryna D. Makova is active.

Publication


Featured researches published by Kateryna D. Makova.


Nature | 2005

Initial sequence of the chimpanzee genome and comparison with the human genome

Tarjei S. Mikkelsen; LaDeana W. Hillier; Evan E. Eichler; Michael C. Zody; David B. Jaffe; Shiaw-Pyng Yang; Wolfgang Enard; Ines Hellmann; Kerstin Lindblad-Toh; Tasha K. Altheide; Nicoletta Archidiacono; Peer Bork; Jonathan Butler; Jean L. Chang; Ze Cheng; Asif T. Chinwalla; Pieter J. de Jong; Kimberley D. Delehaunty; Catrina C. Fronick; Lucinda L. Fulton; Yoav Gilad; Gustavo Glusman; Sante Gnerre; Tina Graves; Toshiyuki Hayakawa; Karen E. Hayden; Xiaoqiu Huang; Hongkai Ji; W. James Kent; Mary Claire King

Here we present a draft genome sequence of the common chimpanzee (Pan troglodytes). Through comparison with the human genome, we have generated a largely complete catalogue of the genetic differences that have accumulated since the human and chimpanzee species diverged from our common ancestor, constituting approximately thirty-five million single-nucleotide changes, five million insertion/deletion events, and various chromosomal rearrangements. We use this catalogue to explore the magnitude and regional variation of mutational forces shaping these two genomes, and the strength of positive and negative selection acting on their genes. In particular, we find that the patterns of evolution in human and chimpanzee protein-coding genes are highly correlated and dominated by the fixation of neutral and slightly deleterious alleles. We also use the chimpanzee genome as an outgroup to investigate human population genetics and identify signatures of selective sweeps in recent human evolution.Here we present a draft genome sequence of the common chimpanzee (Pan troglodytes). Through comparison with the human genome, we have generated a largely complete catalogue of the genetic differences that have accumulated since the human and chimpanzee species diverged from our common ancestor, constituting approximately thirty-five million single-nucleotide changes, five million insertion/deletion events, and various chromosomal rearrangements. We use this catalogue to explore the magnitude and regional variation of mutational forces shaping these two genomes, and the strength of positive and negative selection acting on their genes. In particular, we find that the patterns of evolution in human and chimpanzee protein-coding genes are highly correlated and dominated by the fixation of neutral and slightly deleterious alleles. We also use the chimpanzee genome as an outgroup to investigate human population genetics and identify signatures of selective sweeps in recent human evolution.


Nature | 2010

Complete Khoisan and Bantu genomes from southern Africa

Stephan C. Schuster; Webb Miller; Aakrosh Ratan; Lynn P. Tomsho; Belinda Giardine; Lindsay R. Kasson; Robert S. Harris; Desiree C. Petersen; Fangqing Zhao; Ji Qi; Can Alkan; Jeffrey M. Kidd; Yazhou Sun; Daniela I. Drautz; Pascal Bouffard; Donna M. Muzny; Jeffrey G. Reid; Lynne V. Nazareth; Qingyu Wang; Richard Burhans; Cathy Riemer; Nicola E. Wittekindt; Priya Moorjani; Elizabeth A. Tindall; Charles G. Danko; Wee Siang Teo; Anne M. Buboltz; Zhenhai Zhang; Qianyi Ma; Arno Oosthuysen

The genetic structure of the indigenous hunter-gatherer peoples of southern Africa, the oldest known lineage of modern human, is important for understanding human diversity. Studies based on mitochondrial and small sets of nuclear markers have shown that these hunter-gatherers, known as Khoisan, San, or Bushmen, are genetically divergent from other humans. However, until now, fully sequenced human genomes have been limited to recently diverged populations. Here we present the complete genome sequences of an indigenous hunter-gatherer from the Kalahari Desert and a Bantu from southern Africa, as well as protein-coding regions from an additional three hunter-gatherers from disparate regions of the Kalahari. We characterize the extent of whole-genome and exome diversity among the five men, reporting 1.3 million novel DNA differences genome-wide, including 13,146 novel amino acid variants. In terms of nucleotide substitutions, the Bushmen seem to be, on average, more different from each other than, for example, a European and an Asian. Observed genomic differences between the hunter-gatherers and others may help to pinpoint genetic adaptations to an agricultural lifestyle. Adding the described variants to current databases will facilitate inclusion of southern Africans in medical research efforts, particularly when family and medical histories can be correlated with genome-wide data.


Nature | 2002

Strong male-driven evolution of DNA sequences in humans and apes

Kateryna D. Makova; Wen-Hsiung Li

Studies of human genetic diseases have suggested a higher mutation rate in males than in females and the male-to-female ratio (α) of mutation rate has been estimated from DNA sequence and microsatellite data to be about 4–6 in higher primates. Two recent studies, however, claim that α is only about 2 in humans. This is even smaller than the estimates (α > 4) for carnivores and birds; humans should have a higher α than carnivores and birds because of a longer generation time and a larger sex difference in the number of germ cell cycles. To resolve this issue, we sequenced a noncoding fragment on Y of about 10.4 kilobases (kb) and a homologous region on chromosome 3 in humans, greater apes, and lesser apes. Here we show that our estimate of α from the internal branches of the phylogeny is 5.25 (95% confidence interval (CI) 2.44 to ∞), similar to the previous estimates, but significantly higher than the two recent ones. In contrast, for the external (short, species-specific) branches, α is only 2.23 (95% CI: 1.47–3.84). We suggest that closely related species are not suitable for estimating α, because of ancient polymorphism and other factors. Moreover, we provide an explanation for the small estimate of α in a previous study. Our study reinstates a high α in hominoids and supports the view that DNA replication errors are the primary source of germline mutation.


Current Opinion in Genetics & Development | 2002

Male-driven evolution.

Wen-Hsiung Li; Soojin V. Yi; Kateryna D. Makova

The strength of male-driven evolution - that is, the magnitude of the sex ratio of mutation rate - has been a controversial issue, particularly in primates. While earlier studies estimated the male-to-female ratio (alpha) of mutation rate to be about 4-6 in higher primates, two recent studies claimed that alpha is only about 2 in humans. However, a more recent comparison of mutation rates between a noncoding fragment on Y and a homologous region on chromosome 3 gave an estimate of alpha = 5.3, reinstating strong male-driven evolution in hominoids. Several studies investigated variation in mutation rates among genomic regions that may not be related to sex differences and found strong evidence for such variation. The causes for regional variation in mutation rate are not clear but GC content and recombination are two possible causes. Thus, while the strong male-driven evolution in higher primates suggests that errors during DNA replication in the germ cells are the major source of mutation, the contribution of some replication-independent factors such as recombination may also be important.


Nature | 2002

Chromosome-wide SNPs reveal an ancient origin for Plasmodium falciparum.

Jianbing Mu; Junhui Duan; Kateryna D. Makova; Deirdre A. Joy; Chuong Q. Huynh; Oralee H. Branch; Wen-Hsiung Li; Xin-zhuan Su

The Malarias Eve hypothesis, proposing a severe recent population bottleneck (about 3,000–5,000 years ago) of the human malaria parasite Plasmodium falciparum, has prompted a debate about the origin and evolution of the parasite. The hypothesis implies that the parasite population is relatively homogeneous, favouring malaria control measures. Other studies, however, suggested an ancient origin and large effective population size. To test the hypothesis, we analysed single nucleotide polymorphisms (SNPs) from 204 genes on chromosome 3 of P. falciparum. We have identified 403 polymorphic sites, including 238 SNPs and 165 microsatellites, from five parasite clones, establishing chromosome-wide haplotypes and a dense map with one polymorphic marker per ∼2.3 kilobases. On the basis of synonymous SNPs and non-coding SNPs, we estimate the time to the most recent common ancestor to be ∼100,000–180,000 years, significantly older than the proposed bottleneck. Our estimated divergence time coincides approximately with the start of human population expansion, and is consistent with a genetically complex organism able to evade host immunity and other antimalarial efforts.


Genome Research | 2013

The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes

Stephen B. Montgomery; David L. Goode; Erika Kvikstad; Cornelis A. Albers; Zhengdong D. Zhang; Xinmeng Jasmine Mu; Guruprasad Ananda; Bryan Howie; Konrad J. Karczewski; Kevin S. Smith; Vanessa Anaya; Rhea Richardson; Joseph S. Davis; Daniel G. MacArthur; Arend Sidow; Laurent Duret; Mark Gerstein; Kateryna D. Makova; Jonathan Marchini; Gil McVean; Gerton Lunter

Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.


Nature Biotechnology | 2011

Harnessing cloud computing with Galaxy Cloud

Enis Afgan; Dannon Baker; Nate Coraor; Hiroki Goto; Ian M. Paul; Kateryna D. Makova; Anton Nekrutenko; James Taylor

As next-generation sequencing becomes an indispensible tool for biomedical research, it is crucial to provide analysis solutions that are usable and cost effective for biomedical researchers. Galaxy Cloud addresses this by combining the accessible Galaxy interface with automated management of cloud computing resources. Unlike purpose-built solutions, Galaxy allows users either to use existing tested best practices in the form of workflows or to construct their own analyses for novel tasks. Galaxy Cloud instances are owned and controlled entirely by the user who created them and can be used effectively in secure private clouds. Thus, Galaxy Cloud provides a solution that retains user control and privacy, makes complex analysis accessible and enables the use of practically limitless on-demand computing resources.Continuing evolution of DNA sequencing has transformed modern biology. Reduced sequencing costs coupled with novel sequencing based assays has led to rapid adoption of next generation sequencing (NGS) across diverse areas of life science research1-4. Sequencing has moved out of the genome centers into core facilities and individual labs where any investigator can access them for modest and progressively declining cost. While easy to generate in tremendous quantities, sequence data is still difficult to manage and analyze. Sophisticated informatics techniques and supporting infrastructure are needed to make sense of even conceptually simple sequencing experiments — let alone the more complex analysis techniques being developed. The most pressing challenge facing the sequencing community today is providing the informatics infrastructure and accessible analysis methods needed to make it possible for all investigators to realize the power of high-throughput sequencing to advance their research.


Genome Biology | 2011

Dynamics of mitochondrial heteroplasmy in three families investigated via a repeatable re-sequencing study

Hiroki Goto; Benjamin Dickins; Enis Afgan; Ian M. Paul; James Taylor; Kateryna D. Makova; Anton Nekrutenko

BackgroundOriginally believed to be a rare phenomenon, heteroplasmy - the presence of more than one mitochondrial DNA (mtDNA) variant within a cell, tissue, or individual - is emerging as an important component of eukaryotic genetic diversity. Heteroplasmies can be used as genetic markers in applications ranging from forensics to cancer diagnostics. Yet the frequency of heteroplasmic alleles may vary from generation to generation due to the bottleneck occurring during oogenesis. Therefore, to understand the alterations in allele frequencies at heteroplasmic sites, it is of critical importance to investigate the dynamics of maternal mtDNA transmission.ResultsHere we sequenced, at high coverage, mtDNA from blood and buccal tissues of nine individuals from three families with a total of six maternal transmission events. Using simulations and re-sequencing of clonal DNA, we devised a set of criteria for detecting polymorphic sites in heterogeneous genetic samples that is resistant to the noise originating from massively parallel sequencing technologies. Application of these criteria to nine human mtDNA samples revealed four heteroplasmic sites.ConclusionsOur results suggest that the incidence of heteroplasmy may be lower than estimated in some other recent re-sequencing studies, and that mtDNA allelic frequencies differ significantly both between tissues of the same individual and between a mother and her offspring. We designed our study in such a way that the complete analysis described here can be repeated by anyone either at our site or directly on the Amazon Cloud. Our computational pipeline can be easily modified to accommodate other applications, such as viral re-sequencing.


Genome Research | 2012

A genome-wide analysis of common fragile sites: What features determine chromosomal instability in the human genome?

Arkarachai Fungtammasan; Erin Walsh; Francesca Chiaromonte; Kristin A. Eckert; Kateryna D. Makova

Chromosomal common fragile sites (CFSs) are unstable genomic regions that break under replication stress and are involved in structural variation. They frequently are sites of chromosomal rearrangements in cancer and of viral integration. However, CFSs are undercharacterized at the molecular level and thus difficult to predict computationally. Newly available genome-wide profiling studies provide us with an unprecedented opportunity to associate CFSs with features of their local genomic contexts. Here, we contrasted the genomic landscape of cytogenetically defined aphidicolin-induced CFSs (aCFSs) to that of nonfragile sites, using multiple logistic regression. We also analyzed aCFS breakage frequencies as a function of their genomic landscape, using standard multiple regression. We show that local genomic features are effective predictors both of regions harboring aCFSs (explaining ∼77% of the deviance in logistic regression models) and of aCFS breakage frequencies (explaining ∼45% of the variance in standard regression models). In our optimal models (having highest explanatory power), aCFSs are predominantly located in G-negative chromosomal bands and away from centromeres, are enriched in Alu repeats, and have high DNA flexibility. In alternative models, CpG island density, transcription start site density, H3K4me1 coverage, and mononucleotide microsatellite coverage are significant predictors. Also, aCFSs have high fragility when colocated with evolutionarily conserved chromosomal breakpoints. Our models are predictive of the fragility of aCFSs mapped at a higher resolution. Importantly, the genomic features we identified here as significant predictors of fragility allow us to draw valuable inferences on the molecular mechanisms underlying aCFSs.


Genome Biology and Evolution | 2010

What Is a Microsatellite: A Computational and Experimental Definition Based upon Repeat Mutational Behavior at A/T and GT/AC Repeats

Yogeshwar D. Kelkar; Noelle Strubczewski; Suzanne E. Hile; Francesca Chiaromonte; Kristin A. Eckert; Kateryna D. Makova

Microsatellites are abundant in eukaryotic genomes and have high rates of strand slippage-induced repeat number alterations. They are popular genetic markers, and their mutations are associated with numerous neurological diseases. However, the minimal number of repeats required to constitute a microsatellite has been debated, and a definition of a microsatellite that considers its mutational behavior has been lacking. To define a microsatellite, we investigated slippage dynamics for a range of repeat sizes, utilizing two approaches. Computationally, we assessed length polymorphism at repeat loci in ten ENCODE regions resequenced in four human populations, assuming that the occurrence of polymorphism reflects strand slippage rates. Experimentally, we determined the in vitro DNA polymerase-mediated strand slippage error rates as a function of repeat number. In both approaches, we compared strand slippage rates at tandem repeats with the background slippage rates. We observed two distinct modes of mutational behavior. At small repeat numbers, slippage rates were low and indistinguishable from background measurements. A marked transition in mutability was observed as the repeat array lengthened, such that slippage rates at large repeat numbers were significantly higher than the background rates. For both mononucleotide and dinucleotide microsatellites studied, the transition length corresponded to a similar number of nucleotides (approximately 10). Thus, microsatellite threshold is determined not by the presence/absence of strand slippage at repeats but by an abrupt alteration in slippage rates relative to background. These findings have implications for understanding microsatellite mutagenesis, standardization of genome-wide microsatellite analyses, and predicting polymorphism levels of individual microsatellite loci.

Collaboration


Dive into the Kateryna D. Makova's collaboration.

Top Co-Authors

Avatar

Francesca Chiaromonte

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Anton Nekrutenko

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Kristin A. Eckert

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ian M. Paul

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Guruprasad Ananda

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Marta Tomaszkiewicz

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Hiroki Goto

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Paul Medvedev

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge