Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tatiana V. Tatarinova is active.

Publication


Featured researches published by Tatiana V. Tatarinova.


PLOS ONE | 2012

DNA Barcoding the Native Flowering Plants and Conifers of Wales

Natasha de Vere; Tim C. G. Rich; Col R. Ford; Sarah A. Trinder; Charlotte Long; Christopher Moore; Danielle Satterthwaite; Helena Davies; Joel Allainguillaume; Sandra Ronca; Tatiana V. Tatarinova; Hannah Garbett; Kevin J. Walker; Mike J. Wilkinson

We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification.


Nature Communications | 2014

Geographic population structure analysis of worldwide human populations infers their biogeographical origins

Eran Elhaik; Tatiana V. Tatarinova; Dmitri Chebotarev; Ignazio Piras; Carla Maria Calò; Antonella De Montis; Manuela Atzori; Monica Marini; Sergio Tofanelli; Paolo Francalacci; Luca Pagani; Chris Tyler-Smith; Yali Xue; Francesco Cucca; Theodore G. Schurr; Jill B. Gaieski; Carlalynne Melendez; Miguel Vilar; Amanda C. Owings; Rocío Gómez; Ricardo Fujita; Fabrício R. Santos; David Comas; Oleg Balanovsky; Elena Balanovska; Pierre Zalloua; Himla Soodyall; Ramasamy Pitchappan; ArunKumar GaneshPrasad; Michael F. Hammer

The search for a method that utilizes biological information to predict humans’ place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an accuracy of 700 km in Europe, they were inaccurate elsewhere. Here we describe the Geographic Population Structure (GPS) algorithm and demonstrate its accuracy with three data sets using 40,000–130,000 SNPs. GPS placed 83% of worldwide individuals in their country of origin. Applied to over 200 Sardinians villagers, GPS placed a quarter of them in their villages and most of the rest within 50 km of their villages. GPS’s accuracy and power to infer the biogeography of worldwide individuals down to their country or, in some cases, village, of origin, underscores the promise of admixture-based methods for biogeography and has ramifications for genetic ancestry testing.


BMC Genomics | 2015

Differential Evolution approach to detect recent admixture

Konstantin Kozlov; Dmitri Chebotarev; Mehedi Hassan; Martin Triska; Petr Triska; Pavel Flegontov; Tatiana V. Tatarinova

The genetic structure of human populations is extraordinarily complex and of fundamental importance to studies of anthropology, evolution, and medicine. As increasingly many individuals are of mixed origin, there is an unmet need for tools that can infer multiple origins. Misclassification of such individuals can lead to incorrect and costly misinterpretations of genomic data, primarily in disease studies and drug trials. We present an advanced tool to infer ancestry that can identify the biogeographic origins of highly mixed individuals. reAdmix can incorporate individuals knowledge of ancestors (e.g. having some ancestors from Turkey or a Scottish grandmother). reAdmix is an online tool available at http://chcb.saban-chla.usc.edu/reAdmix/.


Journal of Pharmacokinetics and Pharmacodynamics | 2013

Two general methods for population pharmacokinetic modeling: non-parametric adaptive grid and non-parametric Bayesian

Tatiana V. Tatarinova; Michael Neely; Jay Bartroff; Michael Van Guilder; Walter M. Yamada; David S. Bayard; Roger W. Jelliffe; Robert Leary; Alyona Chubatiuk; Alan Schumitzky

Population pharmacokinetic (PK) modeling methods can be statistically classified as either parametric or nonparametric (NP). Each classification can be divided into maximum likelihood (ML) or Bayesian (B) approaches. In this paper we discuss the nonparametric case using both maximum likelihood and Bayesian approaches. We present two nonparametric methods for estimating the unknown joint population distribution of model parameter values in a pharmacokinetic/pharmacodynamic (PK/PD) dataset. The first method is the NP Adaptive Grid (NPAG). The second is the NP Bayesian (NPB) algorithm with a stick-breaking process to construct a Dirichlet prior. Our objective is to compare the performance of these two methods using a simulated PK/PD dataset. Our results showed excellent performance of NPAG and NPB in a realistically simulated PK study. This simulation allowed us to have benchmarks in the form of the true population parameters to compare with the estimates produced by the two methods, while incorporating challenges like unbalanced sample times and sample numbers as well as the ability to include the covariate of patient weight. We conclude that both NPML and NPB can be used in realistic PK/PD population analysis problems. The advantages of one versus the other are discussed in the paper. NPAG and NPB are implemented in R and freely available for download within the Pmetrics package from www.lapk.org.


Biochemical and Biophysical Research Communications | 2011

Artificial microRNAs (amiRNAs) engineering - On how microRNA-based silencing methods have affected current plant silencing research.

Gaurav Sablok; Álvaro Luis Pérez-Quintero; Mehedi Hassan; Tatiana V. Tatarinova; Camilo López

In recent years, endogenous microRNAs have been described as important regulators of gene expression in eukaryotes. Artificial microRNAs (amiRNAs) represent a recently developed miRNA-based strategy to silence endogenous genes. amiRNAs can be created by exchanging the miRNA/miRNA(∗) sequence within a miRNA precursor with a sequence designed to match the target gene, this is possible as long as the secondary RNA structure of the precursor is kept intact. In this review, we summarize the basic methodologies to design amiRNAs and detail their applications in plants genetic functional studies as well as their potential for crops genetic improvement.


Genome Biology and Evolution | 2013

Cross-Species Analysis of Genic GC3 Content and DNA Methylation Patterns

Tatiana V. Tatarinova; Eran Elhaik; Matteo Pellegrini

The GC content in the third codon position (GC3) exhibits a unimodal distribution in many plant and animal genomes. Interestingly, grasses and homeotherm vertebrates exhibit a unique bimodal distribution. High GC3 was previously found to be associated with variable expression, higher frequency of upstream TATA boxes, and an increase of GC3 from 5′ to 3′. Moreover, GC3-rich genes are predominant in certain gene classes and are enriched in CpG dinucleotides that are potential targets for methylation. Based on the GC3 bimodal distribution we hypothesize that GC3 has a regulatory role involving methylation and gene expression. To test that hypothesis, we selected diverse taxa (rice, thale cress, bee, and human) that varied in the modality of their GC3 distribution and tested the association between GC3, DNA methylation, and gene expression. We examine the relationship between cytosine methylation levels and GC3, gene expression, genome signature, gene length, and other gene compositional features. We find a strong negative correlation (Pearson’s correlation coefficient r = −0.67, P value < 0.0001) between GC3 and genic CpG methylation. The comparison between 5′-3′ gradients of CG3-skew and genic methylation for the taxa in the study suggests interplay between gene-body methylation and transcription-coupled cytosine deamination effect. Compositional features are correlated with methylation levels of genes in rice, thale cress, human, bee, and fruit fly (which acts as an unmethylated control). These patterns allow us to generate evolutionary hypotheses about the relationships between GC3 and methylation and how these affect expression patterns. Specifically, we propose that the opposite effects of methylation and compositional gradients along coding regions of GC3-poor and GC3-rich genes are the products of several competing processes.


BMC Bioinformatics | 2017

Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data

Kuang Lim Chan; Rozana Rosli; Tatiana V. Tatarinova; Michael Hogan; Mohd Firdaus-Raih; Eng Ti Leslie Low

BackgroundGene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion.ResultsWe present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO’s plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure).ConclusionsSeqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.


European Journal of Human Genetics | 2014

The 'extremely ancient' chromosome that isn't: a forensic bioinformatic investigation of Albert Perry's X-degenerate portion of the Y chromosome.

Eran Elhaik; Tatiana V. Tatarinova; Anatole A Klyosov; Dan Graur

Mendez and colleagues reported the identification of a Y chromosome haplotype (the A00 lineage) that lies at the basal position of the Y chromosome phylogenetic tree. Incorporating this haplotype, the authors estimated the time to the most recent common ancestor (TMRCA) for the Y tree to be 338 000 years ago (95% CI=237 000–581 000). Such an extraordinarily early estimate contradicts all previous estimates in the literature and is over a 100 000 years older than the earliest fossils of anatomically modern humans. This estimate raises two astonishing possibilities, either the novel Y chromosome was inherited after ancestral humans interbred with another species, or anatomically modern Homo sapiens emerged earlier than previously estimated and quickly became subdivided into genetically differentiated subpopulations. We demonstrate that the TMRCA estimate was reached through inadequate statistical and analytical methods, each of which contributed to its inflation. We show that the authors ignored previously inferred Y-specific rates of substitution, incorrectly derived the Y-specific substitution rate from autosomal mutation rates, and compared unequal lengths of the novel Y chromosome with the previously recognized basal lineage. Our analysis indicates that the A00 lineage was derived from all the other lineages 208 300 (95% CI=163 900–260 200) years ago.


Molecular Oncology | 2013

Tumor Suppressors Status in Cancer Cell Line Encyclopedia

Dmitriy Sonkin; Mehedi Hassan; Denis J. Murphy; Tatiana V. Tatarinova

Tumor suppressors play a major role in the etiology of human cancer, and typically achieve a tumor‐promoting effect upon complete functional inactivation. Bi‐allelic inactivation of tumor suppressors may occur through genetic mechanisms (such as loss of function mutation, copy number (CN) loss, or loss of heterozygosity (LOH)), epigenetic mechanisms (such as promoter methylation or histone modification), or a combination of the two. We report systematically derived status of 69 known or putative tumor suppressors, across 799 samples of the Cancer Cell Line Encyclopedia. In order to generate such resource we constructed a novel comprehensive computational framework for the assessment of tumor suppressor functional “status”. This approach utilizes several orthogonal genomic data types, including mutation data, copy number, LOH and expression. Through correlation with additional data types (compound sensitivity and gene set activity) we show that this integrative method provides a more accurate assessment of tumor suppressor status than can be inferred by expression, copy number, or mutation alone. This approach has the potential for a more realistic assessment of tumor suppressor genes for both basic and translational oncology research.


DNA Research | 2013

Evaluation of Codon Biology in Citrus and Poncirus trifoliata Based on Genomic Features and Frame Corrected Expressed Sequence Tags

Touqeer Ahmad; Gaurav Sablok; Tatiana V. Tatarinova; Qiang Xu; Xiuxin Deng; Wen-Wu Guo

Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid.

Collaboration


Dive into the Tatiana V. Tatarinova's collaboration.

Top Co-Authors

Avatar

Eran Elhaik

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar

Martin Triska

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mehedi Hassan

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Alan Schumitzky

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Sergey Bruskin

Moscow Institute of Physics and Technology

View shared research outputs
Top Co-Authors

Avatar

Denis J. Murphy

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge