Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alexei Fedorov is active.

Publication


Featured researches published by Alexei Fedorov.


Proceedings of the National Academy of Sciences of the United States of America | 2002

The origin of the eukaryotic cell: A genomic investigation

Hyman Hartman; Alexei Fedorov

We have collected a set of 347 proteins that are found in eukaryotic cells but have no significant homology to proteins in Archaea and Bacteria. We call these proteins eukaryotic signature proteins (ESPs). The dominant hypothesis for the formation of the eukaryotic cell is that it is a fusion of an archaeon with a bacterium. If this hypothesis is accepted then the three cellular domains, Eukarya, Archaea, and Bacteria, would collapse into two cellular domains. We have used the existence of this set of ESPs to test this hypothesis. The evidence of the ESPs implicates a third cell (chronocyte) in the formation of the eukaryotic cell. The chronocyte had a cytoskeleton that enabled it to engulf prokaryotic cells and a complex internal membrane system where lipids and proteins were synthesized. It also had a complex internal signaling system involving calcium ions, calmodulin, inositol phosphates, ubiquitin, cyclin, and GTP-binding proteins. The nucleus was formed when a number of archaea and bacteria were engulfed by a chronocyte. This formation of the nucleus would restore the three cellular domains as the Chronocyte was not a cell that belonged to the Archaea or to the Bacteria.


Proceedings of the National Academy of Sciences of the United States of America | 2003

Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain

Scott W. Roy; Alexei Fedorov; Walter Gilbert

We compared intron–exon structures in 1,560 human–mouse orthologs and 360 mouse–rat orthologs. The origin of differences in intron positions between species was inferred by comparison with an outgroup, Fugu for human–mouse and human for mouse–rat. Among 10,020 intron positions in the human–mouse comparison, we found unequivocal evidence for five independent intron losses in the mouse lineage but no evidence for intron loss in humans or for intron gain in either lineage. Among 1,459 positions in rat–mouse comparisons, we found evidence for one loss in rat but neither loss in mouse nor gain in either lineage. In each case, the intron losses were exact, without change in the surrounding coding sequence, and involved introns that are extremely short, with an average of 200 bp, an order of magnitude shorter than the mammalian average. These results favor a model whereby introns are lost through gene conversion with intronless copies of the gene. In addition, the finding of widespread conservation of intron–exon structure, even over large evolutionary distances, suggests that comparative methods employing information about gene structures should be very successful in correctly predicting exon boundaries in genomic sequences.


Genetica | 2003

Introns in gene evolution

Larisa Fedorova; Alexei Fedorov

Introns are integral elements of eukaryotic genomes that perform various important functions and actively participate in gene evolution. We review six distinct roles of spliceosomal introns: (1) sources of non-coding RNA; (2) carriers of transcription regulatory elements; (3) actors in alternative and trans-splicing; (4) enhancers of meiotic crossing over within coding sequences; (5) substrates for exon shuffling; and (6) signals for mRNA export from the nucleus and nonsense-mediated decay. We consider transposable capacities of introns and the current state of the long-lasting debate on the ‘early-or-late’ origin of introns. Cumulative data on known types of contemporary exon shuffling and the estimation of the size of the underlying exon universe are also discussed. We argue that the processes central to introns-early (exon shuffling) and introns-late (intron insertion) theories are entirely compatible. Each has provided insight: the latter through elucidating the transposon capabilities of introns, and the former through understanding the importance of introns in genomic recombination leading to gene rearrangements and evolution.


Nucleic Acids Research | 2000

EID: the Exon–Intron Database—an exhaustive database of protein-coding intron-containing genes

Serge Saxonov; Iraj Daizadeh; Alexei Fedorov; Walter Gilbert

To aid studies of molecular evolution and to assist in gene prediction research, we have constructed an Exon-Intron Database (EID) in FASTA format. Currently, the database is derived from GenBank release 112, and it contains 51 289 protein-coding genes (287 209 exons) that harbor introns, along with extensive descriptions of each gene and its DNA and protein sequences, as well as splice motif information. There is 17% redundancy inherited from GenBank-a purge at the 99% identity level reduced the database to 42 460 genes (243 589 exons). We have created subdatabases of genes whose intron positions have been experimentally determined. One such database, constructed by comparing genomic and mRNA sequences, contains 11 242 genes (62 474 exons). A larger database of 22 196 genes (105 595 exons) was constructed by selecting on keywords to eliminate computer-predicted genes. By examining the two nucleotides adjacent to the intron boundary, we infer that there is a 2% rate of errors or other deviations from the standard GTellipsisAG motif in nuclear genes. This criterion can be used to eliminate 4921 genes from the overall database. Various tools are provided to enable generation of user-specific subsets of the EID. The EID distribution can be obtained from http://mcb.harvard.edu/gilbert/EID


Nucleic Acids Research | 2011

Critical association of ncRNA with introns

David Rearick; Ashwin Prakash; Andrew McSweeny; Samuel Shepard; Larisa Fedorova; Alexei Fedorov

It has been widely acknowledged that non-coding RNAs are master-regulators of genomic functions. However, the significance of the presence of ncRNA within introns has not received proper attention. ncRNA within introns are commonly produced through the post-splicing process and are specific signals of gene transcription events, impacting many other genes and modulating their expression. This study, along with the following discussion, details the association of thousands of ncRNAs—snoRNA, miRNA, siRNA, piRNA and long ncRNA—within human introns. We propose that such an association between human introns and ncRNAs has a pronounced synergistic effect with important implications for fine-tuning gene expression patterns across the entire genome.


Proceedings of the National Academy of Sciences of the United States of America | 2001

Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns

Alexei Fedorov; Xiaohong Cao; Serge Saxonov; Sandro J. de Souza; Scott William Roy; Walter Gilbert

Do introns delineate elements of protein tertiary structure? This issue is crucial to the debate about the role and origin of introns. We present an analysis of the full set of proteins with known three-dimensional structures that have homologs with intron positions recorded in GenBank. A computer program was generated that maps on a reference sequence the positions of all introns in homologous genes. We have applied this program to a set of 665 nonredundant protein sequences with defined three-dimensional structures in the Protein Data Bank (PDB), which yielded 8,217 introns in 407 proteins. For the subset of proteins corresponding to ancient conserved regions (ACR), we find that there is a correlation of phase-zero introns with the boundary regions of modules and no correlation for the phase-one and phase-two positions. However, for a subset of proteins without prokaryotic counterparts (131 non-ACR proteins), a set of presumably modern proteins (or proteins that have diverged extremely far from any ancestral form), we do not find any correlation of phase-zero intron positions with three-dimensional structure. Furthermore, we find an anticorrelation of phase-one intron positions with module boundaries: they actually have a preference for the interior of modules. This finding is explicable as a preference for phase-one introns to lie in glycines, between G|G sequences, the preference for glycines being anticorrelated with the three-dimensional modules. We interpret this anticorrelation as a sign that a number of phase-one introns, and hence many modern introns, have been inserted into G|G “protosplice” sequences.


PLOS ONE | 2009

The peculiarities of large intron splicing in animals.

Samuel Shepard; Mark McCreary; Alexei Fedorov

In mammals a considerable 92% of genes contain introns, with hundreds and hundreds of these introns reaching the incredible size of over 50,000 nucleotides. These “large introns” must be spliced out of the pre-mRNA in a timely fashion, which involves bringing together distant 5′ and 3′ acceptor and donor splice sites. In invertebrates, especially Drosophila, it has been shown that larger introns can be spliced efficiently through a process known as recursive splicing—a consecutive splicing from the 5′-end at a series of combined donor-acceptor splice sites called RP-sites. Using a computational analysis of the genomic sequences, we show that vertebrates lack the proper enrichment of RP-sites in their large introns, and, therefore, require some other method to aid splicing. We analyzed over 15,000 non-redundant, large introns from six mammals, 1,600 from chicken and zebrafish, and 560 non-redundant large introns from five invertebrates. Our bioinformatic investigation demonstrates that, unlike the studied invertebrates, the studied vertebrate genomes contain consistently abundant amounts of direct and complementary strand interspersed repetitive elements (mainly SINEs and LINEs) that may form stems with each other in large introns. This examination showed that predicted stems are indeed abundant and stable in the large introns of mammals. We hypothesize that such stems with long loops within large introns allow intron splice sites to find each other more quickly by folding the intronic RNA upon itself at smaller intervals and, thus, reducing the distance between donor and acceptor sites.


Bioinformatics | 2006

Bioinformatic analysis of exon repetition, exon scrambling and trans-splicing in humans

Xiang Shao; Valery Shepelev; Alexei Fedorov

MOTIVATION Using bioinformatic approaches we aimed to characterize poorly understood abnormalities in splicing known as exon scrambling, exon repetition and trans-splicing. RESULTS We developed a software package that allows large-scale comparison of all human expressed sequence tags (EST) sequences to the entire set of human gene sequences. Among 5,992,495 EST sequences, 401 cases of exon repetition and 416 cases of exon scrambling were found. The vast majority of identified ESTs contain fragments rather than full-length repeated or scrambled exons. Their structures suggest that the scrambled or repeated exon fragments may have arisen in the process of cDNA cloning and not from splicing abnormalities. Nevertheless, we found 11 cases of full-length exon repetition showing that this phenomenon is real yet very rare. In searching for examples of trans-splicing, we looked only at reproducible events where at least two independent ESTs represent the same putative trans-splicing event. We found 15 ESTs representing five types of putative trans-splicing. However, all 15 cases were derived from human malignant tissues and could have resulted from genomic rearrangements. Our results provide support for a very rare but physiological occurrence of exon repetition, but suggest that apparent exon scrambling and trans-splicing result, respectively, from in vitro artifact and gene-level abnormalities. AVAILABILITY Exon-Intron Database (EID) is available at http://www.meduohio.edu/bioinfo/eid. Programs are available at http://www.meduohio.edu/bioinfo/software.html. The Laboratory website is available at http://www.meduohio.edu/medicine/fedorov SUPPLEMENTARY INFORMATION Supplementary file is available at http://www.meduohio.edu/bioinfo/software.html.


Nucleic Acids Research | 2005

Computer identification of snoRNA genes using a Mammalian Orthologous Intron Database

Alexei Fedorov; Jesse Stombaugh; Michael W. Harr; Saihua Yu; Lorena Nasalean; Valery Shepelev

Based on comparative genomics, we created a bioinformatic package for computer prediction of small nucleolar RNA (snoRNA) genes in mammalian introns. The core of our approach was the use of the Mammalian Orthologous Intron Database (MOID), which contains all known introns within the human, mouse and rat genomes. Introns from orthologous genes from these three species, that have the same position relative to the reading frame, are grouped in a special orthologous intron table. Our program SNO.pl searches for conserved snoRNA motifs within MOID and reports all cases when characteristic snoRNA-like structures are present in all three orthologous introns of human, mouse and rat sequences. Here we report an example of the SNO.pl usage for searching a particular pattern of conserved C/D-box snoRNA motifs (canonical C- and D-boxes and the 6 nt long terminal stem). In this computer analysis, we detected 57 triplets of snoRNA-like structures in three mammals. Among them were 15 triplets that represented known C/D-box snoRNA genes. Six triplets represented snoRNA genes that had only been partially characterized in the mouse genome. One case represented a novel snoRNA gene, and another three cases, putative snoRNAs. Our programs are publicly available and can be easily adapted and/or modified for searching any conserved motifs within mammalian introns.


Trends in Genetics | 2001

Footprints of primordial introns on the eukaryotic genome

Scott William Roy; Benjamin Peter Lewis; Alexei Fedorov; Walter Gilbert

Another expected legacy of ancient phase-zero introns is that putatively ancient introns (those widely phylogenetically distributed) should have a stronger phase-zero bias. We tested this notion by examining widely distributed C. elegans intron positions: those present in two or more kingdoms. We used a database of ancient genes with known corresponding protein structures (culled from the Protein Data Bank) and the introns from their homologues, compiled by Fedorov and collaborators (unpublished). Of 6568 total intron positions, C. elegans introns are found at 1539 positions within 280 gene families. At 233 of these C. elegans positions there are also introns in the copies of the gene from non-animal organisms. Such coincident positions are candidates for the positions of ancient introns, based on this very wide phylogenetic distribution. (These coincident intron positions do, in fact, show a stronger correlation with module boundaries 6xCentripetal modules and ancient introns. Roy, S.W. et al. Gene. 1999; 238: 85–91Crossref | PubMed | Scopus (40)See all References6.) If these positions are indeed ancient, and if the phase-zero bias of introns is due to the presence of ancient introns, this ancient subset of C. elegans introns should have a higher phase-zero bias than should the set of animal-unique C. elegans introns.Table 2Table 2 shows that this ancient subset does have a significantly stronger phase-zero bias (62.2% versus 50.8%), with a χ2 of 10.4 (P = 0.0013). This is a striking verification of the expectation that part of the phase-zero bias is due to the presence of truly ancient introns. Table 3Table 3 shows further that this is a general phenomenon. There is a highly-significant difference between the phase bias of animal intron positions at which introns are found in two or more kingdoms and those positions that are exclusively animal. The same is true for invertebrates and vertebrates separately, for plants, and for fungi, and there is a significant difference for protists. This trend is seen even further in positions where introns are found in three or more kingdoms. This result suggests that there were in fact ancient intron positions and that the observed excess of phase-zero positions is, at least in part, due to their presence in modern eukaryotes.Table 2Phase patterns of introns in Caenorhabditis elegans genesPhase zeroχ2Phase onePhase twoTotal intronsExclusively animal C. elegans positions663 50.8%306 23.4%337 25.8%1306Widely distributed C. elegans positions145 62.2%10.446 19.7%42 18.0%233Widely distributed signifies that an intron is also found at that position in a non-animal copy of the gene. The introns are from the database of 6568 intron positions in the set of ancient genes that have three-dimensional structures for the corresponding proteins compiled by Fedorov and collaborators (unpublished). The χ2 is calculated comparing phase zero with (phase one + phase two).Table 3Phase patterns of introns in different kingdomsaPhase zeroχ2Phase onePhase twoTotal intronsAnimalExclusively animal1607 50.2%860 26.9%735 23.0%3202Animal and other kingdom(s)329 59.8%17.4127 23.1%94 17.1%550Exclusively vertebrate504 48.2%324 31.0%217 20.8%1045Vertebrate and other kingdom(s)194 61.8%17.869 22.0%51 16.2%314Exclusively invertebrate906 51.6%423 24.1%428 24.4%1757Invertebrate and other kingdom(s)246 58.9%7.296 23.0%76 18.2%418 PlantExclusively plant1166 59.9%398 20.5%381 19.6%1945Plant and other kingdom(s)323 66.1%6.199 20.2%67 13.7%489 FungiExclusively fungi303 42.7%236 33.3%170 24.0%709Fungi and other kingdom(s)118 54.1%8.756 25.7%44 20.2%218 ProtistExclusively protist34 38.6%28 31.8%26 29.5%88Protists and other kingdom(s)41 56.9%5.316 22.2%15 20.8%72 AllOnly one kingdom3110 52.3%1522 25.6%1312 22.1%5944Two or more kingdoms371 60.4%14.7142 23.1%101 16.4%614Three or more kingdoms51 64.6%13 16.5%15 19.0%79The introns are from the database of 6568 intron positions in the set of ancient genes that have three-dimensional structures for the corresponding proteins compiled by Fedorov and collaborators (unpublished). The χ2 is calculated comparing phase zero with (phase one + phase two).These twin findings – first, that the intron phases in ancient genes, defined by BLAST scores, differ from those of recently transferred genes by having a stronger phase-zero bias, and second, that phylogenetically ancient intron positions also show a stronger phase-zero bias – support the theory that introns, mostly in phase zero, were present in the LUCA.Furthermore, as phase-zero, widely phylogenetically-distributed intron positions are highly correlated with the boundaries of modules 6xCentripetal modules and ancient introns. Roy, S.W. et al. Gene. 1999; 238: 85–91Crossref | PubMed | Scopus (40)See all References6, the subset of introns possessing one characteristic expected of ancient intron positions – wide phylogenetic distribution – also possesses two other traits expected of ancient positions. This is a result easily explained on a mixed intron-origin model, but not easily explained on strictly insertional ones.

Collaboration


Dive into the Alexei Fedorov's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Scott William Roy

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge