Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maria A. Korotkova is active.

Publication


Featured researches published by Maria A. Korotkova.


Physics Letters A | 2003

Information decomposition method to analyze symbolical sequences

Eugene V. Korotkov; Maria A. Korotkova; Nikolai A. Kudryashov

The information decomposition (ID) method to analyze symbolical sequences is presented. This method allows us to reveal a latent periodicity of any symbolical sequence. The ID method is shown to have advantages in comparison with application of the Fourier transformation, the wavelet transform and the dynamic programming method to look for latent periodicity. Examples of the latent periods for poetic texts, DNA sequences and amino acids are presented. Possible origin of a latent periodicity for different symbolical sequences is discussed.We developed a non-parametric method of Information Decomposition (ID) of a content of any symbolical sequence. The method is based on the calculation of Shannon mutual information between analyzed and artificial symbolical sequences, and allows the revealing of latent periodicity in any symbolical sequence. We show the stability of the ID method in the case of a large number of random letter changes in an analyzed symbolic sequence. We demonstrate the possibilities of the method, analyzing both poems, and DNA and protein sequences. In DNA and protein sequences we show the existence of many DNA and amino acid sequences with different types and lengths of latent periodicity. The possible origin of latent periodicity for different symbolical sequences is discussed.


Bioinformatics | 1997

Latent sequence periodicity of some oncogenes and DNA-binding protein genes

Eugene V. Korotkov; Maria A. Korotkova; J. S. Tulko

A method of latent periodicity search is developed. We use mutual information to reveal the latent periodicity of mRNA sequences. The latent periodicity of an mRNA sequence is a periodicity with a low level of similarity between any two periods inside the mRNA sequence. The mutual information between an artificial numerical sequence and an mRNA sequence is calculated. The length of the artificial sequence period is varied from 2 to 150. The high level of the mutual information between artificial and mRNA sequences allows us to find any type of latent periodicity of mRNA sequence. The latent periodicity of many mRNA coding regions has been found. For example, the retinoblastoma gene of HSRBS clone contains a region with a latent period equal to 45 bases. The A-RAF oncogene of HSARAFIR clone contains a region with a latent period equal to 84 bases. Integrated sequences for the regions with latent periodicity are determined. The potential significance of latent periodicity is discussed.


Molecular Biology | 2003

The Informational Concept of Searching for Periodicity in Symbol Sequences

Eugene V. Korotkov; Maria A. Korotkova; F. E. Frenkel; Nikolai A. Kudryashov

A method of informational decomposition has been developed, allowing one to reveal hidden periodicity in any symbol sequence. The informational decomposition is calculated without conversion of a symbol sequence into a numerical one, which facilitates finding periodicities in a symbol sequence. The method permits introducing an analog of the autocorrelation function of a symbol sequence. The method developed by us has been applied to reveal hidden periodicities in nucleotide and amino acid sequences, as well as in different poetical texts. Hidden periodicity has been detected in various genes, testifying to their quantum structure. The functional and structural role of hidden periodicity is discussed.


Journal of Integrative Bioinformatics | 2010

Study of the triplet periodicity phase shifts in genes.

Eugene V. Korotkov; Maria A. Korotkova

The definition of a phase shift of triplet periodicity (TP) is introduced. The mathematical algorithm for detection of TP phase shift of nucleotide sequences has been developed. Gene sequences from Kegg-46 data bank were analyzed with a purpose of searching genes with a phase shift of TP. The presence of a phase shift of triplet periodicity has been shown for 318329 genes (approximately 10% from the number of genes in Kegg-46). We suppose that shifts of the TP phase may indicate the shifts of reading frame (RF) in genes. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.


Molecular Biology | 2000

MIR: Family of repeats common to vertebrate genomes

Eugene V. Korotkov; Maria A. Korotkova; V. M. Rudenko

We studied the occurrence of mammalian interspersed repeats (MIRs) in DNA and RNA of vertebrates, invertebrates, and bacteria using the data from GenBank. A special algorithm based on a weight position matrix with optimal alignment using dynamic programming was developed to search for the traces of MIR dissemination. This allowed us to search for highly divergent MIRs carrying deletions and insertions. MIRs were detected in genomes of various fishes, includingLatimeria. This suggests that the origin of MIRs dates back more than 400 million years. The method to search for similarity between highly divergent sequences may be used to find the genome fragments from various ancient repeat families and from various gene families.


BioMed Research International | 2017

Database of Periodic DNA Regions in Major Genomes

F. E. Frenkel; Maria A. Korotkova; Eugene V. Korotkov

Summary. We analyzed several prokaryotic and eukaryotic genomes looking for the periodicity sequences availability and employing a new mathematical method. The method envisaged using the random position weight matrices and dynamic programming. Insertions and deletions were allowed inside periodicities, thus adding a novelty to the results we obtained. A periodicity length, one of the key periodicity features, varied from 2 to 50 nt. Totally over 60,000 periodicity sequences were found in 15 genomes including some chromosomes of the H. sapiens (partial), C. elegans, D. melanogaster, and A. thaliana genomes.


Genomics, Proteomics & Bioinformatics | 2011

An Approach for Searching Insertions in Bacterial Genes Leading to the Phase Shift of Triplet Periodicity

Maria A. Korotkova; Nikolay A. Kudryashov; Eugene V. Korotkov

The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2014

Study of the paired change points in bacterial genes

Yulia M. Suvorova; Maria A. Korotkova; Eugene V. Korotkov

It is known that nucleotide sequences are not totally homogeneous and this heterogeneity could not be due to random fluctuations only. Such heterogeneity poses a problem of making sequence segmentation into a set of homogeneous parts divided by the points called “change points”. In this work we investigated a special case of change points-paired change points (PCP). We used a well-known property of coding sequences-triplet periodicity (TP). The sequences that we are especially interested in consist of three successive parts: the first and the last parts have similar TP while the middle part has different TP type. We aimed to find the genes with PCP and provide explanation for this phenomenon. We developed a mathematical method for the PCP detection based on the new measure of similarity between TP matrices. We investigated 66,936 bacterial genes from 17 bacterial genomes and revealed 2,700 genes with PCP and 6,459 genes with single change point (SCP). We developed a mathematical approach to visualize the PCP cases. We suppose that PCP could be associated with double fusion or insertion events. The results of investigating the sequences with artificial insertions/fusions and distribution of TP inside the genome support the idea that the real number of genes formed by insertion/ fusion events could be 5-7 times greater than the number of genes revealed in the present work.


international conference on bioinformatics | 2017

Search of Periodicity Regions in the Genome A.thaliana - Periodicity Regions in the A.thaliana Genomes.

Eugene V. Korotkov; F. E. Frenkel; Maria A. Korotkova

A mathematical method was developed in this study to determine tandem repeats in a DNA sequence. A multiple alignment of periods was calculated by direct optimization of the position-weight matrix (PWM) without using pairwise alignments or searching for similarity between periods. Random PWMs were used to develop a new mathematical algorithm for periodicity search. The developed algorithm was applied to analyze the DNA sequences of A.thaliana genome. 13997 regions having a periodicity with length of 2 to 50 bases were found. The average distance between regions with periodicity is ~9000 nucleotides. A significant portion of the revealed regions have periods consisting of 2 nucleotide, 10-11 nucleotides and periods in the vicinity of 30 nucleotides. No more than ~30% of the periods found were discovered early. The sequences found were collected in a data bank from the website: http://victoria.biengi.ac.ru/cgiin/indelper/index.cgi. This study discussed the origin of periodicity with insertions and deletions.


International Journal of Data Mining and Bioinformatics | 2017

Search for regions with periodicity using the random position weight matrices in the C. elegans genome

Eugene V. Korotkov; Maria A. Korotkova

The present study developed a mathematical method for determining tandem repeats in a DNA sequence. A multiple alignment of periods was calculated by direct optimisation of the position-weight matrix (PWM) without using the pairwise alignments or searching for similarity between periods. A new mathematical algorithm for periodicity search was developed using the random PWMs. The developed algorithm was applied in analysing the DNA sequences of the C. elegans genome. A total of 25,360 regions were found to possess a periodicity with the length of 2 to 50 bases. On the average, a periodicity of ~4000 nucleotides was found to be associated with each region. A significant portion of the revealed regions possess periods consisting of 10 and 11 nucleotides, multiple of 10 nucleotides and periods in the vicinity of 35 nucleotides. Only ~30% of the periods found were discovered previously. This study discussed the origin of periodicity with insertions and deletions.

Collaboration


Dive into the Maria A. Korotkova's collaboration.

Top Co-Authors

Avatar

Eugene V. Korotkov

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

F. E. Frenkel

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Nikolai A. Kudryashov

National Research Nuclear University MEPhI

View shared research outputs
Top Co-Authors

Avatar

Yulia M. Suvorova

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

J. S. Tulko

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge