Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Georgy Sofronov is active.

Publication


Featured researches published by Georgy Sofronov.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2015

Multiple break-points detection in array CGH data via the cross-entropy method

W. J. R. M. Priyadarshana; Georgy Sofronov

Array comparative genome hybridization (aCGH) is a widely used methodology to detect copy number variations of a genome in high resolution. Knowing the number of break-points and their corresponding locations in genomic sequences serves different biological needs. Primarily, it helps to identify disease-causing genes that have functional importance in characterizing genome wide diseases. For human autosomes the normal copy number is two, whereas at the sites of oncogenes it increases (gain of DNA) and at the tumour suppressor genes it decreases (loss of DNA). The majority of the current detection methods are deterministic in their set-up and use dynamic programming or different smoothing techniques to obtain the estimates of copy number variations. These approaches limit the search space of the problem due to different assumptions considered in the methods and do not represent the true nature of the uncertainty associated with the unknown break-points in genomic sequences. We propose the Cross-Entropy method, which is a model-based stochastic optimization technique as an exact search method, to estimate both the number and locations of the break-points in aCGH data. We model the continuous scale log-ratio data obtained by the aCGH technique as a multiple break-point problem. The proposed methodology is compared with well established publicly available methods using both artificially generated data and real data. Results show that the proposed procedure is an effective way of estimating number and especially the locations of break-points with high level of precision. Availability: The methods described in this article are implemented in the new R package breakpoint and it is available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=breakpoint.


European Journal of Operational Research | 2013

An optimal sequential procedure for a multiple selling problem with independent observations

Georgy Sofronov

We consider a sequential problem of selling K identical assets over the finite time horizon with a fixed number of offers per time period and no recall of past offers. The objective is to find an optimal sequential procedure which maximizes the total expected revenue. In this paper, we derive an effective number of stoppings for an optimal sequential procedure for the selling problem with independent observations.


congress on evolutionary computation | 2011

Change-point detection in biological sequences via genetic algorithm

Tatiana Polushina; Georgy Sofronov

Genome research is one of the most interesting and important areas of the science nowadays. It is well-known that the genomes of complex organisms are highly organized. Many studies show that DNA sequence can be divided into a few segments, which have various properties of interest. Detection of this segments is extremely significant from the point of view of practical applications, as well as for understanding evolutional processes. We model genome sequences as a multiple change-point process, that is, a process in which sequential data are divided into segments by an unknown number of change-points, with each segment supposed to have been generated by a process with different parameters. Multiple change-point models are important in many biological applications and, specifically, in analysis of biomolecular sequences. In this paper, we propose to use genetic algorithm to identify change-points. Numerical experiments illustrate the effectiveness of our approach to the problem. We obtain estimates for the positions of change-points in artificially generated sequences and compare the accuracy of these estimates to those obtained via Markov chain Monte Carlo and the Cross-Entropy method. We also provide examples with real data sets to illustrate the usefulness of our method.


symposium on neural network applications in electrical engineering | 2012

Sequential change-point detection via the Cross-Entropy method

Georgy Sofronov; Tatiana V. Polushina; M. W. J. R Priyadarshana

Change-point problems (or break point problems, disorder problems) can be considered one of the central issues of statistics, connecting asymptotic statistical theory and Monte Carlo methods, frequentist and Bayesian approaches, fixed and sequential procedures. In many real applications, observations are taken sequentially over time, or can be ordered with respect to some other criterion. The basic question, therefore, is whether the data obtained are generated by one or by many different probabilistic mechanisms. Change-point problems arise in a wide variety of fields, including biomedical signal processing, speech and image processing, climatology, industry (e.g. fault detection) and financial mathematics. In this paper, we apply the Cross-Entropy method to a sequential change-point problem. We obtain estimates for thresholds in the Shiryaev-Roberts procedure and the CUSUM procedure. We provide examples with generated sequences to illustrate the effectiveness of our approach to the problem.


congress on evolutionary computation | 2012

A modified cross entropy method for detecting multiple change points in DNA Count Data

M. W. J. R Priyadarshana; Georgy Sofronov

We model DNA count data as a multiple change point problem, in which the data are divided in to different segments by an unknown number of change points. Each segment is supposed to be generated by unique distribution characteristics inherent to the underlying process. In this paper, we propose a modified version of the Cross-Entropy (CE) method, which utilizes Beta distribution to simulate locations of change points. Several stopping criterions are also discussed. The proposed CE method applies on over-dispersed count data, in which the observations are distributed as independent Negative Binomial. Furthermore, we incorporate the Bayesian Information Criterion to identify the optimal number of change points within the CE method while not fixing the maximum number of change points in the data sequence. We obtain estimates for the artificial data by using the modified CE method and compare the results with the general CE method, which utilizes normal distribution to simulate locations of the change points. The methods are applied to a real DNA count data set in order to illustrate the usefulness of the proposed modified CE method.


international conference on artificial intelligence and applications | 2013

A hybrid genetic algorithm for change-point detection in binary biomolecular sequences

Tatiana V. Polushina; Georgy Sofronov

Genomes of eukaryotic organisms vary in GC ratio, that is, share of DNA bases such that C or G as contrary to T or A. Statistical identification of segments that are internally homogenous with respect to GC ratio is essential for understanding of evolutionary processes and the different functional characteristics of the genome. It appears that DNA segmentation concerns one of the most important applications involving change-point detection. Problems of this type arise in various areas, such as speech and image processing, biomedical applications, econometrics, industry and seismology. In this study, we develop a hybrid genetic algorithm for detecting change-points in binary sequences. We apply our algorithm to both synthetic and real data sets, and demonstrate that it is more effective than other well-known methods such as Markov chain Monte Carlo, Cross-Entropy and Genetic algorithms.


Advances in Experimental Medicine and Biology | 2015

Hybrid Algorithms for Multiple Change-Point Detection in Biological Sequences

M. W. J. R Priyadarshana; Tatiana Polushina; Georgy Sofronov

Array comparative genomic hybridization (aCGH) is one of the techniques that can be used to detect copy number variations in DNA sequences in high resolution. It has been identified that abrupt changes in the human genome play a vital role in the progression and development of many complex diseases. In this study we propose two distinct hybrid algorithms that combine efficient sequential change-point detection procedures (the Shiryaev-Roberts procedure and the cumulative sum control chart (CUSUM) procedure) with the Cross-Entropy method, which is an evolutionary stochastic optimization technique to estimate both the number of change-points and their corresponding locations in aCGH data. The proposed hybrid algorithms are applied to both artificially generated data and real aCGH experimental data to illustrate their usefulness. Our results show that the proposed methodologies are effective in detecting multiple change-points in biological sequences of continuous measurements.


2013 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL MODELS FOR LIFE SCIENCES | 2013

A hybrid algorithm for multiple change-point detection in continuous measurements

W. J. R. M. Priyadarshana; Tatiana V. Polushina; Georgy Sofronov

Array comparative genomic hybridization (aCGH) is one of the techniques that can be used to detect copy number variations in DNA sequences. It has been identified that abrupt changes in the human genome play a vital role in the progression and development of many diseases. We propose a hybrid algorithm that utilizes both the sequential techniques and the Cross-Entropy method to estimate the number of change points as well as their locations in aCGH data. We applied the proposed hybrid algorithm to both artificially generated data and real data to illustrate the usefulness of the methodology. Our results show that the proposed algorithm is an effective method to detect multiple change-points in continuous measurements.


federated conference on computer science and information systems | 2014

Change-point detection in binary Markov DNA sequences by the Cross-Entropy method

Tatiana Polushina; Georgy Sofronov

A deoxyribonucleic acid (DNA) sequence can be represented as a sequence with 4 characters. If a particular property of the DNA is studied, for example, GC content, then it is possible to consider a binary sequence. In many cases, if the probabilistic properties of a segment differ from the neighbouring ones, this means that the segment can play a structural role. Therefore, DNA segmentation is given a special attention, and it is one of the most significant applications of change-point detection. Problems of this type also arise in a wide variety of areas, for example, seismology, industry (e.g., fault detection), biomedical signal processing, financial mathematics, speech and image processing. In this study, we have developed a Cross-Entropy algorithm for identifying change-points in binary sequences with first-order Markov dependence. We propose a statistical model for this problem and show effectiveness of our algorithm for synthetic and real datasets.


federated conference on computer science and information systems | 2017

Binary Segmentation Methods for Identifying Boundaries of Spatial Domains

Nishanthi Raveendran; Georgy Sofronov

Spatial clustering is an important component of spatial data analysis which aims in identifying the boundaries of domains and their number. It is commonly used in disease surveillance, spatial epidemiology, population genetics, landscape ecology, crime analysis and many other fields. In this paper, we focus on identifying homogeneous sub-regions in binary data, which indicate the presence or absence of a certain plant species which are observed over a two-dimensional lattice. To solve this clustering problem we propose to use the change-point methodology. We develop new methods based on a binary segmentation algorithm, which is a well-known multiple change-point detection method. The proposed algorithms are applied to artificially generated data to illustrate their usefulness. Our results show that the proposed methodologies are effective in identifying multiple domains and their boundaries in two dimensional spatial data.

Collaboration


Dive into the Georgy Sofronov's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tatiana V. Polushina

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dirk P. Kroese

University of Queensland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge