Ulykbek Kairov
Nazarbayev University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ulykbek Kairov.
Biochemical and Biophysical Research Communications | 2013
Andrei Zinovyev; Ulykbek Kairov; Tatiana Karpenyuk; Erlan Ramanculov
Two blind source separation methods (Independent Component Analysis and Non-negative Matrix Factorization), developed initially for signal processing in engineering, found recently a number of applications in analysis of large-scale data in molecular biology. In this short review, we present the common idea behind these methods, describe ways of implementing and applying them and point out to the advantages compared to more traditional statistical approaches. We focus more specifically on the analysis of gene expression in cancer. The review is finalized by listing available software implementations for the methods described.
Bioinformation | 2012
Ulykbek Kairov; Tatyana Karpenyuk; Erlan Ramanculov; Andrei Zinovyev
Many genome-scale studies in molecular biology deliver results in the form of a ranked list of gene names, accordingly to some scoring method. There is always the question how many top-ranked genes to consider for further analysis, for example, in order creating a diagnostic or predictive gene signature for a disease. This question is usually approached from a statistical point of view, without considering any biological properties of top-ranked genes or how they are related to each other functionally. Here we suggest a new method for selecting a number of genes in a ranked gene list such that this set forms the Optimally Functionally Enriched Network (OFTEN), formed by known physical interactions between genes or their products. The method allows associating a network with the gene list, providing easier interpretation of the results and classifying the genes or proteins accordingly to their position in the resulting network. We demonstrate the method on four breast cancer datasets and show that 1) the resulting gene signatures are more reproducible from one dataset to another compared to standard statistical procedures and 2) the overlap of these signatures has significant prognostic potential. The method is implemented in BiNoM Cytoscape plugin (http://binom.curie.fr).
Journal of Cellular and Molecular Medicine | 2014
Petr Dmitriev; Ulykbek Kairov; Thomas Robert; Ana Barat; Vladimir Lazar; Gilles Carnac; Dalila Laoudj-Chenivesse; Yegor Vassetzky
Muscular dystrophy is a condition potentially predisposing for cancer; however, currently, only Myotonic dystrophy patients are known to have a higher risk of cancer. Here, we have searched for a link between facioscapulohumeral dystrophy (FSHD) and cancer by comparing published transcriptome signatures of FSHD and various malignant tumours and have found a significant enrichment of cancer‐related genes among the genes differentially expressed in FSHD. The analysis has shown that gene expression profiles of FSHD myoblasts and myotubes resemble that of Ewings sarcoma more than that of other cancer types tested. This is the first study demonstrating a similarity between FSHD and cancer cell expression profiles, a finding that might indicate the existence of a common step in the pathogenesis of these two diseases.
Central Asian Journal of Global Health | 2014
Maxat Zhabagin; Zhannur Abilova; Ayken Askapuli; Saule Rakhimova; Ulykbek Kairov; Kulzhan Berikkhanova; Assel Terlikbayeva; Meruert Darisheva; Arike Alenova; Ainur Akilzhanova
Introduction Vitamin D receptor (VDR) plays an important role in activating the immune response against various infectious agents. It is known that the active metabolite of ligand receptor Vitamin D (1,25 – dihydroxyvitamin D) is encoded by VDR and helps mononuclear phagocytes to suppress the intracellular growth of M. tuberculosis. The VDR gene harbors approximately 200 polymorphisms, some of which are linked to differences in receptor Vitamin D uptake and therefore can be considered as candidate disease risk variants. The relation between VDR gene polymorphisms and susceptibility to TB has been studied in different populations. There is not a great deal of information regarding the association of these SNPs with TB risk in the Kazakh population. The four most commonly investigated VDR polymorphisms in association with different diseases, including susceptibility to tuberculosis, are located in exon 2 (rs2228570 or FokI), intron 8 (rs1544410 or BsmI and rs7975232 or ApaI), and exon 9 (rs731236 or TaqI). The aim of our study was to determine whether these four VDR gene single nucleotide polymorphisms were associated with TB and whether they were a risk for the development of TB in the Kazakh Population in Almaty city and Almaty area. Methods This study was a hospital-based case-control analysis of 283 individuals (99 TB patients and 184 healthy controls). Genotyping was performed by Taqman SNP allelic discrimination using commercial TaqMan SNP Genotyping assays. Statistical analysis was conducted using SPSS Version 19.0 software. Results Genotype frequencies for the Kazakh population are close to world (HapMap) data on Asian populations. FokI and ApaI polymorphisms genotypes tend to be associated with TB risk under the co-dominant model [OR=1.18; 95%CI: (0.68, 2.07), p=0.15] for FokI and [OR=1.33; 95%CI: (0.61, 2.91), p=0.6] for ApaI. No significant association between the disease and TaqI, BsmI genotypes was observed. Conclusions In summary, we explored potential associations between SNPs in the VDR (FokI, ApaI) gene and susceptibility to tuberculosis in the Kazakh Population, which requires further detailed analysis with a larger sample size and greater geographic diversity including other regions of Kazakhstan.
BMC Genomics | 2017
Ulykbek Kairov; Laura Cantini; Alessandro Greco; Askhat Molkenov; Urszula Czerwinska; Emmanuel Barillot; Andrei Zinovyev
BackgroundIndependent Component Analysis (ICA) is a method that models gene expression data as an action of a set of statistically independent hidden factors. The output of ICA depends on a fundamental parameter: the number of components (factors) to compute. The optimal choice of this parameter, related to determining the effective data dimension, remains an open question in the application of blind source separation techniques to transcriptomic data.ResultsHere we address the question of optimizing the number of statistically independent components in the analysis of transcriptomic data for reproducibility of the components in multiple runs of ICA (within the same or within varying effective dimensions) and in multiple independent datasets. To this end, we introduce ranking of independent components based on their stability in multiple ICA computation runs and define a distinguished number of components (Most Stable Transcriptome Dimension, MSTD) corresponding to the point of the qualitative change of the stability profile. Based on a large body of data, we demonstrate that a sufficient number of dimensions is required for biological interpretability of the ICA decomposition and that the most stable components with ranks below MSTD have more chances to be reproduced in independent studies compared to the less stable ones. At the same time, we show that a transcriptomics dataset can be reduced to a relatively high number of dimensions without losing the interpretability of ICA, even though higher dimensions give rise to components driven by small gene sets.ConclusionsWe suggest a protocol of ICA application to transcriptomics data with a possibility of prioritizing components with respect to their reproducibility that strengthens the biological interpretation. Computing too few components (much less than MSTD) is not optimal for interpretability of the results. The components ranked within MSTD range have more chances to be reproduced in independent studies.
Central Asian Journal of Global Health | 2014
Ulykbek Kairov; Ulan Kozhamkulov; Saule Rakhimova; Ayken Askapuli; Maxat Zhabagin; Venera Bismilda; Leyla Chingissova; Zhaxybay Zhumadilov; Ainur Akilzhanova
Background Tuberculosis is a major public health problem which infects one third of the world’s population, resulting in more than two million deaths every year. The emergence of whole genome sequencing (WGS) technologies as a primary research tool has allowed for the detection of genetic diversity in Mycobacterium tuberculosis (MTB) with unprecedented resolution. WGS has been used to address a broad range of topics, including the dynamics of evolution, transmission, and treatment. To our knowledge, studies involving WGS of Kazakhstani strains of M. tuberculosis have not yet been performed. Aim To perform whole genome sequencing of M. tuberculosis strains isolated in Kazakhstan and analyze sequence data (first experience and preliminary data). Results In the present report, we announce the whole-genome sequences of the two clinical isolates of Mycobacterium tuberculosis, MTB-489 and MTB-476, isolated from the Almaty region. These strains were part of a repository that was created during our project “Creating prerequisites of personalized approach in the diagnosis and treatment of tuberculosis, based on whole genome-sequencing of M. tuberculosis”. Two strains were isolated from sputum samples of patients P1 and P2. Phenotypically, two isolates were drug-susceptible M. tuberculosis. Sequence data was compared with the publicly available data on M. tuberculosis laboratory strain H37Rv and others. The sequencing of the strains was performed on a Roche 454 GS FLX+ next-generation sequencing platform using a standard protocol for a shotgun genome library. The whole genome sequencing was performed for two M.tuberculosis isolates MTB-476 and MTB-489. 96 M bp with an average read length of 520 bp, approximately 21.8X coverage and 104.2 M bp with an average read length of 589 bp and approximately 23.7X coverage were generated for the MTB-476 and MTB-489, respectively. The genome of MTB-476 consists of 257 contigs, 4204 CDS, 46 tRNAs and 3 rRNAs. MTB-489 has 187 contigs, 4183 CDS, 45 tRNAs and 3rRNAs. Conclusion The results of genome assembling have been submitted into NCBI GenBank and are available for public access under the accession numbers AZBA00000000 and AZAZ00000000. These genome assemblies can be useful for comparative genome analysis and for identification of novel SNPs and gene variants in genomes of M.tuberculosis.
research in computational molecular biology | 2018
Laura Cantini; Ulykbek Kairov; Aurélien de Reyniès; Emmanuel Barillot; François Radvanyi; Andrei Zinovyev
Motivation Matrix factorization methods are widely exploited in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). Applying such methods to similar independent datasets should yield reproducible inter-series outputs, though it was never demonstrated yet. Results We systematically test state-of-art methods of matrix factorization on several transcriptomic datasets of the same cancer type. Inspired by concepts of evolutionary bioinformatics, we design a new framework based on Reciprocally Best Hit (RBH) graphs in order to benchmark the method’s reproducibility. We show that a particular protocol of application of Independent Component Analysis (ICA), accompanied by a stabilisation procedure, leads to a significant increase in the inter-series output reproducibility. Moreover, we show that the signals detected through this method are systematically more interpretable than those of other state-of-art methods. We developed a user-friendly tool BIODICA for performing the Stabilized ICA-based RBH meta-analysis. We apply this methodology to the study of colorectal cancer (CRC) for which 14 independent publicly available transcriptomic datasets can be collected. The resulting RBH graph maps the landscape of interconnected factors that can be associated to biological processes or to technological artefacts. These factors can be used as clinical biomarkers or robust and tumor-type specific transcriptomic signatures of tumoral cells or tumoral microenvironment. Their intensities in different samples shed light on the mechanistic basis of CRC molecular subtyping. Availability The BIODICA tool is available from https://github.com/LabBandSB/BIODICA. Contact [email protected] and [email protected] Supplementary information Supplementary data are available at Bioinformatics online.
international conference on latent variable analysis and signal separation | 2018
Urszula Czerwinska; Laura Cantini; Ulykbek Kairov; Emmanuel Barillot; Andrei Zinovyev
Independent Component Analysis (ICA) can be used to model gene expression data as an action of a set of statistically independent hidden factors. The ICA analysis with a downstream component analysis was successfully applied to transcriptomic data previously in order to decompose bulk transcriptomic data into interpretable hidden factors. Some of these factors reflect the presence of an immune infiltrate in the tumor environment. However, no foremost studies focused on reproducibility of the ICA-based immune-related signal in the tumor transcriptome. In this work, we use ICA to detect immune signals in six independent transcriptomic datasets. We observe several strongly reproducible immune-related signals when ICA is applied in sufficiently high-dimensional space (close to one hundred). Interestingly, we can interpret these signals as cell-type specific signals reflecting a presence of T-cells, B-cells and myeloid cells, which are of high interest in the field of oncoimmunology. Further quantification of these signals in tumoral transcriptomes has a therapeutic potential.
Genome Announcements | 2015
Samat Kozhakhmetov; Almagul Kushugulova; Saule Saduakhasova; Gulnara Shakhabayeva; Zhanagul R. Khassenbekova; Askhat Molkenov; Ulykbek Kairov; Raushan B. Issayeva; Talgat Nurgozhin; Zhaxybay Zhumadilov
ABSTRACT We announce the draft genome sequence of the type strain Lactobacillus rhamnosus CLS17 (2,889,314 nt, with a GC content of 46.8%), which is one of the most prevalent lactic acid bacteria present during the manufacturing process of dairy products; the genome consists of 71 large contigs (>100 bp in size). It contains 2,643 protein-coding sequences, single predicted copies of the 5S, 16S, and 23S rRNA genes, and 51 predicted tRNAs.
Genome Announcements | 2015
Ulykbek Kairov; Ulan Kozhamkulov; Askhat Molkenov; Saule Rakhimova; Ayken Askapuli; Maxat Zhabagin; Ainur Akhmetova; Dauren Yerezhepov; Zhannur Abilova; Aliya Abilmazhinova; Venera Bismilda; Leila Chingisova; Zhaxybay Zhumadilov; Ainur Akilzhanova
ABSTRACT Here, we report the draft genome sequences of two clinical isolates of Mycobacterium tuberculosis (MTB-476 and MTB-489) isolated from sputum of Kazakh patients.