Featured Researches

Genomics

Chaos in DNA Evolution

In this paper, we explain why the chaotic model (CM) of Bahi and Michel (2008) accurately simulates gene mutations over time. First, we demonstrate that the CM model is a truly chaotic one, as defined by Devaney. Then, we show that mutations occurring in gene mutations have the same chaotic dynamic, thus making the use of chaotic models relevant for genome evolution.

Read more
Genomics

Characterization of Methicillin-resistant Staphylococcus aureus Isolates from Fitness Centers in Memphis Metropolitan Area, USA

Indoor skin-contact surfaces of public fitness centers may serve as reservoirs of potential human transmission of methicillin-resistant Staphylococcus aureus (MRSA). We found a high prevalence of multi-drug resistant (MDR)-MRSA of CC59 lineage harboring a variety of extracellular toxin genes from surface swab samples collected from inanimate surfaces of fitness centers in Memphis metropolitan area, USA. Our findings underscore the role of inanimate surfaces as potential sources of transmission of MDR-MRSA strains with considerable genetic diversity.

Read more
Genomics

Characterizing Halloumi cheese bacterial communities through metagenomic analysis

Halloumi is a semi hard cheese produced in Cyprus for centuries and its popularity has significantly risen over the past years. High throughput sequencing (HTS) was applied in the present research to characterize traditional Cyprus Halloumi bacterial diversity. Eighteen samples made by different milk mixtures and produced in different areas of the country were analyzed, to reveal that Halloumi microbiome was mainly comprised by lactic acid bacteria (LAB), including Lactobacillus, Leuconostoc, and Pediococcus, as well as halophilic bacteria, such as Marinilactibacillus and Halomonas. Additionally, spore forming bacteria and spoilage bacteria, were also detected. Halloumi produced with the traditional method, had significantly richer bacterial diversity compared to Halloumi produced with the industrial method. Variations detected among the bacterial communities highlight the contribution of the initial microbiome that existed in milk and survived pasteurization, as well as factors associated with Halloumi manufacturing conditions, in the final microbiota composition shaping. Identification and characterization of Halloumi microbiome provides an additional, useful tool to characterize its typicity and probably safeguard it from fraud products that may appear in the market. Also, it may assist producers to further improve its quality and guarantee consumers safety.

Read more
Genomics

Characterizing SARS-CoV-2 mutations in the United States

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k -means clustering, protein-folding stability, algebraic topology, and network theory, we reveal that the US SARS-CoV-2 has four substrains and five top US SARS-CoV-2 mutations were first detected in China (2 cases), Singapore (2 cases), and the United Kingdom (1 case). The next three top US SARS-CoV-2 mutations were first detected in the US. These eight top mutations belong to two disconnected groups. The first group consisting of 5 concurrent mutations is prevailing, while the other group with three concurrent mutations gradually fades out. Our analysis suggests that female immune systems are more active than those of males in responding to SARS-CoV-2 infections. We identify that one of the top mutations, 27964C > T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we further uncover that three of four US SASR-CoV-2 substrains become more infectious. Our study calls for effective viral control and containing strategies in the US.

Read more
Genomics

Chloroplast Genome Yields Unusual Seven-Cluster Structure C

We studied the structuredness in a chloroplast genome of Siberian larch. The clusters in 63-dimensional space were identified with elastic map technique, where the objects to be clusterized are the different fragments of the genome. A seven-cluster structure in the distribution of those fragments reported previously has been found. Unlike the previous results, we have found the drastically other composition of the clusters comprising the fragments extracted from coding and non-coding regions of the genome.

Read more
Genomics

Circuits with broken fibration symmetries perform core logic computations in biological networks

We show that logic computational circuits in gene regulatory networks arise from a fibration symmetry breaking in the network structure. From this idea we implement a constructive procedure that reveals a hierarchy of genetic circuits, ubiquitous across species, that are surprising analogues to the emblematic circuits of solid-state electronics: starting from the transistor and progressing to ring oscillators, current-mirror circuits to toggle switches and flip-flops. These canonical variants serve fundamental operations of synchronization and clocks (in their symmetric states) and memory storage (in their broken symmetry states). These conclusions introduce a theoretically principled strategy to search for computational building blocks in biological networks, and present a systematic route to design synthetic biological circuits.

Read more
Genomics

Class-Conditional VAE-GAN for Local-Ancestry Simulation

Local ancestry inference (LAI) allows identification of the ancestry of all chromosomal segments in admixed individuals, and it is a critical step in the analysis of human genomes with applications from pharmacogenomics and precision medicine to genome-wide association studies. In recent years, many LAI techniques have been developed in both industry and academic research. However, these methods require large training data sets of human genomic sequences from the ancestries of interest. Such reference data sets are usually limited, proprietary, protected by privacy restrictions, or otherwise not accessible to the public. Techniques to generate training samples that resemble real haploid sequences from ancestries of interest can be useful tools in such scenarios, since a generalized model can often be shared, but the unique human sample sequences cannot. In this work we present a class-conditional VAE-GAN to generate new human genomic sequences that can be used to train local ancestry inference (LAI) algorithms. We evaluate the quality of our generated data by comparing the performance of a state-of-the-art LAI method when trained with generated versus real data.

Read more
Genomics

Classification of large DNA methylation datasets for identifying cancer drivers

DNA methylation is a well-studied genetic modification crucial to regulate the functioning of the genome. Its alterations play an important role in tumorigenesis and tumor-suppression. Thus, studying DNA methylation data may help biomarker discovery in cancer. Since public data on DNA methylation become abundant, and considering the high number of methylated sites (features) present in the genome, it is important to have a method for efficiently processing such large datasets. Relying on big data technologies, we propose BIGBIOCL an algorithm that can apply supervised classification methods to datasets with hundreds of thousands of features. It is designed for the extraction of alternative and equivalent classification models through iterative deletion of selected features. We run experiments on DNA methylation datasets extracted from The Cancer Genome Atlas, focusing on three tumor types: breast, kidney, and thyroid carcinomas. We perform classifications extracting several methylated sites and their associated genes with accurate performance. Results suggest that BIGBIOCL can perform hundreds of classification iterations on hundreds of thousands of features in few hours. Moreover, we compare the performance of our method with other state-of-the-art classifiers and with a wide-spread DNA methylation analysis method based on network analysis. Finally, we are able to efficiently compute multiple alternative classification models and extract, from DNA-methylation large datasets, a set of candidate genes to be further investigated to determine their active role in cancer. BIGBIOCL, results of experiments, and a guide to carry on new experiments are freely available on GitHub.

Read more
Genomics

Co-evolution between Codon Usage and Protein-Protein Interaction in Bacteria

We study the correlation between the codon usage bias of genetic sequences and the network features of protein-protein interaction (PPI) in bacterial species. We use PCA techniques in the space of codon bias indices to show that genes with similar patterns of codon usage have a significantly higher probability that their encoded proteins are functionally connected and interacting. Importantly, this signal emerges when multiple aspects of codon bias are taken into account at the same time. The present study extends our previous observations on E.Coli over a wide set of 34 bacteria. These findings could allow for future investigations on the possible effects of codon bias on the topology of the PPI network, with the aim of improving existing bioinformatics methods for predicting protein interactions.

Read more
Genomics

Codon Bias Patterns of E.coli 's Interacting Proteins

Synonymous codons, i.e., DNA nucleotide triplets coding for the same amino acid, are used differently across the variety of living organisms. The biological meaning of this phenomenon, known as codon usage bias, is still controversial. In order to shed light on this point, we propose a new codon bias index, CompAI , that is based on the competition between cognate and near-cognate tRNAs during translation, without being tuned to the usage bias of highly expressed genes. We perform a genome-wide evaluation of codon bias for E.coli , comparing CompAI with other widely used indices: tAI , CAI , and Nc . We show that CompAI and tAI capture similar information by being positively correlated with gene conservation, measured by ERI, and essentiality, whereas, CAI and Nc appear to be less sensitive to evolutionary-functional parameters. Notably, the rate of variation of tAI and CompAI with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs). We also investigate the correlation of codon bias at the genomic level with the network features of protein-protein interactions in E.coli . We find that the most densely connected communities of the network share a similar level of codon bias (as measured by CompAI and tAI ). Conversely, a small difference in codon bias between two genes is, statistically, a prerequisite for the corresponding proteins to interact. Importantly, among all codon bias indices, CompAI turns out to have the most coherent distribution over the communities of the interactome, pointing to the significance of competition among cognate and near-cognate tRNAs for explaining codon usage adaptation.

Read more

Ready to get started?

Join us today