Featured Researches

Quantitative Methods

Discovering novel cancer bio-markers in acquired lapatinib resistance using Bayesian methods

Genes/Proteins do not work alone within our body, rather as a group they perform certain activities indicated as pathways. Signalling transduction pathways (STPs) are some of the important pathways that transmit biological signals from protein-to-protein controlling several cellular activities. However, many diseases such as cancer target some of these signalling pathways for their growth and malignance, but demystifying their underlying mechanisms are a very complicated tasks. In this study, we use a fully Bayesian approach to develop methodologies in discovering novel driver bio-markers in aberrant STPs given two-conditional high-throughput gene expression data. This project, namely PathTurbEr (Pathway Perturbation Driver), is applied on a global gene expression dataset derived from the lapatinib (an EGFR/HER dual inhibitor) sensitive and resistant samples from breast cancer cell lines (SKBR3). Differential expression analysis revealed 512 differentially expressed genes (DEGs) and their signalling pathway enrichment analysis revealed 22 singalling pathways as aberrated including PI3K-AKT, Hippo, Chemokine, and TGF-beta singalling pathway as highly dysregulated in lapatinib resistance. Next, we model the aberrant activities in TGF-beta STP as a causal Bayesian network (BN) from given observational datasets using three Markov Chain Monte Carlo (MCMC) sampling methods, i.e. Neighbourhood sampler (NS) and Hit-and-Run (HAR) sampler, which has already proven to have more robust inference with lower chances of getting stuck at local optima and faster convergence compared to other state-of-art methods. Next, we examined the structural features of the optimal BN as a statistical process that generates the global structure using, p 1 -model, a special class of Exponential Random Graph Models (ERGMs) and MCMC methods for their hyper-parameter sampling....

Read more
Quantitative Methods

Distance Correlation Based Brain Functional Connectivity Estimation and Non-Convex Multi-Task Learning for Developmental fMRI Studies

Resting-state functional magnetic resonance imaging (rs-fMRI)-derived functional connectivity patterns have been extensively utilized to delineate global functional organization of the human brain in health, development, and neuropsychiatric disorders. In this paper, we investigate how functional connectivity in males and females differs in an age prediction framework. We first estimate functional connectivity between regions-of-interest (ROIs) using distance correlation instead of Pearson's correlation. Distance correlation, as a multivariate statistical method, explores spatial relations of voxel-wise time courses within individual ROIs and measures both linear and nonlinear dependence, capturing more complex information of between-ROI interactions. Then, a novel non-convex multi-task learning (NC-MTL) model is proposed to study age-related gender differences in functional connectivity, where age prediction for each gender group is viewed as one task. Specifically, in the proposed NC-MTL model, we introduce a composite regularizer with a combination of non-convex ℓ 2,1−2 and ℓ 1−2 regularization terms for selecting both common and task-specific features. Finally, we validate the proposed NC-MTL model along with distance correlation based functional connectivity on rs-fMRI of the Philadelphia Neurodevelopmental Cohort for predicting ages of both genders. The experimental results demonstrate that the proposed NC-MTL model outperforms other competing MTL models in age prediction, as well as characterizing developmental gender differences in functional connectivity patterns.

Read more
Quantitative Methods

Distance-aware Molecule Graph Attention Network for Drug-Target Binding Affinity Prediction

Accurately predicting the binding affinity between drugs and proteins is an essential step for computational drug discovery. Since graph neural networks (GNNs) have demonstrated remarkable success in various graph-related tasks, GNNs have been considered as a promising tool to improve the binding affinity prediction in recent years. However, most of the existing GNN architectures can only encode the topological graph structure of drugs and proteins without considering the relative spatial information among their atoms. Whereas, different from other graph datasets such as social networks and commonsense knowledge graphs, the relative spatial position and chemical bonds among atoms have significant impacts on the binding affinity. To this end, in this paper, we propose a diStance-aware Molecule graph Attention Network (S-MAN) tailored to drug-target binding affinity prediction. As a dedicated solution, we first propose a position encoding mechanism to integrate the topological structure and spatial position information into the constructed pocket-ligand graph. Moreover, we propose a novel edge-node hierarchical attentive aggregation structure which has edge-level aggregation and node-level aggregation. The hierarchical attentive aggregation can capture spatial dependencies among atoms, as well as fuse the position-enhanced information with the capability of discriminating multiple spatial relations among atoms. Finally, we conduct extensive experiments on two standard datasets to demonstrate the effectiveness of S-MAN.

Read more
Quantitative Methods

DiviK: Divisive intelligent K-means for hands-free unsupervised clustering in biological big data

Investigation of molecular heterogeneity provides insights about tumor origin and metabolomics. Increasing amount of data gathered makes manual analyses infeasible. Automated unsupervised learning approaches are exercised for this purpose. However, this kind of analysis requires a lot of experience with setting its hyperparameters and usually an upfront knowledge about the number of expected substructures. Moreover, numerous measured molecules require additional step of feature engineering to provide valuable results. In this work we propose DiviK: a scalable auto-tuning algorithm for segmentation of high-dimensional datasets, and a method to assess the quality of the unsupervised analysis. DiviK is validated on two separate high-throughput datasets acquired by Mass Spectrometry Imaging in 2D and 3D. Proposed algorithm could be one of the default choices to consider during initial exploration of Mass Spectrometry Imaging data. With comparable clustering quality, it brings the possibility of focusing on different levels of dataset nuance, while requires no number of expected structures specified upfront. Finally, due to its simplicity, DiviK is easily generalizable to even more flexible framework, with other clustering algorithm used instead of k-means. Generic implementation is freely available under Apache 2.0 license at this https URL.

Read more
Quantitative Methods

Drug Repurposing for COVID-19 using Graph Neural Network with Genetic, Mechanistic, and Epidemiological Validation

Amid the pandemic of 2019 novel coronavirus disease (COVID-19) infected by SARS-CoV-2, a vast amount of drug research for prevention and treatment has been quickly conducted, but these efforts have been unsuccessful thus far. Our objective is to prioritize repurposable drugs using a drug repurposing pipeline that systematically integrates multiple SARS-CoV-2 and drug interactions, deep graph neural networks, and in-vitro/population-based validations. We first collected all the available drugs (n= 3,635) involved in COVID-19 patient treatment through CTDbase. We built a SARS-CoV-2 knowledge graph based on the interactions among virus baits, host genes, pathways, drugs, and phenotypes. A deep graph neural network approach was used to derive the candidate representation based on the biological interactions. We prioritized the candidate drugs using clinical trial history, and then validated them with their genetic profiles, in vitro experimental efficacy, and electronic health records. We highlight the top 22 drugs including Azithromycin, Atorvastatin, Aspirin, Acetaminophen, and Albuterol. We further pinpointed drug combinations that may synergistically target COVID-19. In summary, we demonstrated that the integration of extensive interactions, deep neural networks, and rigorous validation can facilitate the rapid identification of candidate drugs for COVID-19 treatment.

Read more
Quantitative Methods

Dynamic causal modelling of immune heterogeneity

An interesting inference drawn by some Covid-19 epidemiological models is that there exists a proportion of the population who are not susceptible to infection -- even at the start of the current pandemic. This paper introduces a model of the immune response to a virus. This is based upon the same sort of mean-field dynamics as used in epidemiology. However, in place of the location, clinical status, and other attributes of people in an epidemiological model, we consider the state of a virus, B and T-lymphocytes, and the antibodies they generate. Our aim is to formalise some key hypotheses as to the mechanism of resistance. We present a series of simple simulations illustrating changes to the dynamics of the immune response under these hypotheses. These include attenuated viral cell entry, pre-existing cross-reactive humoral (antibody-mediated) immunity, and enhanced T-cell dependent immunity. Finally, we illustrate the potential application of this sort of model by illustrating variational inversion (using simulated data) of this model to illustrate its use in testing hypotheses. In principle, this furnishes a fast and efficient immunological assay--based on sequential serology--that provides a (i) quantitative measure of latent immunological responses and (ii) a Bayes optimal classification of the different kinds of immunological response (c.f., glucose tolerance tests used to test for insulin resistance). This may be especially useful in assessing SARS-CoV-2 vaccines.

Read more
Quantitative Methods

Dynamics-based peptide-MHC binding optimization by a convolutional variational autoencoder: a use-case model for CASTELO

An unsolved challenge in the development of antigen specific immunotherapies is determining the optimal antigens to target. Comprehension of antigen-MHC binding is paramount towards achieving this goal. Here, we present CASTELO, a combined machine learning-molecular dynamics (ML-MD) approach to design novel antigens of increased MHC binding affinity for a Type 1 diabetes (T1D)-implicated system. We build upon a small molecule lead optimization algorithm by training a convolutional variational autoencoder (CVAE) on MD trajectories of 48 different systems across 4 antigens and 4 HLA serotypes. We develop several new machine learning metrics including a structure-based anchor residue classification model as well as cluster comparison scores. ML-MD predictions agree well with experimental binding results and free energy perturbation-predicted binding affinities. Moreover, ML-MD metrics are independent of traditional MD stability metrics such as contact area and RMSF, which do not reflect binding affinity data. Our work supports the role of structure-based deep learning techniques in antigen specific immunotherapy design.

Read more
Quantitative Methods

Early Biomarkers and Intervention Programs for the Infant Exposed to Prenatal Stress

Functional development of affective and reward circuits, cognition and response inhibition later in life exhibits vulnerability periods during gestation and early childhood. Extensive evidence supports the model that exposure to stressors in the gestational period and early postnatal life increases an individual's susceptibility to future impairments of functional development. Recent versions of this model integrate epigenetic mechanisms of the developmental response. Their understanding will guide the future treatment of the associated neuropsychiatric disorders. A combination of non-invasively obtainable physiological signals and epigenetic biomarkers related to the principal systems of the stress response, the Hypothalamic-Pituitary axis (HPA) and the Autonomic Nervous System (ANS), are emerging as the key predictors of neurodevelopmental outcomes. Such electrophysiological and epigenetic biomarkers can prove to timely identify children benefiting most from early intervention programs. Such programs should ameliorate future disorders in otherwise apparently healthy children. The recently developed Early Family-Centered Intervention Programs aim to influence the care and stimuli provided daily by the family and improving parent/child attachment, a key element for healthy socio-emotional adult life. Although frequently underestimated, such biomarker-guided early intervention strategy represents a crucial first step in the prevention of future neuropsychiatric problems and in reducing their personal and societal impact.

Read more
Quantitative Methods

Early warning signals for desynchronization in periodically forced systems

Conditions such as insomnia, cardiac arrhythmia and jet-lag share a common feature: they are all related to the ability of biological systems to synchronize with the day-night cycle. When organisms lose resilience, this ability of synchronizing can become weaker till they eventually become desynchronized in a state of malfunctioning or sickness. It would be useful to measure this loss of resilience before the full desynchronization takes place. Several dynamical indicators of resilience (DIORs) have been proposed to account for the loss of resilience of a dynamical system. The performance of these indicators depends on the underlying mechanism of the critical transition, usually a saddle-node bifurcation. Before such bifurcation the recovery rate from perturbations of the system becomes slower, a mechanism known as critical slowing down. Here we show that, for a wide class of biological systems, desynchronization happens through another bifurcation, namely the saddle-node of cycles, for which critical slowing down cannot be directly detected. Such a bifurcation represents a system transitioning from synchronized (phase locked) to a desynchronized state, or vice versa. We show that after an appropriate transformation we can also detect this bifurcation using dynamical indicators of resilience. We test this method with data generated by models of sleep-wake cycles.

Read more
Quantitative Methods

Ecological notes on the Annulated Treeboa (Corallus annulatus) from a Costa Rican Lowland Tropical Wet Forest

The Annulated Treeboa (Corallus annulatus) is one of nine currently recognized species in the boid genus Corallus. Its disjunct range extends from eastern Guatemala into northern Honduras, southeastern Nicaragua, northeastern Costa Rica, and southwestern Panama to northern Colombia west of the Andes. It is the only species of Corallus found on the Caribbean versant of Costa Rica, where it occurs at elevations to at least 650m and perhaps as high as 1,000m. Corallus annulatus occurs mostly in primary and secondary lowland tropical wet and moist rainforest and it appears to be genuinely rare. Besides C. cropanii and C. blombergi (the latter closely related to C. annulatus), it is the rarest member of the genus. Aside from information on habitat and activity, little is known regarding its natural history.

Read more

Ready to get started?

Join us today