Featured Researches

Quantitative Methods

Generating a Heterosexual Bipartite Network Embedded in Social Network

We describe how to generate a heterosexual network with a prescribed joint-degree distribution that is embedded in a prescribed large-scale social contact network. The structure of a sexual network plays an important role in how sexually transmitted infections (STIs) spread. Generating an ensemble of networks that mimics the real-world is crucial to evaluating robust mitigation strategies for controling STIs. Most of the current algorithms to generate sexual networks only use sexual activity data, such as the number of partners per month, to generate the sexual network. Real-world sexual networks also depend on biased mixing based on age, location, and social and work activities. We describe an approach to use a broad range of social activity data to generate possible heterosexual networks. We start with a large-scale simulation of thousands of people in a city as they go through their daily activities, including work, school, shopping, and activities at home. We extract a social network from these activities where the nodes are the people and the edges indicate a social interaction, such as working in the same location. This social network captures the correlations between people of different ages, living in different locations, their economic status, and other demographic factors. We use the social contact network to define a bipartite heterosexual network that is embedded within an extended social network. The resulting sexual network captures the biased mixing inherent in the social network, and models based on this pairing of networks can be used to investigate novel intervention strategies based on the social contacts of infected people. We illustrate the approach in a model for the spread of Chlamydia in the heterosexual network representing the young sexually active community in New Orleans.

Read more
Quantitative Methods

Geometric Uncertainty in Patient-Specific Cardiovascular Modeling with Convolutional Dropout Networks

We propose a novel approach to generate samples from the conditional distribution of patient-specific cardiovascular models given a clinically aquired image volume. A convolutional neural network architecture with dropout layers is first trained for vessel lumen segmentation using a regression approach, to enable Bayesian estimation of vessel lumen surfaces. This network is then integrated into a path-planning patient-specific modeling pipeline to generate families of cardiovascular models. We demonstrate our approach by quantifying the effect of geometric uncertainty on the hemodynamics for three patient-specific anatomies, an aorto-iliac bifurcation, an abdominal aortic aneurysm and a sub-model of the left coronary arteries. A key innovation introduced in the proposed approach is the ability to learn geometric uncertainty directly from training data. The results show how geometric uncertainty produces coefficients of variation comparable to or larger than other sources of uncertainty for wall shear stress and velocity magnitude, but has limited impact on pressure. Specifically, this is true for anatomies characterized by small vessel sizes, and for local vessel lesions seen infrequently during network training.

Read more
Quantitative Methods

Geometric algorithms for sampling the flux space of metabolic networks

Systems Biology is a fundamental field and paradigm that introduces a new era in Biology. The crux of its functionality and usefulness relies on metabolic networks that model the reactions occurring inside an organism and provide the means to understand the underlying mechanisms that govern biological systems. Even more, metabolic networks have a broader impact that ranges from resolution of ecosystems to personalized medicine.The analysis of metabolic networks is a computational geometry oriented field as one of the main operations they depend on is sampling uniformly points from polytopes; the latter provides a representation of the steady states of the metabolic networks. However, the polytopes that result from biological data are of very high dimension (to the order of thousands) and in most, if not all, the cases are considerably skinny. Therefore, to perform uniform random sampling efficiently in this setting, we need a novel algorithmic and computational framework specially tailored for the properties of metabolic networks.We present a complete software framework to handle sampling in metabolic networks. Its backbone is a Multiphase Monte Carlo Sampling (MMCS) algorithm that unifies rounding and sampling in one pass, obtaining both upon termination. It exploits an improved variant of the Billiard Walk that enjoys faster arithmetic complexity per step. We demonstrate the efficiency of our approach by performing extensive experiments on various metabolic networks. Notably, sampling on the most complicated human metabolic network accessible today, Recon3D, corresponding to a polytope of dimension 5 335 took less than 30 hours. To our knowledge, that is out of reach for existing software.

Read more
Quantitative Methods

Global analysis of more than 50,000 SARS-Cov-2 genomes reveals epistasis between 8 viral genes

Genome-wide epistasis analysis is a powerful tool to infer gene interactions, which can guide drug and vaccine development and lead to a deeper understanding of microbial pathogenesis. We have considered all complete SARS-CoV-2 genomes deposited in the GISAID repository until \textbf{four} different cut-off dates, and used Direct Coupling Analysis together with an assumption of Quasi-Linkage Equilibrium to infer epistatic contributions to fitness from polymorphic loci. We find \textbf{eight} interactions, of which three between pairs where one locus lies in gene ORF3a, both loci holding non-synonymous mutations. We also find interactions between two loci in gene nsp13, both holding non-synonymous mutations, and four interactions involving one locus holding a synonymous mutation. Altogether we infer interactions between loci in viral genes ORF3a and nsp2, nsp12 and nsp6, between ORF8 and nsp4, and between loci in genes nsp2, nsp13 and nsp14. The paper opens the prospect to use prominent epistatically linked pairs as a starting point to search for combinatorial weaknesses of recombinant viral pathogens.

Read more
Quantitative Methods

Global sensitivity analysis informed model reduction and selection applied to a Valsalva maneuver model

In this study, we develop a methodology for model reduction and selection informed by global sensitivity analysis (GSA) methods. We apply these techniques to a control model that takes systolic blood pressure and thoracic tissue pressure data as inputs and predicts heart rate in response to the Valsalva maneuver (VM). The study compares four GSA methods based on Sobol' indices (SIs) quantifying the parameter influence on the difference between the model output and the heart rate data. The GSA methods include standard scalar SIs determining the average parameter influence over the time interval studied and three time-varying methods analyzing how parameter influence changes over time. The time-varying methods include a new technique, termed limited-memory SIs, predicting parameter influence using a moving window approach. Using the limited-memory SIs, we perform model reduction and selection to analyze the necessity of modeling both the aortic and carotid baroreceptor regions in response to the VM. We compare the original model to three systematically reduced models including (i) the aortic and carotid regions, (ii) the aortic region only, and (iii) the carotid region only. Model selection is done quantitatively using the Akaike and Bayesian Information Criteria and qualitatively by comparing the neurological predictions. Results show that it is necessary to incorporate both the aortic and carotid regions to model the VM.

Read more
Quantitative Methods

Graph Neural Network Based Coarse-Grained Mapping Prediction

The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning where we seek to reproduce the mapping operators produced by experts. We present a graph neural network based CG mapping predictor called DEEP SUPERVISED GRAPH PARTITIONING MODEL(DSGPM) that treats mapping operators as a graph segmentation problem. DSGPM is trained on a novel dataset, Human-annotated Mappings (HAM), consisting of 1,206 molecules with expert annotated mapping operators. HAM can be used to facilitate further research in this area. Our model uses a novel metric learning objective to produce high-quality atomic features that are used in spectral clustering. The results show that the DSGPM outperforms state-of-the-art methods in the field of graph segmentation. Finally, we find that predicted CG mapping operators indeed result in good CG MD models when used in simulation.

Read more
Quantitative Methods

GraphKKE: Graph Kernel Koopman Embedding for Human Microbiome Analysis

More and more diseases have been found to be strongly correlated with disturbances in the microbiome constitution, e.g., obesity, diabetes, or some cancer types. Thanks to modern high-throughput omics technologies, it becomes possible to directly analyze human microbiome and its influence on the health status. Microbial communities are monitored over long periods of time and the associations between their members are explored. These relationships can be described by a time-evolving graph. In order to understand responses of the microbial community members to a distinct range of perturbations such as antibiotics exposure or diseases and general dynamical properties, the time-evolving graph of the human microbial communities has to be analyzed. This becomes especially challenging due to dozens of complex interactions among microbes and metastable dynamics. The key to solving this problem is the representation of the time-evolving graphs as fixed-length feature vectors preserving the original dynamics. We propose a method for learning the embedding of the time-evolving graph that is based on the spectral analysis of transfer operators and graph kernels. We demonstrate that our method can capture temporary changes in the time-evolving graph on both created synthetic data and real-world data. Our experiments demonstrate the efficacy of the method. Furthermore, we show that our method can be applied to human microbiome data to study dynamic processes.

Read more
Quantitative Methods

Group testing as a strategy for the epidemiologic monitoring of COVID-19

Sample pooling consists in combining samples from multiple individuals into a single pool that is then tested using a unique test-kit. A positive test means that at least one individual within the pool is infected. Here, we propose an analysis and applications of sample pooling to the epidemiologic monitoring of COVID-19. We first introduce a model of the RT-qPCR process used to test for the presence of virus in a sample and construct a statistical model for the viral load in a typical infected individual inspired by the clinical data from Jones et. al. (2020). We then propose a method for the measure of the prevalence in a population, based on group testing, taking into account the increased number of false negatives associated with this method. Finally, we present an application of sample pooling for the prevention of epidemic outbreak in closed connected communities (e.g. nursing homes).

Read more
Quantitative Methods

Guiding Deep Molecular Optimization with Genetic Exploration

De novo molecular design attempts to search over the chemical space for molecules with the desired property. Recently, deep learning has gained considerable attention as a promising approach to solve the problem. In this paper, we propose genetic expert-guided learning (GEGL), a simple yet novel framework for training a deep neural network (DNN) to generate highly-rewarding molecules. Our main idea is to design a "genetic expert improvement" procedure, which generates high-quality targets for imitation learning of the DNN. Extensive experiments show that GEGL significantly improves over state-of-the-art methods. For example, GEGL manages to solve the penalized octanol-water partition coefficient optimization with a score of 31.40, while the best-known score in the literature is 27.22. Besides, for the GuacaMol benchmark with 20 tasks, our method achieves the highest score for 19 tasks, in comparison with state-of-the-art methods, and newly obtains the perfect score for three tasks.

Read more
Quantitative Methods

HAN-ECG: An Interpretable Atrial Fibrillation Detection Model Using Hierarchical Attention Networks

Atrial fibrillation (AF) is one of the most prevalent cardiac arrhythmias that affects the lives of more than 3 million people in the U.S. and over 33 million people around the world and is associated with a five-fold increased risk of stroke and mortality. like other problems in healthcare domain, artificial intelligence (AI)-based algorithms have been used to reliably detect AF from patients' physiological signals. The cardiologist level performance in detecting this arrhythmia is often achieved by deep learning-based methods, however, they suffer from the lack of interpretability. In other words, these approaches are unable to explain the reasons behind their decisions. The lack of interpretability is a common challenge toward a wide application of machine learning-based approaches in the healthcare which limits the trust of clinicians in such methods. To address this challenge, we propose HAN-ECG, an interpretable bidirectional-recurrent-neural-network-based approach for the AF detection task. The HAN-ECG employs three attention mechanism levels to provide a multi-resolution analysis of the patterns in ECG leading to AF. The first level, wave level, computes the wave weights, the second level, heartbeat level, calculates the heartbeat weights, and third level, window (i.e., multiple heartbeats) level, produces the window weights in triggering a class of interest. The detected patterns by this hierarchical attention model facilitate the interpretation of the neural network decision process in identifying the patterns in the signal which contributed the most to the final prediction. Experimental results on two AF databases demonstrate that our proposed model performs significantly better than the existing algorithms. Visualization of these attention layers illustrates that our model decides upon the important waves and heartbeats which are clinically meaningful in the detection task.

Read more

Ready to get started?

Join us today