Featured Researches

Quantitative Methods

HIV transmission in men who have sex with men in England: on track for elimination by 2030?

Background: After a decade of a treatment as prevention (TasP) strategy based on progressive HIV testing scale-up and earlier treatment, a reduction in the estimated number of new infections in men-who-have-sex-with-men (MSM) in England had yet to be identified by 2010. To achieve internationally agreed targets for HIV control and elimination, test-and-treat prevention efforts have been dramatically intensified over the period 2010-2015, and, from 2016, further strengthened by pre-exposure prophylaxis (PrEP). Methods: Application of a novel age-stratified back-calculation approach to data on new HIV diagnoses and CD4 count-at-diagnosis, enabled age-specific estimation of HIV incidence, undiagnosed infections and mean time-to-diagnosis across both the 2010-2015 and 2016-2018 periods. Estimated incidence trends were then extrapolated, to quantify the likelihood of achieving HIV elimination by 2030. Findings: A fall in HIV incidence in MSM is estimated to have started in 2012/3, eighteen months before the observed fall in new diagnoses. A steep decrease from 2,770 annual infections (95% credible interval 2.490-3,040) in 2013 to 1,740 (1,500-2,010) in 2015 is estimated, followed by steady decline from 2016, reaching 854 (441-1,540) infections in 2018. A decline is consistently estimated in all age groups, with a fall particularly marked in the 24-35 age group, and slowest in the 45+ group. Comparable declines are estimated in the number of undiagnosed infections. Interpretation: The peak and subsequent sharp decline in HIV incidence occurred prior to the phase-in of PrEP. Definining elimination as a public health threat to be < 50 new infections (1.1 infections per 10,000 at risk), 40% of incidence projections hit this threshold by 2030. In practice, targeted policies will be required, particularly among the 45+y where STIs are increasing most rapidly.

Read more
Quantitative Methods

Hamiltonian Dynamics of Saturated Elongation in Amyloid Fiber Formation

Elongation is a fundament process in amyloid fiber growth, which is normally characterized by a linear relationship between the fiber elongation rate and the monomer concentration. However, in high concentration regions, a sub-linear dependence was often observed, which could be explained by a universal saturation mechanism. In this paper, we modeled the saturated elongation process through a Michaelis-Menten like mechanism, which is constituted by two sub-steps -- unspecific association and dissociation of a monomer with the fibril end, and subsequent conformational change of the associated monomer to fit itself to the fibrillar structure. Typical saturation concentrations were found to be 7−70μM for A β 40, α -synuclein and etc. Furthermore, by using a novel Hamiltonian formulation, analytical solutions valid for both weak and strong saturated conditions were constructed and applied to the fibrillation kinetics of α -synuclein and silk fibroin.

Read more
Quantitative Methods

High-sensitivity COVID-19 group testing by digital PCR

Background: Worldwide demand for SARS-CoV-2 RT-PCR testing is increasing as more countries are impacted by COVID-19 and as testing remains central to contain the spread of the disease, both in countries where the disease is emerging and in countries that are past the first wave but exposed to re-emergence. Group testing has been proposed as a solution to expand testing capabilities but sensitivity concerns have limited its impact on the management of the pandemic. Digital PCR (RT-dPCR) has been shown to be more sensitive than RT-PCR and could help in this context. Methods: We implemented RT-dPCR based COVID-19 group testing on commercially available system and assay (Naica System from Stilla Technologies) and investigated the sensitivity of the method in real life conditions of a university hospital in Paris, France, in May 2020. We tested the protocol in a direct comparison with reference RT-PCR testing on 448 samples split into groups of 3 sizes for RT-dPCR analysis: 56 groups of 8 samples, 28 groups of 16 samples and 14 groups of 32 samples. Results: Individual RT-PCR testing identified 25 positive samples. Using groups of 8, testing by RT-dPCR identified 23 groups as positive, corresponding to 26 true positive samples including 2 samples not initially detected by individual RT-PCR but confirmed positive by further RT-PCR and RT-dPCR investigation. For groups of 16, 15 groups tested positive, corresponding to 25 true positive samples identified. 100% concordance is found for groups of 32 but with limited data points.

Read more
Quantitative Methods

Higher Criticism Tuned Regression For Weak And Sparse Signals

Here we propose a novel searching scheme for a tuning parameter in high-dimensional penalized regression methods to address variable selection and modeling when sample sizes are limited compared to the data dimensions. Our method is motivated by high-throughput biological data such as genome-wide association studies (GWAS) and epigenome-wide association studies (EWAS). We propose a new estimate of the regularization parameter λ in penalized regression methods based on an estimated lower bound of the proportion of false null hypotheses with confidence (1−α) . The bound is estimated by applying the empirical null distribution of the higher criticism statistic, a second-level significance test constructed by dependent p -values using a multi-split regression and aggregation method. A tuning parameter estimate in penalized regression, λ , corresponds with the lower bound of the proportion of false null hypotheses. Different penalized regression methods with varied signal sparsity and strength are compared in the multi-split method setting. We demonstrate the performance of our method using both simulation experiments and the applications of real data on (1) lipid-trait genetics from the Action to Control Cardiovascular Risk in Diabetes (ACCORD) clinical trial and (2) epigenetic analysis evaluating smoking's influence in differential methylation in the Agricultural Lung Health Study. The proposed algorithm is included in the HCTR package, available at this https URL.

Read more
Quantitative Methods

Higher-order interactions in fitness landscapes are sparse

Biological fitness arises from interactions between molecules, genes, and organisms. To discover the causative mechanisms of this complexity, we must differentiate the significant interactions from a large number of possibilities. Epistasis is the standard way to identify interactions in fitness landscapes. However, this intuitive approach breaks down in higher dimensions for example because the sign of epistasis takes on an arbitrary meaning, and the false discovery rate becomes high. These limitations make it difficult to evaluate the role of epistasis in higher dimensions. Here we develop epistatic filtrations, a dimensionally-normalized approach to define fitness landscape topography for higher dimensional spaces. We apply the method to higher-dimensional datasets from genetics and the gut microbiome. This reveals a sparse higher-order structure that often arises from lower-order. Despite sparsity, these higher-order effects carry significant effects on biological fitness and are consequential for ecology and evolution.

Read more
Quantitative Methods

HistomicsML2.0: Fast interactive machine learning for whole slide imaging data

Extracting quantitative phenotypic information from whole-slide images presents significant challenges for investigators who are not experienced in developing image analysis algorithms. We present new software that enables rapid learn-by-example training of machine learning classifiers for detection of histologic patterns in whole-slide imaging datasets. HistomicsML2.0 uses convolutional networks to be readily adaptable to a variety of applications, provides a web-based user interface, and is available as a software container to simplify deployment.

Read more
Quantitative Methods

How initial distribution affects symmetry breaking induced by panic in ants: experiment and flee-pheromone model

Collective escaping is a ubiquitous phenomenon in animal groups. Symmetry breaking caused by panic escape exhibits a shared feature across species that one exit is used more than the other when agents escaping from a closed space with two symmetrically located exists. Intuitively, one exit will be used more by more individuals close to it, namely there is an asymmetric distribution initially. We used ant groups to investigate how initial distribution of colonies would influence symmetry breaking in collective escaping. Surprisingly, there was no positive correlation between symmetry breaking and the asymmetrically initial distribution, which was quite counter-intuitive. In the experiments, a flee stage was observed and accordingly a flee-pheromone model was introduced to depict this special behavior in the early stage of escaping. Simulation results fitted well with the experiment. Furthermore, the flee stage duration was calibrated quantitatively and the model reproduced the observation demonstrated by our previous work. This paper explicitly distinguished two stages in ant panic escaping for the first time, thus enhancing the understanding in escaping behavior of ant colonies.

Read more
Quantitative Methods

Hypergraph Models of Biological Networks to Identify Genes Critical to Pathogenic Viral Response

Background: Representing biological networks as graphs is a powerful approach to reveal underlying patterns, signatures, and critical components from high-throughput biomolecular data. However, graphs do not natively capture the multi-way relationships present among genes and proteins in biological systems. Hypergraphs are generalizations of graphs that naturally model multi-way relationships and have shown promise in modeling systems such as protein complexes and metabolic reactions. In this paper we seek to understand how hypergraphs can more faithfully identify, and potentially predict, important genes based on complex relationships inferred from genomic expression data sets. Results: We compiled a novel data set of transcriptional host response to pathogenic viral infections and formulated relationships between genes as a hypergraph where hyperedges represent significantly perturbed genes, and vertices represent individual biological samples with specific experimental conditions. We find that hypergraph betweenness centrality is a superior method for identification of genes important to viral response when compared with graph centrality. Conclusions: Our results demonstrate the utility of using hypergraphs to represent complex biological systems and highlight central important responses in common to a variety of highly pathogenic viruses.

Read more
Quantitative Methods

Identification and Validation of the SNV Biomarkers Based on Multi-Dimensional Patterns

Background: Single nucleotide variants (SNVs) are detected as different distributions of DNA samples of distinct types of cancer patients. Even though, it is an exacting task to select the appropriate method to identify cancer to the greatest extent of SNVs. Results: In this paper, we proposed a biomarker concept based on SNV patterns in different feature dimensions. Raw dataset (2761 samples) consisting of twelve different cancers was obtained from TCGA (The Cancer Genome Atlas). After preliminary screening of 562,321 DNA mutation sites in the samples, the mutation sites were extracted and characterized by cancer types in six different SNV feature dimensions. In this study, we found that the extracted features showed similar distribution in the cluster center of the disease type of the samples. After the initial processing of the raw data, the sample was more focused on the subtype distribution of the cancer or the cancer at the SNV level. We used k-nearest neighbors (KNN) to classify the extracted features and Leave-One-Out cross verified them. The accuracy of classifying is stable at around 97% and reached 97.43% at the highest. During the validation phase, we found validated oncogenes in the loci of the features with the highest importance among nine cancers. Conclusions: In summary, the samples showed consistent patterns according to the cancer in which it belongs. It is feasible to classify the cancer of the sample by the distribution of different dimensions of the SNVs and has a high accuracy. And has potential implications for the discovery of cancer-causing genes.

Read more
Quantitative Methods

Identifying and Analyzing Sepsis States: A Retrospective Study on Patients with Sepsis in ICUs

Sepsis accounts for more than 50% of hospital deaths, and the associated cost ranks the highest among hospital admissions in the US. Improved understanding of disease states, severity, and clinical markers has the potential to significantly improve patient outcomes and reduce cost. We develop a computational framework that identifies disease states in sepsis using clinical variables and samples in the MIMIC-III database. We identify six distinct patient states in sepsis, each associated with different manifestations of organ dysfunction. We find that patients in different sepsis states are statistically significantly composed of distinct populations with disparate demographic and comorbidity profiles. Collectively, our framework provides a holistic view of sepsis, and our findings provide the basis for future development of clinical trials and therapeutic strategies for sepsis.

Read more

Ready to get started?

Join us today