Featured Researches

Quantitative Methods

BioStatFlow -Statistical Analysis Workflow for "Omics" Data

BioStatFlow is a free web application, useful to facilitate the performance of statistical analyses of "omics", including metabolomics, data using R packages. It is a fast and easy on-line tool for biologists who are not experts in univariate and multivariate statistics, do not have time to learn R language, and only have basic notions in biostatistics. It guides the biologist through the different steps of a statistical workflow, from data normalization and imputation of missing data to univariate and multivariate analyses. It also includes tools to reconstruct and visualize networks based on correlations. All outputs are easily saved in a session or downloaded. New analytical modules can be easily included upon request. BioStatFlow is available online: this http URL

Read more
Quantitative Methods

Biologically-informed neural networks guide mechanistic modeling from sparse experimental data

Biologically-informed neural networks (BINNs), an extension of physics-informed neural networks [1], are introduced and used to discover the underlying dynamics of biological systems from sparse experimental data. In the present work, BINNs are trained in a supervised learning framework to approximate in vitro cell biology assay experiments while respecting a generalized form of the governing reaction-diffusion partial differential equation (PDE). By allowing the diffusion and reaction terms to be multilayer perceptrons (MLPs), the nonlinear forms of these terms can be learned while simultaneously converging to the solution of the governing PDE. Further, the trained MLPs are used to guide the selection of biologically interpretable mechanistic forms of the PDE terms which provides new insights into the biological and physical mechanisms that govern the dynamics of the observed system. The method is evaluated on sparse real-world data from wound healing assays with varying initial cell densities [2].

Read more
Quantitative Methods

COVID-19 Image Data Collection: Prospective Predictions Are the Future

Across the world's coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as well as a preliminary exploration of possible use cases for the data. This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19. It was manually aggregated from publication figures as well as various web based repositories into a machine learning (ML) friendly format with accompanying dataloader code. We collected frontal and lateral view imagery and metadata such as the time since first symptoms, intensive care unit (ICU) status, survival status, intubation status, or hospital location. We present multiple possible use cases for the data such as predicting the need for the ICU, predicting patient survival, and understanding a patient's trajectory during treatment. Data can be accessed here: this https URL

Read more
Quantitative Methods

COVID-19 Related Mobility Reduction: Heterogenous Effects on Sleep and Physical Activity Rhythms

Mobility restrictions imposed to suppress coronavirus transmission can alter physical activity (PA) and sleep patterns. Characterization of response heterogeneity and their underlying reasons may assist in tailoring customized interventions. We obtained wearable data covering baseline, incremental movement restriction and lockdown periods from 1824 city-dwelling, working adults aged 21 to 40 years, incorporating 206,381 nights of sleep and 334,038 days of PA. Four distinct rest activity rhythms (RARs) were identified using k-means clustering of participants' temporally distributed step counts. Hierarchical clustering of the proportion of time spent in each of these RAR revealed 4 groups who expressed different mixtures of RAR profiles before and during the lockdown. Substantial but asymmetric delays in bedtime and waketime resulted in a 24 min increase in weekday sleep duration with no loss in sleep efficiency. Resting heart rate declined 2 bpm. PA dropped an average of 38%. 4 groups with different compositions of RAR profiles were found. Three were better able to maintain PA and weekday/weekend differentiation during lockdown. The least active group comprising 51 percent of the sample, were younger and predominantly singles. Habitually less active already, this group showed the greatest reduction in PA during lockdown with little weekday/weekend differences. Among different mobility restrictions, removal of habitual social cues by lockdown had the largest effect on PA and sleep. Sleep and resting heart rate unexpectedly improved. RAR evaluation uncovered heterogeneity of responses to lockdown and can identify characteristics of persons at risk of decline in health and wellbeing.

Read more
Quantitative Methods

COVID-19 and the difficulty of inferring epidemiological parameters from clinical data

Knowing the infection fatality ratio (IFR) is of crucial importance for evidence-based epidemic management: for immediate planning; for balancing the life years saved against the life years lost due to the consequences of management; and for evaluating the ethical issues associated with the tacit willingness to pay substantially more for life years lost to the epidemic, than for those to other diseases. Against this background Verity et al. (2020, Lancet Infections Diseases) have rapidly assembled case data and used statistical modelling to infer the IFR for COVID-19. We have attempted an in-depth statistical review of their approach, to identify to what extent the data are sufficiently informative about the IFR to play a greater role than the modelling assumptions, and have tried to identify those assumptions that appear to play a key role. Given the difficulties with other data sources, we provide a crude alternative analysis based on the Diamond Princess Cruise ship data and case data from China, and argue that, given the data problems, modelling of clinical data to obtain the IFR can only be a stop-gap measure. What is needed is near direct measurement of epidemic size by PCR and/or antibody testing of random samples of the at risk population.

Read more
Quantitative Methods

COVID-19 transmission risk factors

We analyze risk factors correlated with the initial transmission growth rate of the recent COVID-19 pandemic in different countries. The number of cases follows in its early stages an almost exponential expansion; we chose as a starting point in each country the first day d i with 30 cases and we fitted for 12 days, capturing thus the early exponential growth. We looked then for linear correlations of the exponents α with other variables, for a sample of 126 countries. We find a positive correlation, {\it i.e. faster spread of COVID-19}, with high confidence level with the following variables, with respective p -value: low Temperature ( 4⋅ 10 −7 ), high ratio of old vs.~working-age people ( 3⋅ 10 −6 ), life expectancy ( 8⋅ 10 −6 ), number of international tourists ( 1⋅ 10 −5 ), earlier epidemic starting date d i ( 2⋅ 10 −5 ), high level of physical contact in greeting habits ( 6⋅ 10 −5 ), lung cancer prevalence ( 6⋅ 10 −5 ), obesity in males ( 1⋅ 10 −4 ), share of population in urban areas ( 2⋅ 10 −4 ), cancer prevalence ( 3⋅ 10 −4 ), alcohol consumption ( 0.0019 ), daily smoking prevalence ( 0.0036 ), UV index ( 0.004 , 73 countries). We also find a correlation with low Vitamin D levels ( 0.002−0.006 , smaller sample, ∼50 countries, to be confirmed on a larger sample). There is highly significant correlation also with blood types: positive correlation with types RH- ( 3⋅ 10 −5 ) and A+ ( 3⋅ 10 −3 ), negative correlation with B+ ( 2⋅ 10 −4 ). Several of the above variables are intercorrelated and likely to have common interpretations. We performed a Principal Component Analysis, in order to find their significant independent linear combinations. We also analyzed a possible bias: countries with low GDP-per capita might have less testing and we discuss correlation with the above variables.

Read more
Quantitative Methods

Calibration of Biophysical Models for tau-Protein Spreading in Alzheimer's Disease from PET-MRI

Aggregates of misfolded tau proteins (or just 'tau' for brevity) play a crucial role in the progression of Alzheimer's disease (AD) as they correlate with cell death and accelerated tissue atrophy. Longitudinal positron emission tomography (PET) scans can be used quantify the extend of abnormal tau spread. Such PET-based image biomarkers are a promising technology for AD diagnosis and prognosis. Here, we propose to calibrate an organ-scale biophysical mathematical model using longitudinal PET scans to extract characteristic growth patterns and spreading of tau. The biophysical model is a reaction-advection-diffusion partial differential equation (PDE) with only two scalar unknown parameters, one representing the spreading (the diffusion part of the PDE) and the other one the growth of tau (the reaction part of the PDE). The advection term captures tissue atrophy and is obtained from diffeomorphic registration of longitudinal magnetic resonance imaging (MRI) scans. We describe the method, present a numerical scheme for the calibration of the growth and spreading parameters, perform a sensitivity study using synthetic data, and we perform a preliminary evaluation on clinical scans from the ADNI dataset. We study whether such model calibration is possible and investigate the sensitivity of such calibration to the time between consecutive scans and the presence of atrophy. Our findings show that despite using only two calibration parameters, the model can reconstruct clinical scans quite accurately. We discovered that small time intervals between scans and the presence of background noise create difficulties. Our reconstructed model fits the data well, yet the study on clinical data also reveals shortcomings of the simplistic model. Interestingly, the parameters show significant variability across patients, an indication that these parameters could be useful biomarkers.

Read more
Quantitative Methods

Can We Detect Mastitis earlier than Farmers?

The aim of this study was to build a modelling framework that would allow us to be able to detect mastitis infections before they would normally be found by farmers through the introduction of machine learning techniques. In the making of this we created two different modelling framework's, one that works on the premise of detecting Sub Clinical mastitis infections at one Somatic Cell Count recording in advance called SMA and the other tries to detect both Sub Clinical mastitis infections aswell as Clinical mastitis infections at any time the cow is milked called AMA. We also introduce the idea of two different feature sets for our study, these represent different characteristics that should be taken into account when detecting infections, these were the idea of a cow differing to a farm mean and also trends in the lactation. We reported that the results for SMA are better than those created by AMA for Sub Clinical infections yet it has the significant disadvantage of only being able to classify Sub Clinical infections due to how we recorded Sub Clinical infections as being any time a Somatic Cell Count measurement went above a certain threshold where as CM could appear at any stage of lactation. Thus in some cases the lower accuracy values for AMA might in fact be more beneficial to farmers.

Read more
Quantitative Methods

Can tumor location on pre-treatment MRI predict likelihood of pseudo-progression versus tumor recurrence in Glioblastoma? A feasibility study

A significant challenge in Glioblastoma (GBM) management is identifying pseudo-progression (PsP), a benign radiation-induced effect, from tumor recurrence, on routine imaging following conventional treatment. Previous studies have linked tumor lobar presence and laterality to GBM outcomes, suggesting that disease etiology and progression in GBM may be impacted by tumor location. Hence, in this feasibility study, we seek to investigate the following question: Can tumor location on treatment-naïve MRI provide early cues regarding likelihood of a patient developing pseudo-progression versus tumor recurrence? In this study, 74 pre-treatment Glioblastoma MRI scans with PsP (33) and tumor recurrence (41) were analyzed. First, enhancing lesion on Gd-T1w MRI and peri-lesional hyperintensities on T2w/FLAIR were segmented by experts and then registered to a brain atlas. Using patients from the two phenotypes, we construct two atlases by quantifying frequency of occurrence of enhancing lesion and peri-lesion hyperintensities, by averaging voxel intensities across the population. Analysis of differential involvement was then performed to compute voxel-wise significant differences (p-value<0.05) across the atlases. Statistically significant clusters were finally mapped to a structural atlas to provide anatomic localization of their location. Our results demonstrate that patients with tumor recurrence showed prominence of their initial tumor in the parietal lobe, while patients with PsP showed a multi-focal distribution of the initial tumor in the frontal and temporal lobes, insula, and putamen. These preliminary results suggest that lateralization of pre-treatment lesions towards certain anatomical areas of the brain may allow to provide early cues regarding assessing likelihood of occurrence of pseudo-progression from tumor recurrence on MRI scans.

Read more
Quantitative Methods

Casework applications of probabilistic genotyping methods for DNA mixtures that allow relationships between contributors

In both criminal cases and civil cases there is an increasing demand for the analysis of DNA mixtures involving relationships. The goal might be, for example, to identify the contributors to a DNA mixture where the donors may be related, or to infer the relationship between individuals based on a DNA mixture. This paper applies a recent approach to modelling and computation for DNA mixtures involving contributors with arbitrarily complex relationships to two real cases from the Spanish Forensic Police.

Read more

Ready to get started?

Join us today