Featured Researches

Quantitative Methods

Ensemble Deep Learning on Large, Mixed-Site fMRI Datasets in Autism and Other Tasks

Deep learning models for MRI classification face two recurring problems: they are typically limited by low sample size, and are abstracted by their own complexity (the "black box problem"). In this paper, we train a convolutional neural network (CNN) with the largest multi-source, functional MRI (fMRI) connectomic dataset ever compiled, consisting of 43,858 datapoints. We apply this model to a cross-sectional comparison of autism (ASD) vs typically developing (TD) controls that has proved difficult to characterise with inferential statistics. To contextualise these findings, we additionally perform classifications of gender and task vs rest. Employing class-balancing to build a training set, we trained 3 × 300 modified CNNs in an ensemble model to classify fMRI connectivity matrices with overall AUROCs of 0.6774, 0.7680, and 0.9222 for ASD vs TD, gender, and task vs rest, respectively. Additionally, we aim to address the black box problem in this context using two visualization methods. First, class activation maps show which functional connections of the brain our models focus on when performing classification. Second, by analyzing maximal activations of the hidden layers, we were also able to explore how the model organizes a large and mixed-centre dataset, finding that it dedicates specific areas of its hidden layers to processing different covariates of data (depending on the independent variable analyzed), and other areas to mix data from different sources. Our study finds that deep learning models that distinguish ASD from TD controls focus broadly on temporal and cerebellar connections, with a particularly high focus on the right caudate nucleus and paracentral sulcus.

Read more
Quantitative Methods

Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug Response

Transfer learning has been shown to be effective in many applications in which training data for the target problem are limited but data for a related (source) problem are abundant. In this paper, we apply transfer learning to the prediction of anti-cancer drug response. Previous transfer learning studies for drug response prediction focused on building models that predict the response of tumor cells to a specific drug treatment. We target the more challenging task of building general prediction models that can make predictions for both new tumor cells and new drugs. We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset, and extends the framework through ensemble. The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures. Uniquely, we investigate its power for three application settings including drug repurposing, precision oncology, and new drug development, through different data partition schemes in cross-validation. We test the proposed ensemble transfer learning on benchmark in vitro drug screening datasets, taking one dataset as the source domain and another dataset as the target domain. The analysis results demonstrate the benefit of applying ensemble transfer learning for predicting anti-cancer drug response in all three applications with both LightGBM and DNN models. Compared between the different prediction models, a DNN model with two subnetworks for the inputs of tumor features and drug features separately outperforms LightGBM and the other DNN model that concatenates tumor features and drug features for input in the drug repurposing and precision oncology applications. In the more challenging application of new drug development, LightGBM performs better than the other two DNN models.

Read more
Quantitative Methods

EoN (Epidemics on Networks): a fast, flexible Python package for simulation, analytic approximation, and analysis of epidemics on networks

We provide a description of the Epidemics on Networks (EoN) python package designed for studying disease spread in static networks. The package consists of over 100 methods available for users to perform stochastic simulation of a range of different processes including SIS and SIR disease, and generic simple or comlex contagions.

Read more
Quantitative Methods

Equalização das escalas NESSCA e SARA utilizando a Teoria da Resposta ao Item na avaliação do comprometimento pela doença de Machado-Joseph

Background: Scale equating is a statistical technique used to establish equivalence relations between different scales. Its use is quite popular in educational evaluation, however, unusual in the health area, where scales of measures are tools that integrate clinical practice. With the use of different scales, there is a difficulty in comparing scientific results, such as NESSCA and SARA scales, tools for assessing the commitment to Machado-Joseph disease (SCA3/MJD). Objective: Explore the method of scale equating and demonstrate its application through NESSCA and SARA scales, using the Item Response Theory (IRT) approach in assessing SCA3/MJD commitment. Methods: Data came from 227 patients from the Hospital de Clínicas de Porto Alegre with SCA3/MJD who have complete measures for NESSCA and/or SARA scales. The equating design used is that of non-equivalent groups with common items, with separate calibration. The IRT model used in the estimation of the parameters was the generalized partial credit, for NESSCA and SARA. The linear transformation was performed using the Mean/Mean, Mean/Sigma, Haebara and StokingLord methods and the equation of the true score was applied to obtain an estimated relationship between the scores of the scales. Results: Difference between NESSCA score estimated by SARA and observed NESSCA score has shown median of 0.82 points, by Mean/Sigma method. This was the best method of linear transformation among the tested. Conclusions: This study extended the use of scale equating under IRT approach to health outcomes and established an equivalence relationship between NESSCA and SARA scores, making the comparison between patients and scientific results feasible.

Read more
Quantitative Methods

Error Correction Codes for COVID-19 Virus and Antibody Testing: Using Pooled Testing to Increase Test Reliability

We consider a novel method to increase the reliability of COVID-19 virus or antibody tests by using specially designed pooled testings. Instead of testing nasal swab or blood samples from individual persons, we propose to test mixtures of samples from many individuals. The pooled sample testing method proposed in this paper also serves a different purpose: for increasing test reliability and providing accurate diagnoses even if the tests themselves are not very accurate. Our method uses ideas from compressed sensing and error-correction coding to correct for a certain number of errors in the test results. The intuition is that when each individual's sample is part of many pooled sample mixtures, the test results from all of the sample mixtures contain redundant information about each individual's diagnosis, which can be exploited to automatically correct for wrong test results in exactly the same way that error correction codes correct errors introduced in noisy communication channels. While such redundancy can also be achieved by simply testing each individual's sample multiple times, we present simulations and theoretical arguments that show that our method is significantly more efficient in increasing diagnostic accuracy. In contrast to group testing and compressed sensing which aim to reduce the number of required tests, this proposed error correction code idea purposefully uses pooled testing to increase test accuracy, and works not only in the "undersampling" regime, but also in the "oversampling" regime, where the number of tests is bigger than the number of subjects. The results in this paper run against traditional beliefs that, "even though pooled testing increased test capacity, pooled testings were less reliable than testing individuals separately."

Read more
Quantitative Methods

Estimate Metabolite Taxonomy and Structure with a Fragment-Centered Database and Fragment Network

Metabolite structure identification has become the major bottleneck of the mass spectrometry based metabolomics research. Till now, number of mass spectra databases and search algorithms have been developed to address this issue. However, two critical problems still exist: the low chemical component record coverage in databases and significant MS/MS spectra variations related to experiment equipment and parameter settings. In this work, we considered the molecule fragment as basic building blocks of the metabolic components which had relatively consistent signatures in MS/MS spectra. And from a bottom-up point of view, we built a fragment centered database, MSFragDB, by reorganizing the data from the Human Metabolome Database (HMDB) and developed an intensity-free searching algorithm to search and rank the most relative metabolite according to the users' input. We also proposed the concept of fragment network, a graph structure that encoded the relationship between the molecule fragments to find close motif that indicated a specific chemical structure. Although based on the same dataset as the HMDB, validation results implied that the MSFragDB had a higher hit ratio and furthermore, estimated possible taxonomy that a query spectrum belongs to when the corresponding chemical component was missing in the database. Aid by the Fragment Network, the MSFragDB was also proved to be able to estimate the right structure while the MS/MS spectrum suffers from the precursor-contamination. The strategy proposed is general and can be adopted in existing databases. We believe MSFragDB and Fragment Network can improve the performance of structure identification with existing data. The beta version of the database is freely available at this http URL.

Read more
Quantitative Methods

Evaluation of the Penetration Process of Fluorescent Collagenase Nanocapsules in a 3D Collagen Gel

One of the major limitations of nanomedicine is the scarce penetration of nanoparticles in tumoral tissues. These constrains have been tried to be solved by different strategies, such as the employ of polyethyleneglycol (PEG) to avoid the opsonization or reducing the extracellular matrix (ECM) density. Our research group has developed some strategies to overcome these limitations such as the employ of pH-sensitive collagenase nanocapsules for the digestion of the collagen-rich extracellular matrix present in most of tumoral tissues. However, a deeper understanding of physicochemical kinetics involved in the nanocapsules degradation process is needed to understand the nanocapsule framework degradation process produced during the penetration in the tissue. For this, in this work it has been employed a double-fluorescent labelling strategy of the polymeric enzyme nanocapsule as a crucial chemical tool which allowed the analysis of nanocapsules and free collagenase during the diffusion process throughout a tumour-like collagen matrix. This extrinsic label strategy provides far greater advantages for observing biological processes. For the detection of enzyme, collagenase has been labelled with fluorescein Isothiocyanate (FITC), whereas the nanocapsule surface was labelled with rhodamine Isothiocyanate (RITC). Thus, it has been possible to monitor the hydrolysis of nanocapsules and their diffusion throughout a thick 3D Collagen gel during the time, obtaining a detailed temporal evaluation of the pH-sensitive collagenase nanocapsule behaviour. These collagenase nanocapsules displayed a high enzymatic activity in low concentrations at acidic pH, and their efficiency to penetrate into tissue models pave the way to a wide range of possible nanomedical applications, especially in cancer therapy.

Read more
Quantitative Methods

Exact maximal reduction of stochastic reaction networks by species lumping

Motivation: Stochastic reaction networks are a widespread model to describe biological systems where the presence of noise is relevant, such as in cell regulatory processes. Unfortu-nately, in all but simplest models the resulting discrete state-space representation hinders analytical tractability and makes numerical simulations expensive. Reduction methods can lower complexity by computing model projections that preserve dynamics of interest to the user. Results: We present an exact lumping method for stochastic reaction networks with mass-action kinetics. It hinges on an equivalence relation between the species, resulting in a reduced network where the dynamics of each macro-species is stochastically equivalent to the sum of the original species in each equivalence class, for any choice of the initial state of the system. Furthermore, by an appropriate encoding of kinetic parameters as additional species, the method can establish equivalences that do not depend on specific values of the parameters. The method is supported by an efficient algorithm to compute the largest species equivalence, thus the maximal lumping. The effectiveness and scalability of our lumping technique, as well as the physical interpretability of resulting reductions, is demonstrated in several models of signaling pathways and epidemic processes on complex networks. Availability: The algorithms for species equivalence have been implemented in the software tool ERODE, freely available for download from this https URL.

Read more
Quantitative Methods

Expiratory variability index (EVI) is associated with asthma risk, wheeze and lung function in infants with recurrent respiratory symptoms

Recurrent respiratory symptoms are common in infants but the paucity of lung function tests suitable for routine use in infants is a widely acknowledged clinical problem. In this study we evaluated tidal breathing variability (expiratory variability index, EVI) measured at home during sleep using impedance pneumography (IP) as a marker of lower airway obstruction in 36 infants (mean age 12.8 [range 6-23] months) with recurrent respiratory symptoms. Lowered EVI was associated with lower lung function (VmaxFRC), higher asthma risk, and obstructive symptoms, but not with nasal congestion. EVI measured using IP is a potential technique for lung function testing in infants.

Read more
Quantitative Methods

Explaining Chemical Toxicity using Missing Features

Chemical toxicity prediction using machine learning is important in drug development to reduce repeated animal and human testing, thus saving cost and time. It is highly recommended that the predictions of computational toxicology models are mechanistically explainable. Current state of the art machine learning classifiers are based on deep neural networks, which tend to be complex and harder to interpret. In this paper, we apply a recently developed method named contrastive explanations method (CEM) to explain why a chemical or molecule is predicted to be toxic or not. In contrast to popular methods that provide explanations based on what features are present in the molecule, the CEM provides additional explanation on what features are missing from the molecule that is crucial for the prediction, known as the pertinent negative. The CEM does this by optimizing for the minimum perturbation to the model using a projected fast iterative shrinkage-thresholding algorithm (FISTA). We verified that the explanation from CEM matches known toxicophores and findings from other work.

Read more

Ready to get started?

Join us today