European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences | 2019

Enabling design of screening libraries for antibiotic discovery by modeling ChEMBL data.

 

Abstract


It is critical to identify novel antibiotics. Yet, the scientific community has struggled in this pursuit because we do not understand which molecules will penetrate the bacterial outer envelope. In this work, we have identified a large dataset of compounds known to reach their targets in bacterial cells (penetrators) and compared them with molecules that do not (non-penetrators). Our dataset, extracted from the ChEMBL database, is a useful tool to guide the selection of molecules for antibiotic screening. Simple random forest classification models are able to correctly identify penetrators from non-penetrators. The model demonstrated ∼87% accuracy, with high precision (∼88%) and recall (∼97%) in identifying penetrators of Gram-positive bacteria. A paucity of data for non-penetrators was a major hurdle to model-building; we observed a ∼86% negative predictive value, but only a ∼57% specificity. Accumulation of data on non-penetrators is therefore necessary. Data for Gram-negative bacteria was also sparse, but a larger fraction of these data represented non-penetrators. Correspondingly, the resultant models performed well in predicting those molecules that would fail to enter Gram-negative cells, but were relatively weaker in correctly predicting penetrators. A comparison of physicochemical properties of penetrators and non-penetrators suggests only marginal differences exist. Therefore, it may be difficult to identify overarching rules for generation of screening libraries for antibiotic discovery, based purely on physicochemical properties alone. Instead, models such as ours should be of use. Our models are highly preliminary and based on phenotypic data, but a similar large dataset directly addressing accumulation of chemical matter in bacterial cells is currently unavailable. Hence, our models represent the cutting edge in design of screening libraries for antibiotic discovery until appropriate data can be compiled.

Volume None
Pages \n 105166\n
DOI 10.1016/j.ejps.2019.105166
Language English
Journal European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences

Full Text