Gilles Marcou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gilles Marcou is active.

Explore More

Publication

Featured researches published by Gilles Marcou.

Journal of Chemical Information and Modeling | 2007

Optimizing Fragment and Scaffold Docking by Use of Molecular Interaction Fingerprints

Gilles Marcou; Didier Rognan

Protein-ligand interaction fingerprints have been used to postprocess docking poses of three ligand data sets: a set of 40 low-molecular-weight compounds from the Protein Data Bank, a collection of 40 scaffolds from pharmaceutically relevant protein ligands, and a database of 19 scaffolds extracted from true cdk2 inhibitors seeded in 2230 scaffold decoys. Four popular docking tools (FlexX, Glide, Gold, and Surflex) were used to generate poses for ligands of the three data sets. In all cases, scoring by the similarity of interaction fingerprints to a given reference was statistically superior to conventional scoring functions in posing low-molecular-weight fragments, predicting protein-bound scaffold coordinates according to the known binding mode of related ligands, and screening a scaffold library to enrich a hit list in true cdk2-targeted scaffolds.

Journal of Computer-aided Molecular Design | 2011

Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.

Iurii Sushko; Sergii Novotarskyi; Robert Körner; Anil Kumar Pandey; Matthias Rupp; Wolfram Teetz; Stefan Brandmaier; Ahmed Abdelaziz; Volodymyr V. Prokopenko; Vsevolod Yu. Tanchuk; Roberto Todeschini; Alexandre Varnek; Gilles Marcou; Peter Ertl; Vladimir Potemkin; Maria A. Grishina; Johann Gasteiger; Christof H. Schwab; I. I. Baskin; V. A. Palyulin; E. V. Radchenko; William J. Welsh; Vladyslav Kholodovych; Dmitriy Chekmarev; Artem Cherkasov; João Aires-de-Sousa; Qingyou Zhang; Andreas Bender; Florian Nigsch; Luc Patiny

The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu.

Current Computer - Aided Drug Design | 2008

ISIDA - Platform for Virtual Screening Based on Fragment and Pharmacophoric Descriptors

Alexandre Varnek; Denis Fourches; Dragos Horvath; Olga Klimchuk; Cédric Gaudin; Philippe Vayer; Vitaly P. Solov'ev; Frank Hoonakker; Igor V. Tetko; Gilles Marcou

In this paper we illustrate the application of the ISIDA (In SIlico design and Data Analysis) software to perform virtual screening of large databases of compounds and reactions and to assess some ADME/Tox properties. ISIDA represents an ensemble of tools allowing users to store, search and analyze the data, to perform similarity searches in large databases of molecules and reactions, to build and validate QSAR models, and to generate and screen virtual combinatorial libraries. It uses its own descriptors (substructural molecular fragments and fuzzy pharmacophore triplets). Workflow can be easily organized by combining different ISIDA modules. Several examples of ISIDA applications (similarity search of potent benzodiazepine ligands with FPT, QSAR modeling of aqueous solubility, aquatic toxicity, tissue-air partition coefficients, anti-HIV activity, and screening of the “Chimiotheque Nationale” Database), are discussed. Particular attention is paid to mining reaction databases using Condensed Reaction Graphs approach.

Molecular Informatics | 2010

ISIDA Property-Labelled Fragment Descriptors.

Fiorella Ruggiu; Gilles Marcou; Alexandre Varnek; Dragos Horvath

ISIDA Property‐Labelled Fragment Descriptors (IPLF) were introduced as a general framework to numerically encode molecular structures in chemoinformatics, as counts of specific subgraphs in which atom vertices are coloured with respect to some local property/feature. Combining various colouring strategies of the molecular graph – notably pH‐dependent pharmacophore and electrostatic potential‐based flagging – with several fragmentation schemes, the different subtypes of IPLFs may range from classical atom pair and sequence counts, to monitoring population levels of branched fragments or feature multiplets. The pH‐dependent feature flagging, pursued at the level of each significantly populated microspecies involved in the proteolytic equilibrium, may furthermore add some competitive advantage over classical descriptors, even when the chosen fragmentation scheme is one of the state‐of‐the‐art pattern extraction procedures (feature sequence or pair counts, etc.) in chemoinformatics. The implemented fragmentation schemes support counting (1) linear feature sequences, (2) feature pairs, (3) circular feature fragments a.k.a. “augmented atoms” or (4) feature trees. Fuzzy rendering – optionally allowing nonterminal fragment atoms to be counted as wildcards, ignoring their specific colours/features – ensures for a seamless transition between the “strict” counts (sequences or circular fragments) and the “fuzzy” multiplet counts (pairs or trees). Also, bond information may be represented or ignored, thus leaving the user a vast choice in terms of the level of resolution at which chemical information should be extracted into the descriptors. Selected IPLF subsets were – tree descriptors, in particular – successfully tested in both neighbourhood behaviour and QSAR modelling challenges, with very promising results. They showed excellent results in similarity‐based virtual screening for analogue protease inhibitors, and generated highly predictive octanol‐water partition coefficient and hERG channel inhibition models.

Journal of Chemical Information and Modeling | 2008

Hot-Spots-Guided Receptor-Based Pharmacophores (HS-Pharm): A Knowledge-Based Approach to Identify Ligand-Anchoring Atoms in Protein Cavities and Prioritize Structure-Based Pharmacophores

Caterina Barillari; Gilles Marcou; Didier Rognan

The design of biologically active compounds from ligand-free protein structures using a structure-based approach is still a major challenge. In this paper, we present a fast knowledge-based approach (HS-Pharm) that allows the prioritization of cavity atoms that should be targeted for ligand binding, by training machine learning algorithms with atom-based fingerprints of known ligand-binding pockets. The knowledge of hot spots for ligand binding is here used for focusing structure-based pharmacophore models. Three targets of pharmacological interest (neuraminidase, beta2 adrenergic receptor, and cyclooxygenase-2) were used to test the evaluated methodology, and the derived structure-based pharmacophores were used in retrospective virtual screening studies. The current study shows that structure-based pharmacophore screening is a powerful technique for the fast identification of potential hits in a chemical library, and that it is a valid alternative to virtual screening by molecular docking.

Molecular Informatics | 2012

Generative Topographic Mapping (GTM): Universal Tool for Data Visualization, Structure-Activity Modeling and Dataset Comparison

Natalia Kireeva; I. I. Baskin; Héléna A. Gaspar; Dragos Horvath; Gilles Marcou; Alexander Varnek

Here, the utility of Generative Topographic Maps (GTM) for data visualization, structure‐activity modeling and database comparison is evaluated, on hand of subsets of the Database of Useful Decoys (DUD). Unlike other popular dimensionality reduction approaches like Principal Component Analysis, Sammon Mapping or Self‐Organizing Maps, the great advantage of GTMs is providing data probability distribution functions (PDF), both in the high‐dimensional space defined by molecular descriptors and in 2D latent space. PDFs for the molecules of different activity classes were successfully used to build classification models in the framework of the Bayesian approach. Because PDFs are represented by a mixture of Gaussian functions, the Bhattacharyya kernel has been proposed as a measure of the overlap of datasets, which leads to an elegant method of global comparison of chemical libraries.

Journal of Chemical Information and Modeling | 2013

Generative topographic mapping-based classification models and their applicability domain: application to the biopharmaceutics Drug Disposition Classification System (BDDCS).

Héléna A. Gaspar; Gilles Marcou; Dragos Horvath; Alban Arault; Sylvain Lozano; Philippe Vayer; Alexandre Varnek

Earlier (Kireeva et al. Mol. Inf. 2012, 31, 301-312), we demonstrated that generative topographic mapping (GTM) can be efficiently used both for data visualization and building of classification models in the initial D-dimensional space of molecular descriptors. Here, we describe the modeling in two-dimensional latent space for the four classes of the BioPharmaceutics Drug Disposition Classification System (BDDCS) involving VolSurf descriptors. Three new definitions of the applicability domain (AD) of models have been suggested: one class-independent AD which considers the GTM likelihood and two class-dependent ADs considering respectively, either the predominant class in a given node of the map or informational entropy. The class entropy AD was found to be the most efficient for the BDDCS modeling. The predominant class AD can be directly visualized on GTM maps, which helps the interpretation of the model.

Journal of Chemical Information and Modeling | 2009

Inductive Transfer of Knowledge : Application of Multi-Task Learning and Feature Net Approaches to Model Tissue-Air Partition Coefficients

Alexandre Varnek; Cédric Gaudin; Gilles Marcou; I. I. Baskin; Anil Kumar Pandey; Igor V. Tetko

Two inductive knowledge transfer approaches - multitask learning (MTL) and Feature Net (FN) - have been used to build predictive neural networks (ASNN) and PLS models for 11 types of tissue-air partition coefficients (TAPC). Unlike conventional single-task learning (STL) modeling focused only on a single target property without any relations to other properties, in the framework of inductive transfer approach, the individual models are viewed as nodes in the network of interrelated models built in parallel (MTL) or sequentially (FN). It has been demonstrated that MTL and FN techniques are extremely useful in structure-property modeling on small and structurally diverse data sets, when conventional STL modeling is unable to produce any predictive model. The predictive STL individual models were obtained for 4 out of 11 TAPC, whereas application of inductive knowledge transfer techniques resulted in models for 9 TAPC. Differences in prediction performances of the models as a function of the machine-learning method, and of the number of properties simultaneously involved in the learning, has been discussed.

Journal of Chemical Information and Modeling | 2015

Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge.

Héléna A. Gaspar; I. I. Baskin; Gilles Marcou; Dragos Horvath; Alexandre Varnek

This paper is devoted to the analysis and visualization in 2-dimensional space of large data sets of millions of compounds using the incremental version of generative topographic mapping (iGTM). The iGTM algorithm implemented in the in-house ISIDA-GTM program was applied to a database of more than 2 million compounds combining data sets of 36 chemicals suppliers and the NCI collection, encoded either by MOE descriptors or by MACCS keys. Taking advantage of the probabilistic nature of GTM, several approaches to data analysis were proposed. The chemical space coverage was evaluated using the normalized Shannon entropy. Different views of the data (property landscapes) were obtained by mapping various physical and chemical properties (molecular weight, aqueous solubility, LogP, etc.) onto the iGTM map. The superposition of these views helped to identify the regions in the chemical space populated by compounds with desirable physicochemical profiles and the suppliers providing them. The data sets similarity in the latent space was assessed by applying several metrics (Euclidean distance, Tanimoto and Bhattacharyya coefficients) to data probability distributions based on cumulated responsibility vectors. As a complementary approach, data sets were compared by considering them as individual objects on a meta-GTM map, built on cumulated responsibility vectors or property landscapes produced with iGTM. We believe that the iGTM methodology described in this article represents a fast and reliable way to analyze and visualize large chemical databases.

Molecular Informatics | 2012

QSPR Approach to Predict Nonadditive Properties of Mixtures. Application to Bubble Point Temperatures of Binary Mixtures of Liquids

I. Oprisiu; Ekaterina V. Varlamova; Eugene N. Muratov; Anatoly G. Artemenko; Gilles Marcou; Pavel G. Polishchuk; Victor E. Kuz'min; Alexander Varnek

This paper is devoted to the development of methodology for QSPR modeling of mixtures and its application to vapor/liquid equilibrium diagrams for bubble point temperatures of binary liquid mixtures. Two types of special mixture descriptors based on SiRMS and ISIDA approaches were developed. SiRMS‐based fragment descriptors involve atoms belonging to both components of the mixture, whereas the ISIDA fragments belong only to one of these components. The models were built on the data set containing the phase diagrams for 167 mixtures represented by different combinations of 67 pure liquids. Consensus models were developed using nonlinear Support Vector Machine (SVM), Associative Neural Networks (ASNN), and Random Forest (RF) approaches. For SVM and ASNN calculations, the ISIDA fragment descriptors were used, whereas Simplex descriptors were employed in RF models. The models have been validated using three different protocols: “Points out”, “Mixtures out” and “Compounds out”, based on the specific rules to form training/test sets in each fold of cross‐validation. A final validation of the models has been performed on an additional set of 94 mixtures represented by combinations of novel 34 compounds and modeling set chemicals with each other. The root mean squared error of predictions for new mixtures of already known liquids does not exceed 5.7 K, which outperforms COSMO‐RS models. Developed QSAR methodology can be applied to the modeling of any nonadditive property of binary mixtures (antiviral activities, drug formulation, etc.)

Explore More