Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Qingyou Zhang is active.

Publication


Featured researches published by Qingyou Zhang.


Journal of Computer-aided Molecular Design | 2011

Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.

Iurii Sushko; Sergii Novotarskyi; Robert Körner; Anil Kumar Pandey; Matthias Rupp; Wolfram Teetz; Stefan Brandmaier; Ahmed Abdelaziz; Volodymyr V. Prokopenko; Vsevolod Yu. Tanchuk; Roberto Todeschini; Alexandre Varnek; Gilles Marcou; Peter Ertl; Vladimir Potemkin; Maria A. Grishina; Johann Gasteiger; Christof H. Schwab; I. I. Baskin; V. A. Palyulin; E. V. Radchenko; William J. Welsh; Vladyslav Kholodovych; Dmitriy Chekmarev; Artem Cherkasov; João Aires-de-Sousa; Qingyou Zhang; Andreas Bender; Florian Nigsch; Luc Patiny

The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu.


Journal of Chemical Information and Modeling | 2007

Random forest prediction of mutagenicity from empirical physicochemical descriptors.

Qingyou Zhang; João Aires-de-Sousa

Fast-to-calculate empirical physicochemical descriptors were investigated for their ability to predict mutagenicity (positive or negative Ames test) from the molecular structure. Fast methods are highly desired for the screening of large libraries of compounds. Global molecular descriptors and MOLMAP descriptors of bond properties were used to train random forests. Error percentages as low as 15% and 16% were achieved for an external test set with 472 compounds and for the training set with 4083 structures, respectively. High sensitivity and specificity were observed. Random forests were able to associate meaningful probabilities to the predictions and to explain the predictions in terms of similarities between query structures and compounds in the training set.


Journal of Chemical Information and Modeling | 2005

Structure-Based Classification of Chemical Reactions without Assignment of Reaction Centers

Qingyou Zhang; João Aires-de-Sousa

The automatic classification of chemical reactions is of high importance for the analysis of reaction databases, reaction retrieval, reaction prediction, or synthesis planning. In this work, the classification of photochemical reactions was investigated with no explicit assignment of the reacting centers. Classifications were explored with Random Forests or Kohonen neural networks in three different situations, using different levels of information: (a) pairs of reactants were classified according to the type of reaction they produce, (b) products were classified according to the type of reaction from which they can be synthesized, and (c) reactions were classified from the difference between the descriptors of the product and the descriptors of the reactants. In all cases molecular maps of atom-level properties (MOLMAPs) were used as descriptors. They are generated by a self-organizing map and encode physicochemical properties of the bonds available in a molecule. Correct classification could be achieved for approximately 90% of the 78 reactions in an independent test set.


Bioinformatics | 2008

Genome-scale classification of metabolic reactions and assignment of EC numbers with self-organizing maps

Diogo A. R. S. Latino; Qingyou Zhang; João Aires-de-Sousa

MOTIVATION The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer-aided validation of classification systems, to genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Comparison of metabolic reactions has been mostly based on Enzyme Commission (EC) numbers, which are extremely useful and widespread, but not always straightforward to apply, and often problematic when an enzyme catalyzes several reactions, when the same reaction is catalyzed by different enzymes, when official full EC numbers are unavailable or when reactions are not catalyzed by enzymes. Different methods should be available to compare metabolic reactions. Simultaneously, methods are required for the automatic assignment of EC numbers to reactions still not officially classified. RESULTS We have proposed the MOLMAP reaction descriptors to numerically encode the structural transformations resulting from a chemical reaction. Here, such descriptors are applied to the mapping of a genome-scale database of almost 4000 metabolic reactions by Kohonen self-organizing maps (SOMs), and its screening for inconsistencies in EC numbers. This approach allowed for the SOMs to assign EC numbers at the class, subclass and sub-subclass levels for reactions of independent test sets with accuracies up to 92, 80 and 70%, respectively. Different levels of similarity between training and test sets were explored. The approach also led to the identification of a number of similar reactions bearing differences at the EC class level. AVAILABILITY The programs to generate MOLMAP descriptors from atomic properties included in SDF files are available upon request for evaluation.


Journal of Chemical Information and Modeling | 2006

Physicochemical Stereodescriptors of Atomic Chiral Centers

Qingyou Zhang; João Aires-de-Sousa

Physicochemical atomic stereodescriptors (PAS) were implemented that represent the chirality of an atomic chiral center on the basis of empirical physicochemical properties of the ligands. The ligands are ranked according to a specific property, and the chiral center takes an S/R-like descriptor relative to that property. The procedure is performed for a series of properties, yielding a chirality profile. Application of the PAS descriptors to the prediction of enantioselectivity in chemical reactions, from the molecular structures, is illustrated here. The relationship between the molecular structures, represented by the PAS descriptors, and the enantioselectivity was learned by neural networks, decision trees, or random forests. In a first application, a data set was employed with chiral amino alcohols that enantioselectively catalyze the addition of diethylzinc to benzaldehyde. Prediction of the major enantiomer obtained in the reaction, from the molecular structure of the catalyst, was achieved with accuracy up to 90%. The second application investigated the enantiopreference of Pseudomonas cepacia lipase (PCL) toward primary alcohols. The learned models could make correct predictions about the preferred enantiomer, from the molecular structure of the substrate, in up to 93% of the cases. These included substrates with and without O-atoms bonded to the chiral center. The properties automatically selected to build the models can give indications on the relevant factors guiding the observed chemical behavior.


Journal of Chemical Information and Modeling | 2017

Machine Learning Methods to Predict Density Functional Theory B3LYP Energies of HOMO and LUMO Orbitals

Florbela Pereira; Kaixia Xiao; Diogo A. R. S. Latino; Chengcheng Wu; Qingyou Zhang; João Aires-de-Sousa

Machine learning algorithms were explored for the fast estimation of HOMO and LUMO orbital energies calculated by DFT B3LYP, on the basis of molecular descriptors exclusively based on connectivity. The whole project involved the retrieval and generation of molecular structures, quantum chemical calculations for a database with >111 000 structures, development of new molecular descriptors, and training/validation of machine learning models. Several machine learning algorithms were screened, and an applicability domain was defined based on Euclidean distances to the training set. Random forest models predicted an external test set of 9989 compounds achieving mean absolute error (MAE) up to 0.15 and 0.16 eV for the HOMO and LUMO orbitals, respectively. The impact of the quantum chemical calculation protocol was assessed with a subset of compounds. Inclusion of the orbital energy calculated by PM7 as an additional descriptor significantly improved the quality of estimations (reducing the MAE in >30%).


Journal of Hazardous Materials | 2017

Facile preparation of 3D GO/CNCs composite with adsorption performance towards [BMIM][Cl] from aqueous solution

Hua Zhou; Bin Gao; Yanmei Zhou; Han Qiao; Wenli Gao; Haonan Qu; Shanhu Liu; Qingyou Zhang; Xiaoqiang Liu

A novel three-dimensional crumpled graphene oxide/cellulose nanocrystals (GO/CNCs) composite was successfully synthesized and firstly used as adsorbent for the removal of ionic liquid [BMIM][Cl] from aqueous solution. The 3D crumpled structure and abundant oxygen of the functional groups on GO/CNCs composite can provide more chance for the sorption of [BMIM][Cl] compared with CNCs and GO, respectively. Therefore, a series of batch experiments were carried out to evaluate the adsorptive property of 3D GO/CNCs composite towards [BMIM][Cl], such as the GO mass content, the pH value and contact time. The results showed that pseudo-second-order kinetic model and Eovlich model were well fitted with the sorption kinetic. The isotherm adsorption data indicated that it was better described by Langmuir model, with the maximum sorption capacity of 0.455mmol/g. This work provides a facile method for the preparation of 3D structure adsorbent from graphene oxide and cellulose nanocrystals which has high adsorption capacity of [BMIM][Cl] in aqueous solution.


Journal of Chemical Information and Modeling | 2015

Extension of a Highly Discriminating Topological Index.

Qingyou Zhang; Chengcheng Wu; Fangfang Zheng; Tanfeng Zhao; Yanmei Zhou; Lu Xu

A highly discriminating topological index, EAID, is generated in our laboratory. A systematic search for degeneracy was performed on a total of over 14 million structures, and no duplicate occurred. These structures are as follows: over 3.8 million alkane trees with 1-22 carbon atoms; over 0.38 million structures containing heteroatoms; over 4 million benzenoids with 1-13 benzene rings; and over 5.9 million compounds from three reality databases. However, in a search of over 20 million alkane trees with 23 and 24 carbon atoms, five and 13 duplicates occurred, respectively, and for over 20 million compounds from the ZINC database, 10 duplicates occurred. To increase the discriminating power of the index, EAID has been extended, and the resulting index is termed 2-EAID. All of the over 55 million structures mentioned above were uniquely identified by 2-EAID except for two duplicates that occurred for the ZINC database. EAID and 2-EAID are the most highly discriminating indices examined to date. Thus, the two indices possess not only theoretical significance but also potential applications. For example, they could possibly be used as a supplementary reference for CAS Registry Numbers for structure documentation.


Journal of Chemometrics | 2016

Development of a highly selective molecular topological index

Qingyou Zhang; Chengcheng Wu; Jingjie Suo; Yanmei Zhou; Lu Xu

The highly selective molecular topological indices EAID and 2‐EAID were extended in order to further improve their discrimination capability. The new 3‐EAID index is obtained as a combination of extended EAID index and the Wiener index. They were tested by screening three data sets of structures comprising over 36 million alkane trees with 25 vertices, 15 million benzenoids with 14 benzene rings, and 20 million compounds taken from real data sets. While EAID index respectively exhibited 75, 29, and 10 pair degeneracies in the three data sets, and 2‐EAID index respectively exhibited 15, 1, and 2 pair degeneracies, the 3‐EAID index could discriminate all unique molecules in virtual and real data sets with >107 million compounds including the molecules stated eralier. Therefore, the new index possesses not only significance in theory but also the practical application value for confirming new compounds (the number of registered substances in Chemical Abstracts Service in June 2015 is over 99 million). Also, 3‐EAID and 2‐EAID, as well as EAID could be used for administration of chemical information systems such as large structural data sets, evaluation of organic structures, and computer‐aided synthesis. Copyright


Molecular Informatics | 2016

Machine Learning Estimation of Atom Condensed Fukui Functions

Qingyou Zhang; Fangfang Zheng; Tanfeng Zhao; Xiaohui Qu; João Aires-de-Sousa

To enable the fast estimation of atom condensed Fukui functions, machine learning algorithms were trained with databases of DFT pre‐calculated values for ca. 23,000 atoms in organic molecules. The problem was approached as the ranking of atom types with the Bradley‐Terry (BT) model, and as the regression of the Fukui function. Random Forests (RF) were trained to predict the condensed Fukui function, to rank atoms in a molecule, and to classify atoms as high/low Fukui function. Atomic descriptors were based on counts of atom types in spheres around the kernel atom. The BT coefficients assigned to atom types enabled the identification (93–94 % accuracy) of the atom with the highest Fukui function in pairs of atoms in the same molecule with differences ≥0.1. In whole molecules, the atom with the top Fukui function could be recognized in ca. 50 % of the cases and, on the average, about 3 of the top 4 atoms could be recognized in a shortlist of 4. Regression RF yielded predictions for test sets with R2=0.68–0.69, improving the ability of BT coefficients to rank atoms in a molecule. Atom classification (as high/low Fukui function) was obtained with RF with sensitivity of 55–61 % and specificity of 94–95 %.

Collaboration


Dive into the Qingyou Zhang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lu Xu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge