Bo-Han Su
National Taiwan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bo-Han Su.
Journal of Chemical Information and Modeling | 2010
Bo-Han Su; Meng-yu Shen; Emilio Xavier Esposito; Anton J. Hopfinger; Yufeng J. Tseng
Blockage of the human ether-a-go-go related gene (hERG) potassium ion channel is a major factor related to cardiotoxicity. Hence, drugs binding to this channel have become an important biological end point in side effects screening. A set of 250 structurally diverse compounds screened for hERG activity from the literature was assembled using a set of reliability filters. This data set was used to construct a set of two-state hERG QSAR models. The descriptor pool used to construct the models consisted of 4D-fingerprints generated from the thermodynamic distribution of conformer states available to a molecule, 204 traditional 2D descriptors and 76 3D VolSurf-like descriptors computed using the Molecular Operating Environment (MOE) software. One model is a continuous partial least-squares (PLS) QSAR hERG binding model. Another related model is an optimized binary classification QSAR model that classifies compounds as active or inactive. This binary model achieves 91% accuracy over a large range of molecular diversity spanning the training set. Two external test sets were constructed. One test set is the condensed PubChem bioassay database containing 876 compounds, and the other test set consists of 106 additional compounds found in the literature. Both of the test sets were used to validate the binary QSAR model. The binary QSAR model permits a structural interpretation of possible sources for hERG activity. In particular, the presence of a polar negative group at a distance of 6-8 A from a hydrogen bond donor in a compound is predicted to be a quite structure-specific pharmacophore that increases hERG blockage. Since a data set of high chemical diversity was used to construct the binary model, it is applicable for performing general virtual hERG screening.
Journal of Chemical Information and Modeling | 2013
Chi-Yu Shao; Sing-Zuo Chen; Bo-Han Su; Yufeng J. Tseng; Emilio Xavier Esposito; Anton J. Hopfinger
Little attention has been given to the selection of trial descriptor sets when designing a QSAR analysis even though a great number of descriptor classes, and often a greater number of descriptors within a given class, are now available. This paper reports an effort to explore interrelationships between QSAR models and descriptor sets. Zhou and co-workers (Zhou et al., Nano Lett. 2008, 8 (3), 859-865) designed, synthesized, and tested a combinatorial library of 80 surface modified, that is decorated, multi-walled carbon nanotubes for their composite nanotoxicity using six endpoints all based on a common 0 to 100 activity scale. Each of the six endpoints for the 29 most nanotoxic decorated nanotubes were incorporated as the training set for this study. The study reported here includes trial descriptor sets for all possible combinations of MOE, VolSurf, and 4D-fingerprints (FP) descriptor classes, as well as including and excluding explicit spatial contributions from the nanotube. Optimized QSAR models were constructed from these multiple trial descriptor sets. It was found that (a) both the form and quality of the best QSAR models for each of the endpoints are distinct and (b) some endpoints are quite dependent upon 4D-FP descriptors of the entire nanotube-decorator complex. However, other endpoints yielded equally good models only using decorator descriptors with and without the decorator-only 4D-FP descriptors. Lastly, and most importantly, the quality, significance, and interpretation of a QSAR model were found to be critically dependent on the trial descriptor sets used within a given QSAR endpoint study.
Journal of Chemical Information and Modeling | 2012
Bo-Han Su; Yi-Shu Tu; Emilio Xavier Esposito; Yufeng J. Tseng
The inclusion and accessibility of different methodologies to explore chemical data sets has been beneficial to the field of predictive modeling, specifically in the chemical sciences in the field of Quantitative Structure-Activity Relationship (QSAR) modeling. This study discusses using contemporary protocols and QSAR modeling methods to properly model two biomolecular systems that have historically not performed well using traditional and three-dimensional QSAR methodologies. Herein, we explore, analyze, and discuss the creation of a classification human Ether-a-go-go Related Gene (hERG) potassium channel model and a continuous Tetrahymena pyriformis (T. pyriformis) model using Support Vector Machine (SVM) and Support Vector Regression (SVR), respectively. The models are constructed with three types of molecular descriptors that capture the gross physicochemical features of the compounds: (i) 2D, 2 1/2D, and 3D physical features, (ii) VolSurf-like molecular interaction fields, and (iii) 4D-Fingerprints. The best hERG SVM model achieved 89% accuracy and the three-best SVM models were able to screen a Pubchem data set with an accuracy of 97%. The best T. pyriformis model had an R(2) value of 0.924 for the training set and was able to predict the continuous end points for two test sets with R(2) values of 0.832 and 0.620, respectively. The studies presented within demonstrate the predictive ability (classification and continuous end points) of QSAR models constructed from curated data sets, biologically relevant molecular descriptors, and Support Vector Machines and Support Vector Regression. The ability of these protocols and methodologies to accommodate large data sets (several thousands compounds) that are chemically diverse - and in the case of classification modeling unbalanced (one experimental outcome dominates the data set) - allows scientists to further explore a remarkable amount of biological and chemical information.
Bioinformatics | 2015
Chi-Yu Shao; Bo-Han Su; Yi-Shu Tu; Chieh Lin; Olivia A. Lin; Yufeng J. Tseng
UNLABELLED Cytochrome P450 (CYPs) are the major enzymes involved in drug metabolism and bioactivation. Inhibition models were constructed for five of the most popular enzymes from the CYP superfamily in human liver. The five enzymes chosen for this study, namely CYP1A2, CYP2D6, CYP2C19, CYP2C9 and CYP3A4, account for 90% of the xenobiotic and drug metabolism in human body. CYP enzymes can be inhibited or induced by various drugs or chemical compounds. In this work, a rule-based CYP inhibition prediction online server, CypRules, was created based on predictive models generated by the rule-based C5.0 algorithm. CypRules can predict and provide structural rulesets for CYP inhibition for each compound uploaded to the server. Capable of fast execution performance, it can be used for virtual high-throughput screening (VHTS) of a large set of testing compounds. AVAILABILITY AND IMPLEMENTATION CypRules is freely accessible at http://cyprules.cmdm.tw/ and models, descriptor and program files for all compounds are publically available at http://cyprules.cmdm.tw/sources/sources.rar.
European Journal of Medicinal Chemistry | 2015
Mengi Lin; Bo-Han Su; Chia-Hsin Lee; Suz-Ting Wang; Wen-Chun Wu; Prasad S. Dangate; Shi-Yun Wang; Wen-I Huang; Ting-Jen Cheng; Olivia A. Lin; Yih-Shyun E. Cheng; Yufeng J. Tseng; Chung-Ming Sun
The influenza nucleoprotein (NP) is a single-strand RNA-binding protein and the core of the influenza ribonucleoprotein (RNP) particle that serves many critical functions for influenza replication. NP has been considered as a promising anti-influenza target. A new class of anti-influenza compounds, nucleozin and analogues were reported recently in several laboratories to inhibit the synthesis of influenza macromolecules and prevent the cytoplasmic trafficking of the influenza RNP. In this study, pyrimido-pyrrolo-quinoxalinedione (PPQ) analogues as a new class of novel anti-influenza agents are reported. Compound PPQ-581 was identified as a potential anti-influenza lead with EC50 value of 1 μM for preventing virus-induced cytopathic effects. PPQ produces similar anti-influenza effects as nucleozin does in influenza-infected cells. Treatment with PPQ at the beginning of H1N1 infection inhibited viral protein synthesis, while treatment at later times blocked the RNP nuclear export and the appearance of cytoplasmic RNP aggregation. PPQ resistant H1N1 (WSN) viruses were isolated and found to have a NPS377G mutation. Recombinant WSN carrying the S377G NP is resistant to PPQ in anti-influenza and RNA polymerase assays. The WSN virus with the NPS377G mutation also is devoid of the PPQ-mediated RNP nuclear retention and cytoplasmic aggregation. The NPS377G expressing WSN virus is not resistant to the reported NP inhibitors nucleozin. Similarly, the nucleozin resistant WSN viruses are not resistant to PPQ, suggesting that PPQ targets a different site from the nucleozin-binding site. Our results also suggest that NP can be targeted through various binding sites to interrupt the crucial RNP trafficking, resulting in influenza replication inhibition.
Journal of Chemical Information and Modeling | 2015
Bo-Han Su; Yi-Shu Tu; Chieh Lin; Chi-Yu Shao; Olivia A. Lin; Yufeng J. Tseng
Hepatotoxicity, drug-induced liver injury, and competitive Cytochrome P-450 (CYP) isozyme binding are serious problems associated with drug use. It would be favorable to avoid or to understand potential CYP inhibition at the developmental stages. However, current in silico CYP prediction models or available public prediction servers can provide only yes/no classification results for just one or a few CYP enzymes. In this study, we utilized a rule-based C5.0 algorithm with different descriptors, including PaDEL, Mold(2), and PubChem fingerprints, to construct rule-based inhibition prediction models for five major CYP enzymes-CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4-that account for 90% of drug oxidation or hydrolysis. We also developed a rational sampling algorithm for the selection of compounds in the training data set, to enhance the performance of these CYP prediction models. The optimized models include several improved features. First, the final models significantly outperformed all of the currently available models. Second, the final models can also be used for rapid virtual screening of a large set of compounds due to their ruleset-based nature. Moreover, such rule-based prediction models can provide rulesets for structural features related to the five major CYP enzymes. The five most significant rules for CYP inhibition were identified for each CYP enzymes and discussed. An example was chosen for each of the five CYP enzymes to demonstrate how rule-based models can be used to gain insights into structural features that correspond with CYP inhibitions. A newer version of the freely accessible CYP prediction server, CypRules, is presented here as a result of the aforementioned improvements.
Toxicology and Applied Pharmacology | 2015
Emilio Xavier Esposito; Anton J. Hopfinger; Chi-Yu Shao; Bo-Han Su; Sing-Zuo Chen; Yufeng J. Tseng
Carbon nanotubes have become widely used in a variety of applications including biosensors and drug carriers. Therefore, the issue of carbon nanotube toxicity is increasingly an area of focus and concern. While previous studies have focused on the gross mechanisms of action relating to nanomaterials interacting with biological entities, this study proposes detailed mechanisms of action, relating to nanotoxicity, for a series of decorated (functionalized) carbon nanotube complexes based on previously reported QSAR models. Possible mechanisms of nanotoxicity for six endpoints (bovine serum albumin, carbonic anhydrase, chymotrypsin, hemoglobin along with cell viability and nitrogen oxide production) have been extracted from the corresponding optimized QSAR models. The molecular features relevant to each of the endpoint respective mechanism of action for the decorated nanotubes are also discussed. Based on the molecular information contained within the optimal QSAR models for each nanotoxicity endpoint, either the decorator attached to the nanotube is directly responsible for the expression of a particular activity, irrespective of the decorators 3D-geometry and independent of the nanotube, or those decorators having structures that place the functional groups of the decorators as far as possible from the nanotube surface most strongly influence the biological activity. These molecular descriptors are further used to hypothesize specific interactions involved in the expression of each of the six biological endpoints.
Journal of Chemical Information and Modeling | 2015
Bo-Han Su; Yi-Shu Tu; Olivia A. Lin; Yeu-Chern Harn; Meng-yu Shen; Yufeng J. Tseng
Fluorescence-based detection has been commonly used in high-throughput screening (HTS) assays. Autofluorescent compounds, which can emit light in the absence of artificial fluorescent markers, often interfere with the detection of fluorophores and result in false positive signals in these assays. This interference presents a major issue in fluorescence-based screening techniques. In an effort to reduce the time and cost that will be spent on prescreening of autofluorescent compounds, in silico autofluorescence prediction models were developed for selected fluorescence-based assays in this study. Five prediction models were developed based on the respective fluorophores used in these HTS assays, which absorb and emit light at specific wavelengths (excitation/emission): Alexa Fluor 350 (A350) (340 nm/450 nm), 7-amino-4-trifluoromethyl-coumarin (AFC) (405 nm/520 nm), Alexa Fluor 488 (A488) (480 nm/540 nm), Rhodamine (547 nm/598 nm), and Texas Red (547 nm/618 nm). The C5.0 rule-based classification algorithm and PubChem 2D chemical structure fingerprints were used to develop prediction models. To optimize the accuracies of these prediction models despite the highly imbalanced ratio of fluorescent versus nonfluorescent compounds presented in the collected data sets, oversampling and undersampling strategies were applied. The average final accuracy achieved for the training set was 97%, and that for the testing set was 92%. In addition, five external data sets were used to further validate the models. Ultimately, 14 representative structural features (or rules) were determined to efficiently predict autofluorescence in data sets containing both fluorescent and nonfluorescent compounds. Several cases were illustrated in this study to demonstrate the applicability of these rules.
PLOS ONE | 2016
Kuo-Hsiang Hsu; Bo-Han Su; Yi-Shu Tu; Olivia A. Lin; Yufeng J. Tseng
With advances in the development and application of Ames mutagenicity in silico prediction tools, the International Conference on Harmonisation (ICH) has amended its M7 guideline to reflect the use of such prediction models for the detection of mutagenic activity in early drug safety evaluation processes. Since current Ames mutagenicity prediction tools only focus on functional group alerts or side chain modifications of an analog series, these tools are unable to identify mutagenicity derived from core structures or specific scaffolds of a compound. In this study, a large collection of 6512 compounds are used to perform scaffold tree analysis. By relating different scaffolds on constructed scaffold trees with Ames mutagenicity, four major and one minor novel mutagenic groups of scaffold are identified. The recognized mutagenic groups of scaffold can serve as a guide for medicinal chemists to prevent the development of potentially mutagenic therapeutic agents in early drug design or development phases, by modifying the core structures of mutagenic compounds to form non-mutagenic compounds. In addition, five series of substructures are provided as recommendations, for direct modification of potentially mutagenic scaffolds to decrease associated mutagenic activities.
Journal of Cheminformatics | 2017
Alioune Schurz; Bo-Han Su; Yi-Shu Tu; Tony Tsung-Yu Lu; Olivia A. Lin; Yufeng J. Tseng
GPU acceleration is useful in solving complex chemical information problems. Identifying unknown structures from the mass spectra of natural product mixtures has been a desirable yet unresolved issue in metabolomics. However, this elucidation process has been hampered by complex experimental data and the inability of instruments to completely separate different compounds. Fortunately, with current high-resolution mass spectrometry, one feasible strategy is to define this problem as extending a scaffold database with sidechains of different probabilities to match the high-resolution mass obtained from a high-resolution mass spectrum. By introducing a dynamic programming (DP) algorithm, it is possible to solve this NP-complete problem in pseudo-polynomial time. However, the running time of the DP algorithm grows by orders of magnitude as the number of mass decimal digits increases, thus limiting the boost in structural prediction capabilities. By harnessing the heavily parallel architecture of modern GPUs, we designed a “compute unified device architecture” (CUDA)-based GPU-accelerated mixture elucidator (G.A.M.E.) that considerably improves the performance of the DP, allowing up to five decimal digits for input mass data. As exemplified by four testing datasets with verified constitutions from natural products, G.A.M.E. allows for efficient and automatic structural elucidation of unknown mixtures for practical procedures.Graphical abstract.