Yufeng J. Tseng
National Taiwan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yufeng J. Tseng.
BMC Systems Biology | 2013
Tien-Chueh Kuo; Tze-Feng Tian; Yufeng J. Tseng
BackgroundIntegrative and comparative analyses of multiple transcriptomics, proteomics and metabolomics datasets require an intensive knowledge of tools and background concepts. Thus, it is challenging for users to perform such analyses, highlighting the need for a single tool for such purposes. The 3Omics one-click web tool was developed to visualize and rapidly integrate multiple human inter- or intra-transcriptomic, proteomic, and metabolomic data by combining five commonly used analyses: correlation networking, coexpression, phenotyping, pathway enrichment, and GO (Gene Ontology) enrichment.Results3Omics generates inter-omic correlation networks to visualize relationships in data with respect to time or experimental conditions for all transcripts, proteins and metabolites. If only two of three omics datasets are input, then 3Omics supplements the missing transcript, protein or metabolite information related to the input data by text-mining the PubMed database. 3Omics’ coexpression analysis assists in revealing functions shared among different omics datasets. 3Omics’ phenotype analysis integrates Online Mendelian Inheritance in Man with available transcript or protein data. Pathway enrichment analysis on metabolomics data by 3Omics reveals enriched pathways in the KEGG/HumanCyc database. 3Omics performs statistical Gene Ontology-based functional enrichment analyses to display significantly overrepresented GO terms in transcriptomic experiments. Although the principal application of 3Omics is the integration of multiple omics datasets, it is also capable of analyzing individual omics datasets. The information obtained from the analyses of 3Omics in Case Studies 1 and 2 are also in accordance with comprehensive findings in the literature.Conclusions3Omics incorporates the advantages and functionality of existing software into a single platform, thereby simplifying data analysis and enabling the user to perform a one-click integrated analysis. Visualization and analysis results are downloadable for further user customization and analysis. The 3Omics software can be freely accessed at http://3omics.cmdm.tw.
Analytical Chemistry | 2013
San Yuan Wang; Ching-Hua Kuo; Yufeng J. Tseng
Metabolomics is a powerful tool for understanding phenotypes and discovering biomarkers. Combinations of multiple batches or data sets in large cross-sectional epidemiology studies are frequently utilized in metabolomics, but various systematic biases can introduce both batch and injection order effects and often require proper calibrations prior to chemometric analyses. We present a novel algorithm, Batch Normalizer, to calibrate large scale metabolomic data. Batch Normalizer utilizes a regression model with consideration of the total abundance of each sample to improve its calibration performance, and it is able to remove both batch effect and injection order effects. This calibration method was tested using liquid chromatography/time-of-flight mass spectrometry (LC/TOF-MS) chromatograms of 228 plasma samples and 23 pooled quality control (QC) samples. We evaluated the performance of Batch Normalizer by examining the distribution of relative standard deviation (RSD) for all peaks detected in the pooled QC samples, the average Pearson correlation coefficients for all peaks between any two of QC samples, and the distribution of QC samples in the scores plot of a principal component analysis (PCA). After calibration by Batch Normalizer, the number of peaks in QC samples with RSD less than 15% increased from 11 to 914, all of the QC samples were closely clustered in PCA scores plot, and the average Pearson correlation coefficients for all peaks of QC samples increased from 0.938 to 0.976. This method was compared to 7 commonly used calibration methods. We discovered that using Batch Normalizer to calibrate LC/TOF-MS data produces the best calibration results.
Journal of Chemical Information and Modeling | 2010
Bo-Han Su; Meng-yu Shen; Emilio Xavier Esposito; Anton J. Hopfinger; Yufeng J. Tseng
Blockage of the human ether-a-go-go related gene (hERG) potassium ion channel is a major factor related to cardiotoxicity. Hence, drugs binding to this channel have become an important biological end point in side effects screening. A set of 250 structurally diverse compounds screened for hERG activity from the literature was assembled using a set of reliability filters. This data set was used to construct a set of two-state hERG QSAR models. The descriptor pool used to construct the models consisted of 4D-fingerprints generated from the thermodynamic distribution of conformer states available to a molecule, 204 traditional 2D descriptors and 76 3D VolSurf-like descriptors computed using the Molecular Operating Environment (MOE) software. One model is a continuous partial least-squares (PLS) QSAR hERG binding model. Another related model is an optimized binary classification QSAR model that classifies compounds as active or inactive. This binary model achieves 91% accuracy over a large range of molecular diversity spanning the training set. Two external test sets were constructed. One test set is the condensed PubChem bioassay database containing 876 compounds, and the other test set consists of 106 additional compounds found in the literature. Both of the test sets were used to validate the binary QSAR model. The binary QSAR model permits a structural interpretation of possible sources for hERG activity. In particular, the presence of a polar negative group at a distance of 6-8 A from a hydrogen bond donor in a compound is predicted to be a quite structure-specific pharmacophore that increases hERG blockage. Since a data set of high chemical diversity was used to construct the binary model, it is applicable for performing general virtual hERG screening.
Journal of Chemical Information and Computer Sciences | 2003
Dahua Pan; Yufeng J. Tseng; Anton J. Hopfinger
A method for performing quantitative structure-based design has been developed by extending the current receptor-independent RI-4D-QSAR methodology to include receptor geometry. The resultant receptor-dependent RD-4D-QSAR approach employs a novel receptor-pruning technique to permit effective processing of ligands with the lining of the binding site wrapped about them. Data reduction, QSAR model construction, and identification of possible pharmacophore sites are achieved by a three-step statistical analysis consisting of genetic algorithm optimization followed by backward elimination multidimensional regression and ending with another genetic algorithm optimization. The RD-4D-QSAR method is applied to a series of glucose inhibitors of glycogen phosphorylase b, GPb. The statistical quality of the best RI- and RD-4D-QSAR models are about the same. However, the predictivity of the RD- model is quite superior to that of the RI-4D-QSAR model for a test set. The superior predictive performance of the RD- model is due to its dependence on receptor geometry. There is a unique induced-fit between each inhibitor and the GPb binding site. This induced-fit results in the side chain of Asn-284 serving as both a hydrogen bond acceptor and donor site depending upon inhibitor structure. The RD-4D-QSAR model strongly suggests that quantitative structure-based design cannot be successful unless the receptor is allowed to be completely flexible.
Journal of Chemical Information and Computer Sciences | 2004
Craig L. Senese; José S. Duca; Dahua Pan; Anton J. Hopfinger; Yufeng J. Tseng
An elusive goal in the field of chemoinformatics and molecular modeling has been the generation of a set of descriptors that, once calculated for a molecule, may be used in a wide variety of applications. Since such universal descriptors are generated free from external constraints, they are inherently independent of the data set in which they are employed. The realization of a set of universal descriptors would significantly streamline such chemoinformatics tasks as virtual high-throughout screening (VHTS) and toxicity profiling. The current study reports the derivation and validation of a potential set of universal descriptors, referred to as the 4D-fingerprints. The 4D-fingerprints are derived from the 4D-molecular similarity analysis. To evaluate the applicability of the 4D-fingerprints as universal descriptors, they are used to generate descriptive QSAR models for 5 independent training sets. Each of the training sets has been analyzed previously by several varying QSAR methods, and the results of the models generated using the 4D-fingerprints are compared to the results of the previous QSAR analyses. It was found that the models generated using the 4D-fingerprints are comparable in quality, based on statistical measures of fit and test set prediction, to the previously reported models for the other QSAR methods. This finding is particularly significant considering the 4D-fingerprints are generated independent of external constraints such as alignment, while the QSAR methods used for comparison all require an alignment analysis.
Journal of Chemical Information and Modeling | 2013
Chi-Yu Shao; Sing-Zuo Chen; Bo-Han Su; Yufeng J. Tseng; Emilio Xavier Esposito; Anton J. Hopfinger
Little attention has been given to the selection of trial descriptor sets when designing a QSAR analysis even though a great number of descriptor classes, and often a greater number of descriptors within a given class, are now available. This paper reports an effort to explore interrelationships between QSAR models and descriptor sets. Zhou and co-workers (Zhou et al., Nano Lett. 2008, 8 (3), 859-865) designed, synthesized, and tested a combinatorial library of 80 surface modified, that is decorated, multi-walled carbon nanotubes for their composite nanotoxicity using six endpoints all based on a common 0 to 100 activity scale. Each of the six endpoints for the 29 most nanotoxic decorated nanotubes were incorporated as the training set for this study. The study reported here includes trial descriptor sets for all possible combinations of MOE, VolSurf, and 4D-fingerprints (FP) descriptor classes, as well as including and excluding explicit spatial contributions from the nanotube. Optimized QSAR models were constructed from these multiple trial descriptor sets. It was found that (a) both the form and quality of the best QSAR models for each of the endpoints are distinct and (b) some endpoints are quite dependent upon 4D-FP descriptors of the entire nanotube-decorator complex. However, other endpoints yielded equally good models only using decorator descriptors with and without the decorator-only 4D-FP descriptors. Lastly, and most importantly, the quality, significance, and interpretation of a QSAR model were found to be critically dependent on the trial descriptor sets used within a given QSAR endpoint study.
International Journal of Obesity | 2015
H. H. Chen; Yufeng J. Tseng; San Yuan Wang; Yau Sheng Tsai; Chin-Sung Chang; Tien-Chueh Kuo; W. J. Yao; C. C. Shieh; Chih-Hsing Wu; Po-Hsiu Kuo
Objectives:Mechanisms of the development of abnormal metabolic phenotypes among obese population are not yet clear. In this study, we aimed to screen metabolomes of both healthy and subjects with abnormal obesity to identify potential metabolic pathways that may regulate the different metabolic characteristics of obesity.Methods:We recruited subjects with body mass index (BMI) over 25 from the weight-loss clinic of a central hospital in Taiwan. Metabolic healthy obesity (MHO) is defined as without having any form of hyperglycemia, hypertension and dyslipidemia, while metabolic abnormal obesity (MAO) is defined as having one or more abnormal metabolic indexes. Serum-based metabolomic profiling using both liquid chromatography–mass spectrometry and gas chromatography–mass spectrometry of 34 MHO and MAO individuals with matching age, sex and BMI was performed. Conditional logistic regression and partial least squares discriminant analysis were applied to identify significant metabolites between the two groups. Pathway enrichment and topology analyses were conducted to evaluate the regulated pathways.Results:A differential metabolite panel was identified to be significantly differed in MHO and MAO groups, including L-kynurenine, glycerophosphocholine (GPC), glycerol 1-phosphate, glycolic acid, tagatose, methyl palmitate and uric acid. Moreover, several metabolic pathways were relevant in distinguishing MHO from MAO groups, including fatty acid biosynthesis, phenylalanine metabolism, propanoate metabolism, and valine, leucine and isoleucine degradation.Conclusion:Different metabolomic profiles and metabolic pathways are important for distinguishing between MHO and MAO groups. We have identified and discussed the key metabolites and pathways that may prove important in the regulation of metabolic traits among the obese, which could provide useful clues to study the underlying mechanisms of the development of abnormal metabolic phenotypes.
Journal of Analytical Toxicology | 2013
I-Lin Tsai; Te-I Weng; Yufeng J. Tseng; Happy Kuy-Lok Tan; Hsiao-Ju Sun; Ching-Hua Kuo
An ultra-high-performance liquid chromatography--quadrupole time-of-flight mass spectrometry (UHPLC-QTOF-MS) method for the screening and confirmation of 62 drugs of abuse and their metabolites in urine was developed in this study. The most commonly abused drugs, including amphetamines, opioids, cocaine, benzodiazepines (BZDs) and barbiturates, and many other new and emerging abused drugs, were selected as the analytes for this study. Urine samples were diluted 5-fold with deionized water before analysis. Using a superficially porous micro-particulate column and an acetic acid-based mobile phase, 54 basic and 8 acidic analytes could be detected within 15 and 12 min in positive and negative ionization modes, respectively. The MS collision energies for the 62 analytes were optimized, and their respective fragmentation patterns were constructed in the in-house library for confirmatory analysis. The coefficients of variation of the intra- and inter-day precision of the analyte responses all were <17.39%. All analytes, except barbital, showed matrix effects of 77-121%. The limits of detection of the 62 analytes were between 2.8 and 187.5 ng/mL, which were lower than their respective cut-off concentrations (20-500 ng/mL). Ten urine samples from patients undergoing methadone treatment were analyzed by the developed UHPLC-QTOF-MS method, and the results were compared with the immunoassay method.
Journal of Agricultural and Food Chemistry | 2015
Yi-Syuan Lai; Wei-Cheng Chen; Tien-Chueh Kuo; Chi-Tang Ho; Ching-Hua Kuo; Yufeng J. Tseng; Kuan-Hung Lu; Shih-Hang Lin; Suraphan Panyod; Lee-Yan Sheen
Obesity, dyslipidemia, insulin resistance, oxidative stress, and inflammation are key clinical risk factors for the progression of non-alcoholic fatty liver disease (NAFLD). Currently, there is no comprehensive metabolic profile of a well-established animal model that effectively mimics the etiology and pathogenesis of NAFLD in humans. Here, we report the pathophysiological and metabolomic changes associated with NAFLD development in a C57BL/6J mouse model in which NAFLD was induced by feeding a high-fat diet (HFD) for 4, 8, 12, and 16 weeks. Serum metabolomic analysis was conducted using ultrahigh-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry (UHPLC-QTOF-MS) and gas chromatography-mass spectrometry (GC-MS) to establish a metabolomic profile. Analysis of the metabolomic profile in combination with principal component analysis revealed marked differences in metabolites between the control and HFD group depending upon NAFLD severity. A total of 30 potential biomarkers were strongly associated with the development of NAFLD. Among these, 11 metabolites were mainly related to carbohydrate metabolism, hepatic biotransformation, collagen synthesis, and gut microbial metabolism, which are characteristics of obesity, as well as significantly increased serum glucose, total cholesterol, and hepatic triglyceride levels during the onset of NAFLD (4 weeks). At 8 weeks, 5 additional metabolites that are chiefly involved in perturbation of lipid metabolism and insulin secretion were found to be associated with hyperinsulinemia, hyperlipidemia, and hepatic steatosis in the mid-term of NAFLD progression. At the end of 12 and 16 weeks, 14 additional metabolites were predominantly correlated to abnormal bile acid synthesis, oxidative stress, and inflammation, representing hepatic inflammatory infiltration during NAFLD development. These results provide potential biomarkers for early risk assessment of NAFLD and further insights into NAFLD development.
Journal of Chemical Information and Modeling | 2013
Chia-Yun Chang; Ming-Tsung Hsu; Emilio Xavier Esposito; Yufeng J. Tseng
The traditional biological assay is very time-consuming, and thus the ability to quickly screen large numbers of compounds against a specific biological target is appealing. To speed up the biological evaluation of compounds, high-throughput screening is widely used in the fields of biomedical, biological information, and drug discovery. The research presented in this study focuses on the use of support vector machines, a machine learning method, various classes of molecular descriptors, and different sampling techniques to overcome overfitting to classify compounds for cytotoxicity with respect to the Jurkat cell line. The cell cytotoxicity data set is imbalanced (a few active compounds and very many inactive compounds), and the ability of the predictive modeling methods is adversely affected in these situations. Commonly imbalanced data sets are overfit with respect to the dominant classified end point; in this study the models routinely overfit toward inactive (noncytotoxic) compounds when the imbalance was substantial. Support vector machine (SVM) models were used to probe the proficiency of different classes of molecular descriptors and oversampling ratios. The SVM models were constructed from 4D-FPs, MOE (1D, 2D, and 21/2D), noNP+MOE, and CATS2D trial descriptors pools and compared to the predictive abilities of CATS2D-based random forest models. Compared to previous results in the literature, the SVM models built from oversampled data sets exhibited better predictive abilities for the training and external test sets.