Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yong-Huan Yun is active.

Publication


Featured researches published by Yong-Huan Yun.


Analytica Chimica Acta | 2016

Chemometric methods in data processing of mass spectrometry-based metabolomics: A review

Lunzhao Yi; Naiping Dong; Yong-Huan Yun; Baichuan Deng; Dabing Ren; Shao Liu; Yi-Zeng Liang

This review focuses on recent and potential advances in chemometric methods in relation to data processing in metabolomics, especially for data generated from mass spectrometric techniques. Metabolomics is gradually being regarded a valuable and promising biotechnology rather than an ambitious advancement. Herein, we outline significant developments in metabolomics, especially in the combination with modern chemical analysis techniques, and dedicated statistical, and chemometric data analytical strategies. Advanced skills in the preprocessing of raw data, identification of metabolites, variable selection, and modeling are illustrated. We believe that insights from these developments will help narrow the gap between the original dataset and current biological knowledge. We also discuss the limitations and perspectives of extracting information from high-throughput datasets.


Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy | 2013

An efficient method of wavelength interval selection based on random frog for multivariate spectral calibration.

Yong-Huan Yun; Hong-Dong Li; Leslie R. E. Wood; Wei Fan; Jia-Jun Wang; Dong-Sheng Cao; Qing-Song Xu; Yi-Zeng Liang

Wavelength selection is a critical step for producing better prediction performance when applied to spectral data. Considering the fact that the vibrational and rotational spectra have continuous features of spectral bands, we propose a novel method of wavelength interval selection based on random frog, called interval random frog (iRF). To obtain all the possible continuous intervals, spectra are first divided into intervals by moving window of a fix width over the whole spectra. These overlapping intervals are ranked applying random frog coupled with PLS and the optimal ones are chosen. This method has been applied to two near-infrared spectral datasets displaying higher efficiency in wavelength interval selection than others. The source code of iRF can be freely downloaded for academy research at the website: http://code.google.com/p/multivariate-calibration/downloads/list.


Analytica Chimica Acta | 2014

A strategy that iteratively retains informative variables for selecting optimal variable subset in multivariate calibration

Yong-Huan Yun; Wei-Ting Wang; Min-Li Tan; Yi-Zeng Liang; Hong-Dong Li; Dong-Sheng Cao; Hongmei Lu; Qing-Song Xu

Nowadays, with a high dimensionality of dataset, it faces a great challenge in the creation of effective methods which can select an optimal variables subset. In this study, a strategy that considers the possible interaction effect among variables through random combinations was proposed, called iteratively retaining informative variables (IRIV). Moreover, the variables are classified into four categories as strongly informative, weakly informative, uninformative and interfering variables. On this basis, IRIV retains both the strongly and weakly informative variables in every iterative round until no uninformative and interfering variables exist. Three datasets were employed to investigate the performance of IRIV coupled with partial least squares (PLS). The results show that IRIV is a good alternative for variable selection strategy when compared with three outstanding and frequently used variable selection methods such as genetic algorithm-PLS, Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS) and competitive adaptive reweighted sampling (CARS). The MATLAB source code of IRIV can be freely downloaded for academy research at the website: http://code.google.com/p/multivariate-calibration/downloads/list.


Analyst | 2013

A perspective demonstration on the importance of variable selection in inverse calibration for complex analytical systems

Yong-Huan Yun; Yi-Zeng Liang; Gui-Xiang Xie; Hong-Dong Li; Dong-Sheng Cao; Qing-Song Xu

Classical calibration and inverse calibration are two kinds of multivariate calibration in chemical modeling. They use strategies of modeling in component spectral space and in measured variable space, respectively. However, the intrinsic difference between these two calibration models is not fully investigated. Besides, in the case of complex analytical systems, the net analyte signal (NAS) cannot be well defined in inverse calibration due to the existence of uninformative and/or interfering variables. Therefore, application of the NAS cannot improve the predictive performance for this kind of calibration, since it is essentially a technique based on the full-spectrum. From our perspective, variable selection can significantly improve the predictive performance through removing uninformative and/or interfering variables. Although the need for variable selection in the inverse calibration model has already been experimentally demonstrated, it has not aroused so much attention. In this study, we first clarify the intrinsic difference between these two calibration models and then use a new perspective to intrinsically prove the importance of variable selection in the inverse calibration model for complex analytical systems. In addition, we have experimentally validated our viewpoint through the use of one UV dataset and two generated near infrared (NIR) datasets.


Chromatographia | 2013

Comparisons of Five Algorithms for Chromatogram Alignment

Wei Jiang; Zhimin Zhang; Yong-Huan Yun; De-Jian Zhan; Yi-Bao Zheng; Yi-Zeng Liang; Zhenyu Yang; Ling Yu

In this study, five frequently used warping algorithms [correlation optimized warping (COW), recursive alignment by fast Fourier transform (RAFFT), dynamic time warping, variable penalty dynamic warping, and parametric time warping (PTW)] are compared for their ability to align chromatograms with retention time shifts. Five datasets consisting of chromatograms of herbal medicines analyzed by high-performance liquid chromatography (HPLC) (Kudzuvine Root, White Paeony Root, Rehmannia Root, Ligusticum wallichii, Scutellaria baicalensis) are chosen to test these five alignment algorithms. The comparison shows all those five methods have misalignments with different degrees, but the correlations of the aligned data sets are all improved, especially for the data sets that are aligned by segment-wise: COW and RAFFT. After the comprehensive comparison, RAFFT wins the highest score, and then COW follows, whereas PTW is not preferable to align HPLC.


Analytica Chimica Acta | 2015

A new strategy to prevent over-fitting in partial least squares models based on model population analysis

Baichuan Deng; Yong-Huan Yun; Yi-Zeng Liang; Dong-Sheng Cao; Qing-Song Xu; Lunzhao Yi; Xin Huang

Partial least squares (PLS) is one of the most widely used methods for chemical modeling. However, like many other parameter tunable methods, it has strong tendency of over-fitting. Thus, a crucial step in PLS model building is to select the optimal number of latent variables (nLVs). Cross-validation (CV) is the most popular method for PLS model selection because it selects a model from the perspective of prediction ability. However, a clear minimum of prediction errors may not be obtained in CV which makes the model selection difficult. To solve the problem, we proposed a new strategy for PLS model selection which combines the cross-validated coefficient of determination (Qcv(2)) and model stability (S). S is defined as the stability of PLS regression vectors which is obtained using model population analysis (MPA). The results show that, when a clear maximum of Qcv(2) is not obtained, S can provide additional information of over-fitting and it helps in finding the optimal nLVs. Compared with other regression vector based indictors such as the Euclidean 2-norm (B2), the Durbin Watson statistic (DW) and the jaggedness (J), S is more sensitive to over-fitting. The model selected by our method has both good prediction ability and stability.


Journal of Proteome Research | 2017

Deep-Learning-Based Drug–Target Interaction Prediction

Ming Wen; Zhimin Zhang; Shaoyu Niu; Haozhi Sha; Ruihan Yang; Yong-Huan Yun; Hongmei Lu

Identifying interactions between known drugs and targets is a major challenge in drug repositioning. In silico prediction of drug-target interaction (DTI) can speed up the expensive and time-consuming experimental work by providing the most potent DTIs. In silico prediction of DTI can also provide insights about the potential drug-drug interaction and promote the exploration of drug side effects. Traditionally, the performance of DTI prediction depends heavily on the descriptors used to represent the drugs and the target proteins. In this paper, to accurately predict new DTIs between approved drugs and targets without separating the targets into different classes, we developed a deep-learning-based algorithmic framework named DeepDTIs. It first abstracts representations from raw input descriptors using unsupervised pretraining and then applies known label pairs of interaction to build a classification model. Compared with other methods, it is found that DeepDTIs reaches or outperforms other state-of-the-art methods. The DeepDTIs can be further used to predict whether a new drug targets to some existing targets or whether a new target interacts with some existing drugs.


Analytica Chimica Acta | 2016

A bootstrapping soft shrinkage approach for variable selection in chemical modeling.

Baichuan Deng; Yong-Huan Yun; Dong-Sheng Cao; Yu-Long Yin; Wei-Ting Wang; Hongmei Lu; Qianyi Luo; Yi-Zeng Liang

In this study, a new variable selection method called bootstrapping soft shrinkage (BOSS) method is developed. It is derived from the idea of weighted bootstrap sampling (WBS) and model population analysis (MPA). The weights of variables are determined based on the absolute values of regression coefficients. WBS is applied according to the weights to generate sub-models and MPA is used to analyze the sub-models to update weights for variables. The optimization procedure follows the rule of soft shrinkage, in which less important variables are not eliminated directly but are assigned smaller weights. The algorithm runs iteratively and terminates until the number of variables reaches one. The optimal variable set with the lowest root mean squared error of cross-validation (RMSECV) is selected. The method was tested on three groups of near infrared (NIR) spectroscopic datasets, i.e. corn datasets, diesel fuels datasets and soy datasets. Three high performing variable selection methods, i.e. Monte Carlo uninformative variable elimination (MCUVE), competitive adaptive reweighted sampling (CARS) and genetic algorithm partial least squares (GA-PLS) are used for comparison. The results show that BOSS is promising with improved prediction performance. The Matlab codes for implementing BOSS are freely available on the website: http://www.mathworks.com/matlabcentral/fileexchange/52770-boss.


Journal of Chromatography A | 2013

Application of fast Fourier transform cross-correlation and mass spectrometry data for accurate alignment of chromatograms.

Yi-Bao Zheng; Zhimin Zhang; Yi-Zeng Liang; De-Jian Zhan; Jian-Hua Huang; Yong-Huan Yun; Hua-Lin Xie

Chromatography has been established as one of the most important analytical methods in the modern analytical laboratory. However, preprocessing of the chromatograms, especially peak alignment, is usually a time-consuming task prior to extracting useful information from the datasets because of the small unavoidable differences in the experimental conditions caused by minor changes and drift. Most of the alignment algorithms are performed on reduced datasets using only the detected peaks in the chromatograms, which means a loss of data and introduces the problem of extraction of peak data from the chromatographic profiles. These disadvantages can be overcome by using the full chromatographic information that is generated from hyphenated chromatographic instruments. A new alignment algorithm called CAMS (Chromatogram Alignment via Mass Spectra) is present here to correct the retention time shifts among chromatograms accurately and rapidly. In this report, peaks of each chromatogram were detected based on Continuous Wavelet Transform (CWT) with Haar wavelet and were aligned against the reference chromatogram via the correlation of mass spectra. The aligning procedure was accelerated by Fast Fourier Transform cross correlation (FFT cross correlation). This approach has been compared with several well-known alignment methods on real chromatographic datasets, which demonstrates that CAMS can preserve the shape of peaks and achieve a high quality alignment result. Furthermore, the CAMS method was implemented in the Matlab language and available as an open source package at http://www.github.com/matchcoder/CAMS.


Talanta | 2016

A potential tool for diagnosis of male infertility: Plasma metabolomics based on GC-MS

Xinyi Zhou; Yang Wang; Yong-Huan Yun; Zian Xia; Hongmei Lu; Jiekun Luo; Yi-Zeng Liang

Male infertility has become an important public health problem worldwide. Nowadays the diagnosis of male infertility frequently depends on the results of semen quality or requires more invasive surgical intervention. Therefore, it is necessary to develop a novel approach for early diagnosis of male infertility. According to the presence or absence of normal sexual function, the male infertility is classified into two phenotypes, erectile dysfunction (ED) and semen abnormalities (SA). The aim of this study was to investigate the GC-MS plasma profiles of infertile male having erectile dysfunction (ED) and having semen abnormalities (SA) and discover the potential biomarkers. The plasma samples from healthy controls (HC) (n=61) and infertility patients with ED (n=26) or with SA (n=44) were analyzed by gas chromatography-mass spectrometry (GC-MS) for discrimination and screening potential biomarkers. The partial least squares-discriminant analysis (PLS-DA) was performed on GC-MS dataset. The results showed that HC could be discriminated from infertile cases having SA (AUC=86.96%, sensitivity=78.69%, specificity=84.09%, accuracy=80.95%) and infertile cases having ED (AUC=94.33%, sensitivity=80.33%, specificity=100%, accuracy=87.36%). Some potential biomarkers were successfully discovered by two commonly used variable selection methods, variable importance on projection (VIP) and original coefficients of PLS-DA (β). 1,5-Anhydro-sorbitol and α-hydroxyisovaleric acid were identified as the potential biomarkers for distinguishing HC from the male infertility patients. Meanwhile, lactate, glutamate and cholesterol were the found to be the important variables to distinguish between patients with erectile dysfunction from those with semen abnormalities. The plasma metabolomics may be developed as a novel approach for fast, noninvasive, and acceptable diagnosis and characterization of male infertility.

Collaboration


Dive into the Yong-Huan Yun's collaboration.

Top Co-Authors

Avatar

Yi-Zeng Liang

Central South University

View shared research outputs
Top Co-Authors

Avatar

Baichuan Deng

South China Agricultural University

View shared research outputs
Top Co-Authors

Avatar

Hongmei Lu

Central South University

View shared research outputs
Top Co-Authors

Avatar

Dong-Sheng Cao

Central South University

View shared research outputs
Top Co-Authors

Avatar

Qing-Song Xu

Central South University

View shared research outputs
Top Co-Authors

Avatar

Lunzhao Yi

Kunming University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Zhimin Zhang

Central South University

View shared research outputs
Top Co-Authors

Avatar

Dabing Ren

Kunming University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Wei Fan

Central South University

View shared research outputs
Top Co-Authors

Avatar

Wei-Ting Wang

Central South University

View shared research outputs
Researchain Logo
Decentralizing Knowledge