Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Imhoi Koo is active.

Publication


Featured researches published by Imhoi Koo.


Analytical Chemistry | 2011

Wavelet- and Fourier-Transform-Based Spectrum Similarity Approaches to Compound Identification in Gas Chromatography/Mass Spectrometry

Imhoi Koo; Xiang Zhang; Seongho Kim

The high-throughput gas chromatography/mass spectrometry (GC/MS) technology offers a powerful means of analyzing a large number of chemical and biological samples. One of the important analyses of GC/MS data is compound identification. In this work, novel spectral similarity measures based on the discrete wavelet and Fourier transforms were proposed. The proposed methods are composite similarities that are composed of weighted intensities and wavelet/Fourier coefficients using cosine correlation. The performance of the proposed approaches along with the existing similarity measures was evaluated using the NIST Chemistry WebBook mass database maintained by the National Institute of Standards and Technology (NIST) as a library of reference spectra and repetitive mass spectral data as query spectra. The analysis results showed that the identification accuracies of the wavelet- and Fourier-transform-based methods were improved by 2.02% and 1.95%, respectively, compared to that of the weighted dot product (cosine correlation) and by 3.01% and 3.08%, respectively, compared to that of the composite similarity measure. The improved identification accuracy demonstrates that the proposed approaches outperformed the existing similarity measures in the literature.


Analytical Chemistry | 2011

MetSign: a computational platform for high-resolution mass spectrometry-based metabolomics.

Xiaoli Wei; Wenlong Sun; Xue Shi; Imhoi Koo; Bing Wang; Jun Zhang; Xinmin Yin; Yunan Tang; Bogdan Bogdanov; Seongho Kim; Zhanxiang Zhou; Craig J. McClain; Xiang Zhang

Data analysis in metabolomics is currently a major challenge, particularly when large sample sets are analyzed. Herein, we present a novel computational platform entitled MetSign for high-resolution mass spectrometry-based metabolomics. By converting the instrument raw data into mzXML format as its input data, MetSign provides a suite of bioinformatics tools to perform raw data deconvolution, metabolite putative assignment, peak list alignment, normalization, statistical significance tests, unsupervised pattern recognition, and time course analysis. MetSign uses a modular design and an interactive visual data mining approach to enable efficient extraction of useful patterns from data sets. Analysis steps, designed as containers, are presented with a wizard for the user to follow analyses. Each analysis step might contain multiple analysis procedures and/or methods and serves as a pausing point where users can interact with the system to review the results, to shape the next steps, and to return to previous steps to repeat them with different methods or parameter settings. Analysis of metabolite extract of mouse liver with spiked-in acid standards shows that MetSign outperforms the existing publically available software packages. MetSign has also been successfully applied to investigate the regulation and time course trajectory of metabolites in hepatic liver.


Bioinformatics | 2012

A method of finding optimal weight factors for compound identification in gas chromatography–mass spectrometry

Seongho Kim; Imhoi Koo; Xiaoli Wei; Xiang Zhang

MOTIVATION The compound identification in gas chromatography-mass spectrometry (GC-MS) is achieved by matching the experimental mass spectrum to the mass spectra in a spectral library. It is known that the intensities with higher m/z value in the GC-MS mass spectrum are the most diagnostic. Therefore, to increase the relative significance of peak intensities of higher m/z value, the intensities and m/z values are usually transformed with a set of weight factors. A poor quality of weight factors can significantly decrease the accuracy of compound identification. With the significant enrichment of the mass spectral database and the broad application of GC-MS, it is important to re-visit the methods of discovering the optimal weight factors for high confident compound identification. RESULTS We developed a novel approach to finding the optimal weight factors only through a reference library for high accuracy compound identification. The developed approach first calculates the ratio of skewness to kurtosis of the mass spectral similarity scores among spectra (compounds) in a reference library and then considers a weight factor with the maximum ratio as the optimal weight factor. We examined our approach by comparing the accuracy of compound identification using the mass spectral library maintained by the National Institute of Standards and Technology. The results demonstrate that the optimal weight factors for fragment ion peak intensity and m/z value found by the developed approach outperform the current weight factors for compound identification. AVAILABILITY The results and R package are available at http://stage.louisville.edu/faculty/x0zhan17/software/ software-development.


Analytical Chemistry | 2012

Compound identification using partial and semipartial correlations for gas chromatography-mass spectrometry data.

Seongho Kim; Imhoi Koo; Jaesik Jeong; Shiwen Wu; Xue Shi; Xiang Zhang

Compound identification is a key component of data analysis in the applications of gas chromatography-mass spectrometry (GC-MS). Currently, the most widely used compound identification is mass spectrum matching, in which the dot product and its composite version are employed as spectral similarity measures. Several forms of transformations for fragment ion intensities have also been proposed to increase the accuracy of compound identification. In this study, we introduced partial and semipartial correlations as mass spectral similarity measures and applied them to identify compounds along with different transformations of peak intensity. The mixture versions of the proposed method were also developed to further improve the accuracy of compound identification. To demonstrate the performance of the proposed spectral similarity measures, the National Institute of Standards and Technology (NIST) mass spectral library and replicate spectral library were used as the reference library and the query spectra, respectively. Identification results showed that the mixture partial and semipartial correlations always outperform both the dot product and its composite measure. The mixture similarity with semipartial correlation has the highest accuracy of 84.6% in compound identification with a transformation of (0.53,1.3) for fragment ion intensity and m/z value, respectively.


Journal of Proteome Research | 2014

Metabolomic Analysis of the Effects of Chronic Arsenic Exposure in a Mouse Model of Diet-Induced Fatty Liver Disease

Xue Shi; Xiaoli Wei; Imhoi Koo; Robin H. Schmidt; Xinmin Yin; Seong Ho Kim; Andrew Vaughn; Craig J. McClain; Gavin E. Arteel; Xiang Zhang; Walter H. Watson

Arsenic is a widely distributed environmental component that is associated with a variety of cancer and non-cancer adverse health effects. Additional lifestyle factors, such as diet, contribute to the manifestation of disease. Recently, arsenic was found to increase inflammation and liver injury in a dietary model of fatty liver disease. The purpose of the present study was to investigate potential mechanisms of this diet-environment interaction via a high-throughput metabolomics approach. GC×GC-TOF MS was used to identify metabolites that were significantly increased or decreased in the livers of mice fed a Western diet (a diet high in fat and cholesterol) and co-exposed to arsenic-contaminated drinking water. The results showed that there are distinct hepatic metabolomic profiles associated with eating a high fat diet, drinking arsenic-contaminated water, and the combination of the two. Among the metabolites that were decreased when arsenic exposure was combined with a high fat diet were short-chain and medium-chain fatty acid metabolites and the anti-inflammatory amino acid, glycine. These results are consistent with the observed increase in inflammation and cell death in the livers of these mice and point to potentially novel mechanisms by which these metabolic pathways could be altered by arsenic in the context of diet-induced fatty liver disease.


Bioinformatics | 2013

MetPP: a computational platform for comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry-based metabolomics

Xiaoli Wei; Xue Shi; Imhoi Koo; Seongho Kim; Robin H. Schmidt; Gavin E. Arteel; Walter H. Watson; Craig J. McClain; Xiang Zhang

MOTIVATION Due to the high complexity of metabolome, the comprehensive 2D gas chromatography time-of-flight mass spectrometry (GC×GC-TOF MS) is considered as a powerful analytical platform for metabolomics study. However, the applications of GC×GC-TOF MS in metabolomics are not popular owing to the lack of bioinformatics system for data analysis. RESULTS We developed a computational platform entitled metabolomics profiling pipeline (MetPP) for analysis of metabolomics data acquired on a GC×GC-TOF MS system. MetPP can process peak filtering and merging, retention index matching, peak list alignment, normalization, statistical significance tests and pattern recognition, using the peak lists deconvoluted from the instrument data as its input. The performance of MetPP software was tested with two sets of experimental data acquired in a spike-in experiment and a biomarker discovery experiment, respectively. MetPP not only correctly aligned the spiked-in metabolite standards from the experimental data, but also correctly recognized their concentration difference between sample groups. For analysis of the biomarker discovery data, 15 metabolites were recognized with significant concentration difference between the sample groups and these results agree with the literature results of histological analysis, demonstrating the effectiveness of applying MetPP software for disease biomarker discovery. AVAILABILITY The source code of MetPP is available at http://metaopen.sourceforge.net CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Journal of Chromatography A | 2013

Comparative analysis of mass spectral matching-based compound identification in gas chromatography-mass spectrometry

Imhoi Koo; Seongho Kim; Xiang Zhang

Compound identification in gas chromatography-mass spectrometry (GC-MS) is usually achieved by matching query spectra to spectra present in a reference library. Although several spectral similarity measures have been developed and compared using a small reference library, it still remains unknown how the relationship between the spectral similarity measure and the size of reference library affects on the identification accuracy as well as the optimal weight factor. We used three reference libraries to investigate the dependency of the optimal weight factor, spectral similarity measure and the size of reference library. Our study demonstrated that the optimal weight factor depends on not only spectral similarity measure but also the size of reference library. The mixture semi-partial correlation measure outperforms all existing spectral similarity measures in all tested reference libraries, in spite of the computational expense. Furthermore, the accuracy of compound identification using a larger reference library in future is estimated by varying the size of reference library. Simulation study indicates that the mixture semi-partial correlation measure will have the best performance with the increase of reference library in future.


BMC Bioinformatics | 2011

Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry.

Seongho Kim; Imhoi Koo; Aiqin Fang; Xiang Zhang

BackgroundComprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis and brings more accurate information about compound retention times and mass spectra. Despite these advantages, the retention times of the resolved peaks on the two-dimensional gas chromatographic columns are always shifted due to experimental variations, introducing difficulty in the data processing for metabolomics analysis. Therefore, the retention time variation must be adjusted in order to compare multiple metabolic profiles obtained from different conditions.ResultsWe developed novel peak alignment algorithms for both homogeneous (acquired under the identical experimental conditions) and heterogeneous (acquired under the different experimental conditions) GC × GC-MS data using modified Smith-Waterman local alignment algorithms along with mass spectral similarity. Compared with literature reported algorithms, the proposed algorithms eliminated the detection of landmark peaks and the usage of retention time transformation. Furthermore, an automated peak alignment software package was established by implementing a likelihood function for optimal peak alignment.ConclusionsThe proposed Smith-Waterman local alignment-based algorithms are capable of aligning both the homogeneous and heterogeneous data of multiple GC × GC-MS experiments without the transformation of retention times and the selection of landmark peaks. An optimal version of the SW-based algorithms was also established based on the associated likelihood function for the automatic peak alignment. The proposed alignment algorithms outperform the literature reported alignment method by analyzing the experiment data of a mixture of compound standards and a metabolite extract of mouse plasma with spiked-in compound standards.


Analyst | 2014

Compound identification in GC-MS by simultaneously evaluating the mass spectrum and retention index

Xiaoli Wei; Imhoi Koo; Seongho Kim; Xiang Zhang

We report a compound identification method (SimMR), which simultaneously evaluates the mass spectrum similarity and the retention index distance using an empirical mixture score function, for the analysis of GC-MS data. The performance of the developed SimMR method was compared to that of two existing compound identification strategies. One is the mass spectrum matching method without incorporation of retention index information (SM). The other is the method that sequentially evaluates the mass spectrum similarity and retention index distance (SeqMR). For comparison purposes, we used the NIST/EPA/NIH Mass Spectral Library 2005. Our study demonstrates that SimMR performs the best among the three compound identification methods, by improving the overall identification accuracy up to 1.53% and 4.81% compared to SeqMR and SM, respectively.


Journal of Chromatography A | 2012

A large scale test dataset to determine optimal retention index threshold based on three mass spectral similarity measures.

Jun Zhang; Imhoi Koo; Bing Wang; Qingwei Gao; Chun-Hou Zheng; Xiang Zhang

Retention index (RI) is useful for metabolite identification. However, when RI is integrated with mass spectral similarity for metabolite identification, many controversial RI threshold setup are reported in literatures. In this study, a large scale test dataset of 5844 compounds with both mass spectra and RI information were created from National Institute of Standards and Technology (NIST) repetitive mass spectra (MS) and RI library. Three MS similarity measures: NIST composite measure, the real part of Discrete Fourier Transform (DFT.R) and the detail of Discrete Wavelet Transform (DWT.D) were used to investigate the accuracy of compound identification using the test dataset. To imitate real identification experiments, NIST MS main library was employed as reference library and the test dataset was used as search data. Our study shows that the optimal RI thresholds are 22, 15, and 15 i.u. for the NIST composite, DFT.R and DWT.D measures, respectively, when the RI and mass spectral similarity are integrated for compound identification. Compared to the mass spectrum matching, using both RI and mass spectral matching can improve the identification accuracy by 1.7%, 3.5%, and 3.5% for the three mass spectral similarity measures, respectively. It is concluded that the improvement of RI matching for compound identification heavily depends on the method of MS spectral similarity measure and the accuracy of RI data.

Collaboration


Dive into the Imhoi Koo's collaboration.

Top Co-Authors

Avatar

Xiang Zhang

University of Louisville

View shared research outputs
Top Co-Authors

Avatar

Seongho Kim

University of Louisville

View shared research outputs
Top Co-Authors

Avatar

Xiaoli Wei

University of Louisville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xue Shi

University of Louisville

View shared research outputs
Top Co-Authors

Avatar

Xinmin Yin

University of Louisville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Biyun Shi

University of Louisville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ming Song

University of Louisville

View shared research outputs
Researchain Logo
Decentralizing Knowledge