Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huixiao Hong is active.

Publication


Featured researches published by Huixiao Hong.


Environmental Health Perspectives | 2003

ArrayTrack--supporting toxicogenomic research at the U.S. Food and Drug Administration National Center for Toxicological Research.

Weida Tong; Xiaoxi Cao; Stephen Harris; Hongmei Sun; Hong Fang; James C. Fuscoe; Angela J. Harris; Huixiao Hong; Qian Xie; Roger Perkins; Leming Shi; Dan Casciano

The mapping of the human genome and the determination of corresponding gene functions, pathways, and biological mechanisms are driving the emergence of the new research fields of toxicogenomics and systems toxicology. Many technological advances such as microarrays are enabling this paradigm shift that indicates an unprecedented advancement in the methods of understanding the expression of toxicity at the molecular level. At the National Center for Toxicological Research (NCTR) of the U.S. Food and Drug Administration, core facilities for genomic, proteomic, and metabonomic technologies have been established that use standardized experimental procedures to support centerwide toxicogenomic research. Collectively, these facilities are continuously producing an unprecedented volume of data. NCTR plans to develop a toxicoinformatics integrated system (TIS) for the purpose of fully integrating genomic, proteomic, and metabonomic data with the data in public repositories as well as conventional (Italic)in vitro(/Italic) and (Italic)in vivo(/Italic) toxicology data. The TIS will enable data curation in accordance with standard ontology and provide or interface a rich collection of tools for data analysis and knowledge mining. In this article the design, practical issues, and functions of the TIS are discussed through presenting its prototype version, ArrayTrack, for the management and analysis of DNA microarray data. ArrayTrack is logically constructed of three linked components: a) a library (LIB) that mirrors critical data in public databases; b) a database (MicroarrayDB) that stores microarray experiment information that is Minimal Information About a Microarray Experiment (MIAME) compliant; and c) tools (TOOL) that operate on experimental and public data for knowledge discovery. Using ArrayTrack, we can select an analysis method from the TOOL and apply the method to selected microarray data stored in the MicroarrayDB; the analysis results can be linked directly to gene information in the LIB.


Nature Biotechnology | 2014

The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance

Charles Wang; Binsheng Gong; Pierre R. Bushel; Jean Thierry-Mieg; Danielle Thierry-Mieg; Joshua Xu; Hong Fang; Huixiao Hong; Jie Shen; Zhenqiang Su; Joe Meehan; Xiaojin Li; Lu Yang; Haiqing Li; Paweł P. Łabaj; David P. Kreil; Dalila B. Megherbi; Stan Gaj; Florian Caiment; Joost H.M. van Delft; Jos Kleinjans; Andreas Scherer; Viswanath Devanarayan; Jian Wang; Yong Yang; Hui-Rong Qian; Lee Lancashire; Marina Bessarabova; Yuri Nikolsky; Cesare Furlanello

The concordance of RNA-sequencing (RNA-seq) with microarrays for genome-wide analysis of differential gene expression has not been rigorously assessed using a range of chemical treatment conditions. Here we use a comprehensive study design to generate Illumina RNA-seq and Affymetrix microarray data from the same liver samples of rats exposed in triplicate to varying degrees of perturbation by 27 chemicals representing multiple modes of action (MOAs). The cross-platform concordance in terms of differentially expressed genes (DEGs) or enriched pathways is linearly correlated with treatment effect size (R20.8). Furthermore, the concordance is also affected by transcript abundance and biological complexity of the MOA. RNA-seq outperforms microarray (93% versus 75%) in DEG verification as assessed by quantitative PCR, with the gain mainly due to its improved accuracy for low-abundance transcripts. Nonetheless, classifiers to predict MOAs perform similarly when developed using data from either platform. Therefore, the endpoint studied and its biological complexity, transcript abundance and the genomic application are important factors in transcriptomic research and for clinical and regulatory decision making.


BMC Bioinformatics | 2005

Cross-platform comparability of microarray technology: Intra-platform consistency and appropriate data analysis procedures are essential

Leming Shi; Weida Tong; Hong Fang; Uwe Scherf; Jing Han; Raj K. Puri; Felix W. Frueh; Federico Goodsaid; Lei Guo; Zhenqiang Su; Tao Han; James C. Fuscoe; Z aAlex Xu; Tucker A. Patterson; Huixiao Hong; Qian Xie; Roger Perkins; James J. Chen; Daniel A. Casciano

BackgroundThe acceptance of microarray technology in regulatory decision-making is being challenged by the existence of various platforms and data analysis methods. A recent report (E. Marshall, Science, 306, 630–631, 2004), by extensively citing the study of Tan et al. (Nucleic Acids Res., 31, 5676–5684, 2003), portrays a disturbingly negative picture of the cross-platform comparability, and, hence, the reliability of microarray technology.ResultsWe reanalyzed Tans dataset and found that the intra-platform consistency was low, indicating a problem in experimental procedures from which the dataset was generated. Furthermore, by using three gene selection methods (i.e., p-value ranking, fold-change ranking, and Significance Analysis of Microarrays (SAM)) on the same dataset we found that p-value ranking (the method emphasized by Tan et al.) results in much lower cross-platform concordance compared to fold-change ranking or SAM. Therefore, the low cross-platform concordance reported in Tans study appears to be mainly due to a combination of low intra-platform consistency and a poor choice of data analysis procedures, instead of inherent technical differences among different platforms, as suggested by Tan et al. and Marshall.ConclusionOur results illustrate the importance of establishing calibrated RNA samples and reference datasets to objectively assess the performance of different microarray platforms and the proficiency of individual laboratories as well as the merits of various data analysis procedures. Thus, we are progressively coordinating the MAQC project, a community-wide effort for microarray quality control.


Journal of Chemical Information and Computer Sciences | 2003

Decision Forest: Combining the Predictions of Multiple Independent Decision Tree Models

Weida Tong; Huixiao Hong; Hong Fang; Qian Xie; Roger Perkins

The techniques of combining the results of multiple classification models to produce a single prediction have been investigated for many years. In earlier applications, the multiple models to be combined were developed by altering the training set. The use of these so-called resampling techniques, however, poses the risk of reducing predictivity of the individual models to be combined and/or over fitting the noise in the data, which might result in poorer prediction of the composite model than the individual models. In this paper, we suggest a novel approach, named Decision Forest, that combines multiple Decision Tree models. Each Decision Tree model is developed using a unique set of descriptors. When models of similar predictive quality are combined using the Decision Forest method, quality compared to the individual models is consistently and significantly improved in both training and testing steps. An example will be presented for prediction of binding affinity of 232 chemicals to the estrogen receptor.


Expert Review of Molecular Diagnostics | 2011

Next-generation sequencing and its applications in molecular diagnostics.

Zhenqiang Su; Baitang Ning; Hong Fang; Huixiao Hong; Roger Perkins; Weida Tong; Leming Shi

DNA sequencing is a powerful approach for decoding a number of human diseases, including cancers. The advent of next-generation sequencing (NGS) technologies has reduced sequencing cost by orders of magnitude and significantly increased the throughput, making whole-genome sequencing a possible way for obtaining global genomic information about patients on whom clinical actions may be taken. However, the benefits offered by NGS technologies come with a number of challenges that must be adequately addressed before they can be transformed from research tools to routine clinical practices. This article provides an overview of four commonly used NGS technologies from Roche Applied Science//454 Life Sciences, Illumina, Life Technologies and Helicos Biosciences. The challenges in the analysis of NGS data and their potential applications in clinical diagnosis are also discussed.


Journal of Chemical Information and Modeling | 2008

Mold(2), molecular descriptors from 2D structures for chemoinformatics and toxicoinformatics.

Huixiao Hong; Qian Xie; Weigong Ge; Feng Qian; Hong Fang; Leming Shi; Zhenqiang Su; Roger Perkins; Weida Tong

Research applications in chemoinformatics and toxicoinformatics increasingly use representations of molecules in the form of numerical descriptors that capture the structural characteristics and properties of molecules. These representations are useful for ADME/toxicity prediction, diversity analysis, library design, QSAR/QSPR, virtual screening, and other purposes. Molecular descriptors have ranged from relatively simple forms calculated from simple two-dimensional (2D) chemical structures to more complex forms representing three-dimensional (3D) chemical structures or complex molecular fingerprints consisting of numerous bit positions to represent specific chemical information. The Mold (2) software was developed to enable the rapid calculation of a large and diverse set of descriptors encoding two-dimensional chemical structure information. Comparative analysis of Mold (2) descriptors with those calculated by Cerius (2), Dragon, and Molconn-Z on several data sets using Shannon entropy analysis demonstrated that Mold (2) descriptors convey a similar amount of information. In addition, using the same classification method, slightly better models were generated using Mold (2) descriptors compared to those generated using descriptors from the compared commercial software packages. The low computing cost for Mold (2) makes it suitable not only for small data sets, such as in QSAR, but also for large databases in virtual screening. High reproducibility and reliability are expected because Mold (2) does not require 3D structures. Mold (2) is freely available to the public ( http://www.fda.gov/nctr/science/centers/toxicoinformatics/index.htm).


Nature Communications | 2014

A rat RNA-Seq transcriptomic BodyMap across 11 organs and 4 developmental stages

James C. Fuscoe; Chen Zhao; Chao Guo; Meiwen Jia; Tao Qing; Desmond I. Bannon; Lee Lancashire; Wenjun Bao; Tingting Du; Heng Luo; Zhenqiang Su; Wendell D. Jones; Carrie L. Moland; William S. Branham; Feng Qian; Baitang Ning; Yan Li; Huixiao Hong; Lei Guo; Nan Mei; Tieliu Shi; Kenneth Wang; Russell D. Wolfinger; Yuri Nikolsky; Stephen J. Walker; Penelope Jayne Duerksen-Hughes; Christopher E. Mason; Weida Tong; Jean Thierry-Mieg; Danielle Thierry-Mieg

The rat has been used extensively as a model for evaluating chemical toxicities and for understanding drug mechanisms. However, its transcriptome across multiple organs, or developmental stages, has not yet been reported. Here we show, as part of the SEQC consortium efforts, a comprehensive rat transcriptomic BodyMap created by performing RNA-Seq on 320 samples from 11 organs of both sexes of juvenile, adolescent, adult and aged Fischer 344 rats. We catalogue the expression profiles of 40,064 genes, 65,167 transcripts, 31,909 alternatively spliced transcript variants and 2,367 non-coding genes/non-coding RNAs (ncRNAs) annotated in AceView. We find that organ-enriched, differentially expressed genes reflect the known organ-specific biological activities. A large number of transcripts show organ-specific, age-dependent or sex-specific differential expression patterns. We create a web-based, open-access rat BodyMap database of expression profiles with crosslinks to other widely used databases, anticipating that it will serve as a primary resource for biomedical research using the rat model.


BMC Bioinformatics | 2005

Microarray scanner calibration curves: characteristics and implications

Leming Shi; Weida Tong; Zhenqiang Su; Tao Han; Jing Han; Raj K. Puri; Hong Fang; Felix W. Frueh; Federico Goodsaid; Lei Guo; William S. Branham; James J. Chen; Z Alex Xu; Stephen Harris; Huixiao Hong; Qian Xie; Roger Perkins; James C. Fuscoe

BackgroundMicroarray-based measurement of mRNA abundance assumes a linear relationship between the fluorescence intensity and the dye concentration. In reality, however, the calibration curve can be nonlinear.ResultsBy scanning a microarray scanner calibration slide containing known concentrations of fluorescent dyes under 18 PMT gains, we were able to evaluate the differences in calibration characteristics of Cy5 and Cy3. First, the calibration curve for the same dye under the same PMT gain is nonlinear at both the high and low intensity ends. Second, the degree of nonlinearity of the calibration curve depends on the PMT gain. Third, the two PMTs (for Cy5 and Cy3) behave differently even under the same gain. Fourth, the background intensity for the Cy3 channel is higher than that for the Cy5 channel. The impact of such characteristics on the accuracy and reproducibility of measured mRNA abundance and the calculated ratios was demonstrated. Combined with simulation results, we provided explanations to the existence of ratio underestimation, intensity-dependence of ratio bias, and anti-correlation of ratios in dye-swap replicates. We further demonstrated that although Lowess normalization effectively eliminates the intensity-dependence of ratio bias, the systematic deviation from true ratios largely remained. A method of calculating ratios based on concentrations estimated from the calibration curves was proposed for correcting ratio bias.ConclusionIt is preferable to scan microarray slides at fixed, optimal gain settings under which the linearity between concentration and intensity is maximized. Although normalization methods improve reproducibility of microarray measurements, they appear less effective in improving accuracy.


Nature Biotechnology | 2006

Evaluation of external RNA controls for the assessment of microarray performance

Weida Tong; Richard Shippy; Xiaohui Fan; Hong Fang; Huixiao Hong; Michael S. Orr; Tzu-Ming Chu; Xu Guo; Patrick J. Collins; Yongming Andrew Sun; Sue-Jane Wang; Wenjun Bao; Russell D. Wolfinger; Svetlana Shchegrova; Lei Guo; Janet A. Warrington; Leming Shi

External RNA controls (ERCs), although important for microarray assay performance assessment, have yet to be fully implemented in the research community. As part of the MicroArray Quality Control (MAQC) study, two types of ERCs were implemented and evaluated; one was added to the total RNA in the samples before amplification and labeling; the other was added to the copyRNAs (cRNAs) before hybridization. ERC concentration-response curves were used across multiple commercial microarray platforms to identify problematic assays and potential sources of variation in the analytical process. In addition, the behavior of different ERC types was investigated, resulting in several important observations, such as the sample-dependent attributes of performance and the potential of using these control RNAs in a combinatorial fashion. This multiplatform investigation of the behavior and utility of ERCs provides a basis for articulating specific recommendations for their future use in evaluating assay performance across multiple platforms.


Environmental Health Perspectives | 2004

Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity.

Weida Tong; Qian Xie; Huixiao Hong; Leming Shi; Hong Fang; Roger Perkins

Quantitative structure–activity relationship (QSAR) methods have been widely applied in drug discovery, lead optimization, toxicity prediction, and regulatory decisions. Despite major advances in algorithms and software, QSAR models have inherent limitations associated with a size and chemical-structure diversity of the training set, experimental error, and many characteristics of structure representation and correlation algorithms. Whereas excellent fit to the training data may be readily attainable, often models fail to predict accurately chemicals that are outside their domain of applicability. A QSAR’s utility and, in the case of regulatory decisions, justification for usage increasingly depend on the ability to quantify a model’s potential for predicting unknown chemicals with some known degree of certainty. It is never possible to predict an unknown chemical with absolute certainty. Here we report on two QSAR models based on different data sets for classification of chemicals according to their ability to bind to the estrogen receptor. The models were developed by using a novel QSAR method, Decision Forest, which combines the results of multiple heterogeneous but comparable Decision Tree models to produce a consensus prediction. We used an extensive cross-validation process to define an applicability domain for model predictions based on two quantitative measures: prediction confidence and domain extrapolation. Together, these measures quantify the accuracy of each prediction within and outside of the training domain. Despite being based on large and diverse training sets, both QSAR models had poor accuracy for chemicals within the domain of low confidence, whereas good accuracy was obtained for those within the domain of high confidence. For prediction in the high confidence domain, accuracy was inversely proportional to the degree of domain extrapolation. The model with a larger training set of 1,092, compared with 232 for the other, was more accurate in predicting chemicals at larger domain extrapolation, and could be particularly useful for rapidly prioritizing potential endocrine disruptors from large chemical universe.

Collaboration


Dive into the Huixiao Hong's collaboration.

Top Co-Authors

Avatar

Weida Tong

Food and Drug Administration

View shared research outputs
Top Co-Authors

Avatar

Roger Perkins

Food and Drug Administration

View shared research outputs
Top Co-Authors

Avatar

Hong Fang

Food and Drug Administration

View shared research outputs
Top Co-Authors

Avatar

Leming Shi

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

Zhenqiang Su

Food and Drug Administration

View shared research outputs
Top Co-Authors

Avatar

Weigong Ge

Food and Drug Administration

View shared research outputs
Top Co-Authors

Avatar

Qian Xie

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

Hui Wen Ng

Food and Drug Administration

View shared research outputs
Top Co-Authors

Avatar

James C. Fuscoe

National Center for Toxicological Research

View shared research outputs
Top Co-Authors

Avatar

Baitang Ning

National Center for Toxicological Research

View shared research outputs
Researchain Logo
Decentralizing Knowledge