Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Joon Jin Song is active.

Publication


Featured researches published by Joon Jin Song.


BMC Bioinformatics | 2007

Bayesian meta-analysis models for microarray data: a comparative study

Erin M. Conlon; Joon Jin Song; Anna Liu

BackgroundWith the growing abundance of microarray data, statistical methods are increasingly needed to integrate results across studies. Two common approaches for meta-analysis of microarrays include either combining gene expression measures across studies or combining summaries such as p-values, probabilities or ranks. Here, we compare two Bayesian meta-analysis models that are analogous to these methods.ResultsTwo Bayesian meta-analysis models for microarray data have recently been introduced. The first model combines standardized gene expression measures across studies into an overall mean, accounting for inter-study variability, while the second combines probabilities of differential expression without combining expression values. Both models produce the gene-specific posterior probability of differential expression, which is the basis for inference. Since the standardized expression integration model includes inter-study variability, it may improve accuracy of results versus the probability integration model. However, due to the small number of studies typical in microarray meta-analyses, the variability between studies is challenging to estimate. The probability integration model eliminates the need to model variability between studies, and thus its implementation is more straightforward. We found in simulations of two and five studies that combining probabilities outperformed combining standardized gene expression measures for three comparison values: the percent of true discovered genes in meta-analysis versus individual studies; the percent of true genes omitted in meta-analysis versus separate studies, and the number of true discovered genes for fixed levels of Bayesian false discovery. We identified similar results when pooling two independent studies of Bacillus subtilis. We assumed that each study was produced from the same microarray platform with only two conditions: a treatment and control, and that the data sets were pre-scaled.ConclusionThe Bayesian meta-analysis model that combines probabilities across studies does not aggregate gene expression measures, thus an inter-study variability parameter is not included in the model. This results in a simpler modeling approach than aggregating expression measures, which accounts for variability across studies. The probability integration model identified more true discovered genes and fewer true omitted genes than combining expression measures, for our data sets.


Computational Biology and Chemistry | 2007

Clustering of time-course gene expression data using functional data analysis

Joon Jin Song; Ho Jin Lee; Jeffrey S. Morris; Sanghoon Kang

Clustering of gene expression data collected across time is receiving growing attention in the biological literature since time-course experiments allow one to understand dynamic biological processes and identify genes governed by the same processes. It is believed that genes demonstrating similar expression profiles over time might give an informative insight into how underlying biological mechanisms work. In this paper, we propose a method based on functional data analysis (FNDA) to cluster time-dependent gene expression profiles. Consideration of clustering problems using the FNDA setting provides ways to take time dependency into account by using basis function expansion to describe the partially observed curves. We also discuss how to choose the number of bases in the basis function expansion in FNDA. A synthetic cycle data and a real data are used to demonstrate the proposed method and some comparisons between the proposed and existing approaches using the adjusted Rand indices are made.


BMC Genomics | 2010

Transcriptional profiling of host gene expression in chicken embryo lung cells infected with laryngotracheitis virus.

Jeong Yoon Lee; Joon Jin Song; Ann Wooming; Xianyao Li; Huaijun Zhou; Walter Bottje; Byung-Whi Kong

BackgroundInfection by infectious laryngotracheitis virus (ILTV; gallid herpesvirus 1) causes acute respiratory diseases in chickens often with high mortality. To better understand host-ILTV interactions at the host transcriptional level, a microarray analysis was performed using 4 × 44 K Agilent chicken custom oligo microarrays.ResultsMicroarrays were hybridized using the two color hybridization method with total RNA extracted from ILTV infected chicken embryo lung cells at 0, 1, 3, 5, and 7 days post infection (dpi). Results showed that 789 genes were differentially expressed in response to ILTV infection that include genes involved in the immune system (cytokines, chemokines, MHC, and NF-κB), cell cycle regulation (cyclin B2, CDK1, and CKI3), matrix metalloproteinases (MMPs) and cellular metabolism. Differential expression for 20 out of 789 genes were confirmed by quantitative reverse transcription-PCR (qRT-PCR). A bioinformatics tool (Ingenuity Pathway Analysis) used to analyze biological functions and pathways on the group of 789 differentially expressed genes revealed that 21 possible gene networks with intermolecular connections among 275 functionally identified genes. These 275 genes were classified into a number of functional groups that included cancer, genetic disorder, cellular growth and proliferation, and cell death.ConclusionThe results of this study provide comprehensive knowledge on global gene expression, and biological functionalities of differentially expressed genes in chicken embryo lung cells in response to ILTV infections.


Proteomics | 2008

A Novel Wavelet-based Thresholding Method for the Pre-processing of Mass Spectrometry Data that Accounts for Heterogeneous Noise

Deukwoo Kwon; Marina Vannucci; Joon Jin Song; Jaesik Jeong; Ruth M. Pfeiffer

In recent years there has been an increased interest in using protein mass spectroscopy to discriminate diseased from healthy individuals with the aim of discovering molecular markers for disease. A crucial step before any statistical analysis is the pre‐processing of the mass spectrometry data. Statistical results are typically strongly affected by the specific pre‐processing techniques used. One important pre‐processing step is the removal of chemical and instrumental noise from the mass spectra. Wavelet denoising techniques are a standard method for denoising. Existing techniques, however, do not accommodate errors that vary across the mass spectrum, but instead assume a homogeneous error structure. In this paper we propose a novel wavelet denoising approach that deals with heterogeneous errors by incorporating a variance change point detection method in the thresholding procedure. We study our method on real and simulated mass specrometry data and show that it improves on performances of peak detection methods.


Computational Statistics & Data Analysis | 2011

Analysis of zero-inflated clustered count data: A marginalized model approach

Keunbaik Lee; Yongsung Joo; Joon Jin Song; Dee Wood Harper

Min and Agresti (2005) proposed random effect hurdle models for zero-inflated clustered count data with two-part random effects for a binary component and a truncated count component. In this paper, we propose new marginalized models for zero-inflated clustered count data using random effects. The marginalized models are similar to Dobbie and Welshs (2001) model in which generalized estimating equations were exploited to find estimates. However, our proposed models are based on a likelihood-based approach. A Quasi-Newton algorithm is developed for estimation. We use these methods to carefully analyze two real datasets.


Computational Biology and Chemistry | 2008

Research article: Optimal classification for time-course gene expression data using functional data analysis

Joon Jin Song; Weiguo Deng; Ho-Jin Lee; Deukwoo Kwon

Classification problems have received considerable attention in biological and medical applications. In particular, classification methods combining to microarray technology play an important role in diagnosing and predicting disease, such as cancer, in medical research. Primary objective in classification is to build an optimal classifier based on the training sample in order to predict unknown class in the test sample. In this paper, we propose a unified approach for optimal gene classification with conjunction with functional principal component analysis (FPCA) in functional data analysis (FNDA) framework to classify time-course gene expression profiles based on information from the patterns. To derive an optimal classifier in FNDA, we also propose to find optimal number of bases in the smoothing step and functional principal components in FPCA using a cross-validation technique, and compare the performance of some popular classification techniques in the proposed setting. We illustrate the propose method with a simulation study and a real world data analysis.


British Journal of Obstetrics and Gynaecology | 2016

Gestational weight gain and preterm birth in obese women: a systematic review and meta‐analysis

Mary Ann Faucher; Marie Hastings-Tolsma; Joon Jin Song; Darryn S. Willoughby; S Gerding Bader

Prepregnant obesity is a global concern and gestational weight gain has been found to influence the risks of preterm birth.


BMC Immunology | 2012

Understanding mechanisms of vitiligo development in Smyth line of chickens by transcriptomic microarray analysis of evolving autoimmune lesions

Fengying Shi; Byung-Whi Kong; Joon Jin Song; Jeong Yoon Lee; Robert L. Dienglewicz; G. F. Erf

BackgroundThe Smyth line (SL) of chicken is an excellent avian model for human autoimmune vitiligo. The etiology of vitiligo is complicated and far from clear. In order to better understand critical components leading to vitiligo development, cDNA microarray technology was used to compare gene expression profiles in the target tissue (the growing feather) of SL chickens at different vitiligo (SLV) states.ResultsCompared to the reference sample, which was from Brown line chickens (the parental control), 395, 522, 524 and 526 out of the 44 k genes were differentially expressed (DE) (P ≤ 0.05) in feather samples collected from SL chickens that never developed SLV (NV), from SLV chickens prior to SLV onset (EV), during active loss of pigmentation (AV), and after complete loss of melanocytes (CV). Comparisons of gene expression levels within SL samples (NV, EV, AV and CV) revealed 206 DE genes, which could be categorized into immune system-, melanocyte-, stress-, and apoptosis-related genes based on the biological functions of their corresponding proteins. The autoimmune nature of SLV was supported by predominant presence of immune system related DE genes and their remarkably elevated expression in AV samples compared to NV, EV and/or CV samples. Melanocyte loss was confirmed by decreased expression of genes for melanocyte related proteins in AV and CV samples compared to NV and EV samples. In addition, SLV development was also accompanied by altered expression of genes associated with disturbed redox status and apoptosis. Ingenuity Pathway Analysis of DE genes provided functional interpretations involving but not limited to innate and adaptive immune response, oxidative stress and cell death.ConclusionsThe microarray results provided comprehensive information at the transcriptome level supporting the multifactorial etiology of vitiligo, where together with apparent inflammatory/innate immune activity and oxidative stress, the adaptive immune response plays a predominant role in melanocyte loss.


BioMed Research International | 2013

Assessing the Detection Capacity of Microarrays as Bio/Nanosensing Platforms

Ju Seok Lee; Joon Jin Song; Russell J. Deaton; Jin-Woo Kim

Microarray is one of the most powerful detection systems with multiplexing and high throughput capability. It has significant potential as a versatile biosensing platform for environmental monitoring, pathogen detection, medical therapeutics, and drug screening to name a few. To date, however, microarray applications are still limited to preliminary screening of genome-scale transcription profiling or gene ontology analysis. Expanding the utility of microarrays as a detection tool for various biological and biomedical applications requires information about performance such as the limits of detection and quantification, which are considered as an essential information to decide the detection sensitivity of sensing devices. Here we present a calibration design that integrates detection limit theory and linear dynamic range to obtain a performance index of microarray detection platform using oligonucleotide arrays as a model system. Two different types of limits of detection and quantification are proposed by the prediction or tolerance interval for two common cyanine fluorescence dyes, Cy3 and Cy5. Besides oligonucleotide, the proposed method can be generalized to other microarray formats with various biomolecules such as complementary DNA, protein, peptide, carbohydrate, tissue, or other small biomolecules. Also, it can be easily applied to other fluorescence dyes for further dye chemistry improvement.


Computational Biology and Chemistry | 2009

Brief communication: Classification for high-throughput data with an optimal subset of principal components

Joon Jin Song; Yuan Ren; Fenglan Yan

High-throughput data have been widely used in biological and medical studies to discover gene and protein functions. Due to the high dimensionality, principal component analysis (PCA) is often involved for data dimension reduction. However, when a few principal components (PCs) are selected for dimension reduction or considered for dimension determination, they are typically ranked by their variances, eigenvalues. However, this approach is not always effective in subsequent multivariate analysis, particularly classification. To maximize information from data with a subset of the components, we apply a different ranking criterion, canonical variate criterion, which considers within- and between-group variance rather than total variance in the classical criterion. Four prevalent classification methods are considered and compared using leave-one-out cross-validation. These methods are illustrated with three real high-throughput data sets, two microarray data sets and a nuclear magnetic resonance spectra data set.

Collaboration


Dive into the Joon Jin Song's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jin-Woo Kim

University of Arkansas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ju Seok Lee

University of Arkansas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge