Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mina Rho is active.

Publication


Featured researches published by Mina Rho.


Nucleic Acids Research | 2010

FragGeneScan: predicting genes in short and error-prone reads

Mina Rho; Haixu Tang; Yuzhen Ye

The advances of next-generation sequencing technology have facilitated metagenomics research that attempts to determine directly the whole collection of genetic material within an environmental sample (i.e. the metagenome). Identification of genes directly from short reads has become an important yet challenging problem in annotating metagenomes, since the assembly of metagenomes is often not available. Gene predictors developed for whole genomes (e.g. Glimmer) and recently developed for metagenomic sequences (e.g. MetaGene) show a significant decrease in performance as the sequencing error rates increase, or as reads get shorter. We have developed a novel gene prediction method FragGeneScan, which combines sequencing error models and codon usages in a hidden Markov model to improve the prediction of protein-coding region in short reads. The performance of FragGeneScan was comparable to Glimmer and MetaGene for complete genomes. But for short reads, FragGeneScan consistently outperformed MetaGene (accuracy improved ∼62% for reads of 400 bases with 1% sequencing errors, and ∼18% for short reads of 100 bases that are error free). When applied to metagenomes, FragGeneScan recovered substantially more genes than MetaGene predicted (>90% of the genes identified by homology search), and many novel genes with no homologs in current protein sequence database.


PLOS Genetics | 2012

Diverse CRISPRs Evolving in Human Microbiomes

Mina Rho; Yu Wei Wu; Haixu Tang; Thomas G. Doak; Yuzhen Ye

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.


Annual Review of Genomics and Human Genetics | 2011

The repatterning of eukaryotic genomes by random genetic drift.

Michael Lynch; Francesco Catania; Jean-François Gout; Mina Rho

Recent observations on rates of mutation, recombination, and random genetic drift highlight the dramatic ways in which fundamental evolutionary processes vary across the divide between unicellular microbes and multicellular eukaryotes. Moreover, population-genetic theory suggests that the range of variation in these parameters is sufficient to explain the evolutionary diversification of many aspects of genome size and gene structure found among phylogenetic lineages. Most notably, large eukaryotic organisms that experience elevated magnitudes of random genetic drift are susceptible to the passive accumulation of mutationally hazardous DNA that would otherwise be eliminated by efficient selection. Substantial evidence also suggests that variation in the population-genetic environment influences patterns of protein evolution, with the emergence of certain kinds of amino-acid substitutions and protein-protein complexes only being possible in populations with relatively small effective sizes. These observations imply that the ultimate origins of many of the major genomic and proteomic disparities between prokaryotes and eukaryotes and among eukaryotic lineages have been molded as much by intrinsic variation in the genetic and cellular features of species as by external ecological forces.


BMC Genomics | 2007

De novo identification of LTR retrotransposons in eukaryotic genomes

Mina Rho; Jeong Hyeon Choi; Sun Kim; Michael Lynch; Haixu Tang

BackgroundLTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known elements.ResultsIn this paper, we report a de novo computational method that can identify new LTR retrotransposons without relying on a library of known elements. Specifically, our method identifies intact LTR retrotransposons by using an approximate string matching technique and protein domain analysis. In addition, it identifies partially deleted or solo LTRs using profile Hidden Markov Models (pHMMs). As a result, this method can de novo identify all types of LTR retrotransposons. We tested this method on the two pairs of eukaryotic genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura. LTR retrotransposons in C. elegans and D. melanogaster have been intensively studied using conventional annotation methods. Comparing with previous work, we identified new intact LTR retroelements and new putative families, which may imply that there may still be new retroelements that are left to be discovered even in well-studied organisms. To assess the sensitivity and accuracy of our method, we compared our results with a previously published method, LTR_STRUC, which predominantly identifies full-length LTR retrotransposons. In summary, both methods identified comparable number of intact LTR retroelements. But our method can identify nearly all known elements in C. elegans, while LTR_STRUCT missed about 1/3 of them. Our method also identified more known LTR retroelements than LTR_STRUCT in the D. melanogaster genome. We also identified some LTR retroelements in the other two genomes, C. briggsae and D. pseudoobscura, which have not been completely finished. In contrast, the conventional method failed to identify those elements. Finally, the phylogenetic and chromosomal distributions of the identified elements are discussed.ConclusionWe report a novel method for de novo identification of LTR retrotransposons in eukaryotic genomes with favorable performance over the existing methods.


Scientific Reports | 2015

Stability of Gut Enterotypes in Korean Monozygotic Twins and Their Association with Biomarkers and Diet

Mi Young Lim; Mina Rho; Yun-Mi Song; Kayoung Lee; Joohon Sung; GwangPyo Ko

Studies on the human gut microbiota have suggested that human individuals could be categorized into enterotypes based on the compositions of their gut microbial communities. Here, we report that the gut microbiota of healthy Koreans are clustered into two enterotypes, dominated by either Bacteroides (enterotype 1) or Prevotella (enterotype 2). More than 72% of the paired fecal samples from monozygotic twin pairs were assigned to the same enterotype. Our longitudinal analysis of these twins indicated that more than 80% of the individuals belonged to the same enterotype after about a 2-year interval. Microbial functions based on KEGG pathways were also divided into two clusters. For enterotype 2, 100% of the samples belonged to the same functional cluster, while for enterotype 1, approximately half of the samples belonged to each functional cluster. Enterotype 2 was significantly associated with long-term dietary habits that were high in dietary fiber, various vitamins, and minerals. Among anthropometrical and biochemical traits, the level of serum uric acid was associated with enterotype. These results suggest that host genetics as well as host properties such as long-term dietary patterns and a particular clinical biomarker could be important contributors to the enterotype of an individual.


Nucleic Acids Research | 2009

MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes

Mina Rho; Haixu Tang

Computational methods for genome-wide identification of mobile genetic elements (MGEs) have become increasingly necessary for both genome annotation and evolutionary studies. Non-long terminal repeat (non-LTR) retrotransposons are a class of MGEs that have been found in most eukaryotic genomes, sometimes in extremely high numbers. In this article, we present a computational tool, MGEScan-non-LTR, for the identification of non-LTR retrotransposons in genomic sequences, following a computational approach inspired by a generalized hidden Markov model (GHMM). Three different states represent two different protein domains and inter-domain linker regions encoded in the non-LTR retrotransposons, and their scores are evaluated by using profile hidden Markov models (for protein domains) and Gaussian Bayes classifiers (for linker regions), respectively. In order to classify the non-LTR retrotransposons into one of the 12 previously characterized clades using the same model, we defined separate states for different clades. MGEScan-non-LTR was tested on the genome sequences of four eukaryotic organisms, Drosophila melanogaster, Daphnia pulex, Ciona intestinalis and Strongylocentrotus purpuratus. For the D. melanogaster genome, MGEScan-non-LTR found all known ‘full-length’ elements and simultaneously classified them into the clades CR1, I, Jockey, LOA and R1. Notably, for the D. pulex genome, in which no non-LTR retrotransposon has been annotated, MGEScan-non-LTR found a significantly larger number of elements than did RepeatMasker, using the current version of the RepBase Update library. We also identified novel elements in the other two genomes, which have only been partially studied for non-LTR retrotransposons.


Biochemical and Biophysical Research Communications | 2014

Detection of PIWI and piRNAs in the mitochondria of mammalian cancer cells.

ChangHyuk Kwon; Hyosun Tak; Mina Rho; Hae Ryung Chang; Yon Hui Kim; Kyung Tae Kim; Curt Balch; Eun Kyung Lee; Seungyoon Nam

Piwi-interacting RNAs (piRNAs) are 26-31 nt small noncoding RNAs that are processed from their longer precursor transcripts by Piwi proteins. Localization of Piwi and piRNA has been reported mostly in nucleus and cytoplasm of higher eukaryotes germ-line cells, where it is believed that known piRNA sequences are located in repeat regions of nuclear genome in germ-line cells. However, localization of PIWI and piRNA in mammalian somatic cell mitochondria yet remains largely unknown. We identified 29 piRNA sequence alignments from various regions of the human mitochondrial genome. Twelve out 29 piRNA sequences matched stem-loop fragment sequences of seven distinct tRNAs. We observed their actual expression in mitochondria subcellular fractions by inspecting mitochondrial-specific small RNA-Seq datasets. Of interest, the majority of the 29 piRNAs overlapped with multiple longer transcripts (expressed sequence tags) that are unique to the human mitochondrial genome. The presence of mature piRNAs in mitochondria was detected by qRT-PCR of mitochondrial subcellular RNAs. Further validation showed detection of Piwi by colocalization using anti-Piwil1 and mitochondria organelle-specific protein antibodies.


international conference on cloud computing | 2009

Biomedical Case Studies in Data Intensive Computing

Geoffrey C. Fox; Xiaohong Qiu; Scott Beason; Jong Youl Choi; Jaliya Ekanayake; Thilina Gunarathne; Mina Rho; Haixu Tang; Neil Devadasan; Gilbert C. Liu

Many areas of science are seeing a data deluge coming from new instruments, myriads of sensors and exponential growth in electronic records. We take two examples --- one the analysis of gene sequence data (35339 Alu sequences) and other a study of medical information (over 100,000 patient records) in Indianapolis and their relationship to Geographic and Information System and Census data available for 635 Census Blocks in Indianapolis. We look at initial processing (such as Smith Waterman dissimilarities), clustering (using robust deterministic annealing) and Multi Dimensional Scaling to map high dimension data to 3D for convenient visualization. We show how scaling pipelines can be produced that can be implemented using either cloud technologies or MPI which are compared. This study illustrates challenges in integrating data exploration tools with a variety of different architectural requirements and natural programming models. We present preliminary results for end to end study of two complete applications.


Methods | 2015

Deciphering the human microbiome using next-generation sequencing data and bioinformatics approaches

Yihwan Kim; InSong Koh; Mina Rho

The human microbiome is one of the key factors affecting the host immune system and metabolic functions that are not encoded in the human genome. Culture-independent analysis of the human microbiome using metagenomics approach allows us to investigate the compositions and functions of the human microbiome. Computational methods analyze the microbial community by using specific marker genes or by using shotgun sequencing of the entire microbial community. Taxonomy profiling is conducted by using the reference sequences or by de novo clustering of the specific region of sequences. Functional profiling, which is mainly based on the sequence similarity, is more challenging since about half of ORFs predicted in the metagenomic data could not find homology with known protein families. This review examines computational methods that are valuable for the analysis of human microbiome, and highlights the results of several large-scale human microbiome studies. It is becoming increasingly evident that dysbiosis of the gut microbiome is strongly associated with the development of immune disorder and metabolic dysfunction.


BMC Genomics | 2010

LTR retroelements in the genome of Daphnia pulex

Mina Rho; Sarah Schaack; Xiang Gao; Sun Kim; Michael Lynch; Haixu Tang

BackgroundLong terminal repeat (LTR) retroelements represent a successful group of transposable elements (TEs) that have played an important role in shaping the structure of many eukaryotic genomes. Here, we present a genome-wide analysis of LTR retroelements in Daphnia pulex, a cyclical parthenogen and the first crustacean for which the whole genomic sequence is available. In addition, we analyze transcriptional data and perform transposon display assays of lab-reared lineages and natural isolates to identify potential influences on TE mobility and differences in LTR retroelements loads among individuals reproducing with and without sex.ResultsWe conducted a comprehensive de novo search for LTR retroelements and identified 333 intact LTR retroelements representing 142 families in the D. pulex genome. While nearly half of the identified LTR retroelements belong to the gypsy group, we also found copia (95), BEL/Pao (66) and DIRS (19) retroelements. Phylogenetic analysis of reverse transcriptase sequences showed that LTR retroelements in the D. pulex genome form many lineages distinct from known families, suggesting that the majority are novel. Our investigation of transcriptional activity of LTR retroelements using tiling array data obtained from three different experimental conditions found that 71 LTR retroelements are actively transcribed. Transposon display assays of mutation-accumulation lines showed evidence for putative somatic insertions for two DIRS retroelement families. Losses of presumably heterozygous insertions were observed in lineages in which selfing occurred, but never in asexuals, highlighting the potential impact of reproductive mode on TE abundance and distribution over time. The same two families were also assayed across natural isolates (both cyclical parthenogens and obligate asexuals) and there were more retroelements in populations capable of reproducing sexually for one of the two families assayed.ConclusionsGiven the importance of LTR retroelements activity in the evolution of other genomes, this comprehensive survey provides insight into the potential impact of LTR retroelements on the genome of D. pulex, a cyclically parthenogenetic microcrustacean that has served as an ecological model for over a century.

Collaboration


Dive into the Mina Rho's collaboration.

Top Co-Authors

Avatar

Haixu Tang

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Geoffrey C. Fox

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Michael Lynch

Arizona State University

View shared research outputs
Top Co-Authors

Avatar

Yuzhen Ye

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Mi Young Lim

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Sun Kim

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Thomas G. Doak

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Judy Qiu

Indiana University Bloomington

View shared research outputs
Researchain Logo
Decentralizing Knowledge