Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mark Rojas is active.

Publication


Featured researches published by Mark Rojas.


Bioinformatics | 2009

PIQA: pipeline for Illumina G1 genome analyzer data quality assessment

Antonio Martínez-Alcántara; Efren Ballesteros; Chen Feng; Mark Rojas; Heather Koshinsky; Viacheslav Y. Fofanov; Paul Havlak; Yuriy Fofanov

Summary: PIQA is a quality analysis pipeline designed to examine genomic reads produced by Next Generation Sequencing technology (Illumina G1 Genome Analyzer). A short statistical summary, as well as tile-by-tile and cycle-by-cycle graphical representation of clusters density, quality scores and nucleotide frequencies allow easy identification of various technical problems including defective tiles, mistakes in sample/library preparations and abnormalities in the frequencies of appearance of sequenced genomic reads. PIQA is written in the R statistical programming language and is compatible with bustard, fastq and scarf Illumina G1 Genome Analyzer data formats. Availability: The PIQA pipeline, installation instructions and examples are available at the supplementary web site (http://bioinfo.uh.edu/PIQA). Contact: [email protected]


Microbial Ecology | 2015

Metagenomic Analysis of the Airborne Environment in Urban Spaces

Nicholas A. Be; James B. Thissen; Viacheslav Y. Fofanov; Jonathan E. Allen; Mark Rojas; George Golovko; Yuriy Fofanov; Heather Koshinsky; Crystal Jaing

The organisms in aerosol microenvironments, especially densely populated urban areas, are relevant to maintenance of public health and detection of potential epidemic or biothreat agents. To examine aerosolized microorganisms in this environment, we performed sequencing on the material from an urban aerosol surveillance program. Whole metagenome sequencing was applied to DNA extracted from air filters obtained during periods from each of the four seasons. The composition of bacteria, plants, fungi, invertebrates, and viruses demonstrated distinct temporal shifts. Bacillus thuringiensis serovar kurstaki was detected in samples known to be exposed to aerosolized spores, illustrating the potential utility of this approach for identification of intentionally introduced microbial agents. Together, these data demonstrate the temporally dependent metagenomic complexity of urban aerosols and the potential of genomic analytical techniques for biosurveillance and monitoring of threats to public health.


BMC Microbiology | 2014

The presence of nitrate dramatically changed the predominant microbial community in perchlorate degrading cultures under saline conditions

Victor G. Stepanov; Yeyuan Xiao; Quyen Tran; Mark Rojas; Richard C. Willson; Yuriy Fofanov; George E. Fox; Deborah J. Roberts

BackgroundPerchlorate contamination has been detected in both ground water and drinking water. An attractive treatment option is the use of ion-exchange to remove and concentrate perchlorate in brine. Biological treatment can subsequently remove the perchlorate from the brine. When nitrate is present, it will also be concentrated in the brine and must also be removed by biological treatment. The primary objective was to obtain an in-depth characterization of the microbial populations of two salt-tolerant cultures each of which is capable of metabolizing perchlorate. The cultures were derived from a single ancestral culture and have been maintained in the laboratory for more than 10 years. One culture was fed perchlorate only, while the other was fed both perchlorate and nitrate.ResultsA metagenomic characterization was performed using Illumina DNA sequencing technology, and the 16S rDNA of several pure strains isolated from the mixed cultures were sequenced. In the absence of nitrate, members of the Rhodobacteraceae constituted the prevailing taxonomic group. Second in abundance were the Rhodocyclaceae. In the nitrate fed culture, the Rhodobacteraceae are essentially absent. They are replaced by a major expansion of the Rhodocyclaceae and the emergence of the Alteromonadaceae as a significant community member. Gene sequences exhibiting significant homology to known perchlorate and nitrate reduction enzymes were found in both cultures.ConclusionsThe structure of the two microbial ecosystems of interest has been established and some representative strains obtained in pure culture. The results illustrate that under favorable conditions a group of organisms can readily dominate an ecosystem and yet be effectively eliminated when their advantage is lost. Almost all known perchlorate-reducing organisms can also effectively reduce nitrate. This is certainly not the case for the Rhodobacteraceae that were found to dominate in the absence of nitrate, but effectively disappeared in its presence. This study is significant in that it reveals the existence of a novel group of organisms that play a role in the reduction of perchlorate under saline conditions. These Rhodobacteraceae especially, as well as other organisms present in these communities may be a promising source of unique salt-tolerant enzymes for perchlorate reduction.


Journal of Clinical Microbiology | 2011

Development and Characterization of a Highly Specific and Sensitive SYBR Green Reverse Transcriptase PCR Assay for Detection of the 2009 Pandemic H1N1 Influenza Virus on the Basis of Sequence Signatures

Rafael A. Medina; Mark Rojas; Astrid Tuin; Stephen Huff; Marcela Ferrés; Constanza Martínez-Valdebenito; Paula Godoy; Adolfo García-Sastre; Yuriy Fofanov; John SantaLucia

ABSTRACT The emergence and rapid spread of the 2009 H1N1 pandemic influenza virus showed that many diagnostic tests were unsuitable for detecting the novel virus isolates. In most countries the probe-based TaqMan assay developed by the U.S. Centers for Disease Control and Prevention was used for diagnostic purposes. The substantial sequence data that became available during the course of the pandemic created the opportunity to utilize bioinformatics tools to evaluate the unique sequence properties of this virus for the development of diagnostic tests. We used a comprehensive computational approach to examine conserved 2009 H1N1 sequence signatures that are at least 20 nucleotides long and contain at least two mismatches compared to any other known H1N1 genome. We found that the hemagglutinin (HA) and neuraminidase (NA) genes contained sequence signatures that are highly conserved among 2009 H1N1 isolates. Based on the NA gene signatures, we used Visual-OMP to design primers with optimal hybridization affinity and we used ThermoBLAST to minimize amplification artifacts. This procedure resulted in a highly sensitive and discriminatory 2009 H1N1 detection assay. Importantly, we found that the primer set can be used reliably in both a conventional TaqMan and a SYBR green reverse transcriptase (RT)-PCR assay with no loss of specificity or sensitivity. We validated the diagnostic accuracy of the NA SYBR green assay with 125 clinical specimens obtained between May and August 2009 in Chile, and we showed diagnostic efficacy comparable to the CDC assay. Our approach highlights the use of systematic computational approaches to develop robust diagnostic tests during a viral pandemic.


BMC Bioinformatics | 2012

Slim-Filter: an interactive windows-based application for illumina genome analyzer data assessment and manipulation

Georgiy Golovko; Kamil Khanipov; Mark Rojas; Antonio Martínez-Alcántara; Jesse J. Howard; Efren Ballesteros; Sharu Gupta; William R. Widger; Yuriy Fofanov

BackgroundThe emergence of Next Generation Sequencing technologies has made it possible for individual investigators to generate gigabases of sequencing data per week. Effective analysis and manipulation of these data is limited due to large file sizes, so even simple tasks such as data filtration and quality assessment have to be performed in several steps. This requires (potentially problematic) interaction between the investigator and a bioinformatics/computational service provider. Furthermore, such services are often performed using specialized computational facilities.ResultsWe present a Windows-based application, Slim-Filter designed to interactively examine the statistical properties of sequencing reads produced by Illumina Genome Analyzer and to perform a broad spectrum of data manipulation tasks including: filtration of low quality and low complexity reads; filtration of reads containing undesired subsequences (such as parts of adapters and PCR primers used during the sample and sequencing libraries preparation steps); excluding duplicated reads (while keeping each read’s copy number information in a specialized data format); and sorting reads by copy numbers allowing for easy access and manual editing of the resulting files. Slim-Filter is organized as a sequence of windows summarizing the statistical properties of the reads. Each data manipulation step has roll-back abilities, allowing for return to previous steps of the data analysis process. Slim-Filter is written in C++ and is compatible with fasta, fastq, and specialized AS file formats presented in this manuscript. Setup files and a user’s manual are available for download at the supplementary web site (https://www.bioinfo.uh.edu/Slim_Filter/).ConclusionThe presented Windows-based application has been developed with the goal of providing individual investigators with integrated sequencing reads analysis, curation, and manipulation capabilities.


Scientific Reports | 2016

Small Regulatory RNAs of Rickettsia conorii

Hema P. Narra; Casey L. C. Schroeder; Abha Sahni; Mark Rojas; Kamil Khanipov; Yuriy Fofanov; Sanjeev K. Sahni

Small regulatory RNAs comprise critically important modulators of gene expression in bacteria, yet very little is known about their prevalence and functions in Rickettsia species. R. conorii, the causative agent of Mediterranean spotted fever, is a tick-borne pathogen that primarily infects microvascular endothelium in humans. We have determined the transcriptional landscape of R. conorii during infection of Human Microvascular Endothelial Cells (HMECs) by strand-specific RNA sequencing to identify 4 riboswitches, 13 trans-acting (intergenic), and 22 cis-acting (antisense) small RNAs (termed ‘Rc_sR’s). Independent expression of four novel trans-acting sRNAs (Rc_sR31, Rc_sR33, Rc_sR35, and Rc_sR42) and known bacterial sRNAs (6S, RNaseP_bact_a, ffs, and α-tmRNA) was next confirmed by Northern hybridization. Comparative analysis during infection of HMECs vis-à-vis tick AAE2 cells revealed significantly higher expression of Rc_sR35 and Rc_sR42 in HMECs, whereas Rc_sR31 and Rc_sR33 were expressed at similar levels in both cell types. We further predicted a total of 502 genes involved in all important biological processes as potential targets of Rc_sRs and validated the interaction of Rc_sR42 with cydA (cytochrome d ubiquinol oxidase subunit I). Our findings constitute the first evidence of the existence of post-transcriptional riboregulatory mechanisms in R. conorii and interactions between a novel Rc_sR and its target mRNA.


Frontiers in Microbiology | 2016

Identification and Characterization of Novel Small RNAs in Rickettsia prowazekii

Casey L. C. Schroeder; Hema P. Narra; Abha Sahni; Mark Rojas; Kamil Khanipov; Jignesh Patel; Riya Shah; Yuriy Fofanov; Sanjeev K. Sahni

Emerging evidence implicates a critically important role for bacterial small RNAs (sRNAs) as post-transcriptional regulators of physiology, metabolism, stress/adaptive responses, and virulence, but the roles of sRNAs in pathogenic Rickettsia species remain poorly understood. Here, we report on the identification of both novel and well-known bacterial sRNAs in Rickettsia prowazekii, known to cause epidemic typhus in humans. RNA sequencing of human microvascular endothelial cells (HMECs), the preferred targets during human rickettsioses, infected with R. prowazekii revealed the presence of 35 trans-acting and 23 cis-acting sRNAs, respectively. Of these, expression of two trans-acting (Rp_sR17 and Rp_sR60) and one cis-acting (Rp_sR47) novel sRNAs and four well-characterized bacterial sRNAs (RNaseP_bact_a, α-tmRNA, 4.5S RNA, 6S RNA) was further confirmed by Northern blot or RT-PCR analyses. The transcriptional start sites of five novel rickettsial sRNAs and 6S RNA were next determined using 5′ RLM-RACE yielding evidence for their independent biogenesis in R. prowazekii. Finally, computational approaches were employed to determine the secondary structures and potential mRNA targets of novel sRNAs. Together, these results establish the presence and expression of sRNAs in R. prowazekii during host cell infection and suggest potential functional roles for these important post-transcriptional regulators in rickettsial biology and pathogenesis.


Frontiers in Microbiology | 2018

Microbiome interaction networks and community structure from laboratory-reared and field-collected Aedes aegypti, Aedes albopictus, and Culex quinquefasciatus mosquito vectors

Shivanand Hegde; Kamil Khanipov; Levent Albayrak; George Golovko; Maria Pimenova; Miguel A. Saldaña; Mark Rojas; Emily A. Hornett; Greg C. Motl; Chris L. Fredregill; James A. Dennett; Mustapha Debboun; Yuriy Fofanov; Grant L. Hughes

Microbial interactions are an underappreciated force in shaping insect microbiome communities. Although pairwise patterns of symbiont interactions have been identified, we have a poor understanding regarding the scale and the nature of co-occurrence and co-exclusion interactions within the microbiome. To characterize these patterns in mosquitoes, we sequenced the bacterial microbiome of Aedes aegypti, Ae. albopictus, and Culex quinquefasciatus caught in the field or reared in the laboratory and used these data to generate interaction networks. For collections, we used traps that attracted host-seeking or ovipositing female mosquitoes to determine how physiological state affects the microbiome under field conditions. Interestingly, we saw few differences in species richness or microbiome community structure in mosquitoes caught in either trap. Co-occurrence and co-exclusion analysis identified 116 pairwise interactions substantially increasing the list of bacterial interactions observed in mosquitoes. Networks generated from the microbiome of Ae. aegypti often included highly interconnected hub bacteria. There were several instances where co-occurring bacteria co-excluded a third taxa, suggesting the existence of tripartite relationships. Several associations were observed in multiple species or in field and laboratory-reared mosquitoes indicating these associations are robust and not influenced by environmental or host factors. To demonstrate that microbial interactions can influence colonization of the host, we administered symbionts to Ae. aegypti larvae that either possessed or lacked their resident microbiota. We found that the presence of resident microbiota can inhibit colonization of particular bacterial taxa. Our results highlight that microbial interactions in mosquitoes are complex and influence microbiome composition.


BMC Genomics | 2016

The ability of human nuclear DNA to cause false positive low-abundance heteroplasmy calls varies across the mitochondrial genome

Levent Albayrak; Kamil Khanipov; Maria Pimenova; George Golovko; Mark Rojas; Ioannis T. Pavlidis; Sergei Chumakov; Gerardo Aguilar; Arturo Chávez; William R. Widger; Yuriy Fofanov

BackgroundLow-abundance mutations in mitochondrial populations (mutations with minor allele frequency ≤ 1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA).ResultsPerformed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA.ConclusionAnalysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium.


bioinformatics and biomedicine | 2015

CoCo: An application to store High-Throughput Sequencing data in compact text and binary file formats

Kamil Khanipov; Georgiy Golovko; Mark Rojas; Levent Albayrak; Otto Dobretsberger; Maria Pimenova; Nels Olson; Sergei Chumakov; Yuriy Fofanov

The storage, manipulation, and especially internet transfer of large amounts of data produced by High-Throughput Sequencing (HTS) instruments present major obstacles utilizing the full potential of this promising technology. The current standard is based on storing all data, which are produced in text (FASTQ and FASTA) and often stored in binary (SRA and BAM) formats. To date, significant effort has been devoted to efficiently compressing these cumbersome sequencing data sets in their existing formats. However, given the substantial improvements in the quality of HTS data, we believe that if one can afford to exclude low quality data and read headers, new much more compressed data formats can be used to reduce size of HTS data files by at least two orders of magnitude. Here we present several examples of file formats specifically designed to store only high quality sequencing reads in space efficient text and binary form. The basic principles used to decrease file size include storage of only one copy of a sequence when reads are present in multiple copies; alphabetical sorting of all reads and storage of only the differences (suffixes) between consecutive reads; and optimization of the number of bits/bytes required to store the information in binary formats. While file size reduction depends on properties of the sequencing data, the size of the resulting files can be as low as 0.1 %-5% of the original FASTQ, SRA, or BAM files. The greatest advantage of the proposed formats however, is based on its time and memory efficiency. The time required to convert reads from FASTQ/FAST A files into the proposed formats is up to 10 times faster than gzip and SRA. The conversion of files in the proposed formats back to FAST A is limited only by the time required to read the file from the hard drive. We present the source code of the C++ object (class) implemented to store, sort, and perform I/O operations with equal length subsequences; and two executable LINUX command line applications (CoCo and CoCo-PIus) able to work with all types of sequencing data including paired-end and flexible size reads. Source code, Linux executables, as well as user manual can be downloaded from http://bgl.utmb.edu/publications/34cocoplus.

Collaboration


Dive into the Mark Rojas's collaboration.

Top Co-Authors

Avatar

Yuriy Fofanov

University of Texas Medical Branch

View shared research outputs
Top Co-Authors

Avatar

Kamil Khanipov

University of Texas Medical Branch

View shared research outputs
Top Co-Authors

Avatar

Levent Albayrak

University of Texas Medical Branch

View shared research outputs
Top Co-Authors

Avatar

George Golovko

University of Texas Medical Branch

View shared research outputs
Top Co-Authors

Avatar

Georgiy Golovko

University of Texas Medical Branch

View shared research outputs
Top Co-Authors

Avatar

Maria Pimenova

University of Texas Medical Branch

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sergei Chumakov

University of Guadalajara

View shared research outputs
Top Co-Authors

Avatar

Abha Sahni

University of Texas Medical Branch

View shared research outputs
Researchain Logo
Decentralizing Knowledge