Bioinformatics | 2019

iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators

 
 
 
 
 
 
 

Abstract


MOTIVATION\nTranscription termination is an important regulatory step of gene expression. If there is no terminator in gene, transcription could not stop, which will result in abnormal gene expression. Detecting such terminators can determine the operon structure in bacterial organisms and improve genome annotation. Thus, accurate identification of transcriptional terminators is essential and extremely important in the research of transcription regulations.\n\n\nRESULTS\nIn this study, we developed a new predictor called iTerm-PseKNC based on support vector machine to identify transcription terminators. The binomial distribution approach was used to pick out the optimal feature subset derived from pseudo k-tuple nucleotide composition (PseKNC). The 5-fold cross-validation test results showed that our proposed method achieved an accuracy of 95%. To further evaluate the generalization ability of iTerm-PseKNC , the model was examined on independent datasets which are experimentally confirmed Rho-independent terminators in Escherichia coli and Bacillus subtilis genomes. As a result, all the terminators in E. coli and 87.5% of the terminators in B. subtilis were correctly identified, suggesting that the proposed model could become a powerful tool for bacterial terminator recognition.\n\n\nAVAILABILITY AND IMPLEMENTATION\nFor the convenience of most of wet-experimental researchers, the web-server for iTerm-PseKNC was established at http://lin-group.cn/server/iTerm-PseKNC/, by which users can easily obtain their desired result without the need to go through the detailed mathematical equations involved.

Volume 35 9
Pages \n 1469-1477\n
DOI 10.1093/bioinformatics/bty827
Language English
Journal Bioinformatics

Full Text