Neural Computing and Applications | 2019

A modified Henry gas solubility optimization for solving motif discovery problem

 
 
 
 
 

Abstract


The DNA motif discovery (MD) problem is the main challenge of genome biology, and its importance is directly proportional to increasing sequencing technologies. MD plays a vital role in the identification of transcription factor binding sites that help in learning the mechanisms for regulation of gene expression. Metaheuristic algorithms are promising techniques for eliciting motif from DNA genomic sequences, but often fail to demonstrate robust performance by overcoming the inherent challenges in complex gene sequences, making search environment extremely non-convex for optimization methods. This paper proposes a novel modified Henry gas solubility optimization (MHGSO) algorithm for motif discovery which elicits a functional motif in DNA genomic sequences. In our approach, a new stage that captures the main characteristics of the motifs in DNA sequences is proposed, and MHGSO imitates the motifs characteristics for accurate detection of target motif. The performance of the MHGSO algorithm is validated using both synthetic and real datasets. Results confirm the stability and superiority of the proposed algorithm compared to state-of-the-art algorithms including MEME, DREME, XXmotif, PMbPSO, and MACS. Based on several evaluation matrices, MHGSO outperforms the competitor techniques in terms of nucleotide-level correlation coefficient, recall, precision, F -score, Cohen’s Kappa, and statistical validation measures.

Volume 32
Pages 10759-10771
DOI 10.1007/s00521-019-04611-0
Language English
Journal Neural Computing and Applications

Full Text