Cluster Computing | 2019

NvPD: novel parallel edit distance algorithm, correctness, and performance evaluation

 
 
 
 
 
 

Abstract


Edit distance has applications in many domains such as bioinformatics, spell checking, plagiarism checking, query optimization, speech recognition, and data mining. Traditionally, edit distance is computed by dynamic programming based sequential solution which becomes infeasible for large problems. In this paper, we introduce NvPD, a novel algorithm for parallel edit distance computation by resolving dependencies in the conventional dynamic programming based solution. We also establish the correctness of modified dependencies. NvPD exhibits certain characteristics such as balanced workload among processors, less synchronization overhead, maximum utilization of resources and it can exploit spatial locality. It requires $$\\min (m,n)$$ min ( m , n ) steps to complete as compared to diagonal based approach that completes in $$\\max (m,n)$$ max ( m , n ) . Experimental evaluation using variety of random and real life data sets over shared memory multi-core systems and graphic processing units (GPUs) show that NvPD outperforms state-of-the-art parallel edit distance algorithms.

Volume 23
Pages 879-894
DOI 10.1007/s10586-019-02962-w
Language English
Journal Cluster Computing

Full Text