Automatic Documentation and Mathematical Linguistics | 2021
DNA Clustering Algorithms
Abstract
Abstract This paper makes two generalizations of the previously presented algorithms of the author based on the principles of information coding in molecular genetics. This is an account of the frequency characteristics of subalphabetic representations of polynucleotides and a generalization of an algorithm for processing arbitrary information presented in a quaternary code. The second generalization indicates the general significance of the proposed algorithms, which the author called molecular genetic or DNA algorithms, emphasizing their difference from the well-known genetic algorithms of the Holland type. An example of displaying the results of the operation of DNA algorithms in the frequency domain with visualization of the cluster structure is given. The example makes it possible to trace a structure that is quite common for DNA, which consists of a main cluster and several satellite clusters. Natural language texts processed by DNA algorithms in the structural and frequency domains are analyzed and compared.