Desalination and Water Treatment | 2021
Water quality assessment based on interval-valued data cluster analysis
Abstract
The interval data cluster analysis method was adopted to evaluate the water quality of the Huaihe River. This method could reduce the dimension of water quality data. When the dimension reduced, the calculation could be easier. With this method, there is no loss of information. Twenty-six sites in the river were selected to take samples. At each sampling site, four indicators were recorded each week in 2012. First, the original 1,326 × 4 matrix was converted into a 26 × 4 matrix. The traditional number elements in matrix were replaced by interval-valued data. Then, the interval data were standardized. For the standardized data, Euclidean–Hausdorff distance was used for hierarchical cluster analysis. To determine the needed clusters, the paper employed corrected rand index (CRI). According to the value of CRI, 26 sites were divided into six clusters. Samples in different clusters had different interval radius and pollution levels. Samples in cluster 1 are mildly polluted and the fluctuation of indicators’ concentrations is not as violent as those in other clusters. Though the samples in cluster 2 and 4 rank in all the middle levels in terms of pollution, pollution in cluster 4 is relatively stable. Cluster 5 and 6 have higher concentrations of DO, CODMn, and NH3–N, due to insufficient DO. The interval data cluster analysis method based on Euclidean–Hausdorff distance classifies the sample sites into multiple clusters without loss of information.