Boosting rare benthic macroinvertebrates taxa identification with one-class classification
BBOOSTING RARE BENTHIC MACROINVERTEBRATES TAXA IDENTIFICATION WITHONE-CLASS CLASSIFICATION
Fahad Sohrab, Jenni Raitoharju † Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland † Programme for Environmental Information, Finnish Environment Institute, Jyv¨askyl¨a, Finland
ABSTRACT
Insect monitoring is crucial for understanding the conse-quences of rapid ecological changes, but taxa identificationcurrently requires tedious manual expert work and cannotbe scaled-up efficiently. Deep convolutional neural networks(CNNs), provide a viable way to significantly increase thebiomonitoring volumes. However, taxa abundances are typi-cally very imbalanced and the amounts of training images forthe rarest classes are simply too low for deep CNNs. As aresult, the samples from the rare classes are often completelymissed, while detecting them has biological importance. Inthis paper, we propose combining the trained deep CNN withone-class classifiers to improve the rare species identification.One-class classification models are traditionally trained withmuch fewer samples and they can provide a mechanism toindicate samples potentially belonging to the rare classes forhuman inspection. Our experiments confirm that the pro-posed approach may indeed support moving towards partialautomation of the taxa identification task.
Index Terms — Biomonitoring, Taxa Identification, Ma-chine Learning, One-Class Classification, Support VectorData Description
1. INTRODUCTION
To understand the consequences of climate change and otheranthropogenic changes in different aquatic ecosystems, it iscrucial to widely monitor different animal groups. Also in-ternational environmental legislation, such as the EU WaterFramework Directive (WFD) [1], acknowledges the task ofmonitoring aquatic ecosystems. Since changes in the abun-dances of benthic macroinvertebrate species can provide anearly warning sign of environmental problems in aquaticecosystems, they are widely used as indicating factors in c (cid:13) { fahad.sohrab, jenni.raitoharju } @tuni.fi. WFD-compliant ecological status assessment and environ-mental decision making [2, 3]. At the same time, they havebeen identified also as one of the most difficult groups to bemonitored [4]. The task currently requires tedious manualexpert work making it expensive, time-consuming, and error-prone. The recent advances in machine learning, especiallydeep convolutional neural networks (CNNs), provide a viableway to scale-up monitoring and provide faster informationfor environmental decision making. In the future, the sam-ples of benthic macroinvertebrates may be imaged with anautomated imaging device and then identified using a deeplearning model trained with a sample dataset.The overall accuracy obtained by automatic identificationof benthic macroinvertebrates is approaching human expertlevel [5] and, already in the near future, it may be possibleto use machines to handle the majority of the samples, whilehuman experts manually identify only the difficult and inter-esting cases, such as specimens potentially belonging to rarespecies. A major challenge that needs to be addressed is in-duced by the very imbalanced taxa abundances. For somerare species, the number of training images is simply too lowfor a deep CNN and, as a result, the identification often fails.This problem is largely overlooked in the recent works [5, 6]that consider only the overall identification accuracy. The lownumber of misclassified specimens from rare species hardlyaffects the overall accuracy, while they are important for mon-itoring biodiversity. In this paper, we propose a mechanismthat can indicate a reasonably-sized subset of specimens aspotential samples of rare species for human expert inspection.To this end, we propose combining the trained deep CNNwith one-class classifiers. One-class classifiers are tradition-ally trained with much fewer samples than deep networks andour experimental results support the assumption that they canhelp in detecting samples from the rare species.
2. RELATED WORK2.1. Machine learning in biomonitoring
Machine learning is rapidly gaining recognition as a promis-ing tool for many biomonitoring applications, such as iden-tifying fish species [7], forest surveillance [8], or monitoring a r X i v : . [ c s . C V ] F e b rctic flowering seasons [9]. In this paper, we concentrate onbenthic macroinvertebrate identification. Nevertheless, mainchallenges are similar for most biomonitoring applicationsand the solutions may be easily applied on other applications.For example, the identification task is very fine-grained. For anon-expert it may be hard to see any difference between sim-ilar species. At the same time, the intra-class variance maybe large due to different development stages [10]. Taxa dis-tributions in the nature, and thus also the available referencedatasets, are very imbalanced [5]. Furthermore, some tax-onomists continue to object the shift toward automated meth-ods due to different doubts and fears [11]. The last problemmay me eased by providing better mechanisms for dividingthe identification task between machines and human expertsin such a manner that the machine first handle only the mostroutine-like cases [6].Efforts to develop automated taxa identification tech-niques have developed from using handcrafted features withshallow networks [12, 13] towards using deep neural net-works, which operate on images as inputs [5, 6]. A majorchallenge with deep neural networks is the need of hugeamounts of training data. This had lead to efforts to createimaging devices capable of providing high quality imageswith minimal manual effort [10, 14]. Nevertheless, the ex-isting datasets, such as FIN-Benthic2 [5] used in this paper,have very imbalanced classes. The smallest taxa simplydo not provide enough information for training deep neuralnetworks. However, such rare taxa and changes in their abun-dances may be biologically and environmentally interesting.The performance of the deep neural networks for the veryrare species may be enhanced, e.g., by data-augmentation[15] or special loss functions [16], but also these approachestend to overfit to the few training samples and do not gen-eralize well for unseen samples. In this paper, we suggestcombining one-class classifiers with the trained deep neuralnetwork to provide an additional mechanism for detectingsamples potentially belonging to the rare classes for humaninspection. The main idea in one-class classification is to create a repre-sentative model of a class of interest, typically called targetclass, using data from this class only. During inference, themodel is used to predict whether unseen samples belong tothe target class or are outliers. We denote the target data asX = [ x , ..., x n ] , where n is the number of target items and x i are D -dimensional vectors. One-Class Support Vector Ma-chine (OC-SVM) [17] basically separates all the data pointsfrom the origin and maximizes the distance from this hyper-plane to the origin: min w ,ξ i ,ρ (cid:107) w (cid:107) + Cn (cid:80) ni =1 ξ i − ρ s.t. w ∗ x i ≥ ξ i − ρ, ∀ i ∈ { , . . . , n } ξ i ≥ , ∀ i ∈ { , . . . , n } , (1) where w is a weight vector, slack variables ξ i allow somedata points to lie within the margin, and hyper-parameter C sets an upper bound on the fraction of training samples al-lowed within the margin and a lower bound on the number oftraining samples used as Support Vector.Another classical one-class classification method is Sup-port Vector Data Description (SVDD) [18]. An SVDD modelis trained by forming the smallest hypersphere which includesall the target data. SVDD minimized the following function: min F ( R, a ) = R + C (cid:80) ni =1 ξ i s.t. (cid:107) x i − a (cid:107) ≤ R + ξ i , ∀ i ∈ { , . . . , n } ,ξ i ≥ , ∀ i ∈ { , . . . , n } , (2)where R is the radius, a is the center of hypersphere, ξ i areslack variables allowing some training samples to be leftoutside the hypersphere, and hyper-parameter C controls theamount of allowed outliers. Both OC-SVM and SVDD canbe solved in one step using Lagrange multipliers.A recent extension of SVDD, Subspace Support VectorData Description (S-SVDD) [19] maps the data to an opti-mised d -dimensional subspace suitable for one-class classifi-cation as Q x i . S-SVDD is solved iteratively alternating thesteps of solving SVDD in the current subspace and improv-ing the subspace projection Q. The second step computes thegradient of Lagrangian of Eq. (2), ∆ L , and updates Q asQ = Q − η (∆ L + β ∆Ψ) , (3)where Ψ = Tr ( QX λλ T X T Q T ) is an additional regularizationterm enforcing more variance, β is a weight for it, and η isa learning rate. Different values for λ result in different ver-sions of S-SVDD. In this paper, we use unregularized version(i.e. λ i = 0 ) denoted as S-SVDD and two regularized ver-sions S-SVDDr1 with λ i = 1 and SVDDr2, where λ is usedto select only the support vectors.
3. PROPOSED SYSTEM
Our work aims at allowing to move from fully manualtaxa identification of benthic macroinvertebrates to a semi-automated approach, where a trained machine learning modelcan handle most of the specimens, while the human expertscan concentrate on difficult and potentially most interestingcases. As our starting point, we assume the typical scenariowhere we have a trained deep neural network model thatgives a satisfactory overall accuracy, while it fails to correctlyidentify specimens from rare species, which have biologi-cal/environmental importance. We propose a mechanism thatcan be used together with the deep neural network to pin-point specimens that potentially belong to the rare species forhuman expert inspection.The proposed general framework is shown in Fig. 1. In thefirst phase, collected macroinvertebrate samples are imaged maging and preprocessing
Specimens
CNN identification
Images Features
PCA
Lower-dimensionalfeatures
One-class classificationManual identification
Samples labeled as targetExpert labels(update initial labels)
Biological assessment
Initial labels Samples labeled as outlier (keep initial labels)
Fig. 1 . The proposed taxa identification pipelineand the images are preprocessed as needed (e.g., normaliza-tion, resizing). The images are fed to a trained deep neuralnetwork for initial identification. Features extracted from thesecond last layer of the network are projected to a lower di-mensionality with Principal Component Analysis (PCA) tomake the one-class models smaller and more focused on thekey features. The PCA-processed features are classified usinga one-class classifier. Finally, the specimens which are clas-sified to the target class are re-identified by a human expert,while otherwise the initial CNN identification is used in thesubsequent biological assessment of the results. Note that theexperts use the actual specimens with a microscopic analysis,while the machine learning components rely on images andfeatures extracted from the images.As one-class classifiers use only target class data for train-ing, they may not be able to accurately distinguish unseen tar-get samples from outliers, which have a high similarity withthe target class. However, this may be even a benefit in ourapplication. Trying to separate target samples from very sim-ilar outliers is naturally error-prone. Therefore, it is betterto direct also these unclear cases for expert identification in-stead of trying to build the model as accurate as possible. Ingeneral, our goal is to detect as many samples from the tar-get class as possible with the minimum amounts of overallsamples that require manual identification. However, it is notstraightforward how to evaluate the performance of differentone-class classifiers on the given task. Depending on biomon-itoring goals and importance of the target class, it may varyhow much human effort is acceptable to maximize the num-ber of detected target class specimens.
4. EXPERIMENTAL SETUP AND RESULTS4.1. Dataset
We used FIN-Benthic2 dataset [5] in our experiments. Thedataset is publicly available and consists of 460004 imagesof 9631 benthic macroinvertebrate specimens belonging to39 different taxa. The number of images per taxon variesfrom 490 to 44240 making the dataset very imbalanced.The images are of varying size and in PNG-format. FIN-Benthic2 provides 10 different data splits for training, vali- dation, and testing. Each split has been formed so that theimages of a single specimen (max 50) are in the the sameset (train/validation/test). In this paper, we consider onlyimage-based identification and we leave it for future workto investigate how to exploit the fact that we actually haveseveral images corresponding to the same specimen. We usedSplit 1 as our data splitting.For one-class classification, we selected three differenttaxa,
Capnopsis schilleri , Nemoura cinerea , and
Leuctra ni-gra , as our target classes. Each of these taxa is rare andVGG16 has poor performance on them. The target classeswere selected as a proof-of-concept, not based on their envi-ronmental importance. The image numbers for the selectedclasses are shown in Table 1.
Table 1 . Image numbers in Split 1 of FIN-Benthic2 dataset
Train Validation Test
Capnopsis schilleri
600 100 350
Nemoura cinerea
650 100 50
Leuctra nigra
As our base-model, we fine-tuned a VGG16 network [20] pre-trained on ImageNet using FIN-Benthic2 dataset. To makeVGG16 suitable for our task, we added two dense layers ontop of the VGG16 convolutional output. The first added layeris composed of 4069 neurons with ReLU activation. The sec-ond added layer is the output layer composed of 39 neuronsusing soft-max activation. We also added two dropout lay-ers on top of the mentioned dense layers to avoid overfitting.The dropout rate was set to 40 percent. We fine-tuned thewhole network for 50 epochs using Stochastic Gradient De-scent with a learning rate of 0.007 and selected the final net-work based on the validation set accuracy. As the originalimages are of varying size, we first scaled them to 64x64.The overall accuracy of the network on the test set was 0.872.This is similar to earlier published results [5], while we didnot concentrate on optimizing this step in this work.We extracted the output of the second last VGG16 layer able 2 . One-class classifier results for different target species
Capnopsis schilleri Nemoura cinerea Leuctra nigra
TPR GM TP TP+FP TPR GM TP TP+FP TPR GM TP TP+FPCNN classificationVGG16 0.046 0.214 16 101 0.020 0.141 1 39 0.170 0.412 34 174Linear one-class classificationOC-SVM 0.906 0.613 317 54367 0.660 0.357 33 74739 0.625 0.437 125 64304SVDD 0.346 0.586 121 701 0.280 0.525 14 1422 0.730 0.832 146 4860S-SVDD 0.557 0.740 195 1893 0.480 0.676 24 4385 0.805 0.838 161 11910S-SVDDr1 0.609 0.773 213 1977 0.340 0.567 17 5209 0.805 0.837 161 12103S-SVDDr2 0.706
247 3573 0.560
28 11178 0.855
171 9625Non-linear one-class classificationOC-SVM 0.034 0.185 12 87 0.000 0.000 0 51 0.220 0.469 44 102SVDD 0.331 0.574 116 658 0.300 0.543 15 1441 0.730 0.832 146 4904S-SVDD 0.503 0.705 176 1169 0.440 0.649 22 3890 0.815 0.853 163 10085S-SVDDr1 0.540 0.730 189 1404 0.400 0.622 20 3138 0.780 0.854 156 6221S-SVDDr2 1.000 0.000 350 92685 0.220 0.465 11 1762 0.995 0.003 199 92683 (i.e., 4096-d) for further analysis and first applied PCA onit. We used only the target class training samples to obtainthe PCA mapping and then applied this mapping for all theremaining data. We kept the first 100 principal componentsas our final feature vectors used for training and testing theone-class classifiers. Finally, we trained different one-classclassifiers (separate models for each target species) usingfeature vectors of the training images of the target species.The hyper-parameters were optimized using the validationset. At the end, we tested the models with the full testset, where all the images not belonging to the target classwere considered as outliers. The one-class classifiers con-sidered were OC-SVM, SVDD, S-SVDD, S-SVDDr1, andS-SVDDr2 (See Section 2.2). We used both linear and non-linear (kernel) versions. For the kernel version, we usedthe RBF kernel, i.e. K ij = exp (cid:16) −|| x i − x j || σ (cid:17) , where σ isan additional hyper-parameter. The hyper-parameters C , d , η , β and σ were selected from the following values: C ∈{ . , . , . , . , . } , d ∈ { , , , , , , , , } , η ∈ { − , − , − , − } , β ∈ { . , . , , , } ,and σ ∈ { − , − , − , , , , } . We report our result in terms of four different criterion: TruePositive Rate (TPR) is the fraction of correctly classified tar-get class samples correctly. Geometric Mean (GM) is thesquare root of the product of TPR and True Negative Rate.GM reflect both the ability of the model to detect target classsamples and its ability to keep the overall amount of sam-ples to be manually identified low. Therefore, it was used asour main performance measure used also for optimizing thehyper-parameters. Furthermore, we report the total numberof correctly identified target samples i.e., True Positives (TP),and the total number of samples needing manual identifica-tion, i.e., True Positives and False Positives (TP+FP).
We give the experimental results in Table 4. We see that one-class classifiers, using the same features as VGG16, can in-deed detect samples from rare species much better than thedeep network with a reasonable overhead (TP+FP). Here, itshould be remembered that up to 50 images can represent thesame specimen and, therefore, the actual number of speci-mens needing manual inspection may be significantly smallerthan the reported number of images. The best one-class clas-sifier in terms of GM is the linear S-SVDDr2 model.
5. CONCLUSION AND FUTURE WORK
We proposed a taxa identification framework, where spec-imens potentially representing rare species are directed forhuman expert inspection. We showed that one-class classi-fiers can complement a deep neural network with high overallclassification accuracy in a way that allows dividing the tasksbetween machine and human expert. This supports movingfrom fully manual to semi-automated taxa identification inbiomonitoring. The best one-class classification model interms of Geometric Mean was regularized linear SubspaceSupport Vector Data Description.In this paper, we considered images separately, while weactually have multiple images of a single specimen. In ourfuture work, we will consider how to exploit this information.For example, we may require a certain fraction of images tobe classified as target class to assign the specimen for humaninspection or we may use multi-modal one-class classifiers,e.g., [21], by considering each image as a separate modality.We will experiment on how to use classification confidencesof both the CNN and one-class classifiers to further reducethe number of samples requiring human inspection. We willalso experiment with different classifier types, such as class-specific classifiers, in our general identification framework. . REFERENCES [1] EU Water Framework Directive (WDF), Directive2000/60/EC, ,”
Journal of the European Communities ,vol. L327/1, pp. 1 – 72, 2000.[2] D. Buchner, A.J. Beermann, A. Laini, P. Rolauffs,S. Vitecek, D. Hering, and F. Leese, “Analysis of 13,312benthic invertebrate samples from german streams re-veals minor deviations in ecological status class betweenabundance and presence/absence data,”
PloS one , vol.14, no. 12, 2019.[3] Y. Sun, Y. Takemon, and Y. Yamashiki, “Freshwaterspring indicator taxa of benthic invertebrates,”
Ecohy-drology & Hydrobiology , 2019.[4] S. Poikane, R.K. Johnson, L. Sandin, A.K. Schartau,et al., “Benthic macroinvertebrates in lake ecologicalassessment: a review of methods, intercalibration andpractical recommendations,”
Science of the total envi-ronment , vol. 543, pp. 123–134, 2016.[5] J. ¨Arje, J. Raitoharju, A. Iosifidis, V. Tirronen, K. Meiss-ner, M. Gabbouj, S. Kiranyaz, and S. K¨arkk¨ainen, “Hu-man experts vs. machines in taxa recognition,” arXivpreprint arXiv:1708.06899v4 , 2019.[6] J. Raitoharju and K. Meissner, “On confidences andtheir use in (semi-)automatic multi-image taxa identi-fication,” in
IEEE Symposium Series on ComputationalIntelligence , 2019.[7] A. Aparecido dos Santos and W. Nunes Gonalves, “Im-proving pantanal fish species recognition through taxo-nomic ranks in convolutional neural networks,”
Ecolog-ical Informatics , vol. 53, pp. 100977, 2019.[8] N.S. Wyniawskyj, M. Napiorkowska, D. Petit, P. Pod-der, and P. Marti, “Forest monitoring in guatemala us-ing satellite imagery and deep learning,” in
IEEE Inter-national Geoscience and Remote Sensing Symposium ,2019, pp. 6598–6601.[9] J. ¨Arje, D. Milioris, D.T. Tran, A. Iosifidis, J. Raito-harju, M. Gabbouj, J.U. Jepsen, and T.T. Høye, “Au-tomatic flower detection and classification system usinga light-weight convolutional neural network,” in
EU-SIPCO workshop on Signal Processing, Computer Vi-sion and Deep Learning for Autonomous Systems , 2019.[10] J. Raitoharju, E. Riabchenko, I. Ahmad, A. Iosi-fidis, M. Gabbouj, S. Kiranyaz, V. Tirronen, J. ¨Arje,S. K¨arkk¨ainen, and K. Meissner, “Benchmark databasefor fine-grained image classification of benthic macroin-vertebrates,”
Image and Vision Computing , vol. 78, pp.73–83, 2018. [11] M.G. Kelly, S.C. Schneider, and L. King, “Customs,habits, and traditions: the role of nonscientific factorsin the development of ecological assessment methods,”
Wiley Interdisciplinary Reviews: Water , vol. 2, no. 3, pp.159–165, 2015.[12] D.A. Lytle, G. Martinez-Munoz, W. Zhang, N. Larios,L. Shapiro, R. Paasch, A. Moldenke, E.N. Mortensen,S. Todorovic, and T.G. Dietterich, “Automated process-ing and identification of benthic invertebrate samples,”
Journal of the North American Benthological Society ,vol. 29, no. 3, pp. 867–874, 2010.[13] S. Kiranyaz, T. Ince, J. Pulkkinen, M. Gabbouj, J. ¨Arje,S. K¨arkk¨ainen, V. Tirronen, M. Juhola, T. Turpeinen,and K. Meissner, “Classification and retrieval onmacroinvertebrate image databases,”
Computers in bi-ology and medicine , vol. 41, no. 7, pp. 463–472, 2011.[14] J. ¨Arje, C. Melvad, M. Rosenhj Jeppesen, S. Ager-skov Madsen, J. Raitoharju, M. Strandg˚ard Rasmussen,A. Iosifidis, V. Tirronen, K. Meissner, M. Gabbouj, andT.T. Hye, “Automatic image-based identification andbiomass estimation of invertebrates,” arXiv preprintarXiv:2002.03807 , 2020.[15] J. Raitoharju, E. Riabchenko, K. Meissner, I. Ahmad,A. Iosifidis, M. Gabbouj, and S. Kiranyaz, “Data enrich-ment in fine-grained classification of aquatic macroin-vertebrates,” in
ICPR Workshop on Computer Vision forAnalysis of Underwater Imagery , 2016, pp. 43–48.[16] C. Huang, Y. Li, C. Change Loy, and X. Tang, “Learn-ing deep representation for imbalanced classification,”in
IEEE Conference on Computer Vision and PatternRecognition (CVPR) , June 2016.[17] B. Sch¨olkopf, R.C. Williamson, A. Smola, andJ. Shawe-Taylor, “Sv estimation of a distributions sup-port,” 1999.[18] D.M.J. Tax and R.P.W. Duin, “Support vector data de-scription,”
Machine learning , vol. 54, no. 1, pp. 45–66,2004.[19] F. Sohrab, J. Raitoharju, M. Gabbouj, and A. Iosifidis,“Subspace support vector data description,” in
Inter-national Conference on Pattern Recognition , 2018, pp.722–727.[20] K. Simonyan and A. Zisserman, “Very deep convolu-tional networks for large-scale image recognition,” in
International Conference on Learning Representations ,2015.[21] F. Sohrab, J. Raitoharju, A. Iosifidis, and M. Gabbouj,“Multimodal subspace support vector data description,” arXiv preprint arXiv:1904.07698arXiv preprint arXiv:1904.07698