Limnology and Oceanography-methods | 2021
Automatic plankton image classification—Can capsules and filters help cope with data\u2009set shift?
Abstract
The general task of image classification seems to be solved due to the development of modern convolutional neural networks (CNNs). However, the high intraclass variability and interclass similarity of plankton images still prevents the practical identification of morphologically similar organisms. This prevails especially for rare organisms. Every CNN requires a vast amount of manually validated training images which renders it inefficient to train study-specific classifiers. In most follow-up studies, the plankton community is different from before and this data set shift (DSS) reduces the correct classification rates. A common solution is to discard all uncertain images and hope that the remains still resemble the true field situation. The intention of this North Sea Video Plankton Recorder (VPR) study is to assess if a combination of a Capsule Neural Network (CapsNet) with probability filters can improve the classification success in applications with DSS. Second, to provide a guideline how to customize automated CNN and CapsNet deep learning image analysis methods according to specific research objectives. In community analyses, our approach achieved a discard of uncertain predictions of only 5%. CapsNet and CNN reach similar precision scores, but the CapsNet has lower recall scores despite similar discard ratios. This is due to a higher discard ratio in rare classes. The recall advantage of the CNN decreases with increasing DSS. We present an alternative method to handle rare classes with a CNN achieving a mean recall of 96% by manually validating an average of 6.5% of the original images. State-of-the-art sampling with towed optical devices provides anthropocenic marine planktologists with a wealth of data that even their most recent ancestors could only have dreamed off. Old-school planktologists had to spent hours sitting over the microscope hand-sorting net samples. They were rewarded with snapshots of plankton communities in space and time at the highest possible taxonomic level, sometimes even down to ontogenetic life stages, sex, and clutch sizes (Hansson et al. 1990; Ston et al. 2002; Johansson et al. 2004; Vuorio et al. 2005; Renz and Hirche 2006; Peters et al. 2013). Modern plankton sampling devices provide information from the other ends of these scales: millions of images at a spatiotemporal resolution of cm and seconds (Davis et al. 1992; Wiebe and Benfield 2003; Benfield et al. 2007) sampled continously over transects 10s (Floeter et al. 2017) or even 100s (Davis and McGillicuddy 2006) of nautical miles long. The subsequently necessary automatic plankton image classification has followed the trends in machine learning from Support Vector Machines (SVMs; Hu and Davis 2005; Sosik and Olson 2007), later on Neural Networks (NNs, Tang and Stewart 1996) to modern Random Forest (Bell and Hopcroft 2008; Orenstein et al. 2015; Faillettaz et al. 2016) and convolutional neural networks (CNNs; LeCun et al. 2015; Krizhevsky et al. 2017), though the use of manually engineered features such as in SVMs is still relatively common (e.g., Nanni et al. 2019). Since the year 2015, when the Microsoft Research Asia team (He et al. 2015) had won the annual ImageNet challenge (Russakovsky et al. 2015) by reaching an accuracy of 96.4% in classifying high-resolution color images into 1000 different categories, image classification seemed to be solved (Chollet 2017). At first sight, plankton images are no exception, because recent efforts have resulted in > 90% average classification accuracy (Al-Barazanchi et al. 2016; Luo et al. 2018). However, the taxonomic resolution is also almost always diametrically opposed to the increasing scales, providing *Correspondence: [email protected] Additional Supporting Information may be found in the online version of this article. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.