[PDF] Anomalous Sound Detection with Machine Learning: A Systematic Review

Abstract

Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from an object is normal or anomalous. In some cases, early detection of this anomaly can prevent several problems. This article presents a Systematic Review (SR) about studies related to Anamolous Sound Detection using Machine Learning (ML) techniques. This SR was conducted through a selection of 31 (accepted studies) studies published in journals and conferences between 2010 and 2020. The state of the art was addressed, collecting data sets, methods for extracting features in audio, ML models, and evaluation methods used for ASD. The results showed that the ToyADMOS, MIMII, and Mivia datasets, the Mel-frequency cepstral coefficients (MFCC) method for extracting features, the Autoencoder (AE) and Convolutional Neural Network (CNN) models of ML, the AUC and F1-score evaluation methods were most cited.

Full PDF

JJanuary 2021

Anomalous Sound Detection withMachine Learning: A Systematic Review

Eduardo Carvalho NUNES a , , a ALGORITMI Centre, Department of Information Systems, University of Minho,Guimar˜aes, Portugal

Abstract.

Anomalous sound detection (ASD) is the task of identifying whether thesound emitted from an object is normal or anomalous. In some cases, early detec-tion of this anomaly can prevent several problems. This article presents a System-atic Review (SR) about studies related to Anamolous Sound Detection using Ma-chine Learning (ML) techniques. This SR was conducted through a selection of 31(accepted studies) studies published in journals and conferences between 2010 and2020. The state of the art was addressed, collecting data sets,methods for extractingfeatures in audio, ML models, and evaluation methods used for ASD. The resultsshowed that the ToyADMOS, MIMII, and Mivia datasets, the Mel-frequency cep-stral coefﬁcients (MFCC) method for extracting features, the Autoencoder (AE)and Convolutional Neural Network (CNN) models of ML, the AUC and F1-scoreevaluation methods were most cited.

Keywords.

Anomalous Sound Detection, Machine Learning, Systematic Review

1. Introduction

Anomaly Sound Detection (ASD) has received a lot of attention from the scientiﬁc ma-chine learning community in recent years [1]. An anomaly in sound can indicate an erroror defect, detecting the anomaly earlier can avoid a series of problems such as industrialequipment predictive maintenance and audio surveillance of roads [2,3].Anomaly detection techniques can be categorized as supervised anomaly detection,semi-supervised anomaly detection, and unsupervised anomaly detection. Supervisedanomaly detection requires the entire dataset to be labeled ”normal” or ”abnormal” andthis technique is basically a type of binary classiﬁcation task. Semi-supervised anomalydetection requires only data considered ”normal” to be labeled, in this technique, themodel will learn what ”normal” data are like. Unsupervised anomaly detection involvesunlabeled data. In this technique, the model will learn which data is ”normal” and ”ab-normal” [4].This paper presents a systematic review (SR) aiming to verify the state of the artin audio anomaly detection using machine learning techniques. Additionally, it was an- Corresponding Author: Eduardo Carvalho Nunes, ALGORITMI Centre, Department of InformationSystems, University of Minho, 4804-533 Guimar˜aes, Portugal; E-mail: [email protected] a r X i v : . [ c s . S D ] F e b anuary 2021 alyzed which datasets, methods for extracting features in audio, ML models, and evalu-ation methods most used in the accepted primary studies.Thus, this survey can enable ageneral analysis of the scope of the work.In addition to this introductory section, the paper is organized as follows. The Re-search Methodology section presents the concept of SR, the deﬁned and used protocol,and the process of conducting the review. The ”Results and discussion” section presentsand discusses the results.

2. Research Methodology

Unlike traditional literature reviews, the SR is a rigorous and reliable research methodthat aims to select relevant research, collect and analyze data, and allow evaluation[5]. According to Kitchanhan’s suggestion, this paper was developed considering the 3phases: planning, execution and analysis of results (Figure 1) [6]. In the planning phase,a protocol is deﬁned specifying research questions, keywords, inclusion and exclusioncriteria for primary studies and other topics of interest. In the execution phase, the bib-liographic research is conducted following the deﬁned protocol, it is in this phase thatthe inclusion and exclusion of primary studies is done. And ﬁnally, the results analysisphase, the extraction of the data is done and the results are compared.

Figure 1.

RS phases adapted [7].

First, a detailed protocol was designed to describe the process and method to be appliedin this SR (Table 1). This protocol contains: objective, main question, keywords andsynonyms, study language, sources search methods, study selection criteria, source list,quality form ﬁelds and data extraction form ﬁelds.

The searches were carried out between November and December 2020. Only recent stud-ies (published since 2010) were considered for assessing the state of the art. The primarystudies returned from the electronic database were identiﬁed through the search string: (”anomalous sound detection” OR ”detecting anomalous audio” OR ”detection ofanomalous sound” OR ”anomaly detection”) AND (”machine learning” OR ”super-vised anomaly detection” OR ”semi-supervised anomaly detection” OR ”unsupervisedanomaly detection”) AND (”sound” OR ”audio”) anuary 2021

Table 1.

Deﬁned Protocol for this SR.

Objective

This Systematic Literature Review Protocol (SLRP) presentsthe methodological structure for the implementation of the lit-erature review stage on audio anomaly detection with machinelearning techniques.

Main Question

What machine learning techniques for audio anomaly detec-tion?

Keywords andSynonyms

Anomalous Sound Detection; Anomaly Detection; Detect-ing Anomalous Audio; Detection of Anomalous Sounds; Ma-chine Learning; Self-Supervised Anomaly Detection; Semi-Supervised Anomaly Detection; Unsupervised Anomaly De-tection; Audio; Sound.

Study Language

English.

Sources SearchMethods

The sources should be available via the web, preferably inscientiﬁc databases in the area (computer science, computing,electronics). In addition to traditional databases, some can beincluded according to the results found. Primary studies inother media can also be selected, as long as they meet the re-quirements of the SR.This process will be carried out by means of searches com-posed of keywords. Primary studies will be found fromsearches carried out on search portals for articles, theses, dis-sertations, and journals.During the information retrieval procedure, the strings foundpreferably in Titles, Abstracts, and Keywords of each databasewill be considered.After checking the relevance of the work, it will be selectedfor reading (full text). Primary studies will then be accepted orrejected. There will be (I) Inclusion and (E) Exclusion criteriafor each primary study analyzed.

Study SelectionCriteria Inclusion : Anomaly detection in audio; uses machine learningtechnique; primary study is written in english.

Exclusion : Not detect anomaly in audio; not uses a machinelearning technique; it is not written in english; full paper notfound; not present the abstract; have a publication year outsidethe speciﬁed deadline (i.e., earlier than 2010).

Sources

In addition to the below sources, a search was made for papersin the research community on Detection and Classiﬁcation ofAcoustic Scenes and Events (DCASE).

ACM : http://portal.acm.org

IEEE : https://ieeexplore.ieee.org/Xplore/home.jsp/

SCOPUS

DCASE2020 : http://dcase.community/

Quality FormFields

Coherence and cohesion textual;Uses machine learning technique in an objective way;Machine learning techniques are cited;

ExtractionForm Fields

Which Machine learning category?Anomaly detection category?Which dataset used?Which programming language is used?Which libraries or structure used? anuary 2021

The selection process for primary studies is illustrated in Figure 2. In the ﬁrst step,3150 primary studies were identiﬁed. In the second step, the titles and abstracts wereread and the inclusion and exclusion criteria were applied. In this step, 109 studies wereaccepted, 3002 studies were rejected and 38 studies were duplicated. In the third step, theintroduction and conclusion were read and the inclusion and exclusion criteria were alsoused. In this step, 34 studies were accepted, 72 studies were rejected and 3 duplicatedstudies. In the fourth step, 25 studies fully were read and 25 studies were accepted. Aftercompleting the selection of studies, it was noted that DCASE was widely cited. Withthat, a manual search was made and 7 more studies were accepted.

Figure 2.

Selection Process of Primary Studies.

This phase consists of a review and extraction of information. The Table 2 shows thenumber of primary studies collected in each indexed database. It is important to note thatScopus covers some results from the ACM and IEEE. For each primary study, a summarywas written with the main study topics. anuary 2021

Table 2.

Number of studies obtained in the indexed databases.

Source Nº of Studies Accepted - Selection Phase Accepted - Extraction Phase

ACM 376 10 (2.65%) 2 (20%)IEEE 1100 25 (2.27%) 3 (12%)SCOPUS 1674 74 (4.42%) 19 (25.67%)DCASE 49 7 (14.28%) 7 (100%)Total 3199 116 (3.62%) 31 (26.72%)

In the selection phase, 116 studies were selected for the next phase. After completingthe selection phase, 31 studies were selected for the extraction phase. It is important tonote that the selection of studies related to DCASE, was done manually, that the studiesof the selection phase were all accepted for the extraction phase. The main topics of in-terest: Machine Learning Technique, Anomaly Type, Dataset, Audio Feature ExtractionMethod, Anomaly Detection Model, and Machine Learning Model Evaluation Method.The main results of SR are described in the results section.

The StArt tool (State of the Art through Systematic Review) was used to support theSR process [22]. This tool was created by LaPES-Software Engineering Research Lab(Federal University of S˜ao Carlos) and was developed with the purpose of to automate thephases of SR. The tool offers full support for SR and is divided into: Planning, Execution,and Summary.

3. Results and Discussions

All primary studies were retrieved from scientiﬁc journals and conference proceedings.The Table 3 lists primary studies published in journals. In general, the journals afﬁliatedto IEEE obtained more studies with 5 (50%) primary studies, two of which have the bestimpact factor. The Table 4 lists the primary studies published in Proceedings. As shownin the table, the DCASE 2020 contain more studies with 7 (33%) primary studies. It isimportant to highlight that in the event there was a competition (task 2) that is totallyrelated to the detection of anomalies in audio. Another important observation is that theproceedings afﬁliated with IEEE, which had 8 primary studies (38%) accepted in thisSR. The studies were published by 133 different researchers. Table 8 (Appendix C)shows a summary of the researchers responsible for two or more studies. In this table wecan see that the highlights are laboratories from Japan. Figure 4 (Appendix B) shows thetotal number (per country) of researchers with published studies. Highlights Japan with38 (28%) researchers and Italy with 29 (22%) researchers.Figure 3 shows the evolution of the research areas in relation to the number of pub-lished studies. According to this SR, the theme of this study had its ﬁrst study publishedonly in 2014. In the years 2017 and 2018, there were few published studies. However, in2019 and 2020 many more studies began to emerge. anuary 2021

Table 3.

Journals that provided the included primary studies.

Journal Impact Factor Nº of Studies

IEEE Transactions on Intelligent Transportation Systems 6.319 1(ISSN 1558-0016)IEEE Transactions on Information Forensics and Security 6.013 1(ISSN 1556-6021)Engineering Applications of Artiﬁcial Intelligence 4.201 2(ISSN: 0952-1976)IEEE Access (ISSN 2169-3536) 3.745 1IEEE/ACM Transactions on Audio, Speech, 3.398 2and Language Processing (ISSN 2329-9304)Electronics (ISSN 2079-9292) 2.412 1Expert Systems (ISSN 1468-0394) 1.546 1

Table 4.

Proceedings that provided the included primary studies.

Proceeding Nº of Studies

Proceedings of the Fifth Workshop on Detection and Classiﬁcation of 7Acoustic Scenes and Events(DCASE 2020)IEEE International Conference on Acoustics, Speech, and Signal 4Processing (ICASSP)IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) 1IEEE International Conference on Machine Learning and 1Applications (ICMLA)IEEE Workshop on Machine Learning for Signal Processing 1IIEEE International Conference on Advanced Trends in 1Information Theory (ATIT)International Conference on Industrial Engineering and Applications (ICIEA) 1International Conference on Soft Computing Models in Industrial and 1Environmental Applications(SOCO)European Signal Processing Conference (EUSIPCO) 1International Conference on Intelligent Networking and 1Collaborative Systems (INCOS)Proceedings of the International Conference on Pattern Recognition and 1Artiﬁcial IntelligenceInternational Conference on Soft Computing and Machine Intelligence 1(ISCMI)

Tables 5, 6, and 7 shows a synthesis of the 31 primary studies analyzed including: dataset,audio features, ML model and evaluation method for each one. A global analysis ispresented in this session taking into account the most relevant topics.About datasets, 9 studies created and used their own dataset, and 22 studies used anuary 2021

Table 5.

Anomaly Detection in audio presented in the included studies of the SR (Automatic Search).

Study Year Dataset Audio Features ML Model EvaluationMethod [23] 2020 MiviaDataset [8] STFT, MFCC,Mel-Scale DenseNet-121,MobileNetV2,ResNet-50 RR, MDR,ER, FPR,Accuracy[24] 2020 ToyADMOS[11],MIMII[12] Mel-Filterbank SPIDERnet, AE,Naive MSE,PROTOnet AUC, ROC,TPR, FPR,F-measure[25] 2020 MiviaDataset [8] Audio Power,Audio harmonic-ity, Total loud-ness in Barkscale, Autocorre-lation coefﬁcient,ZCR, Log-attacktime, Temporalcentroid, Audiospectrum roll-off,Audio spectrumspread, Audiospectrum cen-troid, MFCC,Audio spectralﬂatness one-class SVM,and DNN Accuracy,F1-score,Precision[26] 2020 Own Dataset MFCC, andMel ﬁlterbankenergies LSTM Accuracy,and F1-score[27] 2020 OwnDataset,Freesound[12] MFCC, DWT,ZCR, SR, andGFCC SVM, RandomForest, CNN,KNN GradientBoosting, Precision,Recall,F1-score,Accuracy,p-value[28] 2020 Own Dataset Mel-spectrogram Conv-LSTM AE,and CAE ROC-AUC,F1-score[29] 2020 Own Dataset Mel-spectrogram CAE, and One-Class SVM ROC-AUC,F1-score[30] 2020 MiviaDataset [8] Gammatonegramimages AReN (CNN) Accuracy,RR, MDR,ER, FPR[31] 2019 Own Dataset Mel-spectrogram Deep AE ROC-AUCcurve[32] 2019 Toy CarRunningDataset [13] Time-series ofacoustic AE AUC[33] 2019 UrbanSound8K[14], TUTDataset [15] LPC, MFCC, andGFCC AgglomerativeClustering,BIRCH Precision,Recall, F1-score, TP,FP, FN[34] 2019 TUT Dataset[15] Raw audio WaveNet, andCAE ROC-AUCcurve anuary 2021

Table 6.

Continued. Anomaly Detection in audio presented in the included studies of the SR (AutomaticSearch).

Study Year Dataset Audio Features ML Model EvaluationMethod [44] 2019 DCASE2018Task 1 [19],DCASE2018Task 2 [20] FFT, and Log melspectrogram CNN F1-score,AUC, mAP,AP, ER[45] 2019 TUT Dataset[15], NABData Corpus[21] Raw data One-Class SVM,and LSTM-AE Accuracy[35] 2019 Own Dataset Log mel energy Deep AE AUC[36] 2019 OwnDataset, Ef-fects Library[16] Log mel spec-trum, MFCC,General Sound,i-vector WaveNet, AE,BLSTM-AE,AR-LSTM F1-score[37] 2019 DCASE2016 Dataset[17] MFCC AE, VAE, andVAEGAN AUC, TPR,and pAUC[38] 2018 GeneralSound Ef-fects Library[16] Log mel ﬁlterbank WaveNet, AE,BLSTM-AE,AR-LSTM F1-score[43] 2018 A3FALL[18] Log mel-energies, andDWT Siamese NN,SVM, One-ClassSVM F1-score,Recall,Precision[46] 2018 TUT Dataset[15] MFCC Elliptic En-velope, andIsolation Forest F1-score,and ER[39] 2017 Own Dataset Mel-spectrogram LSTM-AE ROC[40] 2017 OwnDataset, Ur-banSound8K[14] Mel-spectrogram,Gammatoneﬁlterbanks KNN ROC, andAUC[41] 2016 MiviaDataset [8] MFCC, ZCR,Energy ratios inBark sub-bands,Audio spectrumcentroid, Audiospectrum roll-off,Audio spectrumspread SVM RR, MDR,ER, FPR,ROC, AUC,Sensitivity[42] 2014 Own Dataset ZCR, FFT, DWT,MFCC, One-Class SVM F1-score,SD, LPC,and LPCC anuary 2021

Figure 3.

Number of primary studies published per year. public and private datasets. In total, 14 datasets and the main ones were identiﬁed, wherethe 3 most cited datasets were:

ToyADMOS [11],

MIMII [12], and

Mivia [8]. The Toy-ADMOS dataset is a machine operating sound dataset that has approximately 540 hoursof normal sound and approximately 12,000 hours of anomalous sound. ToyADMOS wasdesigned to detect audio anomalies in research involving machine operation [11]. TheMIMII dataset is a data set for investigation and inspection of defective industrial ma-chines. It contains the sounds generated from four types of industrial machines (valves,pumps, fans and slide rails) [12].Mivia dataset is an audio dataset composed of 6,000events considered to be vigilance (glass break, shots and screams) [8].In ML, features are the independent variables that serve as input to your ML systemor model. The ML model uses features to learn and make predictions. About the audiofeatures method, 34 methods were identiﬁed. The main methods of extracting featuresfrom the analyzed audio were:

Mel-frequency cepstral coefﬁcients (MFCCs),

Log-MelEnergy , and

Mel-spectrogram .To answer the main question of this SR, 33 machine learning techniques were iden-tiﬁed to detect anomalies in audio.However, two machine learning techniques stand out:the

Autoencoder (AE) and the

Convolutional Neural Network (CNN). In the mostrecent studies, the transfer learning method is being used. Trasnfer Learning is an MLmethod in which a model developed in one task is reused as a starting point in anothertask. The developed models identiﬁed in this study:

DenseNet-121 , MobileNetV2 , and

ResNet-50 .About the evaluation method, 23 (approximately 75%) studies used

AUC-ROC and

F1-score . AUC-ROC is a performance evaluation that involves classiﬁcation problemswith thresholds. AUC represents a degree of separability and ROC is a probability curve.The higher the AUC, the better the model for predicting a particular class. F1-scoremeasures the accuracy of an ML model. It is widely used in classiﬁcation, in our example,”normal” and ”abnormal”. anuary 2021

Table 7.

Anomaly Detection in audio presented in the included studies of the SR (Manual Search).

Study Year Dataset Audio Features ML Model EvaluationMethod [47] 2020 ToyADMOS[11], MIMII[12] Log-mel energies ResNet AUC[48] 2020 ToyADMOS[11], MIMII[12] GammatoneSpectrogram AE ROC, AUC,pAUC[49] 2020 ToyADMOS[11], MIMII[12] Spectrogram AE AUC, pAUC[50] 2020 ToyADMOS[11], MIMII[12] Log-mel energies CNN AUC, pAUC[51] 2020 ToyADMOS[11], MIMII[12] Log-mel energies CAE ROC, AUC,pAUC[52] 2020 ToyADMOS[11], MIMII[12] Log-mel energies ResNet, Mo-bileNetV2,GroupMADE AUC, pAUC[53] 2020 ToyADMOS[11], MIMII[12] Log-mel energies CNN, PCA,RLDA, PLDA AUC

4. Conclusions

This paper presents a systematic review on anomaly detection in audio using machinelearning techniques. This research had as main objective to obtain the state of the art,enabling an organization of ideas and summarization of information.In total, 31 studies were selected to study machine learning techniques for anomalysound detection. After the analysis, 33 machine learning techniques were identiﬁed,where AE and CNN were the most cited. We also analyzed the most-used datasets foranomaly detection as their respective methods for extracting features and the evaluationmethod for machine learning models.It aims that this study, the result of a secondary study, may allow some direction inthe works and research related to the theme. In particular, the author’s interest is relatedto the use of machine learning techniques for anomaly sound detection in-vehicle. anuary 2021

References [1] Kawaguchi, Y., & Endo, T. (2017, September). How can we detect anomalies from subsampled audiosignals. In 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)(pp. 1-6). IEEE.[2] Henze, D., Gorishti, K., Bruegge, B., & Simen, J. P. (2019, December). AudioForesight: A ProcessModel for Audio Predictive Maintenance in Industrial Environments. In 2019 18th IEEE InternationalConference On Machine Learning And Applications (ICMLA) (pp. 352-357). IEEE.[3] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., & Vento, M. (2015). Audio surveillance of roads: Asystem for detecting anomalous sounds. IEEE transactions on intelligent transportation systems, 17(1),279-288.[4] Alla, S., & Adari, S. K. (2019). Beginning Anomaly Detection Using Python-Based Deep Learning.Apress.[5] Almeida Biolchini, J. C., Mian, P. G., Natali, A. C. C., Conte, T. U., & Travassos, G. H. (2007). Sci-entiﬁc research ontology to support systematic review in software engineering. Advanced EngineeringInformatics, 21(2), 133-151.[6] Brereton, P., Kitchenham, B. A., Budgen, D., Turner, M., & Khalil, M. (2007). Lessons from applyingthe systematic literature review process within the software engineering domain. Journal of systems andsoftware, 80(4), 571-583. ´ anuary 2021 [24] Koizumi, Y., Yasuda, M., Murata, S., Saito, S., Uematsu, H., & Harada, N. (2020, May). SPIDERnet:Attention Network For One-Shot Anomaly Detection In Sounds. In ICASSP 2020-2020 IEEE Interna-tional Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 281-285). IEEE.[25] Rovetta, S., Mnasri, Z., & Masulli, F. (2020, May). Detection of hazardous road events from audiostreams: An ensemble outlier detection approach. In 2020 IEEE Conference on Evolving and AdaptiveIntelligent Systems (EAIS) (pp. 1-6). IEEE.[26] Becker, P., Roth, C., Roennau, A., & Dillmann, R. (2020, April). Acoustic Anomaly Detection in Ad-ditive Manufacturing with Long Short-Term Memory Neural Networks. In 2020 IEEE 7th InternationalConference on Industrial Engineering and Applications (ICIEA) (pp. 921-926). IEEE.[27] Vafeiadis, A., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., & Hamzaoui, R. (2020). Audio contentanalysis for unobtrusive event detection in smart homes. Engineering Applications of Artiﬁcial Intelli-gence, 89, 103226.[28] Bayram, B., Duman, T. B., & Ince, G. (2020). Real time detection of acoustic anomalies in industrialprocesses using sequential autoencoders. Expert Systems, e12564.[29] Duman, T. B., Bayram, B., & ˙Ince, G. (2019, May). Acoustic Anomaly Detection Using ConvolutionalAutoencoders in Industrial Processes. In International Workshop on Soft Computing Models in Indus-trial and Environmental Applications (pp. 432-442). Springer, Cham.[30] Greco, A., Petkov, N., Saggese, A., & Vento, M. (2020). AReN: A Deep Learning Approach for SoundEvent Recognition using a Brain inspired Representation. IEEE Transactions on Information Forensicsand Security.[31] Henze, D., Gorishti, K., Bruegge, B., & Simen, J. P. (2019, December). AudioForesight: A ProcessModel for Audio Predictive Maintenance in Industrial Environments. In 2019 18th IEEE InternationalConference On Machine Learning And Applications (ICMLA) (pp. 352-357). IEEE.[32] Koizumi, Y., Saito, S., Yamaguchi, M., Murata, S., & Harada, N. (2019, October). Batch uniformizationfor minimizing maximum anomaly score of dnn-based anomaly detection in sounds. In 2019 IEEEWorkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (pp. 6-10). IEEE.[33] Janjua, Z. H., Vecchio, M., Antonini, M., & Antonelli, F. (2019). IRESE: An intelligent rare-eventdetection system using unsupervised learning on the IoT edge. Engineering Applications of ArtiﬁcialIntelligence, 84, 41-50.[34] Rushe, E., & Mac Namee, B. (2019, May). Anomaly detection in raw audio using deep autoregres-sive networks. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP) (pp. 3597-3601). IEEE.[35] Kawaguchi, Y., Tanabe, R., Endo, T., Ichige, K., & Hamada, K. (2019, May). Anomaly detection basedon an ensemble of dereverberation and anomalous sound extraction. In ICASSP 2019-2019 IEEE Inter-national Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 865-869). IEEE.[36] Komatsu, T., Hayashiy, T., Kondo, R., Todaz, T., & Takeday, K. (2019, May). Scene-dependent Anoma-lous Acoustic-event Detection Based on Conditional Wavenet and I-vector. In ICASSP 2019-2019 IEEEInternational Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 870-874). IEEE.[37] Koizumi, Y., Saito, S., Uematsu, H., Kawachi, Y., & Harada, N. (2018). Unsupervised detection ofanomalous sound based on deep learning and the Neyman–Pearson lemma. IEEE/ACM Transactions onAudio, Speech, and Language Processing, 27(1), 212-224[38] Hayashi, T., Komatsu, T., Kondo, R., Toda, T., & Takeda, K. (2018, September). Anomalous soundevent detection based on wavenet. In 2018 26th European Signal Processing Conference (EUSIPCO)(pp. 2494-2498). IEEE.[39] Kawaguchi, Y., & Endo, T. (2017, September). How can we detect anomalies from subsampled au-dio signals?. In 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing(MLSP) (pp. 1-6). IEEE.[40] Marchegiani, L., & Posner, I. (2017, May). Leveraging the urban soundscape: Auditory perception forsmart vehicles. In 2017 IEEE International Conference on Robotics and Automation (ICRA) (pp. 6547-6554). IEEE.[41] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., & Vento, M. (2015). Audio surveillance of roads: Asystem for detecting anomalous sounds. IEEE transactions on intelligent transportation systems, 17(1),279-288.[42] Aurino, F., Folla, M., Gargiulo, F., Moscato, V., Picariello, A., & Sansone, C. (2014, September). One-class SVM based approach for detecting anomalous audio events. In 2014 International Conference onIntelligent Networking and Collaborative Systems (pp. 145-151). IEEE. anuary 2021 [43] Droghini, D., Vesperini, F., Principi, E., Squartini, S., & Piazza, F. (2018, August). Few-shot siameseneural networks employing audio features for human-fall detection. In Proceedings of the InternationalConference on Pattern Recognition and Artiﬁcial Intelligence (pp. 63-69).[44] Kong, Q., Xu, Y., Sobieraj, I., Wang, W., & Plumbley, M. D. (2019). Sound event detection andtime–frequency segmentation from weakly labelled data. IEEE/ACM Transactions on Audio, Speech,and Language Processing, 27(4), 777-787.[45] Provotar, O. I., Linder, Y. M., & Veres, M. M. (2019, December). Unsupervised Anomaly Detection inTime Series Using LSTM-Based Autoencoders. In 2019 IEEE International Conference on AdvancedTrends in Information Theory (ATIT) (pp. 513-517). IEEE.[46] Antonini, M., Vecchio, M., Antonelli, F., Ducange, P., & Perera, C. (2018). Smart audio sensors in theinternet of things edge for anomaly detection. IEEE Access, 6, 67594-67610.[47] Primus, P., Haunschmid, V., Praher, P., & Widmer, G. (2020). Anomalous Sound Detection as a Sim-ple Binary Classiﬁcation Problem with Careful Selection of Proxy Outlier Examples. arXiv preprintarXiv:2011.02949.[48] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., & Cobos, M. (2020). Anomalous Sound Detec-tion using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXivpreprint arXiv:2006.15321.[49] Park, J., & Yoo, S. DCASE 2020 TASK2: ANOMALOUS SOUND DETECTION USING RELEVANTSPECTRAL FEATURE AND FOCUSING TECHNIQUES IN THE UNSUPERVISED LEARNINGSCENARIO.[50] Inoue, T., Vinayavekhin, P., Morikuni, S., Wang, S., Trong, T. H., Wood, D., ... & Tachibana, R. (2020).Detection of Anomalous Sounds for Machine Condition Monitoring using Classiﬁcation Conﬁdence(Vol. 2). Tech. report in DCASE2020 Challenge Task.[51] Kapka, S. (2020). ID-Conditioned Auto-Encoder for Unsupervised Anomaly Detection. arXiv preprintarXiv:2007.05314.[52] Giri, R., Tenneti, S. V., Cheng, F., Helwani, K., Isik, U., & Krishnaswamy, A. SELF-SUPERVISEDCLASSIFICATION FOR DETECTING ANOMALOUS SOUNDS.[53] Wilkinghoff, K. USING LOOK, LISTEN, AND LEARN EMBEDDINGS FOR DETECTINGANOMALOUS SOUNDS IN MACHINE CONDITION MONITORING. anuary 2021 A. Image - Number of authors per year

Figure 4.

Number of primary of authors per year.

B. Table 8 - Reseachers of studies anuary 2021

Table 8.

Researchers (author and co-author) and their publications contained in this SR.

Author Pub Afﬁliation Country

Koizumi, Y. 4 NTT Media Intelligence Laboratories JapanSaito, S. 3 NTT Media Intelligence Laboratories JapanUematsu, H. 3 NTT Media Intelligence Laboratories JapanHarada, N. 3 NTT Media Intelligence Laboratories JapanVafeiadis, A. 2 Center for Research and Technology Hellas-Information GreeceTechnologies InstituteVotis, K. 2 Center for Research and Technology Hellas-Information GreeceTechnologies InstituteTzovaras, D. 2 Center for Research and Technology Hellas-Information GreeceTechnologies InstituteSaggese, A. 2 Department of Information Engineering, Electrical ItalyEngineering and Applied Mathematics, University of SalernoVento, M. 2 Department of Information Engineering, Electrical ItalyEngineering and Applied Mathematics, University of SalernoVecchio, M. 2 OpenIoT research unit, FBK CREATE-NET ItalyAntonini, M. 2 OpenIoT research unit, FBK CREATE-NET ItalyAntonelli, F. 2 OpenIoT research unit, FBK CREATE-NET ItalyMurata, S. 2 NTT Media Intelligence Laboratories JapanKawaguchi, Y. 2 Research and Development Group, Hitachi JapanEndo, T. 2 Research and Development Group, Hitachi JapanKomatsu, T. 2 Data Science Research Laboratories, NEC Corporation JapanHayashi, T. 2 Department of Information Science, Nagoya University JapanKondo, R. 2 Data Science Research Laboratories, NEC Corporation JapanToda, T. 2 Information Technology Center, Nagoya University JapanTakeda, K. 2 Department of Information Science, Nagoya University JapanPetkov, N. 2 Faculty of Science and Engineering, University of Groningen NetherlandsBayram, B. 2 Faculty of Computer and Informatics Engineering, Istanbul TurkeyTechnical UniversityDuman, T.B. 2 Faculty of Computer and Informatics Engineering, Istanbul TurkeyTechnical UniversityInce, G. 2 Faculty of Computer and Informatics Engineering, Istanbul TurkeyTechnical University anuary 2021

Table 9.

Continuation of Table 4 (part 1)

Author Pub Afﬁliation Country

Plumbley, M. D. 1 Centre for Vision, Speech and Signal Processing University Englandof SurreyBecker, P. 1 FZI Research Center for Information Technology GermanyRoth, C. 1 FZI Research Center for Information Technology GermanyRoennau, A. 1 FZI Research Center for Information Technology GermanyDillmann, R. 1 FZI Research Center for Information Technology GermanyHenze, D. 1 Technische Universit¨at M¨unchen GermanyGorishti, K. 1 Technische Universit¨at M¨unchen GermanyBruegge, B. 1 Technische Universit¨at M¨unchen GermanySimen, J.-P. 1 Carl Zeiss AG GermanyStemmer, G. 1 Intel Corp, Intel Labs GermanyRushe, E. 1 Insight Centre for Data Analytics, University College Dublin IrelandNamee, B.M. 1 Insight Centre for Data Analytics, University College Dublin IrelandRovetta, S. 1 DIBRIS, Universita degli studi di Genova ItalyMnasri, Z. 1 DIBRIS, Universita degli studi di Genova ItalyMasulli, F. 1 DIBRIS, Universita degli studi di Genova ItalyGreco, A. 1 Department of Information Engineering, Electrical ItalyEngineering and Applied Mathematics, University of SalernoJanjua, Z.H. 1 OpenIoT research unit, FBK CREATE-NET ItalyFoggia, P. 1 Department of Information Engineering, Electrical ItalyEngineering and Applied Mathematics,University of SalernoStrisciuglio, N. 1 Department of Information Engineering, Electrical ItalyEngineering and Applied Mathematics,University of SalernoAurino, F. 1 Dipartimento di Ingegneria Elettrica e delle Tecnologie Italydell’Informazione, University of Naples Federico IIFolla, M. 1 Dipartimento di Ingegneria Elettrica e delle Tecnologie Italydell’Informazione, University of Naples Federico IIGargiulo, F. 1 Dipartimento di Ingegneria Elettrica e delle Tecnologie Italydell’Informazione, University of Naples Federico IIMoscato, V. 1 Dipartimento di Ingegneria Elettrica e delle Tecnologie Italydell’Informazione, University of Naples Federico IIPicariello, A. 1 Dipartimento di Ingegneria Elettrica e delle Tecnologie Italydell’Informazione, University of Naples Federico IISansone, C. 1 Dipartimento di Ingegneria Elettrica e delle Tecnologie Italydell’Informazione, University of Naples Federico IIDroghini, D. 1 Universit`a Politecnica delle Marche ItalyVesperini, F. 1 Universit`a Politecnica delle Marche ItalyPrincipi, E. 1 Universit`a Politecnica delle Marche ItalySquartini, S. 1 Universit`a Politecnica delle Marche ItalyPiazza, F. 1 Universit`a Politecnica delle Marche ItalyDucange, P. 1 SMARTEST Research Centre, eCampus University ItalyYasuda, M. 1 NTT Media Intelligence Laboratories JapanYamaguchi, M. 1 NTT Media Intelligence Laboratories JapanTanabe, R. 1 Research and Development Group, Hitachi Japan anuary 2021

Table 10.

Continuation of Table 4 (part 2)

Author Pub Afﬁliation Country

Ichige, K. 1 Research and Development Group, Hitachi JapanHamada, K. 1 Research and Development Group, Hitachi JapanInoue, T. 1 IBM Research JapanVinayavekhin, P. 1 IBM Research JapanMorikuni, S. 1 IBM Research JapanTachibana, R. 1 IBM Research JapanLopez-Meyer, P. 1 Intel Corp, Intel Labs MexicoKapka, S. 1 Samsung R and D Institute Poland PolandPark, J. 1 Advanced Robot Research Laboratory, LG Electronics South KoreaYoo, S. 1 Advanced Robot Research Laboratory, LG Electronics South KoreaPerez-Castanos, 1 Visualfy, Benisano SpainNaranjo-Alcazar, J. 1 Visualfy, Benisano SpainZuccarello, P. 1 Visualfy, Benisano SpainCobos, M. 1 Universitat de Valencia SpainProvotar, O. I. 1 Faculty of Computer Science and Cybernetics Taras UkraineShevchenko National University of KyivLinder, Y. M. 1 Faculty of Computer Science and Cybernetics Taras UkraineShevchenko National University of KyivVeres, M. M. 1 Faculty of Computer Science and Cybernetics Taras UkraineShevchenko National University of KyivGiri, R. 1 Amazon Web Services USATenneti, S. V. 1 Amazon Web Services USACheng, F. 1 Amazon Web Services USAHelwani, K. 1 Amazon Web Services USAIsik, U. 1 Amazon Web Services USAKrishnaswamy, A. 1 Amazon Web Services USAWang, S. 1 IBM Research USATrong, T. H. 1 IBM Research USAWood, D. 1 IBM Research USALopez, J. A. 1 Intel Corp, Intel Labs USALu, H. 1 Intel Corp, Intel Labs USANachman, L. 1 Intel Corp, Intel Labs USAHuang, J. 1 Intel Corp, Intel Labs USAPerera, C. 1 School of Computer Science and Informatics, WalesCardiff UniversityPrimus, P. 1 Institute of Computational Perception AustriaHaunschmid, V. 1 Institute of Computational Perception AustriaPraher, P. 1 Software Competence Center Hagenberg GmbH AustriaWidmer, G. 1 LIT Artiﬁcial Intelligence Lab, Johannes Kepler University AustriaChen, L. 1 Faculty of Computing, Engineering and Media, De Montfort EnglandUniversityHamzaoui, R. 1 Faculty of Computing, Engineering and Media, De Montfort EnglandUniversityMarchegiani, L. 1 Oxford Robotics Institute, University of Oxford EnglandPapadimitriou, I. 1 Center for Research and Technology Hellas-Information GreeceTechnologies InstituteLalas, A. 1 Center for Research and Technology Hellas-Information GreeceTechnologies InstituteGiakoumis, D. 1 Center for Research and Technology Hellas-Information GreeceTechnologies Institute anuary 2021

Table 11.

Continuation of Table 4 (part 3)