Deep Learning -- A first Meta-Survey of selected Reviews across Scientific Disciplines and their Research Impact
DDeep Learning – A first Meta-Survey of selected Reviews across Scientific Disciplines and their Research Impact
Jan Egger , Antonio Pepe , Christina Gsaxner , Jianning Li Institute of Computer Graphics and Vision, Faculty of Computer Science and Biomedical Engineering, Graz University of Technology, Inffeldgasse 16, 8010 Graz, Austria. Department of Oral &Maxillofacial Surgery, Medical University of Graz, Auenbruggerplatz 5/1, 8036 Graz, Styria, Austria. Computer Algorithms for Medicine Laboratory, Graz, Austria.
Corresponding author: [email protected]
Abstract
Deep learning belongs to the field of artificial intelligence, where machines perform tasks that typically require some kind of human intelligence. Deep learning tries to achieve this by mimicking the learning of a human brain. Similar to the basic structure of a brain, which consists of (billions of) neurons and connections between them, a deep learning algorithm consists of an artificial neural network, which resembles the biological brain structure. Mimicking the learning process of humans with their senses, deep learning networks are fed with (sensory) data, like texts, images, videos or sounds. These networks outperform the state-of-the-art methods in different tasks and, because of this, the whole field saw an exponential growth during the last years. This growth resulted in way over 10 000 publications per year in the last years. For example, the search engine PubMed alone, which covers only a sub-set of all publications in the medical field, provides over 11 000 results for the search term ‘deep learning’ in Q3 2020, and ~90% of these results are from the last three years. Consequently, a complete overview over the field of deep learning is already impossible to obtain and, in the near future, it will potentially become difficult to obtain an overview over a subfield. However, there are several review articles about deep learning, which are focused on specific scientific fields or applications, for example deep learning advances in computer vision or in specific tasks like object detection. With these surveys as a foundation, the aim of this contribution is to provide a first high-level, categorized meta-analysis of selected reviews on deep learning across different scientific disciplines and outline the research impact that they already have during a short period of time.
Keywords : Deep Learning, Artificial Neural Networks, Machine Learning, Data Analysis, Image Analysis, Language Processing, Speech Recognition, Big Data, Detection, Segmentation, Registration, Generative Adversarial Network, Medical Image Analysis, Meta-Survey, Meta-Review. . Introduction
Deep learning belongs to the field of artificial intelligence, where machines execute tasks that usually require human intelligence. Deep learning is trying to achieve this by mimicking the learning of a human brain. Imitating the physiological structure of a brain, which consists of billions of neurons and connections between them, a deep learning algorithm consists of an artificial neural network of interconnected neurons [1], [2]. Also, similarly to the learning process of humans with their senses, deep neural networks are fed with sensory or sensor data like texts, images, videos or sounds [3]. These networks outperform the state-of-the-art methods in different tasks and, thanks to this, the whole field saw an exponential growth [4]-[6]. This resulted in way over 10 000 publications per year, in the last years. For example, alone the search engine PubMed, which covers only a sub-set of all publications in the medical field, returns over 11 000 results for the search term deep learning in Q3 2020, and around 90% of these publications are from the last three years only. Consequently, a complete overview over the field of deep learning is already impossible to obtain and, in the near future, it will probably become difficult even for single sub-fields. However, there are several review or survey articles about deep learning, which focus on specific scientific fields or applications, for example, covering only deep learning approaches from computer vision, or specific tasks like object detection [7]-[9] or object segmentation [10], [11]. With these surveys as foundation, the aim of this contribution is to provide a first categorized and high-level meta-analysis of selected works of deep learning reviews or surveys. On the top level, four main categories have been chosen for this contribution, namely: computer vision, natural language processing, medical informatics and additional works. The reason behind this course of action was to have about the same number of reviews for every main category with a well-balanced distribution. Although the last category could be further divided, this would lead to main categories with a small number of reviews; even only one review for some niche fields. Table 1 gives an overview of the four main categories and the number of screened reviews for each of them. Further, it presents the sum of the overall references and citations per category, to provide an impression of how comprehensive and influential the fields are. Subsequently, Tables 2-5 present more details for each main category. The tables present the sub-categories and the corresponding publications, and again, also the number of references and citations for each of these categories. Hence, hese selected works of deep learning reviews or surveys across scientific disciplines depict the research impact they already had within a relatively short time. Note that the deep learning reviews selected for this contribution present themselves mostly an overview of (selected) deep learning works in a specific field and categorize them in sub-sections or areas. Therefore, this course of action is also applied to this meta-survey/review. The reason for this is that deep learning algorithms have often been applied to completely different datasets and modalities, which makes it difficult to combine them in a systematic survey as it can be seen in the referenced reviews.
Search Strategy
For this meta-analysis a search in IEEE Xplore Digital Library, Scopus, DBLP, PubMed, and Google Scholar for the keyword ‘Deep Learning’ together with any keyword between {‘Review’, ‘Survey’} was performed. Based on titles and abstracts, all records, which were not actual review or survey contributions, were excluded. This ultimately resulted in a total number of 58 review or survey publications about deep learning, which will be covered within this meta-survey. Summarized, this high-level meta-survey gives a snapshot overview of published deep learning reviews (status as of August 2020) and a compact overview of the search results can be found in Tables 1-5. Note that this meta-survey includes a few preprints. However, some of these have already up to one hundred or even several hundreds of citations, and hence have proven to be of high interest for the community and it can be expected that they will be published in a peer-reviewed venue sooner or later. These reviews were included as they cover specific and interesting research areas that have not been covered elsewhere yet.
Manuscript Outline
The core of this meta-contribution explores exclusively reviews and surveys on deep learning. Because some of the included reviews cover up to several hundred publications themselves, only high-level summaries and excerpts are given to keep the manuscript concise for the reader. Hence, every review publication is summarized in around 100 to 200 words and, thus, every sub-category has around 100 to up to a few hundred words, depending on the amount of review contributions in this area. The classification and arrangement of the presented deep learning reviews should enable the interested reader to dive deeper into specific categories and sub-categories by pointing to the associated publications. The following sections of this meta-review are organized as follows: Section two introduces the deep learning reviews or surveys divided into four main categories: computer vision , language processing , medical nformatics and additional works . Section 3 concludes and discusses the contributions and outlines areas of future directions. Furthermore, for readers with a particular interest towards the medical field, an in-depth systematic meta-review about medical deep learning surveys, which are only partially covered within this contribution, is also available [12].
2. Deep Learning: A survey summary of selected reviews across scientific disciplines
This section presents selected review and survey publications in deep learning. For a better overview, the publications are arranged in four categories: - computer vision, - language processing, - medical informatics, - and additional works. The first category introduces deep learning reviews and surveys in the field of computer vision. Among others, this includes publications about object detection, image segmentation, object recognition and a survey about inpainting with generative adversarial networks. The second category presents the deep learning review publications in natural language processing. This covers areas like language understanding but also language generation and answer selection. The third category presents deep learning reviews in the medical field, thus covering reviews about different aspects of medical image processing, medical imaging and computer-aided diagnosis. Ultimately, the last category of this section closes with additional deep learning reviews in areas like big data, networking, multimedia, agriculture and reviews that cover multiple scientific areas or applications. According to these sections and sub-sections, Tables 1-5 are divided into the same categories and sub-categories, and present also the current citations for every publication according to Google Scholar (status as of mid. August 2020). Note that the review about data augmentation is listed in the first category about computer vision topics, because it mainly discusses approaches for images. Some review publications would fit in more than one main category. For example, reviews about medical image analysis, could also fit into the ategory computer vision. However, the final assignment and arrangement was made also under the consideration of balancing the number of publications per category.
This sub-section deals with the deep learning reviews in the area of computer vision. It is divided in nine sub-categories: - object detection, - image segmentation, - face recognition, - action/motion recognition, - biometric recognition, - image super-resolution, - image captioning, - data augmentation, - and generative adversarial networks.
Object detection is one of the most basic, but as well challenging problems in computer vision. Object detection deals with the localization of objects from predefined categories, like cats, dogs, etc., in natural images. Li et al. [7] outline more than 300 research contributions in their survey about object detection. In doing so, they cover many general aspects in the field of object detection, which includes, for example, detection frameworks, but also object feature representation and object proposal generation. On top, they address context modelling, training strategies, and, eventually, evaluation metrics for object detection. A second review paper in the area of computer vision and object detection is from Zhao et al. [8], which first provides a short introduction on convolutional neural networks, deep learning, and their history. They focus on typical generic object detection architectures and briefly survey various particular tasks. These cover salient object detection, but also pedestrian and face detection. In addition, an experimental analysis is provided, which allows the comparison of different approaches and hence, draw constructive conclusions amongst them. inally, Jiao et al. [9] review existing approaches of general detection models and additionally introduce a common benchmark dataset for them. The authors also outline a comprehensive and systematic overview of numerous approaches for object detection, which include one-stage and multi-stage object detectors. Furthermore, they list and analyse established, as well as new, applications of object detection and its most representative branches.
Image segmentation is usually the first step in different computer vision applications, like scene understanding, video surveillance, and robotic perception. Further applications can be medical image analysis, augmented reality and image compression, to name a few. Garcia-Garcia et al. [10] provide a survey of deep learning approaches for semantic segmentation that can be translated and applied to numerous areas. In doing so, datasets, but also challenges, are outlined to guide researchers in the decision which method is most suitable for their needs and aims. Subsequently, the existing approaches and methods are surveyed in the contribution. Additionally, they review common loss functions and error metrics and provide quantitative results for the introduced methods, but also the datasets that have been used for an evaluation. Minaee et al. [11] present a comprehensive review that covers a wide spectrum of contributions in the area of semantic and instance-level segmentation. This includes fully convolutional pixel-labelling networks and encoder-decoder architectures. Moreover, recurrent networks, multi-scale and pyramid-based methods. Further, visual attention and generative models in an adversarial setting. They studied the strengths and challenges of the proposed deep learning models, but also their similarity. Finally, the authors investigated the most commonly applied datasets and present performance results for them.
Face recognition is a significant biometric method for identity authentication that has been applied to numerous application areas. This includes public security, daily life, military, but also finance. Masi et al. [13] introduce the main benefits of face recognition with deep learning, also called deep face recognition. They focus on identification and verification by learning representations of the face. The review gives a structured overview of works from the past years, covering the principals and state-of-the-art in face recognition methods. i and Deng [14] provide a survey on facial expression recognition with deep learning, including datasets and algorithms. They present datasets that are available and have been commonly applied in previous works. Further, they outline commonly recognised data selection and evaluation concepts that have been used for these datasets. Next, they outline the general pipeline and workflow for a deep facial expression recognition approach, covering the corresponding background knowledge, and finally propose, for each stage, a feasible implementation. Mei and Deng [15] also provide a survey of the latest trends on deep facial expression detection, including the design of algorithms, but also possible applications, protocols and databases. They outline various network architectures and loss functions that have been introduced in the general field of the deep facial expression. They categorized the face processing approaches into two different classes: “one-to-many augmentation” and “many-to-one normalization”. They also give a summarized overview of common databases and compare them concerning model training and model evaluation. Finally, they explored further scenarios for deep facial expression, like cross-factor, heterogeneous, industrial and multiple-media.
Understanding human actions and motions in visual data, like surveillance videos, is closely connected to research fields like object recognition, semantic segmentation, human dynamics and domain adaptation. Herath et al. [16] review notable steps that have been taken towards recognizing human actions. Therefore, they start with a discussion of first approaches that applied handcrafted representations. Subsequently, they review deep learning-based approaches suggested in this field. Wang et al. [17] give an outline of latest trends and improvements of motion recognition in RGB-D images. They categorized the surveyed approaches into four groups. The groups base on the particular modality used for recognition, and can be RGB-, skeleton, depth-, or RGB+D-based. Finally, they discuss the advantages and limitations, with a focus on approaches that encode spatial-temporal-structural information, which is inherent in video sequences.
Biometric recognition, or biometrics, studies the identification of people utilizing their unique phenotypical characteristics, like fingerprints or the iris, for applications ranging from cell phone authentication to airport security systems. Sundararajan and Woodard 18] review one hundred distinct methods that study recognizing individuals with deep learning applying different biometric modalities. They conclude that the majority of research in biometrics based on deep learning has been conducted around face recognition and speaker recognition so far. Minaee et al. [19] propose a survey of more than 120 works on biometric recognition, including face, fingerprint, iris, palm print, ear, voice, signature, and gait recognition. For each biometric recognition task, they present the available datasets used in the literature and their characteristics and outline the performance on popular public benchmarks.
Image super-resolution is a basic task in image processing that has seen a rise in popularity with the advent of deep learning. Image super-resolution methods and algorithms are used to improve the resolution of (low-resolution) images and videos. Wang et al. [20] give a comprehensive overview on latest trends and advances in the field of image super-resolution focusing on deep learning methods. They divide the existing papers on image super-resolution methods into three main categories, namely supervised image super-resolution, unsupervised image super-resolution and finally, domain-specific image super-resolution. Moreover, they cover other topics in the field of image super-resolution, like public accessible benchmark data collections and metrics for a performance evaluation.
Image captioning refers to the generation of a description for an image. Therefore, their primary objects, but also their attributes and their relationships to each other within the image need to be recognized. Furthermore, image captioning must produce sentences that are syntactically and semantically correct. Hossain et al. [21] introduce a broad survey of works for image captioning based on deep learning. They analyse their main strengths, performances, but also their limitations. In addition, they explore the datasets and the evaluation metrics that have been used for automatic image captioning with deep learning.
Data augmentation can be used for the expansion of (limited) datasets to obtain larger training and evaluation sets. Shorten and Khoshgoftaar [22] review image ugmentation algorithms that cover geometric transformations, but also colour space augmentations and feature space augmentation. Further, techniques like kernel filters, mixing images, random erasing, neural style transfer and meta-learning. They also cover generative adversarial network-based augmentation methods. In addition, they explore and study further characteristics in the area of data augmentation, like test-time augmentation, final size of the dataset, the impact of the resolution, but also curriculum learning. Finally, they give an overview of available approaches for meta-level decisions for implementing data augmentation.
Generative adversarial networks or GANs belong to the field of generative models in machine learning. GANs have experienced an in-depth exploration during the last few years with the most significant impact in the field of computer vision. Wang et al. [23] survey three real-world problems that have been approached with GANs: the generation of high-quality images, diversity of image generation, and stable training. They give a detailed overview of the current state-of-the-art in generative adversarial networks. Furthermore, they structure their review using a specific taxonomy, which they have adopted based on variations in generative adversarial network-based architectures and loss functions.
This sub-section deals with the deep learning reviews in the area of natural language processing. It is divided in eight sub-categories: - general language processing, - language generation and conversation, - named entity recognition, - sentiment analysis, - text summarization, - answer selection, - word embedding, - and financial forecasting. .2.1 Natural language processing
Natural language processing is a theory-inspired variety of computational methods and algorithms for the automatic studying of the human language that can be used, for example, for voice commands in all kind of applications. Young et al. [24] survey several important deep learning-based approaches that have been utilized for various tasks of natural language processing. They provide a walk-through of their evolution during the last years and overview, compare and contrast the numerous models. Finally, they provide an in-depth overview of the past, the present and future role of deep learning in natural language processing.
A task of natural language processing is the generation (NLG) of text or speech from a non-linguistic input. This can be the generation of new texts from (often human-written) existing ones from one language to another language by machine translation, or a summarization and fusion of texts with the goal to make them more concise. Gatt and Krahmer [25] provide an overview of the published research in common NLG tasks and the corresponding neural architectures. They explore common research areas between NLG and general artificial intelligence and outline the particular challenges in evaluation in the field. Santhanam and Shaikh [26] provide an overview of classical methods, statistical methods, and methods that utilize deep neural networks and review publications on open domain dialogue systems. Thereby, they recognize three further research directions for the development of more effective dialogue systems. These are incorporating conversation context and world knowledge, but also including larger contexts. Further, enhancing an NLG system by personae or personality and eliminating aspects that lower the quality of system-produced responses, like generic or dull responses. Gao et al. [27] present a tutorial that surveys neural approaches to conversational artificial intelligence and arrange conversational systems into three different categories. These are agents for question-answering, task-oriented dialogues and social bots. For each of these categories, they introduce the state-of-the-art overview of neural methods, but also make connections between these methods and classical methods. Finally, they outline the current progress in this field and present the remaining challenges by applying certain models and systems in case studies. hen et al. [28] propose a survey overview of the current progress in the area of dialogue systems from different angles and explore future research areas and topics. Established dialogue systems are grouped by the authors into two model categories: Task-oriented and non-task-oriented. Then, they outline how deep learning-based approaches can support these with specific algorithms. Finally, they highlight several further research areas, which can support and advance the field of dialogue-systems.
Named entity recognition deals with the identification of named entities (in example real-world objects, like persons or locations) and furthermore, their classification into specific categories. It functions as foundation for natural language approaches, like the summarization of text, answering questions or machine translation. Li et al. [29] provide a large panoramic of established deep learning-based approaches for named entity recognition. They introduce named entity recognition resources, including tagged named entity recognition corpora, and further off-the-shelf named entity recognition applications. Available works are systematically arranged, depending on a taxonomy along three axes, namely the distributed representations for the input, the context encoder, and the tag decoder. Finally, they review the main approaches for deep learning-based techniques that have recently been applied and outline the particular challenges for named entity recognition systems in their contribution. Yadav and Bethard [30] present a broad overview of deep neural network approaches for the field of named entity recognition. They compare them with existing methods for named entity recognition that use feature engineering and further supervised or semi-supervised learning methods. Finally, they outline the advantages that have been gained by neural networks and depict how including specific works on feature-based named entity recognition systems can yield further improvements.
Sentiment analysis is the (automatic) recognition and analysis of people’s opinions, sentiments, emotions and appraisals. This can be used in data mining applications for the exploration of this subjective information source and, therefore, opinions of specific entities, like products, events or services, but also topics, individuals or organizations. In their contribution, Zhang et al. [31] start with an introduction of deep learning, then they provide an extensive review of its recent sentiment analysis applications. They divide the area of recent sentiment analysis in different sub-categories, like a sentiment lassification on a document-level, sentence-level and aspect-level, and a further opinion expression extraction. Do et al. [32] wrote a comprehensive and context-based overview of deep learning methods that have been used in aspect-based sentiment analysis. For their review contribution, they categorised and summarised 40 approaches by their main deep learning architecture and their specific classification tasks. The review works consisting of general, but also adaptions of common convolutional neural networks, long-short term memory methods, and gated recurrent units.
Text summarization targets the summarization of (long) documents into shorter ones while, at the same time, keeping the key meaning and information of the original text documents. Shi et al. [33] give a broad technical literature review on various seq2seq (sequence-to-sequence learning) methods for the summarization of text from the angle of training strategies, network structures and summary generation methods. In addition to the review, they implemented an open-source library, called the Neural Abstractive Text Summarizer (NATS) toolkit, which can be used for abstractive text summarization. Further, they conducted a set of experiments on the common CNN/Daily Mail dataset to evaluate the capabilities and results of diverse neural network components. They conclude by benchmarking two implemented NATS methods on the Newsroom and Bytecup datasets.
The aim of (automatic) answer selection is identifying correct (and incorrect) answers. In example, for a given question and a number of possible candidate answers, answer selection identifies which of the candidates answered the question correctly (and, concurrently, which of the candidates do not). In their survey, Lai et al. [34] outline an extensive, systematic analysis of numerous deep learning-based approaches for answer selection along two main dimensions: (1.) neural network architectures, such as attentive architecture, siamese architecture and compare-aggregate architecture, and (2.) learning strategies, such as listwise, pairwise, and pointwise. Moreover, they examined the most common datasets for answer selection and their evaluation metrics, and present various possible research directions for the future in this field. .2.7 Word embedding
Word embedding, or short embedding, covers feature learning and modelling strategies for representing words or phrases as numbers or vectors. In their work, Zhang et al. [35] review the state-of-the-art of neural information retrieval research, with a focus on the usage of queries and document representations that have been learned, like neural embeddings. In doing so, they outline the achievements in the field of neural information retrieval, but also point out limitations for a broader usage, and conclude by proposing possible and favourable future research directions. The survey contribution of Almeida and Xexéo [36] depicts and delineates the main recent strategies in the field of word embedding. The authors introduce two main categories for word embeddings and the corresponding publications: prediction-based models and count-based models.
Financial forecasting tries to (automatically) predict financial market trends, like stock market predictions. Financial forecasting can, for example, be based on financial statements and reports, but also news articles and press releases, with the goal to keep a competitive business advantage. Xing et al. [37] present in their work the scope of natural language-based financial forecasting (NLFF) research by arranging and organizing the methods and approaches from the reviewed works. Their review publication targets on providing a greater knowledge of the advancements and NLFF hotspots.
This sub-section deals with the deep learning reviews in the medical field. It is divided in eight sub-categories, namely: - medical image analysis, - medical imaging, - health-record analysis, - cancer detection and diagnosis, - bioinformatics, - radiotherapy, - pharmacogenomics, - and radiology.
Medical image analysis is the task of automatically or semi-automatically extracting information from (patient-specific) medical images. -38In example, this could be an automatic determination of the tumour volume from a patient’s magnetic resonance imaging scan with the aim to choose a therapy strategy. The publication of Litjens et al. [38] gives an overview of main deep learning techniques in regard to medical image analysis. It presents over 300 works in that area and reviews the application of deep learning in topics like organ or disease detection, image classification, segmentation, registration, and further tasks. Furthermore, compact outlines of studies are presented by the application areas: abdominal, breast, cardiac, digital pathology, musculoskeletal, neuro, pulmonary and retinal. Finally, they summarize recent works at that time and discuss remaining challenges and areas for future research work. Shen et al. [39] present the basics of deep learning-based approaches and analyse the reported results in fields like medical image registration, tissue segmentation, anatomical and cell structure detection, computer-aided disease diagnosis and prognosis. They conclude their work by raising remaining research questions and proposing future research directions for additional improvements in the field of medical image analysis. Xing et al. [40] give an overview about microscopy image analysis. They start with a brief introduction of common deep neural networks and provide a compact outline of recent deep learning successes in numerous applications, including detection, classification and segmentation in the area of microscopy image analysis. They present the background of (fully) convolutional neural networks, deep belief networks, recurrent neural networks and stacked autoencoders, and connect their basic principles and modelling to certain application in different microscopy images. They conclude by discussing remaining research challenges and outline possible directions for future research in deep learning-based microscopy image analysis. The review of Haskins et al. [41] present the progress of deep learning-based approaches for the field of medical image registration by outlining research challenges, but also significant advancement in the last years. They divide their article in three main categories, namely unsupervised transformation estimation, supervised transformation estimation and deep iterative registration. Each main category is then divided again in ub-categories. They conclude with surveying highlights of future research directions in this field.
Medical imaging deals with the acquisition of human and animal images from cellular to body scale. Common examples include computed tomography (CT) and magnetic resonance imaging (MRI), which allow to acquire images at vascular and organ level for diagnostic and therapeutic reasons [42]-[44]. In this context, Lundervold and Lundervold [45] provide an analysis of deep learning-based methods in medical imaging with a focus on MRI acquisitions. The goal of their review is threefold: First, they provide a compact background introduction of deep learning and its main contributions. Then, they outline how deep learning-based methods have been used for the whole MRI workflow – from image acquisition to diagnosis –, and give a starting point for researchers in the area of deep learning-based medical imaging. Finally, they point to open-source code repositories, educational resources, and further related data sources that are of interest for medical imaging.
Health record analysis explores the digital information stored in electronic health records. Originally intended to archive patient information and performing administrative tasks in healthcare, such as billing, researchers starred to utilize these records also for numerous other applications in clinical informatics. Shickel et al. [46] survey the applications of deep learning for the analysis of health-record data. They report several deep learning-based methods and techniques that have been used for different clinical applications, such as information extraction, outcome prediction, representation learning, but also de-identification and phenotyping. In their review, they identify a number of limitations for the current research regarding the data heterogeneity, model interpretability, but also missing universal benchmarks. They conclude their review by outlining the field and proposing directions for upcoming deep health-record analysis research.
As previously mentioned, medical image acquisitions, like MRI or CT, can be used for the diagnosis of pathologies, such as tumours, and internal injuries like bone fractures. However, the manual processing of these images can be cumbersome and time-onsuming, even for experts [47]. This led to the investigation of automatic approaches. Hu et al. [48] review deep learning-based applications for cancer detection and diagnosis. In their survey, they start with a background on deep learning and common architectures applied to the detection and diagnosis of cancer. They focus on four common architectures in deep learning, such as (fully) convolutional neural networks, deep belief networks, but also auto-encoders. Additionally, they present a review on studies that exploit deep learning for cancer detection and diagnosis, grouped by cancer type. Finally, they provide a summary and personal comments to the reviewed works and suggest future research directions.
Bioinformatics is an interdisciplinary field developing approaches and software tools for the understanding of biological data with a strong focus on large and complex data sets. Lan et al. [49] concentrate on the review of research contributions using deep learning and data mining approaches for the analysis of domain-specific knowledge in bioinformatics. Their review article provides a summary of several data mining methods that have been utilized for pre-processing, classification and clustering along with different optimized neural network architectures and deep learning methods. Furthermore, they present the advantages and disadvantages of such methods in practical applications, discuss, and compare them in terms of their industrial usage.
Radiotherapy can be used, for example, to treat cancer patients. However, planning and delivering radiotherapy treatment is a complicated procedure, which artificial intelligence tries to automate and therefore facilitate. In their review, Meyer et al. [50] introduce the basics of deep learning and its position in the overall context of machine learning. They introduce popular neural architectures with a particular emphasis on classic convolutional neural networks. Subsequently, they give an overview of contributions on deep learning-based methods that can be utilized for radiotherapy. Thereby, they classify them into seven different categories in regards to the overall patient workflow.
Pharmacogenomics (pharmaco- + genomics) is an interdisciplinary field between pharmacology and genomics that studies the role of the genome in drug response. In heir review, Kalinin et al. [51] introduce recent works and future applications for deep learning in the area of pharmacogenomics. This includes the exploration of new regulatory variants situated in noncoding genomic regions and their role not only in pharmacoepigenomics, but also in patient stratification from clinical records. They aim at the application of deep learning for the prediction of patient-specific drug responses to optimize the process of drug selection and dosing. This process is automated by applying data-driven deep learning algorithms on large and complex data collections, which can provide different sets of information, ranging from the micro- to the macroscopic level, such as from molecular to epidemiological and from clinical to demographic domains.
Radiology is the medical field that deals with the extraction of useful information from images, like CT or MRI patient acquisitions, for diagnosis and treatment of humans and animals. Mazurowski et al. [52] give an overview of the common fields of radiology and present options and chances for deep learning-based approaches there. They also present fundamental deep learning concepts, such as convolutional neural networks before presenting research contributions focused on deep learning and its application to radiology. The reviewed works are grouped by application task. They conclude their work discussing opportunities and challenges for the inclusion of deep learning-based approaches into the clinical practice.
This sub-section deals with additional deep learning reviews, which were not covered within the other categories. It is divided in eleven sub-categories: - big data, - reinforcement learning, - mobile and wireless networking, - mobile multimedia, - multimodal learning, - remote sensing, - graphs, - anomaly detection, - recommender systems, agriculture, - and multiple areas.
Big data is the field that analyses data that is too comprehensive, or too complex, to handle using classic data-processing tools. The aim of big data is to systematically extract information from large and complex data sets. Application areas include e-commerce, industrial control, and precision medicine. Zhang et al. [53] review in their contribution the works on emerging deep learning models for feature learning with big data. They review deep learning-based methods and models that have been used with large data collections, heterogeneous data, but also real-time and low-quality data. Mohammadi et al. [54] give an overview on deep learning-based approaches that have been used to support the learning and analytics in the domain of internet of things (IoT). They begin with the characteristics of IoT data, but also the treatments of IoT data, namely the analytics and streaming of big data in the domain of IoT. Next, they review the promising aspects of deep learning-based methods for getting certain results in analytics regarding these data types’ applications. In addition, they outline the potential of upcoming deep learning-based methods for data analytics in the IoT domain. Besides a comprehensive background on different deep learning methods, they review further research efforts, which affected the IoT domain by applying deep learning. Furthermore, they state how smart IoT devices have incorporated deep learning and review methods for fog computing and cloud computing in aiding IoT approaches. Emmert-Streib et al. [55] start their contribution with a background analysis in deep learning-based methods, like convolutional neural networks, deep feed-forward neural networks, but also deep belief networks, long short-term memory networks and autoencoders, because, according to the authors, these are currently the most commonly used architectures. Additionally, they introduce related concepts such as restricted Boltzmann machines and resilient backpropagation and discuss the differences when dealing with big data vs. small data and specific data types. They state that the adaptiveness of these (network) architectures enables a “Lego-like” generation of countless new neural networks. .4.2 Reinforcement learning
Reinforcement learning is an algorithmic learning strategy where the algorithm tries to maximize an agent’s performance via rewards, based on observations within the task environment. Reinforcement learning-based approaches have proven to be successful in numerous fields, like the robotics domain. The article of Mousavi et al. [56] surveys the recent advances in supervised and unsupervised deep reinforcement learning, putting an emphasis on most commonly applied deep architectures, like convolutional neural networks, recurrent neural networks, but also autoencoders that have effectively been incorporated within reinforcement learning-based frameworks. They structure their review in three main categories, these are: supervised reinforcement learning, unsupervised reinforcement learning, and deep reinforcement learning in environments that allow to observe parts of the process as Markov decisions. Li [57] gives also an overview of deep reinforcement learning strategies, discussing six main elements, six significant mechanisms, but also twelve related applications. After presenting the fundaments of machine and deep learning, he also introduces the main elements of reinforcement learning, such as value function, policy, reward, and exploration strategies. Afterwards, the mechanisms, including attention and memory, transfer learning, unsupervised learning, multiagent reinforcement learning, but also hierarchical reinforcement learning and learning to learn, are presented. Finally, numerous possible applications are outlined, such as games (like AlphaGo), natural language processing, covering dialogue systems, text generation, machine translation, but also computer vision, robotics, finance, business management, education, healthcare, Industry 4.0, intelligent transportation systems, smart grids and further computer systems. Arulkumaran et al. [58] start their review by giving a general overview of the reinforcement learning field and continuing afterwards to the central areas of value-based methods, but also policy-based methods. The review covers the main approaches and methods in the field of deep reinforcement learning, such as deep q-networks, asynchronous advantage actor critic and trust region policy optimization. Additionally, they outline the main benefits of deep neural networks, in particular on visual understanding, with reinforcement learning. .4.3 Mobile and wireless Networking
Mobile and wireless networking, or short networking, has rapidly evolved during the last years, thanks to the spread of mobile devices and mobile applications [59], [60]. In addition, the recently released 5G technology is expected to massively increase the mobile traffic volumes. Zhang et al. [61] present a comprehensive survey in deep learning-based research in mobile and wireless networking. They start with an introduction about fundamentals on deep learning-based methods that could lead to networking applications and review certain approaches and platforms with the potential to promote the progression of mobile systems using deep learning. Next, they give an overview on deep learning-based research in mobile and wireless networking, by categorizing it in several domains.
Mobile multimedia refers to various applications that can be accessed or created by portable devices, like smartphones [62]. This includes mobile applications such as audio and video players, games and e-healthcare. Ota et al. [63] introduce the basics of deep learning for multimedia, thereby focusing on the core parts of deep learning in regard to mobile environments, namely low-complexity deep learning methods, software tools and frameworks for mobile and other resource-constrained environments, but also specific hardware available in mobile devices, which can be used to facilitate the computationally intensive training and inference of deep networks. In addition, they present numerous deep learning-based, mobile applications to show possible real-life scenarios for such a technology.
Multimodal learning uses data of different modalities in a learning strategy. An example for data from different modalities are the acquisitions from positron emission tomography-computed tomography (PET-CT) scanners, where the tissue data from the CT and metabolically active regions from the PET are acquired from a patient [64]. In their survey, Ramachandram and Taylor [65] first classify the architectures for deep multimodal learning. Afterwards, they introduce certain methods to combine multimodal representations that have been learned with these deep learning-based architectures. In particular, they outline two main research fields for potential upcoming works, namely regularization methods and strategies that learn and optimize structures in the domain of multimodal fusion. .4.6 Remote sensing
Remote sensing covers technologies for the remote analysis of objects or scenes. Examples are satellite-based imaging, aerial imaging, crowdsourcing (such as tweets or phone imagery), but also advanced driver-assistance systems and unmanned aerial vehicles. Ball et al. [66] provide a compact analysis of recent deep learning-based research in the domain of remote sensing. They introduce the theories and tools, but also challenges within the remote sensing field. Thereafter, they present remaining research questions and opportunities, like modelling physical phenomena with human-understandable solutions, inadequate data sets and big data. Furthermore, they focus on specific non-traditional data sources that are heterogeneous, deep learning-based architectures and algorithms to learn spatial, spectral and temporal data, but also transfer learning. Further, they provide a fundamental insight into deep learning-based systems, and outline obstacles for training, but also optimizing deep learning-based methods.
Graphs are representations of objects and of their relationships with other objects. Common examples include social networks, traffic networks, e-commerce networks, but also biological networks. Zhang et al. [67] provide a survey on various types of deep learning-based approaches on graphs. They split the existing approaches into five different categories in regards to the underlying model architectures, but also training strategies, namely graph convolutional networks, graph recurrent neural networks, graph reinforcement learning, graph autoencoders and graph adversarial methods. They propose a systematic outline of these techniques, mostly by following the historical appearance and review the structures and differences. They conclude their review by outlining the applications in this area.
Anomaly detection is a strategy used to detect unexpected events or items in data sets. It can be used in areas like signal processing, statistics, finance, manufacturing, econometrics, networking, but also data mining. Kwon et al. [68] propose an outline of deep learning-based methods, covering deep neural networks, restricted Bolzmann machine-based deep belief networks and recurrent neural networks. They also cover machine learning techniques that are related to network anomaly detection. Furthermore, they present the latest work from the literature that used deep learning-ased techniques with an emphasis on network anomaly detection. Finally, they outline their own results with deep learning-based methods for the analysis of network traffic.
Recommender systems are utilized to predict the preferences of a user, in example, to provide web users with personalized information about products and services, like movies, insurances, or restaurants. Zhang et al. [69] propose a survey of latest deep learning-based research for recommender systems. They formulate and introduce a deep learning-based taxonomy for recommendation models, together with an outlining of recent contributions from the literature. They conclude their contribution by outlining current trends and propose new perspectives in regard to the development in the field.
Agriculture is the scientific process that deals with the cultivation of plants and livestock to produce products like food, feed and fibres. Kamilaris and Prenafeta-Boldú [70] review 40 research works in deep learning, which have been used for numerous agricultural, but also food production problems. For each problem, they explored the specific agricultural challenge, the frameworks and models that have been used, but also the data source, data nature and data pre-processing methods. They also report and present the overall performance results for the selected metrics. In addition, they survey the comparison of deep learning-based approaches with other existing common methods, with a focus on the differences in the performances for classification and regression.
At last, this sub-section presents deep learning review contribution that span over multiple disciplines and applications with no main focus on a specific area. Pouyanfar et al. [71] present a more general review in the field of deep learning-based algorithms and techniques, but also their applications. Their survey proposes an in-depth analysis of historical, but also novel methods in visual processing, audio processing, and text processing, but also the analysis of social networks, and further the processing of natural language. Next, they provide a comprehensive review on improvements in deep learning-based approaches and survey deep learning challenges, like unsupervised earning and online learning, but also black-box models, and show how these reaming demands may be addressed in the upcoming works. Dargan et al. [72] focus in their contribution on common deep learning concept, including fundamental and more sophisticated architectures, characteristics, techniques, limitations and motivational aspects. They introduce several main differences amongst classical machine learning, deep learning and approaches in conventional learning, but also main challenges that still need to be solved. They chronologically analyse and propose an extensive review of significant deep learning-based applications, thereby including numerous fields, techniques, methods and architectures that have been applied, and discuss the works of their application for usage in a real world scenario. In their review of deep learning-based scientific discovery, Raghu and Schmidt [73] provide an analysis of several deep learning-based models that have been applied to topics like sequential, visual, but also graph structured data. Further, they present related tasks and varying training approaches, together with methods that enable the usage of deep learning-based methods for sparse data, and how to get a better understanding of complex models. They also give different outlines of overall design processes, hints for implementations, and point to tutorials, open-sourced pipelines in deep learning, research summaries, but also pretrained models that have been implemented within the community, with the aim to speed up the application of deep learning-based approaches across multiple scientific fields and domains.
3. Conclusion and Discussion
In this contribution, selected reviews and surveys on deep learning have been presented in a compact categorized meta-survey. A systematic search has been performed in common libraries and search engines, like in IEEE Xplore Digital Library, Scopus, DBLP, PubMed, and Google Scholar, which resulted in around 60 review publications for this meta-survey contribution about deep learning during the last three to four years (status as of August 2020). In addition to the identified review publications, which have been arranged in different categories and sub-categories, the references and citations of these reviews have been retrieved and are presented. Even if deep learning is still a relatively young scientific field and technology, there have been already a few hundred reviews within the last years, which can be seen s indicator for the number of breakthroughs that have been achieved with these methods. This categorized meta-survey shows that, based on the selected works of this contribution, on average more than one deep learning review per month was published during this time period, and it can be expected that these numbers will increase in the near future. Moreover, the number of references (>10.000) and citations (>15.000) of the selected works can be seen as an indicator for the current importance of deep learning. Apparently, the medical field currently has the overall highest amount of citations (>6.000), but interestingly, also the lowest number of overall references (<2.000) included in the reviews (note that for the medical category the least number of reviews was selected). A reason for the massive research activities in deep learning is probably given by the relatively easy usage and extension of these approaches: Comprehensive and user-friendly libraries and toolkits, like TensorFlow or PyTorch, do not necessarily require an in-depth education in computer science anymore. This was not the case in the years preceding the diffusion of deep learning, when very good technical skills and programming experience, like C or C++, were necessary to implement efficient algorithms with a reasonable runtime. Additionally, nowadays, many researchers share their source code using online repositories, like GitHub, making it available to the research community. Following this trend, some publication venues started to require that the source code should be made available alongside with the publication of the paper. On top of that, most deep learning libraries and toolkits are built for Python, a high-level programming language with a faster learning curve compared to languages like C++ or Java. The widespread of graphic processing units also contributed to the raise and impact of deep learning: Most deep learning libraries and toolkits support the training and execution of deep learning on graphic processing units, which strongly speed up the computation time thanks to their parallel architecture. Additionally, graphic processing units became less expensive during the last years and graphic processing unit clusters are more and more common in universities, research centers, and companies. Moreover, private corporate companies, like Google, Microsoft, and Amazon, offer online cloud computing services for little to no cost for private users. Deep learning certainly has already had a massive impact in the daily life of most people via the countless applications that are based on this technique. It will be interesting to see what the future brings for us in the area of deep learning. However, eep learning is not perfect, as seen in tragic real life incidents, like car accidents, racist miss-classification of images, or the machine learning bot Tay from Microsoft [74]-[76]. In addition, tasks that claim to have outperformed humans with deep learning are often performed under so-called laboratory conditions , which means that a specific sample set for testing is used, but real-life tests are lacking, or other shortcomings [77]. There exist of course specific scenarios, where machine learning has undoubtable outperformed humans, like the games Deep Blue [78] and AlphaGo [79]. Here, algorithms were able to surpass the best known living human players. However, these games have fixed rules and strong restrictions, within players/algorithms have to operate. This stands in strong contrast to scenarios and tasks with almost unlimited possibilities. A human face, a spoken sentence, a driving scenario, or a pathology, are always distinct and at least slightly different, which makes it harder to predict. In chess, an algorithm can rely on the fact that the king can only move one block and will not jump over the “chessboard cliff”. In principle, deep learning is trying to imitate the human brain and how it functions and learns, although on a very basic level [80]. This course of action can be seen as a blessing and a curse at the same time, because equivalent to the fact that we cannot take a look into someone’s brain, the behavior of a trained deep neural network, with millions of neurons, connections and weights, is not fully understandable in every detail. This, on the other hand, makes it hard to predict exceptions and failures. As a consequence, deep learning is seen as a black box approach and this is often difficult to accept, especially when not all behaviors are foreseeable and there remains uncertainty. Thus, not everyone may feel comfortable in a self-driving car. We can agree that deep learning is an exciting and relatively new machine learning technique, which has already brought a lot of influence and has infiltrated the life of most humans, like through virtual personal assistants (Amazon’s Alexa, Apple’s Siri, Google Now, etc.) or automatic number-plate recognition for toll roads, parking garages or law enforcement, just to name a few. On the other hand, like most new technologies with such a fast and massive impact, deep learning is not free of failures and controversies. We hope, however, that this very first meta-survey of deep learning provides a quick and comprehensive reference for interested readers. Thereby, readers gain a high-level overview and stimuli of this overwhelming field. The contribution of this meta-survey is fourfold: providing an overview of current deep learning reviews from various scientific domains, • categorized arrangement of the works, for a domain-specific and historical picture, • extraction of referenced works and citations to show the research influence of deep learning within these domains, • conclusion and critical discussion of past and future directions for deep learning. cknowledgements
This work sees the funding of the Austrian Science Fund (FWF) KLI 678-B31: “enFaced: Virtual and Augmented Reality Training and Navigation Module for 3D-Printed Facial Defect Reconstructions” and the TU Graz Lead Project (
Mechanics, Modeling and Simulation of Aortic Dissection ). Moreover, this work was supported by CAMed (COMET K-Project 871132), which is funded by the Austrian Federal Ministry of Transport, Innovation and Technology (BMVIT), and the Austrian Federal Ministry for Digital and Economic Affairs (BMDW), and the Styrian Business Promotion Agency (SFG). Finally, we want to thank S. M. for the inspiration for this contribution.
Additional Information
Competing financial interests: The authors declare no competing financial interests. eferences [1]
McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics. 1943 Dec;5(4):115-33. [2]
LeCun Y, Bengio Y, Hinton G. Deep learning. nature. 2015 May;521(7553):436-44. [3]
Ravì D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B, Yang GZ. Deep learning for health informatics. IEEE journal of biomedical and health informatics. 2016 Dec 29;21(1):4-21. [4]
Wang J, Ma Y, Zhang L, Gao RX, Wu D. Deep learning for smart manufacturing: Methods and applications. Journal of Manufacturing Systems. 2018 Jul 1;48:144-56. [5]
Gibson E, Li W, Sudre C, Fidon L, Shakir DI, Wang G, Eaton-Rosen Z, Gray R, Doel T, Hu Y, Whyntie T. NiftyNet: a deep-learning platform for medical imaging. Computer methods and programs in biomedicine. 2018 May 1;158:113-22. [6]
Pepe A, Li J, Rolf-Pissarczyk M, Gsaxner C, Chen X, Holzapfel GA, Egger J. Detection, segmentation, simulation and visualization of aortic dissections: A review. Medical Image Analysis. 2020 Oct 1;65:101773. [7]
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M. Deep learning for generic object detection: A survey. International journal of computer vision. 2020 Feb;128(2):261-318. [8]
Zhao ZQ, Zheng P, Xu ST, Wu X. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems. 2019 Jan 28;30(11):3212-32. [9]
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R. A survey of deep learning-based object detection. IEEE Access. 2019 Sep 5;7:128837-68. [10]
Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Martinez-Gonzalez P, Garcia-Rodriguez J. A survey on deep learning techniques for image and video semantic segmentation. Applied Soft Computing. 2018 Sep 1;70:41-65. 11]
Minaee S, Boykov Y, Porikli F, Plaza A, Kehtarnavaz N, Terzopoulos D. Image segmentation using deep learning: A survey. arXiv preprint arXiv:2001.05566. 2020 Jan 15. [12]
Egger J, Gsaxner C, Pepe A, Li J. Medical Deep Learning--A systematic Meta-Review. arXiv preprint arXiv:2010.14881. 2020 Oct 28. [13]
Masi I, Wu Y, Hassner T, Natarajan P. Deep face recognition: A survey. In2018 31st SIBGRAPI conference on graphics, patterns and images (SIBGRAPI) 2018 Oct 29 (pp. 471-478). IEEE. [14]
Li S, Deng W. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing. 2020 Mar 17. [15]
Mei W, Deng W. Deep face recognition: A survey. arXiv preprint arXiv:1804.06655. 2018;1. [16]
Herath S, Harandi M, Porikli F. Going deeper into action recognition: A survey. Image and vision computing. 2017 Apr 1;60:4-21. [17]
Wang P, Li W, Ogunbona P, Wan J, Escalera S. RGB-D-based human motion recognition with deep learning: A survey. Computer Vision and Image Understanding. 2018 Jun 1;171:118-39. [18]
Sundararajan K, Woodard DL. Deep learning for biometrics: A survey. ACM Computing Surveys (CSUR). 2018 May 23;51(3):1-34. [19]
Minaee S, Abdolrashidi A, Su H, Bennamoun M, Zhang D. Biometric recognition using deep learning: A survey. arXiv preprint arXiv:1912.00271. 2019 Nov 30. [20]
Wang Z, Chen J, Hoi SC. Deep learning for image super-resolution: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020 Mar 23. [21]
Hossain MZ, Sohel F, Shiratuddin MF, Laga H. A comprehensive survey of deep learning for image captioning. ACM Computing Surveys (CSUR). 2019 Feb 4;51(6):1-36. [22]
Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. Journal of Big Data. 2019 Dec 1;6(1):60. [23]
Wang Z, She Q, Ward TE. Generative adversarial networks in computer vision: A survey and taxonomy. arXiv preprint arXiv:1906.01529. 2019 Jun 4. [24]
Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine. 2018 Jul 20;13(3):55-75. 25]
Gatt A, Krahmer E. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research. 2018 Jan 27;61:65-170. [26]
Santhanam S, Shaikh S. A survey of natural language generation techniques with a focus on dialogue systems-past, present and future directions. arXiv preprint arXiv:1906.00500. 2019 Jun 2. [27]
Gao J, Galley M, Li L. Neural approaches to conversational AI. InThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval 2018 Jun 27 (pp. 1371-1374). [28]
Chen H, Liu X, Yin D, Tang J. A survey on dialogue systems: Recent advances and new frontiers. Acm Sigkdd Explorations Newsletter. 2017 Nov 21;19(2):25-35. [29]
Li J, Sun A, Han J, Li C. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering. 2020 Mar 17. [30]
Yadav V, Bethard S. A survey on recent advances in named entity recognition from deep learning models. arXiv preprint arXiv:1910.11470. 2019 Oct 25. [31]
Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2018 Jul;8(4):e1253. [32]
Do HH, Prasad PW, Maag A, Alsadoon A. Deep learning for aspect-based sentiment analysis: a comparative review. Expert Systems with Applications. 2019 Mar 15;118:272-99. [33]
Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK. Neural abstractive text summarization with sequence-to-sequence models. arXiv preprint arXiv:1812.02303. 2018 Dec 5. [34]
Lai T, Bui T, Li S. A review on deep learning techniques applied to answer selection. In Proceedings of the 27th international conference on computational linguistics 2018 Aug (pp. 2132-2144). [35]
Zhang Y, Rahman MM, Braylan A, Dang B, Chang HL, Kim H, McNamara Q, Angert A, Banner E, Khetan V, McDonnell T. Neural information retrieval: A literature review. arXiv preprint arXiv:1611.06792. 2016 Nov 18. [36]
Almeida F, Xexéo G. Word embeddings: A survey. arXiv preprint arXiv:1901.09069. 2019 Jan 25. 37]
Xing FZ, Cambria E, Welsch RE. Natural language based financial forecasting: a survey. Artificial Intelligence Review. 2018 Jun 1;50(1):49-73. [38]
Litjens G, Kooi T, Bejnordi BE, Setio AA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI. A survey on deep learning in medical image analysis. Medical image analysis. 2017 Dec 1;42:60-88. [39]
Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annual review of biomedical engineering. 2017 Jun 21;19:221-48. [40]
Xing F, Xie Y, Su H, Liu F, Yang L. Deep learning in microscopy image analysis: A survey. IEEE Transactions on Neural Networks and Learning Systems. 2017 Nov 22;29(10):4550-68. [41]
Haskins G, Kruger U, Yan P. Deep learning in medical image registration: a survey. Machine Vision and Applications. 2020 Jan 1;31(1):8. [42]
Pepe A, Trotta GF, Mohr-Ziak P, Gsaxner C, Wallner J, Bevilacqua V, Egger J. A Marker-Less Registration Approach for Mixed Reality–Aided Maxillofacial Surgery: a Pilot Evaluation. Journal of digital imaging. 2019 Dec 1;32(6):1008-18. [43]
Gsaxner C, Pepe A, Wallner J, Schmalstieg D, Egger J. Markerless image-to-face registration for untethered augmented reality in head and neck surgery. In International Conference on Medical Image Computing and Computer-Assisted Intervention 2019 Oct 13 (pp. 236-244). Springer, Cham. [44]
Gsaxner C, Pfarrkirchner B, Lindner L, Pepe A, Roth PM, Egger J, Wallner J. PET-train: Automatic ground truth generation from PET acquisitions for urinary bladder segmentation in CT images using deep learning. In 2018 11th Biomedical Engineering International Conference (BMEiCON) 2018 Nov 21 (pp. 1-5). IEEE. [45]
Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Zeitschrift für Medizinische Physik. 2019 May 1;29(2):102-27. [46]
Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE journal of biomedical and health informatics. 2017 Oct 27;22(5):1589-604. [47]
Hahn LD, Mistelbauer G, Higashigaito K, Koci M, Willemink MJ, Sailer AM, Fischbein M, Fleischmann D. CT-based True-and False-Lumen Segmentation in Type B Aortic Dissection Using Machine Learning. Radiology: Cardiothoracic Imaging. 2020 Jun 25;2(3):e190179. 48]
Hu Z, Tang J, Wang Z, Zhang K, Zhang L, Sun Q. Deep learning for image-based cancer detection and diagnosis− a survey. Pattern Recognition. 2018 Nov 1;83:134-49. [49]
Lan K, Wang DT, Fong S, Liu LS, Wong KK, Dey N. A survey of data mining and deep learning in bioinformatics. Journal of medical systems. 2018 Aug 1;42(8):139. [50]
Meyer P, Noblet V, Mazzara C, Lallement A. Survey on deep learning for radiotherapy. Computers in biology and medicine. 2018 Jul 1;98:126-46. [51]
Kalinin AA, Higgins GA, Reamaroon N, Soroushmehr S, Allyn-Feuer A, Dinov ID, Najarian K, Athey BD. Deep learning in pharmacogenomics: from gene regulation to patient stratification. Pharmacogenomics. 2018 May;19(7):629-50. [52]
Mazurowski MA, Buda M, Saha A, Bashir MR. Deep learning in radiology: An overview of the concepts and a survey of the state of the art with focus on MRI. Journal of magnetic resonance imaging. 2019 Apr;49(4):939-54. [53]
Zhang Q, Yang LT, Chen Z, Li P. A survey on deep learning for big data. Information Fusion. 2018 Jul 1;42:146-57. [54]
Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M. Deep learning for IoT big data and streaming analytics: A survey. IEEE Communications Surveys & Tutorials. 2018 Jun 6;20(4):2923-60. [55]
Emmert-Streib F, Yang Z, Feng H, Tripathi S, Dehmer M. An introductory review of deep learning for prediction models with big data. Frontiers in Artificial Intelligence. 2020;3:4. [56]
Mousavi SS, Schukat M, Howley E. Deep reinforcement learning: an overview. In Proceedings of SAI Intelligent Systems Conference 2016 Sep 21 (pp. 426-440). Springer, Cham. [57]
Li Y. Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274. 2017 Jan 25. [58]
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine. 2017 Nov 9;34(6):26-38. [59]
Bevilacqua V, Biasi L, Pepe A, Mastronardi G, Caporusso N. A computer vision method for the italian finger spelling recognition. In International Conference on Intelligent Computing 2015 Aug 20 (pp. 264-274). Springer, Cham. 60]
Labini MS, Gsaxner C, Pepe A, Wallner J, Egger J, Bevilacqua V. Depth-Awareness in a System for Mixed-Reality Aided Surgical Procedures. In International Conference on Intelligent Computing 2019 Aug 3 (pp. 716-726). Springer, Cham. [61]
Zhang C, Patras P, Haddadi H. Deep learning in mobile and wireless networking: A survey. IEEE Communications Surveys & Tutorials. 2019 Mar 13;21(3):2224-87. [62]
Karner F, Gsaxner C, Pepe A, Li J, Fleck P, Arth C, Wallner J, Egger J. Single-shot Deep Volumetric Regression for Mobile Medical Augmented Reality. MICCAI CLIP, 2020 Oct 4 (pp. 1-11), Springer, Lecture Notes in Computer Science (LNCS). [63]
Ota K, Dao MS, Mezaris V, Natale FG. Deep learning for mobile multimedia: A survey. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM). 2017 Jun 28;13(3s):1-22. [64]
Gsaxner C, Roth PM, Wallner J, Egger J. Exploit fully automatic low-level segmented PET data for training high-level deep learning algorithms for the corresponding CT data. PloS one. 2019 Mar 5;14(3):e0212550. [65]
Ramachandram D, Taylor GW. Deep multimodal learning: A survey on recent advances and trends. IEEE Signal Processing Magazine. 2017 Nov 9;34(6):96-108. [66]
Ball JE, Anderson DT, Chan CS. Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community. Journal of Applied Remote Sensing. 2017 Sep;11(4):042609. [67]
Zhang Z, Cui P, Zhu W. Deep learning on graphs: A survey. IEEE Transactions on Knowledge and Data Engineering. 2020 Mar 17. [68]
Kwon D, Kim H, Kim J, Suh SC, Kim I, Kim KJ. A survey of deep learning-based network anomaly detection. Cluster Computing. 2019 Jan:1-3. [69]
Zhang S, Yao L, Sun A, Tay Y. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR). 2019 Feb 25;52(1):1-38. [70]
Kamilaris A, Prenafeta-Boldú FX. Deep learning in agriculture: A survey. Computers and electronics in agriculture. 2018 Apr 1;147:70-90. 71]
Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS. A survey on deep learning: Algorithms, techniques, and applications. ACM Computing Surveys (CSUR). 2018 Sep 18;51(5):1-36. [72]
Dargan S, Kumar M, Ayyagari MR, Kumar G. A survey of deep learning and its applications: A new paradigm to machine learning. Archives of Computational Methods in Engineering. 2019 Jun 1:1-22. [73]
Raghu M, Schmidt E. A survey of deep learning for scientific discovery. arXiv preprint arXiv:2003.11755. 2020 Mar 26. [74]
Hong JW. Why Is Artificial Intelligence Blamed More? Analysis of Faulting Artificial Intelligence for Self-Driving Car Accidents in Experimental Settings. International Journal of Human–Computer Interaction. 2020 Nov 7;36(18):1768-74. [75]
Kohl C, Knigge M, Baader G, Böhm M, Krcmar H. Anticipating acceptance of emerging technologies using twitter: the case of self-driving cars. Journal of Business Economics. 2018 Jul 1;88(5):617-42. [76]
Hong JW, Choi S, Williams D. Sexist AI: An Experiment Integrating CASA and ELM. International Journal of Human–Computer Interaction. 2020 Aug 15:1-4. [77]
Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Mahendiran T, Moraes G, Shamdas M, Kern C, Ledsam JR. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. The lancet digital health. 2019 Oct 1;1(6):e271-97. [78]
Campbell M, Hoane Jr AJ, Hsu FH. Deep blue. Artificial intelligence. 2002 Jan 1;134(1-2):57-83. [79]
Chen JX. The evolution of computing: AlphaGo. Computing in Science & Engineering. 2016 Jul;18(4):4-7. [80]
Fan J, Fang L, Wu J, Guo Y, Dai Q. From Brain Science to Artificial Intelligence. Engineering. 2020 Mar 1;6(3):248-52. ables
Computer vision 16 2017-2020 3294 2423 Yes Language processing 14 2017-2020 2109 2490 Yes Medical informatics 11 2017-2020 2065 6022 No Additional works 17 2016-2020 3481 4171 Yes
Sum 58 - 10949 15106 - Table 1. Overview of published reviews in deep learning divided into the categories: Computer vision, language processing, medical informatics and additional deep learning surveys. The table presents also the sum of the overall references and citations for the single categories. omputer vision Publications Year Number of references Citations (until August 2020) Preprint
Object detection Liu et al. [7] 2019 332 269 No Zhao et al. [8] 2019 230 491 No Jiao et al. [9] 2019 317 45 No Image segmentation Garcia-Garcia et al. [10] 2018 126 127 No Minaee et al. [11] 2020 172 24 Yes Face recognition Masi et al. [13] 2018 81 220 No Li and Deng [14] 2020 253 189 No Mei and Deng [15] 2020 305 11 Yes Action/motion recognition Herath et al. [16] 2017 161 339 No
Wang et al. [17] 2018 182 122 No Biometric recognition Sundararajan and Woodard [18] 2018 176 66 No Minaee et al. [19] 2020 282 8 Yes Image super-resolution Wang et al. [20] 2020 214 74 No Image captioning Hossain et al. [21] 2019 161 118 No Data augmentation Shorten and Khoshgoftaar [22] 2019 140 274 No Generative adversarial networks Wang et al. [23] 2019 162 46 Yes
Sum - - 3294 2423 - Table 2. List of published reviews in deep learning in the category computer vision. anguage processing Publications Year Number of references Citations (until August 2020) Preprint
General language processing Young et al. [24] 2018 164 922 No Language generation and conversation Gatt and Krahmer [25] 2018 548 270 No Santhanam and Shaikh [26] 2019 137 8 Yes Gao et al. [27] No Chen et al. [28] 2017 111 202 No Named entity recognition Li et al. [29] 2018 211 43 No Yadav and Bethard [30] 2019 83 145 Yes Sentiment analysis Zhang et al. [31] 2018 150 398 No Do et al. [32] 2018 135 86 No Text summarization Shi et al. [33] 2018 131 30 Yes Answer selection Lai et al. [34] 2018 56 33 No Word embedding Zhang et al. [35] 2017 187 27 Yes Almeida and Xexéo [36] 2019 48 18 Yes Financial forecasting Xing et al. [37] 2018 128 93 No
Sum - - 2109 2490 - Table 3. List of published reviews in deep learning in the category language processing. edical informatics Publications Year Number of references Citations (until August 2020) Preprint
Medical image analysis Litjens et al. [38] 2017 439 3696 No Shen et al. [39] 2017 117 1200 No Xing et al. [40] 2017 207 94 No Haskins et al. [41] 2020 122 49 No Medical imaging Lundervold and Lundervold [45] 2019 359 199 No Health-record analysis Shickel et al. [46] 2017 63 366 No Cancer detection and diagnosis Hu et al. [48] 2018 144 108 No Bioinformatics Lan et al. [49] 2018 127 85 No Radiotherapy Meyer et al. [50] 2018 234 84 No Pharmacogenomics Kalinin et al. [51] 2018 128 44 No Radiology Mazurowski et al. [52] 2018 125 97 No
Sum - - 2065 6022 - Table 4. List of published reviews in deep learning in the category medical informatics. dditional works Publications Year Number of references Citations (until August 2020) Preprint
Big data Zhang et al. [53] 2017 102 383 No Mohammadi et al. [54] 2018 229 340 No Emmert-Streib et al. [55] 2020 154 3 No Reinforcement learning Mousavi et al. [56] 2016 45 57 No Li [57] 2017 604 463 Yes Arulkumaran et al. [58] 2017 100 438 No Mobile and wireless networking Zhang et al. [61] 2018 574 372 No Mobile multimedia Ota et al. [63] 2017 111 70 No Multimodal learning Ramachandram and Taylor [65] 2017 103 114 No Remote sensing Ball et al. [66] 2017 419 184 No Graphs Zhang et al. [67] 2020 170 142 No Anomaly detection Kwon et al. [68] 2019 45 196 No Recommender systems Zhang et al. [69] 2019 210 583 No Agriculture Kamilaris and Prenafeta-Boldú [70] 2018 72 612 No Multiple areas Pouyanfar et al. [71] 2018 181 188 No Dargan et al. [72] 2019 87 22 No Raghu and Schmidt [73] 2020 275 4 Yes