A View of Regularized Approaches for Image Segmentation
AArticle
A View of Regularized Approaches for Image Segmentation
Laura Antonelli Institute for High Performance Computing and Networking (ICAR), CNR, Naples, Italy; [email protected] University of Campania “Luigi Vanvitelli”, Department of Mathematics and Physics, Caserta, Italy;[email protected] University of Naples Federico II, Department of Mathematics and Applications “R. Caccioppoli”, Naples,Italy; daniela.diserafi[email protected] * Correspondence: [email protected]† Current address: National Research Council of Italy, via P. Castellino 111, I-80131 Napoli, Italy‡ All the authors contributed equally to this work.
Abstract:
Image segmentation is a central topic in image processing and computer vision and a key issue in many applications, e.g., in medical imaging, microscopy, document analysis and remote sensing. According to the human perception, image segmentation is the process of dividing an image into non-overlapping regions. These regions, which may correspond to different objects, are fundamental for the correct interpretation and classification of the scene represented by the image. The division into regions is not unique, but it depends on the application, i.e., it must be driven by the final goal of the segmentation and hence by the most significant features with respect to that goal. Image segmentation is an inherently ill-posed problem. A classical approach to deal with ill posedness consists in the use of regularization, which allows us to incorporate in the model a-priori information about the solution. In this work we provide a brief overview of regularized mathematical models for image segmentation, considering edge-based and region-based variational models, as well as statistical and machine-learning approaches. We also sketch numerical methods that are applied in computing solutions coming from those techniques. Keywords: image segmentation; ill-posed problems; regularization; variational models; machine- learning techniques; numerical methods. MSC:
1. Introduction and applications Image segmentation is a fundamental task of image processing, image analysis, image understanding, and pattern recognition. It has a very long history, whose origin may be dated back to about 50 years ago. A seminal paper is [1], where the authors pointed out that an important component of the Stanford Research Institute automation project was a set of programs providing the automaton with a means of interpreting visual data. While it is possible to accurately represent the information in a real scene by an image, this representation alone does not enable us to highlight specific properties of the scene. Conversely, a description in terms of “natural” elements of the image, such as regions and boundaries of the visualized objects, represented in a uniform manner, provides easy access to useful global information, thus allowing recognition and extraction of specific image features. Thus, to generate a description of specific elements of the image, it is necessary to segment the image into more parts (or segments). Since different types of images (real scenes, synthetic images, medical images, etc. – see Figure 1) may require different partitions to extract significant features, there is no single standard method for image segmentation. On a r X i v : . [ m a t h . NA ] F e b of 21 the other hand, different methods are not equally effective in segmenting a specific type of image, and the criteria to define a successful segmentation depend on the desired goal of the segmentation itself. Thus, the segmentation problem has not a unique result, as shown in Figure 2, where different segmentations of the same image are shown, resulting from different segmentation criteria. Therefore, segmentation is an inherently ill-posed problem and remains a challenging problem in image processing and computer vision, in spite of several decades of research. (a) (b) (c) (d) (e) Figure 1.
Examples of different types of images: (a) natural scene [2], (b) star-forming nursery taken by NASA/ESA HubbleSpace Telescope [3], (c) human knee depicted by magnetic resonance imaging [4], (d) multiple projections and sinogram ofhuman brain acquired by Single Photon Emission Tomography [5], (e) handwritten text [6].Image
User - 6 segments User - 18 segments User - 36 segments
Figure 2.
Segmentations of the Berkeley database image
With the advances in computer technology and mathematical models, segmentationmethods have been refined and improved, maintaining strong relationships with othercomputer vision methods such as image classification and edge detection, while differingfrom them by their aims. Despite their strong relationships, those methods address differentclasses of problems and produce different outputs. Following [7], we can distinguish fourmain computer vision problems, which are listed next in increasing order of difficulty:• image classification : classify the main object category within an image;• object detection : identify the object category and locate its position using a boundingbox, for every known object within an image ( edge detection may be considered a subsetof this problem); of 21 • semantic segmentation : identify the object category of each pixel for every known objectwithin an image;• instance segmentation : identify each object instance of each pixel for every known objectwithin an image.An illustration of the previous problems is given in Figure 3. image classification image detection semantic segmentation instance segmentation(a) (b) (c) (d) Figure 3.
Illustration of the computer vision problems listed in Section 1. The results displayed in (a) , (b) , (c) and (d) wereproduced by using Adobe Photoshop. Image segmentation is used in many application fields, such as biomedicine, remotesensing, public security, transportation, agriculture, environmental analysis, ecology, geol-ogy, weather prediction, disaster assessment, and search and rescue. The theoretical andpractical results of the increasing research activity on image segmentation promotes theirapplication to new problems, which in turn pose further challenges in the design of mathe-matical models and numerical solution algorithms. This is also confirmed by the increasingnumber of documents in the Scopus database that include the word “segmentation” in theirtitles (see Figure 4). In the following, we outline the use of image segmentation in someapplication fields. Of course, we are very far from being exhaustive.
Figure 4.
Number of documents indexed in the Scopus database in the last thirty years,which contain the word “segmentation” in the title.
The segmentation of medical images is commonly used for measuring and visualizinganatomical structures or assessing the functionality of human organs, outlining pathologicalregions, analyzing biological and metabolic processes, setting therapy plans and image-guided surgery [8,9]. Manual segmentation by experts is not just a tedious and time-consuming process, but also error-prone, especially with the increasing complexity ofmedical imaging technologies and the huge amount of images to be processed. It is thereforenecessary to develop accurate and reliable automatic image segmentation methods, and of 21 also to collect and share image data and segmentation results, in order to better understandcomplex diseases and to design new therapies (see, e.g., [10] and the references therein). Themain medical imaging techniques are listed below together with a very simple description;for further details the reader is referred to [11–15].• Radiography (X-Rays) is a technique using X-rays to produce an image of the internalstructure of human organs;• Computed Tomography (CT) refers to a computerized X-ray imaging procedure inwhich a narrow beam of X-rays is aimed at a patient and quickly rotated aroundhis/her body, producing signals that are processed to generate cross-sectional images(or slices) of human organs to reproduce their three-dimensional structure;• Positron Emission Tomography (PET) provides three-dimensional images describ-ing functional processes of human organs by means of the distribution map of aradiopharmaceutical injected into a patient, emitting pairs of positrons;• Single Photon Emission Computed Tomography (SPECT) is similar to PET, but pro-duces three-dimensional images with lower resolution using single-photon-emittingradiopharmaceuticals;• Magnetic Resonance Imaging (MRI) uses a magnetic field and radio waves to producestructural images of organs and tissues of the human body.
At a different scale from the previous ones, microscopy images are generally pro-duced using light microscopes [16,17], which provide structural and temporal informationabout biological and non-biological specimens. In the most widely used light microscopytechniques, the light is transmitted from a source on the opposite side of the specimento the objective lens. On the contrary, fluorescence microscopy uses the reflected light ofthe specimen. Electron microscopes are used to produce higher resolution images thanlight microscopes, providing information that is otherwise inaccessible [18]. There are twomain types of electron microscope: Scanning Electron Microscope (SEM) and TransmissionElectron Microscope (TEM), with similar components. Each of them has an electron sourcethat emits an electron stream towards a sample as a source of illumination, and each con-tains a series of electromagnetic and electrostatic lenses and electron apertures to controlthe electron beam and capture images. SEM sweeps the electron beam across the sampleand records the electrons bouncing back, while TEM works by recording electrons passingthrough a sample to a detector, providing details as small as individual atoms of the innerstructure. Moreover, SEM provides a three-dimensional image of the observed sample,while TEM provides two-dimensional projections of the sample.Applications of microscopy imaging range from life science to nanotechnology, andfrom manufacturing processes to environmental monitoring, with very different dataand objectives. For example, in biology, the research based on microscopy imaging re-quires methods for quantitative, unbiased, and reproducible extraction of meaningfulmeasurements to quantify morphological properties as well as to investigate on intra- andinter-cellular dynamics. To address this need, new technologies have been developed,such as microscopy-based screening, sequencing and imaging, with automated analysis(including high-throughput screening [19] and high-content screening [20]), where imagesegmentation is a fundamental task. Microscopy imaging techniques are also widely usedfor studying structural and morphological properties of materials at different length scales(from micrometer to angstrom) [21]. Material properties are probed by light microscopy orelectron microscopy, and the segmentation aims to detect morphology, texture, microstruc-ture, and chemical composition, and to identify defects in a structure or a mechanicalbehavior. of 21
Document images denote the output of scanning or taking photos of paper documents,as well as video frames where captions are present or pictures of scenes where text ispresent. Document image analysis is concerned with the transformation of any informationpresent in a document image into an equivalent symbolic representation accessible tocomputer information processing [22]. In this context, segmentation is a fundamentalprocess to extract information from the image document, such as its basic components(characters, lines, words, pictures). The methods used to achieve this goal generally exploitthe differences in the properties of textual and image regions within the document [23,24].Applications range from historical document validation and signature authentication todocument compression and digital document processing [25].
Segmentation has been used in remote sensing image processing since the advent ofthe Landsat-1 satellite. Satellite imaging, as a part of remote sensing, is when satellites scanthe Earth using different kinds of sensors to collect electromagnetic radiation reflected fromthe Earth itself. There exist two main groups of remote sensing systems, classified accordingto the source of the signal they use to explore, passive and active. Passive remote sensinginstruments (e.g., RADAR and LIDAR ) rely on the electromagnetic radiation reflectedor emitted from the surface of Earth. They are further divided into two groups based onthe spectral resolution of the sensors, multispectral and hyperspectral remote sensing [26].Conversely, active remote sensing instruments (e.g., optical systems) operate with theirown electromagnetic energy, transmitted towards Earth’s surface (see [27] for furtherdetails). Advances and applications of remote sensing segmentation, e.g., in environmentalmonitoring (agricultural forestry, rural and urban planning, climate changing, weatherforecasting), hydrology (water resources, soil moisture maps, geology), and oceanography(evolution of the ocean basins, monitoring ship traffic, detection oil slicks), among others,can be found in [28–31].Table 1 lists some research projects related to image segmentation in the applicationfields previously described.
2. Contribution and outline of this work
We present image segmentation as an ill-posed problem, and discuss widely usedmodels based on regularization approaches, attempting to put them into a coherent mathe-matical framework. We also show how the key idea of incorporating a-priori information isubiquitous in regularized models, from well-established ones to newer machine learning ap-proaches, revealing links and similarities between them. Finally, we give a quick overviewof some numerical methods used in the application of the various models. Although thisis our (partial) view of image segmentation, we believe that it may contribute to betterunderstanding this huge field.The rest of this paper is organized as follows. In Section 3 we present a mathematicalformulation of image segmentation, and in Section 4 we discuss regularized segmentationmodels, focusing on edge-based, region-based, statistical and machine learning ones. InSection 5 we give a quick overview of numerical techniques that may be used to solve theaforementioned models. Finally, we give some conclusions in Section 6. Landsat-1 was the first Earth-observing satellite launched by NASA on July 23, 1972, to study and monitor our planet’s landmasses. RADAR (RAdiation Detection And Ranging) and LIDAR (LIght Detection And Ranging) have the same purpose, i.e., the detection of distance objects,but they differ in the transmitted wavelength used to scan the Earth’s surface. of 21
3. Mathematical formulation of image segmentation
Let I be the space of the images defined in a domain Ω ⊂ R d ( d ≥ I ∈ I theobserved image, and P , . . . , P n some propositions representing the features driving thesegmentation of I . For example, P k may represent the smoothness or the texture of animage, or may distinguish the objects in the image from the background. Note that P k may be also associated with a function that maps any subset A ⊂ Ω to a logical value,indicating whether the proposition P k is true or f alse for all the pixels of I correspondingto A . Henceforth, we identify P k with that function to simplify the notation. of 21 Table 1.
Research projects related to image segmentation in different application fields.
Project & Repository Description
Medical imaging
BRAINS [32] Brain Images of Normal Subjects bankINbreast [33] Mammographic databases for breast cancer imaging and studyTCIA [34] The Cancer Imaging ArchivePPMI [35] Parkinson Progression Markers InitiativeADNI [36] Alzheimer Disease Neuroimaging InitiativeOAI [4] Osteoarthritis Initiative
Microscopy imaging
UCSB Bio [37] Segmentation Benchmark dataset of 2D/3D images and sequencesBBBC [38] Broad Bioimage Benchmark CollectionEVICAN [39] Expert VIsual Cell ANnotation of different cell linesPH [40] Dermoscopic image database for research and benchmarkingEMPIAR [41] Electron Microscopy Pilot Image ARchiveGeoMod2008 [42] SEM image dataset for natural and artificial granular materials Document image analysis
Layout Analysis Dataset [43] Realistic document database(magazines and technical/scientific publications)IMPACT [44] Historical documents and books of European librariesREID2019 [45] Recognition of Early Indian Printed DocumentsMNIST [46] Database of handwritten digitsCEDAR [47] Database for handwritten text recognition research
Remote sensing imaging
SpaceNet [48] Multi-Temporal Urban Development Challenge95-Cloud [49] Scene images from satellite Landsat 8SEN12MS [50] Georeferenced multispectral Sentinel-1/2 imageryLandCoverNet [51] Global Land Cover Classification Training DatasetGeneralizing the definition in [52], the segmentation of I according to the propositions P k , k =
1, . . . , n , consists of finding a decomposition of Ω into m connected components Ω i ,with i =
1, . . . , m and m ≥ n , such that1. Ω i (cid:54) = ∅ , ∀ i ∈ {
1, . . . , m } ;2. ◦ Ω i (cid:84) ◦ Ω j = ∅ , ∀ i , j ∈ {
1, . . . , m } with i (cid:54) = j , where ◦ Ω k denotes the interior of Ω k ;3. m (cid:83) i = Ω i = Ω ;4. ∀ i ∈ {
1, . . . , m } ∃ ! k ∈ {
1, ..., n } such thati. P k ( I | Ω i ) = true , where P k ( I | Ω i ) denotes the restriction of P k to Ω i ,ii. P k ( I | Ω j ) = false , ∀ j ∈ {
1, . . . , m } with j (cid:54) = i ,iii. P k ( I | Ω i (cid:83) Ω j ) = false , ∀ j ∈ {
1, . . . , m } with j (cid:54) = i .For example, the propositions P = { men }, P = { moon } and P = { background } providethe semantic segmentation of Figure 3(a) into the three components shown in Figure 3(c),but if P is changed into P (cid:48) = { man } the result is the instance segmentation given by thefive connected components of Figure 3(d). of 21 Let Σ be the space of the possible segmentations of the images in I , and S ∈ Σ aparticular segmentation of I . Then S can be also expressed as S = ( u ∗ , I ∗ ) ,where u ∗ is a curve that matches the boundaries of the decomposition of Ω , i.e., u ∗ = ∪ i ∂ Ω i ,and I ∗ is a piecewise-smooth function defined on Ω that approximates I . In particular, wemay assume that the restriction of I ∗ to any set ◦ Ω i is differentiable. The segmentation maybe also identified directly by using a labeling operator Φ , i.e. S = Φ ( I ∗ ) , (1)where Φ ( I ( x )) = l i if x ∈ Ω i , I ( x ) is the value of I associated with x , and l i ∈ N = { l , l , . . . , l m } is a label.
4. Regularized segmentation models
Image segmentation is an ill-posed problem whose solution is highly undetermined orhighly ill conditioned or both. Classical approaches for computing a solution of an ill-posedproblem require additional information that enforce uniqueness and stability. To this end,regularization methods are widely used. In this case, the solution is generally obtainedby minimizing an energy functional E containing a fidelity term F that measures theconsistency of the candidate segmentation with the observed image, and a regularizationterm R that promotes solutions with suitable properties: ( I ∗ , u ∗ ) : = arg min ( I , u ) E ( I , u ; I ) = arg min ( I , u ) ( F ( I , u ; I ) + λ R ( I , u )) . (2)Here λ > F and R (see, e.g., [53] and the references therein).The minimization problem (2) can be solved by writing the Euler-Lagrange equations,which can be derived by integrating by parts the energy functional and using the Gausstheorem along with the fundamental lemma of the calculus of variations. Then a numericalsolution can be computed by applying a gradient descent approach, where the descentdirection is parameterized by an artificial time, and a discretization by finite differences. Awidely used and effective alternative consists in discretizing problem (2) and then solvingit by a numerical optimization method. We will come back to these two approaches inSection 5.Recently, machine learning techniques have been successfully applied to segmentationproblems. The key idea is to tune a generic model to a specific solution through learningagainst sample data (training data). The learning phase extracts prior information to beembedded into the regularization term from a large dataset containing pairs of imageand ground-truth labels [54]. Machine learning approaches avoiding the use of a trainingdataset are also available.In the next subsections, we provide a few examples of regularized models widely usedin image segmentation. Edge-based models aim at finding u ∗ = ∪ i ∂ Ω i by solving the minimization problem (2)with respect to the curve u (in this case, I and I ∗ are not explicitly considered). Amongthose models, the Active Contours [55], or
Snakes , are the most common ones. Here u in (2) of 21 is a parameterized curve, and the fidelity and regularization terms act as an internal forceand an external force, respectively, which move the curve within the image to find theboundaries of the sets Ω i . More precisely, the energy functional takes the form E AC ( u ) = (cid:90) g ( |∇ I ( u ( s )) | ) ds (cid:124) (cid:123)(cid:122) (cid:125) F + λ (cid:90) | u (cid:48) ( s ) | ds (cid:124) (cid:123)(cid:122) (cid:125) R , (3)where I is the observed image, g is an edge-detector function and the curve u is parametrizedby s ∈ [
0, 1 ] . The first term attracts the curve toward the boundaries, whereas the secondone controls its smoothness, and as a result the curve u changes its shape like a snake.The evolving curve is driven by surface properties, such as curvature and normaldirection, and by image features, such as gray levels and intensity gradient. For example, themean curvature can be used and in this case the edge-detector function is also responsiblefor stopping the curve on the edges. For example, g may be defined as g ( |∇ I | ) = + |∇ ( G σ ∗ I ) | ,where g is a positive and decreasing function, G σ is the Gaussian kernel with standarddeviation σ , and ∗ denotes the convolution operator. In a Lagrangian approach, an initialcurve is evolved by ∂ u ∂ t + L ( u ) =
0, (4)where L is a differential operator. The simplest evolution is given by L ( u ) = F · N , where N is the normal to the curve and F is a constant that determines the speed of evolution. Moregenerally, the evolution is driven by an external force. For example, in the mean-curvatureevolution, L ( u ) = κ N , where κ is the Euclidean curvature of u [56]. When u has an explicitrepresentation, it is not easy to deal with topological changes like merge and split, and are-parametrization of the curve may be required. Therefore, the evolution of the curve u iscommonly described by level-set methods [57], because of their ability to follow topologychanges, cusps and corners. In a level set approach, the curve u is implicitly representedby the zero level set of a function φ ( t , x ) , i.e., u = { x ∈ Ω : φ ( t , x ) = } . The level setformulations of the simplest evolution and the mean-curvature one read, respectively: ∂φ∂ t = F |∇ φ | and ∂φ∂ t = div ( ∇ φ |∇ φ | ) |∇ φ | . Region-based models provide directly the segmentation by means of the image parti-tion { Ω i , i =
1, . . . , n } . Region-growing models are among the simplest models falling inthis class [58]. Since aggregation criteria based only on gray level measurements may notbe sufficient to obtain accurate segmentations, region-growing methods have been mergedwith variational approaches where the evolution evolves according to the minimization ofan energy functional including region-based terms [59].One of the most popular models was proposed by Mumford and Shah [60]. In this case,the functional E in (2) takes the form With a little abuse of notation we identify the curve u with its parametrized representation u ( s ) .0 of 21 E MS ( I , u ) = (cid:90) Ω ( I − I ) dx (cid:124) (cid:123)(cid:122) (cid:125) F + λ (cid:90) Ω − u |∇ I | dx + µ len ( u ) (cid:124) (cid:123)(cid:122) (cid:125) R , (5)where len ( u ) denotes the length of u and λ , µ > I and the optimal piecewise-smooth approximation I , and the regularization term attempts to reduce the variation of I within each set Ω i while keeping the curve u as short as possible. Minimizing (5) in asuitable space provides an optimal pair ( I ∗ , u ∗ ) representing a simplified description of I by means a function with bounded variation and a set of edges [60]. Finally, in [61] theMumford and Shah model is formulated as a deterministic refinement of a probabilisticmodel for image restoration.A simplified version of the Mumford-Shah model is its restriction to piecewise-constantfunctions. The Chan-Vese model [62] is a particular case of that simplified version aimed atobtaining a two-phase segmentation, where the piecewise-constant function assumes onlytwo values. Its functional E takes the following form: E CV ( I , c in , c out ) = (cid:18) (cid:90) Ω H ( I )( c in − I ) dx + (cid:90) Ω ( − H ( I ))( c out − I ) dx (cid:19)(cid:124) (cid:123)(cid:122) (cid:125) F + λ (cid:90) Ω |∇ H ( I ) | dx (cid:124) (cid:123)(cid:122) (cid:125) R , (6)where H is the Heaviside function and c in and c out are the average values of the intensityin the foreground and background of the image, respectively. The solution I ∗ is the bestapproximation to I among all the functions that take only two values.The functional (3) is nonconvex, thus minimization methods may get stuck into localminima and result in unsatisfactory segmentations. Aiming to overcome this drawback,some strategies have been proposed, including the convexification of the functional bytaking advantage of its geometric properties. An example is given by the two-phasepartitioning model introduced by Chan, Esedo¯glu and Nikolova [63]: E CEN ( I , c in , c out ) = (cid:90) Ω (cid:16) ( c in − I ) I + ( c out − I ) ( − I ) (cid:17) dx (cid:124) (cid:123)(cid:122) (cid:125) F + λ (cid:90) Ω |∇ I | dx (cid:124) (cid:123)(cid:122) (cid:125) R (7)with 0 ≤ I ≤ c in , c out > Statistical models usually provide a conditional probability, P ( S | I ) , of a segmentation S ∈ Σ given the observed image I , and then select the segmentation with the highestprobability. In the Maximum a Posteriori (MAP) approach the segmentation is given by S ∗ = arg max S ∈ Σ P ( S | I ) ,where the posterior probability P ( S | I ) can be expressed through the Bayes theorem as P ( S | I ) ∝ P ( I | S ) P ( S ) . Here P ( S ) is the prior probability measuring how well S satisfies certain properties of thegiven image, and the conditional probability P ( I | S ) measures the likelihood of I given S .Markov Random Field (MRF) models offer a framework to define prior and likelihoodby capturing properties of the image such as texture, color, etc. [64]. The segmentation isformulated within an image labeling framework, i.e., S = Φ ( I ( x )) , where the problem isreduced to find the labeling which maximizes the posterior probability. Label dependenciesare modeled by an MRF; then, using the Hammersley-Clifford theorem, we get the Gibbsdistribution P ( S ) = Z exp ( − U ( S )) ,where the energy function U takes the form U ( S ) = ∑ c ∈ C V c ( S c ) , C is the set of cliques of S , V c ( S c ) is the potential of the clique c ∈ C having the labelconfiguration S c , and Z is a normalizing constant. The conditional probability P ( I | S ) ismodeled by a Gaussian distribution. Then the original MAP estimation is equivalent to thefollowing energy minimization problem S ∗ = arg max S P ( S | I ) = arg min S U ( S ) . Recently, machine learning approaches, and in particular deep learning ones, are beingused in solving image segmentation problems, also outperforming the previous approaches.Machine learning approaches do not benefit from prior information on the solution, but“learn” the segmentation from large training datasets.More in detail, the aim of machine learning methodologies is to define a segmentationmodel f θ : I −→ Σ such that the segmentation of I can be obtained as I ∗ = f θ ( I ) . Thefunction f θ is usually nonlinear and θ is a large set of unknown parameters. The learningphase selects a set of parameters θ in order to minimize a loss (or cost) functional thatmeasures the accuracy of the predicted segmentation f θ ( I ) . In supervised machine learning,training data are available from databases of manual or annotated segmentations, whichprovide a large number of pairs ( I , I ∗ ) ∈ X × Y ⊂ I × Σ ( X is named training set). Thus, θ is obtained by minimizing a loss function that often takes the form of a mean square errorplus a regularization term: θ ∗ = arg min θ L ( X , Y , θ ) = arg min θ ∑ X (cid:107) f θ ( I ) − I ∗ (cid:107) + R θ ( f θ ( I )) . (8)In unsupervised machine learning, the training set is not available and the goal is to train f θ to recognize specific patterns or image features in the data. This approach is also referred toas self-supervised learning, because the information is extracted from the data themselvesrather than from a set of “predictions” (i.e., given segmentations). Then the fidelity termin (8) takes the form ∑ I (cid:107) f θ ( I ) − Φ ( f θ ( I )) (cid:107) ,where Φ is the labeling operator defined in (1).In order to progressively extract higher-level features from data, machine learningmodels use a multi-layer structure called neural network , consisting of successive function compositions. The number of layers is the depth of the model, hence the terminology deeplearning . A neural network with L layers is a function f θ : I × ( H × . . . × H L ) −→ Σ , f θ ( I ) = ( f L ◦ f L − ◦ ... ◦ f )( I ) ,where f i : R d i − × H i −→ R d i are the layer functions, each depending on θ i , d = d and d L = n , with n equal to the number of features. The adjective “neural” comes from the factthat those networks are loosely inspired by neuroscience.Neural network structures successfully used in image segmentation are the MultilayerPerceptron (MLP), the Deep Auto-Encoder (DAE) and the Convolutional Neural Network(CNN) [65–68]. Their basic schemes are shown in Figure 5. Multilayer Perceptron Deep Auto-Encoder Convolutional Neural Network
Figure 5.
Basic deep neural network architectures.
MLP is the most simple neural network, composed of multiple layers of perceptrons.A perceptron [69] consists of four main parts: the input values (the image), a matrix ofweights and a bias vector, the net sum (matrix-vector multiplication), and an activationfunction, defined as a nonlinear function that maps values in (
0, 1 ) or ( −
1, 1 ) . Popularbasic choices for the activation functions are the sigmoid, the hyperbolic tangent, and therectified linear unit (ReLU) function. The DAE network structure typically consists of 2 L layer functions, where the first L layers act as an encoding function with the input to eachlayer being of lower dimension than the input to the previous layer, and the remaining L layers increase the size of their inputs until the final layer produces the same dimensionas the image input. The first L layers are an MLP. Image segmentation by CNN relies onfeeding a small area (window) of an image as input to the neural network, which labels thepixels. The CNN scans the image, one area at a time, identifies and extracts features, anduses them to classify the image. A CNN mainly consists of three layers:• convolutional layer : the image is analyzed a few pixels at a time to extract low-levelfeatures (edges, color, gradient orientation, etc.);• nonlinear layer : an element-wise activation function creates a feature map with proba-bilities that each feature belongs to the required class;• pooling or downsampling layer : the amount of features and computation in the networkis reduced, hence controlling overfitting.Among the most well-known CNN architecures successfully used in image segmentation,we mention AlexNet [70], GoogLeNet [71], VGGNet [72], and Fully CNN [73].
5. Numerical techniques for segmentation models
The minimization in (2) is usually not trivial and requires appropriate methods, takinginto account the specific application. In this section we provide a brief summary of numeri-cal methods that can be applied to segmentation models. We consider two approaches: firstdiscretize then optimize and first optimize then discretize . In the former, all the quantities in (2)are discretized a priori and then optimization methods are applied to the resulting mini-mization problem in R n . In the latter, we first write first-order optimality conditions for (2),which are generally partial differential equations (PDEs), and then solve those equations bysuitable numerical methods, which discretize the equations. Finally, we also sketch somefiltering techniques used in image segmentation, although they are not directly appliedto the minimization problem (2). This is motivated by their use in some segmentationapproaches, such as those based on deep learning.For the sake of simplicity, here we consider d = S = I (i.e., we neglect u in thesegmentation S = ( I , u ) ). We denote by Ω n x , n y the discretization of Ω consisting of a gridof n x × n y pixels, Ω n x , n y = { ( i , j ) : i =
1, ..., n x , j =
1, ..., n y } . We also identify each pixel with its center and denote by S i , j the value of S in ( i , j ) . Finally,we consider the forward and backward difference operators defined as follows: D + x I i , j = I i + j − I i , j , D − x I i , j = I i , j − I i − j , D + y I i , j = I i , j + − I i , j , D − y I i , j = I i , j − I i , j − ,where we assume I i − j = I i , j for i = I i , j − = I i , j for j = I i + j = I i , j for i = n x , I i , j + = I i , j for j = n y ,i.e., we define by replication the values of I with indices outside Ω n x , n y . Numerical optimization offers a large variety of methods to compute the segmentationby solving the minimization problem coming from a discretization of (2), possibly subjectto constraints that can drive the segmentation towards particular features. The choice ofthe optimization method depends on the properties of the objective function and/or theconstraints.Roughly speaking, at iteration k , optimization methods for nonlinear problems gen-erate a suitable function (cid:101) E ( I ; I k ) that approximates the discretized objective function E around I k , and minimize it to obtain the next iterate (see, e.g., [74]). For example, given I k ,the ( k + ) -st iteration may be written asDefine (cid:101) E ( I ; I k ) that approximates E ( I ; I ) (cid:101) I k + = arg min I (cid:101) E ( I ; I k ) I k + = I k + α k ( (cid:101) I k + − I k ) where the step length α k satisfies some criterion.Classical optimization techniques, such as gradient or Newton-type methods [75,76],require regularity assumptions on the objective function (and the constraints, if any).However, many segmentation models are modeled as non-smooth optimization problems.There are two main approaches to deal with non-differentiability: smoothing and non-smoothing [77]. The former formulates the problem as a suitable smooth one and appliesthe aforementioned classical optimization methods. The latter does not introduce furtherregularization, and thus uses methods not requiring smoothness. For the purpose ofillustration, here we focus on (7), where non-smoothness comes from a discretization of theTV term.A regularized discrete TV may be obtained as follows: (cid:90) Ω |∇ H ( I ) | dx ≈ ∑ i , j (cid:113) ( D + x I i , j ) + ( D + y I i , j ) + (cid:101) ,where (cid:101) > splitting techniques, such as proximal-gradient methods [86,87], and the forward-backwardExpectation Maximization (EM) method in [88]. ADMM and split Bregman methodsdo not use smooth approximations too [53,89–93]. The difficulties associated with thenon-differentiability of the TV functional may be also overcome by reformulating theminimization problem as a saddle-point problem and solving it by a primal-dual algorithmsuch as the Chambolle-Pock one [94,95].EM algorithms [96] are also widely used to solve statistical models. They are based onthe idea of splitting the (negative) log-likelihood into two terms and alternating betweenthe computation of the expectation and its minimization.Finally, stochastic versions of the previous methods are used in segmentation withdeep learning, to limit the computational cost. The idea is to use only random samples ofthe data at each iteration, to estimate first-order and possibly second-order informationaccording to the loss function, with the aim of significantly reducing the computation andhence the time [97–99]. Reducing imaging problems to PDEs has many years, because of the availability ofa large amount of methods and software for solving PDEs. Over time, some PDE-basedmethods have been introduced in different ways, such as the Perona-Malik filtering [100],directly based on properties of the PDE [101], and the axiomatic scale space theory [102,103].In a variational approach, one derives the first-order optimality conditions via smooth-ing regularization, if it is needed. Let us consider, for example, the level-set formulationof the Chan-Vese model (6), where I is represented by a function φ such that φ ( x ) = I (when I = I ∗ the two regions identify thesegmentation). Keeping c and c fixed and writing the Euler-Lagrange equations in agradient-flow approach, we get ∂φ∂ t ( t , x ) = δ ε ( φ ) (cid:18) λ div (cid:18) ∇ φ |∇ φ | (cid:19) − ( c in − I ) + ( c out − I ) (cid:19) in ( + ∞ ) × Ω , φ ( x ) = φ ( x ) in Ω , δ ε ( φ ) |∇ φ | ∂φ∂ N = ∂ Ω , (9)where δ ε is a regularized version of the Dirac measure and N is the exterior normal to theboundary ∂ Ω [62].Finite Difference (FD) schemes [104] are popular methods for the numerical solutionof (9). Of course, the discretization used in image segmentation must take into account thenature and the properties of the operators involved in the model. For example, the edge-preserving property is similar to shock-capturing in computational fluid dynamics, andhence finite-difference schemes based on hyperbolic conservation laws can be used [105].Just to give an example, a level-set equation of the form ∂φ∂ t = F |∇ φ | ,can be solved by using an upwind numerical scheme: φ n + = Ψ ( φ n ) , Ψ ( φ ni , j ) = φ ni , j − ∆ t ( max ( F , 0 ) ∇ + φ ni , j + min ( F , 0 ) ∇ − φ ni , j ) ,where ∇ + φ ni , j = (cid:16) max (cid:16) max ( D − x φ ni , j , 0 ) , − min ( D + x φ ni , j , 0 ) (cid:17) + max (cid:16) max ( D − y φ ni , j , 0 ) , − min ( D + y φ ni , j , 0 ) (cid:17)(cid:17) , ∇ − φ ni , j = (cid:16) max (cid:16) max ( D + x φ ni , j , 0 ) , − min ( D − x φ ni , j , 0 ) (cid:17) + max (cid:16) max ( D + y φ ni , j , 0 ) , − min ( D − y φ ni , j , 0 ) (cid:17)(cid:17) . Discrete filters are often used in image segmentation, e.g., in machine learning ap-proaches. A digital filter is an operator L : I ∈ I −→ (cid:101) I ∈ I , (cid:101) I ij = L [ I ; W ij ] ,where W ij ⊂ Ω n x , n y . A popular filter in image segmentation is the convolution filter, definedby (cid:101) I i , j = L a , b [ I ; W i , j ] = a ∑ s = − a b ∑ t = − b h s , t I i − s , j − t , (10)with a and b positive integers such that a ≤ m − and b ≤ n − , W i , j = { ( s , t ) : s = − a , . . . , a , t = − b , . . . , b } , and h s , t ∈ R . The matrix H ∈ R ( a + ) × ( b + ) such that H i , j = h − a + i , − b + j is called kernel matrix and depends on the features we want extract from theimage. Common choices of a and b are a = b = a = b = H determines thetype of features to be extracted. The kernel matrix H = −
11 0 −
11 0 − is a vertical edge-detection kernel [106]. Another example is the Sobel operator, used tocreate an image emphasizing the edges [107]. It allows us to obtain either the gradientamplitude or the gradient direction of the image intensity at each point, by convolving theimage with the kernel matrices H xS = − − − − − − , H yS = −
10 0 0 − − − .The gradient magnitude, G , and the angle of orientation of the edges, θ , are given by | G i , j | = (cid:113) ( H xS ∗ I ) i , j + ( H yS ∗ I ) i , j , θ i , j = arctan (( H yS ∗ I ) i , j / ( H xS ∗ I ) i , j ) .A padding process is commonly used to preserve the dimension of the image afterthe convolution. It consists in adding zeros symmetrically around the border of the image.A pooling layer is usually inserted between two successive convolution layers, which isobtained by applying basic functions, such as max and mean, in a small window.
6. Conclusions
We presented a view of image segmentation, focusing on its mathematical modelingand attempting to put different segmentation models into a coherent framework whereregularization plays an important role. We first introduced image segmentation and someof its applications and then discussed edge-based, region-based, statistical and machinelearning models. We also provided a summary of numerical methods that are often em-ployed to compute solutions to those models. Our presentation is far from being exhaustive,but nevertheless we think that it can help the reader gain some knowledge in the huge anduseful field of image segmentation.
Funding:
This work was partially supported by Istituto Nazionale di Alta Matematica - Gruppo Nazionale per il Calcolo Scientifico(INdAM-GNCS), by the Italian Ministry of University and Research under grant no. PON03PE_00060_5, and by the VALERE Program ofthe University of Campania "L. Vanvitelli".
Acknowledgments:
The authors would like to thank G. Trerotola for his technical support.
Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection,analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
References
1. Brice, C.R.; Fennema, C.L. Scene analysis using regions.
Artificial Intelligence , , 205–226. doi:10.1016/0004-3702(70)90008-1.2. Arbelaez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour Detection and Hierarchical Image Segmentation. IEEE Transactions on PatternAnalysis and Machine Intelligence , Osteoarthritis and Cartilage , , 1433–1441. doi:doi:10.1016/j.joca.2008.06.016.5. Murli, A.; D’Amore, L.; Carracciuolo, L.; Ceccarelli, M.; Antonelli, L. High performance edge-preserving regularization in 3D SPECTimaging. Parallel Computing , , 115–132. doi:10.1016/j.parco.2007.12.004.6. International Conference on Document Analysis and Recognition. https://scriptnet.iit.demokritos.gr/competitions/ ∼ icdar2017htr/.7. Garcia-Garcia, A.; Orts-Escolano, S.; Oprea, S.; Villena-Martinez, V.; Martinez-Gonzalez, P.; Garcia-Rodriguez, J. A sur-vey on deep learning techniques for image and video semantic segmentation. Applied Soft Computing , , 41 – 65.doi:10.1016/j.asoc.2018.05.018.8. Suri, J.S.; Setarehdan, S.K.; Singh, S., Eds. Advanced Algorithmic Approaches to Medical Image Segmentation ; Springer: Berlin, Heidelberg,2001. doi:10.1007/978-0-85729-333-6.9. El-Baz, A.; Jiang, X.; Suri, J.
Biomedical Image Segmentation: Advances and Trends ; CRC Press, 2016.10. Antonelli, L.; Guarracino, M.R.; Maddalena, L.; Sangiovanni, M. Integrating imaging and omics data: A review.
Biomedical SignalProcessing and Control , , 264 – 280. doi:10.1016/j.bspc.2019.04.032.11. Epstein, C.L. Introduction to the Mathematics of Medical Imaging , 2nd ed.; SIAM, 2007. doi:10.1137/9780898717792.12. Suetens, P.
Fundamentals of Medical Imaging , 2nd ed.; Cambridge University Press, 2009. doi:10.1017/CBO9780511596803.13. Farncombe, T.; Iniewski, K.
Medical Imaging , 1 ed.; Taylor & Francis Group, 2014. doi:10.1201/b15511.14. Lancaster, J.; Hasegawa, B.
Fundamental Mathematics and Physics of Medical Imaging , 1st ed.; Taylor & Francis Group, 2016.doi:10.1201/9781315368214.15. Chappell, M.
Principles of Medical Imaging for Engineers , 1 ed.; Springer, 2019. doi:10.1007/978-3-030-30511-6.16. Murphy, D.; Davidson, M., Fundamentals of Light Microscopy. In
Fundamentals of Light Microscopy and Electronic Imaging ; John Wiley& Sons, Ltd, 2012; chapter 1, pp. 1–19. doi:10.1002/9781118382905.ch1.17. Paddock, S.W.
Confocal Microscopy , 2 ed.; Humana Press, 2014. doi:10.1007/978-1-60761-847-8.18. Zaefferer, S. A critical review of orientation microscopy in SEM and TEM.
Crystal Research and Technology , , 607–628.doi:https://doi.org/10.1002/crat.201100125.19. Blay, V.; Tolani, B.; Ho, S.P.; Arkin, M.R. High-Throughput Screening: today’s biochemical and cell-based approaches. Drug DiscoveryToday , , 1807–1821. doi:10.1016/j.drudis.2020.07.024.20. Boutros, M.; Heigwer, F.; Laufer, C. Microscopy-Based High-Content Screening. Cell , , 1314–1325. doi:10.1016/j.cell.2015.11.007.21. Li, W.; Field, K.G.; Morgan, D. Automated defect analysis in electron microscopic images. npj Computational Materials , .doi:10.1038/s41524-018-0093-8.22. Baird, H.S.; Yamamoto, K.; Bunke, H. Structured Document Image Analysis ; Springer: Berlin, Heidelberg, 1992.
23. Eskenazi, S.; Gomez-Kramer, P.; Ogier, J.M. A comprehensive survey of mostly textual document segmentation algorithms since 2008.
Pattern Recognition , , 1–14. doi:10.1016/j.patcog.2016.10.023.24. Nobile, N.; Suen, C.Y. Text Segmentation for Document Recognition. Handbook of Document Image Processing and Recognition,2014.25. Hussain, R.; Raza, A.; Siddiqi, I.; Khurshid, K.; Djeddi, C. A comprehensive survey of handwritten document benchmarks: structure,usage and evaluation. EURASIP Journal on Image and Video Processing , , 46. doi:10.1186/s13640-015-0102-5.26. Jensen, J. Remote Sensing of the Environment: An Earth Resource Perspective , 2nd ed.; Pearson Education, 2009.27. Wang, K.; Franklin, S.E.; Guo, X.; Cattet, M. Remote Sensing of Ecology, Biodiversity and Conservation: A Review from the Perspectiveof Remote Sensing Specialists.
Sensors , , 9647–9667. doi:10.3390/s101109647.28. Joshi, N.; Baumann, M.; Ehammer, A.; Fensholt, R.; Grogan, K.; Hostert, P.; Jepsen, M.; Kuemmerle, T.; Meyfroidt, P.; Mitchard, E.;et al.. A Review of the Application of Optical and Radar Remote Sensing Data Fusion to Land Use Mapping and Monitoring. RemoteSensing , , 70. doi:10.3390/rs8010070.29. Hossain, M.D.; Chen, D. Segmentation for Object-Based Image Analysis (OBIA): A review of algorithms and challenges from remotesensing perspective. ISPRS Journal of Photogrammetry and Remote Sensing , , 115–134. doi:10.1016/j.isprsjprs.2019.02.009.30. Petropoulos, G.P.; Balzter, H.; Srivastava, P.K.; Pandey, P.C.; Bhattacharya, B. Hyperspectral Remote Sensing ; Elsevier, 2020.doi:10.1016/C2018-0-01850-2.31. Booysen, R.; Gloaguen, R.; Lorenz, S.; Zimmermann, R.; Nex, P.A. Geological Remote Sensing. In
Encyclopedia of Geology , 2 ed.;Alderton, D.; Elias, S.A., Eds.; Academic Press, 2021; pp. 301–314. doi:10.1016/B978-0-12-409548-9.12127-X.32. Job, D.E.; Dickie, D.A.; Rodriguez, D.; Robson, A.; Danso, S.; Pernet, C.; Bastin, M.E.; Boardman, J.P.; Murray, A.D.; Ahearn, T.; Waiter,G.D.; Staff, R.T.; Deary, I.J.; Shenkin, S.D.; Wardlaw, J.M. A brain imaging repository of normal structural MRI across the life course:Brain Images of Normal Subjects (BRAINS).
NeuroImage , , 299–304. Data Sharing Part II, doi:10.1016/j.neuroimage.2016.01.027.33. Moreira, I.C.; Amaral, I.; Domingues, I.; Cardoso, A.; Cardoso, M.J.; Cardoso, J.S. INbreast: Toward a Full-field Digital MammographicDatabase. Academic Radiology , , 236 – 248. doi:10.1016/j.acra.2011.09.014.34. Clark, K.W.; Vendt, B.A.; Smith, K.E.; Freymann, J.B.; Kirby, J.S.; Koppel, P.; Moore, S.M.; Phillips, S.R.; Maffitt, D.R.; Pringle, M.;Tarbox, L.; Prior, F.W. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal ofDigital Imaging , , 1045–1057. doi:10.1007/s10278-013-9622-7.35. Irwin, D.; Felder, J.; Coffey, C.; Caspell-Garcia, C.; Kang, J.H.; Simuni, T.; Foroud, T.; Toga, A.; Tanner, C.; Kieburtz, K.; Chahine, L.;Reimer, A.; Hutten, S.; Weintraub, D.; Mollenhauer, B.; Galasko, D.; Siderowf, A.; Marek, K.; Trojanowski, J.; Shaw, L. Evolution ofAlzheimer’s Disease Cerebrospinal Fluid Biomarkers in Early Parkinson’s Disease. Annals of Neurology , , 574–587.36. Petersen, R.C.; Aisen, P.S.; Beckett, L.A.; Donohue, M.C.; Gamst, A.C.; Harvey, D.J.; Jack, C.R.; Jagust, W.J.; Shaw, L.M.; Toga,A.W.; Trojanowski, J.Q.; Weiner, M.W. Alzheimer’s Disease Neuroimaging Initiative (ADNI). Neurology , , 201–209.doi:10.1212/WNL.0b013e3181cb3e25.37. Gelasca, E.D.; Byun, J.; Obara, B.; Manjunath, B. Evaluation and Benchmark for Biological Image Segmentation. IEEE InternationalConference on Image Processing, 2008.38. Ljosa, V.; Sokolnicki, K.L.; Carpenter, A.E. Annotated high-throughput microscopy image sets for validation. Nature Methods , , 637–637. doi:10.1038/nmeth.2083.39. Schwendy, M.; Unger, R.E.; Parekh, S.H. EVICAN–a balanced dataset for algorithm development in cell and nucleus segmentation. Bioinformatics , , 3863–3870. doi:10.1093/bioinformatics/btaa225.40. Mendonça, T.; Ferreira, P.; Marques, J.; Marçal, A.; Rozeira, J. PH2 - A dermoscopic image database for research and benchmark-ing. , pp. 5437–5440.doi:10.1109/EMBC.2013.6610779.41. Iudin, A.; Korir, P.K.; Salavert-Torres, J.; Kleywegt, G.J.; Patwardhan, A. EMPIAR: a public archive for raw electron microscopy imagedata. Nature Methods , . doi:10.1038/nmeth.3806.42. Klinkmüller, M.; Schreurs, G.; Rosenau, M. GeoMod2008 materials benchmark: The axial test dataset. GFZ Data Services .doi:10.5880/GFZ.4.1.2016.006.43. Antonacopoulos, A.; Bridson, D.; Papadopoulos, C.; Pletschacher, S. A Realistic Dataset for Performance Evaluation of DocumentLayout Analysis. 10th International Conference on Document Analysis and Recognition, 2009. doi:10.1109/ICDAR.2009.271.44. Papadopoulos, C.; Pletschacher, S.; Clausner, C.; Antonacopoulos, A. The IMPACT Dataset of Historical Document Images.HIP@ICDAR 2013. ACM, 2013. doi:10.1145/2501115.2501130.45. Clausner, C.; Antonacopoulos, A.; Derrick, T.; Pletschacher, S. REID2019 - ICDAR2019 Competition on Recognition of Early IndianPrinted Documents. 2019 ICDAR, 2019, pp. 1527–1532. doi:10.1109/ICDAR.2019.00246.46. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition.
Proceedings of the IEEE , , 2278–2324. doi:10.1109/5.726791.47. Hull, J.J. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence , , 550–554. doi:10.1109/34.291440.
48. Etten, A.V.; Lindenbaum, D.; Bacastow, T.M. SpaceNet: A Remote Sensing Dataset and Challenge Series. arXiv e-prints ,[arXiv:cs.CV/1807.01232].49. Mohajerani, S.; Saeedi, P. Cloud-Net+: A Cloud Segmentation CNN for Landsat 8 Remote Sensing Imagery Optimized with FilteredJaccard Loss Function. arXiv e-prints , .50. Schmitt, M.; Hughes, L.H.; Qiu, C.; Zhu, X.X. SEN12MS – A Curated Dataset of Georeferenced Multispectral Sentinel-1/2 Imagery forDeep Learning and Data Fusion. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences , IV-2/W7 , 153–160.doi:10.5194/isprs-annals-IV-2-W7-153-2019.51. Alemohammad, H.; Booth, K. LandCoverNet: A global benchmark land cover classification training dataset. arXiv e-prints ,[arXiv:cs.CV/2012.03111].52. Pal, N.R.; Pal, S.K. A review on image segmentation techniques.
Pattern Recognition , . doi:10.1016/0031-3203(93)90135-J.53. Antonelli, L.; De Simone, V.; di Serafino, D. Spatially Adaptive Regularization in Image Segmentation. Algorithms , .doi:10.3390/a13090226.54. Lucas, A.; Iliadis, M.; Molina, R.; Katsaggelos, A.K. Using Deep Neural Networks for Inverse Problems in Imaging: Beyond AnalyticalMethods. IEEE Signal Processing Magazine , . doi:10.1109/MSP.2017.2760358.55. Kass, M.; Witkin, A.; Terzopoulos, D. Snakes: active contour model. International Journal of Computer Vision , .doi:10.1007/BF00133570.56. Alvarez, L.; Morel, J. Formalization and computational aspects of image analysis. Acta numerica , , 1–59. doi:10.1017/S0962492900002415.57. Osher, O.; Sethian, J.A. Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulation. Journal of Computational Physics , , 12–49. doi:10.1016/0021-9991(88)90002-2.58. Pratt, W.K. Digital Image Processing , 4th ed.; John Wiley & Sons, Inc., 2007.59. Revol-Muller, C.; Grenier, T.; Rose, J.L.; Pacureanu, A.; Peyrin, F.; Odet, C. Region Growing: When Simplicity Meets Theory – RegionGrowing Revisited in Feature Space and Variational Framework. Computer Vision, Imaging and Computer Graphics. Theory andApplication; Csurka, G.; Kraus, M.; Laramee, R.S.; Richard, P.; Braz, J., Eds.; Springer: Berlin, Heidelberg, 2013; pp. 426–444.60. Mumford, D.; Shah, J. Optimal approximations by piecewise smooth functions and associated variational problems.
Communicationson Pure and Applied Mathematics , , 577–685. doi:10.1002/cpa.3160420503.61. Geman, S.; Geman, D. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Transactions onPattern Analysis and Machine Intelligence , PAMI-6 , 721–741. doi:10.1109/TPAMI.1984.4767596.62. Chan, T.F.; Vese, L.A. Active contours without edges.
IEEE Transactions on Image Processing , , 266–277. doi:10.1109/83.902291.63. Chan, T.F.; Esedo ¯glu, S.; Nikolova, M. Algorithms for Finding Global Minimizers of Image Segmentation and Denoising Models. SIAM Journal on Applied Mathematics , , 1632–1648. doi:10.1137/040615286.64. Kato, Z.; Pong, T. A Markov Random Field Image Segmentation Model for Color Textured Images. Image and Vision Computing , , 1103–1114.65. Furat, O.; Wang, M.; Neumann, M.; Petrich, L.; Weber, M.; Krill, C.E.; Schmidt, V. Machine Learning Techniques for the Segmentationof Tomographic Image Data of Functional Materials. Frontiers in Materials , , 145. doi:10.3389/fmats.2019.00145.66. Liao, D.; Lu, H.; Xu, X.; Gao, Q. Image Segmentation Based on Deep Learning Features. 2019 Eleventh International Conference onAdvanced Computational Intelligence (ICACI), 2019, pp. 296–301. doi:10.1109/ICACI.2019.8778464.67. Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.J.; Kehtarnavaz, N.; Terzopoulos, D. Image Segmentation Using Deep Learning: A Survey. ArXiv .68. Rizwan I Haque, I.; Neubert, J. Deep learning approaches to biomedical image segmentation.
Informatics in Medicine Unlocked , . doi:10.1016/j.imu.2020.100297.69. Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review , . doi:10.1037/h0042519.70. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. The 25thInternational Conference on Neural Information Processing Systems. Curran Associates Inc., 2012, Vol. 1, pp. 1097–1105.doi:10.1109/CVPR.2015.7298594.71. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper withConvolutions. arXiv e-prints , [arXiv:cs.CV/1409.4842].72. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv e-prints ,[arXiv:cs.CV/1409.1556].73. Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. 2015 IEEE Conference on ComputerVision and Pattern Recognition (CVPR), 2015. doi:10.1109/CVPR.2015.7298965.74. Fountoulakis, K.; Gondzio, J. Performance of first- and second-order methods for (cid:96) -regularized least squares problems. ComputationalOptimization and Applications , , 605–635. doi:10.1007/s10589-016-9853-x.75. Bertsekas, D.P. Nonlinear Programming , 2nd ed.; Athena Scientific, 1999.76. Nocedal, J.; Wright, S.J.
Numerical Optimization , 2nd ed.; Springer: New York, NY, USA, 2006.
77. Antonelli, L.; De Simone, V. Comparison of minimization methods for nonsmooth image segmentation.
Communications in Appl. andIndustrial Math , , 68–96.78. Weiss, P.; Blanc-Féraud, L.; Aubert, G. Efficient Schemes for Total Variation Minimization Under Constraints in Image Processing. SIAM Journal on Scientific Computing , , 2047–2080.79. Birgin, E.; Martínez, J.; Raydan, M. Nonmonotone spectral projected gradient methods on convex sets. SIAM Journal on Optimization , , 1196–1211.80. Bonettini, S.; Zanella, R.; Zanni, L. A scaled gradient projection method for constrained image deblurring. Inverse Problems , , 015002.81. Antonelli, L.; De Simone, V.; di Serafino, D. On the Application of the Spectral Projected Gradient Method in Image Segmentation. J.Math. Imaging Vis. , , 106–116. doi:10.1007/s10851-015-0591-y.82. di Serafino, D.; Ruggiero, V.; Toraldo, G.; Zanni, L. On the steplength selection in gradient methods for unconstrained optimization. Applied Mathematics and Computation , , 176–195.83. di Serafino, D.; Landi, G.; Viola, M. ACQUIRE: an inexact iteratively reweighted norm approach for TV-based Poisson imagerestoration. Applied Mathematics and Computation , , 124678. doi:https://doi.org/10.1016/j.amc.2019.124678.84. Figueiredo, M.; Nowak, R.; Wright, S. Gradient projection for sparse reconstruction: application to compressed sensing and otherinverse problems. IEEE Journal of Selected Topics in Signal Processing , , 586–598.85. Fountoulakis, K.; Gondzio, J.; Zhlobich, P. Matrix-free interior point method for compressed sensing problems. MathematicalProgramming Computation , , 1–31. doi:10.1007/s12532-013-0063-6.86. Parikh, N.; Boyd, S. Proximal Algorithms. Foundations and Trends in Optimization , , 123–231. doi:10.1561/2400000003.87. Bonettini, S.; Loris, I.; Porta, F.; Prato, M. Variable metric in- exact line-search-based methods for nonsmooth optimization. SIAMJournal on Optimization , , 891–921.88. Sawatzky, A.; Brune, C.; Wübbeling, F.; Kösters, T.; Schäfers, K.; Burger, M. Accurate EM-TV algorithm in PET with low SNR. 2008IEEE Nuclear Science Symposium Conference Record, 2008. doi:10.1109/NSSMIC.2008.4774392.89. Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating directionmethod of multipliers. Foundations and Trends in in Machine Learning , , 1–122. doi:10.1561/2200000016.90. Figueiredo, M.; Bioucas-Dias, J. Restoration of Poissonian images using alternating direction optimization. IEEE Transactions on ImageProcessing , , 3133–3145. doi:10.1109/TIP.2010.2053941.91. Goldstein, T.; Bresson, X.; Osher, S. Geometric applications of the split Bregman method: segmentation and surface reconstruction. Journal of Scientific Computing , , 272–293. doi:10.1007/s10915-009-9331-z.92. Setzer, S. Operator splittings, Bregman methods and frame shrinkage in image processing. International Journal of Computer Vision , , 265–280. doi:10.1007/s11263-010-0357-3.93. De Simone, V.; di Serafino, D.; Viola, M. A subspace-accelerated split Bregman method for sparse data recovery with joint (cid:96) -typeregularizers. Electronic Transactions on Numerical Analysis , , 406–425. doi:10.1553/etna_vol53s406.94. Chambolle, A.; Pock, T. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal ofMathematical Imaging and Vision , , 120–145. doi:10.1007/s10851-010-0251-1.95. Malitsky, Y.; Pock, T. A first-order primal-dual algorithm with linesearch. SIAM Journal on Optimization , , 411–432.doi:10.1137/16M1092015.96. Dempster, A.P.; Laird, N.M.; Rubin, D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal StatisticalSociety , , 1–38.97. Jing, Y.; Guanci, Y. Modified Convolutional Neural Network Based on Dropout and the Stochastic Gradient Descent Optimizer. Algorithms , . doi:10.3390/a11030028.98. Marin, D.; Tang, M.; Ayed, I.; Boykov, Y. Beyond gradient descent for regularized segmentation losses. Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, 2019, pp. 10187–10196.99. Yaqub, M.; Jinchao, F.; Zia, M.S.; Arshid, K.; Jia, K.; Rehman, Z.; Mehmood, A. State-of-the-Art CNN Optimizer for Brain TumorSegmentation in Magnetic Resonance Images. Brain Sciences , , 427. doi:10.3390/brainsci10070427.100. Perona, P.; Malik, J. Scale space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and MachineIntelligence , , 629–639.101. Witkin, A.P. Scale-space filtering. International Joint Conference on Artificial Intelligence, 1983, pp. 1019–1022.102. Koenderink, J. The structure of images. Biological Cybernetics , , 363–370. doi:10.1007/BF00336961.103. Alvarez, L.; Guichard, F.; Lions, P.; Morel, J. Axioms and fundamental equations of image processing. Archive for Rational Mechanicsand Analysis , , 199–257. doi:10.1007/BF00375127.104. Thomas, J. Numerical Partial Differential Equations ; Vol. 1 and 2, Springer, 1995 and 1999.105. Sethian, J.
Level Set Methods and Fast Marching Methods , 2nd ed.; Cambridge University Press, 1999.106. Baum, K.G., Signal Filtering: Noise Reduction and Detail Enhancement. In
Handbook of Visual Display Technology ; Springer: Berlin,Heidelberg, 2012; pp. 325–343. doi:10.1007/978-3-540-79567-4_27.
IEEE Journal ofSolid-State Circuits ,23