Parameter-free and fast nonlinear piecewise filtering. Application to experimental physics
Barbara Pascal, Nelly Pustelnik, Patrice Abry, Jean-Christophe Géminard, Valérie Vidal
PParameter-free and fast nonlinear piecewisefiltering.Application to experimental physics. ∗ Barbara Pascal † Nelly Pustelnik † Patrice Abry † Jean-Christophe G´eminard † Val´erie Vidal † May 2020
Signals or images collected from numerous experiments in physics can be, atleast as a first order approximation, described as piecewise homogeneous (piece-wise constant, piecewise linear,. . . ). Detecting and estimating such piecewisehomogeneous regions thus constitute a crucial goal to extract the physicallyrelevant information conveyed in such data. This remains, however, often chal-lenging, as signals or images are usually altered by superimposed noises, pos-sibly with low signal-to-noise ratio, which may hinder the interpretation of thecorresponding experiments. The joint need to denoise data while preservingedges and discontinuities pertaining phase changes or region boundaries oftenpreclude the use of classical linear filtering and call for the use of advancednonlinear signal and image processing techniques.Solid friction provides us with a first representative example. Indeed, whentwo, nominally flat, solid surfaces in contact are forced to slide against one an-other, the shear force at the contact surface exhibits generally a characteristictooth-shape signal (Fig. 1a): the force signal thus consists of an alternation ofslow linear rises, corresponding to the loading of elastic energy in the drivingsystem while the surfaces in contact do not move with respect to each another,followed by sudden drops, corresponding to fast energy releases when surfacesslide [3]. When solids are strongly pressed one against the other, these two phases (rest and sliding at the contact surface) can easily be identified. How-ever, high confinement pressures tend to damage surfaces, a major limitations inthe study of the microscopic mechanisms at play. Therefore, probing effectivelyand accurately frictional material properties required that experiments are per-formed at low confinement pressure. This, however, induces that collected sig-nals have low to very low signal-to-noise-ratios [13] thus requiring advanced non ∗ Work supported by Defi Imag’in SIROCCO and by ANR-16-CE33-0020 MultiFracs,France. † Univ Lyon, ENS de Lyon, Univ Lyon 1, CNRS, Laboratoire de Physique, F-69342 Lyon,France ( [email protected] ). a r X i v : . [ phy s i c s . d a t a - a n ] J un a) Stick-slip: piecewise linear signal (b) Multiphase flow: piecewise ho-mogeneous texture Figure 1:
Experimental data in non linear physics. (a): Normalized driv-ing force F ∗ = k ∆ x/ ( mg ) [see Section 4.1] as function of time t in a solid frictionexperiment. (b): Direct image of a gas bubble in porous media multiphase flows(gas corresponds to the most scrambled region whereas liquid corresponds tosmoother regions).linear filtering signal processing techniques to detect and analyse the piecewiselinear shape of data.Multiphase flows in porous media constitute another rich example. Hydro-dynamics in porous media, notably mass transfer, is of prominent practicalrelevance in industry, e.g., for catalytic process studies. Hydrodynamics andmass transfer studies of multiphase flows in porous media traditionally involvepacked beds and are well characterized. However, recent experiments [40, 48]consider innovative materials such as open-cell solid foams which, due to highporosity and the resulting low pressure loss, are promising for industrial applica-tions. In such experiments, a liquid and a gas are forced to flow simultaneouslythrough the foam and the characterization of such multiphase spatiotemporaldynamics stems from image analysis (Fig. 1b). The challenge is here to iden-tify liquid flows from gas bubbles. The rationale is that each phase can beassociated to homogeneous textures in images and a crucial stake consists inidentifying precisely gas bubble contours so as to measure their lengths. Liquidand gas are both transparent and the foam is itself introducing a scrambledbackground, yielding low-contrast and blurred images, thus requiring advancednonlinear image processing techniques form texture segmentation and contourestimation.These two emblematic examples share in common that the key aspects ofthe physics to be understood are driven by piecewise homogeneous phases. Onone hand, studying friction requires identifying the stick and the slip phases,each associated with a piecewise linear signal. On other hand, studying mul-tiphase flows implies detecting fluid phases, each associated with a piecewisehomogeneous texture. Piecewise-homogeneous signals or images are very com-mon in numerous fields of nonlinear physics, very different in nature, such astime reversal of the magnetic field in turbulent dynamo [6], on-off intermittencyin creeping granular matter [19], DNA detection during translocation throughnuclear pores [1],. . .The present work proposes a generic nonlinear signal/image filtering uni-2ed framework for the analysis of piecewise homogeneous (piecewise-constant,piecewise-linear) experimental datasets. The major challenges here are bothto obtain fast algorithms so as to handle the large amount of data that needto be analyzed to yield accurate and relevant conclusions (e.g., in producing aphase diagram or in analyzing video frames of large size images) and to be ableto perform an automated and data-driven tuning of hyperparameters, unavoid-ably entering any nonlinear filtering procedure, and whose arbitrary selection(by expert visual inspection) might have drastic impacts on achieved outcomesand thus on a posteriori drawn physical interpretations.The unified signal/image nonlinear filtering framework proposed here isbased on proximal schemes [4, 15] to obtain fast algorithms, and on the Steinunbiased estimator framework to design an automated data-driven hyperparam-eter tuning procedure.Section 2 is dedicated to the formulation of this framework as an inverseproblem, and recalls state-of-the-art strategies with focus on piecewise constantor linear estimation both in signal or images. Section 3 details the proposed al-gorithmic framework. Section 4 illustrates the performance on the two examplesdiscussed above, solid friction and multiphase flows.A documented toolbox (in Matlab ), for the implementation of this sig-nal/image processing nonlinear filtering and data-driven hyperparameter tun-ing, is freely available at https://github.com/bpascal-fr/stein-piecewise-filtering . Let S = { n = ( n , n ) : 1 ≤ n ≤ N , ≤ n ≤ N } denote a lattice, supporting x = ( x n ) n ∈ S , the unknown signal/image of size N = N × N ( N = 1 forunivariate 1D signal analysis and N = K for multivariate 1D signal analysiswith K components). Observation z = ( z m ,m ) ≤ m ≤ M , ≤ m ≤ M is of size M = M × M consists of a degraded version of x , which stems both from alinear degradation (e.g. filtering), denoted A ∈ R M × N , and additive randomnoise, denoted B .Handling an inverse problem relies first on an accurate design of the obser-vation (or direct) model, which here takes the following form: z = B ( Ax ) . In this work, S corresponds to an homogeneous neighborhood system. Forinstance, when considering a 1D signal, a site n ∈ { , . . . , N − } has twonearest neighbors N n = { n − , n + 1 } . In a general regular rectangular lat-tice S and for a 4-neighborhood system, every interior point has four neighborsthat yields to N n = { ( n − , n ) , ( n + 1 , n ) , ( n , n − , ( n , n + 1) } . Thepair ( S, E ) constitutes a graph where S contains the nodes and E determinesthe links between the nodes according to the neighboring relationship.We detail this direct model on the two nonlinear physics problems describedin Introduction (low confinement solid friction and porous media multiphase3ows) and illustrated in Figure 1. For solid friction, the challenging questionconsists in denoising obserbation z (Figure 1(a)), with A = Id and the presenceof additive impulsive noise. For multiphase flows, the challenging issues pertainsto segmentation of textures such as the one in Figure 1(b). In such a case, infor-mation x refers to piecewise constant scale-free texture features, and observation z is obtained from a nonlinear multiscale transform (cf. Section 4.2). Noise isassumed additive and Gaussian, with spatial and multiscale correlations. Solving an inverse problem consists in providing an estimator (cid:98) x , as close as canbe from information x . This has been largely addressed in the literature and wepropose first a brief overview of the main inverse problem solving streams (seealso [8, 43]), before focusing, second, on the specific assumptions on the modelrequired to design parameter-free and fast nonlinear piecewise filtering. Bayesian arguments and most standard models – On the first hand,Markov Random Fields (MRF) have been introduced in visual labelling to es-tablish probabilistic distributions of interacting labels, aiming to analyze de-pendencies of a physical phenomena [32]. In such a formalism x and z areconsidered as realizations of random vectors X and Z defined on the set S . X is said to be a MRF on S with respect to a neighborhood system E if and only ifpositivity (i.e. P ( X = x ) >
0) and Markovianity P ( x n | x { S }− n ) = P ( x n | x N n ),which models the local characteristics of X , are satisfied. Other properties suchas homogeneity and isotropy can be depicted.The link between the MRF, characterized by its local properties, and an-other standard random field, the Gibbs random field, characterized by globalproperties, has been provided by Hammersley and Clifford [36, Theorem 1]. Werecall that a Gibbs distribution relative to the graph { S, E } is a probabilitymeasure and it has the following representation : P ( ω ) = 1 C e − U ( ω ) /T (1)where C is the normalizing constant called the partition function such that C = (cid:80) ω e − U ( ω ) /T and T stands for temperature, which controls the sharpnessof the distribution. High temperature leads to all configurations equally dis-tributed, while a temperature close to 0 concentrates the distribution aroundthe global energy minima. U ( ω ) denotes the energy function and P ( ω ) measuresthe probability of the occurence of a specific configuration ω . The more prob-able configurations are those with the lower energies. The terminology comesfrom statistical physics where such measures are equilibrium states for physicalsystems (e.g. ferromagnets). U ( ω ) can be formulated with contributions fromexternal fields (i.e. x n ,n ) and pair interactions (e.g. x n ,n x n +1 ,n ). Forinstance, the Ising model reads U ( x ) = − α (cid:88) x n ,n − β (cid:32) (cid:88) x n ,n x n +1 ,n + (cid:88) x n ,n x n ,n +1 (cid:33) (2)considering ω = x and for some parameters α ∈ R and β >
0, which measure,the external magnetic moment and bonding strengths.4eman and Geman [27] handle the maximization of the conditional proba-bility distribution of ( x, e ) ∈ { S, E } given the data z (i.e. find the mode of theposterior distribution), which is known as the maximum a posteriori estima-tion or penalized maximum likelihood. In [25–27], the authors prove that theposterior is a Gibbs distribution over { S, E } with energy function U ( x, e ) = 12 σ (cid:107) Ax − z (cid:107) + β (cid:88) n,n (cid:48) ∈N n ϕ ( x n − x n (cid:48) )(1 − e n,n (cid:48) ) + αψ ( e ) (3)so that ω = ( x, e ) when B designates an additive zero-mean Gaussian noisewith a variance σ . e ∈ E denotes the coded line states and ϕ ( η ) = − η = 0 and 1 if η (cid:54) = 0. The first term acts as a data fidelity term and forces theapproximation x to be close to z , the second term allows small variations of x except at the locations where e n,n (cid:48) = 1, and ψ is constructed to organize the lineprocess. Finally, α > β > ϕ and ψ [26,33], an alternative equivalent formulationis the Blake-Zisserman model formulated as U ( x ) = 12 σ (cid:107) Ax − z (cid:107) + λ (cid:88) n,n (cid:48) ∈N n min (cid:16) ( x n − x n (cid:48) ) , η (cid:17) (4)where λ, η >
0, leading to the so-called truncated (cid:96) and whose interest is tofavor piecewise smooth solution. Another model very close is the Potts modelthat can be interpreted as a (cid:96) -penalization over x n − x n (cid:48) that is designed toprovide piecewise-constant estimates [51]. For numerical reasons detailed below,the most standard convex relation is the anisotropic total-variation penalizationwhich reads [12, 46]: U ( x ) = 12 σ (cid:107) Ax − z (cid:107) + λ (cid:88) n,n (cid:48) ∈N n | x n − x n (cid:48) | . (5) Solving inverse problems – Once an energy (or functional) has been designedto fit the considered problem, numerical strategies have to be designed to imple-ment both the hyperparameter selection and the computation of the minimizingsolution x , also corresponding to the most probable ω or moments of P .On one hand, Markov Chain Monte Carlo algorithms address simulationsfrom a probability distribution P . The function P can be written in a closed-form expression but the objective is generally to access the moments of P , whichare not computable analytically. The two main techniques used in MCMC areMetropolis-Hastings, which relies on accept/reject mechanism and Gibbs sam-pler, which simplifies the high dimensional problem by successively simulatingfrom different smaller dimensional components. The main limitation of thesetechniques is to be computationally intensive for solving large size inverse prob-lems (see a contrario [34, 52]). We should also refer to some specific configu-rations where a closed form expression is available such as for the Ising model5n 1D and 2D but which is not adapted for general inverse problem solvingconsidered in this work.When one wants to estimate jointly the maximum of a posteriori and itshyperparameters, Bayesian hierarchical inference frameworks are particularlyadapted and received considerable interest for addressing change-point detec-tion or piecewise denoising problems [20, 21, 42] or texture segmentation [52].However, to the best of our knowledge, for the proposed unified 1D-2D frame-work considered in this work, such a general efficient strategy has not yet beendesigned.On other hand, during the last twenty years, important research efforts havebeen dedicated to convex but non-smooth energy generally formulated as a sumof two or three terms: a data-fidelity term, a penalization and a constraint [4,14, 16]. This framework is thus especially adapted when dealing with an energysuch as (5). These algorithmic strategies are particularly efficient when dealingwith A full-rank which is rarely the case in standard restoration/reconstructionproblems but which is more encountered in experimental physics processingwhen dealing either with denoising i.e. A = Id such as in friction experiments orwhen dealing with texture reconstruction especially adapted to study multiphaseflow dynamics as it will be described later. However, when one handles suchoptimization strategy to find the minimizer of the energy U , the selection ofthe automated regularization parameter(s) is not addressed. For automatedselection, one could consider either an empirical rule that consists in setting λ ∼ N / σ/
4, with N the signal size and σ the noise standard deviation, estimatede.g., from the median value of the absolute value of the wavelet coefficients[22], or an hybrid Bayesian hierarchical inference framework combined with (cid:96) -minimization startegy [24] in the specific case of piecewise-constant denoising,or the recourse to Stein Unbiased Risk Estimator (SURE) which provides anunbiased estimator of the mean square error [5,18,44]. Our contribution focuseson such a strategy, its implementability, and its applicability on real physicsexperiments. In this work, we consider an estimator (cid:98) x ( z ; Λ) of the quantity of interest ¯ x , froma corrupted observation z , parametrized by Λ. The estimate is obtained fromthe minimization of an energy, inspired from (5), and defined as: (cid:98) x ( z ; Λ) ∈ Argmin x ∈ R N (cid:107) Ax − z (cid:107) + (cid:107) D Λ x (cid:107) , (6)where the matrix D Λ models a weighted discrete differentiation operator, sothat the penalization (cid:107) D Λ x (cid:107) enforces some regularity of the estimate (cid:98) x ( z ; Λ).Specific instances of (6) are provided : • To favor joint piecewise constancy of K multivariate signals, the regu-larization parameters are stored in a vector Λ = ( λ , . . . , λ K ) ∈ (cid:0) R ∗ + (cid:1) K ,6nd the operator D Λ is a first order differentiation operator, acting com-ponentwise, also called discrete gradient, writing for k ∈ { , . . . , K } , n ∈ { , . . . , N − } ,( D Λ x ) k,n = λ k ( x k,n +1 − x k,n ) (7)and where (cid:107)·(cid:107) is the mixed (cid:96) , -norm (cid:107) D Λ x (cid:107) , = N − (cid:88) n =1 (cid:118)(cid:117)(cid:117)(cid:116) K (cid:88) k =1 ( D Λ x ) k,n . (8) • Enforcing joint piecewise linearity requires a second order differentiationoperator, named discrete Laplacian, defined for k ∈ { , . . . , K } and n ∈{ , . . . , N − } ,( D Λ x ) n ,k = λ k ( x n +1 ,k − x n ,k + x n − ,k ) (9)composed with the (cid:96) , -norm defined in (8). • Image segmentation is performed imposing piecewise constancy prior, us-ing a two dimensional difference operator. For an image x ∈ R N × N the horizontal and vertical gradients are computed for each pixel with1 ≤ n ≤ N − ≤ n ≤ N − D Λ x ) n ,n = λ (cid:18) ( D x ) n ,n ( D x ) n ,n (cid:19) = λ (cid:18) x n ,n +1 − x n ,n x n +1 ,n − x n ,n (cid:19) (10)and coupled into an (cid:96) , -norm (cid:107) D Λ x (cid:107) = λ N − (cid:88) n =1 N − (cid:88) n =1 (cid:113) ( D x ) n ,n + ( D x ) n ,n := λ TV( x ) . (11)The above penalization is known as the isotopic Total Variation [46].The estimate (cid:98) x ( z ; Λ) is the result of a trade-off between the fidelity to theobservation model and some regularity priors and the balance is tuned by thehyperparameter Λ. Hence, our purpose is twofold. First, solving the minimiza-tion Problem (6), that is, for fixed hyperparameter Λ, given an observation z ,compute (cid:98) x ( z ; Λ) the minimizer of (6). Second, finding the best hyperparameterΛ † minimizing the quadratic error E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } , i.e. Problem 1.
Find Λ † = arg min Λ ∈ ( R ∗ + ) K E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } (12)where (cid:98) x ( z ; Λ) is defined by (6) and the expectation is taken over all realizationsof the noise B corrupting the observation z = B ( A ¯ x ).In the next sections, we specify the assumptions over A and B allowing usto derive a fast algorithmic scheme to estimate Λ † .7 .2 Convex non-smooth minimization The objective function appearing in Problem (6) is convex, since the composi-tion of a linear operator and a norm is convex. Yet, because of the presence ofthe norm (cid:107)·(cid:107) , it is non-smooth. Consequently, the minimization of (6) requiresproximal algorithms [4,14,38], which in general suffer from low convergence rate.Nevertheless, provided that the operator A is injective, it is possible to design ac-celerated primal-dual schemes [12] and to obtain linear convergence rate towardthe minimizer of (6). Such algorithms relies on proximity operators [38], whosedefinition is recalled in Definition 1. Further, disposing closed-form expressionsof the proximity operators of the data fidelity term and the penalization functionis a key element to design fast implementations of primal-dual algorithms. Definition 1.
For a convex lower semi-continuous function ϕ : R N → R ∪{ + ∞} and τ >
0, the proximity operator is defined as( ∀ x ∈ R N ) prox τϕ ( x ) = arg min (cid:101) x (cid:107) (cid:101) x − x (cid:107) + τ ϕ ( (cid:101) x ) (13)where (cid:107)·(cid:107) denotes the Euclidean norm on R N .Few examples of well-established closed-form expression for proximity oper-ator of interest in this work are recalled. Example 1.
The proximity operator of the data fidelity term (cid:107) A · − z (cid:107) as aclosed form expression that is, for every τ > ∀ x ∈ R N ) prox τ (cid:107) A ·− z (cid:107) ( x ) = (cid:0) Id + 2 τ A (cid:62) A (cid:1) − (cid:0) x + 2 τ A (cid:62) z (cid:1) . (14) Example 2.
The proximity operator of the multivariate 1D (cid:96) , -norm definedin (8) is, for every y ∈ R K × N , (cid:16) prox τ (cid:107)·(cid:107) , ( y ) (cid:17) k,n = (cid:40) (cid:16) − τ (cid:107) y · ,n (cid:107) (cid:17) y k,n if (cid:107) y · ,n (cid:107) > τ, , (15)where (cid:107) y · ,n (cid:107) := (cid:113)(cid:80) Kk =1 y k,n . Example 3.
The proximity operator of the 2D (cid:96) , -norm of Equation (11), for y = (cid:0) y ( H ) , y ( V ) (cid:1) ∈ R × N × N , (cid:16) prox τ (cid:107)·(cid:107) , ( y ) (cid:17) n ,n = (cid:40) (cid:16) − τ (cid:107) y n ,n (cid:107) (cid:17) y n ,n if (cid:107) y n ,n (cid:107) > τ, , (16)where (cid:107) y n ,n (cid:107) := (cid:113) ( y ( H ) n ,n ) + ( y ( V ) n ,n ) . Theorem 1.
Assuming that the deformation operator A is injective and denot-ing by µ > the smallest eigenvalue of A (cid:62) A , the sequence (cid:0) x [ t ] (cid:1) t ∈ N defined inAlgorithm 1 converges toward the solution (cid:98) x ( z ; Λ) ofminimize x ∈ R N (cid:107) Ax − z (cid:107) + (cid:107) D Λ x (cid:107) . (17)8 urther, it has been proven in [12] that, for any (cid:15) > , there exists t such that ( ∀ t ≥ t ) (cid:13)(cid:13)(cid:13)(cid:98) x ( z ; Λ) − x [ t ] (cid:13)(cid:13)(cid:13) ≤ (cid:15)t (cid:18) (cid:107) (cid:98) x ( z ; Λ) − x [0] (cid:107) µ τ + (cid:107) A (cid:107) (cid:107) (cid:98) y ( z ; Λ) − y [0] (cid:107) µ (cid:19) (18) where (cid:98) y ( z ; Λ) denotes the solution of the dual problem of Problem (17) . Hence,the convergence rate of the iterates (cid:0) x [ t ] (cid:1) t ∈ N scales like O (1 /t ) .Proof. This theorem is a direct application of [12, Theorem 2], stated anddemonstrated for the minimization of objective functionals defined as the sumof a µ -strongly convex data fidelity and convex, proper, lower semi-continuouspenalization, which is the case here.Indeed, thanks to the assumption that A is full-rank, the considered data fidelityterm (cid:107) Ax − z (cid:107) is µ -strongly convex with modulus µ = 2 min Sp( A (cid:62) A ) >
0. Fur-ther, the penalization being the composition of a linear operator and a norm issatisfies the aforementioned conditions.Then, the primal-dual updates of Algorithm 1 corresponds to the customizationof the Algorithm 2 of [12] to the problem of finding (cid:98) x ( z, Λ) solution of (6), hencecorresponding to G ( x ) = (cid:107) Ax − z (cid:107) , linear operator K = D Λ and F = (cid:107)·(cid:107) , .We have to note that, because of the operation (cid:0) Id + 2 τ A (cid:62) A (cid:1) − , the prox-imity operator of the data-term might be uneasy to evaluate. However, fornumerous configuration, this expression as a closed form expression. First when A = Id. Second, when A is diagonalizable in a specific basis such as Fourier forcirculant matrices (leading to O ( M N ) operations). Another specific choice of A will be discussed in Section 4.2. Once an efficient algorithmic strategy has been identified to estimate (cid:98) x ( z ; Λ), thesecond major difficulty raised by Problem 1 is that, in practice, one does not haveaccess to the true signal/image ¯ x and hence cannot compute E (cid:8) (cid:107) (cid:98) x ( z ; Λ) − x (cid:107) (cid:9) .To handle this limitation, Stein proposed an Unbiased Risk Estimator [50],denoted SURE(Λ), providing an usable approximation of the quadratic risk inthe case of i.i.d. Gaussian noise. This estimator was then widely extended tomore general noise models [23, 41]. Theorem 2 (Stein Unbiased Risk Estimate) . We denote (cid:98) x ( z ; Λ) the paramet-ric estimator defined in (17) of the ground truth ¯ x from observation z = B ( A ¯ x ) corrupted by a full-rank deformation operator A and additive (possibly corre-lated) Gaussian noise B , with covariance matrix S . The Stein Unbiased RiskEstimate, defined as SURE(Λ) := (cid:107)
Φ ( A (cid:98) x ( z ; Λ) − z ) (cid:107) + 2Tr (cid:18) S Φ (cid:62) ∂ (cid:98) x ( z ; Λ) ∂z (cid:19) − Tr(Φ S Φ (cid:62) ) , (19) satisfies the following unbiasedness property E { SURE(Λ) } = E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } . (20) where Φ := (cid:0) A (cid:62) A (cid:1) − A (cid:62) and ∂ (cid:98) x ( z ; Λ) /∂z denotes the Jacobian of (cid:98) x ( z ; Λ) w.r.t.observation z . roof. A complete and detailed proof was proposed in [41].
Definition 2 (Degrees of freedom) . The second term in the definition of SURE(Λ),in Equation (19), 2Tr (cid:18) S Φ (cid:62) ∂ (cid:98) x ( z ; Λ) ∂z (cid:19) (21)is called the degrees of freedom .Since Φ ∈ R M × N and ∂ (cid:98) x ( z ; Λ) /∂z ∈ R N × M , are large size matrices comput-ing the trace of Φ (cid:62) ∂ (cid:98) x ( z ; Λ) /∂z is very expansive and hence the evaluation of thedegrees of freedom requires additional tools. This difficulty is overcome using,a Monte Carlo (MC) strategy, only requiring the evaluation of the Jacobian ona random vector δ ∈ R M . Hence, it is only necessary to store a vector of size N , instead of manipulating a matrix of size N × M , which decreases drasticallythe computational and memory costs. Moreover, when (cid:98) x ( z ; Λ) is obtained froma minimization scheme, there is often no closed-form expression of the Jaco-bian, hence we will approximate ∂ (cid:98) x ( z ; Λ) /∂z [ δ ] using Finite Difference (FD)approximation of the Jacobian. Altogether, Monte Carlo and Finite Differencestrategies lead to the following FDMC Stein Unbiased Risk Estimate. Theorem 3 (Finite Difference Monte Carlo SURE) . Let (cid:98) x ( z ; Λ) denote a para-metric estimator of ground truth ¯ x from observation z = B ( A ¯ x ) corrupted bya full-rank deformation operator A and additive (possibly correlated) Gaussiannoise B , with covariance matrix S and Φ = (cid:0) A (cid:62) A (cid:1) − A (cid:62) . The FDMC SteinUnbiased Risk Estimate is defined as SURE ε,δ (Λ) := (cid:107)
Φ ( A (cid:98) x ( z ; Λ) − z ) (cid:107) + 2 ε (cid:10) Φ (cid:62) ( (cid:98) x ( z + εδ ; Λ) − (cid:98) x ( z ; Λ)) , S δ (cid:11) − Tr(Φ S Φ (cid:62) ) . (22) Provided that (cid:98) x ( z ; Λ) is uniformly Lipschitz w.r.t. observation z and integrableagainst Gaussian density, SURE ε,δ (Λ) is an asymptotically unbiased estimatorof the quadratic risk, i.e. lim ε → E { SURE ε,δ (Λ) } = E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } , (23) where the expectation in the left hand side is taken over both the noise B andthe Monte Carlo vector δ ∼ N (0 , Id) .Proof.
The proof directly follows from Theorem 2 in [41].Thus, Problem 1 is replaced by
Problem 2.
Find (cid:98) Λ † = arg min Λ ∈ ( R ∗ + ) K SURE ε,δ (Λ).
In order to solve Problem 2, two strategies can be considered. First a gridsearch, computing SURE ε,δ (Λ) over a large range of hyperparameter values,corresponding to the discrete set Λ = (Λ i ) Ii =1 and selecting a posteriori the10yperparameters of the grid for which SURE ε,δ (Λ i ) is minimal, denoted (cid:98) Λ G , asdescribed in Algorithm 2. The major drawback of this approach is its computa-tional cost, increasing algebraically with the dimension of the hyperparametersΛ. Algorithm 1
Accelerated primal dual scheme for minimization of (6).
Require:
Set ε > δ ∈ R M , τ > σ >
0, such that τ σ (cid:107) D Λ (cid:107) <
1. . for (cid:101) z = { z, z + εδ } do Choose (cid:101) x [0] ∈ R N , x [0] ∈ R N , y [0] = D Λ x [0] ∂ Λ (cid:101) x [0] ← N ∂ Λ x [0] ← N ∂ Λ y [0] ← D Λ ∂ Λ x [0] for t = 0 to T max − do { Primal-dual updates } y [ t +1] = prox σ t ( (cid:107)·(cid:107) ) ∗ (cid:0) y [ t ] + σ t D Λ (cid:101) x [ t ] (cid:1) x [ t +1] = prox τ t (cid:107) A ·− (cid:101) z (cid:107) (cid:0) x [ t ] − τ t D Λ y [ t +1] (cid:1) ϑ t = √ µτ t , τ t +1 = τ t /ϑ t and σ t +1 = ϑ t σ t (cid:101) x [ t +1] = x [ t +1] + ϑ t (cid:0) x [ t +1] − x [ t ] (cid:1) end for (cid:98) x ( (cid:101) z ; Λ) ← x [ T max ] end for Compute SURE ε,δ (Λ) injecting (cid:98) x ( z ; Λ) and (cid:98) x ( z + εδ ; Λ) in Formula (22) return SURE ε,δ (Λ)
Algorithm 2
Grid search for SURE minimization.
Require:
Grid Λ = (Λ i ) Ii =1 , ε > δ ∼ N (0 , Id) ∈ R M , for i = 1 to I do ERROR( i ) ← SURE ε,δ (Λ i ), computed from Algorithm 1 end for (cid:98) i G ← arg max ≤ i ≤ I ERROR( i ) return (cid:98) Λ G = Λ (cid:98) i G In order to provide faster implementations, we consider automated selectionof hyperparameters. To that aim, the number of hyperparameters is assumed tobe K = O (1) and hence quasi-Newton algorithms are appropriate since they canhandle very efficiently minimization in low dimension. It requires to compute thegradient of SURE ε,δ (Λ) w.r.t. the hyperparameter Λ. For this purpose, it wasproposed a Stein Unbiased GrAdient Risk estimate, denoted SUGAR ε,δ (Λ) [18,41], which, under the conditions of Theorem 3, writesSUGAR ε,δ (Λ) := 2 (cid:18) Φ A ∂ (cid:98) x ( z ; Λ) ∂ Λ (cid:19) (cid:62) (Φ ( A (cid:98) x ( z ; Λ) − z ))+ 2 ε (cid:28) Φ (cid:62) (cid:18) ∂ (cid:98) x ( z + εδ ; Λ) ∂ Λ − ∂ (cid:98) x ( z ; Λ) ∂ Λ (cid:19) , S δ (cid:29) . (24)11 sketch of quasi-Newton descent [37], particularized to Problem 2, is de-tailed in Algorithm 4. It generates a sequence (cid:0) Λ [ j ] (cid:1) j ∈ N converging toward aminimizer of SURE ε, δ (Λ), denoted (cid:98) Λ BFGS .This algorithm relies on a gradient descent step involving a descent direction d [ j ] obtained from the product of BFGS approximated inverse Hessian matrix H [ j ] and the gradient SUGAR ε,δ (Λ) obtained from Algorithm 3. The descentstep size α [ j ] is obtained from a line search which stops when Wolfe conditionsare fulfilled [17, 37]. Finally, the approximated inverse Hessian matrix H [ j ] isupdated according to a BFGS strategy. Remark . The line search is the most time consuming. Indeed, the routinesSURE and SUGAR are called for several hyperparameters of the form Λ [ j ] + αd [ j ] , each call requiring to run differentiated primal-dual scheme twice. Algorithm 3
Accelerated primal dual scheme for minimization of (6) withiterative forward differentiation.
Require:
Set ε > δ ∈ R M , τ > σ >
0, such that τ σ (cid:107) D Λ (cid:107) < for (cid:101) z = { z, z + εδ } do Choose (cid:101) x [0] ∈ R N , x [0] ∈ R N , y [0] = D Λ x [0] ∂ Λ (cid:101) x [0] ← N ∂ Λ x [0] ← N ∂ Λ y [0] ← D Λ ∂ Λ x [0] for t = 0 to T max − do { Primal-dual updates } y [ t +1] = prox σ t ( (cid:107)·(cid:107) ) ∗ (cid:0) y [ t ] + σ t D Λ (cid:101) x [ t ] (cid:1) x [ t +1] = prox τ t (cid:107) A ·− (cid:101) z (cid:107) (cid:0) x [ t ] − τ t D Λ y [ t +1] (cid:1) ϑ t = √ µτ t , τ t +1 = τ t /ϑ t and σ t +1 = ϑ t σ t (cid:101) x [ t +1] = x [ t +1] + ϑ t (cid:0) x [ t +1] − x [ t ] (cid:1) { Forward iterative differentiation } ∂ Λ y [ t +1] = ∂ y prox σ t ( (cid:107)·(cid:107) ) ∗ (cid:0) ∂ Λ y [ t ] + σ t D Λ ∂ Λ (cid:101) x [ t ] + σ t ( ∂ Λ D Λ ) (cid:101) x [ t ] (cid:1) ∂ Λ x [ t +1] = ∂ x prox τ t (cid:107) A ·− (cid:101) z (cid:107) (cid:0) ∂ Λ x [ t ] − τ t D Λ ∂ Λ y [ t +1] − τ t ( ∂ Λ D Λ ) y [ t +1] (cid:1) ∂ Λ (cid:101) x [ t +1] = ∂ Λ x [ t +1] + ϑ t (cid:0) ∂ Λ x [ t +1] − ∂ Λ x [ t ] (cid:1) end for (cid:98) x ( (cid:101) z ; Λ) ← x [ T max ] ∂ Λ (cid:98) x ( (cid:101) z ; Λ) ← ∂ Λ x [ T max ] end for Compute SURE ε,δ (Λ) injecting (cid:98) x ( z ; Λ) and (cid:98) x ( z + εδ ; Λ) in Formula (22)Compute SUGAR ε,δ (Λ) injecting (cid:98) x ( z ; Λ), (cid:98) x ( z + εδ ; Λ), ∂ Λ (cid:98) x ( z ; Λ) and ∂ Λ (cid:98) x ( z + εδ ; Λ) in Formula (24) return SURE ε,δ (Λ) and SUGAR ε,δ (Λ)12 lgorithm 4
Automated selection of hyperparameters minimizing quadraticrisk.
Require: ε > δ ∼ N (0 , Id) ∈ R M Ensure: Λ [0] ∈ ( R + ) K , H [0] ∈ R K × K SUGAR [0] ← SUGAR ε,δ (Λ [0] ) computed from Algorithm 3 for j = 0 to J max − do d [ j ] = − H [ j ] SUGAR [ j ] α [ j ] ∈ Argmin α ∈ R SURE ε,δ (Λ [ j ] + αd [ j ] ), SURE computed from Algorithm 3Λ [ j +1] = Λ [ j ] + α [ j ] d [ j ] SUGAR [ j +1] ← SUGAR ε,δ (Λ [ j +1] ) computed from Algorithm 3 u [ j ] = SUGAR [ j +1] − SUGAR [ j ] H [ j +1] = (cid:18) Id − d [ j ] ( u [ j ] ) (cid:62) ( u [ j ] ) (cid:62) d [ j ] (cid:19) H [ j ] (cid:18) Id − u [ j ] ( d [ j ] ) (cid:62) ( u [ j ] ) (cid:62) d [ j ] (cid:19) + α [ j ] d [ j ] ( d [ j ] ) (cid:62) ( u [ j ] ) (cid:62) d [ j ] . end forreturn (cid:98) Λ BFGS = Λ [ T max ] Context – Friction experiments aim at probing not only the characteristicsof materials, but also the dynamics of systems involving surfaces in contact. Inparticular, they are paradigms for modeling and attempting to predict earth-quake dynamics [35]. The classical solid friction experiment consists of towinga mass m (so-called slider ) over a substrate via a spring of stiffness k pulledat velocity V (see for instance Figure 2 in [13]). The signal representative ofthe slider dynamics is the force measured at the contact point between thespring and the slider. Among the different regimes described in solid friction,we can distinguish the stick-slip, characterized by a tooth-shaped signal alter-nating slow, linear rise and fast drops, the inertial regime, in which the signalbecomes periodic and resembles a sine curve, and the continuous sliding regime,characterized by an almost constant signal superimposed with noise [3]. Theappearance of creep, slow forward motion of the slider previous to a slip event,may also modify the signal shape. The challenge in such studies is to establish a regime diagram describing (and therefore, predicting) the system dynamics de-pending on the parameters ( m, k, V ). If the recognition of the different regimesis easy for large mass m , experiments with low confinement pressure, necessaryto avoid surface wear, are challenging as they add noise to the experimentalsignals [13]. In this context, new signal processing tools are required. Herewe focus in particular on signal denoising by approximating, at first order, thestick-slip signals to piecewise linear signals. Data – Experiments of solid friction (taken from [13]) were performed bypulling a mass m = 30 . × ) over a solid substrate. Both13urfaces in contact consist of paper samples (Canson (cid:114) , characterized by itsroughness). A cantilever spring (metallic blade of stiffness k between 168 and3337 N/m) is pulled at constant velocity V (between 42 and 7200 µ m/s) and isin contact with the slider by a steel ball glued onto this latter, ensuring a punc-tual contact and the free motion of the contact point. The slider dynamics isquantified though the measurement of the blade deflection, ∆ x , by an inductivesensor (Baumer, IPRM 12I9505/S14). In all experiments, the mass m is keptconstant. We vary the parameters ( k, V ) and, for each experiment, record thenormalized force signal F ∗ from the blade deflection, F ∗ = k ∆ x/ ( mg ), where g = 9 .
81 m/s − is the gravitational acceleration. This signal is recorded with asampling frequency of 2 kHz, and its size varies from about 4 . × to 7 . × . Piecewise linear denoising – In [13], stick-slip signals were processed usingan optimization formalism, falling under formulation (6), in order to enforcepiecewise linear behavior. The observation z corresponds to the measured forcesignals, A = Id, the linear operator D Λ is chosen to be the discrete Laplaciandescribed in Equation (9) (for an univariate signal, i.e. K = 1), and (cid:107)·(cid:107) isthe (cid:96) , -norm defined in (8), which reduces to the (cid:96) -norm in the context ofunivariate signals.The tedious task of tuning the regularization parameter λ was performed byexpert visual inspection and led to a choice λ expert = 0 . λ expert displayed in red. The regularized signals appearto capture well the transition between the stick and slip regimes.Here, we propose to illustrate the use of the regularization parameter au-tomated tuning strategy presented in Section 3.4 for piecewise linear denoisingon stick-slip signals. SURE ε,δ ( λ ) is used as the quality criterion and minimizedover λ . Therefore, both grid search and automated tuning are implemented andcompared. Automated data-driven hyperparameter tuning – The Finite Differencestep ε , involved in SURE ε,δ and SUGAR ε,δ computation [see Equations (22)and (24)] is set to ε = 2 σN . (25)with N the length of the considered stick-slip signal and σ the estimated vari-ance of the additive noise corrupting the signal. Since no additional informationabout the noise is available, σ is estimated using the sample variance estimatorapplied to observations.SURE ε,δ (black curve in Figure 3) is first computed over a grid of 15 logarithmi-cally spaced values of the regularization parameter λ , using Algorithm 2. Then, λ grid is defined as the minimum of SURE ε,δ ( λ ) over the grid and indicated bythe ‘+’ symbol. Finally, the quasi-Newton Algorithm 4 for automated tuning ofregularization parameter is run, providing λ BFGS , represented by the ‘ ∗ ’ symbol.The regularization parameter chosen by the expert is displayed for comparisonpurpose, an indicated by the ‘ × ’ marker. For each experimental setting ( k, V ),the grid search optimal regularization parameter λ grid and the automatically14igure 2: Stick-slip force signals.
Experimental data ((in grey) for threedifferent experimental settings and nonlinear filtering enforcing piecewise linearbehavior, with automated hyperparameter tuning (blue) and expert-selectedhyperparameter (red).tuned regularization parameter λ BFGS ( k, V ) obtained respectively from Algo-rithms 2 and 4 are compared in Table 1, showing satisfactory agreement. Denoised data analysis – Table 1 shows first that λ BFGS is within the same or-der of magnitude as λ expert . This is consistent with Figure 2, that further showsthat denoised experimental signals obtained from nonlinear filtering enforcingpiecewise linear behavior, with automated hyperparameter tuning (blue) andexpert-selected hyperparameter (red) display similar shapes and behaviors. Thisis a very satisfactory outcome as the proposed data-driven and automated hy-perparameter tuning yields outcomes very consistent with those obtained fromexpert choices, without making use of any a priori information, and relying ondata only instead. 15 grid ( k, V ) k (N/m)
168 1002 2254 V ( µ m / s ) λ BFGS ( k, V ) k (N/m)
168 1002 2254 V ( µ m / s ) k and velocity V .Figure 3: SURE ε,δ ( λ ). Grid search v.s. automated tuning of regularizationparameter.Table 1 also shows that the automated procedure yields different regular-ization parameters for the different ( k, V ) configurations, illustrating an abilityto finely adapt to data, which would not be possible - or would be too muchtime-consuming - for an expert. Table 1 further reveals that the automaticallyselected regularization parameters, λ BFGS , are, for almost all ( k, V ) configu-rations, slightly larger than the expert-selected ones, λ expert , hence yieldingoverall more regular signals. Figure 2(f) shows that the red signal, obtainedwith λ expert , displays discontinuities (e.g, around t = 5 . λ BFGS ,hence showing the interest of tuning the hyperparameter to each specific signal.However, Figure 2(b) and (d) also shows small yet visible differences during theslip-phase (fast decrease) between the red signals, obtained with λ expert , and theblue signals, obtained with the automated selection λ BFGS . To decide which oneis the most relevant requires returning to a detailed analysis of solid friction:The stick phases actually produce force signals that are exactly linearly increas-ing ; For the slip phase, while they can be described in first approximation asan abrupt linear decrease, detailed analysis indicates that they actually consistof arches of sinusoidal functions that connect the stick phases. Therefore, it can16e considered that the expert-driven signals (red) better fit the experimentaldata, at the price though of concatenating a series of short linear segments thatare irrelevant with respect to the underlying physics, whereas the data-drivensignals (blue) yield more stylized piecewise linear approximations of the data,that may however better capture the times of transitions between stick and slipphases, an information of premier importance to analyze solid friction regimes.In sums, deciding between the use of expert versus automated tuning of thehyperparameters combines several issues ranging from feasibility (expert tuningis time consuming, prone to errors and may lack reproducibility) to relevance(denoised signals must permit relevant access to quantity of interest for thephysics).
Context – Understanding and predicting the dynamics of multiphase flows is amajor issue in geosciences (soil decontamination, CO sequestration) and in theindustry (enhanced oil recovery, heterogeneous catalysis) [2, 28, 30, 45]. Amongthese processes, many involve a joint gas and liquid flow through a porousmedium. Quantifying the contact areas between the different phases, wherechemical reactions take place, is of tremendous importance for analyzing andpredicting the efficiency of such processes [31]. However, even when direct visu-alization is possible, the porous medium generates a global, multiscale textureon images which makes it difficult to extract the gas-liquid interfaces. Seg-mentation techniques based on morphological tools used so far to differentiatephases in multiphase flows [49] present severe limitations: arbitrary thresholdsetting, non-physical irregular bubble contour, non detection of small bubbles.In addition, recent developments in high-resolution and high-speed imaging yieldlarge-size images and large data sets, thus bringing forward issues in memoryand computational costs. Here, we focus on the identification of the differentphases (liquid and gas) in textured images. As a first approximation, the liq-uid and the gas appear as homogeneous fractal textures. Hence, discriminatingphases requires to solve a texture segmentation problem. Data – Experiments of joint gas and liquid flow through a porous mediumwere performed in a quasi-2D vertical Hele-Shaw cell of width 210 mm, height410 mm and gap 1.75 mm (see Figure 1 in [7]). The porous medium is an opencell solid foam of NiCrFeAl alloy (Alantum), with a typical pore diameter of580 µ m. Constant gas and liquid flow rates are injected at the bottom of thecell through nine injectors (air) and a homogeneous slit (water). Images of themultiphase flow are acquired by a high-resolution camera (Basler A2040-90um,2048 × × Fractal features – We consider fractal , or scale-free , features, consisting of thelocal behavior as functions of scales of the wavelet leader coefficients L j,n andscale j ∈ { , . . . , J } , built as a local supremum of wavelet coefficients [53,54]. Foreach pixel n ∈ Ω = { , . . . , N }×{ , . . . , N } , the leader coefficients of the image X to analyze, denoted L j,n , evidence the following local scaling property [29] L j,n ∼ η n jh n , as 2 j → j denotes the scale of the multiscale transform. The quantity h n mea-sures the local regularity of the texture at pixel n . In log-log coordinates, Equa-tion (26) corresponds to a linear behavior through octaves j log ( L j,n ) (cid:39) log ( η n ) + jh n , as 2 j → . (27)Setting v n := log ( η n ), which will be called in the following the local power ofthe texture, a texture X is characterized by (cid:0) h n , v n (cid:1) n ∈ Ω . Definition 3. An homogeneous texture is characterized by a uniform localregularity h n ≡ H and local power v n ≡ V .Then, texture segmentation consists in identifying a partition of the imagedomain Ω = Ω ∪ · · · ∪ Ω Q , Ω q ∩ Ω q (cid:48) = ∅ for q (cid:54) = q (cid:48) , (28)for which both h n and v n are uniform on each Ω q . In other words, it consistsin obtaining piecewise constant maps of local regularity and local power. Regularized estimates – Linear regression on log- leaders (27) can be formu-lated as the minimization of the following least-squaresΦ( h, v ; L ) = 12 j (cid:88) j = j (cid:107) jh + v − log L j (cid:107) , (29)and provides estimates (cid:16)(cid:98) h LR , (cid:98) v LR (cid:17) of fractal features. As an example, the lin-ear regression estimate of the local regularity of the (zoomed) flow image ofFigure 4(a) (Figure 5(a)) is presented in Figure 4(b) (Figure 5(b)). These es-timates turn out to suffer from large variances precluding their use of actualsegmentation, thus calling for nonlinear estimation tools.To favor piecewise homogeneous segmentation, we enforce piecewise constancyin estimated features via two different Total Variation-based penalizations, lead-ing to the minimization of the Joint and the
Coupled functionals (cid:16)(cid:98) h J / C , (cid:98) v J / C (cid:17) = arg min h,v Φ( h, v ; L ) + λ Ψ J / C ( h, v ; α ) . (30)18he Joint and
Coupled penalizations are defined asΨ J ( h, v ) := λ ( α TV( h ) + TV( v )) , (31)Ψ C ( h, v ) := λ N − (cid:88) n =1 N − (cid:88) n =1 (cid:113) α ( Hh ) n ,n + α ( V h ) n ,n + ( Hv ) n ,n + ( V v ) n ,n . (32)where the total variation (TV) is defined in Equation (11) and the horizontaland vertical discrete gradients, H and V , are defined at Equation (10). Whilethe Joint penalization imposes independently piecewise constancy of local reg-ularity h and local power v , the Coupled penalization is more restrictive andfavors co-localized changes in h and v . The trade-off between fidelity to themathematical model (27) and piecewise constancy of h and v is controlled bythe regularization parameter λ > α > Iterated thresholding – From the regularized estimates, (cid:98) h J / C , one can obtaina segmentation by applying a post-processing thresholding. The iterated thresh-olding procedure, proposed in [9, 10], benefiting from theoretical assessment, iscustomized to the gas/liquid segmentation problem in Algorithm 5. It is usedsystematically in the following, leading to the proposed T- Joint and T-
Coupled segmentation procedures introduced in [39].
Algorithm 5
T-ROF: iterative thresholding of (cid:98) h ROF
Require: (cid:98) h Ensure: m [0]0 = min n ∈ Ω (cid:98) h n , m [0]1 = max n ∈ Ω (cid:98) h n . for t ∈ N ∗ do { Compute the threshold: } T [ t − = (cid:16) m [ t − + m [ t − (cid:17) / { Threshold (cid:98) h : } Ω [ t ]0 = { n | (cid:98) h n ≤ T [ t ] } , Ω [ t ]1 = { n | (cid:98) h n > T [ t ] }{ Update region mean: } m [ t ]0 = 1 / | Ω | (cid:88) n ∈ Ω (cid:98) h n , m [ t ]1 = 1 / | Ω | (cid:88) n ∈ Ω (cid:98) h n . end forreturn Ω = Ω [ ∞ ]0 (liquid), Ω = Ω [ ∞ ]1 (gas) Compared texture segmentation procedures – Four procedures fallingunder Model (6) and satisfying assumptions of Theorem 1 and Theorem 2 willbe compared for texture segmentation, summarized in Table 2. Note that theydiffer both by the functional minimized and the noise model, which is of crucialimportance in Stein procedures.The first one, denoted ROF-Id, is a state-of-the-art piecewise constant de-noising method, applied on (cid:98) h LR seen as an observation of ¯ h corrupted by additivei.i.d. zero-mean Gaussian noise of variance σ , hence with scalar covariance ma-trix S = σ Id. 19he three procedures ROF- S , Joint and
Coupled take into account the covari-ance structure of the log- leaders coefficients, evidencing both inter-scale andspatial correlations encapsulated in a non-diagonal covariance matrix S .The linear operator intervening in the data fidelity term of Joint and
Cou-pled procedures, denoted J , acts on the double variable ( h, v ) as J ( h, v ) :=( jh + v ) j j = J . We showed in a previous work [39] that it is full-rank. Hence, The-orem 1 applies. Moreover, the strong-convexity modulus µ = 2 min Sp( J (cid:62) J ),where Sp denotes the spectrum of a linear operator, only depends on the octaverange { j , . . . , j } and its numerical values are provided for fixed j = 1 andvarying j in Table 3 [39]. Method Figures Observation Operator Variable Penalization Covariance4, 5 z A x
ROF-Id (c), (d) (cid:98) h LR Id h TV σ IdROF- S (e), (f) (cid:98) h LR Id h TV S Joint (g), (h) log ( L ) J ( h, v ) Ψ J S Coupled (i), (j) log ( L ) J ( h, v ) Ψ C S Table 2: Four different settings considered in the experiments of local regularity-based texture segmentation with automated choice of hyperparameters proce-dures. (cid:98) h LR stands for the minimizer of (29), L denotes the wavelet leaders ofthe image to analyze, TV stands for total-variation penalization as defined in(11), and with h , v , Ψ J and Ψ C , S are defined in this section denote respec-tively the local regularity, the local variance, the Joint penalization, the Coupledpenalization and the covariance matrix. Automated hyperparameter tuning – Stein based formalism, describedin Section 3.4, is used, first, to obtain an estimation of the quadratic riskfrom SURE ε,δ ( λ, α ), second, for automated tuning of regularization parame-ters thanks to SUGAR ε,δ ( λ, α ) estimate.For this purpose, it is necessary to provide an estimate of the covariance matrixof the noise. The estimated noise variance σ involved in ROF-Id is obtainedfrom the variance of (cid:98) h LR , while the covariance matrix S is assimilated to thecovariance of the log- leaders of the textured image X to be segmented.The Finite Difference step ε , involved in SURE ε,δ and SUGAR ε,δ computation(see Equations (22) and (24)) is set to ε = 2 √ max S M . (33)where M is the size of the observation vector and the maximum is taken over allcoefficients of the covariance matrix and M = N × N in the case of ROF-Id andROF- S , M = ( j − j +1) × N × N in the case of Joint and
Coupled procedures.
Accuracy of the automated tuning – Grid search minimization of SURE ε,δ ( λ, α )(Algorithm 2) being costly, due to the large number of Algorithm 1 runs re-quired, it is performed on a zoomed image of 281 ×
231 pixels, presented inFigure 5(a). Then, automated tuning of λ and α from Algorithm 4, based on20 = 2 j = 3 j = 4 j = 5 j = 6 µ . .
72 1 .
20 1 .
69 2 . µ of data-fidelity term of (6), for fixed j = 1 and varied j . The bold entry correspond to the range of scales used inthe experiments.SUGAR ε,δ ( λ, α ), is performed on the same zoomed image.In practice, SURE ε,δ ( λ ) is computed on 15 values of the hyperparameter λ forROF-Id (Figure 6(a)) and ROF- S (Figure 6(b)) procedures, and over a 15 × λ, α ) for Joint (Figure 6(c)) and
Coupled (Figure 6(d))methods. The grid search minimum, Λ grid , indicated by the ‘+’ symbol, is com-pared to the optimal regularization parameters found applying Algorithm 4,Λ
BFGS , indicated by the ‘ ∗ ’ symbol. The optimal parameters Λ grid and Λ BFGS appear to coincide perfectly for ROF-Id and
Joint procedures. As for ROF- S and Coupled strategies, even though they are different, they are consistent withSURE ε,δ profile, in the sense that they correspond to similar values of SURE ε,δ .We observed that, while grid search minimization (Algorithm 2) required 225runs of Algorithm 1 for
Joint and
Coupled methods, the automated tuning viaBFGS quasi-Newton minimization (Algorithm 4) needed no more than 50 runsof Algorithm 3. Hence, when several parameters are involved, an automatedstrategy (Algorithm 4) is significantly faster than a grid search (Algorithm 2).
Segmentation results – Figure 3 indicates that the automated selection ofregularization parameters is consistent with SURE ε,δ minimization. Hence, thecomplete images of 1626 × S procedures yield regularized (cid:98) h TV / ROF presentingartifacts, as observed in Figures 4(d) and 4(f), and hence lead to inaccuratesegmentation, cf. Figures 4(c) and 4(e). In addition, a key point in such exper-iments is to estimate precisely the contact surface between the liquid and gas.Both T-ROF-Id and T-ROF- S (Figures 4(c) and (e)) present irregular contours,which are not representative of the real contours and strongly overestimate bub-ble perimeters. Joint and
Coupled procedures, taking into account both thelocal regularity and the local variance yield more regular contours. In addition,the
Joint and
Coupled methods detect less artifacts (see Figures 4(g) and 4(i)).However, the
Joint estimate (cid:98) h J appears to be over-regularized, leading to non-detection of small bubbles in the segmentation of Figure 4(g). The Coupled procedure turns out to perform a satisfactory compromise, avoiding artifacts,yet, detecting small gas bubbles, as illustrated in Figures 4(i) and 4(j).
The present work has described a unified framework for signal/image non linearfiltering, formulated as an inverse problem, that can actually be affiliated toseveral functional minimization problems encountered in statistical (nonlinear)physics. This inverse problem formulation aims at favoring piecewise homoge-21eous signal and images, that naturally correspond to solutions on numerousproblems in nonlinear physics, often very different in nature. Piecewise homo-geneity assessment entails non smooth convex optimization, here handled viaproximal operators. In addition to yielding relevant piecewise homogeneous es-timates, the proposed framework also achieves an automated and data-driventuning of hyperparameters inherently present in inverse problems and nonlinearfiltering, thus avoiding the burden of conducting a prone to error and sometimeslacking reproductibily expert inspection. The potential and interest of nonlinearfiltering has been illustrated at work on two, different in nature, real nonlinearphysics experiments (low confinement solid friction and porous media multi-phase flow). However, the approach has a fairly general level of applicabilityand a documented
Matlab toolbox both for multivariate signals and images,implementing both the nonlinear filtering favoring piecewise homogeneity andthe automated data-driven hyperparameter selection, has been made publiclyavailable at https://github.com/bpascal-fr/stein-piecewise-filtering . References [1] Thomas Auger, Jerome Mathe, Virgile Viasnoff, Gaelle Charron, Jean-Marc Di Meglio, Loic Auvray, and Fabien Montel. Zero-mode waveguidedetection of flow-driven dna translocation through nanopores.
Phys. Rev.Lett. , 113:028302, 2014.[2] A. Babchin, I. Brailovsky, P. Gordon, and G. Sivashinsky. Fingering insta-bility in immiscible displacement.
Phys. Rev. E , 77:026301, 2008.[3] T. Baumberger and C. Caroli. Solid friction from stick-slip down to pinningand aging.
Adv. Phys. , 55(3-4):279–348, 2006.[4] H. H. Bauschke and P. L. Combettes.
Convex analysis and monotone op-erator theory in Hilbert spaces . Springer, New York, second edition, 2017.[5] A. Benazza-Benyahia and J.-C. Pesquet. Building robust wavelet estimatorsfor multicomponent images using Stein’s principle.
IEEE Trans. ImageProcess. , 14(11):1814–1830, 2005.[6] M. Berhanu, R. Monchaux, S. Fauve, N. Mordant, F. Petrelis, A. Chif-faudel, F. Daviaud, B. Dubrulle, L. Marie, F. Ravelet, M. Bourgoin, Ph.Odier, J.-F. Pinton, and R. Volk. Magnetic field reversals in an experimen-tal turbulent dynamo.
Eur. Phys. Lett. , 77:59001, 2007.[7] T. Busser, M. Serres, R. Philippe, and V. Vidal. Hydrodynamics of gas-liquid co-current flow through a thin sheet of highly porous open cell solidfoam. in revision at Chem. Eng. Sci. , 2020.[8] J.-F. Cai, B. Dong, S. Osher, and Z. Shen. Image restoration: Total varia-tion, wavelet frames, and beyond.
J. Amer. Math. Soc. , 25:1033–1089, May2012.[9] X. Cai, R. Chan, C.-B. Schonlieb, and T. Steidl, G.and Zeng. Linkagebetween piecewise constant Mumford-Shah model and ROF model and itsvirtue in image segmentation.
Preprint arXiv:1807.10194 , 2018.2210] X. Cai and G. Steidl. Multiclass segmentation by iterated ROF threshold-ing. In
Int. Workshop on Energy Minimization Methods in Comp. Vis. andPat. Rec. , pages 237–250. Springer, 2013.[11] A. Chambolle. Image segmentation by variational methods: Mumford andShah functional and the discrete approximations.
SIAM J. Appl Math. ,55:827–863, 1995.[12] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convexproblems with applications to imaging.
J. Math. Imag. Vis. , 40(1):120–145,2011.[13] J. Colas, N. Pustelnik, C. Oliver, P. Abry, J.-C. G´eminard, and V. Vi-dal. Nonlinear denoising for characterization of solid friction under lowconfinement pressure.
Phys. Rev. E , 100:032803, 2019.[14] P. L. Combettes and J.-C. Pesquet. Proximal splitting methods in signalprocessing. In H. H. Bauschke, R. S. Burachik, P. L. Combettes, V. Elser,D. R. Luke, and H. Wolkowicz, editors,
Fixed-Point Algorithms for InverseProblems in Science and Engineering , pages 185–212. Springer-Verlag, NewYork, 2011.[15] P. L. Combettes and V. R. Wajs. Signal recovery by proximal forward-backward splitting.
Multiscale Model. and Simul. , 4(4):1168–1200, 2005.[16] L. Condat. A primal-dual splitting method for convex optimization involv-ing Lipschitzian, proximable and linear composite terms.
J. Optim. TheoryAppl. , 158(2):460–479, 2013.[17] F. E. Curtis, T. Mitchell, and M. L. Overton. A BFGS-SQP method fornonsmooth, nonconvex, constrained optimization and its evaluation usingrelative minimization profiles.
Optim. Methods Softw. , 32(1):148–181, 2017.[18] C.-A. Deledalle, S. Vaiter, J. Fadili, and G. Peyr´e. Stein Unbiased GrAdientestimator of the Risk (SUGAR) for multiple parameter selection.
SIAM J.Imaging Sci. , 7(4):2448–2487, 2014.[19] Thibaut Divoux, Herve Gayvallet, and Jean-Christophe Geminard. Creepmotion of a granular pile induced by thermal cycling.
Phys. Rev. Lett. ,101:148303, 2008.[20] N. Dobigeon and J.-Y. Tourneret. Joint segmentation of wind speed and di-rection using a hierarchical model.
Comput. Stat. Data Anal. , 51(12):5603–5621, Aug. 2007.[21] N. Dobigeon, J.-Y. Tourneret, and M. Davy. Joint segmentation of piece-wise constant autoregressive processes by using a hierarchical model anda Bayesian sampling approach.
IEEE Trans. Signal Process. , 55(4):1251–1263, Apr. 2007.[22] D. L. Donoho and J. M. Johnstone. Ideal spatial adaptation by waveletshrinkage.
Biometrika , 81(3):425–455, 1994.[23] Yonina C Eldar. Generalized SURE for exponential families: Applicationsto regularization.
IEEE Trans. Signal Process. , 57(2):471–481, 2008.2324] J. Frecon, N. Pustelnik, N. Dobigeon, H. Wendt, and P. Abry. Bayesianselection for the (cid:96) -Potts model regularization parameter: 1-D piecewiseconstant signal denoising. IEEE Trans. Signal Process. , 65(19):5215–5224,Oct. 2017.[25] D. Geman and S. Geman. Bayesian image analysis. In
Disordered systemsand biological organization , pages 301–319. Springer, 1986.[26] D. Geman and G. Reynolds. Constrained image restoration and the recov-ery of discontinuities.
IEEE Trans. Pattern Anal. Match. Int. , 14(3):367–383, 1992.[27] S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, andthe Bayesian restoration of images. In
Readings in Computer Vision , pages564–584. Elsevier, 1987.[28] V. Hessel, P. Angeli, A. Gavriilidis, and H. L¨owe. Gas-liquid and gas-liquid-solid microscructured reactors: Contacting principles and applications.
In-dustrial & engineering chemistry research , 44:9750–9769, 2005.[29] S. Jaffard. Wavelet techniques in multifractal analysis.
Fractal geometryand applications: A jubilee of Benoit Mandelbrot, M. Lapidus et M. vanFrankenhuysen Eds, Proceedings of Symposia in Pure Mathematics (AMS) ,72(2):91–152, 2004.[30] Q. Kang, I. N. Tsimpanogiannis, D. Zhang, and P. C. Lichtner. Numericalmodeling of pore-scale phenomena during CO sequestration in oceanicsediments. Fuel Process. Technol. , 86:1647–1665, 2005.[31] M. T. Kreutzer, F. Kapteijn, J. A. Moulijn, and J. J. Heiszwolf. Multi-phase monolith reactors: Chemical reaction engineering of segmented flowin microchannels.
Chemical Engineering Science , 60:5895–5916, 2005.[32] S. Z. Li.
Markov Random Field modeling in image analysis . Springer, 2009.[33] P. Lobel, C. Pichot, L. Blanc-F´eraud, and M. Barlaud. Conjugate gradi-ent algorithm with edge-preserving regularization for image reconstructionfrom ipswitch data for mystery objects.
IEEE Antennas Propag. Mag. ,39(2):12–13, 1997.[34] Y. Marnissi, E. Chouzenoux, A. Benazza-Benyahia, and J.-C. Pesquet. Anauxiliary variable method for MCMC algorithms in high dimension.
En-tropy , 20(110):35p., 2018.[35] C. Marone. Laboratory-derived friction laws and their application to seis-mic faulting.
Ann. Rev. Earth Planet. Sci. , 26:643–696, 1998.[36] J. Møller.
Spatial statistics and computational methods . Lecture Notes inStatistics, Springer, 2003.[37] J. Nocedal and S. Wright.
Numerical optimization . Springer Science &Business Media, 2006.[38] N. Parikh and S. Boyd. Proximal algorithms.
Foundations and Trends R (cid:13) in Optimization , 1(3):127–239, 2014.2439] B. Pascal, N. Pustelnik, and P. Abry. How joint fractal features estimationand texture segmentation can be cast into a strongly convex optimizationproblem ? Preprint arXiv:1910.05246 , 2019.[40] B. Pascal, N. Pustelnik, P. Abry, M. Serres, and V. Vidal. Joint estimationof local variance and local regularity for texture segmentation. Applicationto multiphase flow characterization. In
Proc. Int. Conf. Image Process. ,pages 2092–2096, Athens, Greece, 2018. IEEE.[41] B. Pascal, S. Vaiter, N. Pustelnik, and P. Abry. Automated data-drivenselection of the hyperparameters for total-variation based texture segmen-tation.
Preprint arXiv:2004.09434 , 2020.[42] M. Pereyra, N. Dobigeon, H. Batatia, and J.-Y. Tourneret. Estimating thegranularity coefficient of a Potts-Markov random field within an MCMCalgorithm.
IEEE Trans. Image Process. , 22(6):2385–2397, June 2013.[43] N. Pustelnik, A. Benazza-Benhayia, Y. Zheng, and J.-C. Pesquet. Wavelet-based image deconvolution and reconstruction.
Wiley Encyclopedia of Elec-trical and Electronics Engineering , Feb. 2016.[44] S. Ramani, T. Blu, and M. Unser. Monte-carlo SURE: A black-box op-timization of regularization parameters for general denoising algorithms.
IEEE Trans. Image Process. , 17(9):1540–1554, 2008.[45] K. R Reddy and J. A Adams. Effects of soil heterogeneity on airflowpatterns and hydrocarbon removal during in situ air sparging.
J. Geotech.Geoenviron. Eng. , 127(3):234–247, 2001.[46] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noiseremoval algorithms.
Physica D: Nonlinear Phenomena , 60(1-4):259–268,1992.[47] M. Serres. Etude hydrodynamique d’un ´ecoulement gaz-liquide dans unmilieu poreux confin´e.
PhDThesis, ´Ecole Normale Sup´erieure de Lyon,Universit´e de Lyon , page 203, 2017.[48] M. Serres, T. Maison, R. Philippe, and V. Vidal. A phenomenological modelfor bubble coalescence in confined highly porous media.
Int. J. Multiph.Flow , 105:134–141, 2018.[49] M. Serres, M.-L. Zanota, R. Philippe, and V. Vidal. On the stability ofTaylor bubbles inside a confined highly porous medium.
Int. J. Multiph.Flow , 85:157–163, 2016.[50] C. M. Stein. Estimation of the mean of a multivariate normal distribution.
Ann. Stat. , pages 1135–1151, 1981.[51] M. Storath, A. Weinmann, J. Frikel, and M. Unser. Joint image recon-struction and segmentation using the Potts model.
Inverse Problems ,31(2):025003, 2015. 2552] C. Vacar and J.-F. Giovannelli. Unsupervised joint deconvolution and seg-mentation method for textured images: A Bayesian approach and an ad-vanced sampling algorithm.
EURASIP J. Adv. Signal Process., special issueon Advanced Computational Methods for Bayesian Signal Processing , (17),March 2019.[53] H. Wendt, P. Abry, and S. Jaffard. Bootstrap for empirical multifractalanalysis.
IEEE Signal Process. Mag. , 24(4):38–48, 2007.[54] H. Wendt, S. G. Roux, P. Abry, and S. Jaffard. Wavelet leaders and boot-strap for multifractal analysis of images.
Signal Process. , 89(6):1100–1114,2009. 26a) Flow image (b) Linear regression(c) T-ROF-Id (d) ROF-Id(e) T-ROF- S (f) ROF- S (g) T- Joint (h)
Joint (i) T-
Coupled (j)
Coupled
Figure 4:
Porous media multiphase flow texture segmentation based onfractal features.
Comparisons between different approaches as summarized inTable 2. 27a) Zoomed flow image (b) Linear regression(c) T-ROF-Id (d) ROF-Id(e) T-ROF- S (f) ROF- S (g) T- Joint (h)
Joint (i) T-
Coupled (j)
Coupled
Figure 5:
Porous media multiphase flow texture segmentation basedon fractal features.
Comparisons between different approaches summarizedin Table 2.
Zoom on the area marked by the black rectangle in Figure 4(a).28a) ROF-Id (b) ROF- S (c) Joint (d) CoupledFigure 6: Grid search stategy to minimize SURE for the segmentation of a zoomedzoomed