[PDF] Parameter-free and fast nonlinear piecewise filtering. Application to experimental physics

Abstract

Numerous fields of nonlinear physics, very different in nature, produce signals and images, that share the common feature of being essentially constituted of piecewise homogeneous phases. Analyzing signals and images from corresponding experiments to construct relevant physical interpretations thus often requires detecting such phases and estimating accurately their characteristics (borders, feature differences, ...). However, situations of physical relevance often comes with low to very low signal to noise ratio precluding the standard use of classical linear filtering for analysis and denoising and thus calling for the design of advanced nonlinear signal/image filtering techniques. Additionally, when dealing with experimental physics signals/images, a second limitation is the large amount of data that need to be analyzed to yield accurate and relevant conclusions requiring the design of fast algorithms. The present work proposes a unified signal/image nonlinear filtering procedure, with fast algorithms and a data-driven automated hyperparameter tuning, based on proximal algorithms and Stein unbiased estimator principles. The interest and potential of these tools are illustrated at work on low-confinement solid friction signals and porous media multiphase flows.

Full PDF

PParameter-free and fast nonlinear piecewiseﬁltering.Application to experimental physics. ∗ Barbara Pascal † Nelly Pustelnik † Patrice Abry † Jean-Christophe G´eminard † Val´erie Vidal † May 2020

Signals or images collected from numerous experiments in physics can be, atleast as a ﬁrst order approximation, described as piecewise homogeneous (piece-wise constant, piecewise linear,. . . ). Detecting and estimating such piecewisehomogeneous regions thus constitute a crucial goal to extract the physicallyrelevant information conveyed in such data. This remains, however, often chal-lenging, as signals or images are usually altered by superimposed noises, pos-sibly with low signal-to-noise ratio, which may hinder the interpretation of thecorresponding experiments. The joint need to denoise data while preservingedges and discontinuities pertaining phase changes or region boundaries oftenpreclude the use of classical linear ﬁltering and call for the use of advancednonlinear signal and image processing techniques.Solid friction provides us with a ﬁrst representative example. Indeed, whentwo, nominally ﬂat, solid surfaces in contact are forced to slide against one an-other, the shear force at the contact surface exhibits generally a characteristictooth-shape signal (Fig. 1a): the force signal thus consists of an alternation ofslow linear rises, corresponding to the loading of elastic energy in the drivingsystem while the surfaces in contact do not move with respect to each another,followed by sudden drops, corresponding to fast energy releases when surfacesslide [3]. When solids are strongly pressed one against the other, these two phases (rest and sliding at the contact surface) can easily be identiﬁed. How-ever, high conﬁnement pressures tend to damage surfaces, a major limitations inthe study of the microscopic mechanisms at play. Therefore, probing eﬀectivelyand accurately frictional material properties required that experiments are per-formed at low conﬁnement pressure. This, however, induces that collected sig-nals have low to very low signal-to-noise-ratios [13] thus requiring advanced non ∗ Work supported by Deﬁ Imag’in SIROCCO and by ANR-16-CE33-0020 MultiFracs,France. † Univ Lyon, ENS de Lyon, Univ Lyon 1, CNRS, Laboratoire de Physique, F-69342 Lyon,France ( [email protected] ). a r X i v : . [ phy s i c s . d a t a - a n ] J un a) Stick-slip: piecewise linear signal (b) Multiphase ﬂow: piecewise ho-mogeneous texture Figure 1:

Experimental data in non linear physics. (a): Normalized driv-ing force F ∗ = k ∆ x/ ( mg ) [see Section 4.1] as function of time t in a solid frictionexperiment. (b): Direct image of a gas bubble in porous media multiphase ﬂows(gas corresponds to the most scrambled region whereas liquid corresponds tosmoother regions).linear ﬁltering signal processing techniques to detect and analyse the piecewiselinear shape of data.Multiphase ﬂows in porous media constitute another rich example. Hydro-dynamics in porous media, notably mass transfer, is of prominent practicalrelevance in industry, e.g., for catalytic process studies. Hydrodynamics andmass transfer studies of multiphase ﬂows in porous media traditionally involvepacked beds and are well characterized. However, recent experiments [40, 48]consider innovative materials such as open-cell solid foams which, due to highporosity and the resulting low pressure loss, are promising for industrial applica-tions. In such experiments, a liquid and a gas are forced to ﬂow simultaneouslythrough the foam and the characterization of such multiphase spatiotemporaldynamics stems from image analysis (Fig. 1b). The challenge is here to iden-tify liquid ﬂows from gas bubbles. The rationale is that each phase can beassociated to homogeneous textures in images and a crucial stake consists inidentifying precisely gas bubble contours so as to measure their lengths. Liquidand gas are both transparent and the foam is itself introducing a scrambledbackground, yielding low-contrast and blurred images, thus requiring advancednonlinear image processing techniques form texture segmentation and contourestimation.These two emblematic examples share in common that the key aspects ofthe physics to be understood are driven by piecewise homogeneous phases. Onone hand, studying friction requires identifying the stick and the slip phases,each associated with a piecewise linear signal. On other hand, studying mul-tiphase ﬂows implies detecting ﬂuid phases, each associated with a piecewisehomogeneous texture. Piecewise-homogeneous signals or images are very com-mon in numerous ﬁelds of nonlinear physics, very diﬀerent in nature, such astime reversal of the magnetic ﬁeld in turbulent dynamo [6], on-oﬀ intermittencyin creeping granular matter [19], DNA detection during translocation throughnuclear pores [1],. . .The present work proposes a generic nonlinear signal/image ﬁltering uni-2ed framework for the analysis of piecewise homogeneous (piecewise-constant,piecewise-linear) experimental datasets. The major challenges here are bothto obtain fast algorithms so as to handle the large amount of data that needto be analyzed to yield accurate and relevant conclusions (e.g., in producing aphase diagram or in analyzing video frames of large size images) and to be ableto perform an automated and data-driven tuning of hyperparameters, unavoid-ably entering any nonlinear ﬁltering procedure, and whose arbitrary selection(by expert visual inspection) might have drastic impacts on achieved outcomesand thus on a posteriori drawn physical interpretations.The uniﬁed signal/image nonlinear ﬁltering framework proposed here isbased on proximal schemes [4, 15] to obtain fast algorithms, and on the Steinunbiased estimator framework to design an automated data-driven hyperparam-eter tuning procedure.Section 2 is dedicated to the formulation of this framework as an inverseproblem, and recalls state-of-the-art strategies with focus on piecewise constantor linear estimation both in signal or images. Section 3 details the proposed al-gorithmic framework. Section 4 illustrates the performance on the two examplesdiscussed above, solid friction and multiphase ﬂows.A documented toolbox (in Matlab ), for the implementation of this sig-nal/image processing nonlinear ﬁltering and data-driven hyperparameter tun-ing, is freely available at https://github.com/bpascal-fr/stein-piecewise-filtering . Let S = { n = ( n , n ) : 1 ≤ n ≤ N , ≤ n ≤ N } denote a lattice, supporting x = ( x n ) n ∈ S , the unknown signal/image of size N = N × N ( N = 1 forunivariate 1D signal analysis and N = K for multivariate 1D signal analysiswith K components). Observation z = ( z m ,m ) ≤ m ≤ M , ≤ m ≤ M is of size M = M × M consists of a degraded version of x , which stems both from alinear degradation (e.g. ﬁltering), denoted A ∈ R M × N , and additive randomnoise, denoted B .Handling an inverse problem relies ﬁrst on an accurate design of the obser-vation (or direct) model, which here takes the following form: z = B ( Ax ) . In this work, S corresponds to an homogeneous neighborhood system. Forinstance, when considering a 1D signal, a site n ∈ { , . . . , N − } has twonearest neighbors N n = { n − , n + 1 } . In a general regular rectangular lat-tice S and for a 4-neighborhood system, every interior point has four neighborsthat yields to N n = { ( n − , n ) , ( n + 1 , n ) , ( n , n − , ( n , n + 1) } . Thepair ( S, E ) constitutes a graph where S contains the nodes and E determinesthe links between the nodes according to the neighboring relationship.We detail this direct model on the two nonlinear physics problems describedin Introduction (low conﬁnement solid friction and porous media multiphase3ows) and illustrated in Figure 1. For solid friction, the challenging questionconsists in denoising obserbation z (Figure 1(a)), with A = Id and the presenceof additive impulsive noise. For multiphase ﬂows, the challenging issues pertainsto segmentation of textures such as the one in Figure 1(b). In such a case, infor-mation x refers to piecewise constant scale-free texture features, and observation z is obtained from a nonlinear multiscale transform (cf. Section 4.2). Noise isassumed additive and Gaussian, with spatial and multiscale correlations. Solving an inverse problem consists in providing an estimator (cid:98) x , as close as canbe from information x . This has been largely addressed in the literature and wepropose ﬁrst a brief overview of the main inverse problem solving streams (seealso [8, 43]), before focusing, second, on the speciﬁc assumptions on the modelrequired to design parameter-free and fast nonlinear piecewise ﬁltering. Bayesian arguments and most standard models – On the ﬁrst hand,Markov Random Fields (MRF) have been introduced in visual labelling to es-tablish probabilistic distributions of interacting labels, aiming to analyze de-pendencies of a physical phenomena [32]. In such a formalism x and z areconsidered as realizations of random vectors X and Z deﬁned on the set S . X is said to be a MRF on S with respect to a neighborhood system E if and only ifpositivity (i.e. P ( X = x ) >

0) and Markovianity P ( x n | x { S }− n ) = P ( x n | x N n ),which models the local characteristics of X , are satisﬁed. Other properties suchas homogeneity and isotropy can be depicted.The link between the MRF, characterized by its local properties, and an-other standard random ﬁeld, the Gibbs random ﬁeld, characterized by globalproperties, has been provided by Hammersley and Cliﬀord [36, Theorem 1]. Werecall that a Gibbs distribution relative to the graph { S, E } is a probabilitymeasure and it has the following representation : P ( ω ) = 1 C e − U ( ω ) /T (1)where C is the normalizing constant called the partition function such that C = (cid:80) ω e − U ( ω ) /T and T stands for temperature, which controls the sharpnessof the distribution. High temperature leads to all conﬁgurations equally dis-tributed, while a temperature close to 0 concentrates the distribution aroundthe global energy minima. U ( ω ) denotes the energy function and P ( ω ) measuresthe probability of the occurence of a speciﬁc conﬁguration ω . The more prob-able conﬁgurations are those with the lower energies. The terminology comesfrom statistical physics where such measures are equilibrium states for physicalsystems (e.g. ferromagnets). U ( ω ) can be formulated with contributions fromexternal ﬁelds (i.e. x n ,n ) and pair interactions (e.g. x n ,n x n +1 ,n ). Forinstance, the Ising model reads U ( x ) = − α (cid:88) x n ,n − β (cid:32) (cid:88) x n ,n x n +1 ,n + (cid:88) x n ,n x n ,n +1 (cid:33) (2)considering ω = x and for some parameters α ∈ R and β >

0, which measure,the external magnetic moment and bonding strengths.4eman and Geman [27] handle the maximization of the conditional proba-bility distribution of ( x, e ) ∈ { S, E } given the data z (i.e. ﬁnd the mode of theposterior distribution), which is known as the maximum a posteriori estima-tion or penalized maximum likelihood. In [25–27], the authors prove that theposterior is a Gibbs distribution over { S, E } with energy function U ( x, e ) = 12 σ (cid:107) Ax − z (cid:107) + β (cid:88) n,n (cid:48) ∈N n ϕ ( x n − x n (cid:48) )(1 − e n,n (cid:48) ) + αψ ( e ) (3)so that ω = ( x, e ) when B designates an additive zero-mean Gaussian noisewith a variance σ . e ∈ E denotes the coded line states and ϕ ( η ) = − η = 0 and 1 if η (cid:54) = 0. The ﬁrst term acts as a data ﬁdelity term and forces theapproximation x to be close to z , the second term allows small variations of x except at the locations where e n,n (cid:48) = 1, and ψ is constructed to organize the lineprocess. Finally, α > β > ϕ and ψ [26,33], an alternative equivalent formulationis the Blake-Zisserman model formulated as U ( x ) = 12 σ (cid:107) Ax − z (cid:107) + λ (cid:88) n,n (cid:48) ∈N n min (cid:16) ( x n − x n (cid:48) ) , η (cid:17) (4)where λ, η >

0, leading to the so-called truncated (cid:96) and whose interest is tofavor piecewise smooth solution. Another model very close is the Potts modelthat can be interpreted as a (cid:96) -penalization over x n − x n (cid:48) that is designed toprovide piecewise-constant estimates [51]. For numerical reasons detailed below,the most standard convex relation is the anisotropic total-variation penalizationwhich reads [12, 46]: U ( x ) = 12 σ (cid:107) Ax − z (cid:107) + λ (cid:88) n,n (cid:48) ∈N n | x n − x n (cid:48) | . (5) Solving inverse problems – Once an energy (or functional) has been designedto ﬁt the considered problem, numerical strategies have to be designed to imple-ment both the hyperparameter selection and the computation of the minimizingsolution x , also corresponding to the most probable ω or moments of P .On one hand, Markov Chain Monte Carlo algorithms address simulationsfrom a probability distribution P . The function P can be written in a closed-form expression but the objective is generally to access the moments of P , whichare not computable analytically. The two main techniques used in MCMC areMetropolis-Hastings, which relies on accept/reject mechanism and Gibbs sam-pler, which simpliﬁes the high dimensional problem by successively simulatingfrom diﬀerent smaller dimensional components. The main limitation of thesetechniques is to be computationally intensive for solving large size inverse prob-lems (see a contrario [34, 52]). We should also refer to some speciﬁc conﬁgu-rations where a closed form expression is available such as for the Ising model5n 1D and 2D but which is not adapted for general inverse problem solvingconsidered in this work.When one wants to estimate jointly the maximum of a posteriori and itshyperparameters, Bayesian hierarchical inference frameworks are particularlyadapted and received considerable interest for addressing change-point detec-tion or piecewise denoising problems [20, 21, 42] or texture segmentation [52].However, to the best of our knowledge, for the proposed uniﬁed 1D-2D frame-work considered in this work, such a general eﬃcient strategy has not yet beendesigned.On other hand, during the last twenty years, important research eﬀorts havebeen dedicated to convex but non-smooth energy generally formulated as a sumof two or three terms: a data-ﬁdelity term, a penalization and a constraint [4,14, 16]. This framework is thus especially adapted when dealing with an energysuch as (5). These algorithmic strategies are particularly eﬃcient when dealingwith A full-rank which is rarely the case in standard restoration/reconstructionproblems but which is more encountered in experimental physics processingwhen dealing either with denoising i.e. A = Id such as in friction experiments orwhen dealing with texture reconstruction especially adapted to study multiphaseﬂow dynamics as it will be described later. However, when one handles suchoptimization strategy to ﬁnd the minimizer of the energy U , the selection ofthe automated regularization parameter(s) is not addressed. For automatedselection, one could consider either an empirical rule that consists in setting λ ∼ N / σ/

4, with N the signal size and σ the noise standard deviation, estimatede.g., from the median value of the absolute value of the wavelet coeﬃcients[22], or an hybrid Bayesian hierarchical inference framework combined with (cid:96) -minimization startegy [24] in the speciﬁc case of piecewise-constant denoising,or the recourse to Stein Unbiased Risk Estimator (SURE) which provides anunbiased estimator of the mean square error [5,18,44]. Our contribution focuseson such a strategy, its implementability, and its applicability on real physicsexperiments. In this work, we consider an estimator (cid:98) x ( z ; Λ) of the quantity of interest ¯ x , froma corrupted observation z , parametrized by Λ. The estimate is obtained fromthe minimization of an energy, inspired from (5), and deﬁned as: (cid:98) x ( z ; Λ) ∈ Argmin x ∈ R N (cid:107) Ax − z (cid:107) + (cid:107) D Λ x (cid:107) , (6)where the matrix D Λ models a weighted discrete diﬀerentiation operator, sothat the penalization (cid:107) D Λ x (cid:107) enforces some regularity of the estimate (cid:98) x ( z ; Λ).Speciﬁc instances of (6) are provided : • To favor joint piecewise constancy of K multivariate signals, the regu-larization parameters are stored in a vector Λ = ( λ , . . . , λ K ) ∈ (cid:0) R ∗ + (cid:1) K ,6nd the operator D Λ is a ﬁrst order diﬀerentiation operator, acting com-ponentwise, also called discrete gradient, writing for k ∈ { , . . . , K } , n ∈ { , . . . , N − } ,( D Λ x ) k,n = λ k ( x k,n +1 − x k,n ) (7)and where (cid:107)·(cid:107) is the mixed (cid:96) , -norm (cid:107) D Λ x (cid:107) , = N − (cid:88) n =1 (cid:118)(cid:117)(cid:117)(cid:116) K (cid:88) k =1 ( D Λ x ) k,n . (8) • Enforcing joint piecewise linearity requires a second order diﬀerentiationoperator, named discrete Laplacian, deﬁned for k ∈ { , . . . , K } and n ∈{ , . . . , N − } ,( D Λ x ) n ,k = λ k ( x n +1 ,k − x n ,k + x n − ,k ) (9)composed with the (cid:96) , -norm deﬁned in (8). • Image segmentation is performed imposing piecewise constancy prior, us-ing a two dimensional diﬀerence operator. For an image x ∈ R N × N the horizontal and vertical gradients are computed for each pixel with1 ≤ n ≤ N − ≤ n ≤ N − D Λ x ) n ,n = λ (cid:18) ( D x ) n ,n ( D x ) n ,n (cid:19) = λ (cid:18) x n ,n +1 − x n ,n x n +1 ,n − x n ,n (cid:19) (10)and coupled into an (cid:96) , -norm (cid:107) D Λ x (cid:107) = λ N − (cid:88) n =1 N − (cid:88) n =1 (cid:113) ( D x ) n ,n + ( D x ) n ,n := λ TV( x ) . (11)The above penalization is known as the isotopic Total Variation [46].The estimate (cid:98) x ( z ; Λ) is the result of a trade-oﬀ between the ﬁdelity to theobservation model and some regularity priors and the balance is tuned by thehyperparameter Λ. Hence, our purpose is twofold. First, solving the minimiza-tion Problem (6), that is, for ﬁxed hyperparameter Λ, given an observation z ,compute (cid:98) x ( z ; Λ) the minimizer of (6). Second, ﬁnding the best hyperparameterΛ † minimizing the quadratic error E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } , i.e. Problem 1.

Find Λ † = arg min Λ ∈ ( R ∗ + ) K E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } (12)where (cid:98) x ( z ; Λ) is deﬁned by (6) and the expectation is taken over all realizationsof the noise B corrupting the observation z = B ( A ¯ x ).In the next sections, we specify the assumptions over A and B allowing usto derive a fast algorithmic scheme to estimate Λ † .7 .2 Convex non-smooth minimization The objective function appearing in Problem (6) is convex, since the composi-tion of a linear operator and a norm is convex. Yet, because of the presence ofthe norm (cid:107)·(cid:107) , it is non-smooth. Consequently, the minimization of (6) requiresproximal algorithms [4,14,38], which in general suﬀer from low convergence rate.Nevertheless, provided that the operator A is injective, it is possible to design ac-celerated primal-dual schemes [12] and to obtain linear convergence rate towardthe minimizer of (6). Such algorithms relies on proximity operators [38], whosedeﬁnition is recalled in Deﬁnition 1. Further, disposing closed-form expressionsof the proximity operators of the data ﬁdelity term and the penalization functionis a key element to design fast implementations of primal-dual algorithms. Deﬁnition 1.

For a convex lower semi-continuous function ϕ : R N → R ∪{ + ∞} and τ >

0, the proximity operator is deﬁned as( ∀ x ∈ R N ) prox τϕ ( x ) = arg min (cid:101) x (cid:107) (cid:101) x − x (cid:107) + τ ϕ ( (cid:101) x ) (13)where (cid:107)·(cid:107) denotes the Euclidean norm on R N .Few examples of well-established closed-form expression for proximity oper-ator of interest in this work are recalled. Example 1.

The proximity operator of the data ﬁdelity term (cid:107) A · − z (cid:107) as aclosed form expression that is, for every τ > ∀ x ∈ R N ) prox τ (cid:107) A ·− z (cid:107) ( x ) = (cid:0) Id + 2 τ A (cid:62) A (cid:1) − (cid:0) x + 2 τ A (cid:62) z (cid:1) . (14) Example 2.

The proximity operator of the multivariate 1D (cid:96) , -norm deﬁnedin (8) is, for every y ∈ R K × N , (cid:16) prox τ (cid:107)·(cid:107) , ( y ) (cid:17) k,n = (cid:40) (cid:16) − τ (cid:107) y · ,n (cid:107) (cid:17) y k,n if (cid:107) y · ,n (cid:107) > τ, , (15)where (cid:107) y · ,n (cid:107) := (cid:113)(cid:80) Kk =1 y k,n . Example 3.

The proximity operator of the 2D (cid:96) , -norm of Equation (11), for y = (cid:0) y ( H ) , y ( V ) (cid:1) ∈ R × N × N , (cid:16) prox τ (cid:107)·(cid:107) , ( y ) (cid:17) n ,n = (cid:40) (cid:16) − τ (cid:107) y n ,n (cid:107) (cid:17) y n ,n if (cid:107) y n ,n (cid:107) > τ, , (16)where (cid:107) y n ,n (cid:107) := (cid:113) ( y ( H ) n ,n ) + ( y ( V ) n ,n ) . Theorem 1.

Assuming that the deformation operator A is injective and denot-ing by µ > the smallest eigenvalue of A (cid:62) A , the sequence (cid:0) x [ t ] (cid:1) t ∈ N deﬁned inAlgorithm 1 converges toward the solution (cid:98) x ( z ; Λ) ofminimize x ∈ R N (cid:107) Ax − z (cid:107) + (cid:107) D Λ x (cid:107) . (17)8 urther, it has been proven in [12] that, for any (cid:15) > , there exists t such that ( ∀ t ≥ t ) (cid:13)(cid:13)(cid:13)(cid:98) x ( z ; Λ) − x [ t ] (cid:13)(cid:13)(cid:13) ≤ (cid:15)t (cid:18) (cid:107) (cid:98) x ( z ; Λ) − x [0] (cid:107) µ τ + (cid:107) A (cid:107) (cid:107) (cid:98) y ( z ; Λ) − y [0] (cid:107) µ (cid:19) (18) where (cid:98) y ( z ; Λ) denotes the solution of the dual problem of Problem (17) . Hence,the convergence rate of the iterates (cid:0) x [ t ] (cid:1) t ∈ N scales like O (1 /t ) .Proof. This theorem is a direct application of [12, Theorem 2], stated anddemonstrated for the minimization of objective functionals deﬁned as the sumof a µ -strongly convex data ﬁdelity and convex, proper, lower semi-continuouspenalization, which is the case here.Indeed, thanks to the assumption that A is full-rank, the considered data ﬁdelityterm (cid:107) Ax − z (cid:107) is µ -strongly convex with modulus µ = 2 min Sp( A (cid:62) A ) >

0. Fur-ther, the penalization being the composition of a linear operator and a norm issatisﬁes the aforementioned conditions.Then, the primal-dual updates of Algorithm 1 corresponds to the customizationof the Algorithm 2 of [12] to the problem of ﬁnding (cid:98) x ( z, Λ) solution of (6), hencecorresponding to G ( x ) = (cid:107) Ax − z (cid:107) , linear operator K = D Λ and F = (cid:107)·(cid:107) , .We have to note that, because of the operation (cid:0) Id + 2 τ A (cid:62) A (cid:1) − , the prox-imity operator of the data-term might be uneasy to evaluate. However, fornumerous conﬁguration, this expression as a closed form expression. First when A = Id. Second, when A is diagonalizable in a speciﬁc basis such as Fourier forcirculant matrices (leading to O ( M N ) operations). Another speciﬁc choice of A will be discussed in Section 4.2. Once an eﬃcient algorithmic strategy has been identiﬁed to estimate (cid:98) x ( z ; Λ), thesecond major diﬃculty raised by Problem 1 is that, in practice, one does not haveaccess to the true signal/image ¯ x and hence cannot compute E (cid:8) (cid:107) (cid:98) x ( z ; Λ) − x (cid:107) (cid:9) .To handle this limitation, Stein proposed an Unbiased Risk Estimator [50],denoted SURE(Λ), providing an usable approximation of the quadratic risk inthe case of i.i.d. Gaussian noise. This estimator was then widely extended tomore general noise models [23, 41]. Theorem 2 (Stein Unbiased Risk Estimate) . We denote (cid:98) x ( z ; Λ) the paramet-ric estimator deﬁned in (17) of the ground truth ¯ x from observation z = B ( A ¯ x ) corrupted by a full-rank deformation operator A and additive (possibly corre-lated) Gaussian noise B , with covariance matrix S . The Stein Unbiased RiskEstimate, deﬁned as SURE(Λ) := (cid:107)

Φ ( A (cid:98) x ( z ; Λ) − z ) (cid:107) + 2Tr (cid:18) S Φ (cid:62) ∂ (cid:98) x ( z ; Λ) ∂z (cid:19) − Tr(Φ S Φ (cid:62) ) , (19) satisﬁes the following unbiasedness property E { SURE(Λ) } = E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } . (20) where Φ := (cid:0) A (cid:62) A (cid:1) − A (cid:62) and ∂ (cid:98) x ( z ; Λ) /∂z denotes the Jacobian of (cid:98) x ( z ; Λ) w.r.t.observation z . roof. A complete and detailed proof was proposed in [41].

Deﬁnition 2 (Degrees of freedom) . The second term in the deﬁnition of SURE(Λ),in Equation (19), 2Tr (cid:18) S Φ (cid:62) ∂ (cid:98) x ( z ; Λ) ∂z (cid:19) (21)is called the degrees of freedom .Since Φ ∈ R M × N and ∂ (cid:98) x ( z ; Λ) /∂z ∈ R N × M , are large size matrices comput-ing the trace of Φ (cid:62) ∂ (cid:98) x ( z ; Λ) /∂z is very expansive and hence the evaluation of thedegrees of freedom requires additional tools. This diﬃculty is overcome using,a Monte Carlo (MC) strategy, only requiring the evaluation of the Jacobian ona random vector δ ∈ R M . Hence, it is only necessary to store a vector of size N , instead of manipulating a matrix of size N × M , which decreases drasticallythe computational and memory costs. Moreover, when (cid:98) x ( z ; Λ) is obtained froma minimization scheme, there is often no closed-form expression of the Jaco-bian, hence we will approximate ∂ (cid:98) x ( z ; Λ) /∂z [ δ ] using Finite Diﬀerence (FD)approximation of the Jacobian. Altogether, Monte Carlo and Finite Diﬀerencestrategies lead to the following FDMC Stein Unbiased Risk Estimate. Theorem 3 (Finite Diﬀerence Monte Carlo SURE) . Let (cid:98) x ( z ; Λ) denote a para-metric estimator of ground truth ¯ x from observation z = B ( A ¯ x ) corrupted bya full-rank deformation operator A and additive (possibly correlated) Gaussiannoise B , with covariance matrix S and Φ = (cid:0) A (cid:62) A (cid:1) − A (cid:62) . The FDMC SteinUnbiased Risk Estimate is deﬁned as SURE ε,δ (Λ) := (cid:107)

Φ ( A (cid:98) x ( z ; Λ) − z ) (cid:107) + 2 ε (cid:10) Φ (cid:62) ( (cid:98) x ( z + εδ ; Λ) − (cid:98) x ( z ; Λ)) , S δ (cid:11) − Tr(Φ S Φ (cid:62) ) . (22) Provided that (cid:98) x ( z ; Λ) is uniformly Lipschitz w.r.t. observation z and integrableagainst Gaussian density, SURE ε,δ (Λ) is an asymptotically unbiased estimatorof the quadratic risk, i.e. lim ε → E { SURE ε,δ (Λ) } = E {(cid:107) (cid:98) x ( z ; Λ) − x (cid:107) } , (23) where the expectation in the left hand side is taken over both the noise B andthe Monte Carlo vector δ ∼ N (0 , Id) .Proof.

The proof directly follows from Theorem 2 in [41].Thus, Problem 1 is replaced by

Problem 2.

Find (cid:98) Λ † = arg min Λ ∈ ( R ∗ + ) K SURE ε,δ (Λ).

In order to solve Problem 2, two strategies can be considered. First a gridsearch, computing SURE ε,δ (Λ) over a large range of hyperparameter values,corresponding to the discrete set Λ = (Λ i ) Ii =1 and selecting a posteriori the10yperparameters of the grid for which SURE ε,δ (Λ i ) is minimal, denoted (cid:98) Λ G , asdescribed in Algorithm 2. The major drawback of this approach is its computa-tional cost, increasing algebraically with the dimension of the hyperparametersΛ. Algorithm 1

Accelerated primal dual scheme for minimization of (6).

Require:

Set ε > δ ∈ R M , τ > σ >

0, such that τ σ (cid:107) D Λ (cid:107) <

1. . for (cid:101) z = { z, z + εδ } do Choose (cid:101) x [0] ∈ R N , x [0] ∈ R N , y [0] = D Λ x [0] ∂ Λ (cid:101) x [0] ← N ∂ Λ x [0] ← N ∂ Λ y [0] ← D Λ ∂ Λ x [0] for t = 0 to T max − do { Primal-dual updates } y [ t +1] = prox σ t ( (cid:107)·(cid:107) ) ∗ (cid:0) y [ t ] + σ t D Λ (cid:101) x [ t ] (cid:1) x [ t +1] = prox τ t (cid:107) A ·− (cid:101) z (cid:107) (cid:0) x [ t ] − τ t D Λ y [ t +1] (cid:1) ϑ t = √ µτ t , τ t +1 = τ t /ϑ t and σ t +1 = ϑ t σ t (cid:101) x [ t +1] = x [ t +1] + ϑ t (cid:0) x [ t +1] − x [ t ] (cid:1) end for (cid:98) x ( (cid:101) z ; Λ) ← x [ T max ] end for Compute SURE ε,δ (Λ) injecting (cid:98) x ( z ; Λ) and (cid:98) x ( z + εδ ; Λ) in Formula (22) return SURE ε,δ (Λ)

Algorithm 2

Grid search for SURE minimization.

Require:

Grid Λ = (Λ i ) Ii =1 , ε > δ ∼ N (0 , Id) ∈ R M , for i = 1 to I do ERROR( i ) ← SURE ε,δ (Λ i ), computed from Algorithm 1 end for (cid:98) i G ← arg max ≤ i ≤ I ERROR( i ) return (cid:98) Λ G = Λ (cid:98) i G In order to provide faster implementations, we consider automated selectionof hyperparameters. To that aim, the number of hyperparameters is assumed tobe K = O (1) and hence quasi-Newton algorithms are appropriate since they canhandle very eﬃciently minimization in low dimension. It requires to compute thegradient of SURE ε,δ (Λ) w.r.t. the hyperparameter Λ. For this purpose, it wasproposed a Stein Unbiased GrAdient Risk estimate, denoted SUGAR ε,δ (Λ) [18,41], which, under the conditions of Theorem 3, writesSUGAR ε,δ (Λ) := 2 (cid:18) Φ A ∂ (cid:98) x ( z ; Λ) ∂ Λ (cid:19) (cid:62) (Φ ( A (cid:98) x ( z ; Λ) − z ))+ 2 ε (cid:28) Φ (cid:62) (cid:18) ∂ (cid:98) x ( z + εδ ; Λ) ∂ Λ − ∂ (cid:98) x ( z ; Λ) ∂ Λ (cid:19) , S δ (cid:29) . (24)11 sketch of quasi-Newton descent [37], particularized to Problem 2, is de-tailed in Algorithm 4. It generates a sequence (cid:0) Λ [ j ] (cid:1) j ∈ N converging toward aminimizer of SURE ε, δ (Λ), denoted (cid:98) Λ BFGS .This algorithm relies on a gradient descent step involving a descent direction d [ j ] obtained from the product of BFGS approximated inverse Hessian matrix H [ j ] and the gradient SUGAR ε,δ (Λ) obtained from Algorithm 3. The descentstep size α [ j ] is obtained from a line search which stops when Wolfe conditionsare fulﬁlled [17, 37]. Finally, the approximated inverse Hessian matrix H [ j ] isupdated according to a BFGS strategy. Remark . The line search is the most time consuming. Indeed, the routinesSURE and SUGAR are called for several hyperparameters of the form Λ [ j ] + αd [ j ] , each call requiring to run diﬀerentiated primal-dual scheme twice. Algorithm 3

Accelerated primal dual scheme for minimization of (6) withiterative forward diﬀerentiation.

Require:

Set ε > δ ∈ R M , τ > σ >

0, such that τ σ (cid:107) D Λ (cid:107) < for (cid:101) z = { z, z + εδ } do Choose (cid:101) x [0] ∈ R N , x [0] ∈ R N , y [0] = D Λ x [0] ∂ Λ (cid:101) x [0] ← N ∂ Λ x [0] ← N ∂ Λ y [0] ← D Λ ∂ Λ x [0] for t = 0 to T max − do { Primal-dual updates } y [ t +1] = prox σ t ( (cid:107)·(cid:107) ) ∗ (cid:0) y [ t ] + σ t D Λ (cid:101) x [ t ] (cid:1) x [ t +1] = prox τ t (cid:107) A ·− (cid:101) z (cid:107) (cid:0) x [ t ] − τ t D Λ y [ t +1] (cid:1) ϑ t = √ µτ t , τ t +1 = τ t /ϑ t and σ t +1 = ϑ t σ t (cid:101) x [ t +1] = x [ t +1] + ϑ t (cid:0) x [ t +1] − x [ t ] (cid:1) { Forward iterative diﬀerentiation } ∂ Λ y [ t +1] = ∂ y prox σ t ( (cid:107)·(cid:107) ) ∗ (cid:0) ∂ Λ y [ t ] + σ t D Λ ∂ Λ (cid:101) x [ t ] + σ t ( ∂ Λ D Λ ) (cid:101) x [ t ] (cid:1) ∂ Λ x [ t +1] = ∂ x prox τ t (cid:107) A ·− (cid:101) z (cid:107) (cid:0) ∂ Λ x [ t ] − τ t D Λ ∂ Λ y [ t +1] − τ t ( ∂ Λ D Λ ) y [ t +1] (cid:1) ∂ Λ (cid:101) x [ t +1] = ∂ Λ x [ t +1] + ϑ t (cid:0) ∂ Λ x [ t +1] − ∂ Λ x [ t ] (cid:1) end for (cid:98) x ( (cid:101) z ; Λ) ← x [ T max ] ∂ Λ (cid:98) x ( (cid:101) z ; Λ) ← ∂ Λ x [ T max ] end for Compute SURE ε,δ (Λ) injecting (cid:98) x ( z ; Λ) and (cid:98) x ( z + εδ ; Λ) in Formula (22)Compute SUGAR ε,δ (Λ) injecting (cid:98) x ( z ; Λ), (cid:98) x ( z + εδ ; Λ), ∂ Λ (cid:98) x ( z ; Λ) and ∂ Λ (cid:98) x ( z + εδ ; Λ) in Formula (24) return SURE ε,δ (Λ) and SUGAR ε,δ (Λ)12 lgorithm 4

Automated selection of hyperparameters minimizing quadraticrisk.

Require: ε > δ ∼ N (0 , Id) ∈ R M Ensure: Λ [0] ∈ ( R + ) K , H [0] ∈ R K × K SUGAR [0] ← SUGAR ε,δ (Λ [0] ) computed from Algorithm 3 for j = 0 to J max − do d [ j ] = − H [ j ] SUGAR [ j ] α [ j ] ∈ Argmin α ∈ R SURE ε,δ (Λ [ j ] + αd [ j ] ), SURE computed from Algorithm 3Λ [ j +1] = Λ [ j ] + α [ j ] d [ j ] SUGAR [ j +1] ← SUGAR ε,δ (Λ [ j +1] ) computed from Algorithm 3 u [ j ] = SUGAR [ j +1] − SUGAR [ j ] H [ j +1] = (cid:18) Id − d [ j ] ( u [ j ] ) (cid:62) ( u [ j ] ) (cid:62) d [ j ] (cid:19) H [ j ] (cid:18) Id − u [ j ] ( d [ j ] ) (cid:62) ( u [ j ] ) (cid:62) d [ j ] (cid:19) + α [ j ] d [ j ] ( d [ j ] ) (cid:62) ( u [ j ] ) (cid:62) d [ j ] . end forreturn (cid:98) Λ BFGS = Λ [ T max ] Context – Friction experiments aim at probing not only the characteristicsof materials, but also the dynamics of systems involving surfaces in contact. Inparticular, they are paradigms for modeling and attempting to predict earth-quake dynamics [35]. The classical solid friction experiment consists of towinga mass m (so-called slider ) over a substrate via a spring of stiﬀness k pulledat velocity V (see for instance Figure 2 in [13]). The signal representative ofthe slider dynamics is the force measured at the contact point between thespring and the slider. Among the diﬀerent regimes described in solid friction,we can distinguish the stick-slip, characterized by a tooth-shaped signal alter-nating slow, linear rise and fast drops, the inertial regime, in which the signalbecomes periodic and resembles a sine curve, and the continuous sliding regime,characterized by an almost constant signal superimposed with noise [3]. Theappearance of creep, slow forward motion of the slider previous to a slip event,may also modify the signal shape. The challenge in such studies is to establish a regime diagram describing (and therefore, predicting) the system dynamics de-pending on the parameters ( m, k, V ). If the recognition of the diﬀerent regimesis easy for large mass m , experiments with low conﬁnement pressure, necessaryto avoid surface wear, are challenging as they add noise to the experimentalsignals [13]. In this context, new signal processing tools are required. Herewe focus in particular on signal denoising by approximating, at ﬁrst order, thestick-slip signals to piecewise linear signals. Data – Experiments of solid friction (taken from [13]) were performed bypulling a mass m = 30 . × ) over a solid substrate. Both13urfaces in contact consist of paper samples (Canson (cid:114) , characterized by itsroughness). A cantilever spring (metallic blade of stiﬀness k between 168 and3337 N/m) is pulled at constant velocity V (between 42 and 7200 µ m/s) and isin contact with the slider by a steel ball glued onto this latter, ensuring a punc-tual contact and the free motion of the contact point. The slider dynamics isquantiﬁed though the measurement of the blade deﬂection, ∆ x , by an inductivesensor (Baumer, IPRM 12I9505/S14). In all experiments, the mass m is keptconstant. We vary the parameters ( k, V ) and, for each experiment, record thenormalized force signal F ∗ from the blade deﬂection, F ∗ = k ∆ x/ ( mg ), where g = 9 .

81 m/s − is the gravitational acceleration. This signal is recorded with asampling frequency of 2 kHz, and its size varies from about 4 . × to 7 . × . Piecewise linear denoising – In [13], stick-slip signals were processed usingan optimization formalism, falling under formulation (6), in order to enforcepiecewise linear behavior. The observation z corresponds to the measured forcesignals, A = Id, the linear operator D Λ is chosen to be the discrete Laplaciandescribed in Equation (9) (for an univariate signal, i.e. K = 1), and (cid:107)·(cid:107) isthe (cid:96) , -norm deﬁned in (8), which reduces to the (cid:96) -norm in the context ofunivariate signals.The tedious task of tuning the regularization parameter λ was performed byexpert visual inspection and led to a choice λ expert = 0 . λ expert displayed in red. The regularized signals appearto capture well the transition between the stick and slip regimes.Here, we propose to illustrate the use of the regularization parameter au-tomated tuning strategy presented in Section 3.4 for piecewise linear denoisingon stick-slip signals. SURE ε,δ ( λ ) is used as the quality criterion and minimizedover λ . Therefore, both grid search and automated tuning are implemented andcompared. Automated data-driven hyperparameter tuning – The Finite Diﬀerencestep ε , involved in SURE ε,δ and SUGAR ε,δ computation [see Equations (22)and (24)] is set to ε = 2 σN . (25)with N the length of the considered stick-slip signal and σ the estimated vari-ance of the additive noise corrupting the signal. Since no additional informationabout the noise is available, σ is estimated using the sample variance estimatorapplied to observations.SURE ε,δ (black curve in Figure 3) is ﬁrst computed over a grid of 15 logarithmi-cally spaced values of the regularization parameter λ , using Algorithm 2. Then, λ grid is deﬁned as the minimum of SURE ε,δ ( λ ) over the grid and indicated bythe ‘+’ symbol. Finally, the quasi-Newton Algorithm 4 for automated tuning ofregularization parameter is run, providing λ BFGS , represented by the ‘ ∗ ’ symbol.The regularization parameter chosen by the expert is displayed for comparisonpurpose, an indicated by the ‘ × ’ marker. For each experimental setting ( k, V ),the grid search optimal regularization parameter λ grid and the automatically14igure 2: Stick-slip force signals.

Experimental data ((in grey) for threediﬀerent experimental settings and nonlinear ﬁltering enforcing piecewise linearbehavior, with automated hyperparameter tuning (blue) and expert-selectedhyperparameter (red).tuned regularization parameter λ BFGS ( k, V ) obtained respectively from Algo-rithms 2 and 4 are compared in Table 1, showing satisfactory agreement. Denoised data analysis – Table 1 shows ﬁrst that λ BFGS is within the same or-der of magnitude as λ expert . This is consistent with Figure 2, that further showsthat denoised experimental signals obtained from nonlinear ﬁltering enforcingpiecewise linear behavior, with automated hyperparameter tuning (blue) andexpert-selected hyperparameter (red) display similar shapes and behaviors. Thisis a very satisfactory outcome as the proposed data-driven and automated hy-perparameter tuning yields outcomes very consistent with those obtained fromexpert choices, without making use of any a priori information, and relying ondata only instead. 15 grid ( k, V ) k (N/m)

168 1002 2254 V ( µ m / s ) λ BFGS ( k, V ) k (N/m)

168 1002 2254 V ( µ m / s ) k and velocity V .Figure 3: SURE ε,δ ( λ ). Grid search v.s. automated tuning of regularizationparameter.Table 1 also shows that the automated procedure yields diﬀerent regular-ization parameters for the diﬀerent ( k, V ) conﬁgurations, illustrating an abilityto ﬁnely adapt to data, which would not be possible - or would be too muchtime-consuming - for an expert. Table 1 further reveals that the automaticallyselected regularization parameters, λ BFGS , are, for almost all ( k, V ) conﬁgu-rations, slightly larger than the expert-selected ones, λ expert , hence yieldingoverall more regular signals. Figure 2(f) shows that the red signal, obtainedwith λ expert , displays discontinuities (e.g, around t = 5 . λ BFGS ,hence showing the interest of tuning the hyperparameter to each speciﬁc signal.However, Figure 2(b) and (d) also shows small yet visible diﬀerences during theslip-phase (fast decrease) between the red signals, obtained with λ expert , and theblue signals, obtained with the automated selection λ BFGS . To decide which oneis the most relevant requires returning to a detailed analysis of solid friction:The stick phases actually produce force signals that are exactly linearly increas-ing ; For the slip phase, while they can be described in ﬁrst approximation asan abrupt linear decrease, detailed analysis indicates that they actually consistof arches of sinusoidal functions that connect the stick phases. Therefore, it can16e considered that the expert-driven signals (red) better ﬁt the experimentaldata, at the price though of concatenating a series of short linear segments thatare irrelevant with respect to the underlying physics, whereas the data-drivensignals (blue) yield more stylized piecewise linear approximations of the data,that may however better capture the times of transitions between stick and slipphases, an information of premier importance to analyze solid friction regimes.In sums, deciding between the use of expert versus automated tuning of thehyperparameters combines several issues ranging from feasibility (expert tuningis time consuming, prone to errors and may lack reproducibility) to relevance(denoised signals must permit relevant access to quantity of interest for thephysics).

Context – Understanding and predicting the dynamics of multiphase ﬂows is amajor issue in geosciences (soil decontamination, CO sequestration) and in theindustry (enhanced oil recovery, heterogeneous catalysis) [2, 28, 30, 45]. Amongthese processes, many involve a joint gas and liquid ﬂow through a porousmedium. Quantifying the contact areas between the diﬀerent phases, wherechemical reactions take place, is of tremendous importance for analyzing andpredicting the eﬃciency of such processes [31]. However, even when direct visu-alization is possible, the porous medium generates a global, multiscale textureon images which makes it diﬃcult to extract the gas-liquid interfaces. Seg-mentation techniques based on morphological tools used so far to diﬀerentiatephases in multiphase ﬂows [49] present severe limitations: arbitrary thresholdsetting, non-physical irregular bubble contour, non detection of small bubbles.In addition, recent developments in high-resolution and high-speed imaging yieldlarge-size images and large data sets, thus bringing forward issues in memoryand computational costs. Here, we focus on the identiﬁcation of the diﬀerentphases (liquid and gas) in textured images. As a ﬁrst approximation, the liq-uid and the gas appear as homogeneous fractal textures. Hence, discriminatingphases requires to solve a texture segmentation problem. Data – Experiments of joint gas and liquid ﬂow through a porous mediumwere performed in a quasi-2D vertical Hele-Shaw cell of width 210 mm, height410 mm and gap 1.75 mm (see Figure 1 in [7]). The porous medium is an opencell solid foam of NiCrFeAl alloy (Alantum), with a typical pore diameter of580 µ m. Constant gas and liquid ﬂow rates are injected at the bottom of thecell through nine injectors (air) and a homogeneous slit (water). Images of themultiphase ﬂow are acquired by a high-resolution camera (Basler A2040-90um,2048 × × Fractal features – We consider fractal , or scale-free , features, consisting of thelocal behavior as functions of scales of the wavelet leader coeﬃcients L j,n andscale j ∈ { , . . . , J } , built as a local supremum of wavelet coeﬃcients [53,54]. Foreach pixel n ∈ Ω = { , . . . , N }×{ , . . . , N } , the leader coeﬃcients of the image X to analyze, denoted L j,n , evidence the following local scaling property [29] L j,n ∼ η n jh n , as 2 j → j denotes the scale of the multiscale transform. The quantity h n mea-sures the local regularity of the texture at pixel n . In log-log coordinates, Equa-tion (26) corresponds to a linear behavior through octaves j log ( L j,n ) (cid:39) log ( η n ) + jh n , as 2 j → . (27)Setting v n := log ( η n ), which will be called in the following the local power ofthe texture, a texture X is characterized by (cid:0) h n , v n (cid:1) n ∈ Ω . Deﬁnition 3. An homogeneous texture is characterized by a uniform localregularity h n ≡ H and local power v n ≡ V .Then, texture segmentation consists in identifying a partition of the imagedomain Ω = Ω ∪ · · · ∪ Ω Q , Ω q ∩ Ω q (cid:48) = ∅ for q (cid:54) = q (cid:48) , (28)for which both h n and v n are uniform on each Ω q . In other words, it consistsin obtaining piecewise constant maps of local regularity and local power. Regularized estimates – Linear regression on log- leaders (27) can be formu-lated as the minimization of the following least-squaresΦ( h, v ; L ) = 12 j (cid:88) j = j (cid:107) jh + v − log L j (cid:107) , (29)and provides estimates (cid:16)(cid:98) h LR , (cid:98) v LR (cid:17) of fractal features. As an example, the lin-ear regression estimate of the local regularity of the (zoomed) ﬂow image ofFigure 4(a) (Figure 5(a)) is presented in Figure 4(b) (Figure 5(b)). These es-timates turn out to suﬀer from large variances precluding their use of actualsegmentation, thus calling for nonlinear estimation tools.To favor piecewise homogeneous segmentation, we enforce piecewise constancyin estimated features via two diﬀerent Total Variation-based penalizations, lead-ing to the minimization of the Joint and the

Coupled functionals (cid:16)(cid:98) h J / C , (cid:98) v J / C (cid:17) = arg min h,v Φ( h, v ; L ) + λ Ψ J / C ( h, v ; α ) . (30)18he Joint and

Coupled penalizations are deﬁned asΨ J ( h, v ) := λ ( α TV( h ) + TV( v )) , (31)Ψ C ( h, v ) := λ N − (cid:88) n =1 N − (cid:88) n =1 (cid:113) α ( Hh ) n ,n + α ( V h ) n ,n + ( Hv ) n ,n + ( V v ) n ,n . (32)where the total variation (TV) is deﬁned in Equation (11) and the horizontaland vertical discrete gradients, H and V , are deﬁned at Equation (10). Whilethe Joint penalization imposes independently piecewise constancy of local reg-ularity h and local power v , the Coupled penalization is more restrictive andfavors co-localized changes in h and v . The trade-oﬀ between ﬁdelity to themathematical model (27) and piecewise constancy of h and v is controlled bythe regularization parameter λ > α > Iterated thresholding – From the regularized estimates, (cid:98) h J / C , one can obtaina segmentation by applying a post-processing thresholding. The iterated thresh-olding procedure, proposed in [9, 10], beneﬁting from theoretical assessment, iscustomized to the gas/liquid segmentation problem in Algorithm 5. It is usedsystematically in the following, leading to the proposed T- Joint and T-

Coupled segmentation procedures introduced in [39].

Algorithm 5

T-ROF: iterative thresholding of (cid:98) h ROF

Require: (cid:98) h Ensure: m [0]0 = min n ∈ Ω (cid:98) h n , m [0]1 = max n ∈ Ω (cid:98) h n . for t ∈ N ∗ do { Compute the threshold: } T [ t − = (cid:16) m [ t − + m [ t − (cid:17) / { Threshold (cid:98) h : } Ω [ t ]0 = { n | (cid:98) h n ≤ T [ t ] } , Ω [ t ]1 = { n | (cid:98) h n > T [ t ] }{ Update region mean: } m [ t ]0 = 1 / | Ω | (cid:88) n ∈ Ω (cid:98) h n , m [ t ]1 = 1 / | Ω | (cid:88) n ∈ Ω (cid:98) h n . end forreturn Ω = Ω [ ∞ ]0 (liquid), Ω = Ω [ ∞ ]1 (gas) Compared texture segmentation procedures – Four procedures fallingunder Model (6) and satisfying assumptions of Theorem 1 and Theorem 2 willbe compared for texture segmentation, summarized in Table 2. Note that theydiﬀer both by the functional minimized and the noise model, which is of crucialimportance in Stein procedures.The ﬁrst one, denoted ROF-Id, is a state-of-the-art piecewise constant de-noising method, applied on (cid:98) h LR seen as an observation of ¯ h corrupted by additivei.i.d. zero-mean Gaussian noise of variance σ , hence with scalar covariance ma-trix S = σ Id. 19he three procedures ROF- S , Joint and

Coupled take into account the covari-ance structure of the log- leaders coeﬃcients, evidencing both inter-scale andspatial correlations encapsulated in a non-diagonal covariance matrix S .The linear operator intervening in the data ﬁdelity term of Joint and

Cou-pled procedures, denoted J , acts on the double variable ( h, v ) as J ( h, v ) :=( jh + v ) j j = J . We showed in a previous work [39] that it is full-rank. Hence, The-orem 1 applies. Moreover, the strong-convexity modulus µ = 2 min Sp( J (cid:62) J ),where Sp denotes the spectrum of a linear operator, only depends on the octaverange { j , . . . , j } and its numerical values are provided for ﬁxed j = 1 andvarying j in Table 3 [39]. Method Figures Observation Operator Variable Penalization Covariance4, 5 z A x

ROF-Id (c), (d) (cid:98) h LR Id h TV σ IdROF- S (e), (f) (cid:98) h LR Id h TV S Joint (g), (h) log ( L ) J ( h, v ) Ψ J S Coupled (i), (j) log ( L ) J ( h, v ) Ψ C S Table 2: Four diﬀerent settings considered in the experiments of local regularity-based texture segmentation with automated choice of hyperparameters proce-dures. (cid:98) h LR stands for the minimizer of (29), L denotes the wavelet leaders ofthe image to analyze, TV stands for total-variation penalization as deﬁned in(11), and with h , v , Ψ J and Ψ C , S are deﬁned in this section denote respec-tively the local regularity, the local variance, the Joint penalization, the Coupledpenalization and the covariance matrix. Automated hyperparameter tuning – Stein based formalism, describedin Section 3.4, is used, ﬁrst, to obtain an estimation of the quadratic riskfrom SURE ε,δ ( λ, α ), second, for automated tuning of regularization parame-ters thanks to SUGAR ε,δ ( λ, α ) estimate.For this purpose, it is necessary to provide an estimate of the covariance matrixof the noise. The estimated noise variance σ involved in ROF-Id is obtainedfrom the variance of (cid:98) h LR , while the covariance matrix S is assimilated to thecovariance of the log- leaders of the textured image X to be segmented.The Finite Diﬀerence step ε , involved in SURE ε,δ and SUGAR ε,δ computation(see Equations (22) and (24)) is set to ε = 2 √ max S M . (33)where M is the size of the observation vector and the maximum is taken over allcoeﬃcients of the covariance matrix and M = N × N in the case of ROF-Id andROF- S , M = ( j − j +1) × N × N in the case of Joint and

Coupled procedures.

Accuracy of the automated tuning – Grid search minimization of SURE ε,δ ( λ, α )(Algorithm 2) being costly, due to the large number of Algorithm 1 runs re-quired, it is performed on a zoomed image of 281 ×

231 pixels, presented inFigure 5(a). Then, automated tuning of λ and α from Algorithm 4, based on20 = 2 j = 3 j = 4 j = 5 j = 6 µ . .

72 1 .

20 1 .

69 2 . µ of data-ﬁdelity term of (6), for ﬁxed j = 1 and varied j . The bold entry correspond to the range of scales used inthe experiments.SUGAR ε,δ ( λ, α ), is performed on the same zoomed image.In practice, SURE ε,δ ( λ ) is computed on 15 values of the hyperparameter λ forROF-Id (Figure 6(a)) and ROF- S (Figure 6(b)) procedures, and over a 15 × λ, α ) for Joint (Figure 6(c)) and

Coupled (Figure 6(d))methods. The grid search minimum, Λ grid , indicated by the ‘+’ symbol, is com-pared to the optimal regularization parameters found applying Algorithm 4,Λ

BFGS , indicated by the ‘ ∗ ’ symbol. The optimal parameters Λ grid and Λ BFGS appear to coincide perfectly for ROF-Id and

Joint procedures. As for ROF- S and Coupled strategies, even though they are diﬀerent, they are consistent withSURE ε,δ proﬁle, in the sense that they correspond to similar values of SURE ε,δ .We observed that, while grid search minimization (Algorithm 2) required 225runs of Algorithm 1 for

Joint and

Coupled methods, the automated tuning viaBFGS quasi-Newton minimization (Algorithm 4) needed no more than 50 runsof Algorithm 3. Hence, when several parameters are involved, an automatedstrategy (Algorithm 4) is signiﬁcantly faster than a grid search (Algorithm 2).

Segmentation results – Figure 3 indicates that the automated selection ofregularization parameters is consistent with SURE ε,δ minimization. Hence, thecomplete images of 1626 × S procedures yield regularized (cid:98) h TV / ROF presentingartifacts, as observed in Figures 4(d) and 4(f), and hence lead to inaccuratesegmentation, cf. Figures 4(c) and 4(e). In addition, a key point in such exper-iments is to estimate precisely the contact surface between the liquid and gas.Both T-ROF-Id and T-ROF- S (Figures 4(c) and (e)) present irregular contours,which are not representative of the real contours and strongly overestimate bub-ble perimeters. Joint and

Coupled procedures, taking into account both thelocal regularity and the local variance yield more regular contours. In addition,the

Joint and

Coupled methods detect less artifacts (see Figures 4(g) and 4(i)).However, the

Joint estimate (cid:98) h J appears to be over-regularized, leading to non-detection of small bubbles in the segmentation of Figure 4(g). The Coupled procedure turns out to perform a satisfactory compromise, avoiding artifacts,yet, detecting small gas bubbles, as illustrated in Figures 4(i) and 4(j).

The present work has described a uniﬁed framework for signal/image non linearﬁltering, formulated as an inverse problem, that can actually be aﬃliated toseveral functional minimization problems encountered in statistical (nonlinear)physics. This inverse problem formulation aims at favoring piecewise homoge-21eous signal and images, that naturally correspond to solutions on numerousproblems in nonlinear physics, often very diﬀerent in nature. Piecewise homo-geneity assessment entails non smooth convex optimization, here handled viaproximal operators. In addition to yielding relevant piecewise homogeneous es-timates, the proposed framework also achieves an automated and data-driventuning of hyperparameters inherently present in inverse problems and nonlinearﬁltering, thus avoiding the burden of conducting a prone to error and sometimeslacking reproductibily expert inspection. The potential and interest of nonlinearﬁltering has been illustrated at work on two, diﬀerent in nature, real nonlinearphysics experiments (low conﬁnement solid friction and porous media multi-phase ﬂow). However, the approach has a fairly general level of applicabilityand a documented

Matlab toolbox both for multivariate signals and images,implementing both the nonlinear ﬁltering favoring piecewise homogeneity andthe automated data-driven hyperparameter selection, has been made publiclyavailable at https://github.com/bpascal-fr/stein-piecewise-filtering . References [1] Thomas Auger, Jerome Mathe, Virgile Viasnoﬀ, Gaelle Charron, Jean-Marc Di Meglio, Loic Auvray, and Fabien Montel. Zero-mode waveguidedetection of ﬂow-driven dna translocation through nanopores.

Phys. Rev.Lett. , 113:028302, 2014.[2] A. Babchin, I. Brailovsky, P. Gordon, and G. Sivashinsky. Fingering insta-bility in immiscible displacement.

Phys. Rev. E , 77:026301, 2008.[3] T. Baumberger and C. Caroli. Solid friction from stick-slip down to pinningand aging.

Adv. Phys. , 55(3-4):279–348, 2006.[4] H. H. Bauschke and P. L. Combettes.

Convex analysis and monotone op-erator theory in Hilbert spaces . Springer, New York, second edition, 2017.[5] A. Benazza-Benyahia and J.-C. Pesquet. Building robust wavelet estimatorsfor multicomponent images using Stein’s principle.

IEEE Trans. ImageProcess. , 14(11):1814–1830, 2005.[6] M. Berhanu, R. Monchaux, S. Fauve, N. Mordant, F. Petrelis, A. Chif-faudel, F. Daviaud, B. Dubrulle, L. Marie, F. Ravelet, M. Bourgoin, Ph.Odier, J.-F. Pinton, and R. Volk. Magnetic ﬁeld reversals in an experimen-tal turbulent dynamo.

Eur. Phys. Lett. , 77:59001, 2007.[7] T. Busser, M. Serres, R. Philippe, and V. Vidal. Hydrodynamics of gas-liquid co-current ﬂow through a thin sheet of highly porous open cell solidfoam. in revision at Chem. Eng. Sci. , 2020.[8] J.-F. Cai, B. Dong, S. Osher, and Z. Shen. Image restoration: Total varia-tion, wavelet frames, and beyond.

J. Amer. Math. Soc. , 25:1033–1089, May2012.[9] X. Cai, R. Chan, C.-B. Schonlieb, and T. Steidl, G.and Zeng. Linkagebetween piecewise constant Mumford-Shah model and ROF model and itsvirtue in image segmentation.

Preprint arXiv:1807.10194 , 2018.2210] X. Cai and G. Steidl. Multiclass segmentation by iterated ROF threshold-ing. In

Int. Workshop on Energy Minimization Methods in Comp. Vis. andPat. Rec. , pages 237–250. Springer, 2013.[11] A. Chambolle. Image segmentation by variational methods: Mumford andShah functional and the discrete approximations.

SIAM J. Appl Math. ,55:827–863, 1995.[12] A. Chambolle and T. Pock. A ﬁrst-order primal-dual algorithm for convexproblems with applications to imaging.

J. Math. Imag. Vis. , 40(1):120–145,2011.[13] J. Colas, N. Pustelnik, C. Oliver, P. Abry, J.-C. G´eminard, and V. Vi-dal. Nonlinear denoising for characterization of solid friction under lowconﬁnement pressure.

Phys. Rev. E , 100:032803, 2019.[14] P. L. Combettes and J.-C. Pesquet. Proximal splitting methods in signalprocessing. In H. H. Bauschke, R. S. Burachik, P. L. Combettes, V. Elser,D. R. Luke, and H. Wolkowicz, editors,

Fixed-Point Algorithms for InverseProblems in Science and Engineering , pages 185–212. Springer-Verlag, NewYork, 2011.[15] P. L. Combettes and V. R. Wajs. Signal recovery by proximal forward-backward splitting.

Multiscale Model. and Simul. , 4(4):1168–1200, 2005.[16] L. Condat. A primal-dual splitting method for convex optimization involv-ing Lipschitzian, proximable and linear composite terms.

J. Optim. TheoryAppl. , 158(2):460–479, 2013.[17] F. E. Curtis, T. Mitchell, and M. L. Overton. A BFGS-SQP method fornonsmooth, nonconvex, constrained optimization and its evaluation usingrelative minimization proﬁles.

Optim. Methods Softw. , 32(1):148–181, 2017.[18] C.-A. Deledalle, S. Vaiter, J. Fadili, and G. Peyr´e. Stein Unbiased GrAdientestimator of the Risk (SUGAR) for multiple parameter selection.

SIAM J.Imaging Sci. , 7(4):2448–2487, 2014.[19] Thibaut Divoux, Herve Gayvallet, and Jean-Christophe Geminard. Creepmotion of a granular pile induced by thermal cycling.

Phys. Rev. Lett. ,101:148303, 2008.[20] N. Dobigeon and J.-Y. Tourneret. Joint segmentation of wind speed and di-rection using a hierarchical model.

Comput. Stat. Data Anal. , 51(12):5603–5621, Aug. 2007.[21] N. Dobigeon, J.-Y. Tourneret, and M. Davy. Joint segmentation of piece-wise constant autoregressive processes by using a hierarchical model anda Bayesian sampling approach.

IEEE Trans. Signal Process. , 55(4):1251–1263, Apr. 2007.[22] D. L. Donoho and J. M. Johnstone. Ideal spatial adaptation by waveletshrinkage.

Biometrika , 81(3):425–455, 1994.[23] Yonina C Eldar. Generalized SURE for exponential families: Applicationsto regularization.

IEEE Trans. Signal Process. , 57(2):471–481, 2008.2324] J. Frecon, N. Pustelnik, N. Dobigeon, H. Wendt, and P. Abry. Bayesianselection for the (cid:96) -Potts model regularization parameter: 1-D piecewiseconstant signal denoising. IEEE Trans. Signal Process. , 65(19):5215–5224,Oct. 2017.[25] D. Geman and S. Geman. Bayesian image analysis. In

Disordered systemsand biological organization , pages 301–319. Springer, 1986.[26] D. Geman and G. Reynolds. Constrained image restoration and the recov-ery of discontinuities.

IEEE Trans. Pattern Anal. Match. Int. , 14(3):367–383, 1992.[27] S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, andthe Bayesian restoration of images. In

Readings in Computer Vision , pages564–584. Elsevier, 1987.[28] V. Hessel, P. Angeli, A. Gavriilidis, and H. L¨owe. Gas-liquid and gas-liquid-solid microscructured reactors: Contacting principles and applications.

In-dustrial & engineering chemistry research , 44:9750–9769, 2005.[29] S. Jaﬀard. Wavelet techniques in multifractal analysis.

Fractal geometryand applications: A jubilee of Benoit Mandelbrot, M. Lapidus et M. vanFrankenhuysen Eds, Proceedings of Symposia in Pure Mathematics (AMS) ,72(2):91–152, 2004.[30] Q. Kang, I. N. Tsimpanogiannis, D. Zhang, and P. C. Lichtner. Numericalmodeling of pore-scale phenomena during CO sequestration in oceanicsediments. Fuel Process. Technol. , 86:1647–1665, 2005.[31] M. T. Kreutzer, F. Kapteijn, J. A. Moulijn, and J. J. Heiszwolf. Multi-phase monolith reactors: Chemical reaction engineering of segmented ﬂowin microchannels.

Chemical Engineering Science , 60:5895–5916, 2005.[32] S. Z. Li.

Markov Random Field modeling in image analysis . Springer, 2009.[33] P. Lobel, C. Pichot, L. Blanc-F´eraud, and M. Barlaud. Conjugate gradi-ent algorithm with edge-preserving regularization for image reconstructionfrom ipswitch data for mystery objects.

IEEE Antennas Propag. Mag. ,39(2):12–13, 1997.[34] Y. Marnissi, E. Chouzenoux, A. Benazza-Benyahia, and J.-C. Pesquet. Anauxiliary variable method for MCMC algorithms in high dimension.

En-tropy , 20(110):35p., 2018.[35] C. Marone. Laboratory-derived friction laws and their application to seis-mic faulting.

Ann. Rev. Earth Planet. Sci. , 26:643–696, 1998.[36] J. Møller.

Spatial statistics and computational methods . Lecture Notes inStatistics, Springer, 2003.[37] J. Nocedal and S. Wright.

Numerical optimization . Springer Science &Business Media, 2006.[38] N. Parikh and S. Boyd. Proximal algorithms.

Foundations and Trends R (cid:13) in Optimization , 1(3):127–239, 2014.2439] B. Pascal, N. Pustelnik, and P. Abry. How joint fractal features estimationand texture segmentation can be cast into a strongly convex optimizationproblem ? Preprint arXiv:1910.05246 , 2019.[40] B. Pascal, N. Pustelnik, P. Abry, M. Serres, and V. Vidal. Joint estimationof local variance and local regularity for texture segmentation. Applicationto multiphase ﬂow characterization. In

Proc. Int. Conf. Image Process. ,pages 2092–2096, Athens, Greece, 2018. IEEE.[41] B. Pascal, S. Vaiter, N. Pustelnik, and P. Abry. Automated data-drivenselection of the hyperparameters for total-variation based texture segmen-tation.

Preprint arXiv:2004.09434 , 2020.[42] M. Pereyra, N. Dobigeon, H. Batatia, and J.-Y. Tourneret. Estimating thegranularity coeﬃcient of a Potts-Markov random ﬁeld within an MCMCalgorithm.

IEEE Trans. Image Process. , 22(6):2385–2397, June 2013.[43] N. Pustelnik, A. Benazza-Benhayia, Y. Zheng, and J.-C. Pesquet. Wavelet-based image deconvolution and reconstruction.

Wiley Encyclopedia of Elec-trical and Electronics Engineering , Feb. 2016.[44] S. Ramani, T. Blu, and M. Unser. Monte-carlo SURE: A black-box op-timization of regularization parameters for general denoising algorithms.

IEEE Trans. Image Process. , 17(9):1540–1554, 2008.[45] K. R Reddy and J. A Adams. Eﬀects of soil heterogeneity on airﬂowpatterns and hydrocarbon removal during in situ air sparging.

J. Geotech.Geoenviron. Eng. , 127(3):234–247, 2001.[46] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noiseremoval algorithms.

Physica D: Nonlinear Phenomena , 60(1-4):259–268,1992.[47] M. Serres. Etude hydrodynamique d’un ´ecoulement gaz-liquide dans unmilieu poreux conﬁn´e.

PhDThesis, ´Ecole Normale Sup´erieure de Lyon,Universit´e de Lyon , page 203, 2017.[48] M. Serres, T. Maison, R. Philippe, and V. Vidal. A phenomenological modelfor bubble coalescence in conﬁned highly porous media.

Int. J. Multiph.Flow , 105:134–141, 2018.[49] M. Serres, M.-L. Zanota, R. Philippe, and V. Vidal. On the stability ofTaylor bubbles inside a conﬁned highly porous medium.

Int. J. Multiph.Flow , 85:157–163, 2016.[50] C. M. Stein. Estimation of the mean of a multivariate normal distribution.

Ann. Stat. , pages 1135–1151, 1981.[51] M. Storath, A. Weinmann, J. Frikel, and M. Unser. Joint image recon-struction and segmentation using the Potts model.

Inverse Problems ,31(2):025003, 2015. 2552] C. Vacar and J.-F. Giovannelli. Unsupervised joint deconvolution and seg-mentation method for textured images: A Bayesian approach and an ad-vanced sampling algorithm.

EURASIP J. Adv. Signal Process., special issueon Advanced Computational Methods for Bayesian Signal Processing , (17),March 2019.[53] H. Wendt, P. Abry, and S. Jaﬀard. Bootstrap for empirical multifractalanalysis.

IEEE Signal Process. Mag. , 24(4):38–48, 2007.[54] H. Wendt, S. G. Roux, P. Abry, and S. Jaﬀard. Wavelet leaders and boot-strap for multifractal analysis of images.

Signal Process. , 89(6):1100–1114,2009. 26a) Flow image (b) Linear regression(c) T-ROF-Id (d) ROF-Id(e) T-ROF- S (f) ROF- S (g) T- Joint (h)

Joint (i) T-

Coupled (j)

Coupled

Figure 4:

Porous media multiphase ﬂow texture segmentation based onfractal features.

Comparisons between diﬀerent approaches as summarized inTable 2. 27a) Zoomed ﬂow image (b) Linear regression(c) T-ROF-Id (d) ROF-Id(e) T-ROF- S (f) ROF- S (g) T- Joint (h)

Joint (i) T-

Coupled (j)

Coupled

Figure 5:

Porous media multiphase ﬂow texture segmentation basedon fractal features.

Comparisons between diﬀerent approaches summarizedin Table 2.

Zoom on the area marked by the black rectangle in Figure 4(a).28a) ROF-Id (b) ROF- S (c) Joint (d) CoupledFigure 6: Grid search stategy to minimize SURE for the segmentation of a zoomedzoomed