[PDF] Accelerating Auxiliary Function-based Independent Vector Analysis

Abstract

Independent Vector Analysis (IVA) is an effective approach for Blind Source Separation (BSS) of convolutive mixtures of audio signals. As a practical realization of an IVA-based BSS algorithm, the so-called AuxIVA update rules based on the Majorize-Minimize (MM) principle have been proposed which allow for fast and computationally efficient optimization of the IVA cost function. For many real-time applications, however, update rules for IVA exhibiting even faster convergence are highly desirable. To this end, we investigate techniques which accelerate the convergence of the AuxIVA update rules without extra computational cost. The efficacy of the proposed methods is verified in experiments representing real-world acoustic scenarios.

Full PDF

aa r X i v : . [ ee ss . A S ] S e p ACCELERATING AUXILIARY FUNCTION-BASED INDEPENDENT VECTOR ANALYSIS

Andreas Brendel and Walter KellermannMultimedia Communications and Signal Processing, Friedrich-Alexander-Universit¨at Erlangen-N¨urnberg,

Cauerstr. 7, D-91058 Erlangen, Germany, e-mail:

[email protected]

ABSTRACT

Independent Vector Analysis (IVA) is an effective approach for BlindSource Separation (BSS) of convolutive mixtures of audio signals.As a practical realization of an IVA-based BSS algorithm, the so-called AuxIVA update rules based on the Majorize-Minimize (MM)principle have been proposed which allow for fast and computa-tionally efﬁcient optimization of the IVA cost function. For manyreal-time applications, however, update rules for IVA exhibiting evenfaster convergence are highly desirable. To this end, we investigatetechniques which accelerate the convergence of the AuxIVA updaterules without extra computational cost. The efﬁcacy of the proposedmethods is veriﬁed in experiments representing real-world acousticscenarios.

Index Terms — Independent Vector Analysis, MM Algorithm,Convergence Acceleration

1. INTRODUCTION

In daily-life situations, acoustic sources are usually observed as amixture, e.g., multiple simultaneously active speakers in the much-quoted cocktail party scenario or a desired acoustic source mixedwith interferers and background noise such as, e.g., street noise.Blind Source Separation (BSS) [1,2] methods aim at separating suchmixtures while using only very little information about the givenscenario. As typical acoustic scenes within enclosures involve mul-tipath propagation, Independent Component Analysis (ICA)-basedapproaches relying on instantaneous demixing models [3] have beenextended to demixing models that represent a circular convolutionby solving the instantaneous BSS problem in individual DiscreteFourier Transform (DFT) bins [4]. However, the performance ofsuch narrow-band methods strongly relies on effective solutions forthe well-known internal permutation ambiguity [5]. As a state-of-the-art method to cope with the internal permutation problem, Inde-pendent Vector Analysis (IVA) which uses a multivariate ProbabilityDensity Function (PDF) as a source model for jointly describing allDFT bins has been proposed [6].Real-time applicability of IVA calls for fast and efﬁcient opti-mization and a large variety of methods has been developed sinceIVA has been proposed: Starting with simple gradient and naturalgradient algorithms [6], step size control mechanisms have been con-sidered to obtain fast and stable convergence [7]. A fast ﬁxed-pointalgorithm, following the ideas of FastICA [3] has been proposed in[8]. An Expectation Maximization (EM)-based optimization schemehas been proposed for IVA considering additive noise [9]. Based onthe Majorize-Minimize (MM) principle [10], fast and stable updaterules have been proposed using the iterative projection principle un-der the name Auxiliary Function IVA (AuxIVA) [11], which do not

This work was funded by the Deutsche Forschungsgemeinschaft (DFG,German Research Foundation) – 282835863. require tuning parameters such as a step size. The latter can be con-sidered as the gold standard for optimizing the IVA cost function.For the special case of two sources and two microphones, even fasterupdate rules based on a generalized eigenvalue decomposition havebeen developed [12].In this paper, we investigate three methods for further accelera-tion of the AuxIVA update rules. The ﬁrst method considered here isa Quasi-Newton scheme [13], which approximates the differential ofthe AuxIVA update rules using previous MM iterates [14]. The sec-ond approach uses a gradient-type scheme also called OverrelaxedBound Optimization [15], which is motivated by the intuition thatextending the update of the algorithm into the direction of the cur-rent MM update may provide accelerated convergence [16]. As athird approach, we use the Squared Iterative Methods (SQUAREM)technique [17, 18], which has been developed for the acceleration ofEM algorithms and is based on ideas of extrapolation for increasingthe convergence speed of sequences [19]. All investigated accelera-tion methods are shown to provide faster convergence in experimentswith measured Room Impulse Responses (RIRs) than the originalAuxIVA update rules at the same computational cost.

2. INDEPENDENT VECTOR ANALYSIS

In the following, we consider an array of K microphones record-ing the convolutive mixture of K acoustic sources, i.e., a deter-mined scenario. Using the observed microphone signals in the Short-Time Fourier Transform (STFT) domain with frequency bin f ∈{ , . . . , F } and time frame index n ∈ { , . . . , N } x f,n = [ x ,f,n , . . . , x K,f,n ] T ∈ C K (1)the demixed signals y f,n ∈ C K are obtained according to y f,n = [ y ,f,n , . . . , y K,f,n ] T = W f x f,n , (2)by the demixing matrix W f = (cid:2) w ,f , . . . , w K,f (cid:3) H ∈ C K × K , (3)with w k,f capturing the weights of the K -channel MISO systemproducing the f -th DFT bin of the k -th demixed signal. For nota-tional convenience, we introduce also the broadband demixed signalvector of output channel k y k,n = [ y k, ,n , . . . , y k,F,n ] T ∈ C F . (4)Using a broadband source model G ( y k,n ) = − log p ( y k,n ) , where p ( · ) is the multivariate PDF capturing all complex-valued STFT binsof the k th output channel at time frame n , IVA aims at separatinghe sources using the demixing matrices W f of all frequency binsdetermined by minimizing the cost function J ( w ) = K X k =1 ˆ E n G (cid:16) y k,n (cid:17)o − F X f =1 log | det W f | , (5)where ˆ E {·} = N P Nn =1 ( · ) denotes the averaging operator and w = (cid:2) w T , , . . . , w T K,F (cid:3) T ∈ C KF (6)the concatenation of demixing vectors of all channels and frequencybins. For minimizing the cost function (5), the MM principle is usedin [11]. Hereby, an upper bound Q for the cost function J is con-structed which is easier to optimize and fulﬁlls the properties of ma-jorization and tangency, i.e., J ( w ) ≤ Q ( w | w ( l ) ) and J ( w ( l ) ) = Q ( w ( l ) | w ( l ) ) , (7)where w ( l ) denotes the concatenation of all demixing vectors (6)determined in iteration l ∈ { , . . . , L } .The MM algorithm iterates between two steps: construction ofthe upper bound Q ( w | w ( l ) ) by the recent update w ( l ) to ensure (7)and optimization of this upper bound to obtain w ( l +1) . To constructthe upper bound for supergaussian source models G ( · ) the followinginequality has been proposed [11] ˆ E n G (cid:16) y k,n (cid:17)o ≤ F X f =1 (cid:16) w H k,f C k, ( l ) f w k,f (cid:17) + const. (8)Hereby, C k, ( l ) f denotes a covariance matrix of the observed signals C k, ( l ) f = ˆ E ( G ′ ( r ( l ) k,n ) r ( l ) k,n x f,n x H f,n ) , (9)weighted by a factor dependent on the short-time broadband signalmagnitude of source kr ( l ) k,n = (cid:13)(cid:13)(cid:13) y ( l ) k,n (cid:13)(cid:13)(cid:13) = vuut F X f =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) w ( l ) k,f (cid:17) H x f,n (cid:12)(cid:12)(cid:12)(cid:12) . (10)Application of inequality (8) to the cost function (5) yields the up-per bound Q , which can be minimized using the iterative projectiontechnique [11] stipulating the following update w ( l +1) k,f = (cid:16) W ( l ) f C k, ( l ) f (cid:17) − e k s(cid:18) e T k W ( l ) f C k, ( l ) f (cid:16) W ( l ) f (cid:17) H (cid:19) − e k , (11)where e k is the canonical basis vector with a one at the k th position.A complete iteration for the AuxIVA update is summarized in Alg. 1.

3. ACCELERATION SCHEMES

In the following, we present three methods for accelerating the con-vergence of AuxIVA. For convenience, we denote one MM map inaccordance with Alg. 1 by w ( l +1) = f ( w ( l ) ) .After convergence, the MM algorithm attains a ﬁxed point f (cid:16) w ( ∞ ) (cid:17) = w ( ∞ ) . (12) Algorithm 1

AuxIVA: w ( l +1) = f (cid:16) w ( l ) (cid:17) INPUT: w ( l ) for k = 1 to K do r ( l ) k,n = qP Ff =1 | ( w ( l ) k,f ) H x f,n | ∀ n for f = 1 to F do C k, ( l ) f = ˆ E (cid:26) G ′ ( r ( l ) k,n ) r ( l ) k,n x f,n x H f,n (cid:27) w ( l +1) k,f = (cid:16) W ( l ) f C k, ( l ) f (cid:17) − e k s(cid:18) e T k W ( l ) f C k, ( l ) f (cid:16) W ( l ) f (cid:17) H (cid:19) − e k end forend forOUTPUT: w ( l +1) Hence, determining this ﬁnal value w ( ∞ ) corresponds to ﬁnding aroot of ∆ f ( w ) = f ( w ) − w = KF × . (13)This problem can be solved by Newton’s method [14] w ( l +1) = w ( l ) − d∆ f (cid:16) w ( l ) (cid:17) − ∆ f (cid:16) w ( l ) (cid:17) (14)where the differential of ∆ f ( w ( l ) ) is denoted by d∆ f ( w ( l ) ) =d f ( w ( l ) ) − I KF . In the following, we present three accelerationmethods which can be derived from the Newton-type update (14). As a ﬁrst acceleration scheme, we apply the Quasi-Newton approx-imation of (see, e.g., [14]) to (14). Here, the differential of the MMmap d f ( w ( l ) ) is approximated by a matrix M d f (cid:16) w ( l ) (cid:17) ≈ M ∈ C KF × KF , (15)which is constructed by so-called secant approximations [13] M ∆ f (cid:16) w ( l ) (cid:17) = ∆ f (cid:16) w ( l ) (cid:17) . (16)Hereby, we introduced the following abbreviation ∆ f (cid:16) w ( l ) (cid:17) = f ◦ f (cid:16) w ( l ) (cid:17) − f (cid:16) w ( l ) (cid:17) (17)and ( · ) ◦ ( · ) denotes the concatenation of functions. Multiple secantapproximations, we denote their number by q , have to be chosen inorder to obtain decent results. This can be conveniently expressed inmatrix notation as MU = V where U , V ∈ C KF × q , (18)i.e., we would obtain, e.g., U = [ f ( w ( l ) ) , f ( w ( l − )] for q = 2 .As a solution for M which minimizes its Frobenius norm and obeys(18), the following expression has been derived [14] M = V (cid:16) U H U (cid:17) − U H . (19)Insertion into (14) and application of the matrix inversion lemmayields [14] w ( l +1) = f (cid:16) w ( l ) (cid:17) − V h U H U − U H V i − U H ∆ f (cid:16) w ( l ) (cid:17) . (20)ote that the matrix to be inverted here is of dimension q × q , i.e.,small relative to the length of w , and hence the inversion is com-putationally cheap. One update of the Quasi-Newton algorithm issummarized in Alg. 2. Algorithm 2

Quasi-Newton

INPUT: w ( l ) ∆ f (cid:16) w ( l ) (cid:17) = f (cid:16) w ( l ) (cid:17) − w ( l ) Construct V and Uw ( l +1) = f (cid:16) w ( l ) (cid:17) − V (cid:2) U H U − U H V (cid:3) − U H ∆ f (cid:16) w ( l ) (cid:17) OUTPUT: w ( l +1) By approximating the differential of (13) by a scaled identity matrix d∆ f (cid:16) w ( l ) (cid:17) ≈ µ I KF (21)we obtain with (14) a gradient-type algorithm with step size µ ≤ − w ( l +1) = w ( l ) − µ ∆ f (cid:16) w ( l ) (cid:17) , (22)which operates on the results of the MM iterations. Note that a stepsize of µ = − corresponds to the original MM algorithm and val-ues above − will slow down convergence. There are many optionsfor the choice of µ (see, e.g., [18]), where line search methods [13]would be a natural choice. However, the calculation of an adaptivestep size adds signiﬁcant computational load to the algorithm, e.g.,caused by the evaluation of the cost function (5) for line search ap-proaches. Hence, we will use a ﬁxed step size here. In the following, we review the SQUAREM method, which has beenintroduced and extensively used for the acceleration of EM algo-rithms [17, 18]. Let denote z ( l ) the outcome of one gradient updateaccording to (22) with step size α z ( l ) = w ( l ) − α ∆ f (cid:16) w ( l ) (cid:17) . (23)The main idea of SQUAREM is to square this update, i.e., to subse-quently perform another gradient update to obtain the next iterate w ( l +1) = z ( l ) − α ∆ f (cid:16) z ( l ) (cid:17) (24) = w ( l ) − α ∆ f (cid:16) w ( l ) (cid:17) − α h(cid:16) f (cid:16) w ( l ) (cid:17) − α ∆ f (cid:16) w ( l ) (cid:17)(cid:17) . . .. . . − (cid:16) w ( l ) − α ∆ f (cid:16) w ( l ) (cid:17)(cid:17)i = w ( l ) − α ∆ f (cid:16) w ( l ) (cid:17) + α ∆ g (cid:16) w ( l ) (cid:17) , (25)where we introduced the term ∆ g (cid:16) w ( l ) (cid:17) = ∆ f (cid:16) w ( l ) (cid:17) − ∆ f (cid:16) w ( l ) (cid:17) . (26)One iteration of the SQUAREM algorithm is summarized in Alg. 3. Algorithm 3

SQUAREM

INPUT: w ( l ) ∆ f (cid:16) w ( l ) (cid:17) = f (cid:16) w ( l ) (cid:17) − w ( l ) ∆ g (cid:16) w ( l ) (cid:17) = ∆ f (cid:16) w ( l ) (cid:17) − ∆ f (cid:16) w ( l ) (cid:17) α = − k ∆ g ( w ( l ) ) k k ∆ f ( w ( l ) ) k w ( l +1) = w ( l ) − α ∆ f (cid:16) w ( l ) (cid:17) + α ∆ g (cid:16) w ( l ) (cid:17) OUTPUT: w ( l +1)

4. EXPERIMENTS

In the following, we discuss the practical realization of the accel-eration methods introduced above and present experimental results.For the Quasi-Newton method, we constructed the matrices U and V representing the secant constraints by using three values for ∆ f ( w ( l ) ) and two of ∆ f ( w ( l ) ) prior to the current iteration, i.e.,we computed only one MM update in each iteration. Using two MMupdates per iteration as suggested in [14] did not yield better resultsin our experiments.The step size µ of the gradient algorithm is chosen to be constantfor simplicity. For the choice of a step size, convergence speed hasto be traded off against stability. Here, a value of µ = − . showedgood results in our experiments. The step size α for the SQUAREMalgorithm is chosen to be [18] α = − k ∆ g (cid:16) w ( l ) (cid:17) k k ∆ f ( w ( l ) ) k , (27)which is a quite common choice for the SQUAREM algorithm [20].This expression for the step size compares the relative change in w by applying the MM map once with the corresponding change byapplying it twice and weight the ﬁrst-order ∆ f ( w ( l ) ) and second-order update ∆ g ( w ( l ) ) accordingly. For the experimental evalua-tion, we simulated microphone signals by convolving speech signalsrandomly chosen from a set of 4 male and 4 female speech signals ofabout

10 sec duration with RIRs measured in three different rooms:two meeting rooms ( T = 0 . and T = 0 . ) and a seminarroom ( T = 0 . ). The RIRs are measured with a linear micro-phone array with . spacing between the microphones. Twoconﬁgurations of RIRs have been measured in the mentioned enclo-sures at and distance from the microphone array: ◦ / and ◦ / ◦ / ◦ w.r.t. the array axis. As we consider only deter-mined scenarios, the number of sources and microphones was equalin all measurements. White Gaussian noise was added to obtain anSignal-to-Noise Ratio (SNR) of

30 dB at the microphones.The microphone signals have been transformed into the STFTdomain by employing a Hamming window of length and overlap at a sampling frequency of

16 kHz . The performance ofthe algorithms has been measured by the Signal-to-Distortion Ra-tio (SDR), Signal-to-Interference Ratio (SIR) and Signal-to-ArtefactRatio (SAR) w.r.t. the unprocessed signals [21]. Note that these per-formance measures are indirect indicators for the convergence of thealgorithm, as they do not express the costs to be minimized. How-ever, they can be seen as a strong indicator for the separation qual-ity as experienced by a user. We used a Laplacian source model,i.e., G ( r k,n ( w k )) = r k,n ( w k ) , which is a common choice for IVAapplied to audio signals [6, 11]. The results of the experiments de-scribed above are shown in Fig. 1 in terms of SDR and SIR. Results SDR - 2 Sources d B → AuxIVASQUAREMGradientQuasi-Newton

SIR - 2 Sources − − SDR - 3 Sources

Runtime in sec → d B → SIR - 3 Sources

Runtime in sec → Fig. 1 . Performance of the discussed algorithmic variants in terms of SDR and SIR w.r.t. runtime of the algorithms for a segment of

10 secs ofspeech. The plots are created by averaging results for all three different rooms ( T = 0 . , . , . ) and two different source-arraydistances ( , ). Each experiment corresponding to a certain room and distance has been repeated 20 times choosing the source signalsrandomly from a set of four male and four female speech signals. The ﬁrst row of plots shows results for a determined scenario comprising sources and microphones, the second row shows results for sources and microphones.for the improvement of the SAR are omitted due to space constraints.However, the SAR improvement was roughly the same for the inves-tigated methods. Fig. 1 shows the results for scenarios comprising sources and microphones and sources and microphones. Allthree different rooms ( T = 0 . , . , . ) and the twodifferent source-array distances ( , ) have been evaluated byrepeating the experiment times for each conﬁguration, where thesource signals are drawn randomly from a set of four male and fourfemale speech signals. The mean performance values from thesedifferent acoustic conditions are shown for the discussed algorithmsover runtime in Fig. 1.The SQUAREM-based method converged after roughly it-erations, all other methods after about iterations. To take intoaccount additional computational cost of more advanced algorithmswhich increase the convergence rate per iteration the runtime periteration has been considered in order to obtain a fair comparison.Here, it turned out that the runtime is dominated by the evaluationof the MM map and the additional runtime caused by operationsadded to the MM map was negligible. The runtime per iteration forAuxIVA, the gradient-based and the Quasi-Newton-based methodwas roughly .

16 sec for two sources and .

27 sec for three sourceson average. Due to the second required MM map the SQUAREMmethod needed roughly twice as much runtime per iteration. Theseobservations have been incorporated into Fig. 1 by showing the per-formance of the algorithms in terms of runtime of the algorithm. It can be observed that all algorithms converge to similar ﬁnal valueswith a slight advantage for the acceleration methods. However, allacceleration schemes provide signiﬁcantly faster convergence thanAuxIVA itself. The gradient-type method and the Quasi-Newtonmethod, both using only a single MM map, showed similar con-vergence speed. The SQUAREM method based on two MM mapsoutperforms these methods especially for the three-source case andprovides SDR and SIR improvements in the early convergence phasewhich are higher by several dB compared to the AuxIVA results atthe same runtime requirement.

5. CONCLUSIONS

We investigated the application of three different schemes for theacceleration of the convergence of the AuxIVA update rules. Weshowed that all three methods increased the convergence speed interms of SDR and SIR improvement at the same runtime require-ments as AuxIVA. The gradient-based approach represents a simplebut effective modiﬁcation of the original algorithm but requires theselection of a suitable step size. In our experiments, a ﬁxed step sizeshowed promising results, but future work should investigate mech-anisms to choose this step size automatically. The Quasi-Newtonmethod performed similarly as the gradient-based method and wasslightly outperformed by the SQUAREM method.uture work will include an in-depth investigation of otheracceleration methods (e.g., [22]). Also the application of suchacceleration schemes to other BSS algorithms, which suffer fromslow convergence, e.g., Multichannel NMF (MNMF) [23] andTRIple-N Independent component analysis for CONvolutive mix-tures (TRINICON) [24], will be part of future work.

6. REFERENCES [1] E. Vincent, T. Virtanen, and S. Gannot, Eds.,

Audio source sep-aration and speech enhancement , John Wiley & Sons, Hobo-ken, NJ, 2018.[2] S. Makino, T.-W. Lee, and H. Sawada, Eds.,

Blind speech sepa-ration , Signals and communication technology. Springer, Dor-drecht, 2007.[3] A. Hyv¨arinen, J. Karhunen, and E. Oja,

Independent compo-nent analysis , J. Wiley, New York, 2001.[4] P. Smaragdis, “Blind Separation of Convolved Mixtures in theFrequency Domain,”

Neurocomputing Journal , vol. 22, pp.21–34, 1998.[5] H. Sawada, R. Mukai, S. Araki, and S. Makino, “A Robustand Precise Method for Solving the Permutation Problem ofFrequency-Domain Blind Source Separation,”

IEEE Transac-tions on Speech and Audio Processing , vol. 12, no. 5, pp. 530–538, Sept. 2004.[6] T. Kim, H. T. Attias, S.-Y. Lee, and T.-W. Lee, “BlindSource Separation Exploiting Higher-Order Frequency Depen-dencies,”

IEEE Transactions on Audio, Speech and LanguageProcessing , vol. 15, no. 1, pp. 70–79, Jan. 2007.[7] Y. Liang, S. M. Naqvi, and J. A. Chambers, “Adaptive stepsize independent vector analysis for blind source separation,”in , Corfu, Greece, July 2011.[8] I. Lee, T. Kim, and T.-W. Lee, “Fast ﬁxed-point independentvector analysis algorithms for convolutive blind source separa-tion,”

Signal Processing , vol. 87, no. 8, pp. 1859–1871, Aug.2007.[9] J. Hao, I. Lee, T.-W. Lee, and T. J. Sejnowski, “Indepen-dent Vector Analysis for Source Separation Using a Mixtureof Gaussians Prior,”

Neural Computation , vol. 22, no. 6, pp.1646–1673, June 2010.[10] D. R. Hunter and K. Lange, “A Tutorial on MM Algorithms,”

The American Statistician , vol. 58, no. 1, pp. 30–37, Feb. 2004.[11] N. Ono, “Stable and fast update rules for independent vectoranalysis based on auxiliary function technique,” in

IEEE Work-shop on Applications of Signal Processing to Audio and Acous-tics (WASPAA) , New Paltz, NY, USA, Oct. 2011, pp. 189–192.[12] N. Ono, “Fast Stereo Independent Vector Analysis and its Im-plementation on Mobile Phone,” in

International Workshop onAcoustic Signal Enhancement (IWAENC) , Aachen, Germany,Sept. 2012.[13] J. Nocedal and S. J. Wright,

Numerical optimization , Springerseries in operations research. Springer, New York, 2nd ed edi-tion, 2006.[14] H. Zhou, D. Alexander, and K. Lange, “A quasi-Newton accel-eration for high-dimensional optimization algorithms,”

Statis-tics and Computing , vol. 21, no. 2, pp. 261–273, Apr. 2011. [15] R. Salakhutdinov and Sam T. R., “Adaptive OverrelaxedBound Optimization Methods,” in

Proceedings of the Twen-tieth International Conference on Machine Learning (ICML-2003) , Washington, DC, USA, 2003.[16] K. Lange, “A gradient algorithm locally equivalent to the EMalgorithm,”

Journal of the Royal Statistical Society: SeriesB (Statistical Methodology) , vol. 57, no. 2, pp. 425–437, Jan.1995.[17] R. Varadhan and Ch. Roland, “Squared Extrapolation Methods(SQUAREM): A New Class of Simple and Efﬁcient Numeri-cal Schemes for Accelerating the Convergence of the EM Al-gorithm,” Working Paper 63, Johns Hopkins University, Nov.2004.[18] R. Varadhan and C. Roland, “Simple and Globally ConvergentMethods for Accelerating the Convergence of Any EM Algo-rithm,”

Scandinavian Journal of Statistics , vol. 35, no. 2, pp.335–353, June 2008.[19] C. Brezinski and M. Redivo Zaglia,

Extrapolation methods:theory and practice , Number 2 in Studies in computationalmathematics. North-Holland, New York, N.Y., U.S.A, 1991.[20] L. Zhao, J. Song, P. Babu, and D. P. Palomar, “A Uni-ﬁed Framework for Low Autocorrelation Sequence Design viaMajorization-Minimization,”

IEEE Transactions on SignalProcessing , vol. 65, no. 2, pp. 438–453, Jan. 2017.[21] E. Vincent, R. Gribonval, and C. F´evotte, “Performance mea-surement in blind audio source separation,”

IEEE Transactionson Audio, Speech and Language Processing , vol. 14, no. 4, pp.1462–1469, July 2006.[22] A. Berlinet and C. Roland, “Parabolic acceleration of the EMalgorithm,”

Statistics and Computing , vol. 19, no. 1, pp. 35–47,Mar. 2009.[23] H. Sawada, H. Kameoka, S. Araki, and N. Ueda, “Multichan-nel Extensions of Non-Negative Matrix Factorization WithComplex-Valued Data,”

IEEE Transactions on Audio, Speech,and Language Processing , vol. 21, no. 5, pp. 971–982, May2013.[24] H. Buchner, R. Aichner, and W. Kellermann, “A generaliza-tion of blind source separation algorithms for convolutive mix-tures based on second-order statistics,”