[PDF] Privacy-Preserving Near Neighbor Search via Sparse Coding with Ambiguation

Abstract

In this paper, we propose a framework for privacy-preserving approximate near neighbor search via stochastic sparsifying encoding. The core of the framework relies on sparse coding with ambiguation (SCA) mechanism that introduces the notion of inherent shared secrecy based on the support intersection of sparse codes. This approach is `fairness-aware', in the sense that any point in the neighborhood has an equiprobable chance to be chosen. Our approach can be applied to raw data, latent representation of autoencoders, and aggregated local descriptors. The proposed method is tested on both synthetic i.i.d data and real large-scale image databases.

Full PDF

PPrivacy-Preserving Near Neighbor Search viaSparse Coding with Ambiguation

Behrooz Razeghi ∗ , Sohrab Ferdowsi † , Dimche Kostadinov ‡ , Flavio. P. Calmon § , Slava Voloshynovskiy ∗∗ University of Geneva † HES-SO Geneva ‡ University of Zurich § Harvard University

Abstract —In this paper, we propose a framework for privacy-preserving approximate near neighbor search via stochasticsparsifying encoding. The core of the framework relies on sparsecoding with ambiguation (SCA) mechanism that introducesthe notion of inherent shared secrecy based on the supportintersection of sparse codes. This approach is ‘fairness-aware’, inthe sense that any point in the neighborhood has an equiprobablechance to be chosen. Our approach can be applied to rawdata, latent representation of autoencoders, and aggregated localdescriptors. The proposed method is tested on both synthetic i.i.ddata and real large-scale image databases.

I. I

NTRODUCTION

Many modern signal processing, machine learning anddata mining applications, such as biometric authentica-tion/identiﬁcation, pattern recognition, speech processing andrecommender systems, require near neighbor search of a querywith respect to a given dataset, and a distance measure.Many search services are outsourced to third parties (serviceproviders) who possess powerful storage, communications andcomputing facilities. The major challenge is to satisfy privacyconstraints of the owner’s data and the clients’ interests, whilestill being capable of performing the fast search service inmulti-billion entry datasets.Let ( S , d S ) be a metric space. Given a set X ⊆ S of M points, a parameter r , and a query point y ∈ S , the goalof the exact near neighbor (NN) problem is to ﬁnd a point x ∈ X such that d S ( x , y ) ≤ r , if such a point exists. Inthe approximate variant of this problem (ANN), given c > ,the problem is relaxed to ﬁnd a point x ∈ X such that d S ( x , y ) ≤ cr . Theses problems can be generalized to k -NN and k -ANN setting. In this context, we assume S to bethe N -dimensional Euclidean space, i.e., S = R N , and thedistance given by an (cid:96) -norm, d S ( x , y ) = (cid:107) x − y (cid:107) . TheNN search based on naïve solution, i.e., the linear scan, isthe bottleneck of the system in large scale high-dimensionaldata sets [1]. Alternatively, approximate near neigbor (ANN)search, is more efﬁcient in terms of query time and spacecomplexity [2]–[4]. Perhaps the most popular solution to ANNproblem is via hashing, where the aim is to transform the datapoints to a lower dimensional space, then perform similaritysearch in the lower dimensional representation. The two mainresearch directions are (1) Locality Sensitive Hashing (LSH):indexing points using a hash table with the property that f ( · ) x g ( · ) y Search AdversarialRecontruction/Clustering

Public Domain

Fig. 1: The general block diagram of our framework.similar (closer) data points have a higher probability of colli-sion than dissimilar (far) points [1]–[7]; (2) learning to hash:performing NN similarity search in a low dimensional spacewith a lower search complexity [1], [8]–[11]. The objectivein the latter methodology is to preserve semantic or distancesimilarity between the original space and transformed space.Subsequent research showed that quantization-based solutionsare preferred in terms of query time, space cost and searchaccuracy [1].Concretely, this work brings the following contributions:(1) We consider the methodology of learning to hash forprivacy-preserving proximity search which entails minimuminformation loss for authorized users. The authorized partiescan purify the ambiguation noise using the shared secrecy based on the support intersection of sparse codes [12].(2) We adopt a notion of fairness addressed in [7], but in aprivacy-preserving setup. Our notion of fairness differs frommachine learning algorithms where the goal is to handle thebias introduced at the training phase. We consider the biasin the stored data and querying response . By doing this, anypoint in the neigborhood has an equal chance to be chosen.Moreover, in some cases, it may sufﬁce to return any of thepoints in the near neighborhood, rather than the computation-ally expensive nearest one. The equiprobable nearby schemecan also be utilized in privacy protection mechanisms. Thatis, instead of reporting the nearest-neighbor, which leaks moreinformation, the service provider just sends back a random ora typical data point close to the query point.In comparison to [13], [14], our work has the followingfundamental differences: a) In [13]–[16], they utilized a di-mensionality reduction transform with random entries, whileour sparsifying transform may keep, extend, or reduce thedimension of the original data. Moreover, our transform is a r X i v : . [ c s . I R ] F e b earned using the sparsifying transform problem to ensure anoptimal sparse representational that is information preservingin general, whereas the transform in [13]–[15] might preservethe distances only under certain conditions of the Johnson-Lindenstrauss Lemma. b) In [13], [14], [16], the codes aredense and binary, whereas in our method the codes are sparse(and possibly ternary), which form a basis of our ambiguationframework. Last but not the least, the embedding based onuniversal quantization scheme [13] has information leakage interms of clustering, i.e., the curious server still can performclustering on data points. Moreover, we impose no restrictionson the input data, i.e., we assume that as an input we mighthave raw data, extracted features using any known hand craftedmethods, aggregated local descriptors based on BoW, FV,VLAD [17]–[19], etc., or from the last layers of deep nets [20],or the latent space of auto-encoders [21]. We apply our modelon the latent representation of a designed network in [12].Throughout this paper, superscript ( · ) T stands for the trans-pose. Vectors and matrices are denoted by boldface lower-case( x ) and upper-case ( X ) letters, respectively. We consider thesame notation for a random vector x and its realization. Thedifference should be clear from the context. x i denotes the i -th entry of vector x . For a matrix X , x ( j ) denotes the j -thcolumn of X . We use the notation [ N ] for the set {

1, 2, ..., N } .II. P RELIMINARIES

A. Problem Setup

Consider a three-party data release scenario involving (a) adata owner, (b) data users, and (c) a service provider (server).The data owner possesses database X = [ x (1), · · · , x ( M )] consisting of M data points x ( m ) ∈ R N , m ∈ [ M ] . Thedatabase is used to offer some utility for the authorized data users. The data users seek some utility from the dataowner based on their query y . The server provides a pre-determined service to the data users on behalf of the dataowner. We assume that both the server and data users arehonest-but-curious, which we consider them as an adversary.The service provider may try to infer some information aboutthe original data collection X from the disclosed public storageand/or the querying data sent to the server. For instance,the server may estimate the original data from the disclosedrepresentations and query, or may establish links between thecloset entries in database. The data users may try to infersome information about the public representations and/or theoriginal data via multiple varied queries to guess the datamanifold by inspecting the returned responses. A generaldiagram of our framework is depicted in Fig. 1.Therefore, we study the problem of disclosing database X to a third-party (public storage) in order to drive some utility,in terms of near neighbor search, for the authorized data usersbased on the public representations while, at the same time,protect the privacy of the data owner (against the honest-but-curious server and data users) and data users (against thehonest-but-curious server). P r i v a c y P r e s e r v i n g R e g i o n r (1 ± (cid:15) ) r d S ( x , y ) d T ( ϕ ( x ) , ϕ ( y )) Presereved DistancesAmbiguous Distances xy r ϕ ( x ) ϕ ( y ) ≈ r ϕ : S → T

Fig. 2: Visualization of desired property of mapping schemein privacy-preserving near neighbor search setup.

B. Fair Near Neighbor

Let ( S , d S ) be a metric space and let X ⊆ S be a set of M data points. Let B S ( c , r ) = { x ∈ S | d S ( c , x ) ≤ r } bethe closed ball of radius r > around a point c ∈ S . Let N ( c , r ) = B S ( c , r ) ∩ X be the r -neighborhood of c in X ,with the size | N ( c , r ) | . Deﬁnition 1.

Fair Near Neighbor (FNN) [7]. Given a data set

X ⊆ S of M data points, a parameter r > , and query point y , the goal is to ﬁnd a data point x ∈ N ( y , r ) with probability µ , where / ( | N ( y , r ) | (1 + (cid:15) )) ≤ µ ≤ (1+ (cid:15) ) / | N ( y , r ) | , i.e., µ is an approximately uniform probability distribution. C. Fair Privacy-Preserved Approximate Near Neigbor

Let ( T , d T ) be a metric space where d T ( a , b ) , ∀ a , b ∈T is deﬁned on supp( a ) = { l ∈ [ L ] : a l (cid:54) = 0 } . Let P ⊆T be a set of M data points. Let B T ( g ( y ), r ) = { f ( x ) ∈T | d T ( g ( y ), f ( x )) ≤ r } . Now, we deﬁne the Fair Privacy-preserved Approximate Near Neighbor method as follows:

Deﬁnition 2.

Fair Privacy-preserved Approximate Near-Neighbor (FPANN). Given a data set

X ⊆ S of M datapoints, a parameter r > , the goal is to design a randomizedprivacy-preserving data release mechanism f : X → P and(randomized) query processing g : S → T such that for agiven authorized query y one can report a point f ( x ) ∈ N T ( g ( y ), r ) with probability µ p , where N T ( g ( y ), r ) = B T ( g ( y ), r ) ∩P be the approximate r -neighbor of g ( y ) in P ,and / ( | N ( g ( y ), r ) | (1 + (cid:15) )) ≤ µ p ≤ (1 + (cid:15) ) / | N ( g ( y ), r ) | . Deﬁnition 3. ( β , γ ) -recoverable privacy mechanism. For ≤ γ ≤ and given authorized query y auth , unauthorized query y unauth and β > , a privacy-preserving data release mecha-nism f : X → P is ( β , γ ) -recoverable if: ( i ) : P auth e = Pr (cid:2) E [ d ( x , ˆx )] ≤ β | g (cid:0) y auth (cid:1)(cid:3) < γ ,( ii ) : P unauth e = Pr (cid:2) E [ d ( x , ˆx )] ≤ β | g (cid:0) y unauth (cid:1)(cid:3) ≥ γ , where g ( · ) is the data user’s query function to service provider.II. P ROPOSED F RAMEWORK

A. Framework Overview

Our framework is composed of the following steps:1)

Preparation at Owner Side:

The owner generates the sparsecodewords from the data that s/he owns using the trainedsparsifying transform . Next, he shares the privacy-protectedsparse codebook with the service provider (server). FollowingKerckchoffs’s Principle in cryptography, the data owner makesthe learned sparsifying transform publicly available.2)

Indexing at Server Side:

The server indexes the receivedsparse codes in a database.3)

Querying at Data User Side:

The data user generates asparse representation from his query data using the sharedtransform. Then, the client sends a function of his sparserepresentation to the server.4)

Near Neighbor Search at Server Side:

Given the requestedprobe, the server runs a near neighbor search to ﬁnd the storedsparse codes that are most similar (close) to the probe. Finally,based on the pre-determined service to the data users, theserver sends back an answer to the data user.Next, we describe in more detail the fundamental elementsof our mechanism.

B. Sparse Data Representation

The goal of sparsiﬁcation is to obtain an information-preserving sparse representation of the original data. Oursparsifying transform consists of a linear mapper followedby an element-wise nonlinearity. We consider a joint learningproblem to obtain the sparsifying transform W ∈ R L × N as well as the sparse codebook A ∈ R L × M that can beformulated as: (cid:0) ˆW , ˆA (cid:1) = arg min ( W , A ) (cid:107) WX − A (cid:107) F + β Ω ( W )+ β Ω ( A ) , (1)where β ≥ and β ≥ are regularization param-eters, Ω ( W ) = ( β (cid:107) W (cid:107) F + β (cid:107) WW T − I (cid:107) F − β log | det W T W | ) penalizes the information loss in orderto avoid trivial solutions, and Ω ( A ) is the sparsity constrainton the compressed codebook A [22]. The term (cid:107) WX − A (cid:107) F is a sparsiﬁcation error, which represents the deviation of thetransformed data from the exact sparse representation in thetransformed domain Our algorithm for solving (1) alternatesbetween a (cid:96) -“norm"-based sparse coding step , and a non-convex transform update step [26]. Therefore, one can writethe closed-form formulation of the encoder as [26], [27]: a ( m ) = ϕ ( x ( m )) = ψ λ ( Wx ( m )) , ∀ m ∈ [ M ] , (2)where ψ λ ( f ) = 1 | f l |≥ λ f , ∀ l ∈ [ L ] , λ ≥ and a ( m ) is S x -sparse, i.e., (cid:107) a ( m ) (cid:107) ≈ S x , ∀ m ∈ [ M ] . The decoder (reconstructmapper) R ∈ R N × L can be formulated as follows: min R (cid:107) RA − X (cid:107) F + β R (cid:107) R − (cid:0) W T W + β I (cid:1) − W T (cid:107) F ,st : R T R = I , (3) We refer the reader to [23]–[25] for applications in group membershipveriﬁcation. where A ∈ R L × M is sparse codebook, X ∈ R M × N isoriginal data points and W ∈ R L × N is encoder transform.Since R has orthonormal columns, we have (cid:107) RA − X (cid:107) F =tr (cid:2) X T X − X T RA + A T A (cid:3) , (cid:107) R − (cid:0) W T W + β I (cid:1) − W T (cid:107) F =tr (cid:2) I − C T R + C T C (cid:3) , where C = (cid:0) W T W + β I (cid:1) − W T .Consequently (3) is equivalent to the problem of maximizing tr (cid:2) X T RA (cid:3) + β R tr (cid:2) C T R (cid:3) = tr (cid:2)(cid:0) AX T + β R C T (cid:1) R (cid:3) . Con-sidering the Singular Value Decomposition AX T + β R C T = UΣV T , this formulation reduces to tr (cid:2) UΣV T R (cid:3) =tr [ ΣZ ] = (cid:80) i z ii Σ ii ≤ (cid:80) i Σ ii , where Z = V T RU . Note thatthe last inequality holds because Z is an orthonormal matrix,and (cid:80) j z ij = 1 , z ii ≤ . Therefore, the maximum can beachieved if Z = I , i.e., closed form solution is R = UV T ,where AX T + β R (cid:0)(cid:0) W T W + β I (cid:1) − W T (cid:1) T = UΣV T . C. Ambiguation Mechanism

The idea of ambiguation is to add (pseudo) random noise tothe orthogonal complement, i.e., non-informative componentsof the sparse code. The integration of ‘sparse lossy coding’with ‘ambiguation’ introduces a generalized randomizationtechnique, namely Sparse Coding with Ambiguation (SCA)[27]. The SCA provides an information-theoretically and com-putationally private mechanism. The information-theoreticalprivacy guarantee originate from the lossy compression in-duced at the sparsiﬁcation stage, and the computational privacyguarantee originate from ambiguation stage. The curiousserver faces a combinatorial complexity budget requirementto guess the informative components. The ambiguation noiseis required to have the same distribution as the sparse codes,to guarantee being indistinguishable from its statistical prop-erties. We refer the reader to [27] for more details. Therandomized privacy-preserving data release mechanism f : X → P ⊆ T can be formulated as: p ( m ) = f ( x ( m )) = ϕ ( x ( m )) ⊕ n p supp , ∀ m ∈ [ M ] , (4)where (cid:107) n p supp (cid:107) ≈ S p .Given a query point y , the (randomized) query releasemechanism g : S → T can be formulated as: q = g ( y ) = ϕ ( y ) ⊕ n q supp , (5)where (cid:107) n q supp (cid:107) ≤ S q , ≤ S q ≤ S p . If S q = 0 , the queryis disclosed as in-the-clear sparse code without ambiguationnoise. Let us consider two hypotheses for near neighbor searchas follows. H : The authorized query is related to one of the M data points in the database. For instance, it is a noisy versionof one data point. H : The unauthorized query is not relatedto any data point. For instance, it is synthetic query generatedby an adversary. D. Near Neighbor Search

The near neighbor search is performed in latent space T .Given a data set P of M embedded disclosed representations { p ( m ) } , m ∈ [ M ] , a parameter r , and embedded query point q , the service provider performs approximate near neighborsearch and report a point randomly and uniformly from B T ( q , r ) ∩ P . (a) (b) Unquantized (c) (d) S p ) − r ec a ll @ S x = 128 S x = 64 S x = 32 S x = 16 (e) S p ) − r ec a ll @ S x = 128 S x = 64 S x = 32 S x = 16 (f) Fig. 3: a) local distance preserving; b) local robustness; (c) and (d): Comparison of distortion-sparsity behavior for c) authorizedand d) unauthorized parties; (e) and (f) R − recall @ T curves for a subset of 10K CelebA images of × × .IV. D ISCUSSION

We now discuss various properties of our method. One de-sired property of an embedding scheme in a privacy-preservingnear neighbor search is to preserve distance information onlyup to a speciﬁed radius, while quickly ﬂattening after thisdistance threshold. Therefore, from one hand, the informationrate is spent in encoding local distances, and from the otherhand, the curious server/data user cannot recover any distanceinformation about signals that are far apart. Fig. 2 visualizesthis local isometric mapping, where d S ( · , · ) and d T ( · , · ) denote the distance measure in original domain and transformdomain, respectively.Suppose x ∼ N ( , σ x I N ) and y auth = x + z , where z ∼N ( , σ z I N ) , and where x ∈ X . Fig.3a depicts the behaviourof our embedding for two sparsity levels and compare themwith linear embeddings which preserve all distances equally.Let us deﬁne: P c = 1 M . S x (cid:88) Pr { supp ( ϕ ( x )) = supp ( ϕ ( y )) } , P m = 1 M . S x (cid:88) Pr { supp ( ϕ ( x )) (cid:54) = supp ( ϕ ( y )) } . Fig.3b illustrates the probability of correct support and missedsupport for the learned linear map W (problem (1)) and W = I . The learned transform outperforms in local distances. A. Reconstruction Leakage

A potential threat is that the adversary may try to reconstructthe original data points from the disclosed representations. Toget insight into the SCA model, we ﬁrstly provide the resultson a synthetic database establish its connection to classicalShannon rate-distortion theory. Next, we validate our modelon real image databases. For the sake of completeness, we alsobring the results provided in [11] on synthetic i.i.d data. Notethat the sparsity level S x controls the information encodingrate, or equivalently, the distinguishability of data points inthe transform domain. The ambiguation level S p controls theimposed randomness to the informative data.Fig. 3c and Fig. 3d illustrate and compare distortion-sparsitybehavior at authorized and unauthorized parties, respectively.Fig.3c depicts reconstruction ﬁdelity for four cases: 1) un-quantized sparsifying encoding (2), 2) Sparse Ternary Coding(STC) for independent and identically distributed (i.i.d.) data[27], 3) STC for i.i.d. data which re-scaled in original domain[11], and 4) STC for correlated data which are drawn fromAR(1) model with the parameter ρ = 0.5 . We used the sameexperimental setting as [11]. Fig.3d shows the reconstructionleakage at a curious server (or an adversary) who knows theencoder and its parameters, but has no knowledge about thecorrect indices to purify the ambiguated representations. The S p = 0 S x = 16 S x = 1 S x = 2 S x = 4 S x = 8 S x = 16 S p = 0 S p = 8 S p = 16 S p = 24 S p = 32 MNIST MSE

SSIM

F-MNIST MSE

SSIM

CIFAR-10 MSE

SSIM

TABLE I: Reconstruction quality vs sparsity and ambiguation levels. a) (b) (c) (d) (e)

PDF P intra inter P D ( P k P ) ↑ PDF P P D ( P k P ) → (f) Fig. 4: t-SNE visualizations from MNIST dataset on: a) original space, b) transformed space with ambiguation, c) transformedspace after puriﬁcation, d) reconstructed space without knowledge of support, e) reconstructed space with knowledge of support;f) conceptual visualization of data clustering leakage.terminology ‘half ambiguation’ is deﬁned as S p = 0.5( L − S x ) ,and the terminology ‘full ambiguation’ is deﬁned as S p = L − S x . Note that the information security guarantee addressedin [13], required keeping the projection parameters secretly.Table I provides a quantitative comparison on reconstructedimages using normalized MSE and SSIM (Structural Similar-ity Index) on MNIST [28], Fashion-MNIST [29], and CIFAR-10 [30] databases, where we applied the proposed method onlatent representation of a designed convolutional autoencoderin [12], setting L = 128 and considering one code-map forMNIST and Fashion-MNIST databases and four code-mapsfor CIFAR-10 database. Finally, note that based on theseresults, our model follows the notion of ( β , γ ) -recoverableprivacy mechanism, which we deﬁned in Section II.As a large-scale retrieval experiment, Figs. 3e-3f depict therecall measure for the CelebA database. The ground-truth wasthe pixel domain Euclidean distances and the latent code of thenetwork in [12] is used to measure the approximate distances. B. Clustering Leakage

Another potential threat is that the adversary may establishlinks between the closet disclosed representations. We nowdiscuss database clustering leakage under our model. Notethat the proposed mechanism can apply to privacy-preservingclustering applications where the goal is to perform clusteringwithout disclosing the original data. The signiﬁcant beneﬁt ofour method is that the authorized data users can purify theimposed ambiguation noise. However, the adversary will facea combinatorial problem to guesses the correct components.Fig. 4 provides a qualitative visualization of clusteringleakage on MNIST database [28], for which t-distributedstochastic neighbor embedding (t-SNE) [31] is used to projectthe underlying space to 2D. As illustrated, our model preventdatabase clustering leakage. Denoting by P intra and P inter as probability density functions of ‘intra-cluster’ and ‘inter-cluster’ of distances, respectively, Fig. 4f, provides a con-ceptual visualization of database clustering leakage, where D ( P (cid:107) P ) = E P [log P P ] .V. C ONCLUSION

We present a computationally efﬁcient, fairness-awareprivacy-preserving nearby search scheme that can be utilizedin cloud-based applications. The key insight behind our mech-anism is that by approximating sparse representation of data points and adding random noise to their orthogonal comple-ment, we can control privacy and utility trade-off in terms ofdataset reconstruction and dataset clustering. The authorizeddata users can purify the ambiguated public representationthanks to the knowledge of correct support of the query.R

EFERENCES[1] Jingdong Wang, Ting Zhang, Nicu Sebe, Heng Tao Shen, et al., “Asurvey on learning to hash,”

IEEE transactions on pattern analysis andmachine intelligence , vol. 40, no. 4, pp. 769–790, 2017.[2] Piotr Indyk and Rajeev Motwani, “Approximate nearest neighbors:towards removing the curse of dimensionality,” in

Proceedings of thethirtieth annual ACM symposium on Theory of computing . ACM, 1998,pp. 604–613.[3] Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al., “Similarity searchin high dimensions via hashing,” in

Vldb , 1999, vol. 99, pp. 518–529.[4] Jun Wang, Wei Liu, Sanjiv Kumar, and Shih-Fu Chang, “Learning tohash for indexing big data—a survey,”

Proceedings of the IEEE , vol.104, no. 1, pp. 34–57, 2015.[5] Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni,“Locality-sensitive hashing scheme based on p-stable distributions,” in

Annual Symposium on Computational Geometry . ACM, 2004, pp. 253–262.[6] Tobias Christiani, “Fast locality-sensitive hashing frameworks forapproximate near neighbor search,” in

Int. Conf. on Similarity Searchand Applications . Springer, 2019, pp. 3–17.[7] Sariel Har-Peled and Sepideh Mahabadi, “Near neighbor: Who is thefairest of them all?,” in

Advances in Neural Information ProcessingSystems (NeurIPS) , 2019, pp. 13176–13187.[8] Ruslan Salakhutdinov and Geoffrey Hinton, “Semantic hashing,”

Inter-national Journal of Approximate Reasoning , vol. 50, no. 7, pp. 969–978,2009.[9] Yair Weiss, Antonio Torralba, and Rob Fergus, “Spectral hashing,” in

Advances in neural information processing systems , 2009, pp. 1753–1760.[10] Sohrab Ferdowsi, Slava Voloshynovskiy, Dimche Kostadinov, and TarasHolotyak, “Sparse ternary codes for similarity search have higher codinggain than dense binary codes,” in

IEEE Int. Symp. on Inf. Theory (ISIT) ,2017.[11] Behrooz Razeghi and Slava Voloshynovskiy, “Privacy-preserving out-sourced media search using secure sparse ternary codes,” in

IEEE Int.Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2018,pp. 1–5.[12] Sohrab Ferdowsi, Behrooz Razeghi, Taras Holotyak, Flavio P. Calmon,and Slava Voloshynovskiy, “Privacy-preserving image sharing viasparsifying layers on convolutional groups,” in

IEEE Int. Conferenceon Acoustics, Speech and Signal Processing (ICASSP) , 2020.[13] Petros Boufounos and Shantanu Rane, “Secure binary embeddingsfor privacy preserving nearest neighbors,” in

IEEE Int. Work. on Inf.Forensics and Security (WIFS) , 2011, pp. 1–6.[14] Li Weng, Laurent Amsaleg, and Teddy Furon, “Privacy-preservingoutsourced media search,”

IEEE Transactions on Knowledge and DataEngineering , vol. 28, no. 10, pp. 2738–2751, 2016.[15] Krishnaram Kenthapadi, Aleksandra Korolova, Ilya Mironov, and NinaMishra, “Privacy via the johnson-lindenstrauss transform,” arXivpreprint arXiv:1204.2606 , 2012.16] Shantanu Rane and Petros T Boufounos, “Privacy-preserving nearestneighbor methods: Comparing signals without revealing them,”

IEEESignal Processing Magazine , vol. 30, no. 2, pp. 18–28, 2013.[17] Hervé Jégou, Matthijs Douze, and Cordelia Schmid, “On the burstinessof visual elements,” in

IEEE Conf. on Comp. Vision and Pattern Recog.(CVPR) , 2009, pp. 1169–1176.[18] Florent Perronnin and Christopher Dance, “Fisher kernels on visualvocabularies for image categorization,” in

IEEE Conf. on Comp. Visionand Pattern Recog. (CVPR) , 2007, pp. 1–8.[19] Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez,“Aggregating local descriptors into a compact image representation,”in

IEEE Conf. on Comp. Vision and Pattern Recog. (CVPR) , 2010, pp.3304–3311.[20] Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lem-pitsky, “Neural codes for image retrieval,” in

Europ. Conf. on Comp.Vision . Springer, 2014, pp. 584–599.[21] Diederik P Kingma and Max Welling, “Auto-encoding variationalbayes,” in

International Conference on Learning Representations(ICLR) , 2014.[22] Saiprasad Ravishankar and Yoram Bresler, “Learning sparsifying trans-forms,”

IEEE Trans. on Signal Processing , vol. 61, no. 5, pp. 1072–1086,2013.[23] Marzieh Gheisari, Teddy Furon, and Laurent Amsaleg, “Joint learningof assignment and representation for biometric group membership,” in

ICASSP 2020-2020 IEEE International Conference on Acoustics, Speechand Signal Processing (ICASSP) . IEEE, 2020, pp. 2922–2926.[24] Marzieh Gheisari, Teddy Furon, and Laurent Amsaleg, “Group mem-bership veriﬁcation with privacy: Sparse or dense?,” in .IEEE, 2019, pp. 1–7.[25] Marzieh Gheisari, Teddy Furon, Laurent Amsaleg, Behrooz Razeghi,and Slava Voloshynovskiy, “Aggregation and embedding for groupmembership veriﬁcation,” in

ICASSP 2019-2019 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP) .IEEE, 2019, pp. 2592–2596.[26] Dimche Kostadinov, Slava Voloshynovskiy, and Sohrab Ferdowsi,“Learning overcomplete and sparsifying transform with approximateand exact closed form solutions,” in

European Workshop on VisualInformation Processing , 2018.[27] Behrooz Razeghi, Slava Voloshynovskiy, Dimche Kostadinov, and OlgaTaran, “Privacy preserving identiﬁcation using sparse approximationwith ambiguization,” in

IEEE Int. Work. on Info. Forensics and Security(WIFS) , 2017, pp. 1–6.[28] Yann LeCun and Corinna Cortes, “MNIST handwritten digit database,”2010.[29] Han Xiao, Kashif Rasul, and Roland Vollgraf, “Fashion-mnist: a novelimage dataset for benchmarking machine learning algorithms,” arXivpreprint arXiv:1708.07747 , 2017.[30] Alex Krizhevsky, Geoffrey Hinton, et al., “Learning multiple layers offeatures from tiny images,” 2009.[31] Laurens van der Maaten and Geoffrey Hinton, “Visualizing data usingt-sne,”