[PDF] Group Membership Verification with Privacy: Sparse or Dense?

Abstract

Group membership verification checks if a biometric trait corresponds to one member of a group without revealing the identity of that member. Recent contributions provide privacy for group membership protocols through the joint use of two mechanisms: quantizing templates into discrete embeddings and aggregating several templates into one group representation. However, this scheme has one drawback: the data structure representing the group has a limited size and cannot recognize noisy queries when many templates are aggregated. Moreover, the sparsity of the embeddings seemingly plays a crucial role on the performance verification. This paper proposes a mathematical model for group membership verification allowing to reveal the impact of sparsity on both security, compactness, and verification performances. This model bridges the gap towards a Bloom filter robust to noisy queries. It shows that a dense solution is more competitive unless the queries are almost noiseless.

Full PDF

GGroup Membership Veriﬁcation with Privacy:Sparse or Dense?

Marzieh Gheisari

Univ Rennes, Inria, CNRS, IRISAFrance

Teddy Furon

Univ Rennes, Inria, CNRS, IRISAFrance

Laurent Amsaleg

Univ Rennes, Inria, CNRS, IRISAFrance

Abstract —Group membership veriﬁcation checks if a biometrictrait corresponds to one member of a group without revealing theidentity of that member. Recent contributions provide privacyfor group membership protocols through the joint use of twomechanisms: quantizing templates into discrete embeddings, andaggregating several templates into one group representation.However, this scheme has one drawback: the data structurerepresenting the group has a limited size and cannot recognizenoisy query when many templates are aggregated. Moreover, thesparsity of the embeddings seemingly plays a crucial role on theperformance veriﬁcation.This paper proposes a mathematical model for group mem-bership veriﬁcation allowing to reveal the impact of sparsity onboth security, compactness, and veriﬁcation performances. Thismodels bridges the gap towards a Bloom ﬁlter robust to noisyqueries. It shows that a dense solution is more competitive unlessthe queries are almost noiseless.

I. I

NTRODUCTION

Group membership veriﬁcation is a procedure checkingwhether an item or an individual is a member of a group. Ifmembership is positively established, then an access to someressources (a building, a ﬁle, . . . ) is granted; otherwise theaccess is refused. This paper focuses on privacy preserving group membership veriﬁcation procedures where membersmust be distinguished from non-members, but where themembers of a group should not be distinguished one another.To this aim, a few recent contributions have proposed torely on the aggregation and the embedding of several distinc-tive templates into a unique and compact high dimensionalfeature representing the members of a group [1], [2]. It hasbeen demonstrated that this allows a good assessment of themembership property at test time. It has also been shown thatthis provides privacy and security. Privacy is enforced becauseit is impossible to infer from the aggregated feature whichoriginal distinctive template matches the one used to probethe system. Security is preserved since nothing meaningfulleaks from embedded data [3], [4].[1] and [2], however face severe limitations. Basically, itseems impossible to create features representing groups havingmany members. In this case, the probability to identify truepositives vanishes and the false negative rate grows accord-ingly. Furthermore, the robustness of the matching procedure

WIFS‘2019, December, 9-12, 2019, Delft, Netherlands. 978-1-7281-3217-4/19/ $ c (cid:13) fades and becomes unable to absorb even the smallest amountof noise that inherently differentiate the enrolled template ofone member and the template captured at query time forthis same member. In contrast, features representing onlyfew group members are robust to noise and cause almost nofalse negatives. A detailled analysis of [1] and [2] suggeststhat these limitations originate from the sparsity level of thefeatures representing group members.This paper investigates the impact of the sparsity level ofthe high dimensional features representing group members onthe quality of (true positive) matches and on their robustness tonoise. It shows it is possible to trade compactness and sparsityfor better security or better veriﬁcation performance.Sect. II ﬁrst considers the aggregation of discrete randomsequences, and models this compromise with informationtheoretical tools. Sect. III applies this viewpoint to binaryrandom sequences and shows that the noise on the query hasan impact depending on the sparsity of the sequences. Sect. IVbridges the gap between the templates, i.e. real d -dimensionalvectors, and the discrete sequences considered in the previoussections. Sect. V gathers the experimental results for a groupmembership veriﬁcation based on faces.II. D ISCRETE S EQUENCES

This section considers the problem of creating a representa-tion Y of a group of n sequences { X , . . . , X n } , whose useis to test whether a query sequence Q is a noisy version ofone of these n original sequences. This test is done at querytime when the original sequences are no longer available andall that remains is the representation Y .The sequences are elements of X m where X is a ﬁnitealphabet of cardinality |X | , say X := { , , . . . , |X | − } . Thesequence follows a statistical model giving a central role tothe symbol . The symbols of the sequences are independentand identically distributed with P ( X = s ) = (cid:40) − p ( |X | − if s = 0 p otherwise (1)for p ∈ (0 , / |X | ] . Sparsity means that probability p is small,density means that p is close to / |X | so that X is uniformlydistributed over X . a r X i v : . [ c s . CR ] F e b . Structure of the group representation We impose the following conditions on the aggregation a ( · ) computing the group representation Y = a ( X , . . . , X n ) : • Y is a discrete sequence of the same length Y ∈ Y m , • Symbol Y ( i ) only depends on symbols { X ( i ) , . . . , X n ( i ) } , • The same aggregation is made index-wise: with abuse ofnotation, Y ( i ) = a ( X ( i ) , . . . , X n ( i )) , ∀ i ∈ [ m ] , • Y ( i ) does not depend on any ordering of the set { X ( i ) , . . . , X n ( i ) } ,These requirements are well known in traitor tracing and grouptesting as they usually model the collusion attack or the testresults over groups. Here, they simplify the analysis reducingthe problem to a single letter formulation where index i isdropped involving symbols { X , . . . , X n } , Y and Q .These conditions motivate a 2-stage construction. The ﬁrststage computes the type (a.k.a. histogram or tally) T of thesymbols { X , . . . , X n } . Denote by T |X | ,n the set of possibletype values. Its cardinality equals |T |X | ,n | = (cid:0) n + |X |− |X |− (cid:1) whichmight be too big. The second stage applies a surjectivefunction r : T |X | ,n → Y , where Y is a much smaller set. B. Noisy query

At enrollment time, the system receives n sequences, aggre-gates them into the compact representation Y , and then forgetsthe n sequences. At query time, the system receives a newsequence Q conforming with one of the following hypotheses: • H : Q is a noisy version of one of the enrolled sequences.Without loss of generality, Q = X + N . • H : Q = X + N , where X shares the same statisticalmodel but it is independent of { X , . . . , X n } .We model the source of noise (due to different acquisitionconditions) by a discrete communication channel. It is deﬁnedby function W : X × X → [0 , with W ( q | x ) := P ( Q = q | X = x ) . We impose some symmetry w.r.t. the symbol : W ( s |

0) = η and W (0 | s ) = η , ∀ s ∈ X \{ } .At query time, the system computes a score S = s ( Q , Y ) and compares to a threshold: hypothesis H is deemed true if S ≥ τ . This test leads to two probabilities of error: • P fp ( n, m ) is the probability of false positive: P fp ( n, m ) := P ( S ≥ τ |H ) . • P fn ( n, m ) is the probability of false negative: P fn ( n, m ) := P ( S < τ |H ) .The emphasis on ( n, m ) is natural. It is expected that: i) themore sequences are aggregated, the less reliable the test is, ii)the longer the sequences are, the more reliable the test is. C. Figures of merit ( C , S , V ) The section presents three information theoretic quantities(expressed in nats) measuring the performances of the scheme.The ﬁrst two depends on the statistical model of X (es-pecially p ) and the aggregation mechanism a . The last onedepends moreover on the channel.

1) Compactness C : The compactness of the group repre-sentation is measured by the entropy C := H ( Y ) . It roughlymeans that the number of typical sequences Y scales expo-nentially as e mH ( Y ) , which can be theoretically compressedto the rate of H ( Y ) nats per symbol.

2) Security S : We consider an insider aiming at disclos-ing one of the n enrolled sequences. Observing the grouprepresentation Y , its uncertainty is measured by the equiv-ocation S := H ( X | Y ) . This means that the insider does notknow which of the e mH ( X | Y ) typical sequences the enrolledsequences are.

3) Veriﬁcation V : In our application, the requirement ofutmost importance is to have a very small probability of falsepositive. We are interested in an asymptotical setup where m → + ∞ . This motivates the use of the false positive errorexponent as a ﬁgure of merit: E fp ( n ) := lim m → + ∞ − m log P fp ( n, m ) . (2)If E fp ( n ) > , it means that P fp ( n, m ) exponentially vanishesas m becomes larger. The theory of test hypothesis showsthat E fp ( n ) is upper bounded by the mutual information V := I ( Y ; Q ) where Q is a symbol of the query sequence, i.e. anoisy version of X . It means that the necessary length forachieving the requirement P fp ( n, m ) < (cid:15) is [5] m ≥ − log (cid:15) V . (3) D. Noiseless setup

The bigger V and S , the better the performance in terms ofveriﬁability and security. Yet, they can not be both big at thesame time. The noiseless case when the channel introduces noerror and Q = X simply illustrates the trade-off: V ≤ C (4) V + S = H ( X ) , (5)with H ( X ) = − log p + (1 − p ) log pp and p := P ( X =0) (1). For a given |X | , H ( X ) is maximised by the densesolution: H ( X ) ≤ log |X | with equality for p = 1 / |X | .III. B INARY ALPHABET

This section explores the binary case where X = { , } .We ﬁrst set the surjection as the identity function s.t. Y = T .Then, the impact of the surjection is investigated. A. Working with types

In the binary case, there are n + 1 type values. Therecan be uniquely labelled by the number of symbols ‘1’ in { X , . . . , X n } , i.e. T = (cid:80) ni =1 X i ∼ B ( n, p ) .

1) Veriﬁcation:

In the noiseless case, after some rewriting: V = h ( p ) − n (cid:88) t =0 P ( T = t ) h (cid:18) tn (cid:19) , (6)with h ( p ) := − p log( p ) − (1 − p ) log(1 − p ) , the entropy of aBernoulli r.v. B ( p ) . If p = 1 / and n is large: V = 12 n + o (cid:18) n (cid:19) . (7)his is not the maximum of this quantity. For large n , the bestoption is to set p = αn , V = βn + o (cid:18) n (cid:19) , (8)with α = 1 . and β = 0 . . This was proven in the totallydifferent application of traitor tracing [6, Prop. 3.8].This section outlines two setups: the dense setup where p = 1 / , and the sparse setup where p goes to when moresequences are packed in the group representation. Both setupsshare the asymptotical property that V ≈ κ/n for large n .According to (3), we can pack a big number n of sequencesinto one group representation provided that their length m scales proportionally to n .

2) Compactness:

The ﬁgure of merit for compactness fortypes is just C = H ( T ) where T follows a binomial distri-bution: T ∼ B ( n, p ) . In the dense setup p = 1 / , the bino-mial distribution is approximated by a Gaussian distribution N ( n/ n/ providing: C = 12 log (cid:16) πen (cid:17) + O (cid:18) n (cid:19) . (9)In the sparse setup p = α/n , the binomial distribution isapproximated by a Poisson distribution P ( α ) [7]: C ≈ α (1 − log( α )) + e − α + ∞ (cid:88) j =0 α j log( j !) j ! . (10)This shows that the types are not compact in the dense setup;It approximatively remains constant in the sparse setup.

3) Security:

Thanks to (5), we only need to calculate H ( X ) = h ( p ) . In the dense setup, H ( X ) = log(2) and S converges to H ( X ) as n increases. Merging into a singlerepresentation protects an individual sequence. If sparse, H ( X ) = αn (cid:16) − log αn (cid:17) + o (cid:18) n (cid:19) . (11)Therefore, S converges to zero as n increases, contrary tothe dense setup. It might be more insightful to see thatthe ratio of uncertainties before and after observing T , i.e. H ( X ) /H ( X | T ) , converges to 1 in both cases. Merging doesprovide some security but sparsity is more detrimental. B. Adding a surjection

The motivation of the surjection onto a smaller set Y isto bound C as C ≤ log |Y| , ∀ n . The Markov chain Q → X → T → Y imposes that V ≤ I ( T ; Q ) . The surjection thusprovoques a loss in veriﬁcation as depicted in Fig. 1.App. A shows that for |Y| = 2 , this loss is minimized for: r ( t ) = (cid:40) if t < t p otherwise (12)where t p is a threshold depending on p . In the dense setup, t p = n/ and the surjection corresponds to a majority vote Fig. 1. The trade-off ( S , V , C ) for X = { , } , n = 16 , Y = T (blue), Y = r ( T ) for ‘All-1’ (red) and majority vote (green). Dashed plot representsthe projection onto C = 0 . Triangles and stars summarize results (7) to (14). collusion in traitor tracing (a threshold model in group testing).Hence, by [6, Prop. 3.4]: V = 1 nπ + o (cid:18) n (cid:19) . (13)In the sparse setup t p = 1 which corresponds to an ‘All-1’attack in traitor tracing (a the perfect model in group testing).Then the best option is to set p = log(2) /n and [6, Prop. 3.3]: V = (log(2)) n + o (cid:18) n (cid:19) . (14)From (3), the necessary length is m ≥ − n log( (cid:15) ) / (log(2)) .The main property V ≈ κ/n still holds but the surjectionlowers κ from . to . (dense), from . to . (sparse).The sparse setup is still the best option w.r.t. V . C. Relationship with the Bloom ﬁlter

A Bloom ﬁlter is a well-known data structure Y ∈ { , } m designed for set membership, embedding items to be enrolledinto Y thanks to k hash functions. Its probability of false neg-ative is exactly , whereas the probability of false positive isnot null. The number of hash functions minimizing P fp ( n, m ) is k = (cid:98) log(2) m/n (cid:99) . Then, the necessary length to meet arequired false positive level (cid:15) is m ≥ − n log( (cid:15) ) / (log(2)) .These numbers show the connection with our scheme (14).At the enrollment phase, the hash functions indeed associateto the j -th item a binary sequence X j indicating which bitsof Y have to be set. This sequence is indeed sparse with k/m ≈ log(2) /n . The necessary length is the same. Indeed,the enrollment phase of a Bloom ﬁlter is nothing more thanthe ‘All-1’ surjection.The only difference resides in the statistical model. There isat most k symbols ‘1’ in sequence X j whereas, in our model,that follows a binomial distribution B ( m, p ) . Yet, asymptot-ically as m → ∞ , by some concentration phenomenon, thetwo models get similar. This explains why we end up withsimilar optimal parameters. Yet, the Bloom ﬁlter only workshen the query object is exactly one enrolled item, whereasthe next section shows that our scheme is robust to noise.IV. R EAL VECTORS

This section deals with real vectors: n vectors to be enrolled { (cid:126)x , . . . , (cid:126)x n } ⊂ R d , and the query vector (cid:126)q ∈ R d . All haveunit norm. An embedding mechanism E : R d → X m makesthe connection with the previous section. As in [8], this studymodels the embedding as a probabilistic function. A. Binary embedding

For instance, for X = { , } , a popular embedding is: X ( i ) = [ (cid:126)x (cid:62) (cid:126)U i > λ x ] , ∀ i ∈ [ m ] (15)where (cid:126)U i i.i.d. ∼ N ( (cid:126) d , I d ) . This in turn gives i.i.d. Bernoullisymbols { X ( i ) } with p = 1 − Φ( λ x ) if (cid:107) (cid:126)x (cid:107) = 1 .At the query time, the embedding mechanism uses the samerandom vectors but a different threshold: Q ( i ) = [ (cid:126)q (cid:62) (cid:126)U i > λ q ] , ∀ i ∈ [ m ] . (16)Under H , suppose that (cid:126)q (cid:62) (cid:126)x = c < . This correlationdeﬁnes the channel X → Q with the error rates: η = P ( (cid:126)q (cid:62) (cid:126)U > λ q | (cid:126)x (cid:62) (cid:126)U ≤ λ x ) , (17) η = P ( (cid:126)q (cid:62) (cid:126)U ≤ λ q | (cid:126)x (cid:62) (cid:126)U > λ x ) . (18)The error rate η has the expression (and similarly for η ): η = 1 − − p ) √ π (cid:90) λ x −∞ Φ (cid:18) λ q − cx √ − c (cid:19) e − x dx. (19) B. Induced channel

For this embedding, the parameters ( λ x , λ q , c, d ) for thevectors deﬁne the setup ( p, η , η ) for the sequences. It is apriori difﬁcult to ﬁnd the best tuning ( λ x , λ q ) . For a ﬁxed λ x , η decreases with λ q while η increases. App. B reveals that V is sensitive to η especially with the ‘All-1’ surjection of thesparse solution. Fig. 2 shows indeed that the dense solution ( λ x , λ q ) = (0 , is more robust, unless c is very close to1. Here, we enforce a surjection (identity, All-1, or majorityvote) and make a grid search to ﬁnd the optimum ( λ x , λ q ) fora given c . It happens that these parameters are better set to0, i.e. dense solution, for the identity and majority vote. Asfor the ‘All-1’ surjection, we observe that λ x is s.t. p ≈ /n and λ q is slightly bigger than λ x to lower η . Yet, this sparsesolution is not as good as the dense solution unless c is closeto 1, i.e. the query vector is very close to the enrolled vector.This observation holds only for the embedding func-tion (15). Hashing functions less prone to error η may exist.V. E XPERIMENTAL WORK

We evaluate our scheme with face recognition. Face imagesare coming from LFW [9], CFP [10] and FEI [11] databases.For each dataset, N individuals are enrolled into randomgroups. There is the same number N q of positive and negative(impostors) queries. TypesAll1Maj

Fig. 2. V as a function of correlation c , d = 256 , n = 15 . Labeled Faces in the Wild:

These are pictures of celebri-ties in all sort of viewpoint and under an uncontrolled envi-ronment. We use pre-aligned LFW images. The enrollment setconsists of N = 1680 individuals with at least two images inthe LFW database. One random template of each individualis enrolled in the system, playing the role of (cid:126)x i . Some other N q = 263 individuals were randomly picked in the databaseto play the role of impostors. Celebrities in Frontal-Proﬁle:

These are frontal andproﬁle views of celebrities taken in an uncontrolled environ-nement. We only use N = 400 frontal images enrolled in thesystem. The impostor set is a random selection of N q = 100 other individuals. Faculdade de Engenharia Industrial:

The FEI databasecontains images in frontal view in a controlled environnement.We use pre-aligned images. There are subjects with twofrontal images (one with a neutral expression and the otherwith a smiling facial expression). The database is created byrandomly sampling N = 150 individuals to be enrolled, and N q = 50 impostors. A. Experimental Setup

Face descriptors are obtained from a pre-trained networkbased on VGG-Face architecture followed by PCA [12] .FEI corresponds to the scenario of employees entering in abuilding with face recognition, whereas CFP is more difﬁcult,and LFW even more difﬁcult. To equalize the difﬁculty, weapply a dimension reduction (Probabilistic Principal Compo-nent Analysis [13]) to d = 128 (FEI), (CFP), and (LFW). The parameters of PPCA are learned on a differentset of images, not on the enrolled templates and queries. Thevectors are also L normalized. With such post-processing, theaverage correlation between positive pairs equals 0.83 (FEI),0.78 (CFP), and 0.68 (LFW) with a standard deviation of . .Despite the dimension reduction, the hardest dataset is LFWand the easiest FEI.In one simulation run, the enrollment phase makes randomgroups with the same number n of members. A user claimsshe/he belongs to group g . This claim is true under hypothesis H and false under hypothesis H ( i.e. the user is an impostor).Her/his template is quantized to the sequence Q , and ( Q , g ) is sent to the system, which compares Q to the group repre-

10 15 20 25 30 n P f n FEI n P f n CFP n P f n LFW

EoA-SP AoE-SP EoA-ML AoE-ML Our

Fig. 3. Veriﬁcation performance P fn @ P fp = 0 . vs. group size n for the baselines (see Sect. V-B) and our scheme. sentation Y g . This is done for all impostors and all queries ofenrolled people. One Monte-Carlo simulation is composed of runs. The ﬁgure of merit is P fn when P fp = 0 . . B. Exp.

Our scheme is compared to the following baselines: • EoA-SP and AoE-SP [1] (signal processing approach) • EoA-ML and AoE-ML [2] (machine learning approach)The drawback of these baselines is that the length m of thedata structure is bounded. Here, it is set to maximum value, i.e. m = d the dimension of templates.Our scheme allows more freedom. Setting m = 8 × d produces a much bigger representation. It is not surprising thatour scheme is better than the baselines. Fig. 3 validates ourmotivation to get rid off the drawback of the baselines withlimited m , to achieve better veriﬁcation performance. Theseresults are obtained with the dense solution. Indeed, despiteall our efforts, we could not achieve better results with thesparse solution. This conﬁrms the lesson learnt from Fig. 2:the dense solution outperforms the sparse solution when theaverage correlation between positive pairs is lower than . .The improvement is also better as the size of groupsincreases. We explain this by the use of the types, i.e. Y = T .Equation (9) shows that C increases with n for the densesolution, compensating for aggregating more templates. C. Exp.

There are two ways for reducing the size of the grouprepresentation. The ﬁrst means is to decrease m , the secondmeans is to lower C thanks to a surjection. Sect. III-B pre-sented optimal surjections from T ,n to Y = { , } . We foundexperimentally good surjections to sets Y for |Y| ∈ { , , } .This is done according to the following heuristic. Startingfrom T ,n , we iteratively decrease the size of Y by one.This amounts to merge two symbols of Y . By brute force,we analyse all the pairs of symbols measuring the loss in V induced by their merging. By merging the best pair, wedecrease the number of symbols in Y by one. This process isiterated until the targeted size of Y is achieved. This heuristicis not optimal, but it is tractable. Fig. 4 compares these two means. Employing a coarser surjection is slightly better interms of veriﬁcation performances. D. Unexpected results

We have argued that FEI < CFP < LFW in terms ofdifﬁculty due to the opposite ordering of the datasets typicalcorrelation c between positive pairs. Eq. (19) shows that alower c produces a higher η (and η ), whence a lower V . InFig. 3, the experimental results contradict this intuition.This may be explained by the Signal to Noise Ratio at thetemplate level. We deﬁne it as c /v where c is the averagecorrelation for positive pairs and v is the variance of thiscorrelation for negative pairs. If a negative query is uniformlydistributed over the hypersphere, then its correlation with anenrolled template is approximatively distributed as a centeredGaussian distribution with variance v = 1 /d .Yet, d has no impact on p , η , and η . We suppose thatits impact is tangible on the entropy of the template vectors.Sect. II assumes that the enrolled sequences are statisticallyindependent. This assumption is not granted with the embed-ding of Sect. IV. Yet, a bigger d favors the independence (orat least the decorrelation) between real template vectors.VI. C ONCLUSION

Our theoretical study justiﬁes that the dense setup is moreinteresting in terms of veriﬁcation performance V and securitylevel S unless we are operating in the high-SNR regime wherethe positive queries are very well correlated with the enrolledtemplates. This statement holds for any embedding, yet someare certainly more suited than others depending on d , c , andthe geometrical relationship among positive pairs.A CKNOWLEDGMENT

This work is supported by the project CHIST-ERA ID IOT20CH21 167534.

PPENDIX

Let us ﬁrst explain how V is computed. Denote P i ( q, y ) := P ( Q = q, Y = y |H i ) and channel W ( q | x ) := P ( Q = q | X = x ) , ∀ y ∈ Y , q ∈ X and i ∈ { , } . Then, V = (cid:88) q,y P ( q, y ) log P ( q, y ) P ( q, y ) , (20)with P ( q, y ) = P ( Q = q ) P ( Y = y ) and P ( q, y ) = (cid:88) x ∈X P ( Y = y, X = x ) W ( q | x ) . (21) A. Surjection to Y = { , } We assume here the noiseless setup allowing to write P ( Y = y, X = x ) as P ( x, y ) . Inspired by traitor tracing, we considera probabilistic surjection where P ( r ( t ) = 1) = θ t . The vector θ ∈ [0 , n +1 parametrizes the surjection. Denote by ∇ θ V ( t ) the derivative w.r.t. θ t . After some lengthy calculus: ∇ θ V ( t ) = n − K ( p, θ )( t − nK ( p, θ )) , (22) K ( p, θ ) = P ( T = t )∆ ,K ( p, θ ) = h (cid:48) ( P (0 , − h (cid:48) ( P ( Y = 1))∆ , ∆ = h (cid:48) ( P (0 , − h (cid:48) ( P (1 , . It is not possible to cancel the gradient ∇ θ V . The optimal θ thus lies on the boundary of the hypercube [0 , n +1 .This makes the surjection deterministic. Assuming P ( Y =1 | X = 0) < P ( Y = 1 | X = 1) , then < K ( p, θ ) and < K ( p, θ ) ≤ because h (cid:48) ( · ) is strictly decreasing. Thismakes ∇ θ V (0) < and θ must be set to the lowest possiblevalue, i.e. θ = 0 , to increase V at most. This is indeed thecase for any θ t with t < K ( p, θ ) . In the same way, θ n = 1 and so is θ t if t > K ( p, θ ) . Yet, for a given θ , K ( p, θ ) ranges from 0 to as p increases from 0 to 1. Therefore, θ = (0 , . . . , , , . . . , is optimal only over an interval of p . m C P f n CFPLFW

Fig. 4. Veriﬁcation performance P fn @ P fp = 0 . vs. m C , for n = 16 . Thisquantity is reduced by decreasing m (dashed lines) or by decreasing C thanksto a surjection (solid lines). For n odd and p = 1 / , θ t = 0 if t ≤ ( n + 1) / , and 1 ( i.e. majority vote) otherwise is optimal because K (1 / , θ ) = 1 / ( P ( Y = 1) = 1 / and P (0 ,

1) = 1 − P (1 , ).The ‘All-1’ surjection: θ = (0 , , . . . , makes P (1 ,

1) = 1 so that ∇ θ V ( t ) = + ∞ if t > and < for t = 0 . B. Impact of the channel

Suppose that η is a parameter of the channel W ( ·|· ) . Then ∂ V ∂η = (cid:88) q,y ∂P ( q, y ) ∂η log P ( q, y ) P ( q, y ) , (23)because (cid:80) q,y ∂P ( q,y ) ∂η = ∂ (cid:80) q,y P ( q,y ) ∂η = 0 and (cid:80) q,y P ( q,y ) P ( q,y ) ∂P ( q,y ) ∂η = (cid:80) q ∂ P ( Q = q ) ∂η = 0 .Suppose now that η = η := W ( q | , ∀ q ∈ X \ . Then, ∂P ( q, y ) ∂η = P ( X = 0 , Y = y ) ∀ q ∈ X \{ } . (24)Taking (23) around the noiseless channel where η = 0 and P ( X = 0 , Y = y ) = P (0 , y ) because Q = X : ∂ V ∂η (cid:12)(cid:12)(cid:12)(cid:12) η =0 = (cid:88) y,x (cid:54) =0 P (0 , y ) log P ( x, y ) P ( x, y ) + . . . (25)We only express the ﬁrst terms to outline that if P ( x, y ) =0 while P (0 , y ) and hence P ( x, y ) are not null, then thisderivative goes to −∞ . A small deviation from the noiselesscase with η (cid:54) = 0 has a major detrimental impact on V . Thatsituation happens for sure when working with type, i.e. Y = T : Consider the null type t obtained when X = . . . = X n =0 : P (0 , t ) > while P ( x, t ) = 0 , ∀ x (cid:54) = 0 .One can prove that the surjection can mitigate this effect if ∃ t (cid:54) = t : r ( t ) = r ( t ) and P (0 , t ) > . This happens with themajority vote of the dense setup, but unfortunately, not withof the ‘All-1’ surjection in the sparse setup.R EFERENCES[1] M. Gheisari, T. Furon, L. Amsaleg, B. Razeghi, and S. Voloshynovskiy,“Aggregation and embedding for group membership veriﬁcation,” in

Proceedings of the IEEE International Conference on Acoustics, Speechand Signal Processing , 2019.[2] M. Gheisari, T. Furon, and L. Amsaleg, “Privacy preserving groupmembership veriﬁcation and identiﬁcation,” in

The IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR) Workshops , June2019.[3] B. Razeghi, S. Voloshynovskiy, D. Kostadinov, and O. Taran, “Privacypreserving identiﬁcation using sparse approximation with ambiguiza-tion,” in

Proceedings of the IEEE International Workshop on InformationForensics and Security , 2017.[4] B. Razeghi and S. Voloshynovskiy, “Privacy-preserving outsourcedmedia search using secure sparse ternary codes,” in

Proceedings ofthe IEEE International Conference on Acoustics, Speech and SignalProcessing , 2018.[5] C. E. Shannon, “Probability of error for optimal codes in a gaussianchannel,”

Bell System Tech. J. , vol. 38, pp. 611–656, 1959.[6] T. Laarhoven, “Search problems in cryptography from ﬁngerprinting tolattice sieving,” Ph.D. dissertation, Eindhoven University of Technology,2015.[7] J. Boersma, “Solution to problem 87-6* : The entropy of a poissondistribution,”

SIAM Review , vol. 30, no. 2, pp. 314–317, 1988.[8] A. Andoni, P. Indyk, T. Laarhoven, I. P. Razenshteyn, and L. Schmidt,“Practical and optimal LSH for angular distance,”

NIPS , 2015. [Online].Available: http://arxiv.org/abs/1509.028979] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled facesin the wild: A database forstudying face recognition in unconstrainedenvironments,” in

Workshop on faces in’Real-Life’Images: detection,alignment, and recognition , 2008.[10] S. Sengupta, J.-C. Chen, C. Castillo, V. M. Patel, R. Chellappa, andD. W. Jacobs, “Frontal to proﬁle face veriﬁcation in the wild,” in

Proceeding of the IEEE Winter Conference on Applications of ComputerVision , 2016.[11] C. E. Thomaz and G. A. Giraldi, “A new ranking method for principalcomponents analysis and its application to face image analysis,”

Imageand Vision Computing , vol. 28, no. 6, pp. 902–913, 2010.[12] O. M. Parkhi, A. Vedaldi, A. Zisserman et al. , “Deep face recognition.”in

Proceedings of the British Machine Vision Conference , 2015.[13] M. E. Tipping and C. M. Bishop, “Probabilistic principal componentanalysis,”