[PDF] Joint Learning of Assignment and Representation for Biometric Group Membership

Abstract

This paper proposes a framework for group membership protocols preventing the curious but honest server from reconstructing the enrolled biometric signatures and inferring the identity of querying clients. This framework learns the embedding parameters, group representations and assignments simultaneously. Experiments show the trade-off between security/privacy and verification/identification performances.

Full PDF

JJOINT LEARNING OF ASSIGNMENT AND REPRESENTATION FOR BIOMETRIC GROUP MEMBERSHIP

Marzieh Gheisari, Teddy Furon, Laurent Amsaleg,

Univ Rennes, Inria, CNRS, IRISA, France { marzieh.gheisari-khorasgani, teddy.furon } @ inria.fr , laurent.amsaleg @ irisa.fr ABSTRACT

This paper proposes a framework for group membership pro-tocols preventing the curious but honest server from recon-structing the enrolled biometric signatures and inferring theidentity of querying clients. This framework learns the em-bedding parameters, group representations and assignmentssimultaneously. Experiments show the trade-off between se-curity/privacy and veriﬁcation/identiﬁcation performances.

Index Terms — Group Representation, Veriﬁcation, Iden-tiﬁcation, Security, Data Privacy.

1. INTRODUCTION

Group membership veriﬁcation is a procedure checkingwhether an item or an individual is a member of a group. Ifmembership is positively established, then an access to someresources (buildings, wiﬁ, payment, conveyor units, . . . ) isgranted; otherwise the access is refused. Being granted withthis shared privileged access requires that the members ofthe group could be distinguished from non-members, but itdoes not require to distinguish members from one another.Indeed, privacy concerns suggest that the veriﬁcation shouldbe carried-out anonymously.This paper studies group veriﬁcation and also group iden-tiﬁcation. In this later setup, there are several groups ofmembers and one needs to identify in which group the user isbelonging to. This paper focuses on privacy preserving groupidentiﬁcation procedure where group identity of a member isfound without disclosing the identity of that individual.In computer vision, it is very common to aggregate sig-nals into one representation [1, 2, 3], but they do not considersecurity or privacy. For instance, in [4], Iscen et al. usethe group testing paradigm to pack a random set of imagesignatures into a unique high-dimensional vector where thesimilarities between the original non-aggregated signaturesand a query signature is preserved through the aggregation.Recently [5, 6, 7] proposed a framework based on ag-gregation and embedding of several biometric signatures intoa unique vector representing the members of a group. It hasbeen demonstrated that this allows a good assessment of themembership property at test time provided that the groups aresmall. It has also been shown that this provides privacy and

Research supported by the ERA-Net project ID IoT 20CH21 167534. security. Privacy is enforced because it is impossible to inferfrom the aggregated feature which original signature matchesthe one used to probe the system. Security is preserved sincenothing meaningful leaks from embedded data [8, 9].This paper revisits the core mechanism proposed by [6].That work, however, is deterministic in the sense that it learnsgroup representations based on predeﬁned groups. This pa-per shows that learning jointly the group representationsand group assignments results in better performance withoutdamaging the security. This adresses scenarios where thenumber of members is too big. Their signatures can not bepacked into one unique group representation with a techniquelike [6]. Therefore, members are automatically assigned todifferent groups. A light cryptographic protocol is deployedto secure their privacy during group veriﬁcation.

2. GROUP MEMBERSHIP2.1. Notations

The embedding, the assignment, and the group represen-tations are learned jointly at enrolment, and given to aserver. Biometric signatures are modelled as vectors in R d . X ∈ R d × N is the matrix of the signatures to be enrolled into M groups. The group representations are stored column wisein (cid:96) × M matrix R . The group representations are quan-tized and sparse i.e. , r g ∈ A (cid:96) with A = {− , , +1 } and (cid:107) r g (cid:107) ≤ S < (cid:96) , ∀ g ∈ [ M ] .At query time, the user computes a sparse representationof his biometric signature q ∈ R d . For that purpose, func-tion e : R d → A (cid:96) maps a vector to a sequence of (cid:96) dis-crete symbols. We use the sparsifying transform coding [8, 9]: p := e ( q ) = T S ( W (cid:62) q ) . After projecting q ∈ R d on the col-umn vectors of W ∈ R d × (cid:96) , the output alphabet A is imposedby the ternarization function T S : The (cid:96) − S components hav-ing the lowest amplitude are set to 0. The S remaining onesare quantized to +1 or -1 according to their sign. Our group membership protocol aims at jointly learning thepartition, the embedding and the group representations. Thekey is to introduce the auxiliary data E = [ e , . . . , e N ] ∈A (cid:96) × N the hash codes of enrolled signatures and Y ∈ R M × N the group indicator matrix ( y i,j = 1 if e j is assigned to i -thgroup). Then, the optimization problem is composed of a cost a r X i v : . [ c s . C V ] F e b or embedding C E and a cost for partitioning C A,G : min W , R , Y C E ( X , W , E ) + C A,G ( E , Y , R ) , (1)The embedding cost is the loss for quantizing signatures: C E ( X , W , E ) := N (cid:88) i =1 (cid:13)(cid:13) e i − W (cid:62) x i (cid:13)(cid:13) . (2)The assignment aims at grouping together signaturessharing similar hash codes: the overall dissimilarity betweenmembers and their group representation is minimized whilethe separation between two groups is maximized. Inspiredby Linear Discriminant Analysis, we consider variance tomeasure dissimilarity. The within group scatter matrix S w and the between group scatter matrix S b are deﬁned as S w = M (cid:88) g =1 (cid:88) i ∈Y g ( e i − r g )( e i − r g ) (cid:62) = ( E − RY )( E − RY ) (cid:62) S b = M (cid:88) g =1 r g r (cid:62) g = RY ( RY ) (cid:62) where Y g = { i ∈ [ N ] : y g,i = 1 } . The cost for partitioningis C A,G = λT r ( S w ) − γT r ( S b ) for some λ , γ in R + .In the end, the objective function is formulated as:min W , R , Y (cid:107) E − W (cid:62) X (cid:107) F + λT r ( S w ) − γT r ( S b ) s.t. W T W = I (cid:96) Y ∈ { , } M × N , (cid:107) y i (cid:107) = 1 ∀ i ∈ [ N ] e i ∈ A (cid:96) , (cid:107) e i (cid:107) ≤ S r g ∈ A (cid:96) , (cid:107) r g (cid:107) ≤ S (3)The constraint on Y ensures that each signature belongs toexactly one group. The solution of (3) is found by iterating the following steps: W -Step. We ﬁx E , R , Y and update W by solving: min W (cid:13)(cid:13) E − W (cid:62) X (cid:13)(cid:13) F s.t. W (cid:62) W = I (cid:96) (4)This problem is a least square Procruste problem with orthog-onality constraint. By setting S := XE (cid:62) , [10] shows that W = UV (cid:62) , where U contains the eigenvectors correspond-ing to the (cid:96) ( (cid:96) < d ) largest eigenvalues of SS (cid:62) and V con-tains the eigenvectors of S (cid:62) S . E-Step.

Given W , Y and R , (3) amounts to: min E (cid:13)(cid:13) E − W (cid:62) X (cid:13)(cid:13) F + λ (cid:107) E − RY (cid:107) F s.t. e i ∈ A (cid:96) , (cid:107) e i (cid:107) ≤ S (5)We ﬁrst ﬁnd the solution relaxing the constraints and then ap-ply ternarization function T S to obtain sparse codes: E = T S ( W (cid:62) X + λ RY ) . (6) (R,Y)-Step. When ﬁxing W and E , the assignment andgroup representations are found by minimizing: min R , Y (cid:107) E − RY (cid:107) F − λγ T r ( RYY (cid:62) R (cid:62) ) s.t. Y ∈ { , } M × N , (cid:107) y i (cid:107) = 1 ∀ i ∈ [ N ] r g ∈ A (cid:96) , (cid:107) r g (cid:107) ≤ S (7)As E is ﬁxed, T r ( EE (cid:62) ) is irrelevant to Y , thus minimizing(7) is equivalent to:min R , Y (cid:13)(cid:13)(cid:13)(cid:13) λλ − γ E − RY (cid:13)(cid:13)(cid:13)(cid:13) F . (8)Relaxing the ternarization constraint, (8) is solved by a k-means clustering algorithm, i.e. iteratively: • Update assignments : Each item is assigned to its near-est group representative. • Update centroids : g -th centroid is the mean of all ˜ e i ingroup g .Then the group representation r g is found by applying ternar-ization function on g -th centroid.

3. EXPERIMENTS

This section presents the datasets used in our experiments andinvestigates the performance of the proposed method for twoapplication scenarios. We compare our scheme with EoA-SP,AoE-SP [5] and EoA-ML, AoE-ML [6]. For the baselines N individuals of each dataset are enrolled into M randomgroups but for our scheme the algorithm learns how to parti-tion the enrolled templates. Face descriptors are obtained from a pre-trained networkbased on VGG-Face architecture [11] followed by PCA andthen L -normalization with d = 1 , . LFW [12].

These are pictures of celebrities in all sortof viewpoint and under an uncontrolled environment. Weuse pre-aligned LFW images. The enrollment set consists of N = 1680 individuals with at least two images in the LFWdatabase. One random template of each individual is enrolledin the system, playing the role of x i . Some other N q = 263 individuals were randomly picked in the database to play therole of impostors. CFP [13].

These are frontal and proﬁle views of celebri-ties taken in an uncontrolled environnement. We use N =400 frontal images to be enrolled in the system. The impostorset is a random selection of N q = 100 other individuals. Iris images are prepossessed by the following steps: iris lo-calization, iris normalization and image enhancement. Then m P f n CFP

EoA-SP AoE-SP EoA-ML AoE-ML Ours m P f n LFW m P f n CASIA m P f n MMU2

Fig. 1 : Performances comparison for varying group size m . P fn at P fp = 0 . for group veriﬁcation.the feature vectors are extracted by Gabor ﬁlters. CASIA-IrisV1 [14].

The database includes 756 iris im-ages from 108 eyes of Chinese persons. The images stored inthe database were captured within a highly constrained cap-turing environment. 3 images were collected in a ﬁrst sessionand 4 images in a second session. The database is created byrandomly sampling N = 80 individuals to be enrolled, and N q = 28 impostors. MMU2 [15].

This dataset contains images corre-sponding to people with different age and nationalityfrom Asia, Middle East, Africa and Europe. Each of themcontributes to 5 iris images for each eye. We exclude 5 lefteye iris images due to cataract disease.

A user claims she/he belongs to group g . This claim is trueunder hypothesis H and false under hypothesis H ( i.e. theuser is an impostor). Her/his signature q is embedded into p = e ( q ) , and ( p , g ) is sent to the system, which compares p to the group representation r g . The system accepts ( t = 1 ) orrejects ( t = 0 ) the claim. This is a two hypothesis test withtwo probabilities of errors: P fp := P ( t = 1 |H ) is the falsepositive rate and P fn := P ( t = 0 |H ) is the false negativerate. The ﬁgure of merit is P fn when P fp = 0 . .Fig. 1 compares the performance of our scheme withbaselines for group membership veriﬁcation. Totally ourscheme gives a better veriﬁcation performance especially onCASIA. Since our method tries to simultaneously learn grouprepresentations and assignment, it aggregates similar embed-ded vectors and this looses less information.Note that, although LFW and CFP are difﬁcult datasetsdue to the ”in the wild” variations, the group membership ver-iﬁcation task is handled well even for large group sizes. Thisis not the case for iris datasets. As mentioned before, we makeuse of VGG-Face for face datasets while for iris, traditionalfeature extraction algorithms are used. So, the big differencein overall analysis shows how the feature space affect the per-formance of group membership tasks. m P f n EoA-SP AoE-SP EoA-ML AoE-ML Ours m P f n Fig. 2 : Performances comparison for varying group size m on group identiﬁcation for CFP(left) and LFW(right). P fn at P fp = 0 . for the ﬁrst step of group identiﬁcation (solid) and P (cid:15) for the second step of group identiﬁcation (dashed). The scenario is an open set identiﬁcation where the queryinguser is either enrolled or an impostor. The system proceeds intwo steps. First, it decides whether or not this user is enrolled.This is veriﬁcation as above, except that the group is unknow:The system computes δ j = (cid:107) p − r j (cid:107) , ∀ j ∈ [ M ] , and accepts ( t = 1) if the minimum of these M distances is below a giventhreshold τ . The ﬁgure of merit is P fn when P fp = 0 . .When t = 1 , the system proceeds to the second step.The estimated group is given by ˆ g = arg min j ∈ [ M ] δ j . Theﬁgure of merit for this second step is P (cid:15) := P (ˆ g (cid:54) = g ) or theDetection and Identiﬁcation Rate DIR := (1 − P (cid:15) )(1 − P fn ) .Fig. 2 shows that our scheme brings improvement com-pared to the baselines and the improvement is also better asthe size of groups increases.The impact of the group size on DIR is illustrated inFig. 3. Obviously, packing more signatures into one grouprepresentation is detrimental. It gets worse when the queriesare not well correlated with the enrolled signature. A curious server can only reconstruct a single vector ˆ r g = rec ( r g ) from the group representation r g , and this vector P fp D I R m=10 P fp D I R m=16 EoA-SP AoE-SP EoA-ML AoE-ML Ours

Fig. 3 : The Detection and Identiﬁcation Rate (

DIR ) vs. P fp for group identiﬁcation on CASIA-IRISV1. fn MSE

CASIA

Query (AoE-ML) Enrolled (AoE-ML) Query (Ours) Enrolled (Ours) fn MSE

CFP

Fig. 4 : Investigation of trade-off between security and perfor-mance for varying sparsity level S on CFP (with m = 25 ) andCASIA-IrisV1 (with m = 16 ).serves as an estimation of any signature in the group. We mea-sure the security by the mean square error over the dataset: MSE S = ( dN ) − M (cid:88) g =1 |Y g | (cid:88) i =1 E ( (cid:107) x i − ˆ r g (cid:107) ) . (9)For the of privacy of query template, a curious server canreconstruct the query template q from its embedding: MSE P = d − E ( (cid:107) Q − rec ( e ( Q )) (cid:107) ) , (10)These reconstructions are possible only if matrix W isknown. This is not the case in practice, so we give here anextra advantage to the curious server. Figure 4 comparessecurity with AoE-ML [6] where the assignment was im-posed randomly, i.e. not learned. Different levels of sparsityare tested. The reconstruction error of queries are closein either case, yet learning the assignment improves veriﬁ-cation performance. Reconstructing enrolled signatures ismore difﬁcult due to the aggregation. However, learning theassignment by similarity correspondence in the embeddeddomain decreases the security slightly while improving theperformance a lot.

4. SECURITY PROTOCOLS

This section gives an example of a cryptographic protocolexploiting the group representations. The experimental sec-tion showed that grouping secures the enrolled signatures,but ternarization alone provides less protection to the query.Therefore, this protocol strengthens the protection of thequerying user. For security reason, the server only manipu-lates query and the distances in the encrypted domain. Forprivacy reason, the server only learns that the query is closeenough to one group representation, but it cannot tell whichgroup exactly. We assume honest but curious user and server.This protocol also justiﬁes choices of our scheme: Queriesand group representations are heavily quantized onto a smallalphabet A . They are long vectors but sparse: only S compo-nents will be processed in the encrypted domain. Moreover,we have (cid:107) p − r (cid:107) ∈ [0 , S ] . These facts ease the use of par-tial homomorphic encryptions with limited module, whencea low complexity and expansion factor. The group represen-tations remain in the clear on the server side, and we do notneed fully homomorphic encryption.The user generates a pair of secret and public keys ( sk U , pk U ) for an additive homomorphic cryptosystem e ( · ) (say [16]), and sends the query encrypted component-wise.The server computes its correlation with group representation r g : e ( p (cid:62) r g , pk U ) = (cid:89) i : r g ( i ) (cid:54) =0 e ( p ( i ) , pk U ) r g ( i ) . (11)The server also generates a key pair ( sk S , pk S ) for amultiplicative homomorphic cryptosystem E ( · ) (say [17]),and sends the user ( E ( e ( p (cid:62) r g , pk U ) , pk S )) g . The user ran-domly permutes the order of these quantities and masks themby multiplying them by E (1 , pk S ) . This yields another se-mantically secure version of the ciphertexts thanks to themultiplicative homomorphy of E ( · ) . The server decrypts ( e ( p (cid:62) r k , pk U )) k , but the permutation prevents connecting k back to the group index g . Again thanks to homomorphy,the server computes ( e ( a k (2 S − p (cid:62) r k − τ ) + b k ) , pk U )) g where ( a k , b k ) are random signed integers. The user decryptsand sends ( a k ( (cid:107) p − r k (cid:107) − τ ) + b k ) k to the server. The usercannot guess the distances (cid:107) p − r k (cid:107) thanks to the masking ( a k , b k ) k , not even the sign of ( (cid:107) p − r k (cid:107) − τ ) . The servercan do this (since it knows ( a k , b k ) ) and thus learns whetherthere is one group where ( (cid:107) p − r k (cid:107) − τ ) is negative.

5. CONCLUSION

We proposed a method for group membership veriﬁcation andidentiﬁcation jointly learning group representations and as-signment. The idea is to minimize the overall distance be-tween group members while maximizing the separation be-tween groups in the embedded domain. Yet, the method stillhas some rigidness: the prototyping of the embedding (thesparse ternary quantization), considering mean as group cen-troids, and assigning a signature to only one group. . REFERENCES [1] J. Sivic and A. Zisserman, “Video google: a text re-trieval approach to object matching in videos,” in

Pro-ceedings of the IEEE International Conference on Com-puter Vision , 2003.[2] Herv´e J´egou, Florent Perronnin, Matthijs Douze, JorgeS´anchez, Patrick P´erez, and Cordelia Schmid, “Ag-gregating local image descriptors into compact codes,”

IEEE Transactions on Pattern Analysis and Machine In-telligence , vol. 34, no. 9, pp. 1704–1716, 2012.[3] F. Perronnin and C. Dance, “Fisher kernels on visual vo-cabularies for image categorization,” in

Proceedings ofthe IEEE International Conference on Computer Visionand Pattern Recognition , 2007.[4] Ahmet Iscen, Teddy Furon, Vincent Gripon, MichaelRabbat, and Herv´e J´egou, “Memory vectors for simi-larity search in high-dimensional spaces,”

IEEE Trans-actions on Big Data , 2017.[5] Marzieh Gheisari, Teddy Furon, Laurent Amsaleg,Behrooz Razeghi, and Slava Voloshynovskiy, “Aggrega-tion and embedding for group membership veriﬁcation,”in

Proceedings of the IEEE International Conference onAcoustics, Speech and Signal Processing , 2019.[6] Marzieh Gheisari, Teddy Furon, and Laurent Amsaleg,“Privacy preserving group membership veriﬁcation andidentiﬁcation,” in

Proceedings of the The IEEE Con-ference on Computer Vision and Pattern Recognition(CVPR) Workshops , June 2019.[7] Marzieh Gheisari, Teddy Furon, and Laurent Amsa-leg, “Group membership veriﬁcation with privacy:Sparse or dense?,” in

Proceedings of the IEEE Interna-tional Workshop on Information Forensics and Security(WIFS) , 2019.[8] Behrooz Razeghi, Slava Voloshynovskiy, DimcheKostadinov, and Olga Taran, “Privacy preserving iden-tiﬁcation using sparse approximation with ambiguiza-tion,” in

Proceedings of the IEEE International Work-shop on Information Forensics and Security , 2017.[9] Behrooz Razeghi and Slava Voloshynovskiy, “Privacy-preserving outsourced media search using secure sparseternary codes,” in

Proceedings of the IEEE InternationalConference on Acoustics, Speech and Signal Process-ing , 2018.[10] Peter H. Sch¨onemann, “A generalized solution of the or-thogonal procrustes problem,”

Psychometrika , vol. 31,no. 1, pp. 1–10, 1966. [11] Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman,et al., “Deep face recognition.,” in

Proceedings of theBritish Machine Vision Conference , 2015.[12] Gary B Huang, Marwan Mattar, Tamara Berg, and EricLearned-Miller, “Labeled faces in the wild: A databaseforstudying face recognition in unconstrained environ-ments,” in

Workshop on faces in’Real-Life’Images: de-tection, alignment, and recognition , 2008.[13] Soumyadip Sengupta, Jun-Cheng Chen, Carlos Castillo,Vishal M Patel, Rama Chellappa, and David W Jacobs,“Frontal to proﬁle face veriﬁcation in the wild,” in

Pro-ceeding of the IEEE Winter Conference on Applicationsof Computer Vision , 2016.[14] Chinese Academy of Sciences Institute of Automation,“Casia-irisv1 iris image database [online],” .[15] The Multimedia University, “Mmu2 iris image database[online],”

Available: http://pesona.mmu.edu.my/ ccteo/ .[16] Pascal Paillier, “Public-key cryptosystems based oncomposite degree residuosity classes,” in

Advances inCryptology — EUROCRYPT ’99 , Jacques Stern, Ed.,Berlin, Heidelberg, 1999, pp. 223–238, Springer BerlinHeidelberg.[17] T. Elgamal, “A public key cryptosystem and a signaturescheme based on discrete logarithms,”